C# consume rss feed containing xml-stylesheet? - c#

I have a problem for parsing a rss feed using c#.
I used to use this method to load the feed.
XDocument rssFeed = XDocument.Load(#url);
But, when I notice when the feed has a xml-stylesheet this method crashes saying the xml is not well formated...
Here's a rss feed that contains this tag
http://www.channelnews.fr/accueil.feed?type=rss
What would be the best way to parse any rss feed using c#?
Thanks for your help

This code works for me
static XDocument DownloadPage()
{
var req = (HttpWebRequest)WebRequest.Create("http://www.channelnews.fr/accueil.feed?type=rss");
req.UserAgent = "Mozilla";
using(var response = req.GetResponse())
using(var stream = response.GetResponseStream())
using (var reader = new StreamReader(stream))
return XDocument.Load(reader);
}
Note, that if you omit setting UserAgent, then response will contain string 'DOS' that is defnintly not xml :)

This one works nicer:
XDocument xdoc = XDocument.Load("http://pedroliska.wordpress.com/feed/");
var items = from i in xdoc.Descendants("item")
select new
{
Title = i.Element("title").Value
};
So now you can access the rss titles by doing a loop or something like:
items[0].Title
And just the code is pulling the title from the rss feed, you can pull the description, link, pubDate, etc.

Related

XDocument Load - cannot open

I'm trying to load rss feed by XDocument.
The url is:
http://www.ft.com/rss/home/uk
XDocument doc = XDocument.Load(url);
But I'm getting an error:
Cannot open 'http://www.ft.com/rss/home/uk'. The Uri parameter must be a file system relative or absolute path.
XDocument.Load does not take URL's, only files as stated in the documentation.
Try something like the following code which I totally did not test:
using(var httpclient = new HttpClient())
{
var response = await httpclient.GetAsync("http://www.ft.com/rss/home/uk");
var xDoc = XDocument.Load(await response.Content.ReadAsStreamAsync());
}

Transform XML returned from a web request using XLST

I see several questions that are close to this but none exactly cover it:
How to apply an XSLT Stylesheet in C#
XSLT Transform of XML using Xml data from a web form
How to transform an xml structure generated from a request to a web services
I can cobble something together from these but I worry I am passing it through too many steps to be efficient.
What I currently have is this, to read XML from a HTTP web request:
WebRequest request = WebRequest.Create(url);
WebResponse response = request.GetResponse();
Stream stream = response.GetResponseStream();
StreamReader streamReader = new StreamReader(stream);
string xml = streamReader.ReadToEnd();
This was before the need to apply an XLST transform was needed. Now I have a (possibly null) XslCompiledTransform object.
So I want to add a block like:
if(transform != null)
{
xml = transform.Transform(xml);
}
Clearly this isn't possible as written. I see StringReaders and XmlReaders can be created but is it inefficient to get my xml as a string and then push it back into another object? Can I use my stream or streamReader objects directly to support the same basic flow, but with optional transformation?
Personally I'd use the XmlDocument.Load() function to load the XML from the URL, without using WebRequest in this case.
You can pass the XmlDocument Straight to XSLCompiledTransform.Transform() then.
XmlDocument doc = new XmlDocument();
doc.Load(url);
if (transform != null)
{
XmlDocument tempDoc = new XmlDocument();
using (XmlWriter writer = tempDoc.CreateNavigator().AppendChild())
{
transform.Transform(doc, writer);
}
doc = tempDoc;
} //Use your XmlDocument for your transformed output

How Can I Read The XML

I'm getting geografic info from a webservice.
I'm trying to parse the return data for hours, but have been getting no where.
HttpWebRequest request = WebRequest.Create(url) as HttpWebRequest;
using (HttpWebResponse response = request.GetResponse() as HttpWebResponse)
{
StreamReader reader = new StreamReader(response.GetResponseStream());
string result = reader.ReadToEnd();
XDocument document = XDocument.Parse(result, LoadOptions.None);
i got this
document <html>
<body>
<state>Apure</state>
<municipality>RĂ“MULO GALLEGOS</municipality>
<parish>URBANA ELORZA</parish>
<street>La Trinidad De Arauca</street>
</body>
</html> System.Xml.Linq.XDocument
I try
document.Elements("state")
document.Descendants("body")
document.GetElementsByTagName("state");
But nothing.
I'm sure there is a simple way of do something so basic.
I'm seriously considering convert that to a string and do the parsing myself.
Aditional consideration:
The fields include it in the result is variable.
Because some info doesnt have all fields.
Ok, I make a change.
I read a XElement instead of a XDocument;
XElement sitemap = XElement.Parse(result, LoadOptions.None);
foreach (var bodyElement in sitemap.Elements("body"))
{
foreach (var fieldElement in bodyElement.Elements())
{
Console.WriteLine(fieldElement.Name);
Console.WriteLine(fieldElement.Value);
}
}
Probably there is a way to skip the first foreach, but still looking for it.
#Jonesy line works but that mean I have to know the fields names. This way i just create the info for the values I got.

Converting JSON to XML

I trying to convert JSON output into XML. Unfortunately I get this error:
JSON root object has multiple properties. The root object must have a single property in order to create a valid XML document. Consider specifing a DeserializeRootElementName.
This is what I up to now created.
string url = string.Format("https://graph.facebook.com/{0}?fields=posts.fields(message)&access_token={1}", user_name, access_token);
HttpWebRequest request = WebRequest.Create(url) as HttpWebRequest;
using (HttpWebResponse response = request.GetResponse() as HttpWebResponse)
{
StreamReader reader = new StreamReader(response.GetResponseStream());
jsonOutput = reader.ReadToEnd();
Console.WriteLine("THIS IS JSON OUTPUT: " + jsonOutput);
}
XmlDocument doc = (XmlDocument)JsonConvert.DeserializeXmlNode(jsonOutput);
Console.WriteLine(doc);
And this is my JSON output:
{"id":"108013515952807","posts":{"data":[{"id":"108013515952807_470186843068804","created_time":"2013-05-14T20:43:28+0000"},{"message":"TEKST","id":"108013515952807_470178529736302","created_time":"2013-05-14T20:22:07+0000"}
How can I solve this problem?
Despite the fact your JSON provided in the question is not complete, you have multiple properties at the top level as indicated by the exception. You have to define the root for it to get valid XML:
var doc = JsonConvert.DeserializeXmlNode(jsonOutput, "root");
EDIT: In order to print out your XML with indentation you can use XDocument class from System.Xml.Linq namespace: XDocument.Parse(doc.InnerXml).
I thought it's worth linking to the Documentation for turning xml to json and the other way around.
The guys are right..
// To convert an XML node contained in string xml into a JSON string
XmlDocument doc = new XmlDocument();
doc.LoadXml(xml);
string jsonText = JsonConvert.SerializeXmlNode(doc);
// To convert JSON text contained in string json into an XML node
XmlDocument doc = (XmlDocument)JsonConvert.DeserializeXmlNode(json);
You can do JSON-to-XML also by using the .NET Framework (System.Runtime.Serialization.Json):
private static XDocument JsonToXml(string jsonString)
{
using (var stream = new MemoryStream(Encoding.ASCII.GetBytes(jsonString)))
{
var quotas = new XmlDictionaryReaderQuotas();
return XDocument.Load(JsonReaderWriterFactory.CreateJsonReader(stream, quotas));
}
}
DeserializeXmlNode returns XDcument.
If needed XNode use FirstNode.
//string jsonOutput="{"id":"108013515952807","posts":{"data":[{"id":"108013515952807_470186843068804","created_time":"2013-05-14T20:43:28+0000"},{"message":"TEKST","id":"108013515952807_470178529736302","created_time":"2013-05-14T20:22:07+0000"}";
var myelement= JsonConvert.DeserializeXmlNode(jsonOutput, "myelement").FirstNode;
Your shared JSON is invalid please go through http://jsonformatter.curiousconcept.com/ and validate your JSON first.
Yourt JSON should look like:
{
"id":"108013515952807",
"posts":{
"data":[
{
"id":"108013515952807_470186843068804",
"created_time":"2013-05-14T20:43:28+0000"
},
{
"message":"TEKST",
"id":"108013515952807_470178529736302",
"created_time":"2013-05-14T20:22:07+0000"
}
]
}
}
Adding on #jwaliszko's answer, converting json to XDocument:
XDocument xml = JsonConvert.DeserializeXNode(json);

Pull RSS Feeds From Facebook Page

I need help to pull RSS feeds from a facebook page I'm using the following code but it keeps giving me an error :
string url =
"https://www.facebook.com/feeds/page.php?id=40796308305&format=rss20";
XmlReaderSettings settings =
new XmlReaderSettings
{
XmlResolver = null,
DtdProcessing=DtdProcessing.Parse,
};
XmlReader reader = XmlReader.Create(url,settings);
SyndicationFeed feed = SyndicationFeed.Load(reader);
foreach (var item in feed.Items)
{
Console.WriteLine(item.Id);
Console.WriteLine(item.Title.Text);
Console.WriteLine(item.Summary.Text);
}
if (reader != null) reader.Close();
This code works perfectly with any blog or page rss but with Facebook rss it give an exception with the following message
The element with name 'html' and namespace 'http://www.w3.org/1999/xhtml' is not an allowed feed format.
Thanks
Facebook will return HTML in this instance because it doesn't like the User Agent supplied by XmlReader. Since you can't customize it, you will need a different solution to grab the feed. This should solve your problem:
var req = (HttpWebRequest)WebRequest.Create(url);
req.Method = "GET";
req.UserAgent = "Fiddler";
var rep = req.GetResponse();
var reader = XmlReader.Create(rep.GetResponseStream());
SyndicationFeed feed = SyndicationFeed.Load(reader);
This is strictly a behavior of Facebook, but the proposed change should work equally well for other sites that are okay with your current implementation.
It works when using Gregorys code above if you change the feed format to atom10 instead of rss20.
Change the url:
string url =
"https://www.facebook.com/feeds/page.php?id=40796308305&format=atom10";
In my case also Facebook feed was difficult to consume and then I try with feedburner to burn the feed for my facebook page. Feedburner generated the feed for me in Atom1.0 format. And then I successfully :) consumed this with system.syndication class my code was:
string Main()
{
var url = "http://feeds.feedburner.com/Per.........all";
Atom10FeedFormatter formatter = new Atom10FeedFormatter();
using (XmlReader reader = XmlReader.Create(url))
{
formatter.ReadFrom(reader);
}
var s = "";
foreach (SyndicationItem item in formatter.Feed.Items)
{
s+=String.Format("[{0}][{1}] {2}", item.PublishDate, item.Title.Text, ((TextSyndicationContent)item.Content).Text);
}
return s;
}

Categories

Resources