Xpath expression to get paragraphs from rssfeed

Xpath expression to get paragraphs from rssfeed - c#

http://lankacnews.com/sinhala/feed/
this is a feed address of a news web site. It contains 11 latest news.
I want to get those news from that rss feed and show them on my program. It contains news like this.
http://i44.tinypic.com/a3cxsw.png
between content : encoded tag.
I want to get it and show it in my application.
This is the code I have used..
XmlDocument lkcnws = new XmlDocument();
lkcnws.Load(#"http://lankacnews.com/sinhala/feed/");
textBox1.Text = lkcnws.OuterXml;
XmlNodeList ndlst;
XmlNode root = lkcnws.DocumentElement;
ndlst = root.SelectNodes("//p");
foreach (XmlNode nd in ndlst)
{
textBox2.Text += nd.OuterXml;
}
but it does not work. Whats the wrong with this code and how can I solve this ?

Try this:
void test() {
XmlDocument lkcnws = new XmlDocument();
lkcnws.Load("http://lankacnews.com/sinhala/feed/");
textBox1.Text = lkcnws.OuterXml;
XmlNodeList ndlst = lkcnws.SelectNodes("//category['#*']"); // it's null with p
foreach (XmlNode nd in ndlst)
{
textBox2.Text += nd.InnerXml;
}
}

Related

Scrape data with HtmlAgilityPack for a tag which doesn't have class

Here is my C# code what i am trying to do is to scrape data from a website by using HtmlAgilityPack but it's showing nothing found every time don't know what i am doing wrong a bit confused
HtmlAgilityPack.HtmlWeb webb = new HtmlAgilityPack.HtmlWeb();
ServicePointManager.SecurityProtocol = SecurityProtocolType.Tls12;
HtmlAgilityPack.HtmlDocument doc = webb.Load("mywebsite");
HtmlNodeCollection nodes = doc.DocumentNode.SelectNodes("//ul[#class='unstyled']//li//a");
if (nodes != null)
{
foreach (HtmlNode n in nodes)
{
q = n.InnerText;
q = System.Net.WebUtility.HtmlDecode(q);
q = q.Trim();
Console.WriteLine(q);
}
}
else
{
Console.WriteLine("nothing found");
}
Here is the picture of the tag from which i am trying to capture data i need data from <a> tag .

The XPath used to select the tag is incorrect.
HtmlNodeCollection nodes =
doc.DocumentNode.SelectNodes("//ul[#class='unstyled']/li/a");
This should select all the anchor nodes and then you can loop through the nodes to get the InnerHtml.
Working sample shown below
string s = "<ul class='unstyle no-overflow'><li><ul class='unstyled'><li><a href='http://www.smsconnexion.com'>SMS ConneXion</a></li></ul><ul class='unstyled'><li><a href='http://www.celusion.com'>Celusion</a></li></ul></li></ul>";
HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(s);
HtmlNodeCollection nodes =
doc.DocumentNode.SelectNodes("//ul[#class='unstyled']/li/a");
foreach(var node in nodes)
{
Console.WriteLine(node.Attributes["href"].Value);
}
Console.ReadLine();

How to retrieve attribute from XML file in C#

I'm complete rookie I need to retrieve the value of last id in XML file here is

You can also use XPath within the Xml DOM like this :
string title;
XmlDocument xml = new XmlDocument();
xml.Load("~/purchases.xml");
// Or any other method to load your xml data in the XmlDocument.
// For example if your xml data are in a string, use the LoadXml method.
XmlElement elt = xml.SelectSingleNode("//SubMenu[#id='1']") as XmlElement;
if(elt!=null)
{
name=elt.GetAttribute("title");
}
Reference

As Jon suggested, you can use Linq To XML here.
XElement books = XElement.Load(filePath);
var lastId = books.Descendants("book").Select(x=>Int32.Parse(x.Attribute("ID").Value)).Last();
This will give you the last ID in the current list.You can now create your new Node
books.Add(new XElement("book",new XAttribute("ID",(lastId+1).ToString()),
new XAttribute("title","New Title"),
new XAttribute("price","1234")));
books.Save(filePath);

XmlDocument doc = new XmlDocument();
doc.Load("Yourxmlfilepath.xml");
//Display all the book titles.
XmlNodeList xNodeList = doc.SelectNodes("/bookstore/book");
foreach (XmlNode xNode in xNodeList)
{
var employeeName = xNode.OuterXml;
XmlDocument docnew = new XmlDocument();
docnew.LoadXml(employeeName);
foreach (XmlElement report in docnew.SelectNodes("book"))
{
string ID = report.GetAttribute("ID");
string title = report.GetAttribute("title");
string quantity = report.GetAttribute("quantity");
string price = report.GetAttribute("price");
}
}

XML to string (C#)

I have a XML loaded from a URL like this:
WebClient client = new WebClient();
client.Encoding = Encoding.UTF8;
try
{
string reply = client.DownloadString("http://Example.com/somefile.xml");
label1.Text = reply;
}
catch
{
label1.Text = "FAILED";
}
That XML belongs to a RSS Feed. I want that label1.Text shows just the titles of that XML. How can I achieve that?
Example of label1.Text
This is my first title - This is my 2nd title - And this is my last title

You can load your XML into an XmlDocument and then use XPath to Get the value of each node you're targeting.
XmlDocument doc = new XmlDocument();
doc.LoadXml(reply);
XmlNodeList nodes = doc.SelectNodes("//NodeToSelect");
foreach (XmlNode node in nodes)
{
//If the value you want is the content of the node
label1.Text = node.InnerText;
//If the value you want is an attribute of the node
label1.Text = node.Attributes["AttibuteName"].Value;
}
If you are not familiar with XPath you can always check here :
http://www.w3schools.com/xpath/xpath_syntax.asp

var xml= XElement.Parse(reply);
label1.Text = string.Join(Environment.NewLine, xml
.Descendants()
.Where (x => !string.IsNullOrEmpty(x.Value))
.Select(x=> string.Format("{0}: {1}", x.Name, x.Value))
.ToArray());

You probably need to parse the RSS XML manually to get the title. Here is some sample code for your reference:
private static List<FeedsItem> ParseFeeds(string feedsXml)
{
XDocument xDoc = XDocument.Parse(feedsXml);
XNamespace xmlns = "http://www.w3.org/2005/Atom";
var items = from entry in xDoc.Descendants(xmlns + "entry")
select new FeedsItem
{
Id = (string)entry.Element(xmlns + "id").Value,
Title = (string)entry.Element(xmlns + "title").Value,
AlternateLink = (string)entry.Descendants(xmlns + "link").Where(link => link.Attribute("rel").Value == "alternate").First().Attribute("href").Value
};
Console.WriteLine("Count = {0}", items.Count());
foreach(var i in items)
{
Console.WriteLine(i);
}
return null;
}

How to get xml element values using XmlNodeList in c#

I need to store the element values which are inside the nodes "member" . I have tried the following code but I can't achieve it. How to get the values. Any help would be appreciated
XML:
<ListInventorySupplyResponse xmlns="http://mws.amazonaws.com/FulfillmentInventory/2010-10-01/">
<ListInventorySupplyResult>
<InventorySupplyList>
<member>
<SellerSKU>043859634910</SellerSKU>
<FNSKU>X000IA4045</FNSKU>
<ASIN>B005YV4DJO</ASIN>
<Condition>NewItem</Condition>
<TotalSupplyQuantity>7</TotalSupplyQuantity>
<InStockSupplyQuantity>7</InStockSupplyQuantity>
<EarliestAvailability>
<TimepointType>Immediately</TimepointType>
</EarliestAvailability>
<SupplyDetail>
</SupplyDetail>
</member>
</InventorySupplyList>
</ListInventorySupplyResult>
<ResponseMetadata>
<RequestId>58c9f4f4-6f60-496a-8d71-8fe99ce301c9</RequestId>
</ResponseMetadata>
</ListInventorySupplyResponse>
C# Code:
string a = Convert.ToString(oInventorySupplyRes.ToXML());
XmlDocument oXdoc = new XmlDocument();
oXdoc.LoadXml(a);
XmlNodeList oInventorySupplyListxml = oXdoc.SelectNodes("//member");
foreach (XmlNode itmXml in oInventorySupplyListxml)
{
// var cond = itmXml.InnerXml.ToString();
var asinVal = itmXml.SelectSingleNode("ASIN").Value;
var TotalSupplyQuantityVal = itmXml.SelectSingleNode("TotalSupplyQuantity").Value;
}
ResultView : "Enumeration yielded no results" and count = 0;
Edit 1:
string a = Convert.ToString(oInventorySupplyRes.ToXML());
var status = oInventorySupplyResult.InventorySupplyList;
XmlDocument oXdoc = new XmlDocument();
var doc = XDocument.Parse(a);
var r = doc.Descendants("member")
.Select(member => new
{
ASIN = member.Element("ASIN").Value,
TotalSupplyQuantity = member.Element("TotalSupplyQuantity").Value
});

private string mStrXMLStk = Application.StartupPath + "\\Path.xml";
private System.Xml.XmlDocument mXDoc = new XmlDocument();
mXDoc.Load(mStrXMLStk);
XmlNode XNode = mXDoc.SelectSingleNode("/ListInventorySupplyResult/InventorySupplyList/member");
if (XNode != null)
{
int IntChildCount = XNode.ChildNodes.Count;
for (int IntI = 1; IntI <= IntChildCount ; IntI++)
{
string LocalName = XNode.ChildNodes[IntI].LocalName;
XmlNode Node = mXDoc.SelectSingleNode("/Response/" + LocalName);
// Store Value in Array assign value by "Node.InnerText"
}
}
Try This Code. Its Worked

try using this xpath
string xPath ="ListInventorySupplyResponse/ListInventorySupplyResult
/InventorySupplyList/member"
XmlNodeList oInventorySupplyListxml = oXdoc.SelectNodes(xpath);
when you do "//member", then, the code is trying to look for element named member from the root level, which is not present at the root level, rather it is nested beneath few layers.

I think this will help you..
string a = Convert.ToString(oInventorySupplyRes.ToXML());
XmlDocument oXdoc = new XmlDocument();
oXdoc.LoadXml(a);
XmlNodeList fromselectors;
XmlNodeList toselectors;
XmlElement root = oXdoc.DocumentElement;
fromselectors = root.SelectNodes("ListInventorySupplyResult/InventorySupplyList/member/ASIN");
toselectors = root.SelectNodes("ListInventorySupplyResult/InventorySupplyList/member/TotalSupplyQuantity");
foreach (XmlNode m in fromselectors)
{
you will have value in `m.InnerXml` use it whereever you want..
}
foreach (XmlNode n in toselectors)
{
you will have value in `n.InnerXml` use it whereever you want..
}

Deleting XML using a selected Xpath and a for loop

I currently have the following code:
XPathNodeIterator theNodes = theNav.Select(theXPath.ToString());
while (theNodes.MoveNext())
{
//some attempts i though were close
//theNodes.RemoveChild(theNodes.Current.OuterXml);
//theNodes.Current.DeleteSelf();
}
I have set xpath to what I want to return in xml and I want to delete everything that is looped. I have tried a few ways of deleting the information but it does't like my syntax. I found an example on Microsoft support: http://support.microsoft.com/kb/317666 but I would like to use this while instead of a for each.
Any comments or questions are appreciated.

Why not to use XDocument?
var xmlText = "<Elements><Element1 /><Element2 /></Elements>";
var document = XDocument.Parse(xmlText);
var element = document.XPathSelectElement("Elements/Element1");
element.Remove();
var result = document.ToString();
result will be <Elements><Element2 /></Elements>.
Or:
var document = XDocument.Load(fileName);
var element = document.XPathSelectElement("Elements/Element1");
element.Remove();
document.Savel(fileName);
[Edit] For .NET 2, you can use XmlDocument:
XmlDocument document = new XmlDocument();
document.Load(fileName);
XmlNode node = document.SelectSingleNode("Elements/Element1");
node.ParentNode.RemoveChild(node);
document.Save(fileName);
[EDIT]
If you need to remove all child elements and attributes:
XmlNode node = document.SelectSingleNode("Elements");
node.RemoveAll();
If you need to keep attributes, but delete elements:
XmlNode node = document.SelectSingleNode("Elements");
foreach (XmlNode childNode in node.ChildNodes)
node.RemoveChild(childNode);

string nodeXPath = "your x path";
XmlDocument document = new XmlDocument();
document.Load(/*your file path*/);
XmlNode node = document.SelectSingleNode(nodeXPath);
node.RemoveAll();
XmlNode parentnode = node.ParentNode;
parentnode.RemoveChild(node);
document.Save("File Path");

You can use XmlDocument:
string nodeXPath = "your x path";
XmlDocument document = new XmlDocument();
document.Load(/*your file path*/);//or document.LoadXml(...
XmlNode node = document.SelectSingleNode(nodeXPath);
if (node.HasChildNodes)
{
//note that you can use node.RemoveAll(); it will remove all child nodes, but it will also remove all node' attributes.
for (int childNodeIndex = 0; childNodeIndex < node.ChildNodes.Count; childNodeIndex++)
{
node.RemoveChild(node.ChildNodes[childNodeIndex]);
}
}
document.Save("your file path"));

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Xpath expression to get paragraphs from rssfeed - c#

Try this: void test() { XmlDocument lkcnws = new XmlDocument(); lkcnws.Load("http://lankacnews.com/sinhala/feed/"); textBox1.Text = lkcnws.OuterXml; XmlNodeList ndlst = lkcnws.SelectNodes("//category['#*']"); // it's null with p foreach (XmlNode nd in ndlst) { textBox2.Text += nd.InnerXml; } }

Related

Scrape data with HtmlAgilityPack for a tag which doesn't have class

How to retrieve attribute from XML file in C#

XML to string (C#)

How to get xml element values using XmlNodeList in c#

Deleting XML using a selected Xpath and a for loop

Categories

Resources