How to get enclosure url with XElement C# Console

How to get enclosure url with XElement C# Console - c#

I read multiple feed from many sources with C# Console, and i have this code where i load XML From sources:
XmlDocument doc = new XmlDocument();
doc.Load(sourceURLX);
XElement xdoc = XElement.Load(sourceURLX);
How to get enclosure url and show as variable?

If I understand your question correctly (I'm making a big assumption here) - you want to select an attribute from the root (or 'enclosing') tag, named 'url'?
You can make use of XPath queries here. Consider the following XML:
<?xml version="1.0" encoding="utf-8"?>
<root url='google.com'>
<inner />
</root>
You could use the following code to retrieve 'google.com':
String query = "/root[1]/#url";
XmlDocument doc = new XmlDocument();
doc.Load(sourceURLX);
String value = doc.SelectSingleNode(query).InnerText;
Further information about XPath syntax can be found here.
Edit: As you stated in your comment, you are working with the following XML:
<item>
<description>
</description>
<enclosure url="blablabla.com/img.jpg" />
</item>
Therefore, you can retrieve the url using the following XPath query:
/item[1]/enclosure[1]/#url

With xml like below
<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0">
<channel>
<title>title</title>
<link>https://www.link.com</link>
<description>description</description>
<item>
<title>RSS</title>
<link>https://www.link.com/xml/xml_rss.asp</link>
<description>description</description>
<enclosure url="https://www.link.com/media/test.wmv"
length="10000"
type="video/wmv"/>
</item>
</channel>
</rss>
You will get url by reading attribute
var document = XDocument.Load(sourceURLX);
var url = document.Root
.Element("channel")
.Element("item")
.Element("enclosure")
.Attribute("url")
.Value;
To get multiple urls
var urls = document.Descendants("item")
.Select(item => item.Element("enclosure").Attribute("url").Value)
.ToList();
Using foreach loop
foreach (var item in document.Descendants("item"))
{
var title = item.Element("title").Value;
var link = item.Element("link").Value;
var description = item.Element("description").Value;
var url = item.Element("enclosure").Attribute("url").Value;
// save values to database
}

Related

Access Child nodes with namespace using xpath

How can I read the content of the childnotes using Xpath?
I have already tried this:
var xml = new XmlDocument();
xml.Load("server-status.xml");
var ns = new XmlNamespaceManager(xml.NameTable);
ns.AddNamespace("ns", "namespace");
var node = xml.SelectSingleNode("descendant::ns:server[ns:ip-address]", ns)
Console.WriteLine(node.InnerXml)
But I only get a string like this:
<ip-address>127.0.0.50</ip-address><name>Server 1</name><group>DATA</group>
How can I get the values individually?
Xml file:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<server-status xmlns="namespace">
<server>
<ip-address>127.0.0.50</ip-address>
<name>Server 1</name>
<group>DATA</group>
</server>
</server-status>

You're using XML namespaces in XPath correctly.
However, your original XPath,
descendant::ns:server[ns:ip-address]
says to select all ns:server elements with ns:ip-address children.
If you wish to select the ns:ip-address children themselves, instead use
descendant::ns:server/ns:ip-address
Similarly, you could select ns:name or ns:group elements.

Unable to parse xml string using Xdocument and Linq

I would like to parse the below xml using XDocument in Linq.
<?xml version="1.0" encoding="UTF-8"?>
<string xmlns="http://tempuri.org/">
<Sources>
<Item>
<Id>1</Id>
<Name>John</Name>
</Item>
<Item>
<Id>2</Id>
<Name>Max</Name>
</Item>
<Item>
<Id>3</Id>
<Name>Ricky</Name>
</Item>
</Sources>
</string>
My parsing code is :
var xDoc = XDocument.Parse(xmlString);
var xElements = xDoc.Element("Sources")?.Elements("Item");
if (xElements != null)
foreach (var source in xElements)
{
Console.Write(source);
}
xElements is always null. I tried using namespace as well, it did not work. How can I resolve this issue?

Try below code:
string stringXml = "<?xml version=\"1.0\" encoding=\"UTF-8\"?><string xmlns=\"http://tempuri.org/\"><Sources><Item><Id>1</Id><Name>John</Name></Item><Item><Id>2</Id><Name>Max</Name></Item><Item><Id>3</Id><Name>Ricky</Name></Item></Sources></string>";
XDocument xDoc = XDocument.Parse(stringXml);
var items = xDoc.Descendants("{http://tempuri.org/}Sources")?.Descendants("{http://tempuri.org/}Item").ToList();
I tested it and it correctly shows that items has 3 lements :) Maybe you used namespaces differently (it's enough to inspect xDoc objct in object browser and see its namespace).

You need to concatenate the namespace and can directly use Descendants method to fetch all Item nodes like:
XNamespace ns ="http://tempuri.org/";
var xDoc = XDocument.Parse(xmlString);
var xElements = xDoc.Descendants(ns + "Item");
foreach (var source in xElements)
{
Console.Write(source);
}
This prints on Console:
<Item xmlns="http://tempuri.org/">
<Id>1</Id>
<Name>John</Name>
</Item><Item xmlns="http://tempuri.org/">
<Id>2</Id>
<Name>Max</Name>
</Item><Item xmlns="http://tempuri.org/">
<Id>3</Id>
<Name>Ricky</Name>
</Item>
See the working DEMO Fiddle

Need advice on removing xml row

I'm using a CMS and found a function to generate a rss feed from content within folders. However I would like one of the rows removing from the list. I've done my research and I 'think' I should be using XmlDocument class to help me remove the row I don't want. I've used Firebug and FirePath to get the XPath - but I cant seem to figure out how to apply it appropriately. I am also uncertain of whether I should be using .Load or .LoadXml - I've used the latter seing as though the feed displays fine. However I have had to convert ToString() to get rid of that overloaded match error....
The row I want removing is called "Archived Planes"
The XPath I get for FirePath is ".//*[#id='feedContent']/xhtml:div[11]/xhtml:h3/xhtml:a"
I am also assuming that .RemoveChild(node); will remove it out of rssData before I Response.Write. Thanks
Object rssData = new object();
Cms.UI.CommonUI.ApplicationAPI AppAPI = new Cms.UI.CommonUI.ApplicationAPI();
rssData = AppAPI.ecmRssSummary(50, true, "DateCreated", 0, "");
Response.ContentType = "text/xml";
XmlDocument xmlDocument = new XmlDocument();
xmlDocument.LoadXml(rssData.ToString());
XmlNode node = xmlDocument.SelectSingleNode(#"xhtml:div/xhtml:h3/xhtml[a = 'Archived Planes']");
if (node != null)
{
node.ParentNode.RemoveChild(node);
}
Response.Write(rssData);
Edited to include output below
This is the what the response.write from rssData is pumping out:
<?xml version="1.0" ?>
<rss xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" version="2.0">
<channel>
<title>Plane feed</title>
<link>http://www.domain.rss1.aspx</link>
<description></description>
<item>
<title>New Planes</title>
<link>http://www.domainx1.aspx</link>
<description>
This is the description
</description>
<author>Andrew</author>
<pubDate>Thu, 16 Aug 2012 15:55:53 GMT</pubDate>
</item>
<item>
<title>Archived Planes</title>
<link>http://www.domain23.aspx</link>
<description>
Description of Archived Planes
</description>
<author>Jan</author>
<pubDate>Wed, 15 Aug 2012 10:34:23 GMT</pubDate>
</item>
</channel>
</rss>

I suspect your xpath is incorrect, it looks like some funky dom element that you are referencing and not the xml element... e.g. for the following xml
<?xml version="1.0" encoding="UTF-16" standalone="yes"?>
<NewDataSet>
<userinfo>
<username>pqr2</username>
<pass>abc</pass>
<addr>abc</addr>
</userinfo>
<userinfo>
<username>pqr1</username>
<pass>pqr2</pass>
<addr>pqr3</addr>
</userinfo>
</NewDataSet>
This code will remove the userinfo node with an username element of pqr1
XmlDocument xmlDocument = new XmlDocument();
xmlDocument.Load(#"file.xml");
XmlNode node = xmlDocument.SelectSingleNode(#"NewDataSet/userinfo[username = 'pqr1']");
if (node != null) {
node.ParentNode.RemoveChild(node);
xmlDocument.Save(#"file.xml");
}

Thought I would post the answer, although I'll mark Pauls as the answer, as his code/advice was the basis of this and my further research. Still don't know what the '#' in SelectSingleNode is and whether I should really have it - will do more research.
Object rssData = new object();
Cms.UI.CommonUI.ApplicationAPI AppAPI = new Cms.UI.CommonUI.ApplicationAPI();
rssData = AppAPI.ecmRssSummary(50, true, "DateCreated", 0, "");
Response.ContentType = "text/xml";
Response.ContentEncoding = System.Text.Encoding.UTF8;
XmlDocument xmlDocument = new XmlDocument();
xmlDocument.LoadXml(rssData.ToString());
XmlNode node = xmlDocument.SelectSingleNode("rss/channel/item[title = 'Archived Planes']");
if (node != null)
try
{
node.ParentNode.RemoveChild(node);
xmlDocument.Save(Response.Output);
}
catch { }
else { Response.Write(rssData); }
}

The # symbol is simply to denote a verbatim string literal (allows you to have funky characters in the string compared to the normal string declaration) e.g.
string e = "Joe said \"Hello\" to me"; // Joe said "Hello" to me
string f = #"Joe said ""Hello"" to me"; // Joe said "Hello" to me
See this msdn link for more info

xml XPathSelectElements => to string type

I have the following XML structure:
<?xml version="1.0" encoding="utf-8"?>
<xml>
<root>
<Item>
<taxids>
<string>330</string>
<string>374</string>
<string>723</string>
<string>1087</string>
<string>1118</string>
<string>1121</string>
</taxids>
</Item>
</root>
</xml>
I need to get all the string nodes from the xml file to a string variable, like this:
var query = from ip in doc.XPathSelectElements("xml/root/Item")
where ip.XPathSelectElement("taxid").Value == "723"
select ip.XPathSelectElements("taxids").ToString();
But I am getting the following in one row of the variable query:
"System.Xml.XPath.XPathEvaluator+<EvaluateIterator>d__0`1[System.Xml.Linq.XElement]"
I want to get a string like this:
<taxids><string>330</string><string>374</string><string>723</string><string>1087</string><string>1118</string><string>1121</string></taxids>
Is this possible?
Thanks!

I would suggest you something like:
var values = from ids in doc.XPathSelectElements("/xml/root/Item/taxids")
from id in ids.XPathSelectElements("string")
where id.Value.Contains("723")
select ids.ToString();
var result = string.Join("", values);
The value variable will contain all the taxids, which have at least one string child with value 723.
Another variant, which doesn't use XPath for the children checking:
var values = from ids in doc.XPathSelectElements("/xml/root/Item/taxids")
from id in ids.Elements("string")
where id.Value.Contains("723")
select ids.ToString();
var result = string.Join("\n", values);

var doc = XDocument.Parse(#"<?xml version=""1.0"" encoding=""utf-8""?>
<xml>
<root>
<Item>
<taxids>
<string>330</string>
<string>374</string>
<string>723</string>
<string>1087</string>
<string>1118</string>
<string>1121</string>
</taxids>
</Item>
</root>
</xml>");
var query = doc.XPathSelectElement("xml/root/Item/taxids");
Console.WriteLine(query);

How to extract a part of xml code from an xml file using c# code

<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<eRecon xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:noNamespaceSchemaLocation="eRecon.xsd">
<Header>
<Company Code="" />
<CommonCarrierCode />
<InputFileName InputIDPk="">F:\ReconNew\TmesysRec20100111.rec</InputFileName>
<BatchNumber>000152</BatchNumber>
<InputStartDateTime>2010-02-26 11:47:00</InputStartDateTime>
<InputFinishDateTime>2010-02-26 11:47:05</InputFinishDateTime>
<RecordCount>8</RecordCount>
</Header>
<Detail>
<CarrierStatusDate>2010-01-11</CarrierStatusDate>
<ClaimNum>YDF02892 C</ClaimNum>
<InvoiceNum>0108013775</InvoiceNum>
<LineItemNum>001</LineItemNum>
<NABP>10600211</NABP>
<RxNumber>4695045</RxNumber>
<RxDate>2008-07-21</RxDate>
<CheckNum />
<PaymentStatus>PENDING</PaymentStatus>
<RejectDescription />
<InvoiceChargeAmount>152.15</InvoiceChargeAmount>
<InvoicePaidAmount>131.00</InvoicePaidAmount>
</Detail>
</eRecon>
How can I extract the portion
<Header>
<Company Code="" />
<CommonCarrierCode />
<InputFileName InputIDPk="">F:\ReconNew\TmesysRec20100111.rec</InputFileName>
<BatchNumber>000152</BatchNumber>
<InputStartDateTime>2010-02-26 11:47:00</InputStartDateTime>
<InputFinishDateTime>2010-02-26 11:47:05</InputFinishDateTime>
<RecordCount>8</RecordCount>
</Header>
from the above xml file.
I need the c# code to extract a part of xml tag from an xml file.

If the file isn't too big (smaller than a few MB), you can load it into an XmlDocument:
XmlDocument doc = new XmlDocument();
doc.Load(#"C:\yourfile.xml");
and then you can parse for the <Header> element using an XPath expression:
XmlNode headerNode = doc.SelectSingleNode("/eRecon/Header");
if(headerNode != null)
{
string headerNodeXml = headerNode.OuterXml;
}

You can use XPath like in this tutorial:
http://www.codeproject.com/KB/cpp/myXPath.aspx

Use Linq-to-xml:
XDocument xmlDoc = XDocument.Load(#"c:\sample.xml");
var header = xmlDoc.Descendants("Header").FirstOrDefault();

Linq version
string fileName=#"d:\xml.xml";
var descendants = from i in XDocument.Load(fileName).Descendants("Header")
select i;

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

How to get enclosure url with XElement C# Console - c#

I read multiple feed from many sources with C# Console, and i have this code where i load XML From sources: XmlDocument doc = new XmlDocument(); doc.Load(sourceURLX); XElement xdoc = XElement.Load(sourceURLX); How to get enclosure url and show as variable?

Related

Access Child nodes with namespace using xpath

Unable to parse xml string using Xdocument and Linq

Need advice on removing xml row

xml XPathSelectElements => to string type

How to extract a part of xml code from an xml file using c# code

Categories

Resources