Need advice on removing xml row - c#

I'm using a CMS and found a function to generate a rss feed from content within folders. However I would like one of the rows removing from the list. I've done my research and I 'think' I should be using XmlDocument class to help me remove the row I don't want. I've used Firebug and FirePath to get the XPath - but I cant seem to figure out how to apply it appropriately. I am also uncertain of whether I should be using .Load or .LoadXml - I've used the latter seing as though the feed displays fine. However I have had to convert ToString() to get rid of that overloaded match error....
The row I want removing is called "Archived Planes"
The XPath I get for FirePath is ".//*[#id='feedContent']/xhtml:div[11]/xhtml:h3/xhtml:a"
I am also assuming that .RemoveChild(node); will remove it out of rssData before I Response.Write. Thanks
Object rssData = new object();
Cms.UI.CommonUI.ApplicationAPI AppAPI = new Cms.UI.CommonUI.ApplicationAPI();
rssData = AppAPI.ecmRssSummary(50, true, "DateCreated", 0, "");
Response.ContentType = "text/xml";
XmlDocument xmlDocument = new XmlDocument();
xmlDocument.LoadXml(rssData.ToString());
XmlNode node = xmlDocument.SelectSingleNode(#"xhtml:div/xhtml:h3/xhtml[a = 'Archived Planes']");
if (node != null)
{
node.ParentNode.RemoveChild(node);
}
Response.Write(rssData);
Edited to include output below
This is the what the response.write from rssData is pumping out:
<?xml version="1.0" ?>
<rss xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" version="2.0">
<channel>
<title>Plane feed</title>
<link>http://www.domain.rss1.aspx</link>
<description></description>
<item>
<title>New Planes</title>
<link>http://www.domainx1.aspx</link>
<description>
This is the description
</description>
<author>Andrew</author>
<pubDate>Thu, 16 Aug 2012 15:55:53 GMT</pubDate>
</item>
<item>
<title>Archived Planes</title>
<link>http://www.domain23.aspx</link>
<description>
Description of Archived Planes
</description>
<author>Jan</author>
<pubDate>Wed, 15 Aug 2012 10:34:23 GMT</pubDate>
</item>
</channel>
</rss>

I suspect your xpath is incorrect, it looks like some funky dom element that you are referencing and not the xml element... e.g. for the following xml
<?xml version="1.0" encoding="UTF-16" standalone="yes"?>
<NewDataSet>
<userinfo>
<username>pqr2</username>
<pass>abc</pass>
<addr>abc</addr>
</userinfo>
<userinfo>
<username>pqr1</username>
<pass>pqr2</pass>
<addr>pqr3</addr>
</userinfo>
</NewDataSet>
This code will remove the userinfo node with an username element of pqr1
XmlDocument xmlDocument = new XmlDocument();
xmlDocument.Load(#"file.xml");
XmlNode node = xmlDocument.SelectSingleNode(#"NewDataSet/userinfo[username = 'pqr1']");
if (node != null) {
node.ParentNode.RemoveChild(node);
xmlDocument.Save(#"file.xml");
}

Thought I would post the answer, although I'll mark Pauls as the answer, as his code/advice was the basis of this and my further research. Still don't know what the '#' in SelectSingleNode is and whether I should really have it - will do more research.
Object rssData = new object();
Cms.UI.CommonUI.ApplicationAPI AppAPI = new Cms.UI.CommonUI.ApplicationAPI();
rssData = AppAPI.ecmRssSummary(50, true, "DateCreated", 0, "");
Response.ContentType = "text/xml";
Response.ContentEncoding = System.Text.Encoding.UTF8;
XmlDocument xmlDocument = new XmlDocument();
xmlDocument.LoadXml(rssData.ToString());
XmlNode node = xmlDocument.SelectSingleNode("rss/channel/item[title = 'Archived Planes']");
if (node != null)
try
{
node.ParentNode.RemoveChild(node);
xmlDocument.Save(Response.Output);
}
catch { }
else { Response.Write(rssData); }
}

The # symbol is simply to denote a verbatim string literal (allows you to have funky characters in the string compared to the normal string declaration) e.g.
string e = "Joe said \"Hello\" to me"; // Joe said "Hello" to me
string f = #"Joe said ""Hello"" to me"; // Joe said "Hello" to me
See this msdn link for more info

Related

XDocument get and set values in XML nodes

I have this XML
<?xml version="1.0" encoding="utf-8" standalone="no"?>
<S:Envelope xmlns:S="http://schemas.xmlsoap.org/soap/envelope/">
<S:Header>
<wss:Security xmlns:wss="http://schemas.xmlsoap.org/ws/2002/12/secext">
<wss:UsernameToken>
<wss:Username>username</wss:Username>
<wss:Password>password</wss:Password>
<wss:Nonce></wss:Nonce>
<wss:Created></wss:Created>
</wss:UsernameToken>
</wss:Security>
</S:Header>
<S:Body>
<TaxRegistrationNumber>213213123</TaxRegistrationNumber>
<CompanyName>sadsadasd</CompanyName>
</S:Body>
</S:Envelope>
I want to accesses to the value of <wss:Username> and set a value in <wss:Nonce> node.
I already try 3 ways to get value of <wss:Username> on C# project:
First:
XDocument xmlFile = XDocument.Load(xmlpathfile);
XmlNamespaceManager ns = new XmlNamespaceManager(new NameTable());
ns.AddNamespace("wss", "http://schemas.xmlsoap.org/ws/2002/12/secext/");
XElement UserFinanc = xmlFile.XPathSelectElement("wss:Security/wss:UsernameToken/wss:Username", ns);
Second:
XDocument xmlFile = XDocument.Load(xmlpathfile);
XmlNamespaceManager ns = new XmlNamespaceManager(new NameTable());
var element = xmlFile.Descendants(wss + "Security").Descendants(wss + "UsernameToken").Where(x => x.Descendants(wss + "Username").Any(y => y.Value != "")).First().Element(wss + "UsernameToken");
if (element != null)
MessageBox.Show(element.Element(wss + "Username").Value).Value);
Third:
string grandChild = (string) (from el in xmlFile.Descendants(wss + "Username") select el).First();
MsgBox.Show(grandChild);
I always have similar errors like 'The sequence contains no elements'
Your first attempt is almost right. There are a couple things missing:
The namespace defined in code must be an exact match to the one in the xml. In your case the namespace in code has an extra trailing slash. It should be http://schemas.xmlsoap.org/ws/2002/12/secext.
The XPath expression should be //wss:Security/wss:UsernameToken/wss:Username. The leading slashes basically mean "look for this node anywhere". Alternatively, you could write out the whole path begining with <S:Envelope>. You would need to add the soap envelope namespace to your code as well.

How to get enclosure url with XElement C# Console

I read multiple feed from many sources with C# Console, and i have this code where i load XML From sources:
XmlDocument doc = new XmlDocument();
doc.Load(sourceURLX);
XElement xdoc = XElement.Load(sourceURLX);
How to get enclosure url and show as variable?
If I understand your question correctly (I'm making a big assumption here) - you want to select an attribute from the root (or 'enclosing') tag, named 'url'?
You can make use of XPath queries here. Consider the following XML:
<?xml version="1.0" encoding="utf-8"?>
<root url='google.com'>
<inner />
</root>
You could use the following code to retrieve 'google.com':
String query = "/root[1]/#url";
XmlDocument doc = new XmlDocument();
doc.Load(sourceURLX);
String value = doc.SelectSingleNode(query).InnerText;
Further information about XPath syntax can be found here.
Edit: As you stated in your comment, you are working with the following XML:
<item>
<description>
</description>
<enclosure url="blablabla.com/img.jpg" />
</item>
Therefore, you can retrieve the url using the following XPath query:
/item[1]/enclosure[1]/#url
With xml like below
<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0">
<channel>
<title>title</title>
<link>https://www.link.com</link>
<description>description</description>
<item>
<title>RSS</title>
<link>https://www.link.com/xml/xml_rss.asp</link>
<description>description</description>
<enclosure url="https://www.link.com/media/test.wmv"
length="10000"
type="video/wmv"/>
</item>
</channel>
</rss>
You will get url by reading attribute
var document = XDocument.Load(sourceURLX);
var url = document.Root
.Element("channel")
.Element("item")
.Element("enclosure")
.Attribute("url")
.Value;
To get multiple urls
var urls = document.Descendants("item")
.Select(item => item.Element("enclosure").Attribute("url").Value)
.ToList();
Using foreach loop
foreach (var item in document.Descendants("item"))
{
var title = item.Element("title").Value;
var link = item.Element("link").Value;
var description = item.Element("description").Value;
var url = item.Element("enclosure").Attribute("url").Value;
// save values to database
}

Include XML CDATA in an element

UPDATE: Added more detail per request
I am trying to create an xml configuration file for my application. The file contains a list of criteria to search and replace in an html document. The problem is, I need to search for character strings like &nbsp. I do not want my code to read the decoded item, but the text itself.
Admitting to being very new to XML, I did make some attempts at meeting the requirements. I read a load of links here on Stackoverflow regarding CDATA and ATTRIBUTES and so on, but the examples here (and elsewhere) seem to focus on creating one single line in an xml file, not multiple.
Here is one of many attempts I have made to no avail:
<?xml version="1.0" encoding="utf-8" ?>
<!DOCTYPE item [
<!ELEMENT item (id, replacewith)>
<!ELEMENT id (#CDATA)>
<!ELEMENT replacewith (#CDATA)>
]>
]>
<item id=" " replacewith=" ">Non breaking space</item>
<item id="‑" replacewith="-">Non breaking hyphen</item>
This document gives me a number of errors, including:
In the DOCTYPE, I get errors like <!ELEMENT id (#CDATA)>. In the CDATA area, Visual Studio informs me it is expecting a ',' or '|'.
]> gives me an error of invalid token at the root of the document.
And of course, after the second <item entry, I get an error stating XML document cannot contain multiple root level elements.
How can I write an xml file that includes multiple items and allows me to store and retrieve the text within the element, rather than the interpreted characters?
If it helps any, I am using .Net, C#, and Visual Studio.
EDIT:
The purpose of this xml file is to provide my code with a list of things to search and replace in an html file. The xml file simply contains a list of what to search for and what to replace with.
Here is the file I have in place right now:
<?xml version="1.0" encoding="utf-8" ?>
<Items>
<item id="‑" replacewith="-">Non breaking hyphen</item>
<item id=" " replacewith=" ">Non breaking hyphen</item>
</Items>
Using the first as an example, I want to read the text &#8209 but instead when I read this, I get - because that is what the code represents.
Any help or pointers you can give would be helpful.
To elaborate on my comment: XML acts like HTML due to the reserved characters. An ampersand prefixes keywords or character codes to translate into a literal string when read in with any type of parser (browser, XML reader, etc).
The easiest way to escape the values to make sure they are read back in as the literal that you want is to put them in as if you were encoding it for web. For example, to create your XML document, I did this:
XmlDocument xmlDoc = new XmlDocument();
XmlElement xmlItem;
XmlAttribute xmlAttr;
XmlText xmlText;
// Declaration
XmlDeclaration xmlDec = xmlDoc.CreateXmlDeclaration("1.0", "UTF-8", null);
XmlElement xmlRoot = xmlDoc.DocumentElement;
xmlDoc.InsertBefore(xmlDec, xmlRoot);
// Items
XmlElement xmlItems = xmlDoc.CreateElement(string.Empty, "Items", string.Empty);
xmlDoc.AppendChild(xmlItems);
// Item #1
xmlItem = xmlDoc.CreateElement(string.Empty, "item", string.Empty);
xmlAttr = xmlDoc.CreateAttribute(string.Empty, "id", string.Empty);
xmlAttr.Value = "‑";
xmlItem.Attributes.Append(xmlAttr);
xmlAttr = xmlDoc.CreateAttribute(string.Empty, "replacewith", string.Empty);
xmlAttr.Value = "-";
xmlItem.Attributes.Append(xmlAttr);
xmlText = xmlDoc.CreateTextNode("Non breaking hyphen");
xmlItem.AppendChild(xmlText);
xmlItems.AppendChild(xmlItem);
// Item #2
xmlItem = xmlDoc.CreateElement(string.Empty, "item", string.Empty);
xmlAttr = xmlDoc.CreateAttribute(string.Empty, "id", string.Empty);
xmlAttr.Value = " ";
xmlItem.Attributes.Append(xmlAttr);
xmlAttr = xmlDoc.CreateAttribute(string.Empty, "replacewith", string.Empty);
xmlAttr.Value = " ";
xmlItem.Attributes.Append(xmlAttr);
xmlText = xmlDoc.CreateTextNode("Non breaking hyphen");
xmlItem.AppendChild(xmlText);
xmlItems.AppendChild(xmlItem);
// For formatting
StringBuilder xmlBuilder = new StringBuilder();
XmlWriterSettings xmlSettings = new XmlWriterSettings
{
Indent = true,
IndentChars = " ",
NewLineChars = "\r\n",
NewLineHandling = NewLineHandling.Replace
};
using (XmlWriter writer = XmlWriter.Create(xmlBuilder, xmlSettings))
{
xmlDoc.Save(writer);
}
xmlOutput.Text = xmlBuilder.ToString();
Notice that I put in your id values with what you are expecting. Now, look at how it gets encoded:
<?xml version="1.0" encoding="utf-16"?>
<Items>
<item id="&#8209;" replacewith="-">Non breaking hyphen</item>
<item id=" " replacewith="&nbsp;">Non breaking hyphen</item>
</Items>
The only difference between yours and this one is that the ampersand was encoded as & and the rest remained as a string literal. This is normal behavior for XML. When you read it back in, it will come back as the literal ‑ and .

C# Edit XML, I am totally lost

I have this code to Load xml files that I am not sure of if it is complete. This is my code.
public void updateXML(string xmlFile, string chooseNode, string chooseSingleNode, string newNode, string selectedCategory)
{
XmlDocument xml = new XmlDocument();
xml.Load(xmlFile);
foreach (XmlElement element in xml.SelectNodes(chooseNode))
{
foreach (XmlElement element1 in element)
{
if (element.SelectSingleNode(chooseNode).InnerText == selectedCategory)
{
XmlNode newvalue = xml.CreateElement(newNode);
newvalue.InnerText = "MODIFIED";
element.ReplaceChild(newvalue, element1);
xml.Save(xmlFile);
}
}
}
Below is the method that I use in the end, where I set xmlfile and such. (the updateXML method is in "data.cs", which is called on from the repository.
public void editCategory(string newNode)
{
string xmlFile = "Category.xml";
string chooseNodes = "ArrayOfCategory/Category";
string chooseSingleNode = "//Name";
string selectedCategory = "News";
repository.Update(xmlFile, chooseNodes, newNode, chooseSingleNode, selectedCategory);
}
I am unsure of what to put in the diffrent Nodes etc, the code above I found here on Stackoverflow.
- Below is my XMLfile that I want to edit.
<?xml version="1.0" encoding="utf-8"?>
<ArrayOfCategory xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<Category>
<Id>6b30511d-2cd1-4325-ad73-7b905f76ffc0</Id>
<Name>News</Name>
</Category>
<Category>
<Id>516401f4-b45c-46ef-b8f4-9d05021ae794</Id>
<Name>Pods</Name>
</Category>
<Category>
<Id>0c9cd216-86cf-4a62-884c-1b428150ebac</Id>
<Name>Pods</Name>
</Category>
</ArrayOfCategory>
I would really appreciate your help.
if (element.SelectSingleNode(chooseNode).InnerText == selectedCategory)
ChooseNode = "ArrayOfCategory/Category"
selectedCategory = "News";
So the innertext of chooseNode will never be "News" because "News" is under <Name>
There's something wrong in your second foreach : did you forget to put element.SelectNodes or something ?
Next thing: you can modify an XmlElement directly, no need to create a new one. You create (and add it) only if it's not there.
I strongly recommend you have a look at the MSDN documentation of XmlDocument, more specifically CreateElement and this simple example following the SelectNodes presentation.
Moreover, you may want to consider putting an # in front of your strings :
see What's the # in front of a string in C#?

How to correctly parse an XML document with arbitrary namespaces

I am trying to parse somewhat standard XML documents that use a schema called MARCXML from various sources.
Here are the first few lines of an example XML file that needs to be handled...
<?xml version="1.0" encoding="UTF-8" standalone="no" ?>
<marc:collection xmlns:marc="http://www.loc.gov/MARC21/slim" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.loc.gov/MARC21/slim http://www.loc.gov/standards/marcxml/schema/MARC21slim.xsd">
<marc:record>
<marc:leader>00925njm 22002777a 4500</marc:leader>
and one without namespace prefixes...
<?xml version="1.0" encoding="UTF-8" standalone="no" ?>
<collection xmlns="http://www.loc.gov/MARC21/slim">
<record>
<leader>01142cam 2200301 a 4500</leader>
Key point: in order to get the XPaths to resolve further along in the program I have to go through a regex routine to add the namespaces to the NameTable (which doesn't add them by default). This seems unnecessary to me.
Regex xmlNamespace = new Regex("xmlns:(?<PREFIX>[^=]+)=\"(?<URI>[^\"]+)\"", RegexOptions.Compiled);
XmlDocument xmlDoc = new XmlDocument();
xmlDoc.LoadXml(xmlRecord);
XmlNamespaceManager nsMgr = new XmlNamespaceManager(xmlDoc.NameTable);
MatchCollection namespaces = xmlNamespace.Matches(xmlRecord);
foreach (Match n in namespaces)
{
nsMgr.AddNamespace(n.Groups["PREFIX"].ToString(), n.Groups["URI"].ToString());
}
The XPath call looks something like this...
XmlNode leaderNode = xmlDoc.SelectSingleNode(".//" + LeaderNode, nsMgr);
Where LeaderNode is a configurable value and would equal "marc:leader" in the first example and "leader" in the second example.
Is there a better, more efficient way to do this? Note: suggestions for solving this using LINQ are welcome, but I would mainly like to know how to solve this using XmlDocument.
EDIT: I took GrayWizardx's advice and now have the following code...
if (LeaderNode.Contains(":"))
{
string prefix = LeaderNode.Substring(0, LeaderNode.IndexOf(':'));
XmlNode root = xmlDoc.FirstChild;
string nameSpace = root.GetNamespaceOfPrefix(prefix);
nsMgr.AddNamespace(prefix, nameSpace);
}
Now there's no more dependency on Regex!
If you know there is going to be a given element in the document (for instance the root element) you could try using GetNamespaceOfPrefix.

Categories

Resources