I am trying to parse an xml. All nodes have opening and closing tags except one node that in some lines in only has this tag: <persons/>
In most of the time it appears like this: <persons> ... </persons>
I cannot get values from the xml when this node is not closing like this
Here is my code:
foreach (HtmlNode man in bm.SelectNodes(".//persons"))
{
//store values
}
How can I overcome this issue? Even if some nodes are like this at the start:
<persons> </persons>
if there is a tag like this in the middle of the file
<persons/>
I cannot get the remaining <persons> </persons> values from the remaining lines
why are you using htmlnode? xmlnode would be just fine.
Or else, show more codes.
Did you step through the line? Did you encounter any error?
try this:
internal string ParseXML()
{
string ppl = "";
XmlDocument doc = new XmlDocument();
doc.LoadXml(xmlString);
foreach (XmlElement node in doc.SelectNodes(".//person"))
{
string text = node.InnerText; //or loop through its children as well
ppl += text;
}
return ppl;
}
Related
I have looked everywhere, and cannot find anything to help me.
I am writing a program that connects to a webservice and then the webservice sends an XML response. After the response is received I have to retrieve certain values from it, but this is where it gets tricky
Here is a snippet of the returned XML:
<?xml version="1.0"?>
<MobilePortalSellingCategoriesHierarchy xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<Response xmlns="http://blabla.com/service/">Successful</Response>
<ResponseNumber xmlns="http://blabla.com/service/">0</ResponseNumber>
<SellingCategoriesHierarchy xmlns="http://tym2sell.com/PortalService/">
<Response>Successful</Response>
<ResponseNumber>0</ResponseNumber>
<SellingCategories>
<PortalSellingCategory>
<SellingCategoryId xsi:nil="true" />
<SellingCategoryName>category1</SellingCategoryName>
<DeliveryMethod />
<DeliveryMethodNumber>0</DeliveryMethodNumber>
<SellingCategories>
<PortalSellingCategory>
<SellingCategoryId xsi:nil="true" />
<SellingCategoryName>category1_Item</SellingCategoryName>
<DeliveryMethod />
<DeliveryMethodNumber>0</DeliveryMethodNumber>
<SellingCategories>
<PortalSellingCategory>
<SellingCategoryId>2</SellingCategoryId>
<SellingCategoryName>Item2</SellingCategoryName>
<DeliveryMethod>Display</DeliveryMethod>
<DeliveryMethodNumber>1</DeliveryMethodNumber>
<VoucherValue>0.00</VoucherValue>
<IsVariablePrice>true</IsVariablePrice>
<MinimumVoucherValue>1.00</MinimumVoucherValue>
<MaximumVoucherValue>1000.00</MaximumVoucherValue>
<VoucherValueIncrement>1.00</VoucherValueIncrement>
<AdditionalInputItems>
<PortalAdditionalInputItem>
<InputItemId>-1</InputItemId>
<Label>Value:</Label>
<IsNumericOnly>true</IsNumericOnly>
<MaximumLength>7</MaximumLength>
<Hidden>false</Hidden>
</PortalAdditionalInputItem>
<PortalAdditionalInputItem>
<InputItemId>4</InputItemId>
<Label>Mobile Number</Label>
<IsNumericOnly>true</IsNumericOnly>
<MaximumLength>15</MaximumLength>
<Hidden>false</Hidden>
</PortalAdditionalInputItem>
</AdditionalInputItems>
<TwoStep>false</TwoStep>
<SelectedIcon>SamplePicture</SelectedIcon>
<UnSelectedIcon>SamplePicture</UnSelectedIcon>
This repeats from the SellingCategories node just under Response for a couple of times.
Here is a Snippet of my code where I get the XML as string.
XmlDocument xml = new XmlDocument();
xml.LoadXml(receivedData);
XmlNodeList xnList = xml.SelectNodes("/MobilePortalSellingCategoriesHierarchy");
foreach (XmlNode xn in xnList)
{
string sellingCategoryName = xn["SellingCategoryName"].InnerText;
string SelectedIcon = xn["SelectedIcon"].InnerText;
string UnSelectedIcon = xn["UnSelectedIcon"].InnerText;
richTextBox1.AppendText(string.Format("Name: {0} {1} {2}", sellingCategoryName, SelectedIcon, UnSelectedIcon));
}
I have tried changing xml.SelectNodes("/MobilePortalSellingCategoriesHierarchy");
to
xml.SelectNodes("/MobilePortalSellingCategoriesHierarchy/SellingCategoriesHierarchy/SellingCategories/PortalSellingCategory");
I need to select each SellingCategoryName and list the SellingCategoryName(s) and all the other items underneath it.
I was hoping to get something in the lines of:
Category1
Category1_Item
Item2
SamplePicture
Sample Picture
Mine only reads the First Node and then returns "Successful" to me.
I havve also tried:
XElement root = XElement.Load("FilePath");
var sellingCategoryNames = from PortalSellingCategory in root.Elements("MobilePortalSellingCategoriesHierarchy")
where (string)PortalSellingCategory.Element("SellingCategoriesHierarchy").Element("SellingCategories").Element("PortalSellingCategory") != ""
select PortalSellingCategory;
foreach (var xEle in sellingCategoryNames)
{
richTextBox1.Text = (string)xEle;
}
Any help would be greatly appreciated.
What you are doing with
xml.SelectNodes("/MobilePortalSellingCategoriesHierarchy");
is selecting your root node, which is just one. Thats why you only get one item in your list back. Is the hierarchy important? I can see that PortalSellingCategory can also be inside another PortalSellingCategory. If not maybe you could try the following select:
xml.SelectNodes("//PortalSellingCategory");
This will search for every node named "PortalSellingCategory" in your response, no mather where in the hierarchy it is.
EDIT:
And yes, you should look out for the namespaces, sry for didnt seeing that.
If you like geeting all the nodes with XPath you must create a new NamespaceManager and at it to your selectNodes call:
XmlDocument xml = new XmlDocument();
xml.LoadXml(data);
XmlNamespaceManager ns = new XmlNamespaceManager(xml.NameTable);
ns.AddNamespace("ns", "http://tym2sell.com/PortalService/");
XmlNodeList xnList = xml.SelectNodes("//ns:PortalSellingCategory", ns);
I would use XElement instead of XMLDocument, and then use Linq to query or pick the elmements like: XElement xContact = ....
int contactno = (int?)xContact.Element("command").Element("contactperson").Attribute("contactpersonid") ?? -1;
if (xContact.Element("command").Element("contactperson").Element("name").Element("firstname") != null)
console.writeline(xContact.Element("command").Element("contactperson").Element("name").Element("firstname").Value);
var doc= new XmlDocument();
doc.Load("FilePath");
var nodeList = xml.GetElementsByTagName("PortalSellingCategory");
Hi,
It returns the collection of nodes, and you just have to query it to get needed informations.
Feel free to ask help if needed.
Dimitri.
I'm parsing xml with HtmlAgilityPack on WebService worker role, but there is something wrong. When I select childnode "link" get empty char.
the xml like :
<link>
http://www.webtekno.com/google/google-ve-razer-dan-oyun-konsolu.html
</link>
my code for get link from rss is:
HtmlNodeCollection nodeList = doc.DocumentNode.SelectNodes("//item");
foreach (HtmlNode node in nodeList)
{
string newsUri = node.ChildNodes["link"].InnerText;
}
I think gets empty char cause link node includes new line and after link. How can I get link in the node?
Put that line before loading HtmlDocument
HtmlNode.ElementsFlags["link"] = HtmlElementFlag.Closed;
That is all.
By default, its value is HtmlElementFlag.Empty and treated like meta and img tags...
I have a XML file which contains about 850 XML nodes. Like this:
<NameValueItem>
<Text>Test</Text>
<Code>Test</Code>
</NameValueItem>
........ 849 more
And I want to add a new Childnode inside each and every Node. So I end up like this:
<NameValueItem>
<Text>Test</Text>
<Code>Test</Code>
<Description>TestDescription</Description>
</NameValueItem>
........ 849 more
I've tried the following:
XmlDocument doc = new XmlDocument();
doc.Load(xmlPath);
XmlNodeList nodes = doc.GetElementsByTagName("NameValueItem");
Which gives me all of the nodes, but from here am stuck(guess I need to iterate over all of the nodes and append to each and every) Any examples?
You need something along the lines of this example below. On each of your nodes, you need to create a new element to add to it. I assume you will be getting different values for the InnerText property, but I just used your example.
foreach (var rootNode in nodes)
{
XmlElement element = doc.CreateElement("Description");
element.InnerText = "TestDescription";
root.AppendChild(element);
}
You should just be able to use a foreach loop over your XmlNodeList and insert the node into each XmlNode:
foreach(XmlNode node in nodes)
{
node.AppendChild(new XmlNode()
{
Name = "Description",
Value = [value to insert]
});
}
This can also be done with XDocument using LINQ to XML as such:
XDocument doc = XDocument.Load(xmlDoc);
var updated = doc.Elements("NameValueItem").Select(n => n.Add(new XElement() { Name = "Description", Value = [newvalue]}));
doc.ReplaceWith(updated);
If you don't want to parse XML using proper classes (i.e. XDocument), you can use Regex to find a place to insert your tag and insert it:
string s = #"<NameValueItem>
<Text>Test</Text>
<Code>Test</Code>
</NameValueItem>";
string newTag = "<Description>TestDescription</Description>";
string result = Regex.Replace(s, #"(?<=</Code>)", Environment.NewLine + newTag);
but the best solution is Linq2XML (it's much better, than simple XmlDocument, that is deprecated at now).
string s = #"<root>
<NameValueItem>
<Text>Test</Text>
<Code>Test</Code>
</NameValueItem>
<NameValueItem>
<Text>Test2</Text>
<Code>Test2</Code>
</NameValueItem>
</root>";
var doc = XDocument.Load(new StringReader(s));
var elms = doc.Descendants("NameValueItem");
foreach (var element in elms)
{
element.Add(new XElement("Description", "TestDescription"));
}
var text = new StringWriter();
doc.Save(text);
Console.WriteLine(text);
I need to take an XML file and create multiple output xml files from the repeating nodes of the input file. The source file "AnimalBatch.xml" looks like this:
<?xml version="1.0" encoding="utf-8" ?>
<Animals>
<Animal id="1001">
<Quantity>One</Quantity>
<Adjective>Red</Adjective>
<Name>Rooster</Name>
</Animal>
<Animal id="1002">
<Quantity>Two</Quantity>
<Adjective>Stubborn</Adjective>
<Name>Donkeys</Name>
</Animal>
<Animal id="1003">
<Quantity>Three</Quantity>
<Color>Blind</Color>
<Name>Mice</Name>
</Animal>
</Animals>
The program needs to split the repeating "Animal" and produce 3 files named: Animal_1001.xml, Animal_1002.xml, and Animal_1003.xml
Each output file should contain just their respective element (which will be the root). The id attribute from AnimalsBatch.xml will supply the sequence number for the Animal_xxxx.xml filenames. The id attribute does not need to be in the output files.
Animal_1001.xml:
<?xml version="1.0" encoding="utf-8"?>
<Animal>
<Quantity>One</Quantity>
<Adjective>Red</Adjective>
<Name>Rooster</Name>
</Animal>
Animal_1002.xml
<?xml version="1.0" encoding="utf-8"?>
<Animal>
<Quantity>Two</Quantity>
<Adjective>Stubborn</Adjective>
<Name>Donkeys</Name>
</Animal>
Animal_1003.xml>
<?xml version="1.0" encoding="utf-8"?>
<Animal>
<Quantity>Three</Quantity>
<Adjective>Blind</Adjective>
<Name>Mice</Name>
</Animal>
I want to do this with XmlDocument, since it needs to be able to run on .Net 2.0.
My program looks like this:
static void Main(string[] args)
{
string strFileName;
string strSeq;
XmlDocument doc = new XmlDocument();
doc.Load("D:\\Rick\\Computer\\XML\\AnimalBatch.xml");
XmlNodeList nl = doc.DocumentElement.SelectNodes("Animal");
foreach (XmlNode n in nl)
{
strSeq = n.Attributes["id"].Value;
XmlDocument outdoc = new XmlDocument();
XmlNode rootnode = outdoc.CreateNode("element", "Animal", "");
outdoc.AppendChild(rootnode); // Put the wrapper element into outdoc
outdoc.ImportNode(n, true); // place the node n into outdoc
outdoc.AppendChild(n); // This statement errors:
// "The node to be inserted is from a different document context."
strFileName = "Animal_" + strSeq + ".xml";
outdoc.Save(Console.Out);
Console.WriteLine();
}
Console.WriteLine("END OF PROGRAM: Press <ENTER>");
Console.ReadLine();
}
I think I have 2 problems.
A) After doing the ImportNode on node n into outdoc, I call outdoc.AppendChild(n) which complains: "The node to be inserted is from a different document context." I do not know if this is a scope issue referencing node n within the ForEach loop - or if I am somehow not using ImportNode() or AppendChild properly. 2nd argument on ImportNode() is set to true, because I want the child elements of Animal (3 fields arbitrarily named Quantity, Adjective, and Name) to end up in the destination file.
B) Second problem is getting the Animal element into outdoc. I'm getting '' but I need ' ' so I can place node n inside it. I think my problem is how I am doing: outdoc.AppendChild(rootnode);
To show the xml, I'm doing: outdoc.Save(Console.Out); I do have the code to save() to an output file - which does work, as long as I can get outdoc assembled properly.
There is a similar question at: Split XML in Multiple XML files, but I don't understand the solution code yet. I think I'm pretty close on this approach, and will appreciate any help you can provide.
I'm going to be doing this same task using XmlReader, since I'm going to need to be able to handle large input files, and I understand that XmlDocument reads the whole thing in and can cause memory issues.
That's a simple method that seems what you are looking for
public void test_xml_split()
{
XmlDocument doc = new XmlDocument();
doc.Load("C:\\animals.xml");
XmlDocument newXmlDoc = null;
foreach (XmlNode animalNode in doc.SelectNodes("//Animals/Animal"))
{
newXmlDoc = new XmlDocument();
var targetNode = newXmlDoc.ImportNode(animalNode, true);
newXmlDoc.AppendChild(targetNode);
newXmlDoc.Save(Console.Out);
Console.WriteLine();
}
}
This approach seems to work without using the "var targetnode" statement. It creates an XmlNode object called targetNode from outdoc's "Animal" element in the ForEach loop. I think the main things that were problems in my original code were: A) I was getting nodelist nl incorrectly. And B) I couldn't "Import" node n, I think because it was associated specifically with doc. It had to be created as its own Node.
The problem with the prior proposed solution was the use of the "var" keyword. My program has to assume 2.0 and that came in with v3.0. I like Rogers solution, in that it is concise. For me - I wanted to do each thing as a separate statement.
static void SplitXMLDocument()
{
string strFileName;
string strSeq;
XmlDocument doc = new XmlDocument(); // The input file
doc.Load("D:\\Rick\\Computer\\XML\\AnimalBatch.xml");
XmlNodeList nl = doc.DocumentElement.SelectNodes("//Animals/Animal");
foreach (XmlNode n in nl)
{
strSeq = n.Attributes["id"].Value; // Animal nodes have an id attribute
XmlDocument outdoc = new XmlDocument(); // Create the outdoc xml document
XmlNode targetNode = outdoc.CreateElement("Animal"); // Create a separate node to hold the Animal element
targetNode = outdoc.ImportNode(n, true); // Bring over that Animal
targetNode.Attributes.RemoveAll(); // Remove the id attribute in <Animal id="1001">
outdoc.ImportNode(targetNode, true); // place the node n into outdoc
outdoc.AppendChild(targetNode); // AppendChild to make it stick
strFileName = "Animal_" + strSeq + ".xml";
outdoc.Save(Console.Out); Console.WriteLine();
outdoc.Save("D:\\Rick\\Computer\\XML\\" + strFileName);
Console.WriteLine();
}
}
I have an XmlWriter that contains a xml that looks like the one below, just with a lot more nodes. what's the fastest and best way to remove all the ARTIST node from this xml ?
<CATALOG>
<CD>
<TITLE>Empire Burlesque</TITLE>
<ARTIST>Bob Dylan</ARTIST>
</CD>
<CD>
<TITLE>Hide your heart</TITLE>
<ARTIST>Bonnie Tyler</ARTIST>
</CD>
</CATALOG>
As long as the file isn't gigabytes XmlDocument should be fine:
XmlDocument XDoc = new XmlDocument();
XDoc.Load(MapPath(#"~\xml\test.xml"));
XmlNodeList Nodes = XDoc.GetElementsByTagName("ARTIST");
foreach (XmlNode Node in Nodes)
Node.ParentNode.RemoveChild(Node);
XDoc.Save(MapPath(#"~\xml\test_out.xml"));
When I tried Steve's solution I received the following error "The element list has changed. The enumeration operation failed to continue". I know it is a bit of a hack but to get around this I used the following:
//Load XML
XmlDocument XDoc = new XmlDocument();
XDoc.Load(MapPath(#"~\xml\test.xml"));
//Get list of offending nodes
XmlNodeList Nodes = XDoc.GetElementsByTagName("ARTIST");
//Loop through the list
while (Nodes.Count != 0) {
foreach (XmlNode Node in Nodes) {
//Remove the offending node
Node.ParentNode.RemoveChild(Node); //<--This line messes with our iteration and forces us to get a new list after each remove
//Stop the loop
break;
}
//Get a refreshed list of offending nodes
Nodes = XDoc.GetElementsByTagName("ARTIST");
}
//Save the document
XDoc.Save(newfile);
I was in a bind and needing something quick and this got the job done.
From the way your question is worded, I'm assuming that you have an XmlWriter that has been used to write an XmlDocument and you want to manipulate what it is going to write before flush is called.
I'm afraid that what you're trying to do may be impossible. The XmlWriter is not meant to manipulate XML before it is written to a destination stream.