I'm parsing an XML file, to compare it to another XML file. XML Diff works nicely, but we have found there are a lot of junk tags that exist in one file, not in the other, that have no bearing on our results, but clutter up the report. I have loaded the XML file into memory to do some other things to it, and I'm wondering if there is an easy way at the same time to go through that file, and remove all tags that start with, as an example color=. The value of color is all over the map, so not easy to grab them all remove them.
Doesn't seem to be any way in XML Diff to specify, "ignore these tags".
I could roll through the file, find each instance, find the end of it, delete it out, but I'm hoping there will be something simpler. If not, oh well.
Edit: Here's a piece of the XML:
<numericValue color="-103" hidden="no" image="stuff.jpg" key="More stuff." needsQuestionFormatting="false" system="yes" systemEquivKey="Stuff." systemImage="yes">
<numDef increment="1" maximum="180" minimum="30">
<unit deprecated="no" key="BPM" system="yes" />
</numDef>
</numericValue>
If you are using Linq to XML, you can load your XML into an XDocument via:
var doc = XDocument.Parse(xml); // Load the XML from a string
Or
var doc = XDocument.Load(fileName); // Load the XML from a file.
Then search for all elements with matching names and use System.Xml.Linq.Extensions.Remove() to remove them all at once:
string prefix = "L"; // Or whatever.
// Use doc.Root.Descendants() instead of doc.Descendants() to avoid accidentally removing the root element.
var elements = doc.Root.Descendants().Where(e => e.Name.LocalName.StartsWith(prefix, StringComparison.Ordinal));
elements.Remove();
Update
In your XML, the color="-103" substring is an attribute of an element, rather than an element itself. To remove all such attributes, use the following method:
public static void RemovedNamedAttributes(XElement root, string attributeLocalNamePrefix)
{
if (root == null)
throw new ArgumentNullException();
foreach (var node in root.DescendantsAndSelf())
node.Attributes().Where(a => a.Name.LocalName == attributeLocalNamePrefix).Remove();
}
Then call it like:
var doc = XDocument.Parse(xml); // Load the XML
RemovedNamedAttributes(doc.Root, "color");
Related
I have a following xml file. I need to change the inner text of ANY tag, which contains the value «Museum», or just a tag for a start:
<src>
<riga>
<com>¾</com>
<riquadro797>Direction to move</riquadro797>
</riga>
<riga>
<llt>
<com>Museum</com>
<elemento797>Direction not to move</elemento797>
</llt>
</riga>
<operation>
<com> </com>
<riquadro797>Museum</riquadro797>
</operation>
<riga>
<gt>
<elemento797>Direction not to move</elemento797>
</gt>
</riga>
</src>
I've parsed this file to XElement. What I've tried and it dos not work:
var tt = xmlCluster.Elements(First(x => x.Value == "Museum");
This code is not proper, as I cannot predict which element will contain "Museum":
var el = rootElRecDocXml.SelectSingleNode("src/riga/gt/elemento797[text()='"+mFilePath+"']");
How to do it? Any help will be greatly appreciated!
just grab all elements with Museum values:
var doc = XDocument.Parse(xml);
var elements = doc.Descendants().Where(e => e.Value == "Museum");
foreach (var ele in elements)
ele.Value = "Test";
//doc is updated with new values
as Selman22 noted, doc will just be a working copy of your xml. You'll need to call doc.Save to apply anything back to the disk, or wherever you need
Elements() only looks at a single level in the heirarchy. I think you want Descendants() instead...
If you want an older-school XPath option, you need to do a global search on the tree - you can use the // XPath expression for this:
var els = rootElRecDocXml.SelectNodes("//[text()='"+mFilePath+"']");
I want to grab one specific value within an XML document at a url, I have managed to get a list of all values, but I'm not sure how to choose the specific value. The XML document is as follows;
<evec_api version="2.0" method="marketstat_xml">
<marketstat>
<type id="37">
<buy>
<volume>291092912</volume>
<avg>137.11</avg>
<max>156.06</max>
<min>53.46</min>
<stddev>31.00</stddev>
<median>140.28</median>
<percentile>156.05</percentile>
</buy>
<sell>
<volume>273042044</volume>
<avg>177.43</avg>
<max>339.00</max>
<min>166.22</min>
<stddev>30.83</stddev>
<median>170.38</median>
<percentile>166.26</percentile>
</sell>
<all>
<volume>574134956</volume>
<avg>154.64</avg>
<max>339.00</max>
<min>43.00</min>
<stddev>42.21</stddev>
<median>156.05</median>
<percentile>69.98</percentile>
</all>
</type>
</marketstat>
</evec_api>
The specific value I want is the min sell value, being 166.22. My code at current, which just retrieves all values in the document is
private void Form1_Load(object sender, EventArgs e)
{
string xmlDocPath = "http://api.eve-central.com/api/marketstat?typeid=37®ionlimit=10000002&usesystem=30000142";
XmlTextReader xmlReader = new XmlTextReader(xmlDocPath);
while (xmlReader.Read())
{
if (xmlReader.NodeType == XmlNodeType.Text)
{
textBox1.AppendText(xmlReader.Value + "\n");
}
}
}
I've tried a few different methods, like just throwing it all in a text box and taking the specific line, but that seems like a really silly solution. Most of the tutorials use console however that doesn't work for me. I feel it's probably a simple solution, but I'm yet to find one that works. Also, being fairly new to this, if there is anything terribly inefficient about this code, feel free to point it out.
Try to use LINQ to XML, it's very straightforward. Example is given below:
var doc = XDocument.Parse(xml); //use XDocument.Load if you have path to a file
string minSell = doc.Descendants("sell")
.First()
.Element("min")
.Value;
Console.WriteLine(minSell); //prints 166.22
If you wrap that XmlTextReader into a XmlDocument you can then execute an XPath query on it to retrieve the specific node you're interested in:
var doc = new XmlDocument(xmlReader);
doc.Load();
var xpath = "/marketstat/type [#id='37']/sell/min";
var myNode = doc.SelectSingleNode(xpath);
Load function is already defined in xmlData class
public class XmlData
{
public void Load(XElement xDoc)
{
var id = xDoc.XPathSelectElements("//ID");
var listIds = xDoc.XPathSelectElements("/Lists//List/ListIDS/ListIDS");
}
}
I'm just calling the Load function from my end.
XmlData aXmlData = new XmlData();
string input, stringXML = "";
TextReader aTextReader = new StreamReader("D:\\test.xml");
while ((input = aTextReader.ReadLine()) != null)
{
stringXML += input;
}
XElement Content = XElement.Parse(stringXML);
aXmlData.Load(Content);
in load function,im getting both id and and listIds as null.
My test.xml contains
<SEARCH>
<ID>11242</ID>
<Lists>
<List CURRENT="true" AGGREGATEDCHANGED="false">
<ListIDS>
<ListID>100567</ListID>
<ListID>100564</ListID>
<ListID>100025</ListID>
<ListID>2</ListID>
<ListID>1</ListID>
</ListIDS>
</List>
</Lists>
</SEARCH>
EDIT: Your sample XML doesn't have an id element in the namespace with the nss alias. It would be <nss:id> in that case, or there'd be a default namespace set up. I've assumed for this answer that in reality the element you're looking for is in the namespace.
Your query is trying to find an element called id at the root level. To find all id elements, you need:
var tempId = xDoc.XPathSelectElements("//nss:id", ns);
... although personally I'd use:
XDocument doc = XDocument.Parse(...);
XNamespace nss = "http://schemas.microsoft.com/SQLServer/reporting/reportdesigner";
// Or use FirstOrDefault(), or whatever...
XElement idElement = doc.Descendants(nss + "id").Single();
(I prefer using the query methods on LINQ to XML types instead of XPath... I find it easier to avoid silly syntax errors etc.)
Your sample code is also unclear as you're using xDoc which hasn't been declared... it helps to write complete examples, ideally including everything required to compile and run as a console app.
I am looking at the question 3 hours after it was submitted and 41 minutes after it was (last) edited.
There are no namespaces defined in the provided XML document.
var listIds = xDoc.XPathSelectElements("/Lists//List/ListIDS/ListIDS");
This XPath expression obviously doesn't select any node from the provided XML document, because the XML document doesn't have a top element named Lists (the name of the actual top element is SEARCH)
var id = xDoc.XPathSelectElements("//ID");
in load function,im getting both id and and listIds as null.
This statement is false, because //ID selects the only element named ID in the provided XML document, thus the value of the C# variable id is non-null. Probably you didn't test thoroughly after editing the XML document.
Most probably the original ID element belonged to some namespace. But now it is in "no namespace" and the XPath expression above does select it.
string xmldocument = "<response xmlns:nss=\"http://schemas.microsoft.com/SQLServer/reporting/reportdesigner\"><action>test</action><id>1</id></response>";
XElement Content = XElement.Parse(xmldocument);
XPathNavigator navigator = Content.CreateNavigator();
XmlNamespaceManager ns = new XmlNamespaceManager(navigator.NameTable);
ns.AddNamespace("nss", "http://schemas.microsoft.com/SQLServer/reporting/reportdesigner");
var tempId = navigator.SelectSingleNode("/id");
The reason for the null value or system returned value is due to the following
var id = xDoc.XPathSelectElements("//ID");
XpathSElectElements is System.xml.linq.XElment which is linq queried date. It cannot be directly outputed as such.
To Get individual first match element
use XPathSelectElement("//ID");
You can check the number of occurrences using XPathSelectElements as
var count=xDoc.XPathSelectElements("//ID").count();
you can also query the linq statement as order by using specific conditions
Inorder to get node value from a list u can use this
foreach (XmlNode xNode in xDoc.SelectNodes("//ListIDS/ListID"))
{
Console.WriteLine(xNode.InnerText);
}
For Second list you havnt got the value since, the XPath for list items is not correct
I have an application that is on .net 2.0 and I am having some difficult with it as I am more use to linq.
The xml file look like this:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<updates>
<files>
<file url="files/filename.ext" checksum="06B9EEA618EEFF53D0E9B97C33C4D3DE3492E086" folder="bin" system="0" size="40448" />
<file url="files/filename.ext" checksum="CA8078D1FDCBD589D3769D293014154B8854D6A9" folder="" system="0" size="216" />
<file url="files/filename.ext" checksum="CA8078D1FDCBD589D3769D293014154B8854D6A9" folder="" system="0" size="216" />
</files>
</updates>
The file is downloaded and readed on the fly:
XmlDocument readXML = new XmlDocument();
readXML.LoadXml(xmlData);
Initially i was thinking it would go with something like this:
XmlElement root = doc.DocumentElement;
XmlNodeList nodes = root.SelectNodes("//files");
foreach (XmlNode node in nodes)
{
... im reading it ...
}
But before reading them I need to know how many they are to use on my progress bar and I am also clueless on how to grab the attribute of the file element in this case.
How could I count how many "file"
ELEMENTS I have (count them before entering the foreach ofc) and read their
attributes ?
I need the count because it will be used to update the progress bar.
Overall it is not reading my xml very well.
before reading them I need to know how many they are to use on my progress bar
Use the XmlNodeList.Count property. Code example below.
Overall it is not reading my xml very well
Here's some tips on reading Xml with the older Xml library.
First, XPath is your friend. It lets you query elements pretty quickly, in a way that is (very) vaguely similar to Linq. In this case, you should change your XPath to get the list of child "file" elements, rather than the parent "files" element.
XmlNodeList nodes = root.SelectNodes("//files");
Becomes
XmlNodeList files = root.SelectNodes("//file");
The //ElementName searches recursively for all elements with that name. XPath is pretty cool, and you should read up on a bit. Here are some links:
http://msdn.microsoft.com/en-us/library/hcebdtae.aspx
http://msdn.microsoft.com/en-us/library/d271ytdx.aspx
Once you have those elements, you can use the XmlElement.Attributes property, coupled with the XmlAttribute.Value property (file.Attributes["url"].Value).
Or you can use the GetAttribute method.
Click this link to the documentation on XmlElement for more info. Remember to switch the .Net Framework version to 2.0 on that page.
XmlElement root = doc.DocumentElement;
XmlNodeList files = root.SelectNodes("//file"); // file element, not files element
int numberOfFiles = files.Count;
// Todo: Update progress bar here
foreach (XmlElement file in files) // These are elements, so this cast is safe-ish
{
string url = file.GetAttribute("url");
string folder = file.GetAttribute("folder");
// If not an integer, will throw. Could use int.TryParse instead
int system = int.Parse(file.GetAttribute("system"));
int size = int.Parse(file.GetAttribute("size"));
// convert this to a byte array later
string checksum = file.GetAttribute("checksum");
}
For how to convert your checksum into a byte array, see this question:
How can I convert a hex string to a byte array?
You can count a number of elements by getting a length of your collection:
int ElementsCount = nodes.Count;
You can read attributes as following:
foreach(XmlNode node in nodes) {
Console.WriteLine("Value: " + node.Attributes["name_of_attribute"].Value;
}
EDIT:
you should be able to use nodes[0].ChildNodes.Count;.
El Padrino showed a solution:
How to change XML Attribute
where an xml element can be loaded directly (no for each..), edited and saved!
My xml is:
<?xml version="1.0" encoding="ISO-8859-8"?>
<g>
<page no="1" href="page1.xml" title="נושא 1">
<row>
<pic pos="1" src="D:\RuthSiteFiles\webSiteGalleryClone\ruthCompPics\C_WebBigPictures\100CANON\IMG_0418.jpg" width="150" height="120">1</pic>
</row>
</page>
</g>
and I need to select a node by two attributes(1. "no" in the page tag and "pos" in the pic tag)
I've found :
How to access a xml node with attributes and namespace using selectsinglenode()
where direct access is possible but beside the fact that I dont understand the solution, I think it uses the xpath object which can't be modified and save changes.
What's the best way to
access directly an xml node (I'm responsible that the node will be unique)
edit that node
save changes to the xml
Thanks
Asaf
You can use the same pattern as the first answer you linked to, but you will need to include the conditions on the attributes in the XPath. Your basic XPath would be g/page/row/pic. Since you want the no attribute of page to be 1, you add [#no='1'] as a predicate on page. So, the full XPath query is something like g/page[#no='1']/row/pic[#pos='1']. SelectSingleNode will return a mutable XmlNode object, so you can modify that object and save the original document to save changes.
Putting the XPath together with El Padrino's answer:
//Here is the variable with which you assign a new value to the attribute
string newValue = string.Empty;
XmlDocument xmlDoc = new XmlDocument();
xmlDoc.Load(xmlFile);
XmlNode node = xmlDoc.SelectSingleNode("g/page[#no='1']/row/pic[#pos='1']");
node.Attributes["src"].Value = newValue;
xmlDoc.Save(xmlFile);
//xmlFile is the path of your file to be modified
Use the new, well-designed XDocument/XElement instead of the old XmlDocument API.
In your example,
XDocument doc = XDocument.Load(filename);
var pages = doc.Root.Elements("page").Where(page => (int?) page.Attribute("no") == 1);
var rows = pages.SelectMany(page => page.Elements("row"));
var pics = rows.SelectMany(row => row.Elements("pic").Where(pic => (int?) pic.Attribute("pos") == 1));
foreach (var pic in pics)
{
// outputs <pic pos="1" src="D:\RuthSiteFiles\webSiteGalleryClone\ruthCompPics\C_WebBigPictures\100CANON\IMG_0418.jpg" width="150" height="120">1</pic>
Console.WriteLine(pic);
// outputs 1
Console.WriteLine(pic.Value);
// Changes the value
pic.Value = 2;
}
doc.Save(filename);