I am having trouble parsing an xml file . A sample is below.
<G_LOG>
<LINE>9206</LINE>
<TEXT>Generating
</TEXT>
</G_LOG>
<G_LOG>
<LINE>9207</LINE>
<TEXT>Inserted Actual
</TEXT>
O.K , so this is just a snapshot of thousands of nodes in the file. I need to search for the TEXT "Inserted Actual" and not only remove this node , but the previous node as well. So it would find the text on line 9207 and remove 9206 as well. (removing everything in the above snippet)
I can search for the lines I want to remove .
XDocument xmlDoc = XDocument.Load(filename);
var q = from c in xmlDoc.Descendants("G_LOG")
where c.Element("TEXT").Value.Contains("Inserted Actual")
select (string)c.Element("LINE");
foreach (string name in q)
{
Console.WriteLine("Actuals Success on ID : " + name);
}
But I am unsure of how to obtain the previous node and remove it as well (without buckets of code)?.
var elementsToRemove =
from logElement in xml.Descendants("G_LOG")
where logElement.Element("TEXT").Value.Contains("Inserted Actual")
from element in new[] { logElement, logElement.PreviousNode }
select element;
foreach(var element in elementsToRemove.ToList())
{
element.Remove();
}
A couple of things to note:
The second from flattens out each node and its previous node into one sequence
The .ToList() eagerly evaluates the query, ensuring we don't remove a node while evaluating
XElement xmlDoc = XElement.Load("c:\\temp.xml");
var q = xmlDoc.Descendants("G_LOG").Where(c=>c.Element("TEXT").Value.Contains("Inserted Actual")).Select(d=>d.Element("LINE"));
foreach(XElement elm in q)
{
if(elm.Parent.ElementsBeforeSelf().Count()!=0)
elm.Parent.PreviousNode.Remove();
elm.Parent.RemoveNodes();
}
Related
I'm using XPath to read elements from an XML document. Specifically I want to return the values of any element which is the child of a specified element (here the specified element is <SceneryType> and these elements have single-digit values. So I want to return all of the children of <SceneryType> 1 for example.
Here is the XML:
<MissionObjectives>
<Theme themeName="Gothic">
<SceneryType>
1
<Objective>
Do a river thing.
</Objective>
<Objective>
Get all men to the other side of the river.
</Objective>
</SceneryType>
<SceneryType>
2
<Objective>
Climb some trees!
</Objective>
<Objective>
Shoot the tree!
</Objective>
</SceneryType>
</Theme>
I've tried various ways of getting these child elements, but I can't figure it out. My //objective part of the expression just returns everything from the root it seems, but the iterator isn't running which seems odd, shouldn't it loop through every element if the expression is returning a nodelist of all the elements?
XPathDocument missionDoc = new XPathDocument(objectivesPath + "MissionObjectives" + chosenTheme + ".xml");
XPathNavigator nav = missionDoc.CreateNavigator();
foreach (Scenery scenery in world.currentWorld)
{
int sceneryType = scenery.type;
XPathExpression expr = nav.Compile($"MissionObjectives/Theme/SceneryType[text()='{sceneryType}']//Objective");
XPathNodeIterator iterator = nav.Select(expr);
while (iterator.MoveNext())
{
XPathNavigator nav2 = iterator.Current.Clone();
compatibleObjectivesList.Add(nav2.Value);
}
}
I've tried looking through Stack Overflow for similar questions but I can't seem to find anything which applies to XPath. I can't use LINQ to XML for this. Any idea how I can return all the values of the various 'Objective' nodes?
Cheers for any help!
its much simpler to use the XDocument:
var doc = XDocument.Load(objectivesPath + "MissionObjectives" + chosenTheme + ".xml");
to get all of the first SceneryType child nodes:
var node = doc.XPathSelectElement("//MissionObjectives/Theme/SceneryType[1]");
to get the second objective node:
var node = doc.XPathSelectElement("//MissionObjectives/Theme/SceneryType/Objective[2]");
more xpath samples
For one, your xml data has carriage returns, line feeds, and white spaces in the search element's text node. Keep in mind, that an XML node can be an element, attribute, or text (among other node types). The solution below is a bit on the "long-handed" side and perhaps a little "hacky", but it should work. I wasn't certain if you wanted the child element text data or the entire child element, but I return just the child text node data (without carriage returns and line feeds). Also, while this solution DOES NOT use LINQ to XML in the strictest sense, it does use one LINQ expression.
private List<string> getSceneryTypeObjectiveTextList(string xml, int sceneryTypeId, string xpath = "/MissionObjectives/Theme/SceneryType")
{
List<string> result = null;
XmlDocument doc = null;
XmlNodeList sceneryTypeNodes = null;
try
{
doc = new XmlDocument();
doc.LoadXml(xml);
sceneryTypeNodes = doc.SelectNodes(xpath);
if (sceneryTypeNodes != null)
{
if (sceneryTypeNodes.Count > 0)
{
foreach (XmlNode sceneryTypeNode in sceneryTypeNodes)
{
if (sceneryTypeNode.HasChildNodes)
{
var textNode = from XmlNode n in sceneryTypeNode.ChildNodes
where (n.NodeType == XmlNodeType.Text && n.Value.Replace("\r", "").Replace("\n", "").Replace(" ", "") == sceneryTypeId.ToString())
select n;
if (textNode.Count() > 0)
{
XmlNodeList objectiveNodes = sceneryTypeNode.SelectNodes("Objective");
if (objectiveNodes != null)
{
result = new List<string>(objectiveNodes.Count);
foreach (XmlNode objectiveNode in objectiveNodes)
{
result.Add(objectiveNode.InnerText.Replace("\r", "").Replace("\n", "").Trim());
}
// Could break out of the iteration, here, if we know that SceneryType is always unique (i.e. - no duplicates in Element text node)
}
}
}
}
}
}
}
catch (Exception ex)
{
// Handle error
}
finally
{
}
return result;
}
private sampleCall(string filePath, int sceneryTypeId)
{
List<string> compatibleObjectivesList = null;
try
{
compatibleObjectivesList = getSceneryTypeObjectiveTextList(File.ReadAllText(filePath), sceneryTypeId);
}
catch (Exception ex)
{
// Handle error
}
finally
{
}
}
Is there a way I can reduce the foreach code below so I don't have to use a foreach loop to iterate over the xml nodes?
I just want to look and see if an item is present in the xml file
XmlDocument doc = new XmlDocument();
doc.Load("MyList.xml");
XmlNodeList list = doc.SelectNodes("/MyList/item");
foreach( XmlNode item in list)
{
string name = item.InnerText;
if(name == "blah blah")
{
//do something
}
}
The above works but I just want a smaller cooler way of doing it :)
If all you want to do is check whether a certain node exists, use SelectSingleNode with a filtered XPath:
XmlNode node = doc.SelectSingleNode("/MyList/item[. = 'blah blah']");
if (node != null)
{
// do something
}
One issue here is that if the value you want to match on is a dynamic value, you should not build up the XPath by concatenating strings together. That would create an invalid XPath.
In that case, you can either use LINQ on an XmlNodeList:
var found = doc.SelectSingleNode("/MyList/item")
.Cast<XmlNode>()
.Any(n => n.InnerText == "blah blah");
or go ahead and use LINQ-to-XML:
XDocument doc = XDocument.Load("MyList.xml");
bool itemFound = doc.Element("MyList")
.Elements("item")
.Any(e => (string) e == "blah blah");
You can filter element by inner text directly in the XPath expression like so :
XmlNodeList list = doc.SelectNodes("/MyList/item[.='blah blah']");
I am reading an XML that contains a tag like this:
<source><bpt id="1"><donottranslate></bpt><ph id="2">($ T_353_1 Parent ID $)</ph><ept id="1"></donottranslate></ept></source>
When reading source node I get that this node type is Text, but it should be Element.
This is an XML that I am receiving and I cannot change it.
Do you know how can I get this sorted out?
This is my code:
XDocument doc = XDocument.Load(fileName, LoadOptions.PreserveWhitespace);
foreach (var elUnit in doc.Descendants("trans-unit"))
{
if (elUnit.AttributeString("translate").ToString() == "no")
{
foreach (var elSource in elUnit.Elements("source"))
{
string text = "";
foreach (var node in elSource.DescendantNodes().Where(n => XmlNodeType.Text == n.NodeType).ToList())
{
//When reading that "source" node, it enters inside this code
Thanks
First check whether your XML is wellformed
http://www.w3schools.com/xml/xml_validator.asp
http://chris.photobooks.com/xml/default.htm
I could get this to work
//using System.Xml.Linq;
var str = "<source><bpt id=\"1\"><donottranslate></bpt>" +
"<ph id=\"2\">($ T_353_1 Parent ID $)</ph>" +
"<ept id=\"1\"></donottranslate></ept></source>";
XElement element = XElement.Parse(str);
Console.WriteLine(element);
The output is this
<source>
<bpt id="1"><donottranslate></bpt>
<ph id="2">($ T_353_1 Parent ID $)</ph>
<ept id="1"></donottranslate></ept>
</source>
Please provide some code sample for more help if this example if not suffient.
Finally, I solved this checking if the node is correct or not:
if (System.Security.SecurityElement.IsValidText(text.XmlDecodeEntities()))
For an application I am working on, I have to display data from an XML File. There's a few transformations being done, but eventually the end result will be displayed in a treeview. When a user then clicks on a node, I want to pop up the details in a listview.
When no node has been selected, I basically use LINQ to grab the details of the first item I encounter.
Here's a simplified version of my XML
<root>
<parent label="parent1">
<child label="child1">
<element1>data</element1>
<element2>data</element2>
...
</child>
<child label="child2">
<element1>data</element1>
<element2>data</element2>
...
</child>
</parent>
</root>
And here's the code used to grab it (After selecting the parent-node that the treeview has been set to by means of an XPAthSelectStatement):
protected void listsSource_Selecting(object sender, LinqDataSourceSelectEventArgs e)
{
XElement rootElement = XElement.Load(MapPath(TreeSource.DataFile));
rootElement = rootElement.XPathSelectElement("//parent[#label='parent1']");
XElement parentElement;
parentElement = rootElement;
var query = (from itemElement in parentElement.Descendants("child")
select new
{
varElement1 = itemElement.Element("element1").Value,
varElement2 = itemElement.Element("element2").Value,
...
}).Take(1);
e.result = Query;
}
This works a treat, and I can read out the varElement1 and varElement2 values from there. However, when I try and implement a similar mechanism for when the user actually did select a node, I seem to run into a wall.
My approach was to use another XPatchSelectStatement to get to the actual node:
parentElement = rootElement.XPathSelectElement("//child[#label='" + tvwChildren.SelectedNode.Text + "']");
But I am kind of stumped on how to now get a proper LINQ query built up to read in all elements nested under the child node. I tried using parentElement.Elements(), but that was yielding an error. I also looked at using Nodes(), but with similar results.
I suppose I could use a foreach loop to access the nodes, but then I'm not sure how to get the results into a LINQ query so I can return the same e.Result = query back.
I'm fairly new to LINQ, as you might have guessed, so any hints would be very much appreciated.
Here's the query that will give you the child element (given that there is only one child element with the specified label):
var childElement = rootNode.Descendants("child")
.Single(e=>e.Attribute("label").Value == "child1");
If you have more than one child elements with label="child1" but those elements are under different parent elements you can use the same approach to get first the parent element and then the child element.
Having the above, you can use this query to get all element nodes under the child node:
var elements = childElement.Descendants().Select(e=>e.Value);
I think data binding is much easier in this case.
XDocument doc = XDocument.Load(filePath);
if (doc.Root == null)
{
throw new ApplicationException("invalid data");
}
tvwChildren.Source=doc;
But if you want in this way hope following one helps(not the exact solution)
XElement root = XElement.Load("Employees.xml");
TreeNode rootNode = new TreeNode(root.Name.LocalName);
treeView1.Nodes.Add(rootNode);
foreach(XElement employee in root.Elements())
{
TreeNode employeeNode = new TreeNode("Employee ID :" + employee.Attribute("employeeid").Value);
rootNode.Nodes.Add(employeeNode);
if (employee.HasElements)
{
foreach(XElement employeechild in employee.Descendants())
{
TreeNode childNode = new TreeNode(employeechild.Value);
employeeNode.Nodes.Add(childNode);
}
}
}
And you can try Resharper tool for create better linq statements. It shows possible ones and you can easily convert each for,foreach loops into linq statements.
I'm not entirely sure I understand what you're trying to do, but it sounds like it could be this:
var data =
from p in xml.Root.Elements("parent")
where p.Attribute("label").Value == "parent1"
from c in p.Elements("child")
where c.Attribute("label").Value == "child2"
from d in c.Elements()
select d.Value;
Let me know if that helps.
Using this Xml library you can write your XPath like:
XElement child = rootElement.XPathElement(
"//parent[#label={0}]/child[#label={1}]", "parent1", "child2");
Sample xml:
<parent>
<child>test1</child>
<child>test2</child>
</parent>
If I look for parent.Value where parent is XElement, I get "test1test2".
What I am expecting is "". (since there is no text/value for .
What property of XElement should I be looking for?
When looking for text data in the <parent> element you should look for child nodes that have NodeType properties equal to XmlNodeType.Text. These nodes will be of type XText. The following sample illustrates this:
var p = XElement
.Parse("<parent>Hello<child>test1</child>World<child>test2</child>!</parent>");
var textNodes = from c in p.Nodes()
where c.NodeType == XmlNodeType.Text
select (XText)c;
foreach (var t in textNodes)
{
Console.WriteLine(t.Value);
}
Update: if all you want is the first Text node, if any, here's an example using LINQ method calls instead of query comprehension syntax:
var firstTextNode = p.Nodes().OfType<XText>().FirstOrDefault();
if (firstTextNode != null)
{
var textValue = firstTextNode.Value;
...do something interesting with the value
}
Note: using First() or FirstOrDefault() will be more performant than Count() > 0 in this scenario. Count always enumerates the whole collection while FirstOrDefault() will only enumerate until a match is found.
It is amazing that a coder somewhere at Microsoft thought that returning all text values as a concatenated and undelimited string would be useful. Luckily, another MS developer wrote an XElement extension to return what they call the "Shallow Value" of the text node here. For those who get the willies from clicking on links, the function is below...
public static string ShallowValue(this XElement element)
{
return element
.Nodes()
.OfType<XText>()
.Aggregate(new StringBuilder(),
(s, c) => s.Append(c),
s => s.ToString());
}
And you call it like this, because it gives you all the whitespace too (or, come to think of it, you could trim it in the extension, whatever)
// element is a var in your code of type XElement ...
string myTextContent = element.ShallowValue().Trim();
You could concatenate the value of all XText nodes in parent:
XElement parent = XElement.Parse(
#"<parent>Hello<child>test1</child>World<child>test2</child>!</parent>");
string result = string.Concat(
parent.Nodes().OfType<XText>().Select(t => t.Value));
// result == "HelloWorld!"
For comparison:
// parent.Value == "Hellotest1Worldtest2!"
// (parent.HasElements ? "" : parent.Value) == ""
msdn says:
A String that contains all of the text content of this element. If there are multiple text nodes, they will be concatenated.
So the behaviour is to be expected.
You could solve your problem by doing:
string textContent = parent.HasElements ? "" : parent.Value;
// Create the XElement
XElement parent = XElement.Parse(
#"<parent>Hello<child>test1</child>World<child>test2</child>!</parent>");
// Make a copy
XElement temp=new XElement(parent);
// remove all elements but root
temp.RemoveNodes();
// now, do something with temp.value, e.g.
Console.WriteLine(temp.value);