I have got yet another task I am not able to accomplish: I am supposed to parse the XML from this site, remove all the nodes that don't have "VIDEO" in their name and then save it to another XML file. I have no problems with reading and writing, but removing makes me some difficulties. I have tried to do the Node -> Parent Node -> Child Node work-aroud, but it did not seem useful:
static void Main(string[] args)
{
using (WebClient wc = new WebClient())
{
string s = wc.DownloadString("http://feeds.bbci.co.uk/news/health/rss.xml");
XmlElement tbr = null;
XmlDocument xml = new XmlDocument();
xml.LoadXml(s);
foreach (XmlNode node in xml["rss"]["channel"].ChildNodes)
{
if (node.Name.Equals("item") && node["title"].InnerText.StartsWith("VIDEO"))
{
Console.WriteLine(node["title"].InnerText);
}
else
{
node.ParentNode.RemoveChild(node);
}
}
xml.Save("NewXmlDoc.xml");
Console.WriteLine("\nDone...");
Console.Read();
}
}
I have also tried the RemoveAll method, which does not work as well, because it removes all the nodes not satisfying the "VIDEO" condition.
//same code as above, just the else statement is changed
else
{
node.RemoveAll();
}
Could you help me, please?
I find Linq To Xml easier to use
var xDoc = XDocument.Load("http://feeds.bbci.co.uk/news/health/rss.xml");
xDoc.Descendants("item")
.Where(item => !item.Element("title").Value.StartsWith("VIDEO"))
.ToList()
.ForEach(item=>item.Remove());
xDoc.Save("NewXmlDoc.xml");
You can also use XPath
foreach (var item in xDoc.XPathSelectElements("//item[not(starts-with(title,'VIDEO:'))]")
.ToList())
{
item.Remove();
}
Related
I'm using XPath to read elements from an XML document. Specifically I want to return the values of any element which is the child of a specified element (here the specified element is <SceneryType> and these elements have single-digit values. So I want to return all of the children of <SceneryType> 1 for example.
Here is the XML:
<MissionObjectives>
<Theme themeName="Gothic">
<SceneryType>
1
<Objective>
Do a river thing.
</Objective>
<Objective>
Get all men to the other side of the river.
</Objective>
</SceneryType>
<SceneryType>
2
<Objective>
Climb some trees!
</Objective>
<Objective>
Shoot the tree!
</Objective>
</SceneryType>
</Theme>
I've tried various ways of getting these child elements, but I can't figure it out. My //objective part of the expression just returns everything from the root it seems, but the iterator isn't running which seems odd, shouldn't it loop through every element if the expression is returning a nodelist of all the elements?
XPathDocument missionDoc = new XPathDocument(objectivesPath + "MissionObjectives" + chosenTheme + ".xml");
XPathNavigator nav = missionDoc.CreateNavigator();
foreach (Scenery scenery in world.currentWorld)
{
int sceneryType = scenery.type;
XPathExpression expr = nav.Compile($"MissionObjectives/Theme/SceneryType[text()='{sceneryType}']//Objective");
XPathNodeIterator iterator = nav.Select(expr);
while (iterator.MoveNext())
{
XPathNavigator nav2 = iterator.Current.Clone();
compatibleObjectivesList.Add(nav2.Value);
}
}
I've tried looking through Stack Overflow for similar questions but I can't seem to find anything which applies to XPath. I can't use LINQ to XML for this. Any idea how I can return all the values of the various 'Objective' nodes?
Cheers for any help!
its much simpler to use the XDocument:
var doc = XDocument.Load(objectivesPath + "MissionObjectives" + chosenTheme + ".xml");
to get all of the first SceneryType child nodes:
var node = doc.XPathSelectElement("//MissionObjectives/Theme/SceneryType[1]");
to get the second objective node:
var node = doc.XPathSelectElement("//MissionObjectives/Theme/SceneryType/Objective[2]");
more xpath samples
For one, your xml data has carriage returns, line feeds, and white spaces in the search element's text node. Keep in mind, that an XML node can be an element, attribute, or text (among other node types). The solution below is a bit on the "long-handed" side and perhaps a little "hacky", but it should work. I wasn't certain if you wanted the child element text data or the entire child element, but I return just the child text node data (without carriage returns and line feeds). Also, while this solution DOES NOT use LINQ to XML in the strictest sense, it does use one LINQ expression.
private List<string> getSceneryTypeObjectiveTextList(string xml, int sceneryTypeId, string xpath = "/MissionObjectives/Theme/SceneryType")
{
List<string> result = null;
XmlDocument doc = null;
XmlNodeList sceneryTypeNodes = null;
try
{
doc = new XmlDocument();
doc.LoadXml(xml);
sceneryTypeNodes = doc.SelectNodes(xpath);
if (sceneryTypeNodes != null)
{
if (sceneryTypeNodes.Count > 0)
{
foreach (XmlNode sceneryTypeNode in sceneryTypeNodes)
{
if (sceneryTypeNode.HasChildNodes)
{
var textNode = from XmlNode n in sceneryTypeNode.ChildNodes
where (n.NodeType == XmlNodeType.Text && n.Value.Replace("\r", "").Replace("\n", "").Replace(" ", "") == sceneryTypeId.ToString())
select n;
if (textNode.Count() > 0)
{
XmlNodeList objectiveNodes = sceneryTypeNode.SelectNodes("Objective");
if (objectiveNodes != null)
{
result = new List<string>(objectiveNodes.Count);
foreach (XmlNode objectiveNode in objectiveNodes)
{
result.Add(objectiveNode.InnerText.Replace("\r", "").Replace("\n", "").Trim());
}
// Could break out of the iteration, here, if we know that SceneryType is always unique (i.e. - no duplicates in Element text node)
}
}
}
}
}
}
}
catch (Exception ex)
{
// Handle error
}
finally
{
}
return result;
}
private sampleCall(string filePath, int sceneryTypeId)
{
List<string> compatibleObjectivesList = null;
try
{
compatibleObjectivesList = getSceneryTypeObjectiveTextList(File.ReadAllText(filePath), sceneryTypeId);
}
catch (Exception ex)
{
// Handle error
}
finally
{
}
}
How can I remove the xmlns namespace from a XElement?
I tried: attributes.remove, xElement.Name.NameSpace.Remove(0), etc, etc. No success.
My xml:
<event xmlns="http://www.blablabla.com/bla" version="1.00">
<retEvent version="1.00">
</retEvent>
</event>
How can I accomplish this?
#octaviocc's answer did not work for me because xelement.Attributes() was empty, it wasn't returning the namespace as an attribute.
The following will remove the declaration in your case:
element.Name = element.Name.LocalName;
If you want to do it recursively for your element and all child elements use the following:
private static void RemoveAllNamespaces(XElement element)
{
element.Name = element.Name.LocalName;
foreach (var node in element.DescendantNodes())
{
var xElement = node as XElement;
if (xElement != null)
{
RemoveAllNamespaces(xElement);
}
}
}
I'd like to expand upon the existing answers. Specifically, I'd like to refer to a common use-case for removing namespaces from an XElement, which is: to be able to use Linq queries in the usual way.
When a tag contains a namespace, one has to use this namespace as an XNamespace on every Linq query (as explained in this answer), so that with the OP's xml, it would be:
XNamespace ns = "http://www.blablabla.com/bla";
var element = xelement.Descendants(ns + "retEvent")).Single();
But usually, we don't want to use this namespace every time. So we need to remove it.
Now, #octaviocc's suggestion does remove the namespace attribute from a given element. However, the element name still contains that namespace, so that the usual Linq queries won't work.
Console.WriteLine(xelement.Attributes().Count()); // prints 1
xelement.Attributes().Where( e => e.IsNamespaceDeclaration).Remove();
Console.WriteLine(xelement.Attributes().Count()); // prints 0
Console.WriteLine(xelement.Name.Namespace); // prints "http://www.blablabla.com/bla"
XNamespace ns = "http://www.blablabla.com/bla";
var element1 = xelement.Descendants(ns + "retEvent")).SingleOrDefault(); // works
var element2 = xelement.Descendants("retEvent")).SingleOrDefault(); // returns null
Thus, we need to use #Sam Shiles suggestion, but it can be simplified (no need for recursion):
private static void RemoveAllNamespaces(XElement xElement)
{
foreach (var node in xElement.DescendantsAndSelf())
{
node.Name = node.Name.LocalName;
}
}
And if one needs to use an XDocument:
private static void RemoveAllNamespaces(XDocument xDoc)
{
foreach (var node in xDoc.Root.DescendantsAndSelf())
{
node.Name = node.Name.LocalName;
}
}
And now it works:
var element = xelement.Descendants("retEvent")).SingleOrDefault();
You could use IsNamespaceDeclaration to detect which attribute is a namespace
xelement.Attributes()
.Where( e => e.IsNamespaceDeclaration)
.Remove();
I am reading an XML that contains a tag like this:
<source><bpt id="1"><donottranslate></bpt><ph id="2">($ T_353_1 Parent ID $)</ph><ept id="1"></donottranslate></ept></source>
When reading source node I get that this node type is Text, but it should be Element.
This is an XML that I am receiving and I cannot change it.
Do you know how can I get this sorted out?
This is my code:
XDocument doc = XDocument.Load(fileName, LoadOptions.PreserveWhitespace);
foreach (var elUnit in doc.Descendants("trans-unit"))
{
if (elUnit.AttributeString("translate").ToString() == "no")
{
foreach (var elSource in elUnit.Elements("source"))
{
string text = "";
foreach (var node in elSource.DescendantNodes().Where(n => XmlNodeType.Text == n.NodeType).ToList())
{
//When reading that "source" node, it enters inside this code
Thanks
First check whether your XML is wellformed
http://www.w3schools.com/xml/xml_validator.asp
http://chris.photobooks.com/xml/default.htm
I could get this to work
//using System.Xml.Linq;
var str = "<source><bpt id=\"1\"><donottranslate></bpt>" +
"<ph id=\"2\">($ T_353_1 Parent ID $)</ph>" +
"<ept id=\"1\"></donottranslate></ept></source>";
XElement element = XElement.Parse(str);
Console.WriteLine(element);
The output is this
<source>
<bpt id="1"><donottranslate></bpt>
<ph id="2">($ T_353_1 Parent ID $)</ph>
<ept id="1"></donottranslate></ept>
</source>
Please provide some code sample for more help if this example if not suffient.
Finally, I solved this checking if the node is correct or not:
if (System.Security.SecurityElement.IsValidText(text.XmlDecodeEntities()))
I am having trouble parsing an xml file . A sample is below.
<G_LOG>
<LINE>9206</LINE>
<TEXT>Generating
</TEXT>
</G_LOG>
<G_LOG>
<LINE>9207</LINE>
<TEXT>Inserted Actual
</TEXT>
O.K , so this is just a snapshot of thousands of nodes in the file. I need to search for the TEXT "Inserted Actual" and not only remove this node , but the previous node as well. So it would find the text on line 9207 and remove 9206 as well. (removing everything in the above snippet)
I can search for the lines I want to remove .
XDocument xmlDoc = XDocument.Load(filename);
var q = from c in xmlDoc.Descendants("G_LOG")
where c.Element("TEXT").Value.Contains("Inserted Actual")
select (string)c.Element("LINE");
foreach (string name in q)
{
Console.WriteLine("Actuals Success on ID : " + name);
}
But I am unsure of how to obtain the previous node and remove it as well (without buckets of code)?.
var elementsToRemove =
from logElement in xml.Descendants("G_LOG")
where logElement.Element("TEXT").Value.Contains("Inserted Actual")
from element in new[] { logElement, logElement.PreviousNode }
select element;
foreach(var element in elementsToRemove.ToList())
{
element.Remove();
}
A couple of things to note:
The second from flattens out each node and its previous node into one sequence
The .ToList() eagerly evaluates the query, ensuring we don't remove a node while evaluating
XElement xmlDoc = XElement.Load("c:\\temp.xml");
var q = xmlDoc.Descendants("G_LOG").Where(c=>c.Element("TEXT").Value.Contains("Inserted Actual")).Select(d=>d.Element("LINE"));
foreach(XElement elm in q)
{
if(elm.Parent.ElementsBeforeSelf().Count()!=0)
elm.Parent.PreviousNode.Remove();
elm.Parent.RemoveNodes();
}
I want to change the order of XML using XDocument
<root>
<one>1</one>
<two>2</two>
</root>
I want to change the order so that 2 appears before 1. Is this capability baked in or do I have to do it myself. For example, remove then AddBeforeSelf()?
Thanks
Similar to above, but wrapping it in an extension method. In my case this works fine for me as I just want to ensure a certain element order is applied in my document before the user saves the xml.
public static class XElementExtensions
{
public static void OrderElements(this XElement parent, params string[] orderedLocalNames)
{
List<string> order = new List<string>(orderedLocalNames);
var orderedNodes = parent.Elements().OrderBy(e => order.IndexOf(e.Name.LocalName) >= 0? order.IndexOf(e.Name.LocalName): Int32.MaxValue);
parent.ReplaceNodes(orderedNodes);
}
}
// using the extension method before persisting xml
this.Root.Element("parentNode").OrderElements("one", "two", "three", "four");
Try this solution...
XElement node = ...get the element...
//Move up
if (node.PreviousNode != null) {
node.PreviousNode.AddBeforeSelf(node);
node.Remove();
}
//Move down
if (node.NextNode != null) {
node.NextNode.AddAfterSelf(node);
node.Remove();
}
This should do the trick. It order the child nodes of the root based on their content and then changes their order in the document. This is likely not the most effective way but judging by your tags you wanted to see it with LINQ.
static void Main(string[] args)
{
XDocument doc = new XDocument(
new XElement("root",
new XElement("one", 1),
new XElement("two", 2)
));
var results = from XElement el in doc.Element("root").Descendants()
orderby el.Value descending
select el;
foreach (var item in results)
Console.WriteLine(item);
doc.Root.ReplaceAll( results.ToArray());
Console.WriteLine(doc);
Console.ReadKey();
}
Outside of writing C# code to achieve this, you could use XSLT to transform the XML.