I have an xmlString that I am parsing to an XDocument:
xmlString =
"<TestXml>" +
"<Data>" +
"<leadData>" +
"<Email>testEmail#yahoo.ca</Email>" +
"<FirstName>John</FirstName>" +
"<LastName>Doe</LastName>" +
"<Phone>555-555-5555</Phone>" +
"<AddressLine1>123 Fake St</AddressLine1>" +
"<AddressLine2></AddressLine2>" +
"<City>Metropolis</City>" +
"<State>DC</State>" +
"<Zip>20016</Zip>" +
"</leadData>" +
"</Data>" +
"</TestXml>"
I parse the string to an XDocument, and then try and iterate through the nodes:
XDocument xDoc = XDocument.Parse(xmlString);
Dictionary<string, string> xDict = new Dictionary<string, string>();
//Convert xDocument to Dictionary
foreach (var child in xDoc.Root.Elements())
{
//xDict.Add();
}
This will only iterate once, and the one iteration seems to have all of the data in it. I realize I am doing something wrong, but after googling around I have no idea what.
Try xDoc.Root.Descendants() instead of xDoc.Root.Elements() in your foreach loop.
Your root has only one child Data, therefore it iterates only once
var xDict = XDocument.Parse(xmlString)
.Descendants("leadData")
.Elements()
.ToDictionary(e => e.Name.LocalName, e => (string)e);
Related
I am developing a quiz program using unity ( c# ) where the program gets the questions from XML files , How can i add math formulas to the XML file and the math equation is displayed correctly in the program ? i tired to use MathML but unity hasn't recognized it and instead of displaying the equation it displays the commands of MathML, Here is a sample of my code for parsing XML file and load it from resources
public string parseXmlFile(string xmlData)
{
XmlDocument xmlDoc = new XmlDocument();
xmlDoc.Load(new StringReader(xmlData));
string xmlPathPattern = "//data-set/record";
XmlNodeList myNodeList = xmlDoc.SelectNodes(xmlPathPattern);
foreach (XmlNode node in myNodeList)
{
XmlNode q1 = node.FirstChild;
XmlNode an1 = q1.NextSibling;
XmlNode q2 = an1.NextSibling;
XmlNode an2 = q2.NextSibling;
XmlNode q3 = an2.NextSibling;
XmlNode an3 = q3.NextSibling;
totValue += q1.InnerXml + "|" + q2.InnerXml + "|" + q3.InnerXml+ "|"+ an1.InnerXml + "|" + an2.InnerXml + "|" + an3.InnerXml + "|"
+ "/";
}
return totValue;
}
_xml = Resources.Load<TextAsset>(xmlfile);
string totval = parseXmlFile(_xml.text);
totval = totval.Replace("\n", "").Replace("\r", "");
allFile = totval.Split('/');
Text2.text=allfile[0];
after that i use allfile elements to display it in text UI
I'm having trouble to make some loops.
I'm using agilitypack. I have a TXT file with several links (1 per line), and for each link that txt want to navigate to the page and then later extract to be in xpath and write in a memo.
The problem I'm having and that the code is only carrying out the procedure for the last line of txt. Where am I wrong?
var Webget = new HtmlWeb();
foreach (string line in File.ReadLines("c:\\test.txt"))
{
var doc = Webget.Load(line);
foreach (HtmlNode node in doc.DocumentNode.SelectNodes("//*[#id='title-article']"))
{
memoEdit1.Text = node.ChildNodes[0].InnerHtml + "\r\n";
break;
}
}
try to change
memoEdit1.Text = node.ChildNodes[0].InnerHtml + "\r\n";
to
memoEdit1.Text += node.ChildNodes[0].InnerHtml + "\r\n";
You're overwriting memoEdit1.Text every time. Try
memoEdit1.Text += node.ChildNodes[0].InnerHtml + "\r\n";
instead - note the += instead of =, which adds the new text every time.
Incidentally, constantly appending strings together isn't really the best way. Something like this might be better:
var Webget = new HtmlWeb();
var builder = new StringBuilder();
foreach (string line in File.ReadLines("c:\\test.txt"))
{
var doc = Webget.Load(line);
foreach (HtmlNode node in doc.DocumentNode.SelectNodes("//*[#id='title-article']"))
{
builder.AppendFormat("{0}\r\n", node.ChildNodes[0].InnerHtml);
break;
}
}
memoEdit1.Text = builder.ToString();
Or, using LINQ:
var Webget = new HtmlWeb();
memoEdit1.Text = string.Join(
"\r\n",
File.ReadAllLines("c:\\test.txt")
.Select (line => Webget.Load(line).DocumentNode.SelectNodes("//*[#id='title-article']").First().ChildNodes[0].InnerHtml));
If you are only selecting 1 node in the inner loop then use SelectSingleNode Instead. Also the better practice when concatenating strings in a loop is to use StringBuilder:
StringBuilder builder = new StringBuilder();
var Webget = new HtmlWeb();
foreach (string line in File.ReadLines("c:\\test.txt"))
{
var doc = Webget.Load(line);
builder.AppendLine(doc.DocumentNode.SelectSingleNode("//*[#id='title-article']").InnerHtml);
}
memoEdit1.Text = builder.ToString();
Using linq it will look like this:
var Webget = new HtmlWeb();
var result = File.ReadLines("c:\\test.txt")
.Select(line => Webget.Load(line).DocumentNode.SelectSingleNode("//*[#id='title-article']").InnerHtml));
memoEdit1.Text = string.Join(Environment.NewLine, result);
I'm trying to get the inner text from a XML tag called sharecount, I need to get it by the normalized_url tag like so:
string ShareCount;
string Url = "'" + "http://www.example.com/Article.aspx?id=" + GuideID + "'";
XmlNode Node;
try
{
//return Share count (using Xpath).
XmlDoc.SelectSingleNode("fql_query_response/link_stat[normalized_url=" + Url + "]/share_count");
ShareCount = XmlDoc.InnerText;
int.TryParse(ShareCount, out Value); //Turn string to int.
return Value;
}
catch
{
return 0;
}
and this is the XML:
<fql_query_response xmlns="http://api.facebook.com/1.0/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" list="true">
<link_stat>
<url>http://www.example.com/Article.aspx?id=1909</url>
<normalized_url>http://www.example.com/Article.aspx?id=1909</normalized_url>
<share_count>11</share_count>
<like_count>3</like_count>
<comment_count>0</comment_count>
<total_count>14</total_count>
<commentsbox_count>8</commentsbox_count>
<comments_fbid>10150846665210566</comments_fbid>
<click_count>0</click_count>
</link_stat>
</fql_query_response>
<link_stat>
<url>http://www.example.com/Article.aspx?id=1989</url>
<normalized_url>http://www.example.com/Article.aspx?id=1989</normalized_url>
<share_count>11</share_count>
<like_count>3</like_count>
<comment_count>0</comment_count>
<total_count>14</total_count>
<commentsbox_count>8</commentsbox_count>
<comments_fbid>10150846665210566</comments_fbid>
<click_count>0</click_count>
</link_stat>
</fql_query_response>
The thing is i got in the return value: "www.example.com/Article.aspx?id=1132http://www.example.com/Article.aspx?id=190900000101502138970422760" what am i doing wrong? thanks!
The problem is you are Selecting a single node but obtaining the content from the root element. On top of that, you have a root namespace, therefore use of NameSpaceManger for searches is required. Try this sample:
string GuideID = "1989";
string Url = "'" + "http://www.example.com/Article.aspx?id=" + GuideID + "'";
var XmlDoc = new XmlDocument();
XmlDoc.Load(new FileStream("XMLFile1.xml",FileMode.Open,FileAccess.Read));
var nsm = new XmlNamespaceManager(XmlDoc.NameTable);
nsm.AddNamespace("s", "http://api.facebook.com/1.0/");
var node = XmlDoc.SelectSingleNode("s:fql_query_response/s:link_stat[s:normalized_url=" + Url + "]/s:share_count", nsm);
var ShareCount = node.InnerText;
And here is the LINQ way with same namespace manager and an XPathSelector.
XDocument xdoc = XDocument.Load(new FileStream("XMLFile1.xml", FileMode.Open, FileAccess.Read));
var lnode = xdoc.XPathSelectElements("s:fql_query_response/s:link_stat[s:normalized_url=" + Url + "]/s:share_count",nsm).First();
var ret = lnode.Value;
Thank you very much for reading my question.
and this is my xml file. (for node Songs, many childNodes named Song)
<?xml version="1.0" encoding="utf-8" ?>
<xmlData>
<version>1.0</version>
<Songs>
<Song>
<artist>mic</artist>
<track>2</track>
<column>happy</column>
<date>14</date>
</Song>
<Song>
<artist>cool</artist>
<track>2</track>
<column>work</column>
<date>4</date>
</Song>
</Songs>
</xmlData>
reading xml, i use the following code:
XmlDocument doc = new XmlDocument();
doc.Load(xmlFilePath);
XmlNode versionNode = doc.SelectSingleNode(#"/xmlData/version");
Console.WriteLine(versionNode.Name + ":\t" + versionNode.InnerText);
XmlNode SongsNode = doc.SelectSingleNode(#"/xmlData/Songs");
Console.WriteLine(SongsNode.Name + "\n");
XmlDocument docSub = new XmlDocument();
docSub.LoadXml(SongsNode.OuterXml);
XmlNodeList SongList = docSub.SelectNodes(#"/Songs/Song");
if (SongList != null)
{
foreach (XmlNode SongNode in SongList)
{
XmlNode artistDetail = SongNode.SelectSingleNode("artist");
Console.WriteLine(artistDetail.Name + "\t: " + artistDetail.InnerText);
XmlNode trackDetail = SongNode.SelectSingleNode("track");
Console.WriteLine(trackDetail.Name + "\t: " + trackDetail.InnerText);
XmlNode columnDetail = SongNode.SelectSingleNode("column");
Console.WriteLine(columnDetail.Name + "\t: " + columnDetail.InnerText);
XmlNode dateDetail = SongNode.SelectSingleNode("date");
Console.WriteLine(dateDetail.Name + "\t: " + dateDetail.InnerText + "\n");
}
}
it seems working.
but how can i write the change to xml file?
maybe, i will change some childNode in Song, and may delete the whole chindNode by artist keyword.
is it possible such as this function
bool DeleteSongByArtist(string sArtist);
bool ChangeNodeInSong(string sArtist, string sNodeName, string value);
because the "Reading solution is "XmlDucoment", so it is better if "changing solution" by using "XmlDocument"
but, if you have better idea to read and change the xml file, please give me the sample code... and please don't write a name of solution such as "Ling to xml"...acutally, i do many testes, but failed.
Welcome to Stackoverflow!
You can change the nodes simply by setting a new .Value or in your case .InnerText.
Sample
// change the node
trackDetail.InnerText = "NewValue"
// save the document
doc.Save(xmlFilePath);
More Information
How To: Modify an Existing Xml File
MSDN - XmlDocument.Save Method
You need to use an XmlWriter. The easiest way to do it would be something like this...
using(XmlWriter writer = new XmlWriter(textWriter))
{
doc.WriteTo(writer);
}
Where textWriter is your initialized Text Writer.
Actually, forget that... the easiest way is to call...
doc.Save(xmlFilePath);
To delete an artist by artist name add the following method:
bool DeleteSongByArtist(XmlDocument doc, string artistName)
{
XmlNodeList SongList = doc.SelectNodes(#"/Songs/Song");
if (SongList != null)
{
for (int i = SongList.Count - 1; i >= 0; i--)
{
if (SongList[i]["artist"].InnerText == artistName && SongList[i].ParentNode != null)
{
SongList[i].ParentNode.RemoveChild(SongList[i]);
}
}
}
}
You probably want to clean it up a bit more to be more resilient. When you call it, change your initial code to be like this. Don't create the subDocument as you want to work with the entire XmlDocument.
XmlDocument doc = new XmlDocument();
doc.Load(xmlFilePath);
XmlNode versionNode = doc.SelectSingleNode(#"/xmlData/version");
Console.WriteLine(versionNode.Name + ":\t" + versionNode.InnerText);
XmlNode SongsNode = doc.SelectSingleNode(#"/xmlData/Songs");
Console.WriteLine(SongsNode.Name + "\n");
XmlNodeList SongList = doc.SelectNodes(#"/Songs/Song");
if (SongList != null)
{
foreach (XmlNode SongNode in SongList)
{
XmlNode artistDetail = SongNode.SelectSingleNode("artist");
Console.WriteLine(artistDetail.Name + "\t: " + artistDetail.InnerText);
XmlNode trackDetail = SongNode.SelectSingleNode("track");
Console.WriteLine(trackDetail.Name + "\t: " + trackDetail.InnerText);
XmlNode columnDetail = SongNode.SelectSingleNode("column");
Console.WriteLine(columnDetail.Name + "\t: " + columnDetail.InnerText);
XmlNode dateDetail = SongNode.SelectSingleNode("date");
Console.WriteLine(dateDetail.Name + "\t: " + dateDetail.InnerText + "\n");
}
}
You aren't able to save your changes because you made changes to an entirely new document!
You likely meant to do the following:
XmlNode SongsNode = doc.SelectSingleNode(#"/xmlData/Songs");
Console.WriteLine(SongsNode.Name + "\n");
// Don't make a new XmlDocument here! Use your existing one
XmlNodeList SongList = SongsNode.SelectNodes(#"/Song");
At this point SongList is still living inside doc. Now when you call:
doc.Save(xmlFilePath);
Your changes will be saved as you intended.
If you're looking to delete nodes that match certain criteria:
// Use XPath to find the matching node
XmlNode song = SongsNode.SelectSingleNode(#"/Song[artist='" + artist + "']");
// Remove it from its Parent
SongsNode.RemoveChild(song);
If you're looking to add a new node:
// Create the new nodes using doc
XmlNode newSong = doc.CreateElement("Song");
XmlNode artist = doc.CreateElement("artist");
artist.InnerText = "Hello";
// Begin the painstaking process of creation/appending
newSong.AppendChild(artist);
// rinse...repeat...
// Finally add the new song to the SongsNode
SongsNode.AppendChild(newSong);
You could do
XmlNodeList SongList = doc.SelectNodes(#"//Songs/Song");
The // tells it to select the Songs node anywhere in document. This is better than
doc.SelectNodes(#"/document/level1/music/Songs")
Note that the above statement is oviously not for your xml, but to prove a point about //
Using // removes the need for your docSub document and SongsNode element.
To add then delete a song, just use the following
XmlDocument doc = new XmlDocument();
XmlElement ea = doc.SelectSingleNode("//songs");
XmlElement el = doc.CreateElement("song");
XmlElement er;
ea.AppendChild(el);
//doing my work with ea
//you could use innerxml.
el.InnerXml = "<artist>Judas Priest</artist><track>7</track><column>good</column><date>1</date>";
//or you can treat each node as above
er = doc.CreateElement("Name");
el.AppendChild(er);
er.InnerText = "The Ripper";
//but you don't nead this song any more?
ea.RemoveChild(el);
//so it's gone.
And thats all there is to it.
please don't write a name of solution such as "Ling to xml"...acutally, i do many testes, but failed.
Still I think, this is a very good time to start to use Linq2Xml. If you don't like it, just ignore.
XDocument xDoc = XDocument.Load(new StringReader(xml));
//Load Songs
var songs = xDoc.Descendants("Song")
.Select(s => new
{
Artist = s.Element("artist").Value,
Track = s.Element("track").Value,
Column = s.Element("column").Value,
Date = s.Element("date").Value,
})
.ToArray();
//Delete Songs
string songByArtist="mic";
xDoc.Descendants("Song")
.Where(s => s.Element("artist").Value == songByArtist)
.Remove();
string newXml = xDoc.ToString();
XDocument coordinates = XDocument.Load("http://feeds.feedburner.com/TechCrunch");
System.IO.StreamWriter StreamWriter1 = new System.IO.StreamWriter(DestFile);
XNamespace nsContent = "http://purl.org/rss/1.0/modules/content/";
string pchild = null;
foreach (var item in coordinates.Descendants("item"))
{
string link = item.Element("guid").Value;
//string content = item.Element(nsContent + "encoded").Value;
foreach (var child in item.Descendants(nsContent + "encoded"))
{
pchild = pchild + child.Element("p").Value;
}
StreamWriter1.WriteLine(link + Environment.NewLine + Environment.NewLine + pchild + Environment.NewLine);
}
StreamWriter1.Close();
If i use Commented line code (string content = item.Element(nsContent + "encoded").Value;) instead of inner for loop than it will fetch the value of <conten:encoded> element but it contains all links, images etc etc. And I want only text.
For that I have tried to use this filter (inner for loop) but its showing error :
Object reference not set to an instance of an object.
Please suggest me code so that I can store only text and remove all other links, <img> tags etc.
The content of item.Element(nsContent + "encoded").Value is html not xml. You should parse it accordingly, such as using Html Agility Pack
See the example below
string content = item.Element(nsContent + "encoded").Value;
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.Load(new StringReader(content));
var text = String.Join(Environment.NewLine + Environment.NewLine,
doc.DocumentNode
.Descendants("p")
.Select(n => "\t" + System.Web.HttpUtility.HtmlDecode(n.InnerText))
);
Firstly, I would start by using a StringBuilder:
StringBuilder sb = new StringBuilder();
Then, I suspect that sometimes, the "child" doesn't have a "p" element, so you can check before using it:
foreach (var child in item.Descendants(nsContent + "encoded"))
{
if (child.Element("p") != null)
{
sb.Append(child.Element("p").Value);
}
}
StreamWriter1.WriteLine(link + Environment.NewLine + Environment.NewLine + sb.ToString() + Environment.NewLine);
Does that work for you?