Xml searching recursively for a specific value - c#

In my C# app I am reading an xml document, and in it, are tags containing paths to where .png and .jpg files are being kept. These tags are , and .
I could simply create an XmlNodeList object for each of these tags, such as
XmlNodeList image = _doc.GetElementsByTagName.("Image");
XmlNodeList background = _doc.GetElementsByTagName.("BackgroundImage");
XmlNodeList foreground = _doc.GetElementsByTagName.("ForegroundImage");
for(int i = 0; i < image.count; i++)
{
//..code
}
for(int i = 0; i < background .count; i++)
{
//..code
}
for(int i = 0; i < foreground .count; i++)
{
//..code
}
Clunky, I know. But, is there a way where I can have the application recursively find the tags that contains the word "Image" and return it as a single XmlNodeList? Can it be done? Would this be the best approach? Many thanks in advance.

Related

Getting RSS item <description> with XmlNodeList

I'm trying to get the description from a podcast item in an rss feed but I want it to start reading after any extra tags after the description such as a paragraph tag.
XmlNodeList xmlNodeListDesc = myXmlDocument.GetElementsByTagName("description");
List<Podd> poddLista = new List<Podd>();
for (int i = 2; i < xmlNodeListDur.Count; i++)
{
podd.description = xmlNodeListDesc[i].InnerText;
poddLista.Add(podd);
}
< description>< ![ CDATA [< p >Desired text< / p >]]> < / description>
My current code would start reading after CDATA so < p > and any style tags would be included in the description.

c# HtmlAgilityPack for on nodes array

I'm using html agility pack and after I got array of nodes:
HtmlNode[] nodes = document.DocumentNode.SelectNodes("//tbody[#class='table']").ToArray();
now i want to run a for loop one each nodes[i]. I've tried this:
for (int i = 0; i < 1; i++)
{
if (t == null)
t = new Model.Track();
HtmlNode[] itemText = nodes[i].SelectNodes("//td[#class='artist']").ToArray();
for (int x = 0; x < itemText.Length; x++)
{ //doing something }
the problem is that the itemtext array isn't focusing on nodes[i] .
but brings out an array of all the ("//td[#class='artist']") in the html document.
help?
Using //td[#class='artist'] will fetch all columns with artist class from your document.DocumentNode.
Using .//td[#class='artist'] (Notice the dot at the begining) will fetch all columns with artist class from the current selected node, which in your case is nodes[i].

can not correctly insert text into a bookmark from another bookmark

I'm writing a windows form application which must exchange the content of Word bookmarks between two documents.
There are two similar documents (wordDocument and wordPattern) with similar amount of bookmarks. I'm trying this:
for (int i = 1; i <= wordDocument.Bookmarks.Count; i++)
{
object j = i;
wordDocument.Bookmarks.get_Item(ref j).Range.Text = wordPattern.Bookmarks.get_Item(ref j).Range.Text.ToString();
//MessageBox.Show(wordDocument.Bookmarks[i].Range.Text);
//MessageBox.Show(wordPattern.Bookmarks[i].Range.Text);
}
But it does the task incorrectly. I mean, it does it in improper order and deletes bookmarks. Help me by providing right way to exchange the text inside the bookmarks.
int count1 = 0;
int count2 = 0;
foreach (Word.Bookmark bookmark1 in wordDocument.Bookmarks)
{
Word.Range bmRange = bookmark1.Range;
//bmRange.Text = "заметка" + count1;
listOfRanges.Add(bmRange);
count1++;
}
foreach (Word.Bookmark bookmark2 in wordPattern.Bookmarks)
{
Word.Range mbRange = bookmark2.Range;
mbRange.Text = listOfRanges[count2].Text;
count2++;
}
Solved it that way.

Pictures in DOC file

I have file in format DOC (MS Word 97-2003) and I want to get list of all images used in this file. I try to use "Microsoft.Office.Interop.Word" namespace like in code below
Application application = new Application();
Document document = application.Documents.Open(dataPath);
var words = document.InlineShapes;
int count = words.Count;
for (int i = 0; i < count; i++)
{
if (words[i] != null)
{
Console.WriteLine("{0} : {1}", i, words[i].PictureFormat);
}
}
but I can not find any image in this file (in real there exists two images). Maybe I do something wrong? Could you recommend me any library, which will easy it. I can'nt convert file to DOCX
Use document.InlineShapes to grab the images.
It may be funny, but in this case, I think, the numbering goes from 1. That's why you get COMException: "Element doesn't exists in collection".
Try:
for (int i = 1; i <= count; i++)
{
if (words[i] != null)
{
Console.WriteLine("{0} : {1}", i, words[i].PictureFormat);
}
}

Why do I need to count the number of XmlNodes before iterating through and deleting some of them?

I believe I have found a weird bug as follow:
I want to delete the first two nodes in an XmlNodeList.
I know that there may be other ways of doing this (there surely are) but it is the reason why one of the code segments works and one doesn't (the difference being the Count line) that I am interested in.
var strXml = #"<food><fruit type=""apple""/><fruit type=""pear""/><fruit type=""banana""/></food>";
XmlDocument doc = new XmlDocument();
doc.LoadXml(strXml);
XmlNodeList nlFruit = doc.SelectNodes("food/fruit");
for(int i = 0; i < 2; i++)
{
// This produces a null reference exception:
nlFruit[i].ParentNode.RemoveChild(nlFruit[i]);
}
However, if I count the number of nodes in the XmlNodeList it works and I am left with the desired outcome:
var strXml = #"<food><fruit type=""apple""/><fruit type=""pear""/><fruit type=""banana""/></food>";
XmlDocument doc = new XmlDocument();
doc.LoadXml(strXml);
XmlNodeList nlFruit = doc.SelectNodes("food/fruit");
// Count the nodes..
Debug.WriteLine(nlFruit.Count);
for(int i = 0; i < 2; i++)
{
nlFruit[i].ParentNode.RemoveChild(nlFruit[i]);
}
// doc is now: <food><fruit type="banana" /></food>
Both are wrong you should delete from the end
for(int i = 1; i >= 0; i--)
{
nlFruit[i].ParentNode.RemoveChild(nlFruit[i]);
}
because you remove the 0 th element, and 1 st element becomes the 0 th, than you removes 1st element which is null.
May be this will help:
Halloween Problem : http://blogs.msdn.com/mikechampion/archive/2006/07/20/672208.aspx

Categories

Resources