Set value of root XElement without affecting child elements - c#

UPDATED: I still have this problem, better explanation.
I have a list of XElements and I'm iterating through them to check if it match a regex pattern. If there's a match, I need to replace the value of the current element without affecting his child elements.
For example,
<root>{REGEX:#Here}<child>Element</child> more content</root
In that case, I need to replace {REGEX:#Here} which is under the root element but his not a child element! If Use:
string newValue = xElement.ToString();
if(ReplaceRegex(ref newValue))
xElement.ReplaceAll(newValue);
I'm losing the child elements and the tags get converted to & lt;child & gt;element in the value.
If I use:
xElement.SetValue(newValue);
The value of the xElement will be,
"{REGEX:Replaced} Element more content"
thus losing child elements as well.
What can I do to replace the value that will keep the child elements and work if the regex pattern is under the root element or child elements.
PS: I will add the regex function here for understanding purpose
private bool ReplaceRegex(ref string text)
{
bool match = false;
Regex linkRegex = new Regex(#"\{XPath:.*?\}", System.Text.RegularExpressions.RegexOptions.Multiline);
Match m = linkRegex.Match(text);
while (m.Success)
{
match = true;
string substring = m.Value;
string xpath = substring.Replace("{XPath:", string.Empty).Replace("}", string.Empty);
object temp = this.Container.Data.XPathEvaluate(xpath);
text = text.Replace(substring, Utility.XPathResultToString(temp));
m = m.NextMatch();
}
return match;
}

private void ReplaceRegex(XElement xElement)
{
if(xElement.HasElements)
{
foreach (XElement subElement in xElement.Elements())
this.ReplaceRegex(subElement);
}
foreach(var node in xElement.Nodes().OfType<XText>())
{
string value = node.Value;
if(this.ReplaceRegex(ref value))
node.Value = value;
}
}
EDIT :
Regarding your mixed-content comment, edited the code to take care of text nodes. See if it works.

Related

Using XPath to add elements to mutiple elements doesn't work as intended

My logic goes as follows: I want to find the first element that misses a given attribute, add the attribute and then find the next element which misses the element, add it and so fourth.
I find the first element missing the amount attribute in the following way:
private XmlNode GetFirstElementWithoutAmount()
{
string productXPathQuery = "//XML/Products";
XmlNodeList productList = ParentXmlDocument.SelectNodes(productXPathQuery);
foreach (XmlNode element in productList)
{
string passengerXPathQuery = "//XML/Products[ID=" + element.FirstChild.InnerText + "]/Amount";
var amount = element.SelectSingleNode(passengerXPathQuery);
if (amount == null)
{
return element;
}
}
return null;
}
When I've found the first element missing the attribute, the amount is added in the following way:
private XmlNode GetOrCreateChildXMLNode(string NewNodeName, XmlNode ParentXMLNode)
{
if (ParentXMLNode == null)
{
return null;
}
XmlNode NewXMLNode = ParentXMLNode.SelectSingleNode("//" + NewNodeName);
if (NewXMLNode == null)
{
NewXMLNode = ParentXmlDocument.CreateNode(XmlNodeType.Element, NewNodeName, string.Empty);
ParentXMLNode.AppendChild(NewXMLNode);
}
return NewXMLNode;
}
The problems is, that it only adds to the first element, and then the first function always returns the second element, even though there's more elements to come? Any ideas why this is?
You are already inside //XML/Products during your foreach loop. Point directly to subnode.
string passengerXPathQuery = "./Amount";

HTMLAgilityPack error: "Multiple node elements can't be created."

I'm attempting to use the HTMLAgilityPack to get retrieve and edit inner text of some HTML. The inner text of each node i retrieve needs to be checked for matching strings and those matching strings to be highlighted like so:
var HtmlDoc = new HtmlDocument();
HtmlDoc.LoadHtml(item.Content);
var nodes = HtmlDoc.DocumentNode.SelectNodes("//div[#class='guide_subtitle_cell']/p");
foreach (HtmlNode htmlNode in nodes)
{
htmlNode.ParentNode.ReplaceChild(HtmlTextNode.CreateNode(Methods.HighlightWords(htmlNode.InnerText, searchstring)), htmlNode);
}
This is the code for the HighlightWords method I use:
public static string HighlightWords(string input, string searchstring)
{
if (input == null || searchstring == null)
{
return input;
}
var lowerstring = searchstring.ToLower();
var words = lowerstring.Split(' ').ToList();
for (var i = 0; i < words.Count; i++)
{
Match m = Regex.Match(input, words[i], RegexOptions.IgnoreCase);
if (m.Success)
{
string ReplaceWord = string.Format("<span class='search_highlight'>{0}</span>", m.Value);
input = Regex.Replace(input, words[i], ReplaceWord, RegexOptions.IgnoreCase);
}
}
return input;
}
Can anyone suggest how to get this working or indicate what i'm doing wrong?
The problem is that HtmlTextNode.CreateNode can only create one node. When you add a <span> inside, that's another node, and CreateNode throws the exception you see.
Make sure that you are only doing a search and replace on the lowest leaf nodes (nodes with no children). Then rebuild that node by:
Create a new empty node to replace the old one
Search for the text in .InnerText
Use HtmlTextNode.Create to add the plain text before the text you want to highlight
Then add your new <span> with the highlighted text with HtmlNode.CreateNode
Then search for the next occurrence (start back at 1) until no more occurrences are found.
Your function HighlightWords must be returning multiple top-level HTML nodes. For example:
<p>foo</p>
<span>bar</span>
The HtmlAgilityPack only allows one top-level node to be returned. You can hardcode the return value for HighlightWords to test.
Also, this post has run across the same problem.

dynamic content control mapping for MS word c#

I am using code like this:
public void BindControlsToCustomXmlPart()
{
wordApp = (Word.Application)System.Runtime.InteropServices.Marshal.GetActiveObject("Word.Application");
foreach (Word.ContentControl contentControl in wordApp.ActiveDocument.ContentControls)
{
if (contentControl.Tag == "FieldName")
{
string xPathFieldName = "ns:records/ns:record/ns:FieldName";
contentControl.XMLMapping.SetMapping(xPathFieldName,
prefix, currentWordDocumentXMLPart);
}
What ends up happening is every new field I want to add, I have to repeat this redundant code:
if (contentControl.Tag == "FieldName2")
{
string xPathFieldName2 = "ns:records/ns:record/ns:FieldName2";
contentControl.XMLMapping.SetMapping(xPathFieldName2,
prefix, currentWordDocumentXMLPart);
}
Is there a way that I can write this code once and have the "FieldName" portion get updated for each field dynamically? i.e. have some type of loop that would increment through each xmlnode in an xml file (in this case it would map the xml node FieldName to the content control with a tag of FieldName, and then map the xml node FieldName2 to the content control with a tag of FieldName2
A good start would be creating a function to transform your control and reuse that function multiple times as followed
public contentControl BindControlsOperation(contentControl control, string pFieldName)
{
if (control.Tag == pFieldName)
{
string xPathFieldName = String.Format("ns:records/ns:record/ns:{0}",pFieldName);
control.XMLMapping.SetMapping(xPathFieldName,prefix, currentWordDocumentXMLPart);
}
return control;
}
You could then use it in the following fashion
foreach (Word.ContentControl contentControl in wordApp.ActiveDocument.ContentControls)
{
contentControl = BindControlsOperation(contentControl,"FieldName")
}
Next step would be to have a list of names you want to use for fields and feed it to your algorythm using a for loop
....
List<string> names = "x,y,z";
for(int i=0;i < names.length();i++)
{
wordApp.ActiveDocument.ContentControls[i] = BindControlsOperation(wordApp.ActiveDocument.ContentControls[i],name[i])
}
Hope this helps

How to test whether a node contains particular string or character as its text value?

How to test whether a node contains particular string or character using C# code.
example:
<abc>
<foo>data testing</foo>
<foo>test data</foo>
<bar>data value</bar>
</abc>
Now I need to test the particular node value has the string "testing" ?
The output would be "foo[1]"
You can also that into an XPath document and then use a query:
var xPathDocument = new XPathDocument("myfile.xml");
var query = XPathExpression.Compile(#"/abc/foo[contains(text(),""testing"")]");
var navigator = xpathDocument.CreateNavigator();
var iterator = navigator.Select(query);
while(iterator.MoveNext())
{
Console.WriteLine(iterator.Current.Name);
Console.WriteLine(iterator.Current.Value);
}
This will determine if any elements (not just foo) contain the desired value and will print the element's name and it's entire value. You didn't specify what the exact result should be, but this should get you started. If loading from a file use XElement.Load(filename).
var xml = XElement.Parse(#"<abc>
<foo>data testing</foo>
<foo>test data</foo>
<bar>data value</bar>
</abc>");
// or to load from a file use this
// var xml = XElement.Load("sample.xml");
var query = xml.Elements().Where(e => e.Value.Contains("testing"));
if (query.Any())
{
foreach (var item in query)
{
Console.WriteLine("{0}: {1}", item.Name, item.Value);
}
}
else
{
Console.WriteLine("Value not found!");
}
You can use Linq to Xml
string someXml = #"<abc>
<foo>data testing</foo>
<foo>test data</foo>
</abc>";
XDocument doc = XDocument.Parse(someXml);
bool containTesting = doc
.Descendants("abc")
.Descendants("foo")
.Where(i => i.Value.Contains("testing"))
.Count() >= 1;

How to get xpath from an XmlNode instance

Could someone supply some code that would get the xpath of a System.Xml.XmlNode instance?
Thanks!
Okay, I couldn't resist having a go at it. It'll only work for attributes and elements, but hey... what can you expect in 15 minutes :) Likewise there may very well be a cleaner way of doing it.
It is superfluous to include the index on every element (particularly the root one!) but it's easier than trying to work out whether there's any ambiguity otherwise.
using System;
using System.Text;
using System.Xml;
class Test
{
static void Main()
{
string xml = #"
<root>
<foo />
<foo>
<bar attr='value'/>
<bar other='va' />
</foo>
<foo><bar /></foo>
</root>";
XmlDocument doc = new XmlDocument();
doc.LoadXml(xml);
XmlNode node = doc.SelectSingleNode("//#attr");
Console.WriteLine(FindXPath(node));
Console.WriteLine(doc.SelectSingleNode(FindXPath(node)) == node);
}
static string FindXPath(XmlNode node)
{
StringBuilder builder = new StringBuilder();
while (node != null)
{
switch (node.NodeType)
{
case XmlNodeType.Attribute:
builder.Insert(0, "/#" + node.Name);
node = ((XmlAttribute) node).OwnerElement;
break;
case XmlNodeType.Element:
int index = FindElementIndex((XmlElement) node);
builder.Insert(0, "/" + node.Name + "[" + index + "]");
node = node.ParentNode;
break;
case XmlNodeType.Document:
return builder.ToString();
default:
throw new ArgumentException("Only elements and attributes are supported");
}
}
throw new ArgumentException("Node was not in a document");
}
static int FindElementIndex(XmlElement element)
{
XmlNode parentNode = element.ParentNode;
if (parentNode is XmlDocument)
{
return 1;
}
XmlElement parent = (XmlElement) parentNode;
int index = 1;
foreach (XmlNode candidate in parent.ChildNodes)
{
if (candidate is XmlElement && candidate.Name == element.Name)
{
if (candidate == element)
{
return index;
}
index++;
}
}
throw new ArgumentException("Couldn't find element within parent");
}
}
Jon's correct that there are any number of XPath expressions that will yield the same node in an an instance document. The simplest way to build an expression that unambiguously yields a specific node is a chain of node tests that use the node position in the predicate, e.g.:
/node()[0]/node()[2]/node()[6]/node()[1]/node()[2]
Obviously, this expression isn't using element names, but then if all you're trying to do is locate a node within a document, you don't need its name. It also can't be used to find attributes (because attributes aren't nodes and don't have position; you can only find them by name), but it will find all other node types.
To build this expression, you need to write a method that returns a node's position in its parent's child nodes, because XmlNode doesn't expose that as a property:
static int GetNodePosition(XmlNode child)
{
for (int i=0; i<child.ParentNode.ChildNodes.Count; i++)
{
if (child.ParentNode.ChildNodes[i] == child)
{
// tricksy XPath, not starting its positions at 0 like a normal language
return i + 1;
}
}
throw new InvalidOperationException("Child node somehow not found in its parent's ChildNodes property.");
}
(There's probably a more elegant way to do that using LINQ, since XmlNodeList implements IEnumerable, but I'm going with what I know here.)
Then you can write a recursive method like this:
static string GetXPathToNode(XmlNode node)
{
if (node.NodeType == XmlNodeType.Attribute)
{
// attributes have an OwnerElement, not a ParentNode; also they have
// to be matched by name, not found by position
return String.Format(
"{0}/#{1}",
GetXPathToNode(((XmlAttribute)node).OwnerElement),
node.Name
);
}
if (node.ParentNode == null)
{
// the only node with no parent is the root node, which has no path
return "";
}
// the path to a node is the path to its parent, plus "/node()[n]", where
// n is its position among its siblings.
return String.Format(
"{0}/node()[{1}]",
GetXPathToNode(node.ParentNode),
GetNodePosition(node)
);
}
As you can see, I hacked in a way for it to find attributes as well.
Jon slipped in with his version while I was writing mine. There's something about his code that's going to make me rant a bit now, and I apologize in advance if it sounds like I'm ragging on Jon. (I'm not. I'm pretty sure that the list of things Jon has to learn from me is exceedingly short.) But I think the point I'm going to make is a pretty important one for anyone who works with XML to think about.
I suspect that Jon's solution emerged from something I see a lot of developers do: thinking of XML documents as trees of elements and attributes. I think this largely comes from developers whose primary use of XML is as a serialization format, because all the XML they're used to using is structured this way. You can spot these developers because they're using the terms "node" and "element" interchangeably. This leads them to come up with solutions that treat all other node types as special cases. (I was one of these guys myself for a very long time.)
This feels like it's a simplifying assumption while you're making it. But it's not. It makes problems harder and code more complex. It leads you to bypass the pieces of XML technology (like the node() function in XPath) that are specifically designed to treat all node types generically.
There's a red flag in Jon's code that would make me query it in a code review even if I didn't know what the requirements are, and that's GetElementsByTagName. Whenever I see that method in use, the question that leaps to mind is always "why does it have to be an element?" And the answer is very often "oh, does this code need to handle text nodes too?"
I know, old post but the version I liked the most (the one with names) was flawed:
When a parent node has nodes with different names, it stopped counting the index after it found the first non-matching node-name.
Here is my fixed version of it:
/// <summary>
/// Gets the X-Path to a given Node
/// </summary>
/// <param name="node">The Node to get the X-Path from</param>
/// <returns>The X-Path of the Node</returns>
public string GetXPathToNode(XmlNode node)
{
if (node.NodeType == XmlNodeType.Attribute)
{
// attributes have an OwnerElement, not a ParentNode; also they have
// to be matched by name, not found by position
return String.Format("{0}/#{1}", GetXPathToNode(((XmlAttribute)node).OwnerElement), node.Name);
}
if (node.ParentNode == null)
{
// the only node with no parent is the root node, which has no path
return "";
}
// Get the Index
int indexInParent = 1;
XmlNode siblingNode = node.PreviousSibling;
// Loop thru all Siblings
while (siblingNode != null)
{
// Increase the Index if the Sibling has the same Name
if (siblingNode.Name == node.Name)
{
indexInParent++;
}
siblingNode = siblingNode.PreviousSibling;
}
// the path to a node is the path to its parent, plus "/node()[n]", where n is its position among its siblings.
return String.Format("{0}/{1}[{2}]", GetXPathToNode(node.ParentNode), node.Name, indexInParent);
}
Here's a simple method that I've used, worked for me.
static string GetXpath(XmlNode node)
{
if (node.Name == "#document")
return String.Empty;
return GetXpath(node.SelectSingleNode("..")) + "/" + (node.NodeType == XmlNodeType.Attribute ? "#":String.Empty) + node.Name;
}
My 10p worth is a hybrid of Robert and Corey's answers. I can only claim credit for the actual typing of the extra lines of code.
private static string GetXPathToNode(XmlNode node)
{
if (node.NodeType == XmlNodeType.Attribute)
{
// attributes have an OwnerElement, not a ParentNode; also they have
// to be matched by name, not found by position
return String.Format(
"{0}/#{1}",
GetXPathToNode(((XmlAttribute)node).OwnerElement),
node.Name
);
}
if (node.ParentNode == null)
{
// the only node with no parent is the root node, which has no path
return "";
}
//get the index
int iIndex = 1;
XmlNode xnIndex = node;
while (xnIndex.PreviousSibling != null) { iIndex++; xnIndex = xnIndex.PreviousSibling; }
// the path to a node is the path to its parent, plus "/node()[n]", where
// n is its position among its siblings.
return String.Format(
"{0}/node()[{1}]",
GetXPathToNode(node.ParentNode),
iIndex
);
}
There's no such thing as "the" xpath of a node. For any given node there may well be many xpath expressions which will match it.
You can probably work up the tree to build up an expression which will match it, taking into account the index of particular elements etc, but it's not going to be terribly nice code.
Why do you need this? There may be a better solution.
If you do this, you will get a Path with Names of der Nodes AND the Position, if you have Nodes with the same name like this:
"/Service[1]/System[1]/Group[1]/Folder[2]/File[2]"
public string GetXPathToNode(XmlNode node)
{
if (node.NodeType == XmlNodeType.Attribute)
{
// attributes have an OwnerElement, not a ParentNode; also they have
// to be matched by name, not found by position
return String.Format("{0}/#{1}", GetXPathToNode(((XmlAttribute)node).OwnerElement), node.Name);
}
if (node.ParentNode == null)
{
// the only node with no parent is the root node, which has no path
return "";
}
//get the index
int iIndex = 1;
XmlNode xnIndex = node;
while (xnIndex.PreviousSibling != null && xnIndex.PreviousSibling.Name == xnIndex.Name)
{
iIndex++;
xnIndex = xnIndex.PreviousSibling;
}
// the path to a node is the path to its parent, plus "/node()[n]", where
// n is its position among its siblings.
return String.Format("{0}/{1}[{2}]", GetXPathToNode(node.ParentNode), node.Name, iIndex);
}
I found that none of the above worked with XDocument, so I wrote my own code to support XDocument and used recursion. I think this code handles multiple identical nodes better than some of the other code here because it first tries to go as deep in to the XML path as it can and then backs up to build only what is needed. So if you have /home/white/bob and /home/white/mike and you want to create /home/white/bob/garage the code will know how to create that. However, I didn't want to mess with predicates or wildcards, so I explicitly disallowed those; but it would be easy to add support for them.
Private Sub NodeItterate(XDoc As XElement, XPath As String)
'get the deepest path
Dim nodes As IEnumerable(Of XElement)
nodes = XDoc.XPathSelectElements(XPath)
'if it doesn't exist, try the next shallow path
If nodes.Count = 0 Then
NodeItterate(XDoc, XPath.Substring(0, XPath.LastIndexOf("/")))
'by this time all the required parent elements will have been constructed
Dim ParentPath As String = XPath.Substring(0, XPath.LastIndexOf("/"))
Dim ParentNode As XElement = XDoc.XPathSelectElement(ParentPath)
Dim NewElementName As String = XPath.Substring(XPath.LastIndexOf("/") + 1, XPath.Length - XPath.LastIndexOf("/") - 1)
ParentNode.Add(New XElement(NewElementName))
End If
'if we find there are more than 1 elements at the deepest path we have access to, we can't proceed
If nodes.Count > 1 Then
Throw New ArgumentOutOfRangeException("There are too many paths that match your expression.")
End If
'if there is just one element, we can proceed
If nodes.Count = 1 Then
'just proceed
End If
End Sub
Public Sub CreateXPath(ByVal XDoc As XElement, ByVal XPath As String)
If XPath.Contains("//") Or XPath.Contains("*") Or XPath.Contains(".") Then
Throw New ArgumentException("Can't create a path based on searches, wildcards, or relative paths.")
End If
If Regex.IsMatch(XPath, "\[\]()#='<>\|") Then
Throw New ArgumentException("Can't create a path based on predicates.")
End If
'we will process this recursively.
NodeItterate(XDoc, XPath)
End Sub
What about using class extension ? ;)
My version (building on others work) uses the syntaxe name[index]... with index omited is element has no "brothers".
The loop to get the element index is outside in an independant routine (also a class extension).
Just past the following in any utility class (or in the main Program class)
static public int GetRank( this XmlNode node )
{
// return 0 if unique, else return position 1...n in siblings with same name
try
{
if( node is XmlElement )
{
int rank = 1;
bool alone = true, found = false;
foreach( XmlNode n in node.ParentNode.ChildNodes )
if( n.Name == node.Name ) // sibling with same name
{
if( n.Equals(node) )
{
if( ! alone ) return rank; // no need to continue
found = true;
}
else
{
if( found ) return rank; // no need to continue
alone = false;
rank++;
}
}
}
}
catch{}
return 0;
}
static public string GetXPath( this XmlNode node )
{
try
{
if( node is XmlAttribute )
return String.Format( "{0}/#{1}", (node as XmlAttribute).OwnerElement.GetXPath(), node.Name );
if( node is XmlText || node is XmlCDataSection )
return node.ParentNode.GetXPath();
if( node.ParentNode == null ) // the only node with no parent is the root node, which has no path
return "";
int rank = node.GetRank();
if( rank == 0 ) return String.Format( "{0}/{1}", node.ParentNode.GetXPath(), node.Name );
else return String.Format( "{0}/{1}[{2}]", node.ParentNode.GetXPath(), node.Name, rank );
}
catch{}
return "";
}
I produced VBA for Excel to do this for a work project. It outputs tuples of an Xpath and the associated text from an elemen or attribute. The purpose was to allow business analysts to identify and map some xml. Appreciate that this is a C# forum, but thought this may be of interest.
Sub Parse2(oSh As Long, inode As IXMLDOMNode, Optional iXstring As String = "", Optional indexes)
Dim chnode As IXMLDOMNode
Dim attr As IXMLDOMAttribute
Dim oXString As String
Dim chld As Long
Dim idx As Variant
Dim addindex As Boolean
chld = 0
idx = 0
addindex = False
'determine the node type:
Select Case inode.NodeType
Case NODE_ELEMENT
If inode.ParentNode.NodeType = NODE_DOCUMENT Then 'This gets the root node name but ignores all the namespace attributes
oXString = iXstring & "//" & fp(inode.nodename)
Else
'Need to deal with indexing. Where an element has siblings with the same nodeName,it needs to be indexed using [index], e.g swapstreams or schedules
For Each chnode In inode.ParentNode.ChildNodes
If chnode.NodeType = NODE_ELEMENT And chnode.nodename = inode.nodename Then chld = chld + 1
Next chnode
If chld > 1 Then '//inode has siblings of the same nodeName, so needs to be indexed
'Lookup the index from the indexes array
idx = getIndex(inode.nodename, indexes)
addindex = True
Else
End If
'build the XString
oXString = iXstring & "/" & fp(inode.nodename)
If addindex Then oXString = oXString & "[" & idx & "]"
'If type is element then check for attributes
For Each attr In inode.Attributes
'If the element has attributes then extract the data pair XString + Element.Name, #Attribute.Name=Attribute.Value
Call oSheet(oSh, oXString & "/#" & attr.Name, attr.Value)
Next attr
End If
Case NODE_TEXT
'build the XString
oXString = iXstring
Call oSheet(oSh, oXString, inode.NodeValue)
Case NODE_ATTRIBUTE
'Do nothing
Case NODE_CDATA_SECTION
'Do nothing
Case NODE_COMMENT
'Do nothing
Case NODE_DOCUMENT
'Do nothing
Case NODE_DOCUMENT_FRAGMENT
'Do nothing
Case NODE_DOCUMENT_TYPE
'Do nothing
Case NODE_ENTITY
'Do nothing
Case NODE_ENTITY_REFERENCE
'Do nothing
Case NODE_INVALID
'do nothing
Case NODE_NOTATION
'do nothing
Case NODE_PROCESSING_INSTRUCTION
'do nothing
End Select
'Now call Parser2 on each of inode's children.
If inode.HasChildNodes Then
For Each chnode In inode.ChildNodes
Call Parse2(oSh, chnode, oXString, indexes)
Next chnode
Set chnode = Nothing
Else
End If
End Sub
Manages the counting of elements using:
Function getIndex(tag As Variant, indexes) As Variant
'Function to get the latest index for an xml tag from the indexes array
'indexes array is passed from one parser function to the next up and down the tree
Dim i As Integer
Dim n As Integer
If IsArrayEmpty(indexes) Then
ReDim indexes(1, 0)
indexes(0, 0) = "Tag"
indexes(1, 0) = "Index"
Else
End If
For i = 0 To UBound(indexes, 2)
If indexes(0, i) = tag Then
'tag found, increment and return the index then exit
'also destroy all recorded tag names BELOW that level
indexes(1, i) = indexes(1, i) + 1
getIndex = indexes(1, i)
ReDim Preserve indexes(1, i) 'should keep all tags up to i but remove all below it
Exit Function
Else
End If
Next i
'tag not found so add the tag with index 1 at the end of the array
n = UBound(indexes, 2)
ReDim Preserve indexes(1, n + 1)
indexes(0, n + 1) = tag
indexes(1, n + 1) = 1
getIndex = 1
End Function
Another solution to your problem might be to 'mark' the xmlnodes which you will want to later identify with a custom attribute:
var id = _currentNode.OwnerDocument.CreateAttribute("some_id");
id.Value = Guid.NewGuid().ToString();
_currentNode.Attributes.Append(id);
which you can store in a Dictionary for example.
And you can later identify the node with an xpath query:
newOrOldDocument.SelectSingleNode(string.Format("//*[contains(#some_id,'{0}')]", id));
I know this is not a direct answer to your question, but it can help if the reason you wish to know the xpath of a node is to have a way of 'reaching' the node later after you have lost the reference to it in code.
This also overcomes problems when the document gets elements added/moved, which can mess up the xpath (or indexes, as suggested in other answers).
This is even easier
''' <summary>
''' Gets the full XPath of a single node.
''' </summary>
''' <param name="node"></param>
''' <returns></returns>
''' <remarks></remarks>
Private Function GetXPath(ByVal node As Xml.XmlNode) As String
Dim temp As String
Dim sibling As Xml.XmlNode
Dim previousSiblings As Integer = 1
'I dont want to know that it was a generic document
If node.Name = "#document" Then Return ""
'Prime it
sibling = node.PreviousSibling
'Perculate up getting the count of all of this node's sibling before it.
While sibling IsNot Nothing
'Only count if the sibling has the same name as this node
If sibling.Name = node.Name Then
previousSiblings += 1
End If
sibling = sibling.PreviousSibling
End While
'Mark this node's index, if it has one
' Also mark the index to 1 or the default if it does have a sibling just no previous.
temp = node.Name + IIf(previousSiblings > 0 OrElse node.NextSibling IsNot Nothing, "[" + previousSiblings.ToString() + "]", "").ToString()
If node.ParentNode IsNot Nothing Then
Return GetXPath(node.ParentNode) + "/" + temp
End If
Return temp
End Function
I had to do this recently. Only elements needed to be considered. This is what I came up with:
private string GetPath(XmlElement el)
{
List<string> pathList = new List<string>();
XmlNode node = el;
while (node is XmlElement)
{
pathList.Add(node.Name);
node = node.ParentNode;
}
pathList.Reverse();
string[] nodeNames = pathList.ToArray();
return String.Join("/", nodeNames);
}
public static string GetFullPath(this XmlNode node)
{
if (node.ParentNode == null)
{
return "";
}
else
{
return $"{GetFullPath(node.ParentNode)}\\{node.ParentNode.Name}";
}
}

Categories

Resources