c# parsing xml with and apostrophe throws exception - c#

I am parsing an xml file and am running into an issue when trying find a node that has an apostrophe in it. When item name does not have this everything works fine. I have tried replacing the apostrophe with different escape chars but am not having much luck
string s = "/itemDB/item[#name='" + itemName + "']";
// Things i have tried that did not work
// s.Replace("'", "''");
// .Replace("'", "\'");
XmlNode parent = root.SelectSingleNode(s);
I always receive an XPathException. What is the proper way to do this. Thanks

For apostophe replace it with '

You can do it Like this:
XmlDocument root = new XmlDocument();
root.LoadXml(#"<itemDB><item name=""abc'def""/></itemDB>");
XmlNode node = root.SelectSingleNode(#"itemDB/item[#name=""abc'def""]");
Note the verbatim string literal '#' and the double quotes.
Your code would then look like this and there is no need to replace anything:
var itemName = #"abc'def";
string s = #"/itemDB/item[#name=""" + itemName + #"""]";

Related

XMLException. List of the all invalid characters

I try execute such a code sample.
var xmlDocument = new XmlDocument();
documentTagName = "testName)"
XmlNode headerElement = xmlDocument.CreateElement(documentTagName);
Of cource I get XmlException:
The ')' character, hexadecimal value 0x... (doesn't matter), cannot be included in a name
Because I have ) symbol in documentTagName. And of cource I'll get the same exception if documentTagName would be like this:
documentTagName = "testName("
or like this:
documentTagName = "testName:"
Because all of these characters ('(' , ')' , ':') are invalid for the xml tag name. But I check many links (and even this) and cannot find the list of all invalid characters for xml tag name. Can anybody help me?

string.replace() not working with xml string

I have read all similar questions and acted accordingly. But still can't figure out what's wrong with my code.
This is my code, super simple. (I know this isn't valid XML. It's just for the example).
string replacement = "TimeSheetsReplaced";
string word = "TimeSheets";
string result = "<?xml version=\"1.0\" encoding=\"utf-16\"?><DisplayName>Timesheets</DisplayName>";
result = result.Replace("<DisplayName>" + word + "</DisplayName>", "<DisplayName>" + replacement + "</DisplayName>");
The result string remains unplaced. What am I doing wrong??
TimeSheets != Timesheets
Casing does not match
It's because your string contains Timesheets, but you're lokoing for TimeSheets (with a capital S).
In your word TimeSheets has big S, in string small s

Xpath error has an Invalid Token

I have the following C# code:
var selectNode = xmlDoc.SelectSingleNode("//CodeType[#name='" + codetype +
"']/Section[#title='" + section + "']/Code[#code='" + code + "' and
#description='" + codedesc + "']") as XmlElement;
when I run my code it raises the error saying "the above statement has an invalid token"
These are the values for the above statement.
codeType=cbc
section="Mental"
codedesc="Injection, enzyme (eg, collagenase), palmar fascial cord (ie,
Dupuytren's contracture"
Notice the apostrophe (') in codedesc?
You need to escape it somehow. The XPath interpreter considers it as a string delimiter and doesn't know how to treat the other apostrophe after it.
One way you can do that is by enclosing your string in double quotes instead of apostrophes.
Your code could therefore become:
var selectNode = xmlDoc.SelectSingleNode(
"//CodeType[#name='" + codetype + "']" +
"/Section[#title='" + section + "']" +
"/Code[#code=\"" + code + "' and #description='" + codedesc + "\"]")
as XmlElement;
(note that on the fourth line, the apostrophes(') became double quotes(\"))
While this approach works for the data you presented, you are still not 100% safe: other records could contain double quotes themselves. If that happens, we'll need to think of something for that case as well.
You can get selected node based on index , if any special characters in the xml schema. So , here looks below implementation for delete selected index node from xml schema.
XML SelectSingleNode Delete Operation
var schemaDocument = new XmlDocument();
schemaDocument.LoadXml(codesXML);
var xmlNameSpaceManager = new XmlNamespaceManager(schemaDocument.NameTable);
if (schemaDocument.DocumentElement != null)
xmlNameSpaceManager.AddNamespace("x", schemaDocument.DocumentElement.NamespaceURI);
var codesNode = schemaDocument.SelectSingleNode(#"/x:integration-engine-codes/x:code-categories/x:code-category/x:codes", xmlNameSpaceManager);
var codeNode = codesNode.ChildNodes.Item(Convert.ToInt32(index) - 1);
if (codeNode == null || codeNode.ParentNode == null)
{
throw new Exception("Invalid node found");
}
codesNode.RemoveChild(codeNode);
return schemaDocument.OuterXml;
Duplicate the single quote, so that it reads "Dupuytren''s contracture"
This way you can escape the single quote in the xpath expression.

How to prevent XElement from decoding character entity references

I have an XML string that contains an apostrophe. I replace the apostrophe with its equivalent & parse the revised string into an XElement. The XElement, however, is turning the ' back into an apostrophe.
How do I force XElement.Parse to preserve the encoded string?
string originalXML = #"<Description><data>Mark's Data</data></Description>"; //for illustration purposes only
string encodedApostrophe = originalXML.Replace("'", "'");
XElement xe = XElement.Parse(encodedApostrophe);
This is correct behavior. In places where ' is allowed, it works the same as &apos;, ' or '. If you want to include literal string ' in the XML, you should encode the &:
originalXML.Replace("'", "&#39;")
Or parse the original XML and modify that:
XElement xe = XElement.Parse(originalXML);
var data = xe.Element("data");
data.Value = data.Value.Replace("'", "'");
But doing this seems really weird. Maybe there is a better solution to the problem you're trying to solve.
Also, this encoding is not “ASCII equivalent”, they are called character entity references. And the numeric ones are based on the Unicode codepoint of the character.

Escaping ONLY contents of Node in XML

I have a part of code mentioned like below.
//Reading from a file and assign to the variable named "s"
string s = "<item><name> Foo </name></item>";
XmlDocument doc = new XmlDocument();
doc.LoadXml(s);
But, it stops working if the contents has characters something like "<", ">"..etc.
string s = "<item><name> Foo > Bar </name></item>";
I know, I have to escape those characters before loading but, if I do like
doc.LoadXml(System.Security.SecurityElement.Escape(s));
, the tags (< , >) are also escaped and as a result, the error occurs.
How can I solve this problem?
a tricky solution:
string s = "<item><name> Foo > Bar </name></item>";
s = Regex.Replace(s, #"<[^>]+?>", m => HttpUtility.HtmlEncode(m.Value)).Replace("<","ojlovecd").Replace(">","cdloveoj");
s = HttpUtility.HtmlDecode(s).Replace("ojlovecd", ">").Replace("cdloveoj", "<");
XmlDocument doc = new XmlDocument();
doc.LoadXml(s);
Assuming your content will never contain the characters "]]>", you can use CDATA.
string s = "<item><name><![CDATA[ Foo > Bar ]]></name></item>";
Otherwise, you'll need to html encode your special characters, and decode them before you use/display them (unless it's in a browser).
string s = "<item><name> Foo > Bar </name></item>";
Assign the content of string to the InnerXml property of node.
var node = doc.CreateElement("root");
node.InnerXml = s;
Take a look at - Different ways how to escape an XML string in C#
It looks like the strings that you have generated are strings, and not valid XML. You can either get the strings generated as valid XML OR if you know that the strings are always going to be the name, then don't include the XML <item> and <name> tags in the data.
Then when you create the XMLDocument. do a CreateElement and assign your string before resaving the results.
XmlDocument doc = new XmlDocument();
XmlElement root = doc.CreateElement("item");
doc.AppendChild(root);
XmlElement name = doc.CreateElement("name");
name.InnerText = "the contents from your file";
root.AppendChild(name);

Categories

Resources