I'm trying to call a method from a catch block but the xmlnode inside the method doesn't seem to work. I'm getting a null. If I call the same method from the try block it works.
var doc = new XmlDocument();
try
{
doc.Load(f.FullPath);
// do some work
}
catch (Exception e)
{
if (e is XMLException)
{
checkXML(ref doc);
}
public void checkXML(ref XmlDocument doc)
{
XmlNode xn = doc.SelectSingleNode("/BroadcastMonitor/Current");
xn["name1"].InnerText = SecurityElement.Escape(xn["name1"].InnerText);
xn["name2"].InnerText = SecurityElement.Escape(xn["name2"].InnerText); ;
}
Now when the catch block calls method 'checkXML', i get xn as null. But if I execute the same from the 'try' block just to check, 'xn' has a value. 'doc' too has a value regardless of when called try or from catch block.
Why is this happening? Please help me understand.
EDIT
<BroadcastMonitor>
<updated>2014-10-17T07:56:30</updated>
<Name>TESTING</Name>
<Current>
<artistName>اصاله& نصرى</artistName>
<albumName>شخصيه عنيده</albumName>
<CategoryName>ARABIC & SONGS</CategoryName>
</Current>
</BroadcastMonitor>
Thank you.
Your xml contains an & character which is not a 'valid' xml character and must be escaped.
<CategoryName>ARABIC & SONGS</CategoryName>
So it's causing your Load() method to throw the exception.
What you should do is escape all the invalid characters in your xml string before passing them on to an xml parser like so
yourXmlString = XmlConvert.EncodeName(yourXmlString);
You can then pass the yourXmlString on to the parser like so
var xDoc = XDocument.Parse(yourXmlString);
or if you don't want to or can't use the XDocument class you will need to make sure you save the xml encoded so that the Load() method of the XmlDocument class will be reading a file that is properly encoded.
Note that XmlDocument and XDocument classes are not the same thing and have some significant differences. Method Parse(), if I remember correctly, is one of the advantages that XDocument has over XmlDocument.
EDIT :
You can read the xml file into a string using the File class
var yourXmlString = File.ReadAllText(filePath);
XmlDocument is a reference type... no need to pass it with ref.
And my guess is that its failing to load in the first place so doc is really null
It looks like this document is missing its xml declaration tag.
try:
XmlDocument doc = new XmlDocument;
using(StreamReader reader = new StreamReader(f.FullPath))
{
doc.LoadXml(reader.ReadToEnd());
}
You can use System.IO.File.ReadAllText() to get all text from file into a string variable :
string invalidXml = System.IO.File.ReadAllText(f.FullPath);
For this particular XML, you can simply replace & with it's encoded version & to make a valid XML string :
string validXml = invalidXml.Replace("&", "&");
doc.LoadXml(validXml);
.....
Related question for reference : Reading XML with an "&" into C# XMLDocument Object
This would be my solution:
private static Regex InnerValues = new Regex(#"(?<=<(.*?>)).*?(?=</\1)",RegexOptions.Compiled);
private static XmlDocument LoadInvalidDocument(string path)
{
XmlDocument result = new XmlDocument();
string content = File.ReadAllText(path);
var matches = InnerValues.Matches(content);
foreach (Match match in matches)
{
content = content.Replace(match.Value, HttpUtility.HtmlEncode(match.Value));
}
result.LoadXml(content);
return result;
}
Related
How do you get the data from description in the tag something: <something description = "something else"> </something> using c# in uwp.
Here is the test code to show how to get the XML node attribute value:
private void GetContent()
{
string xml = "<?xml version=\"1.0\" encoding=\"utf-8\"?><body><content title =\"XML File!\"></content></body>";
var doc = new XmlDocument();
doc.LoadXml(xml);
var tags=doc.GetElementsByTagName("content");
if (tags.Count > 0)
{
var firstContent = tags.First();
string result = firstContent.Attributes.GetNamedItem("title").InnerText;
}
}
Tips
In UWP, loading an XmlDocument via a path is not recommended. It is best to get the XML file first, read all the text, and load the XmlDocument via text.
The XmlDocument prefix namespace is Windows.Data.Xml.Dom, NOT System.Xml
Best regards.
When I'm trying to save the XML Document I edited the IOException "file used by another process" occured when I try to save that document.
Any ideas how to solve this?
Note: This method is called everytime a new element in the XmlDocument should be written.
public void saveRectangleAsXMLFragment()
{
XmlDocument doc = new XmlDocument();
doc.Load("test.xml");
XmlDocumentFragment xmlDocFrag = doc.CreateDocumentFragment();
String input = generateXMLInput();
xmlDocFrag.InnerXml = input;
XmlElement mapElement = doc.DocumentElement;
mapElement.AppendChild(xmlDocFrag);
input = null;
mapElement = null;
xmlDocFrag = null;
doc.Save("test.xml");
}
Its probably one of your other methods, or other part of the code which opened the file and didnt calose it well. Try to search for this kind of problem.
try this if your's application is only access that .xml file
1. Create a Object globally
object lockData = new object();
2.Use than object to lock statement where you save and load xml
lock(lockData )
{
doc.Load("test.xml");
}
lock(lockData )
{
doc.Save("test.xml");
}
From Jon Skeet's related answer (see https://stackoverflow.com/a/8354736/4151626)
There seems to be a bug in XmlDocument.Save()'s treatment of the file stream, where it becomes pinned and is neither Closed() nor Disposed(). By taking direct control of the creation and disposition of the stream outside of the XmlDocument.Save() I was able to get around this halting error.
//e.g.
XmlWriter xw = new XmlWriter.Create("test.xml");
doc.Save(xw);
xw.Close();
xw.Dispose();
I have a piece of code which works well for normal files. But for really big files, it makes the server stop working.
Here it is:
XmlReader reader = null;
try
{
reader = XmlReader.Create(file_name + ".xml");
XDocument xml = XDocument.Load(reader);
XmlNamespaceManager namespaceManager = GetNamespaceManager(reader);
XElement root = xml.Root;
//XAttribute supplier = root.XPathSelectElement("//sh:Receive/sh:Id", namespaceManager).Attribute("Authority");
//string version = root.XPathSelectElement("//sh:DocumentId/sh:Version", namespaceManager).Value;
var nodes = root.XPathSelectElements("//eanucc:msg/eanucc:transact", namespaceManager);
return nodes;
}
catch
{ }
I think this is the part which causes the memory problem which happens on the server. How can I fix this?
It sounds like there's simply too much data to read in one go. You'll have to iterate over the elements one at a time, using XmlReader as a cursor, and converting one element to XElement at a time.
public static IEnumerable<XElement> ReadTransactions()
{
using (var reader = XmlReader.Create(file_name + ".xml"))
{
while (reader.ReadToFollowing("transact", eanuccNamespaceUri))
{
using (var subtree = reader.ReadSubtree())
{
yield return XElement.Load(subtree);
}
}
}
}
Note: this assumes there are never "transact" elements at any other level. If there are, you'll need to be more careful with your XmlReader than just calling ReadToFollowing. Also note that you'll need to find the actual namespace URI of the eanucc alias.
Don't forget that if you try to read all of this information in one go (e.g. by calling ToList()) then you'll still run out of memory. You need to stream the information. (It's not clear what you're trying to do with the elements, but you need to think about it carefully.)
Try putting the reader in a using(){} clause so it gets disposed of after use.
try
{
using(var reader = XmlReader.Create(file_name + ".xml"))
{
XDocument xml = XDocument.Load(reader);
XmlNamespaceManager namespaceManager = GetNamespaceManager(reader);
XElement root = xml.Root;
var nodes = root.XPathSelectElements("//eanucc:msg/eanucc:transact", namespaceManager);
return nodes;
}
}
catch
{ }
I have a string input that i do not know whether or not is valid xml.
I think the simplest aprroach is to wrap
new XmlDocument().LoadXml(strINPUT);
In a try/catch.
The problem im facing is, sometimes strINPUT is an html file, if the header of this file contains
<!DOCTYPE html PUBLIC ""-//W3C//DTD XHTML 1.0 Transitional//EN"" ""http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"">
<html xml:lang=""en-GB"" xmlns=""http://www.w3.org/1999/xhtml"" lang=""en-GB"">
...like many do, it actually tries to make a connection to the w3.org url, which i really dont want it doing.
Anyone know if its possible to just parse the string without trying to be clever and checking external urls? Failing that is there an alternative to xmldocument?
Try the following:
XmlDocument doc = new XmlDocument();
using (var reader = XmlReader.Create(new StringReader(xml), new XmlReaderSettings() {
ProhibitDtd = true,
ValidationType = ValidationType.None
})) {
doc.Load(reader);
}
The code creates a reader that turns off DTD processing and validation. Checking for wellformedness will still apply.
Alternatively you can use XDocument.Parse if you can switch to using XDocument instead of XmlDocument.
I am not sure about the reason behind the problem but Have you tried XDocument and XElement classes in System.Xml.Linq
XDocument document = XDocument.Load(strINPUT , LoadOptions.None);
XElement element = XElement.Load(strINPUT );
EDIT: for xml as string try following
XDocument document = XDocument.Parse(strINPUT , LoadOptions.None );
Use XmlDocument's load method to load the xml document, use XmlNodeList to get at the elements, then retrieve the data ...
try the following:
XmlDocument xmlDoc = new XmlDocument();
//use the load method to load the XML document from the specified stream.
xmlDoc.Load("myXMLDoc.xml");
//Use the method GetElementsByTagName() to get elements that match the specified name.
XmlNodeList item = xDoc.GetElementsByTagName("item");
XmlNodeList url = xDoc.GetElementsByTagName("url");
Console.WriteLine("The item is: " + item[0].InnerText));
add a try/catch block around the above code and see what you catch, modify your code to address that situation.
How do i know if my XML file has data besides the name space info:
Some of the files contain this:
<?xml version="1.0" encoding="UTF-8"?>
And if i encounter such a file, i want to place the file in an error directory
You could use the XmlReader to avoid the overhead of XmlDocument. In your case, you will receive an exception because the root element is missing.
string xml = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>";
using (StringReader strReader = new StringReader(xml))
{
//You can replace the StringReader object with the path of your xml file.
//In that case, do not forget to remove the "using" lines above.
using (XmlReader reader = XmlReader.Create(strReader))
{
try
{
while (reader.Read())
{
}
}
catch (XmlException ex)
{
//Catch xml exception
//in your case: root element is missing
}
}
}
You can add a condition in the while(reader.Read()) loop after you checked the first nodes to avoid to read the entire xml file since you just want to check if the root element is missing.
I think the only way is to catch an exception when you try and load it, like this:
try
{
System.Xml.XmlDocument doc = new System.Xml.XmlDocument();
doc.Load(Server.MapPath("XMLFile.xml"));
}
catch (System.Xml.XmlException xmlEx)
{
if (xmlEx.Message.Contains("Root element is missing"))
{
// Xml file is empty
}
}
Yes, there is some overhead, but you should be performing sanity checks like this anyway. You should never trust input and the only way to reliably verify it is XML is to treat it like XML and see what .NET says about it!
XmlDocument xDoc = new XmlDocument();
if (xDoc.ChildNodes.Count == 0)
{ // xml document is empty }
if (xDoc.ChildNodes.Count == 1)
{ // in xml document is only declaration node. (if you are shure that declaration is allways at the begining }
if (xDoc.ChildNodes.Count > 1)
{ // there is declaration + n nodes (usually this count is 2; declaration + root node) }
Haven't tried this...but should work.
try
{
XmlDocument doc = new XmlDocument();
doc.Load("test.xml");
}
catch (XmlException exc)
{
//invalid file
}
EDIT: Based on feedback comments
For large XML documents see Thomas's answer. This approach can have performance issues.
But, if it is a valid xml and the program wants to process it then this approach seems better.
If you aren't worried about validity, just check to see if there is anything after the first ?>. I'm not entirely sure of the C# syntax (it's been too long since I used it), but read the file, look for the first instance of ?>, and see if there is anything after that index.
However, if you want to use the XML later or you want to process the XML later, you should consider PK's answer and load the XML into an XmlDocument object. But if you have large XML documents that you don't need to process, then a solution more like mine, reading the file as text, might have less overhead.
You could check if the xml document has a node (the root node) and check it that node has inner text or other children.
As long as you aren't concerned with the validity of the XML document, and only want to ensure that it has a tag other than the declaration, you could use simple text processing:
var regEx = new RegEx("<[A-Za-z]");
bool foundTags = false;
string curLine = "";
using (var reader = new StreamReader(fileName)) {
while (!reader.EndOfStream) {
curLine = reader.ReadLine();
if (regEx.Match(curLine)) {
foundTags = true;
break;
}
}
}
if (!foundTags) {
// file is bad, copy.
}
Keep in mind that there's a million other reasons that the file may be invalid, and the code above would validate a file consisting only of "<a". If your intent is to validate that the XML document is capable of being read, you should use the XmlDocument approach.