XmlDocument file name with space - c#

I have XML file that contains services names in windows7, one of the services has white space i.e "service name" I get exception when I load the file:
fileName = file;
pathToFile = path;
XmlDocument ServerList = new XmlDocument();
ServerList.Load(pathToFile + fileName);
the XML:
<systems>
<Groups>
<Myervices>
<Dialogic/>
<BoardServer/>
<HmpElements/>
<Service-1 Agent/>
</Myervices>
</Groups>
</systems>
the filenName has the white space, is there a way to receive it cause I cannot change the service name.
the exception I get:
'/' is an unexpected token. The expected token is '='. Line 824,
position 23. at System.Xml.XmlTextReaderImpl.Throw(Exception e) at
System.Xml.XmlTextReaderImpl.Throw(String res, String[] args) at
System.Xml.XmlTextReaderImpl.ThrowUnexpectedToken(String
expectedToken1, String expectedToken2) at
System.Xml.XmlTextReaderImpl.ParseAttributes() at
System.Xml.XmlTextReaderImpl.ParseElement() at
System.Xml.XmlTextReaderImpl.ParseElementContent() at
System.Xml.XmlTextReaderImpl.Read() at
System.Xml.XmlLoader.LoadNode(Boolean skipOverWhitespace) at
System.Xml.XmlLoader.LoadDocSequence(XmlDocument parentDoc) at
System.Xml.XmlLoader.Load(XmlDocument doc, XmlReader reader, Boolean
preserveWhitespace) at System.Xml.XmlDocument.Load(XmlReader reader)
at System.Xml.XmlDocument.Load(String filename) at
Stop_Start_systems.Functions..ctor(String path, String file) in
c:\Stop_Start_systems\Functions.cs:line 32 at
Stop_Start_systems.Default.Page_Load(Object sender, EventArgs e) in
c:\Stop_Start_systems\Default.aspx.cs:line 31
System.Collections.ListDictionaryInterna Thanks

The problem has nothing to do with the name of the XML file, or the code you posted. It has everything to do with the XML being invalid. XML element names can't contain spaces, so this isn't valid:
<Service-1 Agent/>
Instead, you should use the same element name for all services, putting the service name into an attribute instead, e.g.
<Service Name="Service-1 Agent" />
<Service Name="Some other service" />
etc. I would strongly advise you to create the XML file automatically using an API instead of by hand - that way you're much more likely to end up with valid XML.

Related

XML Deserialization C# gives error for valid document

I have a lot of XML files with the same structure. Many of them working OK, but for some XmlSerializer gives me an error but when I put the document in xml validator - it says that document is correct.
Deserialization code:
var document = serializer.Deserialize(File.OpenRead(file));
Error:
System.InvalidOperationException: There is an error in XML document (504, 8). ---> System.Xml.XmlException: Unexpected node type Element. ReadElementString method can only be called on elements with simple or empty content. Line 504, position 8.
at System.Xml.XmlReader.ReadElementString()
at Microsoft.Xml.Serialization.GeneratedAssembly.XmlSerializationReaderPatentdocument.Read33_Claimtext(Boolean isNullable, Boolean checkType)
at Microsoft.Xml.Serialization.GeneratedAssembly.XmlSerializationReaderPatentdocument.Read34_Claim(Boolean isNullable, Boolean checkType)
at Microsoft.Xml.Serialization.GeneratedAssembly.XmlSerializationReaderPatentdocument.Read35_Claims(Boolean isNullable, Boolean checkType)
at Microsoft.Xml.Serialization.GeneratedAssembly.XmlSerializationReaderPatentdocument.Read43_Patentdocument(Boolean isNullable, Boolean checkType)
at Microsoft.Xml.Serialization.GeneratedAssembly.XmlSerializationReaderPatentdocument.Read44_patentdocument()
--- End of inner exception stack trace ---
at System.Xml.Serialization.XmlSerializer.Deserialize(XmlReader xmlReader, String encodingStyle, XmlDeserializationEvents events)
at System.Xml.Serialization.XmlSerializer.Deserialize(Stream stream)
The part of the document where it gives the error:
<text>12. Führungsschiene nach einem der Ansprüche 2 bis 11, dadurch gekennzeichnet, daß in den beiden Nutwänden (<b>11<i>a</i>, 11</b><i>a′)</i> einander gegenüberliegende Bohrungen (<b>14</b><i>a</i>, <b>14</b><i>a</i>′) vorgesehen sind, von denen die eine Bohrung (<b>14</b><i>a</i>′) durch das Einsatzteil (<b>15</b><i>a)</i> ver­schlossen ist.</text>
I suppose it is because of inline html tags inside because it complains about this line on position of i tag
<b>11<i>a</i>, 11</b>
But for example this xml is correct according to XmlSerializer and it is possible to deserialize it:
<text>9. Führungsschiene nach Anspruch 8, dadurch gekennzeichnet, daß der Ansatz (<b>20</b>) die Zuführfläche (<b>25</b>) aufweist.</text>
So my question why xml validator says that the document is valid and XmlSerializer cannot deserialize it? Is it possible to have a workaround without changing the document?
You're right when you point at the inner HTML tags.
Your XML is not valid because you have tags inside a simple (text) element. XmlSerializer doesn't understand and throws an error.
If you have generated the XML files, you have to escape the data inside the simple elements beforehand :
with HTML Encode
Or by encapsulating it in a CDATA tag (<![CDATA[...]]>)
Try serializing the instance that is causing you problems. Then you can compare the output of the serialization with the contents of the file you are attempting to deserialize. The difference between the two XML strings will show you where the problem is.
Here is a quick function to serialize an instance of a class to XML:
public static string Serialize<T>(T entity)
{
if (entity == null)
return String.Empty;
try
{
XmlSerializer XS = new XmlSerializer(typeof(T));
System.IO.StringWriter SW = new System.IO.StringWriter();
XS.Serialize(SW, entity);
return SW.ToString();
}
catch (Exception e)
{
Logging.Log(Severity.Error, "Unable to serialize entity", e);
return String.Empty;
}
}
If you haven't tried it yet, I would suggest the software BeyondCompare to easily see the difference between the two files.
Suppose we have the following class:
public class Foo
{
//[XmlIgnore]
public string Text { get; set; }
}
And xml of the following form:
<Foo xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<text>12. Führungsschiene nach einem der Ansprüche 2 bis 11, dadurch gekennzeichnet, daß in den beiden Nutwänden (<b>11<i>a</i>, 11</b><i>a′)</i> einander gegenüberliegende Bohrungen (<b>14</b><i>a</i>, <b>14</b><i>a</i>′) vorgesehen sind, von denen die eine Bohrung (<b>14</b><i>a</i>′) durch das Einsatzteil (<b>15</b><i>a)</i> ver­schlossen ist.</text>
</Foo>
Then we can deserialize the data as follows.
var xs = new XmlSerializer(typeof(Foo));
xs.UnknownElement += Xs_UnknownElement;
Foo foo;
using (var fs = new FileStream("test.txt", FileMode.Open))
{
foo = (Foo)xs.Deserialize(fs);
}
Subscribe XmlSerializer to UnknownElement event.
In the event handler manually set our property to the data.
private static void Xs_UnknownElement(object sender, XmlElementEventArgs e)
{
var foo = (Foo)e.ObjectBeingDeserialized;
foo.Text = e.Element.InnerXml;
}
Please note that the property name should not match the xml node name (case sensitive). Only in this case the event is triggered. If the names match, use the XmlIgnore attribute.

XDocument Invalid Characters On Load - '\v', hexadecimal value 0x0B, is an invalid character

I am downloading some XML content from the Adobe Connect API. I am loading the content into a XDocument and reading through all of the sco elements to save them to the database. However, one of the calls to the API contains an invalid character that gives the exception:
System.Xml.XmlException: '', hexadecimal value 0x0B, is an invalid character. Line 2, position 6495.
at System.Xml.XmlTextReaderImpl.Throw(Exception e)
at System.Xml.XmlTextReaderImpl.Throw(String res, String[] args)
at System.Xml.XmlTextReaderImpl.ParseText(Int32& startPos, Int32& endPos, Int32& outOrChars)
at System.Xml.XmlTextReaderImpl.ParseText()
at System.Xml.XmlTextReaderImpl.ParseElementContent()
at System.Xml.XmlTextReaderImpl.Read()
at System.Xml.Linq.XContainer.ReadContentFrom(XmlReader r)
at System.Xml.Linq.XContainer.ReadContentFrom(XmlReader r, LoadOptions o)
at System.Xml.Linq.XDocument.Load(XmlReader reader, LoadOptions options)
at System.Xml.Linq.XDocument.Load(XmlReader reader)
at ACRS.DataRefresherApp.Program.GetFolderContents(Folder parentFolder, AcrsDbContext db) in xxx:line 164
Here is a sample of the XML coming from the Adobe Connect API. Note: this example does not contain an invalid character.
<?xml version="1.0"?>
<results>
<status code="ok"/>
<scos>
<sco is-folder="1" duration="" display-seq="0" icon="folder" type="folder" folder-id="xx" source-sco-id="" sco-id="xx">
<name>Shared Templates</name>
<url-path>/f1101964883/</url-path>
<date-created>2010-09-16T15:21:15.993+10:00</date-created>
<date-modified>2013-12-11T22:31:05.130+11:00</date-modified>
<is-seminar>false</is-seminar>
</sco>
.....
</scos>
</results>
Here is the code I am using to read/load the XML data.
Stream responseStream = response.GetResponseStream();
XmlReader xmlReader = XmlReader.Create(responseStream, new XmlReaderSettings() { CheckCharacters = false });
var xmlResponse = XDocument.Load(xmlReader);
var folders = xmlResponse.Elements("results").Elements("scos").Elements("sco").ToList();
The exception occurs when the XDocument attempts to load the data from the xmlReader.
var xmlResponse = XDocument.Load(xmlReader);
I realise that I do not need to use the XmlReader and can load the XDocument directrly from the stream. However, I have included the XmlReader in response to this blog post by Paul Selles.
I have already read this thread:
How to prevent System.Xml.XmlException: Invalid character in the given encoding
However, this does not fix my problem. Apparently, XML standards cause the reader to default to the declared document encoding once the document is being read. In the case of my document where no declaration is being made, it should default to UTF-8. See this answer.

nbsp issue reading XML

There are several posts on this issue but none of the solutions seem to work for me. Not sure what I'm doing wrong.
I'm trying to read an XML file that looks like this:
<?xml version="1.0"?>
<content languageCode="en" languageID="1">
<ftypeparser>c:\users\pdeoliveira\Documents\Xmlparsernew.xml</ftypeparser>
<xmlparser>c:\users\pdeoliveira\Documents\Xmlparser.xml</xmlparser>
<sourcelocation>c:\localization\2015</sourcelocation>
<rpkfile>RPK_DefaultSyllabi_2015.xml</rpkfile>
<loglocation>c:\users\pdeoliveira\Documents\Buildloc_Log.txt</loglocation>
<losefiles><locfile>c:\localization\2015\Strings.xml</locfile></losefiles>
</content>
Here's the code that reads it:
XmlDocument doc = new XmlDocument();
string filetoload = "config.xml";
try
{
doc.Load(filetoload);
}
catch (Exception e)
{
Console.Writeline(e);
}
And here's the exception I get:
System.Xml.XmlException: Reference to undeclared entity 'nbsp'. Line 1960, position 12.
at System.Xml.XmlTextReaderImpl.Throw(Exception e)
at System.Xml.XmlTextReaderImpl.HandleGeneralEntityReference(String name, Boolean isInAttributeValue, Boolean pushFakeEntityIfNullResolver, Int32 entityStartLinePos)
at System.Xml.XmlTextReaderImpl.ResolveEntity()
at System.Xml.XmlLoader.LoadEntityReferenceNode(Boolean direct)
at System.Xml.XmlLoader.LoadNode(Boolean skipOverWhitespace)
at System.Xml.XmlLoader.LoadDocSequence(XmlDocument parentDoc)
at System.Xml.XmlLoader.Load(XmlDocument doc, XmlReader reader, Boolean preserveWhitespace)
at System.Xml.XmlDocument.Load(XmlReader reader)
at System.Xml.XmlDocument.Load(String filename)
at ConsoleApplication9.Program.getrestype(String fname) in c:\Users\pdeoliveira\Documents\Visual Studio 2013\Projects\BuildLoc\BuildLoc\Program.cs:line 1067
I tried using the declaration for nbsp like suggested in other posts:
<!DOCTYPE doctypeName [
<!ENTITY nbsp " ">
]>
That didn't work either. The interesting thing is that my file is only 10 or so lines long and it points to an issue on line 1960...??
Any help would be appreciated.

Exact way to receive an XML by C# Web service

I appreciate your help.
I have created a web service to receive an XML file, so I followed the below approach then I published it and it worked fine for me :
....
XmlDocument xmldoc = new XmlDocument();
try
{
if (HttpContext.Current.Request.InputStream != null)
{
StreamReader stream = new StreamReader(HttpContext.Current.Request.InputStream);
string xmls = stream.ReadToEnd();
xmldoc.LoadXml(xmls);
XmlReaderSettings settings = new XmlReaderSettings();
settings.ValidationType = ValidationType.Schema;
}
}
catch (Exception ex)
{
logger.Log(NLog.LogLevel.Error, ex.Message + ex.StackTrace);
}
...
knowing that my XML structure is:
<reports uis="5521452542">
<attribute1>val1</attribute1>
...
</reports>
but after testing by some friends, that called my web service from the Lunix platform I received in the Log file error the below message error; knowing that their XML file is validated.
Just to let you know; that their XML file did not contains the declaration of:
<?xml version="1.0" encoding="UTF-8"?>
Can this provide the error or NOT ?
2014-04-03 03:56:53.7408|Error|Root element is missing.
at System.Xml.XmlTextReaderImpl.Throw(Exception e)
at System.Xml.XmlTextReaderImpl.ParseDocumentContent()
at System.Xml.XmlLoader.Load(XmlDocument doc, XmlReader reader, Boolean preserveWhitespace)
at System.Xml.XmlDocument.Load(XmlReader reader)
at System.Xml.XmlDocument.LoadXml(String xml)
at WebService.Service1.GetInfoService() in
D:\yassine\Mobily\Log\WebService\WebService\WebService\Service1.asmx.cs:line 56
2014-04-03 03:56:53.8032|Error|Root element is missing. at System.Xml.XmlTextReaderImpl.Throw(Exception e)
at System.Xml.XmlTextReaderImpl.ParseDocumentContent()
at System.Xml.Linq.XDocument.Load(XmlReader reader, LoadOptions options)
at System.Xml.Linq.XDocument.Parse(String text, LoadOptions options)
at WebService.Service1.GetInfoService() in
D:\yassine\Mobily\Log\WebService\WebService\WebService\Service1.asmx.cs:line 71
Can you please help me to find the exact error please ?
Thank you
The exception is saying exactly whats wrong, you are receiving an invalid xml that has no root element. Ask your friends to send you the raw xml by mail so you could see what they're sending you.
You can you Altova XmlSpy to verify that the xml is valid.
A very basic but valid xml should be:
<root>
<child></child
</root>

parsing almost well formed XML fragments: how to skip over multiple XML headers

I’m required to write a tool that can handle the below XML fragment that is not well formed as it contains XML declarations in the middle of the stream.
The company already has these kinds files in use for a long time, so there is no option to change the format.
There is no source code available that does the parsing, and the platform of choice for new tooling is .NET 4 or newer preferably with C#.
This is how the fragments look like:
<Header>
<Version>1</Version>
</Header>
<Entry><?xml version="1.0"?><Detail>...snip...</Detail></Entry>
<Entry><?xml version="1.0"?><Detail>...snip...</Detail></Entry>
<Entry><?xml version="1.0"?><Detail>...snip...</Detail></Entry>
<Entry><?xml version="1.0"?><Detail>...snip...</Detail></Entry>
Using an XmlReader with the XmlReaderSettings.ConformanceLevel set to ConformanceLevel.Fragment, I can read the complete <Header> element fine.
Even the <Entry> element start is OK, however while reading the <Detail> info the XmlReader it throws an XmlException, as it reads in the <?xml...?> XML declaration which it doesn't expect at that place.
What options do I have to skip over those XML declarations, besides heavy string manipulations?
Since the fragments can easily go above 100 megabyte a piece, I'd rather do not load everything into memory at once. But it that is what it takes, I am open for it.
Example of the exceptions I get:
System.Xml.XmlException: Unexpected XML declaration.
The XML declaration must be the first node in the document, and no white space characters are allowed to appear before it.
Line ##, position ##.
I don't think the built in classes will help; you'll probably have to do some preparsing and remove the extra headers. If your sample is accurate, you can just do a string.Replace(badXml, "<?xml version=\"1.0\"?>, "") and be on your way.
If you are unsure that the declarations stay the same all the time, replace <?xml with <XmlDeclaration and ?> with /> and use a regular parser ;)
Also, have you tried passing the files through an XML tidy style program?
There might also be an SGML library you can use to preprocess the data and output correct XML.
I added this as an answer because it preserves syntax highlighting.
private void ProcessFile(string inputFileName, string outputFileName)
{
using (StreamReader reader = new StreamReader(inputFileName, new UTF8Encoding(false, true)))
{
using (StreamWriter writer = new StreamWriter(outputFileName, false, Encoding.UTF8))
{
string line;
while ((line = reader.ReadLine()) != null)
{
const string xmlDeclarationStart = "<?xml";
const string xmlDeclarationFinish = "?>";
if (line.Contains(xmlDeclarationStart))
{
string newLine = line.Substring(0, line.IndexOf(xmlDeclarationStart));
int endPosition = line.IndexOf(xmlDeclarationFinish, line.IndexOf(xmlDeclarationStart));
if (endPosition == -1)
{
throw new NotImplementedException(string.Format("Implementation assumption is wrong. {0} .. {1} spans multiple lines (or input file is severely misformed)", xmlDeclarationStart, xmlDeclarationFinish));
}
// the code completely strips the <?xml ... ?> part
// an alternative would be to make this a new XML element containing
// the information inside the <?xml ... ?> part as attributes
// just like Daren Thomas suggested
newLine += line.Substring(endPosition + 2);
line = newLine;
}
writer.WriteLine(line);
}
}
}
}

Categories

Resources