Modifying Word Document XML using Packages - c#

I am attempting to modify a simple MS word templates XML. I realize there are SDK's available that could make this process easier but what I am tasked with maintaining uses packages and I was told to do the same.
I have a basic test document with two placeholders mapped to the following XML:
<root>
<element>
Fubar
</element>
<second>
This is the second placeholder
</second>
</root>
What I am doing is creating a stream using the word doc, removing the existing XML, getting some hard coded test XML and trying to write that to the stream.
Here is the code I am using:
string strRelRoot = "http://schemas.openxmlformats.org/officeDocument/2006/relationships/officeDocument";
byte[] buffer = File.ReadAllBytes("dev.docx");
//stream with the template
MemoryStream stream = new MemoryStream(buffer, true);
//create a package using the stream
Package package = Package.Open(stream, FileMode.Open, FileAccess.ReadWrite);
PackageRelationshipCollection pkgrcOfficeDocument = package.GetRelationshipsByType(strRelRoot);
foreach (PackageRelationship pkgr in pkgrcOfficeDocument)
{
if (pkgr.SourceUri.OriginalString == "/")
{
Uri uriData = new Uri("/customXML/item1.xml", UriKind.Relative);
//remove the existing part
if (package.PartExists(uriData))
{
// Delete template "/customXML/item1.xml" part
package.DeletePart(uriData);
}
//create a new part
PackagePart pkgprtData = package.CreatePart(uriData, "application/xml");
//test data
string xml = #"<root>
<element>
Changed
</element>
<second>
The second placeholder changed
</second>
</root>";
//stream created from the xml string
MemoryStream fromStream = new MemoryStream();
UnicodeEncoding uniEncoding = new UnicodeEncoding();
byte[] fromBuffer = uniEncoding.GetBytes(xml);
fromStream.Write(fromBuffer, 0, fromBuffer.Length);
fromStream.Seek(0L, SeekOrigin.Begin);
Stream toStream = pkgprtData.GetStream();
//copy the xml to the part stream
fromStream.CopyTo(toStream);
//copy part stream to the byte stream
toStream.CopyTo(stream);
}
}
This is currently not modifying the document although I feel like I am close to a solution. Any advice would be very much appreciated. Thanks!
Edit: To clairify, the result I am getting is the document is unchanged. I get no exceptions or the like, but the documents XML is not modified.

OK, so not quite the timely response I promised, but here goes!
There are several aspects to the problem. Sample code is from memory and documentation, not necessarily compiled and tested.
Read the template XML
Before you delete the package part containing the template XML, you need to open its stream and read the XML. How you get the XML if the part doesn't exist to begin with is up to you.
My example code uses classes from the LINQ to XML API, though you could use whichever set of XML APIs you prefer.
XElement templateXml = null;
using (Stream stream = package.GetPart(uriData))
templateXml = XElement.Load(stream);
// Now you can delete the part.
At this point you have an in-memory representation of the template XML in templateXml.
Substitute values into the placeholders
templateXml.SetElementValue("element", "Replacement value of first placeholder");
templateXml.SetElementValue("second", "Replacement value of second placeholder");
Check out the methods on XElement if you need to do anything more advanced than this, e.g. read the original content in order to determine the replacement value.
Save the document
This is your original code, modified and annotated.
// The very first thing to do is create the Package in a using statement.
// This makes sure it's saved and closed when you're done.
using (Package package = Package.Open(...))
{
// XML reading, substituting etc. goes here.
// Eventually...
//create a new part
PackagePart pkgprtData = package.CreatePart(uriData, "application/xml");
// Don't need the test data anymore.
// Assuming you need UnicodeEncoding, set it up like this.
var writerSettings = new XmlWriterSettings
{
Encoding = Encoding.Unicode,
};
// Shouldn't need a MemoryStream at all; write straight to the part stream.
// Note using statements to ensure streams are flushed and closed.
using (Stream toStream = pkgprtData.GetStream())
using (XmlWriter writer = XmlWriter.Create(toStream, writerSettings))
templateXml.Save(writer);
// No other copying should be necessary.
// In particular, your toStream.CopyTo(stream) appeared
// to be appending the part's data to the package's stream
// (the physical file), which is a bug.
} // This closes the using statement for the package, which saves the file.

Related

Open XML WordprocessingDocument with MemoryStream is 0KB

I am trying to learn how to use with Microsoft's Open XML SDK. I followed their steps on how to create a Word document using a FileStream and it worked perfectly. Now I want to create a Word document but only in memory, and wait for the user to specify whether they would like to save the file or not.
This document by Microsoft says how to deal with in-memory documents using MemoryStream, however, the document is first loaded from an existing file and "dumped" into a MemorySteam. What I want is to create a document entirely in memory (not based on a file in a drive). What I thought would achieve that was this code:
// This is almost the same as Microsoft's code except I don't
// dump any files into the MemoryStream
using (var mem = new MemoryStream())
{
using (var doc = WordprocessingDocument.Create(mem, WordprocessingDocumentType.Document, true))
{
doc.AddMainDocumentPart().Document = new Document();
var body = doc.MainDocumentPart.Document.AppendChild(new Body());
var paragraph = body.AppendChild(new Paragraph());
var run = paragraph.AppendChild(new Run());
run.AppendChild(new Text("Hello docx"));
using (var file = new FileStream(destination, FileMode.CreateNew))
{
mem.WriteTo(file);
}
}
}
But the result is a file that is 0KB and that can't be read by Word. At first I thought it was because of the size of the MemoryStream so I provided it with an initial size of 1024 but the results were the same. On the other hand if I change the MemoryStream for a FileStreamit works perfectly.
My question is whether what I want to do is possible, and if so, how? I guess it must be possible, just not how I'm doing it. If it isn't possible what alternative do I have?
There's a couple of things going on here:
First, unlike Microsoft's sample, I was nesting the using block code that writes the file to disk inside the block that creates and modifies the file. The WordprocessingDocument gets saved to the stream until it is disposed or when the Save() method is called. The WordprocessingDocument gets disposed automatically when reaching the end of it's using block. If I had not nested the third using statement, thus reaching the end of the second using statement before trying to save the file, I would have allowed the document to be written to the MemoryStream- instead I was writing a still empty stream to disk (hence the 0KB file).
I suppose calling Save()might have helped, but it is not supported by .Net core (which is what I'm using). You can check whether Save()is supported on you system by checking CanSave.
/// <summary>
/// Gets a value indicating whether saving the package is supported by calling <see cref="Save"/>. Some platforms (such as .NET Core), have limited support for saving.
/// If <c>false</c>, in order to save, the document and/or package needs to be fully closed and disposed and then reopened.
/// </summary>
public static bool CanSave { get; }
So the code ended up being almost identical to Microsoft's code except I don't read any files beforehand, rather I just begin with an empty MemoryStream:
using (var mem = new MemoryStream())
{
using (var doc = WordprocessingDocument.Create(mem, WordprocessingDocumentType.Document, true))
{
doc.AddMainDocumentPart().Document = new Document();
var body = doc.MainDocumentPart.Document.AppendChild(new Body());
var paragraph = body.AppendChild(new Paragraph());
var run = paragraph.AppendChild(new Run());
run.AppendChild(new Text("Hello docx"));
}
using (var file = new FileStream(destination, FileMode.CreateNew))
{
mem.WriteTo(file);
}
}
Also you don't need to reopen the document before saving it, but if you do remember to use Open() instead of Create() because Create() will empty the MemoryStream and you'll also end with a 0KB file.
You're passing mem to WordprocessingDocument.Create(), which is creating the document from the (now-empty) MemoryStream, however, I don't think that is associating the MemoryStream as the backing store of the document. That is, mem is only the input of the document, not the output as well. Therefore, when you call mem.WriteTo(file);, mem is still empty (the debugger would confirm this).
Then again, the linked document does say "you must supply a resizable memory stream to [Open()]", which implies that the stream will be written to, so maybe mem does become the backing store but nothing has been written to it yet because the AutoSave property (for which you specified true in Create()) hasn't had a chance to take effect yet (emphasis mine)...
Gets a flag that indicates whether the parts should be saved when disposed.
I see that WordprocessingDocument has a SaveAs() method, and substituting that for the FileStream in the original code...
using (var mem = new MemoryStream())
using (var doc = WordprocessingDocument.Create(mem, WordprocessingDocumentType.Document, true))
{
doc.AddMainDocumentPart().Document = new Document();
var body = doc.MainDocumentPart.Document.AppendChild(new Body());
var paragraph = body.AppendChild(new Paragraph());
var run = paragraph.AppendChild(new Run());
run.AppendChild(new Text("Hello docx"));
// Explicitly close the OpenXmlPackage returned by SaveAs() so destination doesn't stay locked
doc.SaveAs(destination).Close();
}
...produces the expected file for me. Interestingly, after the call to doc.SaveAs(), and even if I insert a call to doc.Save(), mem.Length and mem.Position are both still 0, which does suggest that mem is only used for initialization.
One other thing I would note is that the sample code is calling Open(), whereas you are calling Create(). The documentation is pretty sparse as far as how those two methods differ, but I would have suggested you try creating your document with Open() instead...
using (MemoryStream mem = new MemoryStream())
using (WordprocessingDocument doc = WordprocessingDocument.Open(mem, true))
{
// ...
}
...however when I do that Open() throws an exception, presumably because mem has no data. So, it seems the names are somewhat self-explanatory in that Create() initializes new document data whereas Open() expects existing data. I did find that if I feed Create() a MemoryStream filled with random garbage...
using (var mem = new MemoryStream())
{
// Fill mem with garbage
byte[] buffer = new byte[1024];
new Random().NextBytes(buffer);
mem.Write(buffer, 0, buffer.Length);
mem.Position = 0;
using (var doc = WordprocessingDocument.Create(mem, WordprocessingDocumentType.Document, true))
{
// ...
}
}
...it still produces the exact same document XML as the first code snippet above, which makes me wonder why Create() even needs an input Stream at all.
I was facing the same problem today, after all, the solution is closing the document to fill the memorystream, here is the example, Lance U. Matthews's example help me alot, and finally I realized, after cheking others document types exports, after fill thems, each one calls method Close, but, Microsoft example doesn't show it
private MemoryStream GenerateWord(DataTable dt)
{
MemoryStream mStream = new MemoryStream();
// Create Document
OpenXMLPackaging.WordprocessingDocument wordDocument =
OpenXMLPackaging.WordprocessingDocument.Create(mStream, OpenXML.WordprocessingDocumentType.Document, true);
// Add a main document part.
OpenXMLPackaging.MainDocumentPart mainPart = wordDocument.AddMainDocumentPart();
mainPart.Document = new OpenXMLWordprocessing.Document();
OpenXMLWordprocessing.Body body = mainPart.Document.AppendChild(new OpenXMLWordprocessing.Body());
OpenXMLWordprocessing.Table table = new OpenXMLWordprocessing.Table();
body.AppendChild(table);
OpenXMLWordprocessing.TableRow tr = new OpenXMLWordprocessing.TableRow();
foreach (DataColumn c in dt.Columns)
{
tr.Append(new OpenXMLWordprocessing.TableCell(new OpenXMLWordprocessing.Paragraph(new OpenXMLWordprocessing.Run(new OpenXMLWordprocessing.Text(c.ColumnName.ToString())))));
}
table.Append(tr);
foreach (DataRow r in dt.Rows)
{
if (dt.Rows.Count > 0)
{
OpenXMLWordprocessing.TableRow dataRow = new OpenXMLWordprocessing.TableRow();
for (int h = 0; h < dt.Columns.Count; h++)
{
dataRow.Append(new OpenXMLWordprocessing.TableCell(new OpenXMLWordprocessing.Paragraph(new OpenXMLWordprocessing.Run(new OpenXMLWordprocessing.Text(r[h].ToString())))));
}
table.Append(dataRow);
}
}
mainPart.Document.Save();
wordDocument.Close();
mStream.Position = 0;
return mStream;
}

How to read the text element of an XML node without dereferencing entities using XmlReader

I am attempting to read a XML document containing elements like the data mentioned below.
Accessing the text node via reader.Value, reader.ReadContentAsString(), reader.ReadContentAsObject() results in the value read being truncated to the last ampersand, so in the case of the data below that would be ISO^urn:ihe:iti:xds:2013:referral. Using XmlDocument the text nodes can be read properly so I am assuming there has to be a way to make this work using the reader as well.
<Slot name="urn:ihe:iti:xds:2013:referenceIdList">
<ValueList>
<Value>123456^^^&orgID&ISO^urn:ihe:iti:xds:2013:referral</Value>
<Value>098765^^^&orgID&ISO^urn:ihe:iti:xds:2013:referral</Value>
</ValueList>
</Slot>
Clarification Edit
After asking the question I was able to determine my issue came from creating an XmlReader from a XPathNavigator instance created from a MessageBuffer executing in the context of a WCF service call. Thus #DarkGray's answer was correct for the original question but did not really address the root of the problem. I provided a second answer which addressed my corner case.
System.ServiceModel.Channels.Message message; // the inbound SOAP message
var buffer = message.CreateBufferedCopy(11 * 1024 * 1024);
var navigator = buffer.CreateNavigator();
var reader = navigator.ReadSubtree();
// advance the reader to the text element
//
// `reader.Value` now produces ISO^urn:ihe:iti:xds:2013:referral
Answer: reader.Value
Output:
123456^^^&orgID&ISO^urn:ihe:iti:xds:2013:referral
098765^^^&orgID&ISO^urn:ihe:iti:xds:2013:referral
Example:
public static void Execute()
{
var xml = #"
<Slot name='urn:ihe:iti:xds:2013:referenceIdList'>
<ValueList>
<Value>123456^^^&orgID&ISO^urn:ihe:iti:xds:2013:referral</Value>
<Value>098765^^^&orgID&ISO^urn:ihe:iti:xds:2013:referral</Value>
</ValueList>
</Slot>
";
var reader = System.Xml.XmlReader.Create(new System.IO.StringReader(xml));
for (; ; )
{
if (!reader.Read())
break;
if (reader.NodeType == System.Xml.XmlNodeType.Text)
Console.WriteLine(reader.Value);
}
}
My question ended up being too broad as the incorrect behavior (truncation when using reader.Value) only manifest when the code was executing within the context of a WCF call. It worked perfectly fine when exercising the logic of the containing class from a unit test.
So the salient setup can be reproduced as follows
The Failing Code
System.ServiceModel.Channels.Message message; // the inbound SOAP message
var buffer = message.CreateBufferedCopy(11 * 1024 * 1024);
var navigator = buffer.CreateNavigator();
var reader = navigator.ReadSubtree();
// advance the reader to the text element
//
// `reader.Value` now produces ISO^urn:ihe:iti:xds:2013:referral
Once this reader instance was created then any XmlText node read from it produced the truncated value when the text contained an character entity reference. The only way I found that allow for the original value to be read in high-fidelity was to eschew the use of the XPathNavigator completely and instead take the hit of creating another Message instance. Note, the fix uses the long way around to write the SOAP envelope to the stream as affected service is using MTOM encoding. Writing to the stream directly from the MessageBuffer resulted in the MIME fences being written out.
The Fix
System.ServiceModel.Channels.Message message; // the inbound SOAP
var buffer = message.CreateBufferedCopy(MaxMessageSize);
var message = buffer.CreateMessage();
using (MemoryStream stream = new MemoryStream())
using (XmlWriter writer = XmlWriter.Create(stream))
{
message.WriteMessage(writer);
writer.Flush();
stream.Position = 0;
using (XmlReader reader = XmlReader.Create(stream))
{
// business logic goes here
}
}

Save object to file - What are common alternate serialization methods?

Edit: The original question was made out of a misunderstanding, and has been updated.
Coming from other languages it seems odd that C# does not seem to have a simple way to dump things like objects and components straight to file.
Lets take java as an example, where I can dump any object to file and load with no apparent restrictions:
//Save object to file:
FileOutputStream saveFile = new FileOutputStream(---filePath---);
ObjectOutputStream save = new ObjectOutputStream(saveFile);
save.writeObject(---YorObjectHere---);
//Load object from file
FileInputStream saveFile = new FileInputStream(---filePath---);
ObjectInputStream save = new ObjectInputStream(saveFile);
---ObjectType--- loadedObject = (---ObjectType---) save.readObject();
Can this sort of thing be easily achieved in C#?
I have tried the standard method of serialization:
//Save object to file:
IFormatter formatterSave = new BinaryFormatter();
Stream streamSave = new FileStream(---filePath---, FileMode.Create, FileAccess.Write, FileShare.None);
formatterSave.Serialize(streamSave, ---ObjectToSave---);
//Load object from file
IFormatter formatterLoad = new BinaryFormatter();
Stream streamLoad = new FileStream(---filePath---, FileMode.Open, FileAccess.Read, FileShare.Read);
---ObjectType--- ---LoadedObjectName--- = (---ObjectType---)formatterLoad.Deserialize(streamLoad);
While this method is quite simple to do, it does not always work with existing or locked code because the [serializable] tag cannot always be added.
So what is the best alternate for serialization?
Thanks to comments I tried the XML method, but it did not work either because it cannot serialize an ArrayList as shown on MSDN
//Save object to file:
var serializerSave = new XmlSerializer(typeof(objectType));
writer = new StreamWriter(filePath, append);
serializerSave.Serialize(writer, objectToWrite);
//Load object from file
var serializerLoad = new XmlSerializer(typeof(objectType));
reader = new StreamReader(filePath);
return (T)serializerLoad.Deserialize(reader);
It looks like JSON is the only way to do it easily, or is there another alternate way to serialize with normal C# libraries without needing massive amounts of code?
Thanks to those who commented and pointed me in the right direction.
Although its not covered in the basic libraries, using JSON looks like the best alternate in this situation, I plan to use Json.NET, it can be found here: http://www.nuget.org/packages/Newtonsoft.Json
Another good option is Google Gson: https://code.google.com/p/google-gson/

OutOfMemoryException loading xml document winmo 6.1

I am using c# on a windows mobile 6.1 device. compact framework 3.5.
I am getting a OutofMemoryException when loading in a large string of XML.
The handset has limited memory, but should be more than enough to handle the size of the xml string. The string of xml contains the base64 contents of a 2 MB file. The code will work when the xml string contains files of up to 1.8 MB.
I am completely puzzled as to what to do. Not sure how to change any memory settings.
I have included a condensed copy of the code below. Any help is appreciated.
Stream newStream = myRequest.GetRequestStream();
// Send the data.
newStream.Write(data, 0, data.Length);
//close the write stream
newStream.Close();
// Get the response.
HttpWebResponse response = (HttpWebResponse)myRequest.GetResponse();
// Get the stream containing content returned by the server.
Stream dataStream = response.GetResponseStream();
//Process the return
//Set the buffer
byte[] server_response_buffer = new byte[8192];
int response_count = -1;
string tempString = null;
StringBuilder response_sb = new StringBuilder();
//Loop through the stream until it is all written
do
{
// Read content into a buffer
response_count = dataStream.Read(server_response_buffer, 0, server_response_buffer.Length);
// Translate from bytes to ASCII text
tempString = Encoding.ASCII.GetString(server_response_buffer, 0, response_count);
// Write content to a file from buffer
response_sb.Append(tempString);
}
while (response_count != 0);
responseFromServer = response_sb.ToString();
// Cleanup the streams and the response.
dataStream.Close();
response.Close();
}
catch {
MessageBox.Show("There was an error with the communication.");
comm_error = true;
}
if(comm_error == false){
//Load the xml file into an XML object
XmlDocument xdoc = new XmlDocument();
xdoc.LoadXml(responseFromServer);
}
The error occurs on the xdoc.LoadXML line. I have tried writing the stream to a file and then loading the file directly into the xmldocument but it was no better.
Completely stumped at this point.
I would recommend that you use the XmlTextReader class instead of the XmlDocument class. I am not sure what your requirements are for reading of the xml, but XmlDocument is very memory intensive as it creates numerous objects and attempts to load the entire xml string. The XmlTextReader class on the other hand simply scans through the xml as you read from it.
Assuming you have the string, this means you would do something like the following
String xml = "<someXml />";
using(StringReader textReader = new StringReader(xml)) {
using(XmlTextReader xmlReader = new XmlTextReader(textReader)) {
xmlReader.MoveToContent();
xmlReader.Read();
// the reader is now pointed at the first element in the document...
}
}
Have you tried loading from a stream instead of from a string(this is different from writing to a stream, because in your example you are still trying to load it all at once into memory with the XmlDocument)?
There are other .NET components for XML files that work with the XML as a stream instead of loading it all at once. The problem is that .LoadXML probably tries to process the entire document at once, loading it in memory. Not only that but you've already loaded it into a string, so it exists in two different forms in memory, further increasing the chance that you do not have enough free contiguous memory available.
What you want is some way to read it piece meal into a stream through an XmlReader so that you can begin reading the XML document piece wise without loading the entire thing into memory. Of course there are limitations to this approach because an XmlReader is forward only and readonly, and whether it will work depends on what you are wanting to do with the XML once it is loaded.
I'm unsure why you are reading the xml in the way you are, but it could be very memory inefficient. If the garbage collector hasn't kicked in you could have 3+ copies of the document in memory: in the string builder, in the string and in the XmlDocument.
Much better to do something like:
XmlDocument xDoc = new XmlDocument();
Stream dataStream;
try {
dataStream = response.GetResponseStream();
xDoc.Load(dataStream);
} catch {
MessageBox.Show("There was an error with the communication.");
} finally {
if(dataStream != null)
dataStream.Dispose();
}

Why does C# XmlDocument.LoadXml(string) fail when an XML header is included?

Does anyone have any idea why the following code sample fails with an XmlException "Data at the root level is invalid. Line 1, position 1."
var body = "<?xml version="1.0" encoding="utf-16"?><Report> ......"
XmlDocument bodyDoc = new XmlDocument();
bodyDoc.LoadXml(body);
Background
Although your question does have the encoding set as UTF-16, you don't have the string properly escaped so I wasn't sure if you did, in fact, accurately transpose the string into your question.
I ran into the same exception:
System.Xml.XmlException: Data at the
root level is invalid. Line 1,
position 1.
However, my code looked like this:
string xml = "<?xml version=\"1.0\" encoding=\"utf-8\" ?>\n<event>This is a Test</event>";
XmlDocument xmlDoc = new XmlDocument();
xmlDoc.LoadXml(xml);
The Problem
The problem is that strings are stored internally as UTF-16 in .NET however the encoding specified in the XML document header may be different. E.g.:
<?xml version="1.0" encoding="utf-8"?>
From the MSDN documentation for String here:
Each Unicode character in a string is
defined by a Unicode scalar value,
also called a Unicode code point or
the ordinal (numeric) value of the
Unicode character. Each code point is
encoded using UTF-16 encoding, and the
numeric value of each element of the
encoding is represented by a Char
object.
This means that when you pass XmlDocument.LoadXml() your string with an XML header, it must say the encoding is UTF-16. Otherwise, the actual underlying encoding won't match the encoding reported in the header and will result in an XmlException being thrown.
The Solution
The solution for this problem is to make sure the encoding used in whatever you pass the Load or LoadXml method matches what you say it is in the XML header. In my example above, either change your XML header to state UTF-16 or to encode the input in UTF-8 and use one of the XmlDocument.Load methods.
Below is sample code demonstrating how to use a MemoryStream to build an XmlDocument using a string which defines a UTF-8 encode XML document (but of course, is stored a UTF-16 .NET string).
string xml = "<?xml version=\"1.0\" encoding=\"utf-8\" ?>\n<event>This is a Test</event>";
// Encode the XML string in a UTF-8 byte array
byte[] encodedString = Encoding.UTF8.GetBytes(xml);
// Put the byte array into a stream and rewind it to the beginning
MemoryStream ms = new MemoryStream(encodedString);
ms.Flush();
ms.Position = 0;
// Build the XmlDocument from the MemorySteam of UTF-8 encoded bytes
XmlDocument xmlDoc = new XmlDocument();
xmlDoc.Load(ms);
Simple and effective solution: Instead of using the LoadXml() method use the Load() method
For example:
XmlDocument xmlDoc = new XmlDocument();
xmlDoc.Load("sample.xml");
I figured it out. Read the MSDN documentation and it says to use .Load instead of LoadXml when reading from strings. Found out this works 100% of time. Oddly enough using StringReader causes problems. I think the main reason is that this is a Unicode encoded string and that could cause problems because StringReader is UTF-8 only.
MemoryStream stream = new MemoryStream();
byte[] data = body.PayloadEncoding.GetBytes(body.Payload);
stream.Write(data, 0, data.Length);
stream.Seek(0, SeekOrigin.Begin);
XmlTextReader reader = new XmlTextReader(stream);
// MSDN reccomends we use Load instead of LoadXml when using in memory XML payloads
bodyDoc.Load(reader);
Try this:
XmlDocument bodyDoc = new XmlDocument();
bodyDoc.XMLResolver = null;
bodyDoc.Load(body);
This worked for me:
var xdoc = new XmlDocument { XmlResolver = null };
xdoc.LoadXml(xmlFragment);
This really saved my day.
I have written a extension method based on Zach's answer, also I have extended it to use the encoding as a parameter, allowing for different encodings beside from UTF-8 to be used, and I wrapped the MemoryStream in a 'using' statement.
public static class XmlHelperExtentions
{
/// <summary>
/// Loads a string through .Load() instead of .LoadXml()
/// This prevents character encoding problems.
/// </summary>
/// <param name="xmlDocument"></param>
/// <param name="xmlString"></param>
public static void LoadString(this XmlDocument xmlDocument, string xmlString, Encoding encoding = null) {
if (encoding == null) {
encoding = Encoding.UTF8;
}
// Encode the XML string in a byte array
byte[] encodedString = encoding.GetBytes(xmlString);
// Put the byte array into a stream and rewind it to the beginning
using (var ms = new MemoryStream(encodedString)) {
ms.Flush();
ms.Position = 0;
// Build the XmlDocument from the MemorySteam of UTF-8 encoded bytes
xmlDocument.Load(ms);
}
}
}
I had the same problem when switching from absolute to relative path for my xml file.
The following solves both loading and using relative source path issues.
Using a XmlDataProvider, which is defined in xaml (should be possible in code too) :
<Window.Resources>
<XmlDataProvider
x:Name="myDP"
x:Key="MyData"
Source=""
XPath="/RootElement/Element"
IsAsynchronous="False"
IsInitialLoadEnabled="True"
debug:PresentationTraceSources.TraceLevel="High" /> </Window.Resources>
The data provider automatically loads the document once the source is set. Here's the code :
m_DataProvider = this.FindResource("MyData") as XmlDataProvider;
FileInfo file = new FileInfo("MyXmlFile.xml");
m_DataProvider.Document = new XmlDocument();
m_DataProvider.Source = new Uri(file.FullName);
Simple line:
bodyDoc.LoadXml(new MemoryStream(Encoding.Unicode.GetBytes(body)));
I had the same issue because the XML file I was uploading was encoded using UTF-8-BOM (UTF-8 byte-order mark).
Switched the encoding to UTF-8 in Notepad++ and was able to load the XML file in code.

Categories

Resources