Investigating XMLWriter object - c#

How can I see the XML contents of fully populated XmlWriter object while debugging. My silverlight application doesn't permit to actually write to a file and check the contents.

Have it write to a MemoryStream or StringBuilder instead of a file. That will allow you to check the output.

You can create the XmlWriter based on a MemoryStream, then unencode the bytes from the memory stream and display it in a text box, for example.
MemoryStream ms = new MemoryStream();
XmlWriterSettings ws = new XmlWriterSettings();
ws.Encoding = Encoding.UTF8;
XmlWriter w = XmlWriter.Create(ms, ws);
// populate the writer
w.Flush();
textBox1.Text = Encoding.UTF8.GetString(ms.GetBuffer(), 0, (int)ms.Position);

An XmlReader is not "populated". It represents the state of an XML parsing operation, as that operation is in progress. This state will change as the XML is read.

Related

XmlSerializer.Serialize BOM missing

I am using this code to store my class:
FileStream stream = new FileStream(myPath, FileMode.Create);
XmlSerializer serializer = new XmlSerializer(typeof(myClass));
serializer.Serialize(stream, myClass);
stream.Close();
This writes a file that I can read alright with XmlSerializer.Deserialize. The generated file, however, is not a proper text file. XmlSerializer.Serialize doesn't store a BOM, but still inserts multibyte characters. Thus it is implicitely declared an ANSI file (because we expect an XML file to be a text file, and a text file without a BOM is considered ANSI by Windows), showing ö as ö in some editors.
Is this a known bug? Or some setting that I'm missing?
Here is what the generated file starts with:
<?xml version="1.0"?>
<SvnProjects xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
The first byte in the file is hex 3C, i.e the <.
Having or not having a BOM is not a definition of a "proper text file". In fact, I'd say that the most typical format these days is UTF-8 without BOM; I don't think I've ever seen anyone actually use the UTF-8 BOM in real systems! But: if you want a BOM, that's fine: just pass the correct Encoding in; if you want UTF-8 with BOM:
using (var writer = XmlWriter.Create(myPath, s_settings))
{
XmlSerializer serializer = new XmlSerializer(typeof(MyClass));
serializer.Serialize(writer, obj);
}
with:
static readonly XmlWriterSettings s_settings =
new XmlWriterSettings { Encoding = new UTF8Encoding(true) };
The result of this is a file that starts EF-BB-BF, the UTF-8 BOM.
If you want a different encoding, then just replace new UTF8Encoding with whatever you did want, remembering to enable the BOM.
(note: the static Encoding.UTF8 instance has the BOM enabled, but IMO it is better to be very explicit here if you specifically intend to use a BOM, just like you should be very explicit about what Encoding you intended to use)
Edit: the key difference here is that Serialize(Stream, object) ends up using:
XmlTextWriter xmlWriter = new XmlTextWriter(stream, encoding: null) {
Formatting = Formatting.Indented,
Indentation = 2
};
which then ends up using:
public StreamWriter(Stream stream) : this(stream,
encoding: UTF8NoBOM, // <==== THIS IS THE PROBLEM
bufferSize: 1024, leaveOpen: false)
{
}
so: UTF-8 without BOM is the default if you use that API.
you must xml an instance not a class definition
for getting Unicode you must declare a XmlWriter or TextWriter
FileStream stream = new FileStream(myPath, FileMode.Create);
XmlSerializer serializer = new XmlSerializer(typeof(myClass));
XmlWriter writer = new XmlTextWriter(fs, Encoding.Unicode);
serializer.Serialize(writer, myClass);
stream.Close();

How can I check result of adding to a MemoryStream, before adding to the MemoryStream?

I am creating an XML file using a Memory stream, in the following manner -
XmlWriterSettings settings = new XmlWriterSettings();
settings.Indent = true;
MemoryStream ms = new MemoryStream();
using (XmlWriter writer = XmlWriter.Create(ms, settings))
{
// CREATE XML WITH STATEMENTS LIKE THIS
writer.WriteStartElement("url", myUrl);
writer.WriteEndElement();
}
Is there some way to check what the length of the resulting MemoryStream will be, before actually adding the element to the MemoryStream?
Something like -
var sizeTotal = ms + (writer.WriteStartElement("url", myUrl);
The purpose of this is so I can check sizeTotal, make sure it's not too big before I write it to the MemoryStream.
I have a size limit of 10MB for the XML file I am generating. Could I write to a "temp" stream 1st, check the length, if the sizeTotal is less than 10MB, write to the element to the "real" stream. Else, if the sizeTotal is more than 10MB, I want to not add the element to the stream, generate the XML file, then start a new file.
There's no way I know of to determine how long an element will be before you actually write it. But you can determine how long the thing was, and then back up if you need to. The key is to get the current position of the MemoryStream before you begin the write, and subtract that from the position after writing. For example:
using (XmlWriter writer = XmlWriter.Create(ms, settings))
{
long startPos = ms.Position;
// write an element
writer.WriteStartElement("url", myUrl);
writer.WriteEndElement();
writer.Flush(); // make sure data is flushed to the stream
long endPos = ms.Position;
}
So if endPos goes beyond your size limit, you can copy the memory stream from 0 to startPos, and save it. You can then position the stream at 0 and retry your write. Something like:
using (XmlWriter writer = XmlWriter.Create(ms, settings))
{
// write an element
while (true)
{
long startPos = ms.Position;
writer.WriteStartElement("url", myUrl);
writer.WriteEndElement();
writer.Flush(); // make sure data is flushed to the stream
if (ms.Position < maxSize)
break;
// here, copy memory buffer from 0 to startPos and save to disk
// then, reset position
ms.Position = 0;
}
// write another element
while (true)
{
// same thing, different element
}
// make sure to write the final buffer!
}
That's the general idea. Of course you'd want to make sure you don't go into an infinite loop if a single element exceeds the threshold value.
Note that I don't address the problem of XML fragments, which this technique will undoubtedly create. If you want your individual files to be valid XML, you'll have to keep track of which elements are opened, so when you write the partial file all those elements are closed. You'll also need to open those elements when you initialize the memory stream for the next bit. You can either do that in place of the ms.Position = 0;, or you can do that when you write the next fragment out.
Without knowing more about the larger problem (i.e. what you're doing with these partial files), I can't make specific recommendations about how to handle it.
You mean something like this?
var myUrl = "http://www.stackoverflow.com";
var encoding = Encoding.UTF8;
XmlWriterSettings settings = new XmlWriterSettings();
settings.Indent = true;
settings.Encoding = encoding;
var SOME_MAX_SIZE_IN_BYTES = 1024;
MemoryStream ms = new MemoryStream();
for(var i=0; i<10; i++)
{
using (XmlWriter writer = XmlWriter.Create(ms, settings))
{
// CREATE XML WITH STATEMENTS LIKE THIS
writer.WriteStartElement("url", myUrl);
writer.WriteEndElement();
// in order to get a semi-accurate byte count,
// you need to force the flush to the underlying stream
writer.Flush();
var lengthInBytes = ms.Length;
if(lengthInBytes > SOME_MAX_SIZE_IN_BYTES)
{
Console.WriteLine("TOO BIG!");
break;
}
}
}
var xml = encoding.GetString(ms.ToArray());
Console.WriteLine(xml);

Live SDK - Uploading XML file trough Memory Stream

I have a bit of a problem with the client.UploadAsync method of Live SDK (SkyDrive SDK). My code for some reason doesn't work or more specifically it uploads an empty file. It doesn't throw any error and the serialization to stream works (I know that for sure).
It even seems that the Memory Stream is OK. (since I have no tool to really see the data in it I just guess it is OK by looking at its 'Length' property).
The UploadAsync method is fine as well or at least it worked well when I first serialized the data into a .xml file in IsolatedStorage then read it with IsolatedStorageFileStream and then eventualy send that stream. (then it uploaded the data)
Any advice on why this may be happening?
public void UploadFile<T>(string skyDriveFolderID, T data, string fileNameInSkyDrive)
{
this.fileNameInSkyDrive = fileNameInSkyDrive;
{
try
{
memoryStream = new MemoryStream();
XmlWriterSettings xmlWriterSettings = new XmlWriterSettings();
xmlWriterSettings.Indent = true;
XmlSerializer serializer = new XmlSerializer(typeof(T));
using (XmlWriter xmlWriter = XmlWriter.Create(memoryStream, xmlWriterSettings))
{
serializer.Serialize(xmlWriter, data);
}
client.UploadAsync(skyDriveFolderID, fileNameInSkyDrive, true, memoryStream, null);
}
catch (Exception ex)
{
if (memoryStream != null) { memoryStream.Dispose(); }
}
}
}
You have to "rewind" the memorystream to the start before calling the UploadAsync method. Imagine the memorystream being like a tape which you record things on. The "read/write-head" is always floating over some point of the tape, which is the end in your case because you just wrote all serialized data onto it. The uploading method tries to read from it by moving forward on the tape, realizing it is already at its end. Thus you get an empty file uploaded.
The method you need for rewinding is:
memoryStream.Seek(0, SeekOrigin.Begin);
Also, it is good practice to use the using directive for IDisposable objects, which the memorystream is. This way you don't need a try {...} finally { ...Dispose(); } (this is done by the using).
Your method could then look like:
public void UploadFile<T>(string skyDriveFolderID, T data, string fileNameInSkyDrive)
{
this.fileNameInSkyDrive = fileNameInSkyDrive;
XmlWriterSettings xmlWriterSettings = new XmlWriterSettings();
xmlWriterSettings.Indent = true;
XmlSerializer serializer = new XmlSerializer(typeof(T));
using (var memoryStream = new MemoryStream())
{
using (var xmlWriter = XmlWriter.Create(memoryStream, xmlWriterSettings))
{
serializer.Serialize(xmlWriter, data);
}
memoryStream.Seek(0, SeekOrigin.Begin);
client.UploadAsync(skyDriveFolderID, fileNameInSkyDrive, true, memoryStream, null);
}
}

How to write System.Xml.Linq.XElement using XmlWriter to a stream

I have an XElement instance and I wish to write to a stream using XmlWriter class. Why? Well, one of the configuration settings defines whether to use binary Xml or not. Based on this setting a suitable XmlWriter instance is created - either by XmlWriter.Create(stream) or XmlDictionaryWriter.CreateBinaryWriter(stream)).
Anyway, I am trying the following code, but it leaves the stream empty:
using (var stream = new MemoryStream())
{
var xmlReader = new XDocument(xml).CreateReader();
xmlReader.MoveToContent();
var xmlWriter = GetXmlWriter(stream);
xmlWriter.WriteNode(xmlReader, true);
return stream.ToArray();
}
I have checked, xmlReader is properly aligned after MoveToContent at the root XML element.
I must be doing something wrong, but what?
Thanks.
You haven't shown what GetXmlWriter does... but have you tried just flushing the writer?
xmlWriter.Flush();
Alternatively, wrap the XmlWriter in another using statement:
using (var stream = new MemoryStream())
{
var xmlReader = new XDocument(xml).CreateReader();
xmlReader.MoveToContent();
using (var xmlWriter = GetXmlWriter(stream))
{
xmlWriter.WriteNode(xmlReader, true);
}
return stream.ToArray();
}
You might want to do the same for the XmlReader as well, although in this particular case I don't believe you particularly need to.
Having said all this, I'm not entirely sure why you're using an XmlReader at all. Any reason you can't just find the relevant XElement and use XElement.WriteTo(XmlWriter)? Or if you're trying to copy the whole document, just use XDocument.WriteTo(XmlWriter)

Writing XML files using XmlTextWriter with ISO-8859-1 encoding

I'm having a problem writing Norwegian characters into an XML file using C#. I have a string variable containing some Norwegian text (with letters like æøå).
I'm writing the XML using an XmlTextWriter, writing the contents to a MemoryStream like this:
MemoryStream stream = new MemoryStream();
XmlTextWriter xmlTextWriter = new XmlTextWriter(stream, Encoding.GetEncoding("ISO-8859-1"));
xmlTextWriter.Formatting = Formatting.Indented;
xmlTextWriter.WriteStartDocument(); //Start doc
Then I add my Norwegian text like this:
xmlTextWriter.WriteCData(myNorwegianText);
Then I write the file to disk like this:
FileStream myFile = new FileStream(myPath, FileMode.Create);
StreamWriter sw = new StreamWriter(myFile);
stream.Position = 0;
StreamReader sr = new StreamReader(stream);
string content = sr.ReadToEnd();
sw.Write(content);
sw.Flush();
myFile.Flush();
myFile.Close();
Now the problem is that in the file on this, all the Norwegian characters look funny.
I'm probably doing the above in some stupid way. Any suggestions on how to fix it?
Why are you writing the XML first to a MemoryStream and then writing that to the actual file stream? That's pretty inefficient. If you write directly to the FileStream it should work.
If you still want to do the double write, for whatever reason, do one of two things. Either
Make sure that the StreamReader and StreamWriter objects you use all use the same encoding as the one you used with the XmlWriter (not just the StreamWriter, like someone else suggested), or
Don't use StreamReader/StreamWriter. Instead just copy the stream at the byte level using a simple byte[] and Stream.Read/Write. This is going to be, btw, a lot more efficient anyway.
Both your StreamWriter and your StreamReader are using UTF-8, because you're not specifying the encoding. That's why things are getting corrupted.
As tomasr said, using a FileStream to start with would be simpler - but also MemoryStream has the handy "WriteTo" method which lets you copy it to a FileStream very easily.
I hope you've got a using statement in your real code, by the way - you don't want to leave your file handle open if something goes wrong while you're writing to it.
Jon
You need to set the encoding everytime you write a string or read binary data as a string.
Encoding encoding = Encoding.GetEncoding("ISO-8859-1");
FileStream myFile = new FileStream(myPath, FileMode.Create);
StreamWriter sw = new StreamWriter(myFile, encoding);
stream.Position = 0;
StreamReader sr = new StreamReader(stream, encoding);
string content = sr.ReadToEnd();
sw.Write(content);
sw.Flush();
myFile.Flush();
myFile.Close();
As mentioned in above answers, the biggest issue here is the Encoding, which is being defaulted due to being unspecified.
When you do not specify an Encoding for this kind of conversion, the default of UTF-8 is used - which may or may not match your scenario. You are also converting the data needlessly by pushing it into a MemoryStream and then out into a FileStream.
If your original data is not UTF-8, what will happen here is that the first transition into the MemoryStream will attempt to decode using default Encoding of UTF-8 - and corrupt your data as a result. When you then write out to the FileStream, which is also using UTF-8 as encoding by default, you simply persist that corruption into the file.
In order to fix the issue, you likely need to specify Encoding into your Stream objects.
You can actually skip the MemoryStream process entirely, also - which will be faster and more efficient. Your updated code might look something more like:
FileStream fs = new FileStream(myPath, FileMode.Create);
XmlTextWriter xmlTextWriter =
new XmlTextWriter(fs, Encoding.GetEncoding("ISO-8859-1"));
xmlTextWriter.Formatting = Formatting.Indented;
xmlTextWriter.WriteStartDocument(); //Start doc
xmlTextWriter.WriteCData(myNorwegianText);
StreamWriter sw = new StreamWriter(fs);
fs.Position = 0;
StreamReader sr = new StreamReader(fs);
string content = sr.ReadToEnd();
sw.Write(content);
sw.Flush();
fs.Flush();
fs.Close();
Which encoding do you use for displaying the result file? If it is not in ISO-8859-1, it will not display correctly.
Is there a reason to use this specific encoding, instead of for example UTF8?
After investigating, this is that worked best for me:
var doc = new XDocument(new XDeclaration("1.0", "ISO-8859-1", ""));
using (XmlWriter writer = doc.CreateWriter()){
writer.WriteStartDocument();
writer.WriteStartElement("Root");
writer.WriteElementString("Foo", "value");
writer.WriteEndElement();
writer.WriteEndDocument();
}
doc.Save("dte.xml");

Categories

Resources