Odd behaviour with XmlReader.Read and Stream.Read

Odd behaviour with XmlReader.Read and Stream.Read - c#

I've encountering an unusual problem with .Net Framework 3.5 and the System.Xml.XmlReader class.
Before my application calls the XmlReader.Read method it first reads the content of the stream for logging purposes using the Stream.Read method. It then seeks back to the beginning of the stream before calling Stream.Read. When I do this I am getting the following error:
Unhandled Exception: System.Xml.XmlException: Unexpected end of file while parsing Name has occurred. Line 1, position 4097.
If however I call XmlReader.Read, seek to the beginning of the stream and then call the Stream.Read method it all works fine. This only appears to be happening on large streams however. I've just seen one go through the system at about 2000 characters and it works fine?
I've included a code sample below to give an idea of what I'm doing.
XmlReaderSettings readerSettings = new XmlReaderSettings();
readerSettings.Schemas.Add(null, args[1]);
readerSettings.ValidationType = ValidationType.Schema;
readerSettings.ValidationEventHandler += new ValidationEventHandler(XmlValidatingReaderValidationEventHandler);
XmlReader reader = XmlReader.Create(fileReader, readerSettings);
byte[] buffer = new byte[fs.Length];
fs.Read(buffer, 0, buffer.Length);
string content = System.Text.UTF8Encoding.UTF8.GetString(buffer, 0, buffer.Length);
fs.Seek(0, SeekOrigin.Begin);
while(reader.Read());
Console.WriteLine("Done");
Thanks

Messing around with the stream which is backing something like XmlReader is generally a bad idea. If you want to do two different things with the same file, I suggest you open two different streams. That way they won´t interfere with each other.
Note that using File.ReadAllText is a simpler way of loading the contents of a text file into a string.

That's because XmlReader buffers the data from the Stream. If you mess with the current position of the Stream, you mess with the XmlReader too...

Related

Sending stream via socket

Sorry for this type of question but I am writing a test soon on this and have no clue on the following possible question:
A web server uses the following c#-code fragment to write a static web-object into the socket-object 'sock'. For which type of web-objects does the code work and which it doesn't? With what .Net-class could the code be improved?
...
f = new FileStream(pathName, FileMode.Open, FileAccess.Read);
StreamReader sReader = new StreamReader(f);
sReader.BaseStream.Seek(0, SeekOrigin.Begin);
String s = sReader.ReadlLine();
while (s != null)
{
sock.Send(System.Text.Encoding.ASCII.GetBytes(s));
s = sReader.ReadLine();
}
sReader.Close();
...

What's a "web-object"? I think your teacher made that term up. I assume this means "file".
Anyway, this will fail if the content is not exactly representable as ASCII.
There is no need to go through text at all. Just copy over the bytes:
f.CopyTo(new NetworkStream(sock));
Any other way to copy the bytes unmodified is also fine.
Be aware that you need to wrap resources such as all those streams and sockets into using in order to not leak.

Memory stream is empty

I need to generate a huge xml file from different sources (functions). I decide to use XmlTextWriter since it uses less memory than XmlDocument.
First, initiate an XmlWriter with underlying MemoryStream
MemoryStream ms = new MemoryStream();
XmlTextWriter xmlWriter = new XmlTextWriter(ms, new UTF8Encoding(false, false));
xmlWriter.Formatting = Formatting.Indented;
Then I pass the XmlWriter (note xml writer is kept open until the very end) to a function to generate the beginning of the XML file:
xmlWriter.WriteStartDocument();
xmlWriter.WriteStartElement();
// xmlWriter.WriteEndElement(); // Do not write the end of root element in first function, to add more xml elements in following functions
xmlWriter.WriteEndDocument();
xmlWriter.Flush();
But I found that underlying memory stream is empty (by converting byte array to string and output string). Any ideas why?
Also, I have a general question about how to generate a huge xml file from different sources (functions). What I do now is keeping the XmlWriter open (I assume the underlying memory stream should open as well) to each function and write. In the first function, I do not write the end of root element. After the last function, I manually add the end of root element by:
string endRoot = "</Root>";
byte[] byteEndRoot = Encoding.ASCII.GetBytes(endRoot);
ms.Write(byteEndRoot, 0, byteEndRoot.Length);
Not sure if this works or not.
Thanks a lot!

Technically you should only ask one question per question, so I'm only going to answer the first one because this is just a quick visit to SO for me at the moment.
You need to call Flush before attempting to read from the Stream I think.
Edit
Just bubbling up my second hunch from the comments below to justify the accepted answer here.
In addition to the call to Flush, if reading from the Stream is done using the Read method and its brethren, then the position in the stream must first be reset back to the start. Otherwise no bytes will be read.
ms.Position = 0; /*reset Position to start*/
StreamReader reader = new StreamReader(ms);
string text = reader.ReadToEnd();
Console.WriteLine(text);

Perhaps you need to call Flush() on the xml stream before checking the memory streazm.

Make sure you call Flush on the XmlTextWriter before checking the memory stream.

What is ADODB.Stream?

What exactly is it, or was it, as is a interop, used for?
Here, this is the method where I use it:
public void SaveAttachmentMime(String fileName, CDO.Message message)
{
ADODB.Stream stream = message.BodyPart.GetStream();
String messageString = stream.ReadText(stream.Size);
StreamWriter outputStream = new StreamWriter(fileName);
outputStream.Write(messageString);
outputStream.Flush();
outputStream.Close();
}

The ADODB.Stream object was used to read files and other streams. What it does is part of what the StreamReader, StreamWriter, FileStream and Stream does in the .NET framework.
For what the code in that method uses it for, in .NET you would use a StreamReader to read from a Stream.
Note that the code in the method only works properly if the stream contains non-Unicode data, as it uses the size in bytes to determine how many characters to read. With a Unicode encoding some characters may be encoded as several bytes, so the stream would run into the end of the stream before it could read the number of characters specified.

It is a COM object, which is used to represent a stream of data or text. The data can be binary. If I recall correctly, it implements the IStream interface, which stores data in a structured storage object. You can find the interop representation of the interface in System.Runtime.InteropServices.ComTypes.IStream.

StreamWriter not writing out the last few characters to a file

We are having an issue with one server and it's utilization of the StreamWriter class. Has anyone experienced something similar to the issue below? If so, what was the solution to fix the issue?
using( StreamWriter logWriter = File.CreateText( logFileName ) )
{
for (int i = 0; i < 500; i++)
logWriter.WriteLine( "Process completed successfully." );
}
When writing out the file the following output is generated:
Process completed successfully.
... (497 more lines)
Process completed successfully.
Process completed s
Tried adding logWriter.Flush() before close without any help. The more lines of text I write out the more data loss occurs.

Had a very similar issue myself. I found that if I enabled AutoFlush before doing any writes to the stream and it started working as expected.
logWriter.AutoFlush = true;

sometimes even u call flush(), it just won't do the magic. becus Flush() will cause stream to write most of the data in stream except the last block of its buffer.
try
{
// ... write method
// i dont recommend use 'using' for unmanaged resource
}
finally
{
stream.Flush();
stream.Close();
stream.Dispose();
}

Cannot reproduce this.
Under normal conditions, this should not and will not fail.
Is this the actual code that fails ? The text "Process completed" suggests it's an extract.
Any threading involved?
Network drive or local?
etc.

This certainly appears to be a "flushing" problem to me, even though you say you added a call to Flush(). The problem may be that your StreamWriter is just a wrapper for an underlying FileStream object.
I don't typically use the File.CreateText method to create a stream for writing to a file; I usually create my own FileStream and then wrap it with a StreamWriter if desired. Regardless, I've run into situations where I've needed to call Flush on both the StreamWriter and the FileStream, so I imagine that is your problem.
Try adding the following code:
logWriter.Flush();
if (logWriter.BaseStream != null)
logWriter.BaseStream.Flush();

In my case, this is what I found with output file
Case 1: Without Flush() and Without Close()
Character Length = 23,371,776
Case 2: With Flush() and Without Close()
logWriter.flush()
Character Length = 23,371,201
Case 3: When propely closed
logWriter.Close()
Character Length = 23,375,887 (Required)
So, In order to get proper result, always need to close Writer instance.

I faced same problem
Following worked for me
using (StreamWriter tw = new StreamWriter(#"D:\Users\asbalach\Desktop\NaturalOrder\NatOrd.txt"))
{
tw.Write(abc.ToString());// + Environment.NewLine);
}

Using framework 4.6.1 and under heavy stress it still has this problem. I'm not sure why it does this, though i found a way to solve it very differently (which strengthens my feeling its indeed a .net bug).
In my case i tried write huge jagged arrays to disk (video caching).
Since the jagged array is quite large it had to do lot of repeated writes to store a large set of video frames, and despite they where uncompressed and each cache file got exact 1000 frames, the logged cash files had all different sizes.
I had the problem when i used this
//note, generateLogfileName is just a function to create a filename()
using (FileStream fs = new FileStream(generateLogfileName(), FileMode.OpenOrCreate))
{
using (StreamWriter sw = new StreamWriter(fs)
{
// do your stuff, but it will be unreliable
}
}
However when i provided it an Encoding type, all logged files got an equal size, and the problem was gone.
using (FileStream fs = new FileStream(generateLogfileName(), FileMode.OpenOrCreate))
{
using (StreamWriter sw = new StreamWriter(fs,Encoding.Unicode))
{
// all data written correctly, no data lost.
}
}
Note also read the file width the same encoding type!

This did the trick for me:
streamWriter.flush();

where is leak in my code?

Here is my code which opens an XML file (old.xml), filter invalid characters and write to another XML file (abc.xml). Finally I will load the XML (abc.xml) again. When executing the followling line, there is exception says the xml file is used by another process,
xDoc.Load("C:\\abc.xml");
Does anyone have any ideas what is wrong? Any leaks in my code and why (I am using "using" keyword all the time, confused to see leaks...)?
Here is my whole code, I am using C# + VSTS 2008 under Windows Vista x64.
// Create an instance of StreamReader to read from a file.
// The using statement also closes the StreamReader.
Encoding encoding = Encoding.GetEncoding("utf-8", new EncoderReplacementFallback(String.Empty), new DecoderReplacementFallback(String.Empty));
using (TextWriter writer = new StreamWriter(new FileStream("C:\\abc.xml", FileMode.Create), Encoding.UTF8))
{
using (StreamReader sr = new StreamReader(
"C:\\old.xml",
encoding
))
{
int bufferSize = 10 * 1024 * 1024; //could be anything
char[] buffer = new char[bufferSize];
// Read from the file until the end of the file is reached.
int actualsize = sr.Read(buffer, 0, bufferSize);
writer.Write(buffer, 0, actualsize);
while (actualsize > 0)
{
actualsize = sr.Read(buffer, 0, bufferSize);
writer.Write(buffer, 0, actualsize);
}
}
}
try
{
XmlDocument xDoc = new XmlDocument();
xDoc.Load("C:\\abc.xml");
}
catch (Exception ex)
{
Console.WriteLine(ex.Message);
}
EDIT1: I have tried to change the size of buffer from 10M to 1M and it works! I am so confused, any ideas?
EDIT2: I find this issue is very easy to reproduce when the input old XML file is very big, like 100M or something. I am suspecting whether it is a .Net known bug? I am going to using tools like ProcessExplorer/ProcessMonitor to see which process locks the file to keep it from being accessed by XmlDocument.Load.

That works fine for me.
Purely a guess, but maybe a virus checker is scanning the file?
To investigate, try disabling your virus checker and see if it works (and then re-enable your virus checker).
As an aside, there is one way it can leave the file open: if the StreamReader constructor throws an exception; but then you won't reach the XmlDocument stuff anyway... but consider:
using (FileStream fs = new FileStream("C:\\abc.xml", FileMode.Create))
using (TextWriter writer = new StreamWriter(fs, Encoding.UTF8))
{
...
}
Now fs is disposed in the edge-case where new StreamWriter(...) throws. However, I do not believe that this is the problem here.

You running a FileSystemWatcher on the root perhaps?
You can also use ProcessMonitor to see who accesses that file.

The problem is your char[] which seems to be to big. If it is too big, it is located on the large objekt heap, not on the stack. Hence the large object heap is not compacted as long as the software is running, the once allocated space there may not be used again - which looks like a memory leak. Try splitting up your array to smaller chunks.

I second Leppie's suggestion to use ProcessMonitor (or equivalent) to see for sure who is locking the file. Anything else is just speculation.

Your buffer isnt being deallocated, is it?

Have you checked that no other process tries to access the file?

Code works fine. Just checked.

using will call Dispose, but will Dispose call close on the writing stream? If it does not, the system may still consider the file to be open for writing.
I'd try putting in a close of the writer just before then end of its using block.
Edit: Just tried out the code myself as well. Compiled and ran without the problem your are seeing. Try turning off Virus scanners like some others have mentioned and make sure you don't have a window somewhere with the file open.

The fact that it works for some people and not for others makes me think that the file isn't being closed. Close the writer before trying to load the file.

My bet is that you have some Antivirus solution running, which locks the file after it is being closed. To verify, try adding a delay (like, 1 second) before loading the file. If that works, you probably found the cause.

Run Process Explorer
Make sure it's your program locking the file first.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Odd behaviour with XmlReader.Read and Stream.Read - c#

That's because XmlReader buffers the data from the Stream. If you mess with the current position of the Stream, you mess with the XmlReader too...

Related

Sending stream via socket

Memory stream is empty

What is ADODB.Stream?

StreamWriter not writing out the last few characters to a file

where is leak in my code?

Categories

Resources