XpsDocumentWriter is leaving a **source** file open - c#

I have a routine that reads XPS documents and chops off pages to separate documents. Originally it read one document, decided how to chop it, closed it and wrote out the new files.
Features were added, this was causing headaches with cleaning up old files before running the routine and so I saved all the chopped pieces to be written out at the end.
ChoppedXPS is a dictionary, the key is the filename, the data is the FixedDocument prepared from the original:
foreach (String OneReport in ChoppedXPS.Keys)
{
File.Delete(OneReport);
using (XpsDocument TargetFile = new XpsDocument(OneReport, FileAccess.ReadWrite))
{
XpsDocumentWriter Writer = XpsDocument.CreateXpsDocumentWriter(TargetFile);
Writer.Write(ChoppedXPS[OneReport]);
Logger($"{OneReport} written to disk", 2);
}
Application.DoEvents();
}
If the FixedDocument being written out here contains graphics the source file is opened by the Writer.Write line and left open until the program is closed.
The XpsDocumentWriter does not seem to implement anything that can be used to clean it up.
(Yeah, that Application.DoEvents is ugly--this is an in-house program used by two people, it's not worth the hassle of making this run in the background and without it a big enough task can cause Windows to decide it's non-responsive and kill it. And, yes, I know how to indent--I took them out to make it all fit this screen.)
.Net 4.5, using some C# 8.0 features.

I found a workaround for this problem. I'm not going to try to post the whole thing as I had to change the whole data handling but the heart of it:
using (XPSDocument Source = new XPSDocument(SourceFile, FileAccess.Read)
{
[the using loop from my question]
}
I'm still hoping for understanding and something more appropriate than this approach.
Yes--this produces a warning that Source is unused, but the compiler isn't eliminating it so it does work.

Related

What could cause an XML file to be filled with null characters?

This is a tricky question. I suspect it will require some advanced knowledge of file systems to answer.
I have a WPF application, "App1," targeting .NET framework 4.0. It has a Settings.settings file that generates a standard App1.exe.config file where default settings are stored. When the user modifies settings, the modifications go in AppData\Roaming\MyCompany\App1\X.X.0.0\user.config. This is all standard .NET behavior. However, on occasion, we've discovered that the user.config file on a customer's machine isn't what it's supposed to be, which causes the application to crash.
The problem looks like this: user.config is about the size it should be if it were filled with XML, but instead of XML it's just a bunch of NUL characters. It's character 0 repeated over and over again. We have no information about what had occurred leading up to this file modification.
We can fix that problem on a customer's device if we just delete user.config because the Common Language Runtime will just generate a new one. They'll lose the changes they've made to the settings, but the changes can be made again.
However, I've encountered this problem in another WPF application, "App2," with another XML file, info.xml. This time it's different because the file is generated by my own code rather than by the CLR. The common themes are that both are C# WPF applications, both are XML files, and in both cases we are completely unable to reproduce the problem in our testing. Could this have something to do with the way C# applications interact with XML files or files in general?
Not only can we not reproduce the problem in our current applications, but I can't even reproduce the problem by writing custom code that generates errors on purpose. I can't find a single XML serialization error or file access error that results in a file that's filled with nulls. So what could be going on?
App1 accesses user.config by calling Upgrade() and Save() and by getting and setting the properties. For example:
if (Settings.Default.UpgradeRequired)
{
Settings.Default.Upgrade();
Settings.Default.UpgradeRequired = false;
Settings.Default.Save();
}
App2 accesses info.xml by serializing and deserializing the XML:
public Info Deserialize(string xmlFile)
{
if (File.Exists(xmlFile) == false)
{
return null;
}
XmlSerializer xmlReadSerializer = new XmlSerializer(typeof(Info));
Info overview = null;
using (StreamReader file = new StreamReader(xmlFile))
{
overview = (Info)xmlReadSerializer.Deserialize(file);
file.Close();
}
return overview;
}
public void Serialize(Info infoObject, string fileName)
{
XmlSerializer writer = new XmlSerializer(typeof(Info));
using (StreamWriter fileWrite = new StreamWriter(fileName))
{
writer.Serialize(fileWrite, infoObject);
fileWrite.Close();
}
}
We've encountered the problem on both Windows 7 and Windows 10. When researching the problem, I came across this post where the same XML problem was encountered in Windows 8.1: Saved files sometime only contains NUL-characters
Is there something I could change in my code to prevent this, or is the problem too deep within the behavior of .NET?
It seems to me that there are three possibilities:
The CLR is writing null characters to the XML files.
The file's memory address pointer gets switched to another location without moving the file contents.
The file system attempts to move the file to another memory address and the file contents get moved but the pointer doesn't get updated.
I feel like 2 and 3 are more likely than 1. This is why I said it may require advanced knowledge of file systems.
I would greatly appreciate any information that might help me reproduce, fix, or work around the problem. Thank you!
It's well known that this can happen if there is power loss. This occurs after a cached write that extends a file (it can be a new or existing file), and power loss occurs shortly thereafter. In this scenario the file has 3 expected possible states when the machine comes back up:
1) The file doesn't exist at all or has its original length, as if the write never happened.
2) The file has the expected length as if the write happened, but the data is zeros.
3) The file has the expected length and the correct data that was written.
State 2 is what you are describing. It occurs because when you do the cached write, NTFS initially just extends the file size accordingly but leaves VDL (valid data length) untouched. Data beyond VDL always reads back as zeros. The data you were intending to write is sitting in memory in the file cache. It will eventually get written to disk, usually within a few seconds, and following that VDL will get advanced on disk to reflect the data written. If power loss occurs before the data is written or before VDL gets increased, you will end up in state 2.
This is fairly easy to repro, for example by copying a file (the copy engine uses cached writes), and then immediately pulling the power plug on your computer.
I had a similar problem and I was able to trace my problem to corrupted HDD.
Description of my problem (all related informations):
Disk attached to mainboard (SATA):
SSD (system),
3 * HDD.
One of the HDD's had a bad blocks and there were even problems reading the disk structure (directories and file listing).
Operation system: Windows 7 x64
file system (on all disks): NTFS
When the system tried to read or write to the corrupted disk (user request or automatic scan or any other reason) and the attempt failed, all write operations (to other disk's) were incorrect. The files created on system disk (mostly configuration files by another applications) were written and were valid (probably because the files were cashed in RAM) on direct check of file content.
Unfortunately, after a restart, all the files (written after the failed write/read access on corrupted drive) had the correct size, but the content of the files was 'zero byte' (exactly like in your case).
Try rule out hardware related problems. You can try to check 'copy' the file (after a change) to a different machine (upload to web/ftp). Or try to save specific content to a fixed file. When the check file on different will be correct, or when the fixed content file will be 'empty', the reason is probably on local machine. Try to change HW components, or reinstall the system.
There is no documented reason for this behavior, as this is happening to users but nobody can tell the origin of this odd conditions.
It might be CLR problem, although this is a very unlikely, the CLR doesn't just write null characters and XML document cannot contain null characters if there's no xsi:nil defined for the nodes.
Anyway, the only documented way to fix this is to delete the corrupted file using this line of code:
try
{
ConfigurationManager.OpenExeConfiguration(ConfigurationUserLevel.PerUserRoamingAndLocal);
}
catch (ConfigurationErrorsException ex)
{
string filename = ex.Filename;
_logger.Error(ex, "Cannot open config file");
if (File.Exists(filename) == true)
{
_logger.Error("Config file {0} content:\n{1}", filename, File.ReadAllText(filename));
File.Delete(filename);
_logger.Error("Config file deleted");
Properties.Settings.Default.Upgrade();
// Properties.Settings.Default.Reload();
// you could optionally restart the app instead
}
else
{
_logger.Error("Config file {0} does not exist", filename);
}
}
It will restore the user.config using the Properties.Settings.Default.Upgrade();
again without null values.
I ran into a similar issue but it was on a server. The server restarted while a program was writing to a file which caused the file to contain all null characters and become unusable to the program writing/reading from it.
So the file looked like this:
The logs showed that the server restarted:
The corrupted file showed that it was last updated at the time of the restart:
I have the same problem, there is an extra "NUL" character at the end of serialized xml file:
I am using XMLWriter like this:
using (var stringWriter = new Utf8StringWriter())
{
using (var xmlWriter = XmlWriter.Create(stringWriter, new XmlWriterSettings { Indent = true, IndentChars = "\t", NewLineChars = "\r\n", NewLineHandling = NewLineHandling.Replace }))
{
xmlSerializer.Serialize(xmlWriter, data, nameSpaces);
xml = stringWriter.ToString();
var xmlDocument = new XmlDocument();
xmlDocument.LoadXml(xml);
if (removeEmptyNodes)
{
RemoveEmptyNodes(xmlDocument);
}
xml = xmlDocument.InnerXml;
}
}

reading a text file its size (nearly 4 miga )c#

I write a code to read many text files and grouped them in one file called (all.txt), after that I read all.txt file to count the word frequency, and the result appears in richtextbox. the code work well but the problem when I run the program, part of result appears then the program is hang without responding. I think that may be from memory, my computer RAM is 4 GB any help would be appreciated.
note:my code work well in small text file.here's part of my code :
StreamWriter w=new StreamWriter(#"C:\documents\all.txt");
w.Write(all);
w.Close();
If your problem is reading a large file, try using MemoryMappedFile as indicated in here

How do I remove blank lines from text File in c#.net

I want to remove blank lines from my file, foe that I am using code below.
private void ReadFile(string Address)
{
var tempFileName = Path.GetTempFileName();
try
{
//using (var streamReader = new StreamReader(Server.MapPath("~/Images/") + FileName))
using (var streamReader = new StreamReader(Address))
using (var streamWriter = new StreamWriter(tempFileName))
{
string line;
while ((line = streamReader.ReadLine()) != null)
{
if (!string.IsNullOrWhiteSpace(line))
streamWriter.WriteLine(line);
}
}
File.Copy(tempFileName, Address, true);
}
finally
{
File.Delete(tempFileName);
}
Response.Write("Completed");
}
But the problem is my file is too large (8 lac lines ) so its taking lot of time. So is there any other way to do it faster?
Instead of doing a ReadLine(), I would do a StreamReader.ReadToEnd() to load the entire file into memory, then do a line.Replace("\n\n","\n") and then do a streamWrite.Write(line) to the file. That way there is not a lot of thrashing, either memory or disk, going on.
The best solution may well depend on the disk type - SSDs and spinning rust behave differently. Your current approach has the advantage over Steve's answer of being able to do processing (such as encoding text data back as binary) while data is still coming off the disk. (With buffering and background IO, there's a lot of potential asynchrony here.) It's definitely worth trying both approaches. (Obviously your approach uses less memory, too.)
However, there's one aspect of your code which is definitely suboptimal: creating a copy of the results. You don't need to do that. You can use file moves instead which are a lot more efficient, assuming they're all in the same drive. To make sure you don't lose data, you can do two moves and a delete:
Move the old file to a backup filename
Move the new file to the old filename
Delete the backup filename
It looks like this is what File.Replace does for you, which makes it considerably simpler, and also preserves the original metadata.
If something goes wrong after the first move, you're left without the "proper" file from either old or new, but you can detect that and use the backup filename to read next time.
Of course, if this is meant to happen as part of a web request, you may want to do all the processing in a background task - processing 800,000 lines of text is likely to take longer than you really want a web request to take...

Reading an image file from local storage on mono for android

In mono for android I have an app that saves images to local storage for caching purposes. When the app launches it tries to load images from the cache before trying to load them from the web.
I'm currently having a hard time finding a good way to read and load them from local storage.
I'm currently using something equivilant to this:
List<byte> byteList = new List<byte>();
using (System.IO.BinaryReader binaryReader = new System.IO.BinaryReader(context.OpenFileInput("filename.jpg")))
{
while (binaryReader.BaseStream.IsDataAvailable())
{
byteList.Add(binaryReader.ReadByte());
}
}
return byteList.toArray();
OpenFileInput() returns a stream that does not give me a length so I have to read one byte at a time. It also can't seek. This seems to be causing images to load much slower than they aughto. Loading images from Resrouce.Drawable is almost instantanious by comparison, but with my method there a very noticable pause, maybe 300ms, for loading a 8kb file. This seems like a really obvious task to be able to do, but I've tried many solutions and searched a lot for advise but to no avail.
I've also noticed this code seems to crash with an EndOfStream exception when not run on the UI thread.
Any help would be hugely appreciated
What do you intend on doing with the List<byte>? You want to "load images from the cache," but you don't specify what you want to load them into.
If you want to load them into a Android.Graphics.Bitmap, you could use BitmapFactory.DecodeStream(Stream):
Bitmap bitmap = BitmapFactory.DecodeStream(context.OpenFileInput("filename.jpg"));
This would remove the List<byte> intermediary.
If you really need all the bytes (for whatever reason), you can rely on the fact that System.Environment.GetFolderPath(System.Environment.SpecialFolder.Personal) is the same as Context.FilesDir, which is what context.OpenFileInput() will use, permitting:
byte[] bytes = System.IO.File.ReadAllBytes(
Path.Combine (
System.Environment.GetFolderPath(System.Environment.SpecialFolder.Personal),
"filename.jpg"));
However, if this is truly a cache, you should be using Context.CacheDir instead of Context.FilesDir, which is Path.GetTempPath returns:
byte[] cachedBytes = System.IO.File.ReadAllBytes(
Path.Combine(System.IO.Path.GetTempPath(), "filename.jpg"));

where is leak in my code?

Here is my code which opens an XML file (old.xml), filter invalid characters and write to another XML file (abc.xml). Finally I will load the XML (abc.xml) again. When executing the followling line, there is exception says the xml file is used by another process,
xDoc.Load("C:\\abc.xml");
Does anyone have any ideas what is wrong? Any leaks in my code and why (I am using "using" keyword all the time, confused to see leaks...)?
Here is my whole code, I am using C# + VSTS 2008 under Windows Vista x64.
// Create an instance of StreamReader to read from a file.
// The using statement also closes the StreamReader.
Encoding encoding = Encoding.GetEncoding("utf-8", new EncoderReplacementFallback(String.Empty), new DecoderReplacementFallback(String.Empty));
using (TextWriter writer = new StreamWriter(new FileStream("C:\\abc.xml", FileMode.Create), Encoding.UTF8))
{
using (StreamReader sr = new StreamReader(
"C:\\old.xml",
encoding
))
{
int bufferSize = 10 * 1024 * 1024; //could be anything
char[] buffer = new char[bufferSize];
// Read from the file until the end of the file is reached.
int actualsize = sr.Read(buffer, 0, bufferSize);
writer.Write(buffer, 0, actualsize);
while (actualsize > 0)
{
actualsize = sr.Read(buffer, 0, bufferSize);
writer.Write(buffer, 0, actualsize);
}
}
}
try
{
XmlDocument xDoc = new XmlDocument();
xDoc.Load("C:\\abc.xml");
}
catch (Exception ex)
{
Console.WriteLine(ex.Message);
}
EDIT1: I have tried to change the size of buffer from 10M to 1M and it works! I am so confused, any ideas?
EDIT2: I find this issue is very easy to reproduce when the input old XML file is very big, like 100M or something. I am suspecting whether it is a .Net known bug? I am going to using tools like ProcessExplorer/ProcessMonitor to see which process locks the file to keep it from being accessed by XmlDocument.Load.
That works fine for me.
Purely a guess, but maybe a virus checker is scanning the file?
To investigate, try disabling your virus checker and see if it works (and then re-enable your virus checker).
As an aside, there is one way it can leave the file open: if the StreamReader constructor throws an exception; but then you won't reach the XmlDocument stuff anyway... but consider:
using (FileStream fs = new FileStream("C:\\abc.xml", FileMode.Create))
using (TextWriter writer = new StreamWriter(fs, Encoding.UTF8))
{
...
}
Now fs is disposed in the edge-case where new StreamWriter(...) throws. However, I do not believe that this is the problem here.
You running a FileSystemWatcher on the root perhaps?
You can also use ProcessMonitor to see who accesses that file.
The problem is your char[] which seems to be to big. If it is too big, it is located on the large objekt heap, not on the stack. Hence the large object heap is not compacted as long as the software is running, the once allocated space there may not be used again - which looks like a memory leak. Try splitting up your array to smaller chunks.
I second Leppie's suggestion to use ProcessMonitor (or equivalent) to see for sure who is locking the file. Anything else is just speculation.
Your buffer isnt being deallocated, is it?
Have you checked that no other process tries to access the file?
Code works fine. Just checked.
using will call Dispose, but will Dispose call close on the writing stream? If it does not, the system may still consider the file to be open for writing.
I'd try putting in a close of the writer just before then end of its using block.
Edit: Just tried out the code myself as well. Compiled and ran without the problem your are seeing. Try turning off Virus scanners like some others have mentioned and make sure you don't have a window somewhere with the file open.
The fact that it works for some people and not for others makes me think that the file isn't being closed. Close the writer before trying to load the file.
My bet is that you have some Antivirus solution running, which locks the file after it is being closed. To verify, try adding a delay (like, 1 second) before loading the file. If that works, you probably found the cause.
Run Process Explorer
Make sure it's your program locking the file first.

Categories

Resources