Overwrite file but still read content - c#

I'm trying to find the most reasonable way to open a file, modify its content and then write it back to file.
If I have the following "MyFile.xml"
<?xml version="1.0" encoding="utf-8"?>
<node>
<data>this is my data which is long</data>
</node>
And then want to modify it according to this:
private static void Main(string[] args)
{
using (FileStream stream = new FileStream("Myfile.xml", FileMode.Open))
{
XDocument doc = XDocument.Load(stream);
doc.Descendants("data").First().Value = "less data";
stream.Position = 0;
doc.Save(stream);
}
}
I get the following result. Note that, since the total file length is less than before I get incorrect data at the ending.
<?xml version="1.0" encoding="utf-8"?>
<node>
<data>less data</data>
</node>/node>
I guess I could use File.ReadAll* and File.WriteAll* but that would mean two File openings. Isn't there some way to say "I want to open this file, read its data and when I save delete the old content" without closing and reopening the file? Other solutions that I have found include FileMode.Truncate, but that would imply that I cannot read the content.

You'll have to use FileStream.SetLength like this:
stream.SetLength(stream.Position);
After you have finished writing.
Of course, assuming that the position is at the end of the written data.

Why do you read the file into a filestream first?
You can do the following:
private static void Main(string[] args]
{
string path = "MyFile.xml";
XDocument doc = XDocument.Load(path);
// Check if the root-Node is not null and other validation-stuff
doc.Descendants("data").First().Value = "less data";
doc.Save(path);
}
The problem with the stream is, that you can either read or write.
I've read, that with the .net-Framework 4.5 it's also possible to read and write on a stream, but haven't tried it yet.

Related

C# XmlSerializer bug when serializing root tag

I'm currently serializing HighScoreData from a C# application to an XML file using the XmlSerializer namespace. This is producing an inconsistent result regarding the outputted XML file.
The object I'm serializing is the following struct:
namespace GameProjectTomNijs.GameComponents
{
[Serializable]
public struct HighScoreData
{
public string[] PlayerName;
public int[] Score;
public int Count;
public readonly string HighScoresFilename;
public HighScoreData(int count)
{
PlayerName = new string[count];
Score = new int[count];
Count = count;
HighScoresFilename = "highscores.lst";
}
}
}
Questionable variable accessibility levels aside, it contains an array of string, an array of integers and an integer containing the total objects. This is the data that is being serialized. The output of this usually is :
<?xml version="1.0"?>
<HighScoreData xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<PlayerName>
<string />
<string>MISTERT</string>
<string>TESTER</string>
<string>PAULA</string>
<string>JENS</string>
</PlayerName>
<Score>
<int>554</int>
<int>362</int>
<int>332</int>
<int>324</int>
<int>218</int>
</Score>
<Count>5</Count>
</HighScoreData>
However, about 20-30% of the time it is writing to the XML file in a peculiar manner, the ending root tag would look as follows: </HighScoreData>oreData>
It seems the method writing to the XML file is appending the values rather than overwriting I guess?
The following code is the method actually writing to the XML file:
public static void SaveHighScores(HighScoreData data, string fullpath)
{
// Open the file, creating it if necessary
FileStream stream = File.Open(fullpath, FileMode.OpenOrCreate);
try
{
// Convert the object to XML data and put it in the stream
XmlSerializer serializer = new XmlSerializer(typeof(HighScoreData));
serializer.Serialize(stream, data);
}
finally
{
// Close the file
stream.Close();
}
}`
Is there anything I'm currently overlooking? I've used this method in a large number of projects to great success.
Any help would be greatly appreciated !
This is the problem:
FileStream stream = File.Open(fullpath, FileMode.OpenOrCreate);
That doesn't truncate the file if it already exists - it just overwrites the data that it writes, but if the original file is longer than the data written by the serializer, the "old" data is left at the end.
Just use
using (var stream = File.Create(fullPath))
{
...
}
(Then you don't need try/finally either - always use a using statement for resources...)

Load XML in C# InnerText

I have a C# service where I loop every 1 seconds trough a directory looking for XML files.
These XML files may look like this:
<?xml version="1.0" encoding="UTF-8"?>
<job>
<type>freelance</type>
<text>blah</text>
</job>
In a foreach I do the following:
var doc = new XmlDocument();
doc.LoadXml(xmlFile);
XmlNode xmltype = doc.DocumentElement.SelectSingleNode("/job/type");
And than I would like to use these strings to use in my program, however. Using xmltype.InnerText does not work. Documentation on MSDN does not provide me with anything new and I would like to know what I am doing wrong.
First you have to check the xml file.whether is there any data or not.
after that take the one particular node check for innerText.
for example
This is the Text
XmlNode xmlType = doc.DocumentElement.SelectSingleNode("/job/type");
xmlType.innerText = "This is the Text";
xmlType.Value = "Stack";
This following console program will output "freelance". I think the issue may be with some of your XML - do all of your XML docs follow the same schema? I am guessing that the code fails with a NullReferenceException at some point. I've added a null check to protect against this possible scenario.
To help debug your service I tend to use the technique described here to run the app as console application (for easy debugging) or windows service.
using System;
using System.Xml;
public class Program
{
static string xmlFile = #"<?xml version=""1.0"" encoding=""UTF-8""?>
<job>
<type>freelance</type>
<text>blah</text>
</job>";
public static void Main()
{
var doc = new XmlDocument();
doc.LoadXml(xmlFile);
XmlNode xmltype = doc.DocumentElement.SelectSingleNode("/job/type");
if(xmltype==null)
{
Console.WriteLine("/job/type not found");
} else {
Console.WriteLine(xmltype.InnerText);
}
}
}
Try this:
string str = xmltype.Value;

parsing almost well formed XML fragments: how to skip over multiple XML headers

I’m required to write a tool that can handle the below XML fragment that is not well formed as it contains XML declarations in the middle of the stream.
The company already has these kinds files in use for a long time, so there is no option to change the format.
There is no source code available that does the parsing, and the platform of choice for new tooling is .NET 4 or newer preferably with C#.
This is how the fragments look like:
<Header>
<Version>1</Version>
</Header>
<Entry><?xml version="1.0"?><Detail>...snip...</Detail></Entry>
<Entry><?xml version="1.0"?><Detail>...snip...</Detail></Entry>
<Entry><?xml version="1.0"?><Detail>...snip...</Detail></Entry>
<Entry><?xml version="1.0"?><Detail>...snip...</Detail></Entry>
Using an XmlReader with the XmlReaderSettings.ConformanceLevel set to ConformanceLevel.Fragment, I can read the complete <Header> element fine.
Even the <Entry> element start is OK, however while reading the <Detail> info the XmlReader it throws an XmlException, as it reads in the <?xml...?> XML declaration which it doesn't expect at that place.
What options do I have to skip over those XML declarations, besides heavy string manipulations?
Since the fragments can easily go above 100 megabyte a piece, I'd rather do not load everything into memory at once. But it that is what it takes, I am open for it.
Example of the exceptions I get:
System.Xml.XmlException: Unexpected XML declaration.
The XML declaration must be the first node in the document, and no white space characters are allowed to appear before it.
Line ##, position ##.
I don't think the built in classes will help; you'll probably have to do some preparsing and remove the extra headers. If your sample is accurate, you can just do a string.Replace(badXml, "<?xml version=\"1.0\"?>, "") and be on your way.
If you are unsure that the declarations stay the same all the time, replace <?xml with <XmlDeclaration and ?> with /> and use a regular parser ;)
Also, have you tried passing the files through an XML tidy style program?
There might also be an SGML library you can use to preprocess the data and output correct XML.
I added this as an answer because it preserves syntax highlighting.
private void ProcessFile(string inputFileName, string outputFileName)
{
using (StreamReader reader = new StreamReader(inputFileName, new UTF8Encoding(false, true)))
{
using (StreamWriter writer = new StreamWriter(outputFileName, false, Encoding.UTF8))
{
string line;
while ((line = reader.ReadLine()) != null)
{
const string xmlDeclarationStart = "<?xml";
const string xmlDeclarationFinish = "?>";
if (line.Contains(xmlDeclarationStart))
{
string newLine = line.Substring(0, line.IndexOf(xmlDeclarationStart));
int endPosition = line.IndexOf(xmlDeclarationFinish, line.IndexOf(xmlDeclarationStart));
if (endPosition == -1)
{
throw new NotImplementedException(string.Format("Implementation assumption is wrong. {0} .. {1} spans multiple lines (or input file is severely misformed)", xmlDeclarationStart, xmlDeclarationFinish));
}
// the code completely strips the <?xml ... ?> part
// an alternative would be to make this a new XML element containing
// the information inside the <?xml ... ?> part as attributes
// just like Daren Thomas suggested
newLine += line.Substring(endPosition + 2);
line = newLine;
}
writer.WriteLine(line);
}
}
}
}

Changing Word Document XML

What I am doing is trying to change the value of a Microsoft Office Word documents XML and save it as a new file. I know that there are SDK's that I could use to make this easier but the project I am tasked with maintaining is doing things this way and I was told I had to as well.
I have a basic test document with two placeholders mapped to the following XML:
<root>
<element>
Fubar
</element>
<second>
This is the second placeholder
</second>
</root>
In my test project I have the following:
string strRelRoot = "http://schemas.openxmlformats.org/officeDocument/2006/relationships/officeDocument";
//the word template
byte[] buffer = File.ReadAllBytes("dev.docx");
MemoryStream stream = new MemoryStream(buffer, true);
Package package = Package.Open(stream, FileMode.Open, FileAccess.ReadWrite);
//get the document relationship
PackageRelationshipCollection pkgrcOfficeDocument = package.GetRelationshipsByType(strRelRoot);
//iterate through relationship
foreach (PackageRelationship pkgr in pkgrcOfficeDocument)
{
if (pkgr.SourceUri.OriginalString == "/")
{
//uri for the custom xml
Uri uriData = new Uri("/customXML/item1.xml", UriKind.Relative);
//delete the existing xml if it exists
if (package.PartExists(uriData))
{
// Delete template "/customXML/item1.xml" part
package.DeletePart(uriData);
}
PackagePart pkgprtData = package.CreatePart(uriData, "application/xml");
//hard coded test data
string xml = #"<root>
<element>
Changed
</element>
<second>
The second placeholder changed
</second>
</root>";
Stream fromStream = pkgprtData.GetStream();
//write the string
fromStream.Write(Encoding.UTF8.GetBytes(xml),0,xml.Length);
//destination file
Stream dest = File.Create("test.docx");
//write to the destination file
for (int a = fromStream.ReadByte(); a != -1; a = fromStream.ReadByte())
{
dest.WriteByte((byte)a);
}
}
}
What is happening right now is the file test.docx is being created but it is a blank document. I'm not sure why this is happening. Any suggestions anyone could offer on this approach and/or what I am doing incorrectly would be very much appreciated. Thanks much!
After your fromStream.Write call, the stream pointer is positioned after the data you've just written. So your first call to fromStream.ReadByte is already at the end of the stream, and you read (and write) nothing.
You need to either Seek to the beginning of the stream after writing (if the stream returned by the package supports seeking), or close fromStream (to ensure the data you've written is flushed) and reopen it for reading.
fromStream.Seek(0L, SeekOrigin.Begin);

XML reading from web and displaying content

I'm reading a file like from the web:
<?xml version='1.0' encoding='UTF-8'?>
<eveapi version="2">
<currentTime>2011-07-30 16:08:53</currentTime>
<result>
<rowset name="characters" key="characterID" columns="name,characterID,corporationName,corporationID">
<row name="Conqrad Echerie" characterID="91048359" corporationName="Federal Navy Academy" corporationID="1000168" />
</rowset>
</result>
<cachedUntil>2011-07-30 17:05:48</cachedUntil>
</eveapi>
im still new to XML and i see there are many ways to read XML data, is there a certain way im going to want to do this? what i want to do is load all the data into a StreamReader? and then use get; set; to pull the data later?
If you want object-based access, put the example xml in a file and run
xsd.exe my.xml
xsd.exe my.xsd /classes
this will create my.cs which is an object model similar to the xml that you can use with XmlSerializer:
var ser = new XmlSerializer(typeof(eveapi));
var obj = (eveapi)ser.Deserialize(source);
Use XmlReader Class or XmlTextReader Class
http://msdn.microsoft.com/en-us/library/aa720470(v=vs.71).aspx
http://msdn.microsoft.com/en-us/library/system.xml.xmltextreader(v=vs.71).aspx
If you need to use the data in the easy way, especially when you're new to XML, use XmlDocument.
To load the document:
using System.Xml;
using System.IO;
public class someclass {
void somemethod () {
//Initiate the XmlDocument object
XmlDocument xdoc;
//To load from file
xdoc.Load("SomeFolder\\SomeFile.xml");
//Or to load from XmlTextReader, from a file for example
FileStream fs = FileStream("SomeFolder\\SomeFile.xml", FileMode.Open, FileAccess.Read);
XmlTextReader reader = new XmlTextReader(fs);
xdoc.Load(reader);
//In fact, you can load the stream directly
xdoc.Load(fs);
//Or, you can load from a string
xdoc.LoadXml(#"<rootElement>
<element1>value1</element1>
<element2>value2</element2>
</rootElement>");
}
}
I personally find XmlDocument far easier to use for navigating an Xml file.
To use it efficiently, you need to learn XPath. For example, to get the name of the first row:
string name = xdoc.SelectSingleNode("/eveapi/result/rowset/row").Attribute["name"].InnerText;
or even more XPath:
string name = xdoc.SelectSingleNode("/eveapi/result/rowset/row/#name").InnerText;
you can even filter:
XmlNodeList elems = xdoc.SelectNodes("//*[#name=\"characters\"]")
gives you the rowset element.
But that's off topic.

Categories

Resources