how to read xml files in C# ? - c#

I have a code which reads an xml file. There are some parts I dont understand.
From my understanding , the code will create an xml file with 2 elements,
"Product" and "OtherDetails" . How come we only have to use writer.WriteEndElement();
once when we used writer.WriteStartElement twice ? , shouldn't we close each
writer.WriteStartElement statement with a writer.WriteEndElement() statement ?
using System.Xml;
public class Program
{
public static void Main()
{
XmlWriterSettings settings = new XmlWriterSettings();
settings.Indent = true;
XmlWriter writer = XmlWriter.Create("Products.xml", settings);
writer.WriteStartDocument();
writer.WriteComment("This file is generated by the program.");
writer.WriteStartElement("Product"); // first s
writer.WriteAttributeString("ID", "001");
writer.WriteAttributeString("Name", "Soap");
writer.WriteElementString("Price", "10.00")
// Second Element
writer.WriteStartElement("OtherDetails");
writer.WriteElementString("BrandName", "X Soap");
writer.WriteElementString("Manufacturer", "X Company");
writer.WriteEndElement();
writer.WriteEndDocument();
writer.Flush();
writer.Close();
}
}
using System;
using System.Xml;
public class Program
{
public static void Main()
{
XmlReader reader = XmlReader.Create("Products.xml");
while (reader.Read())
{
if (reader.NodeType == XmlNodeType.Element
&& reader.Name == "Product")
{
Console.WriteLine("ID = " + reader.GetAttribute(0));
Console.WriteLine("Name = " + reader.GetAttribute(1));
while (reader.NodeType != XmlNodeType.EndElement)
{
reader.Read();
if (reader.Name == "Price")
{
while (reader.NodeType != XmlNodeType.EndElement)
{
reader.Read();
if (reader.NodeType == XmlNodeType.Text)
{
Console.WriteLine("Price = {0:C}", Double.Parse(reader.Value));
}
}
reader.Read();
} //end if
if (reader.Name == "OtherDetails")
{
while (reader.NodeType != XmlNodeType.EndElement)
{
reader.Read();
if (reader.Name == "BrandName")
{
while (reader.NodeType != XmlNodeType.EndElement)
{
reader.Read();
if (reader.NodeType == XmlNodeType.Text)
{
Console.WriteLine("Brand Name = " + reader.Value);
}
}
reader.Read();
} //end if
if (reader.Name == "Manufacturer")
{
while (reader.NodeType != XmlNodeType.EndElement)
{
reader.Read();
if (reader.NodeType == XmlNodeType.Text)
{
Console.WriteLine("Manufacturer = " + reader.Value);
}
}
} //end if
}
} //end if
} //end while
} //end if
} //end while
}
}
I don't get this part:
if (reader.Name == "OtherDetails")
{
while (reader.NodeType != XmlNodeType.EndElement)
{
reader.Read();
if (reader.Name == "BrandName")
{
while (reader.NodeType != XmlNodeType.EndElement)
{
reader.Read();
if (reader.NodeType == XmlNodeType.Text)
{
Console.WriteLine("Brand Name = " + reader.Value);
}
}
notice how the condition while (reader.NodeType != XmlNodeType.EndElement) has been used twice ?
why is that we don't have to specify
if (reader.NodeType == XmlNodeType.Element for OtherDetails) as we did for Product,
like this
if (reader.NodeType == XmlNodeType.Element
&& reader.Name == "OtherDetails")
{}

To answer your first question:
As the MSDN documentation for XmlWriter.WriteEndDocument() says:
Closes any open elements or attributes and puts the writer back in the Start state.
So it will automatically close any open elements for you. In fact, you can remove the call to WriteEndElement() altogether and it will still work ok.
And as people are saying in the comments above, you should perhaps consider using Linq-to-XML.
It can make things much easier. For example, to create the XML structure from your program using Linq-to-XML you can do this:
var doc = new XDocument(
new XElement("Product",
new XAttribute("ID", "001"), new XAttribute("Name", "Soap"),
new XElement("Price", 10.01),
new XElement("OtherDetails",
new XElement("BrandName", "X Soap"),
new XElement("Manufacturer", "X Company"))));
File.WriteAllText("Products.xml", doc.ToString());
If you were reading data from the XML, you can use var doc = XDocument.Load("Filename.xml") to load the XML from a file, and then getting the data out is as simple as:
double price = double.Parse(doc.Descendants("Price").Single().Value);
string brandName = doc.Descendants("BrandName").Single().Value;
Or alternatively (casting):
double price = (double) doc.Descendants("Price").Single();
string brandName = (string) doc.Descendants("BrandName").Single();
(In case you're wondering how on earth we can cast an object of type XElement like that: It's because a load of explict conversion operators are defined for XElement.)

If you need anything strait forward (no reading or research), here is what I did:
I recently wrote a custom XML parsing method for my MenuStrip for WinForms (it had hundreds of items and XML was my best bet).
// load the document
// I loaded mine from my C# resource file called TempResources
XDocument doc = XDocument.Load(new MemoryStream(Encoding.UTF8.GetBytes(TempResources.Menu)));
// get the root element
// (var is an auto token, it becomes what ever you assign it)
var elements = doc.Root.Elements();
// iterate through the child elements
foreach (XElement node in elements)
{
// if you know the name of the attribute, you can call it
// mine was 'name'
// (if you don't know, you can call node.Attributes() - this has the name and value)
Console.WriteLine("Loading list: {0}", node.Attribute("name").Value);
// in my case, every child had additional children, and them the same
// *.Cast<XElement>() would give me the array in a datatype I can work with
// menu_recurse(...) is just a resursive helper method of mine
menu_recurse(node.Elements().Cast<XElement>().ToArray()));
}
(My answer can also be found here: Reading an XML File With Linq - though it unfortunately is not Linq)

Suppose if you want to read an xml file we need to use a dataset,because xml file internally converts into datatables using a dataset.Use the following line of code to access the file and to bind the dataset with the xml data.
DataSet ds=new DataSet();
ds.ReadXml(HttpContext.Current.Server.MapPath("~/Labels.xml");
DataSet comprises of many datatables,count of those datatables depends on number of Parent Child Tags in an xml file

Related

Update XLSX file changes whilst reading the file with XmlReader

We had a code which was loading the Excel XLSX document into the memory, doing some modifications with it and saving it back.
XmlDocument doc = new XmlDocument();
doc.Load(pp.GetStream());
XmlNode rootNode = doc.DocumentElement;
if (rootNode == null) return;
ProcessNode(rootNode);
if (this.fileModified)
{
doc.Save(pp.GetStream(FileMode.Create, FileAccess.Write));
}
This was working good with small files, but throwing OutOfMemory exceptions with some large Excel files. So we decided to change the approach and use XmlReader class to not load the file into the memory at once.
PackagePartCollection ppc = this.Package.GetParts();
foreach (PackagePart pp in ppc)
{
if (!this.xmlContentTypesXlsx.Contains(pp.ContentType)) continue;
using (XmlReader reader = XmlReader.Create(pp.GetStream()))
{
reader.MoveToContent();
while (reader.EOF == false)
{
XmlDocument doc;
XmlNode rootNode;
if (reader.NodeType == XmlNodeType.Element && reader.Name == "hyperlinks")
{
doc = new XmlDocument();
rootNode = doc.ReadNode(reader);
if (rootNode != null)
{
doc.AppendChild(rootNode);
ProcessNode(rootNode); // how can I save updated changes back to the file?
}
}
else if (reader.NodeType == XmlNodeType.Element && reader.Name == "row")
{
doc = new XmlDocument();
rootNode = doc.ReadNode(reader);
if (rootNode != null)
{
doc.AppendChild(rootNode);
ProcessNode(rootNode); // how can I save updated changes back to the file?
}
}
else
{
reader.Read();
}
}
}
}
This reads the file node by node and processes nodes we need (and changes some values there). However, I'm not sure how we can update those values back to the original Excel file.
I tried to use XmlWriter together with the XmlReader, but was not able to make it work. Any ideas?
UPDATE:
I tried to use #dbc's suggestions from the comments section, but it seems too slow to me. It probably will not throw OutOfMemory exceptions for huge files, but processing will take forever.
PackagePartCollection ppc = this.Package.GetParts();
foreach (PackagePart pp in ppc)
{
if (!this.xmlContentTypesXlsx.Contains(pp.ContentType)) continue;
StringBuilder strBuilder = new StringBuilder();
using (XmlReader reader = XmlReader.Create(pp.GetStream()))
{
using (XmlWriter writer = this.Package.FileOpenAccess == FileAccess.ReadWrite ? XmlWriter.Create(strBuilder) : null)
{
reader.MoveToContent();
while (reader.EOF == false)
{
XmlDocument doc;
XmlNode rootNode;
if (reader.NodeType == XmlNodeType.Element && reader.Name == "hyperlinks")
{
doc = new XmlDocument();
rootNode = doc.ReadNode(reader);
if (rootNode != null)
{
doc.AppendChild(rootNode);
ProcessNode(rootNode);
writer?.WriteRaw(rootNode.OuterXml);
}
}
else if (reader.NodeType == XmlNodeType.Element && reader.Name == "row")
{
doc = new XmlDocument();
rootNode = doc.ReadNode(reader);
if (rootNode != null)
{
doc.AppendChild(rootNode);
ProcessNode(rootNode);
writer?.WriteRaw(rootNode.OuterXml);
}
}
else
{
WriteShallowNode(writer, reader); // Used from the #dbc's suggested stackoverflow answers
reader.Read();
}
}
writer?.Flush();
}
}
}
NOTE 1: I'm using StringBuilder for the test, but was planning to switch to a temp file in the end.
NOTE 2: I tried flushing the XmlWriter after every 100 elements, but it's still slow.
Any ideas?
Try following. I've been use for a long time with huge xml files that give Out of Memory
using (XmlReader reader = XmlReader.Create("File Stream", readerSettings))
{
while (!reader.EOF)
{
if (reader.Name != "row")
{
reader.ReadToFollowing("row");
}
if (!reader.EOF)
{
XElement row = (XElement)XElement.ReadFrom(reader);
}
}
}
}
I did some more modifications with #dbc's help and now it works as I wanted.
PackagePartCollection ppc = this.Package.GetParts();
foreach (PackagePart pp in ppc)
{
try
{
if (!this.xmlContentTypesXlsx.Contains(pp.ContentType)) continue;
string tempFilePath = GetTempFilePath();
using (XmlReader reader = XmlReader.Create(pp.GetStream()))
{
using (XmlWriter writer = this.Package.FileOpenAccess == FileAccess.ReadWrite ? XmlWriter.Create(tempFilePath) : null)
{
while (reader.EOF == false)
{
if (reader.NodeType == XmlNodeType.Element && reader.Name == "hyperlinks")
{
XmlDocument doc = new XmlDocument();
XmlNode rootNode = doc.ReadNode(reader);
if (rootNode != null)
{
ProcessNode(rootNode);
if (writer != null)
{
rootNode.WriteTo(writer);
}
}
}
else if (reader.NodeType == XmlNodeType.Element && reader.Name == "row")
{
XmlDocument doc = new XmlDocument();
XmlNode rootNode = doc.ReadNode(reader);
if (rootNode != null)
{
ProcessNode(rootNode);
if (writer != null)
{
rootNode.WriteTo(writer);
}
}
}
else
{
WriteShallowNode(writer, reader); // Used from the #dbc's suggested StackOverflow answers
reader.Read();
}
}
}
}
if (this.packageChanged) // is being set in ProcessNode method
{
this.packageChanged = false;
using (var tempFile = File.OpenRead(tempFilePath))
{
tempFile.CopyTo(pp.GetStream(FileMode.Create, FileAccess.Write));
}
}
}
catch (OutOfMemoryException)
{
throw;
}
catch (Exception ex)
{
Log.Exception(ex, #"Failed to process a file."); // our inner log method
}
finally
{
if (!string.IsNullOrWhiteSpace(tempFilePath))
{
// Delete temp file
}
}
}

C# XmlReader confused about assigning values from read() function

Have the following code, new to this so be kind, it looks clunky and isn't returning what I expect it to return. Basically I'm trying to read nodes for Operator, Password, and Group values into vars and return via a tuple.
public static Tuple<string, string, string> ReadSecurity()
{
XmlReader reader = XmlReader.Create("Operator.xml");
string sOperator = "";
string sPassword = "";
string sGroup = "";
while (reader.Read())
{
if(reader.NodeType == XmlNodeType.Element && reader.Name == "Security")
{
while (reader.NodeType != XmlNodeType.EndElement)
{
reader.Read();
if (reader.NodeType == XmlNodeType.Text)
{
sOperator = reader.Value;
}
reader.Read();
if (reader.NodeType == XmlNodeType.Text)
{
sPassword = reader.Value;
}
reader.Read();
if (reader.NodeType == XmlNodeType.Text)
{
sGroup = reader.Value;
}
}
}
}
return Tuple.Create(sOperator, sPassword, sGroup);
}
Seem to be missing the first value each time but have no idea how to change this, online tutorials are assuming a lot more knowledge than I currently have.
For example:
See below for the current iteration (yes, I know the password should be encrypted).
<?xml version="1.0" encoding="utf-8"?>
<Security ver="beta">
<Operator>Ted</Operator>
<Password>password</Password>
<Group>op</Group>
</Security>
Put a breakpoint at the beginning of your method and run it step by step pressing F11. You will see that the XmlReader reads, among others, the Whitespace nodes. See Debug > Windows > Autos or Locals window.
You must ignore these Whitespace nodes and correctly navigate to nodes of type Text, to read their values. You also need to correctly handle start and end tags.
As a result, the code might look like this:
using (var reader = XmlReader.Create("Operator.xml"))
{
string sOperator = "";
string sPassword = "";
string sGroup = "";
while (reader.Read())
{
if (reader.NodeType == XmlNodeType.Element && reader.Name == "Operator")
{
reader.Read(); // move to Text node
sOperator = reader.Value;
}
if (reader.NodeType == XmlNodeType.Element && reader.Name == "Password")
{
reader.Read(); // move to Text node
sPassword = reader.Value;
}
if (reader.NodeType == XmlNodeType.Element && reader.Name == "Group")
{
reader.Read(); // move to Text node
sGroup = reader.Value;
}
}
return Tuple.Create(sOperator, sPassword, sGroup);
}
However the XmlReader class has many useful methods. If you use them correctly, you can make its use simple and enjoyable.
using (var reader = XmlReader.Create("Operator.xml"))
{
reader.ReadToFollowing("Operator");
var sOperator = reader.ReadElementContentAsString();
reader.ReadToFollowing("Password");
var sPassword = reader.ReadElementContentAsString();
reader.ReadToFollowing("Group");
var sGroup = reader.ReadElementContentAsString();
return Tuple.Create(sOperator, sPassword, sGroup);
}

ReadOuterXml is throwing OutOfMemoryException reading part of large (1 GB) XML file

I am working on a large XML file and while running the application, XmlTextReader.ReadOuterXml() method is throwing memory exception.
Lines of codes are like,
XmlTextReader xr = null;
try
{
xr = new XmlTextReader(fileName);
while (xr.Read() && success)
{
if (xr.NodeType != XmlNodeType.Element)
continue;
switch (xr.Name)
{
case "A":
var xml = xr.ReadOuterXml();
var n = GetDetails(xml);
break;
}
}
}
catch (Exception ex)
{
//Do stuff
}
Using:
private int GetDetails (string xml)
{
var rootNode = XDocument.Parse(xml);
var xnodes = rootNode.XPathSelectElements("//A/B").ToList();
//Then working on list of nodes
}
Now while loading the XML files, the application throwing exception on the xr.ReadOuterXml() line. What can be done to avoid this? The size of XML is almost 1 GB.
The most likely reason you are getting a OutOfMemoryException in ReadOuterXml() is that you are trying to read in a substantial portion of the 1 GB XML document into a string, and are hitting the Maximum string length in .Net.
So, don't do that. Instead load directly from the XmlReader using XDocument.Load() with XmlReader.ReadSubtree():
using (var xr = XmlReader.Create(fileName))
{
while (xr.Read() && success)
{
if (xr.NodeType != XmlNodeType.Element)
continue;
switch (xr.Name)
{
case "A":
{
// ReadSubtree() positions the reader at the EndElement of the element read, so the
// next call to Read() moves to the next node.
using (var subReader = xr.ReadSubtree())
{
var doc = XDocument.Load(subReader);
GetDetails(doc);
}
}
break;
}
}
}
And then in GetDetails() do:
private int GetDetails(XDocument rootDocument)
{
var xnodes = rootDocument.XPathSelectElements("//A/B").ToList();
//Then working on list of nodes
return xnodes.Count;
}
Not only will this use less memory, it will also be more performant. ReadOuterXml() uses a temporary XmlWriter to copy the XML in the input stream to an output StringWriter (which you then parse a second time). This version of the algorithm completely skips this extra work. It also avoids creating strings large enough to go on the large object heap which can cause additional performance issues.
If this is still using too much memory you will need to implement SAX-like parsing for your XML where you only load one element <B> at a time. First, introduce the following extension method:
public static partial class XmlReaderExtensions
{
public static IEnumerable<XElement> WalkXmlElements(this XmlReader xmlReader, Predicate<Stack<XName>> filter)
{
Stack<XName> names = new Stack<XName>();
while (xmlReader.Read())
{
if (xmlReader.NodeType == XmlNodeType.Element)
{
names.Push(XName.Get(xmlReader.LocalName, xmlReader.NamespaceURI));
if (filter(names))
{
using (var subReader = xmlReader.ReadSubtree())
{
yield return XElement.Load(subReader);
}
}
}
if ((xmlReader.NodeType == XmlNodeType.Element && xmlReader.IsEmptyElement)
|| xmlReader.NodeType == XmlNodeType.EndElement)
{
names.Pop();
}
}
}
}
Then, use it as follows:
using (var xr = XmlReader.Create(fileName))
{
Predicate<Stack<XName>> filter =
(stack) => stack.Peek().LocalName == "B" && stack.Count > 1 && stack.ElementAt(1).LocalName == "A";
foreach (var element in xr.WalkXmlElements(filter))
{
//Then working on the specific node.
}
}
using (var reader = XmlReader.Create(fileName))
{
XmlDocument oXml = new XmlDocument();
while (reader.Read())
{
oXml.Load(reader);
}
}
For me above code resolved the issue when we return it to XmlDocument through XmlDocument Load method

using XmlReader to get child nodes without knowing their names(in .net)

How do I get the the top level child nodes(unknownA) of root node with XmlReader in .net? Because their names are unknown, ReadToDescendant(string) and ReadToNextSibling(string)won't work.
<root>
<unknownA/>
<unknownA/>
<unknownA>
<unknownB/>
<unknownB/>
</unknownA>
<unknownA/>
<unknownA>
<unknownB/>
<unknownB>
<unknownC/>
<unknownC/>
</unknownB>
</unknownA>
<unknownA/>
</root>
You can walk through the file using XmlReader.Read(), checking the current Depth against the initial depth, until reaching an element end at the initial depth, using the following extension method:
public static class XmlReaderExtensions
{
public static IEnumerable<string> ReadChildElementNames(this XmlReader xmlReader)
{
if (xmlReader == null)
throw new ArgumentNullException();
if (xmlReader.NodeType == XmlNodeType.Element && !xmlReader.IsEmptyElement)
{
var depth = xmlReader.Depth;
while (xmlReader.Read())
{
if (xmlReader.Depth == depth + 1 && xmlReader.NodeType == XmlNodeType.Element)
yield return xmlReader.Name;
else if (xmlReader.Depth == depth && xmlReader.NodeType == XmlNodeType.EndElement)
break;
}
}
}
public static bool ReadToFirstElement(this XmlReader xmlReader)
{
if (xmlReader == null)
throw new ArgumentNullException();
while (xmlReader.NodeType != XmlNodeType.Element)
if (!xmlReader.Read())
return false;
return true;
}
}
Then it could be used as follows:
var xml = GetXml(); // Your XML string
using (var textReader = new StringReader(xml))
using (var xmlReader = XmlReader.Create(textReader))
{
xmlReader.ReadToFirstElement();
var names = xmlReader.ReadChildElementNames().ToArray();
Console.WriteLine(string.Join("\n", names));
}

Strategy for splitting a large JSON file

I'm trying to split very large JSON files into smaller files for a given array. For example:
{
"headerName1": "headerVal1",
"headerName2": "headerVal2",
"headerName3": [{
"element1Name1": "element1Value1"
},
{
"element2Name1": "element2Value1"
},
{
"element3Name1": "element3Value1"
},
{
"element4Name1": "element4Value1"
},
{
"element5Name1": "element5Value1"
},
{
"element6Name1": "element6Value1"
}]
}
...down to { "elementNName1": "elementNValue1" } where N is a large number
The user provides the name which represents the array to be split (in this example "headerName3") and the number of array objects per file, e.g. 1,000,000
This would result in N files each containing the top name:value pairs (headerName1, headerName3) and up to 1,000,000 of the headerName3 objects in each file.
I'm using the excellent Newtonsof JSON.net and understand that I need to do this using a stream.
So far I have looked a reading in JToken objects to establish where the PropertyName == "headerName3" occurs when reading in the tokens but what I would like to do is then read in the entire JSON object for each object in the array and not have to continue parsing JSON into JTokens;
Here's a snippet of the code I am building so far:
using (StreamReader oSR = File.OpenText(strInput))
{
using (var reader = new JsonTextReader(oSR))
{
while (reader.Read())
{
if (reader.TokenType == JsonToken.StartObject)
{
intObjectCount++;
}
else if (reader.TokenType == JsonToken.EndObject)
{
intObjectCount--;
if (intObjectCount == 1)
{
intArrayRecordCount++;
// Here I want to read the entire object for this record into an untyped JSON object
if( intArrayRecordCount % 1000000 == 0)
{
//write these to the split file
}
}
}
}
}
}
I don't know - and in fact, and am not concerned with - the structure of the JSON itself, and the objects can be of varying structures within the array. I am therefore not serializing to classes.
Is this the right approach? Is there a set of methods in the JSON.net library I can easily use to perform such operation?
Any help appreciated.
You can use JsonWriter.WriteToken(JsonReader reader, true) to stream individual array entries and their descendants from a JsonReader to a JsonWriter. You can also use JProperty.Load(JsonReader reader) and JProperty.WriteTo(JsonWriter writer) to read and write entire properties and their descendants.
Using these methods, you can create a state machine that parses the JSON file, iterates through the root object, loads "prefix" and "postfix" properties, splits the array property, and writes the prefix, array slice, and postfix properties out to new file(s).
Here's a prototype implementation that takes a TextReader and a callback function to create sequential output TextWriter objects for the split file:
enum SplitState
{
InPrefix,
InSplitProperty,
InSplitArray,
InPostfix,
}
public static void SplitJson(TextReader textReader, string tokenName, long maxItems, Func<int, TextWriter> createStream, Formatting formatting)
{
List<JProperty> prefixProperties = new List<JProperty>();
List<JProperty> postFixProperties = new List<JProperty>();
List<JsonWriter> writers = new List<JsonWriter>();
SplitState state = SplitState.InPrefix;
long count = 0;
try
{
using (var reader = new JsonTextReader(textReader))
{
bool doRead = true;
while (doRead ? reader.Read() : true)
{
doRead = true;
if (reader.TokenType == JsonToken.Comment || reader.TokenType == JsonToken.None)
continue;
if (reader.Depth == 0)
{
if (reader.TokenType != JsonToken.StartObject && reader.TokenType != JsonToken.EndObject)
throw new JsonException("JSON root container is not an Object");
}
else if (reader.Depth == 1 && reader.TokenType == JsonToken.PropertyName)
{
if ((string)reader.Value == tokenName)
{
state = SplitState.InSplitProperty;
}
else
{
if (state == SplitState.InSplitProperty)
state = SplitState.InPostfix;
var property = JProperty.Load(reader);
doRead = false; // JProperty.Load() will have already advanced the reader.
if (state == SplitState.InPrefix)
{
prefixProperties.Add(property);
}
else
{
postFixProperties.Add(property);
}
}
}
else if (reader.Depth == 1 && reader.TokenType == JsonToken.StartArray && state == SplitState.InSplitProperty)
{
state = SplitState.InSplitArray;
}
else if (reader.Depth == 1 && reader.TokenType == JsonToken.EndArray && state == SplitState.InSplitArray)
{
state = SplitState.InSplitProperty;
}
else if (state == SplitState.InSplitArray && reader.Depth == 2)
{
if (count % maxItems == 0)
{
var writer = new JsonTextWriter(createStream(writers.Count)) { Formatting = formatting };
writers.Add(writer);
writer.WriteStartObject();
foreach (var property in prefixProperties)
property.WriteTo(writer);
writer.WritePropertyName(tokenName);
writer.WriteStartArray();
}
count++;
writers.Last().WriteToken(reader, true);
}
else
{
throw new JsonException("Internal error");
}
}
}
foreach (var writer in writers)
using (writer)
{
writer.WriteEndArray();
foreach (var property in postFixProperties)
property.WriteTo(writer);
writer.WriteEndObject();
}
}
finally
{
// Make sure files are closed in the event of an exception.
foreach (var writer in writers)
using (writer)
{
}
}
}
This method leaves all the files open until the end in case "postfix" properties, appearing after the array property, need to be appended. Be aware that there is a limit of 16384 open files at one time, so if you need to create more split files, this won't work. If postfix properties are never encountered in practice, you can just close each file before opening the next and throw an exception in case any postfix properties are found. Otherwise you may need to parse the large file in two passes or close and reopen the split files to append them.
Here is an example of how to use the method with an in-memory JSON string:
private static void TestSplitJson(string json, string tokenName)
{
var builders = new List<StringBuilder>();
using (var reader = new StringReader(json))
{
SplitJson(reader, tokenName, 2, i => { builders.Add(new StringBuilder()); return new StringWriter(builders.Last()); }, Formatting.Indented);
}
foreach (var s in builders.Select(b => b.ToString()))
{
Console.WriteLine(s);
}
}
Prototype fiddle.

Categories

Resources