I have a method that I am writing in C# which accepts a string which contains an XML document, and an array of streams that are XSDs. The string document is validated against the XSDs:
private static XmlValidationResult ValidateDocumentInternal(string document, params Stream[] xsdStreams)
{
XmlReaderSettings settings = new XmlReaderSettings
{
ValidationType = ValidationType.Schema
};
foreach (var xsdStream in xsdStreams)
{
using (xsdStream)
{
XmlReader xmlReader = XmlReader.Create(xsdStream);
try
{
settings.Schemas.Add(null, xmlReader);
}
finally
{
xmlReader.Close();
}
}
}
var validationErrors = new List<string>();
settings.ValidationEventHandler += (object sender, System.Xml.Schema.ValidationEventArgs e) =>
{
validationErrors.Add($"({e.Exception.LineNumber}): {e.Message}");
};
using (var stream = document.ToStream())
{
var reader = XmlReader.Create(stream, settings);
while (reader.Read())
{
}
}
return new XmlValidationResult
{
Success = validationErrors.Count == 0,
ValidationErrors = validationErrors
};
}
My question is, should this method be disposing of the XSD streams or should that be the responsibility of the caller? Imagine the following code which passes in the document and XSD and expects ValidateDocumentInternal to dispose the XSD stream:
var document = GetDocument();
Stream xsd = GetXSD();
var validationResult = ValidateDocumentInternal(document, xsd);
or should it be like (not disposing of the stream in ValidateDocumentInternal):
var document = GetDocument();
using (Stream xsd = GetXSD()) {
var validationResult = = ValidateDocumentInternal(document, xsd);
}
or alternatively should I just pass in a bool saying whether to dispose or not?
I think it is the caller's responsibility - it is a parameter given from the function by someone else. The function can't know if it is used in another context and and change that it will do to is is actually a "side effect"... which I personally strongly try to avoid
Related
Below is the code which i using to read a stream source of csv files but I get error as "No header record found". The library is 15.0 and I am already using .ToList() as suggested in some solutions, but still the error persists. Below is the method along with the tablefield class and the Read Stream method.
Also note here, I can get the desired result if I pass source as MemoryStream but it fails if I pass it as Stream because I need to avoid writing to memory each time.
public async Task<Stream> DownloadBlob(string containerName, string fileName, string connectionString)
{
// MemoryStream memoryStream = new MemoryStream();
if (string.IsNullOrEmpty(connectionString))
{
connectionString = #"UseDevelopmentStorage=true";
containerName = "testblobs";
}
Microsoft.Azure.Storage.CloudStorageAccount storageAccount = Microsoft.Azure.Storage.CloudStorageAccount.Parse(connectionString);
CloudBlobClient serviceClient = storageAccount.CreateCloudBlobClient();
CloudBlobContainer container = serviceClient.GetContainerReference(containerName);
CloudBlockBlob blob = container.GetBlockBlobReference(fileName);
if (!blob.Exists())
{
throw new Exception($"Blob Not found");
}
return await blob.OpenReadAsync();
public class TableField
{
public string Name { get; set; }
public string Type { get; set; }
public Type DataType
{
get
{
switch( Type.ToUpper() )
{
case "STRING":
return typeof(string);
case "INT":
return typeof( int );
case "BOOL":
case "BOOLEAN":
return typeof( bool );
case "FLOAT":
case "SINGLE":
case "DOUBLE":
return typeof( double );
case "DATETIME":
return typeof( DateTime );
default:
throw new NotSupportedException( $"CSVColumn data type '{Type}' not supported" );
}
}
}
private IEnumerable<Dictionary<string, EntityProperty>> ReadCSV(Stream source, IEnumerable<TableField> cols)
{
using (TextReader reader = new StreamReader(source, Encoding.UTF8))
{
var cache = new TypeConverterCache();
cache.AddConverter<float>(new CSVSingleConverter());
cache.AddConverter<double>(new CSVDoubleConverter());
var csv = new CsvReader(reader,
new CsvHelper.Configuration.CsvConfiguration(global::System.Globalization.CultureInfo.InvariantCulture)
{
Delimiter = ";",
HasHeaderRecord = true,
CultureInfo = global::System.Globalization.CultureInfo.InvariantCulture,
TypeConverterCache = cache
});
csv.Read();
csv.ReadHeader();
var map = (
from col in cols
from src in col.Sources()
let index = csv.GetFieldIndex(src, isTryGet: true)
where index != -1
select new { col.Name, Index = index, Type = col.DataType }).ToList();
while (csv.Read())
{
yield return map.ToDictionary(
col => col.Name,
col => EntityProperty.CreateEntityPropertyFromObject(csv.GetField(col.Type, col.Index)));
}
}
}
StreamReading code:
public async Task<Stream> ReadStream(string containerName, string digestFileName, string fileName, string connectionString)
{
string data = string.Empty;
string fileExtension = Path.GetExtension(fileName);
var contents = await DownloadBlob(containerName, digestFileName, connectionString);
return contents;
}
Sample CSv to be read:
PartitionKey;Time;RowKey;State;RPM;Distance;RespirationConfidence;HeartBPM
te123;2020-11-06T13:33:37.593Z;10;1;8;20946;26;815
te123;2020-11-06T13:33:37.593Z;4;2;79944;8;36635;6
te123;2020-11-06T13:33:37.593Z;3;3;80042;9;8774;5
te123;2020-11-06T13:33:37.593Z;1;4;0;06642;6925;37
te123;2020-11-06T13:33:37.593Z;6;5;04740;74753;94628;21
te123;2020-11-06T13:33:37.593Z;7;6;6;2;14;629
te123;2020-11-06T13:33:37.593Z;9;7;126;86296;9157;05
te123;2020-11-06T13:33:37.593Z;5;8;5;3;7775;08
te123;2020-11-06T13:33:37.593Z;2;9;44363;65;70;229
te123;2020-11-06T13:33:37.593Z;8;10;02;24666;2;2
I have tried to reproduce the problem with version 15.0 of the library, but have failed with classes CSVSingleConverter and CSVDoubleConverter. With the standard classes of the CSVHelper, however, reading the header works:
using System;
using System.IO;
using System.Text;
using CsvHelper;
using CsvHelper.TypeConversion;
namespace ConsoleApp2
{
class Program
{
static void Main(string[] args)
{
using (Stream stream = new FileStream(#"e:\demo.csv", FileMode.Open, FileAccess.Read))
{
ReadCSV(stream);
}
}
private static void ReadCSV(Stream source)
{
using (TextReader reader = new StreamReader(source, Encoding.UTF8))
{
var cache = new TypeConverterCache();
cache.AddConverter<float>(new SingleConverter());
cache.AddConverter<double>(new DoubleConverter());
var csv = new CsvReader(reader,
new CsvHelper.Configuration.CsvConfiguration(global::System.Globalization.CultureInfo.InvariantCulture)
{
Delimiter = ";",
HasHeaderRecord = true,
CultureInfo = global::System.Globalization.CultureInfo.InvariantCulture,
TypeConverterCache = cache
});
csv.Read();
csv.ReadHeader();
foreach (string headerRow in csv.Context.HeaderRecord)
{
Console.WriteLine(headerRow);
}
}
}
}
}
I´ve changed the lines ...
cache.AddConverter<float>(new CSVSingleConverter());
cache.AddConverter<double>(new CSVDoubleConverter());
... to ...
cache.AddConverter<float>(new SingleConverter());
cache.AddConverter<double>(new DoubleConverter());
I put the CSV data into a UTF-8 text file. Output at the console is:
PartitionKey
Time
RowKey
State
RPM
Distance
RespirationConfidence
HeartBPM
EDIT 2020-12-24:
Put the whole source text online, not just part of it.
Related to my answer to your other question (it has more detail ; you can read it there) I didn't encounter any problem connecting CsvHelper to a blob storage sourced stream
This was the code used (I took the CSV data you posted, added it to a file, upped it to blob):
public partial class Form1 : Form
{
public Form1()
{
InitializeComponent();
}
private async void button1_Click(object sender, EventArgs e)
{
var cstr = "YOUR CONNSTR" HERE;
var bbc = new BlockBlobClient(cstr, "temp", "ankit.csv");
var s = await bbc.OpenReadAsync(new BlobOpenReadOptions(true) { BufferSize = 16384 });
var sr = new StreamReader(s);
var csv = new CsvHelper.CsvReader(sr, new CsvConfiguration(CultureInfo.CurrentCulture) { HasHeaderRecord = true, Delimiter = ";" });
//try by read/getrecord
while(await csv.ReadAsync())
{
var rec = csv.GetRecord<X>();
Console.WriteLine(rec.PartitionKey);
}
var x = new X();
//try by await foreach
await foreach (var r in csv.EnumerateRecordsAsync(x))
{
Console.WriteLine(r.PartitionKey);
}
}
}
class X {
public string PartitionKey { get; set; }
}
Try setting the source stream back to the start.
private IEnumerable<Dictionary<string, EntityProperty>> ReadCSV(Stream source, IEnumerable<TableField> cols)
{
source.Position = 0;
You also can't use yield return there. It delays execution of the code until you access the IEnumerable<Dictionary<string, EntityProperty>> returned from the ReadCSV method. The problem is at that point you have already closed the using statement with the TextReader that CsvHelper needs to read your data, so you get a NullReferenceException.
You either need to remove the yield return
var result = new List<Dictionary<string, EntityProperty>>();
while (csv.Read()){
// Add to result
}
return result;
Or pass the TextReader to your method. Any enumaration of the IEnumerable<Dictionary<string, EntityProperty>> must occur before leaving the using statement which will dispose of the TextReader needed by the CsvReader
IEnumerable<Dictionary<string, EntityProperty>> result;
using (TextReader reader = new StreamReader(source, Encoding.UTF8)){
// Calling ToList() will enumerate your yield statement
result = ReadCSV(reader, cols).ToList();
}
I was getting the same error 'No header found...' and this was after several hundred successful reads of the same file. I added the delimiter=","
reader = csv.reader(filename, delimiter=",")
and that solved the problem. I think the csv_reader will attempt to determine the delimiter if the delimiter is not specified, and fails after a while, maybe a memory leak? the comma is the default, but if the reader has to programatically determine it, it is more likely to fail.
I am working with an XML based API that as its root node can either return a SuccessResponse or a ErrorResponse.
I am using the below to deserialize the data but I am not sure how to handle the case of the return not being a SuccessResponse.
What is the best way to handle the situation where the returned XML isn't in the expected format?
I know I could do a hack way and look for the occurrance of either SuccessResponse or ErrorResponse but that doesn't feel right.
TheIconicApiResult result = this.apiService.SendGetRequest("GetProducts", new List<AbstractParam>() { new FilterParam("live"), new LimitParam(5000) });
IXmlSerialiser xmlSerialiser = new XmlSerialiser();
var xmlBody = xmlSerialiser.ParseXML<SuccessResponse>(result.ResponseBody);
public TObject ParseXML<TObject>(string xml)
{
using (TextReader reader = new StreamReader(GetMemoryStream(xml)))
{
XmlSerializer serialiser = new XmlSerializer(typeof(TObject));
return (TObject)serialiser.Deserialize(reader);
}
}
In situations where you have an XML stream containing one of several possible document types, you can construct an XmlSerializer for each type and call XmlSerializer.CanDeserialize(XmlReader) to successively test whether the document can be deserialized into that type. This method does not advance the XmlReader past the root element so it can be called multiple times without re-reading the stream.
For instance, you could introduce the following extension method:
public static partial class XmlSerializerExtensions
{
public static object DeserializePolymorphicXml(this string xml, params Type[] types)
{
using (var textReader = new StringReader(xml))
{
return textReader.DeserializePolymorphicXml(types);
}
}
public static object DeserializePolymorphicXml(this TextReader textReader, params Type[] types)
{
if (textReader == null || types == null)
throw new ArgumentNullException();
var settings = new XmlReaderSettings { CloseInput = false }; // Let caller close the input.
using (var xmlReader = XmlReader.Create(textReader, settings))
{
foreach (var type in types)
{
var serializer = new XmlSerializer(type);
if (serializer.CanDeserialize(xmlReader))
return serializer.Deserialize(xmlReader);
}
}
throw new XmlException("Invalid root type.");
}
}
Then use it as follows:
var xmlBody = result.ResponseBody.DeserializePolymorphicXml(typeof(SuccessResponse), typeof(FailResponse));
if (xmlBody is SuccessResponse)
{
// Handle successful response
}
else if (xmlBody is FailResponse)
{
// Handle failed response
}
else
{
// unknown response
throw new InvalidOperationException("unknown response");
}
Sample fiddle.
I am working with an XML based API that as its root node can either return a SuccessResponse or a ErrorResponse.
I am using the below to deserialize the data but I am not sure how to handle the case of the return not being a SuccessResponse.
What is the best way to handle the situation where the returned XML isn't in the expected format?
I know I could do a hack way and look for the occurrance of either SuccessResponse or ErrorResponse but that doesn't feel right.
TheIconicApiResult result = this.apiService.SendGetRequest("GetProducts", new List<AbstractParam>() { new FilterParam("live"), new LimitParam(5000) });
IXmlSerialiser xmlSerialiser = new XmlSerialiser();
var xmlBody = xmlSerialiser.ParseXML<SuccessResponse>(result.ResponseBody);
public TObject ParseXML<TObject>(string xml)
{
using (TextReader reader = new StreamReader(GetMemoryStream(xml)))
{
XmlSerializer serialiser = new XmlSerializer(typeof(TObject));
return (TObject)serialiser.Deserialize(reader);
}
}
In situations where you have an XML stream containing one of several possible document types, you can construct an XmlSerializer for each type and call XmlSerializer.CanDeserialize(XmlReader) to successively test whether the document can be deserialized into that type. This method does not advance the XmlReader past the root element so it can be called multiple times without re-reading the stream.
For instance, you could introduce the following extension method:
public static partial class XmlSerializerExtensions
{
public static object DeserializePolymorphicXml(this string xml, params Type[] types)
{
using (var textReader = new StringReader(xml))
{
return textReader.DeserializePolymorphicXml(types);
}
}
public static object DeserializePolymorphicXml(this TextReader textReader, params Type[] types)
{
if (textReader == null || types == null)
throw new ArgumentNullException();
var settings = new XmlReaderSettings { CloseInput = false }; // Let caller close the input.
using (var xmlReader = XmlReader.Create(textReader, settings))
{
foreach (var type in types)
{
var serializer = new XmlSerializer(type);
if (serializer.CanDeserialize(xmlReader))
return serializer.Deserialize(xmlReader);
}
}
throw new XmlException("Invalid root type.");
}
}
Then use it as follows:
var xmlBody = result.ResponseBody.DeserializePolymorphicXml(typeof(SuccessResponse), typeof(FailResponse));
if (xmlBody is SuccessResponse)
{
// Handle successful response
}
else if (xmlBody is FailResponse)
{
// Handle failed response
}
else
{
// unknown response
throw new InvalidOperationException("unknown response");
}
Sample fiddle.
There is an error in XML document (8, 20). Inner 1: Unexpected XML declaration. The XML declaration must be the first node in the document, and no white space characters are allowed to appear before it.
OK, I understand this error.
How I get it, however, is what perplexes me.
I create the document with Microsoft's Serialize tool. Then, I turn around and attempt to read it back, again, using Microsoft's Deserialize tool.
I am not in control of writing the XML file in the correct format - that I can see.
Here is the single routine I use to read and write.
private string xmlPath = System.Web.Hosting.HostingEnvironment.MapPath(WebConfigurationManager.AppSettings["DATA_XML"]);
private object objLock = new Object();
public string ErrorMessage { get; set; }
public StoredMsgs Operation(string from, string message, FileAccess access) {
StoredMsgs list = null;
lock (objLock) {
ErrorMessage = null;
try {
if (!File.Exists(xmlPath)) {
var root = new XmlRootAttribute(rootName);
var serializer = new XmlSerializer(typeof(StoredMsgs), root);
if (String.IsNullOrEmpty(message)) {
from = "Code Window";
message = "Created File";
}
var item = new StoredMsg() {
From = from,
Date = DateTime.Now.ToString("s"),
Message = message
};
using (var stream = File.Create(xmlPath)) {
list = new StoredMsgs();
list.Add(item);
serializer.Serialize(stream, list);
}
} else {
var root = new XmlRootAttribute("MessageHistory");
var serializer = new XmlSerializer(typeof(StoredMsgs), root);
var item = new StoredMsg() {
From = from,
Date = DateTime.Now.ToString("s"),
Message = message
};
using (var stream = File.Open(xmlPath, FileMode.Open, FileAccess.ReadWrite)) {
list = (StoredMsgs)serializer.Deserialize(stream);
if ((access == FileAccess.ReadWrite) || (access == FileAccess.Write)) {
list.Add(item);
serializer.Serialize(stream, list);
}
}
}
} catch (Exception error) {
var sb = new StringBuilder();
int index = 0;
sb.AppendLine(String.Format("Top Level Error: <b>{0}</b>", error.Message));
var err = error.InnerException;
while (err != null) {
index++;
sb.AppendLine(String.Format("\tInner {0}: {1}", index, err.Message));
err = err.InnerException;
}
ErrorMessage = sb.ToString();
}
}
return list;
}
Is something wrong with my routine? If Microsoft write the file, it seems to me that it should be able to read it back.
It should be generic enough for anyone to use.
Here is my StoredMsg class:
[Serializable()]
[XmlType("StoredMessage")]
public class StoredMessage {
public StoredMessage() {
}
[XmlElement("From")]
public string From { get; set; }
[XmlElement("Date")]
public string Date { get; set; }
[XmlElement("Message")]
public string Message { get; set; }
}
[Serializable()]
[XmlRoot("MessageHistory")]
public class MessageHistory : List<StoredMessage> {
}
The file it generates doesn't look to me like it has any issues.
I saw the solution here:
Error: The XML declaration must be the first node in the document
But, in that case, it seems someone already had an XML document they wanted to read. They just had to fix it.
I have an XML document created my Microsoft, so it should be read back in by Microsoft.
The problem is that you are adding to the file. You deserialize, then re-serialize to the same stream without rewinding and resizing to zero. This gives you multiple root elements:
<?xml version="1.0"?>
<StoredMessage>
</StoredMessage
<?xml version="1.0"?>
<StoredMessage>
</StoredMessage
Multiple root elements, and multiple XML declarations, are invalid according to the XML standard, thus the .NET XML parser throws an exception in this situation by default.
For possible solutions, see XML Error: There are multiple root elements, which suggests you either:
Enclose your list of StoredMessage elements in some synthetic outer element, e.g. StoredMessageList.
This would require you to load the list of messages from the file, add the new message, and then truncate the file and re-serialize the entire list when adding a single item. Thus the performance may be worse than in your current approach, but the XML will be valid.
When deserializing a file containing concatenated root elements, create an XML writer using XmlReaderSettings.ConformanceLevel = ConformanceLevel.Fragment and iteratively walk through the concatenated root node(s) and deserialize each one individually as shown, e.g., here. Using ConformanceLevel.Fragment allows the reader to parse streams with multiple root elements (although multiple XML declarations will still cause an error to be thrown).
Later, when adding a new element to the end of the file using XmlSerializer, seek to the end of the file and serialize using an XML writer returned from XmlWriter.Create(TextWriter, XmlWriterSettings)
with XmlWriterSettings.OmitXmlDeclaration = true. This prevents output of multiple XML declarations as explained here.
For option #2, your Operation would look something like the following:
private string xmlPath = System.Web.Hosting.HostingEnvironment.MapPath(WebConfigurationManager.AppSettings["DATA_XML"]);
private object objLock = new Object();
public string ErrorMessage { get; set; }
const string rootName = "MessageHistory";
static readonly XmlSerializer serializer = new XmlSerializer(typeof(StoredMessage), new XmlRootAttribute(rootName));
public MessageHistory Operation(string from, string message, FileAccess access)
{
var list = new MessageHistory();
lock (objLock)
{
ErrorMessage = null;
try
{
using (var file = File.Open(xmlPath, FileMode.OpenOrCreate))
{
list.AddRange(XmlSerializerHelper.ReadObjects<StoredMessage>(file, false, serializer));
if (list.Count == 0 && String.IsNullOrEmpty(message))
{
from = "Code Window";
message = "Created File";
}
var item = new StoredMessage()
{
From = from,
Date = DateTime.Now.ToString("s"),
Message = message
};
if ((access == FileAccess.ReadWrite) || (access == FileAccess.Write))
{
file.Seek(0, SeekOrigin.End);
var writerSettings = new XmlWriterSettings
{
OmitXmlDeclaration = true,
Indent = true, // Optional; remove if compact XML is desired.
};
using (var textWriter = new StreamWriter(file))
{
if (list.Count > 0)
textWriter.WriteLine();
using (var xmlWriter = XmlWriter.Create(textWriter, writerSettings))
{
serializer.Serialize(xmlWriter, item);
}
}
}
list.Add(item);
}
}
catch (Exception error)
{
var sb = new StringBuilder();
int index = 0;
sb.AppendLine(String.Format("Top Level Error: <b>{0}</b>", error.Message));
var err = error.InnerException;
while (err != null)
{
index++;
sb.AppendLine(String.Format("\tInner {0}: {1}", index, err.Message));
err = err.InnerException;
}
ErrorMessage = sb.ToString();
}
}
return list;
}
Using the following extension method adapted from Read nodes of a xml file in C#:
public partial class XmlSerializerHelper
{
public static List<T> ReadObjects<T>(Stream stream, bool closeInput = true, XmlSerializer serializer = null)
{
var list = new List<T>();
serializer = serializer ?? new XmlSerializer(typeof(T));
var settings = new XmlReaderSettings
{
ConformanceLevel = ConformanceLevel.Fragment,
CloseInput = closeInput,
};
using (var xmlTextReader = XmlReader.Create(stream, settings))
{
while (xmlTextReader.Read())
{ // Skip whitespace
if (xmlTextReader.NodeType == XmlNodeType.Element)
{
using (var subReader = xmlTextReader.ReadSubtree())
{
var logEvent = (T)serializer.Deserialize(subReader);
list.Add(logEvent);
}
}
}
}
return list;
}
}
Note that if you are going to create an XmlSerializer using a custom XmlRootAttribute, you must cache the serializer to avoid a memory leak.
Sample fiddle.
I am calling below method in a loop with same xmlRequestPath and xmlResponsePath files. Two loop counts it executes fine in the 3rd iteration I am getting exception "The process cannot access the file because it is being used by another process.".
public static void UpdateBatchID(String xmlRequestPath, String xmlResponsePath)
{
String batchId = "";
XDocument requestDoc = null;
XDocument responseDoc = null;
lock (locker)
{
using (var sr = new StreamReader(xmlRequestPath))
{
requestDoc = XDocument.Load(sr);
var element = requestDoc.Root;
batchId = element.Attribute("BatchID").Value;
if (batchId.Length >= 16)
{
batchId = batchId.Remove(0, 16).Insert(0, DateTime.Now.ToString("yyyyMMddHHmmssff"));
}
else if (batchId != "") { batchId = DateTime.Now.ToString("yyyyMMddHHmmssff"); }
element.SetAttributeValue("BatchID", batchId);
}
using (var sw = new StreamWriter(xmlRequestPath))
{
requestDoc.Save(sw);
}
using (var sr = new StreamReader(xmlResponsePath))
{
responseDoc = XDocument.Load(sr);
var elementResponse = responseDoc.Root;
elementResponse.SetAttributeValue("BatchID", batchId);
}
using (var sw = new StreamWriter(xmlResponsePath))
{
responseDoc.Save(sw);
}
}
Thread.Sleep(500);
requestDoc = null;
responseDoc = null;
}
Exception is occurring at using (var sw = new StreamWriter(xmlResponsePath)) in above code.
Exception:
The process cannot access the file 'D:\Projects\ESELServer20130902\trunk\Testing\ESL Server Testing\ESLServerTesting\ESLServerTesting\TestData\Assign\Expected Response\Assign5kMACResponse.xml' because it is being used by another process.
Maybe at the third loop the stream is still being closed, so it tells you that it is non accessible. Try waiting a bit before calling it again in the loop, for example:
while (...)
{
UpdateBatchID(xmlRequestPath, xmlResponsePath);
System.Threading.Thread.Sleep(500);
}
Or, close explicitly the stream instead of leaving the work to the garbage collector:
var sr = new StreamReader(xmlResponsePath);
responseDoc = XDocument.Load(sr);
....
sr.Close();
Instead of using two streams, a Write and a Read stream, try using only a FileStream, since the problem might be that after loading the file the stream remains opened until the garbadge collector actives.
using (FileSteam f = new FileStream(xmlResponsePath))
{
responseDoc = XDocument.Load(sr);
var elementResponse = responseDoc.Root;
elementResponse.SetAttributeValue("BatchID", batchId);
responseDoc.Save(sw);
}