CsvHelper CsvWriter is empty when source DataTable contains less than 12 rows - c#

When writing to stream (Maybe other destinations too) CsvHelper does not return anything if my DataTable contains less than 12 rows. I tested adding rows one by one until I get a result in the string myCsvAsString variable.
Anyone ran into this problem? Here is the code I am using to reproduce it:
var stream = new MemoryStream();
using (var writer = new StreamWriter(stream))
using (var csvWriter = new CsvWriter(writer, CultureInfo.InvariantCulture))
{
if (includeHeaders)
{
foreach (DataColumn column in dataTable.Columns)
{
csvWriter.WriteField(column.ColumnName);
}
csvWriter.NextRecord();
}
foreach (DataRow row in dataTable.Rows)
{
for (var i = 0; i < dataTable.Columns.Count; i++)
{
csvWriter.WriteField(row[i]);
}
csvWriter.NextRecord();
}
csvWriter.Flush();
stream.Position = 0;
StreamReader reader = new StreamReader(stream);
string myCsvAsString = reader.ReadToEnd();
}

Ok I found the issue, I was flushing the csvWriter but I did not flush the StreamWriter.
I added writer.Flush() just after csvWriter.Flush() and the stream is complete.

Related

CSV appears to be corrupt on Double quotes in Headers - C#

I was trying to read CSV file in C#.
I have tried File.ReadAllLines(path).Select(a => a.Split(';')) way but the issue is when there is \n multiple line in a cell it is not working.
So I have tried below
using LumenWorks.Framework.IO.Csv;
var csvTable = new DataTable();
using (TextReader fileReader = File.OpenText(path))
using (var csvReader = new CsvReader(fileReader, false))
{
csvTable.Load(csvReader);
}
for (int i = 0; i < csvTable.Rows.Count; i++)
{
if (!(csvTable.Rows[i][0] is DBNull))
{
var row1= csvTable.Rows[i][0];
}
if (!(csvTable.Rows[i][1] is DBNull))
{
var row2= csvTable.Rows[i][1];
}
}
The issue is the above code throwing exception as
The CSV appears to be corrupt near record '0' field '5 at position '63'
This is because the header of CSV's having two double quote as below
"Header1",""Header2""
Is there a way that I can ignore double quotes and process the CSV's.
update
I have tried with TextFieldParser as below
public static void GetCSVData()
{
using (var parser = new TextFieldParser(path))
{
parser.HasFieldsEnclosedInQuotes = false;
parser.Delimiters = new[] { "," };
while (parser.PeekChars(1) != null)
{
string[] fields = parser.ReadFields();
foreach (var field in fields)
{
Console.Write(field + " ");
}
Console.WriteLine(Environment.NewLine);
}
}
}
The output:
Sample CSV data I have used:
Any help is appreciated.
Hope this works!
Please replace two double quotes as below from csv:
using (FileStream fs = new FileStream(Path, FileMode.Open, FileAccess.ReadWrite, FileShare.None))
{
StreamReader sr = new StreamReader(fs);
string contents = sr.ReadToEnd();
// replace "" with "
contents = contents.Replace("\"\"", "\"");
// go back to the beginning of the stream
fs.Seek(0, SeekOrigin.Begin);
// adjust the length to make sure all original
// contents is overritten
fs.SetLength(contents.Length);
StreamWriter sw = new StreamWriter(fs);
sw.Write(contents);
sw.Close();
}
Then use the same CSV helper
using LumenWorks.Framework.IO.Csv;
var csvTable = new DataTable();
using (TextReader fileReader = File.OpenText(path))
using (var csvReader = new CsvReader(fileReader, false))
{
csvTable.Load(csvReader);
}
Thanks.

Using Enumerable method with yield keyword and MemoryStream [duplicate]

This question already has an answer here:
When using yield within a "using" statement, when does Dispose occur?
(1 answer)
Closed 1 year ago.
I wrote below code, which works:
//VERSION 1;
static IEnumerable<string> ReadAsLines(string filename)
{
using (StreamReader reader = new StreamReader(filename))
{
while (!reader.EndOfStream)
yield return reader.ReadLine();
}
}
Using above method:
const string fileData = #"path\to\somePipeDelimitedData.txt";
var reader = ReadAsLines(fileData);
var headerArr = reader.First().Split('|');
foreach (var column in headerArr)
{
var dummy = column;
}
var recordsEnumerable = reader.Skip(1); //skip first header Line
//Read other lines...
foreach (var record in recordsEnumerable)
{
//read each line
var rowArray = record.Split('|');
//etc...
}
Now suppose I start off with a Stream instead of a file;
I tried re-writing the above code, but am struggling with the stream getting closed.
How can I fix the version below?
//VERSION 2;
static IEnumerable<string> ReadAsLines(Stream stream)
{
using (StreamReader reader = new StreamReader(stream))
{
while (!reader.EndOfStream)
yield return reader.ReadLine();
}
}
Calling version 2:
byte[] dataByteArr = File.ReadAllBytes(fileData);
MemoryStream memStr = new MemoryStream(dataByteArr);
var reader2 = ReadAsLines(memStr);
var headerArr2 = reader2.First().Split('|'); //*** STREAM gets closed after this line
foreach (var column in headerArr2)
{
var dummy = column;
}
var recordsEnumerable2 = reader2.Skip(1); //skip first header Line
//Read other lines... *** ERROR OCCURS HERE, as the Stream is closed.
foreach (var record in recordsEnumerable2)
{
//read each line
var rowArray = record.Split('|');
//etc...
}
I re-organized my initial attempt by pulling the StreamReader out of the Enumerable method and disposing it outside when I'm really done.
byte[] dataByteArr = File.ReadAllBytes(fileData); //decoded bytes
var memStr = new MemoryStream(dataByteArr);
using (StreamReader sr = new StreamReader(memStr))
{
var dataAsEnumerable = ReadAsLines(sr, memStr);
var headerArr2 = dataAsEnumerable.First().Split('|');
//*** HA! stream is still open !
foreach (var column in headerArr2)
{
var dummy = column;
}
var dataMinusHeader = dataAsEnumerable.Skip(1);
//Read other lines...
foreach (var record in dataMinusHeader)
{
//read each line
var rowArray = record.Split('|');
//etc...
}
}

How can we find the expected table row while converting the json data into a datatable

I know how to convert a json data into datatable, here I need to know if there is any formula to get the expected datatable row without actually converting the json into datatable.
as already commented, parse the big JSON as a stream to handle huge amounts.
Then it's up to you to count the rows or process it to DataTables without memory exceptions:
using (FileStream s = File.Open("big.json")) // or any other stream
using (StreamReader streamReader = new StreamReader(s))
using (JsonTextReader reader = new JsonTextReader(streamReader))
{
reader.SupportMultipleContent = true;
int rowCount = 0;
var serializer = new JsonSerializer();
while (reader.Read())
{
if (reader.TokenType == JsonToken.StartObject)
{
DataRow r = serializer.Deserialize<Contact>(reader);
rowCount++;
}
}
}
You can filter using JObject using this way
string jsonData = "";
using (StreamReader reader = new StreamReader("big.json"))
{
jsonData = reader.ReadToEnd();
reader.Close();
}
JObject o = JObject.Parse(jsonData);
var results = o["datatable"].Where(x => (bool)x["filter"]).ToArray();

C# ZipArchive losing data

I'm trying to copy the contents of one Excel file to another Excel file while replacing a string inside of the file on the copy. It's working for the most part, but the file is losing 27 kb of data. Any suggestions?
public void ReplaceString(string what, string with, string path) {
List < string > doneContents = new List < string > ();
List < string > doneNames = new List < string > ();
using(ZipArchive archive = ZipFile.Open(_path, ZipArchiveMode.Read)) {
int count = archive.Entries.Count;
for (int i = 0; i < count; i++) {
ZipArchiveEntry entry = archive.Entries[i];
using(var entryStream = entry.Open())
using(StreamReader reader = new StreamReader(entryStream)) {
string txt = reader.ReadToEnd();
if (txt.Contains(what)) {
txt = txt.Replace(what, with);
}
doneContents.Add(txt);
string name = entry.FullName;
doneNames.Add(name);
}
}
}
using(MemoryStream zipStream = new MemoryStream()) {
using(ZipArchive newArchive = new ZipArchive(zipStream, ZipArchiveMode.Create, true, Encoding.UTF8)) {
for (int i = 0; i < doneContents.Count; i++) {
int spot = i;
ZipArchiveEntry entry = newArchive.CreateEntry(doneNames[spot]);
using(var entryStream = entry.Open())
using(var sw = new StreamWriter(entryStream)) {
sw.Write(doneContents[spot]);
}
}
}
using(var fileStream = new FileStream(path, FileMode.Create)) {
zipStream.Seek(0, SeekOrigin.Begin);
zipStream.CopyTo(fileStream);
}
}
}
I've used Microsoft's DocumentFormat.OpenXML and Excel Interop, however, they are both lacking in a few main components that I need.
Update:
using(var fileStream = new FileStream(path, FileMode.Create)) {
var wrapper = new StreamWriter(fileStream);
wrapper.AutoFlush = true;
zipStream.Seek(0, SeekOrigin.Begin);
zipStream.CopyTo(wrapper.BaseStream);
wrapper.Flush();
wrapper.Close();
}
Try the process without changing the string and see if the file size is the same. If so then it would seem that your copy is working correctly, however as Marc B suggested, with compression, even a small change can result in a larger change in the overall size.

MemoryStream - Adding Zip File to Stream

I have some code which processes one or more DataTables, writing the row and column data as CSV files to a stream. Each DataTable's contents are saved to a separate CSV file and finally saved to zip file (using DotNetZip. This code works fine when there is only one DataTable that needs to be proceesed, but when there are multiple DataTables, row and column data is being saved to only one CSV (the other CSVs are empty) and the data is chopped off at random places.
MemoryStream stream = new MemoryStream();
MemoryStream outputStream = new MemoryStream();
StreamWriter streamWriter = new StreamWriter(stream);
StreamWriter outStreamWriter = new StreamWriter(stream);
CsvConfiguration config = new CsvConfiguration();
config.QuoteAllFields = true;
streamWriter.WriteLine("sep=" + config.Delimiter);
var zip = new ZipFile();
var csv = new CsvWriter(streamWriter, config);
foreach (DataTable dt in dataTables)
{
foreach (DataColumn dc in dt.Columns)
{
csv.WriteField(dc.ColumnName.ToString());
}
csv.NextRecord();
foreach (DataRow dr in dt.Rows)
{
foreach (DataColumn dc in dt.Columns)
{
csv.WriteField(dr[dc].ToString());
}
csv.NextRecord();
}
zip.AddEntry(report.Title.ToString() + dt.GetHashCode() + ".csv", stream);
stream.Position = 0;
}
zip.Save(outputStream);
streamWriter.Flush();
outStreamWriter.Flush();
outputStream.Position = 0;
return outputStream;
I suspect that my usage of zip.AddEntry() may not be the correct way to save files to a stream. Any help appreciated as always. Also note, I know I don't have any using statements in my code: I was too lazy to add this in for this example.
There are two possible problem places I see:
1) outputStream.Position = 0;
and
2) var csv = new CsvWriter(streamWriter, config);
First is not the correct way to reset the stream. Second may have some problems with rewinded streams;
1)To rectify the first one either:
ms.Seek(0, IO.SeekOrigin.Begin)
ms.SetLength(0)
or just create new MemoryStream for each table.
2) To rectify the second one just create new CsvWriter for each table.
foreach (DataTable dt in dataTables)
{
var csv = new CsvWriter(streamWriter, config);
...
}
I'd recommended you to handle both problems, because there aren't any enormous advantages in reusing old objects(or have you profiled your code?), but it may lead to all sorts of inconsistent behaviours in case of misdisposing and misresetting.

Categories

Resources