How can i write and read using a BinaryWriter? - c#

I have this code wich is working when writing a binary file:
using (BinaryWriter binWriter =
new BinaryWriter(File.Open(f.fileName, FileMode.Create)))
{
for (int i = 0; i < f.histogramValueList.Count; i++)
{
binWriter.Write(f.histogramValueList[(int)i]);
}
binWriter.Close();
}
And this code to read back from the DAT file on the hard disk:
fileName = Options_DB.get_histogramFileDirectory();
if (File.Exists(fileName))
{
BinaryReader binReader =
new BinaryReader(File.Open(fileName, FileMode.Open));
try
{
//byte[] testArray = new byte[3];
int pos = 0;
int length = (int)binReader.BaseStream.Length;
binReader.BaseStream.Seek(0, SeekOrigin.Begin);
while (pos < length)
{
long[] l = new long[256];
for (int i = 0; i < 256; i++)
{
if (pos < length)
l[i] = binReader.ReadInt64();
else
break;
pos += sizeof(Int64);
}
list_of_histograms.Add(l);
}
}
catch
{
}
finally
{
binReader.Close();
}
But what i want to do is to add to the Writing code to write to the file more three streams like this:
binWriter.Write(f.histogramValueList[(int)i]);
binWriter.Write(f.histogramValueListR[(int)i]);
binWriter.Write(f.histogramValueListG[(int)i]);
binWriter.Write(f.histogramValueListB[(int)i]);
But the first thing is how can i write all this and make it in the file it self to be identify by a string or something so when im reading the file back i will be able to put each List to a new one ?
Second thing is how do i read back the file now so each List will be added to a new List ?
Now it's easy im writing one List reading and adding it to a List.
But now i added more three Lists so how can i do it ?
Thanks.

To get answer think about how to get number of items in list that you've just serialized.
Cheat code: write number of items in collection before items. When reading do reverse.
writer.Write(items.Count());
// write items.Count() items.
Reading:
int count = reader.ReadInt32();
items = new List<ItemType>();
// read count item objects and add to items collection.

Related

Large data table to multiple csv files of specific size in .net

I have one large data table of some millions records. I need to export that into multiple CSV files of specific size. So for example, I choose file size of 5MB and when I say export, The Datatable will get exported to 4 CSV files each of size 5MB and last file size may vary due to remaining records. I went through many solutions here as well had a look at csvhelper library but all deals with large files gets split into multiple CSV but not the in memory data table to multiple CSV files based on the file size specified. I want to do this in C#. Any help in this direction would be great.
Thanks
Jay
Thanks #H.G.Sandhagen and #jdweng for the inputs. Currently I have written following code which does the work needed. I know it is not perfect and some enhancement can surely be done and can be made more efficient if we can pre-determine length out of data table item array as pointed out by Nick.McDermaid. As of now, I will go with this code to unblock my self and will post the final optimized version when I have it coded.
public void WriteToCsv(DataTable table, string path, int size)
{
int fileNumber = 0;
StreamWriter sw = new StreamWriter(string.Format(path, fileNumber), false);
//headers
for (int i = 0; i < table.Columns.Count; i++)
{
sw.Write(table.Columns[i]);
if (i < table.Columns.Count - 1)
{
sw.Write(",");
}
}
sw.Write(sw.NewLine);
foreach (DataRow row in table.AsEnumerable())
{
sw.WriteLine(string.Join(",", row.ItemArray.Select(x => x.ToString())));
if (sw.BaseStream.Length > size) // Time to create new file!
{
sw.Close();
sw.Dispose();
fileNumber ++;
sw = new StreamWriter(string.Format(path, fileNumber), false);
}
}
sw.Close();
}
I had a similar problem and this is how I solved it with CsvHelper.
Answer could be easily adapted to use DataTable as source.
public void SplitCsvTest()
{
var inventoryRecords = new List<InventoryCsvItem>();
for (int i = 0; i < 100000; i++)
{
inventoryRecords.Add(new InventoryCsvItem { ListPrice = i + 1, Quantity = i + 1 });
}
const decimal MAX_BYTES = 5 * 1024 * 1024; // 5 MB
List<byte[]> parts = new List<byte[]>();
using (var memoryStream = new MemoryStream())
{
using (var streamWriter = new StreamWriter(memoryStream))
using (var csvWriter = new CsvWriter(streamWriter))
{
csvWriter.WriteHeader<InventoryCsvItem>();
csvWriter.NextRecord();
csvWriter.Flush();
streamWriter.Flush();
var headerSize = memoryStream.Length;
foreach (var record in inventoryRecords)
{
csvWriter.WriteRecord(record);
csvWriter.NextRecord();
csvWriter.Flush();
streamWriter.Flush();
if (memoryStream.Length > (MAX_BYTES - headerSize))
{
parts.Add(memoryStream.ToArray());
memoryStream.SetLength(0);
memoryStream.Position = 0;
csvWriter.WriteHeader<InventoryCsvItem>();
csvWriter.NextRecord();
}
}
if (memoryStream.Length > headerSize)
{
parts.Add(memoryStream.ToArray());
}
}
}
for(int i = 0; i < parts.Count; i++)
{
var part = parts[i];
File.WriteAllBytes($"C:/Temp/Part {i + 1} of {parts.Count}.csv", part);
}
}

How to Stream string data from a txt file into an array

I'm doing this exercise from a lab. the instructions are as follows
This method should read the product catalog from a text file called “catalog.txt” that you should
create alongside your project. Each product should be on a separate line.Use the instructions in the video to create the file and add it to your project, and to return an
array with the first 200 lines from the file (use the StreamReader class and a while loop to read
from the file). If the file has more than 200 lines, ignore them. If the file has less than 200 lines,
it’s OK if some of the array elements are empty (null).
I don't understand how to stream data into the string array any clarification would be greatly appreciated!!
static string[] ReadCatalogFromFile()
{
//create instance of the catalog.txt
StreamReader readCatalog = new StreamReader("catalog.txt");
//store the information in this array
string[] storeCatalog = new string[200];
int i = 0;
//test and store the array information
while (storeCatalog != null)
{
//store each string in the elements of the array?
storeCatalog[i] = readCatalog.ReadLine();
i = i + 1;
if (storeCatalog != null)
{
//test to see if its properly stored
Console.WriteLine(storeCatalog[i]);
}
}
readCatalog.Close();
Console.ReadLine();
return storeCatalog;
}
Here are some hints:
int i = 0;
This needs to be outside your loop (now it is reset to 0 each time).
In your while() you should check the result of readCatalog() and/or the maximum number of lines to read (i.e. the size of your array)
Thus: if you reached the end of the file -> stop - or if your array is full -> stop.
static string[] ReadCatalogFromFile()
{
var lines = new string[200];
using (var reader = new StreamReader("catalog.txt"))
for (var i = 0; i < 200 && !reader.EndOfStream; i++)
lines[i] = reader.ReadLine();
return lines;
}
A for-loop is used when you know the exact number of iterations beforehand. So you can say it should iterate exactly 200 time so you won't cross the index boundaries. At the moment you just check that your array isn't null, which it will never be.
using(var readCatalog = new StreamReader("catalog.txt"))
{
string[] storeCatalog = new string[200];
for(int i = 0; i<200; i++)
{
string temp = readCatalog.ReadLine();
if(temp != null)
storeCatalog[i] = temp;
else
break;
}
return storeCatalog;
}
As soon as there are no more lines in the file, temp will be null and the loop will be stopped by the break.
I suggest you use your disposable resources (like any stream) in a using statement. After the operations in the braces, the resource will automatically get disposed.

Intersect and Union in byte array of 2 files

I have 2 files.
1 is Source File and 2nd is Destination file.
Below is my code for Intersect and Union two file using byte array.
FileStream frsrc = new FileStream("Src.bin", FileMode.Open);
FileStream frdes = new FileStream("Des.bin", FileMode.Open);
int length = 24; // get file length
byte[] src = new byte[length];
byte[] des = new byte[length]; // create buffer
int Counter = 0; // actual number of bytes read
int subcount = 0;
while (frsrc.Read(src, 0, length) > 0)
{
try
{
Counter = 0;
frdes.Position = subcount * length;
while (frdes.Read(des, 0, length) > 0)
{
var data = src.Intersect(des);
var data1 = src.Union(des);
Counter++;
}
subcount++;
Console.WriteLine(subcount.ToString());
}
}
catch (Exception ex)
{
}
}
It is works fine with fastest speed.
but Now the problem is that I want count of it and when I Use below code then it becomes very slow.
var data = src.Intersect(des).Count();
var data1 = src.Union(des).Count();
So, Is there any solution for that ?
If yes,then please lete me know as soon as possible.
Thanks
Intersect and Union are not the fastest operations. The reason you see it being fast is that you never actually enumerate the results!
Both return an enumerable, not the actual results of the operation. You're supposed to go through that and enumerate the enumerable, otherwise nothing happens - this is called "deferred execution". Now, when you do Count, you actually enumerate the enumerable, and incur the full cost of the Intersect and Union - believe me, the Count itself is relatively trivial (though still an O(n) operation!).
You'll need to make your own methods, most likely. You want to avoid the enumerable overhead, and more importantly, you'll probably want a lookup table.
A few points: the comment // get file length is misleading as it is the buffer size. Counter is not the number of bytes read, it is the number of blocks read. data and data1 will end up with the result of the last block read, ignoring any data before them. That is assuming that nothing goes wrong in the while loop - you need to remove the try structure to see if there are any errors.
What you can do is count the number of occurences of each byte in each file, then if the count of a byte in any file is greater than one then it is is a member of the intersection of the files, and if the count of a byte in all the files is greater than one then it is a member of the union of the files.
It is just as easy to write the code for more than two files as it is for two files, whereas LINQ is easy for two but a little bit more fiddly for more than two. (I put in a comparison with using LINQ in a naïve fashion for only two files at the end.)
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
var file1 = #"C:\Program Files (x86)\Electronic Arts\Crysis 3\Bin32\Crysis3.exe"; // 26MB
var file2 = #"C:\Program Files (x86)\Electronic Arts\Crysis 3\Bin32\d3dcompiler_46.dll"; // 3MB
List<string> files = new List<string> { file1, file2 };
var sw = System.Diagnostics.Stopwatch.StartNew();
// Prepare array of counters for the bytes
var nFiles = files.Count;
int[][] count = new int[nFiles][];
for (int i = 0; i < nFiles; i++)
{
count[i] = new int[256];
}
// Get the counts of bytes in each file
int bufLen = 32768;
byte[] buffer = new byte[bufLen];
int bytesRead;
for (int fileNum = 0; fileNum < nFiles; fileNum++)
{
using (var sr = new FileStream(files[fileNum], FileMode.Open, FileAccess.Read))
{
bytesRead = bufLen;
while (bytesRead > 0)
{
bytesRead = sr.Read(buffer, 0, bufLen);
for (int i = 0; i < bytesRead; i++)
{
count[fileNum][buffer[i]]++;
}
}
}
}
// Find which bytes are in any of the files or in all the files
var inAny = new List<byte>(); // union
var inAll = new List<byte>(); // intersect
for (int i = 0; i < 256; i++)
{
Boolean all = true;
for (int fileNum = 0; fileNum < nFiles; fileNum++)
{
if (count[fileNum][i] > 0)
{
if (!inAny.Contains((byte)i)) // avoid adding same value more than once
{
inAny.Add((byte)i);
}
}
else
{
all = false;
}
};
if (all)
{
inAll.Add((byte)i);
};
}
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds);
// Display the results
Console.WriteLine("Union: " + string.Join(",", inAny.Select(x => x.ToString("X2"))));
Console.WriteLine();
Console.WriteLine("Intersect: " + string.Join(",", inAll.Select(x => x.ToString("X2"))));
Console.WriteLine();
// Compare to using LINQ.
// N/B. Will need adjustments for more than two files.
var srcBytes1 = File.ReadAllBytes(file1);
var srcBytes2 = File.ReadAllBytes(file2);
sw.Restart();
var intersect = srcBytes1.Intersect(srcBytes2).ToArray().OrderBy(x => x);
var union = srcBytes1.Union(srcBytes2).ToArray().OrderBy(x => x);
Console.WriteLine(sw.ElapsedMilliseconds);
Console.WriteLine("Union: " + String.Join(",", union.Select(x => x.ToString("X2"))));
Console.WriteLine();
Console.WriteLine("Intersect: " + String.Join(",", intersect.Select(x => x.ToString("X2"))));
Console.ReadLine();
}
}
}
The counting-the-byte-occurences method is roughly five times faster than the LINQ method on my computer, even without the latter loading the files and on a range of file sizes (a few KB to a few MB).

Reading input from a text file, converting into int, and storing into array.

I need some help. I have a text file that looks like so:
21,M,S,1
22,F,M,2
19,F,S,3
65,F,M,4
66,M,M,4
What I need to do is put the first column into an array int[] age and the last column into an array int[] districts. This is for a college project due in a week. I've been having a lot of trouble trying to figure this out. Any help would be greatly appreciated. I did try searching for an answer already but didn't find anything that i understood. I also cannot use anything we havent learned from the book, so it rules out lists<> and things of the like.
FileStream census = new FileStream("census.txt", FileMode.Open, FileAccess.Read);
StreamReader inFile = new StreamReader(census);
string input = "";
string[] fields;
int[] districts = new int[SIZE];
int[] ageGroups = new int[SIZE];
input = inFile.ReadLine();
while (input != null)
{
fields = input.Split(',');
for (int i = 0; i < 1; i++)
{
int x = int.Parse(fields[i]);
districts[i] = x;
}
input = inFile.ReadLine();
}
Console.WriteLine(districts[0]);
if your file is nothing but this then File.ReadAllLines() will return a string array with each element being a line of your file. Having done that, you can then use the length of the returned array to initialize the other two arrays, into which the data will be stored.
Once you have your string array you call string.Split() on each element with "," as your delimiter, now you will have another array of strings minus the commas, you will them take the values you want by their index position, 0 and 3 respectively, and you can store those somewhere. Your code would look something like this:
//you will need to replace path with the actual path to the file.
string[] file = File.ReadAllLines("path");
int[] age = new int[file.Length];
int[] districts = new int[file.Length];
int counter = 0;
foreach (var item in file)
{
string[] values = item.Split(',');
age[counter] = Convert.ToInt32(values[0]);
districts[counter] = Convert.ToInt32(values[3]);
counter++
}
Proper way of writing this code:
Write each step your trying to perform:
// open file
// for each line
// parse line
Then refine "parse line"
// split by fields
// parse and handle age
// parse and handle gender
// parse and handle martial status
// parse and handle ....
Then start writing missing code.
At that point you should figure out that iterating through fields of single record not going to do you any good as all fields have different meaning.
So you'll need to remove for and replace it with filed-by-field parsing/assignments.
Instead of looping through all your fields, simply refer to the actual index of the field:
Wrong:
for (int i = 0; i < 1; i++)
{
int x = int.Parse(fields[i]);
districts[i] = x;
}
Right:
districts[i] = int.Parse(fields[0]);
ageGroups[i] = int.Parse(fields[3]);
i++;
So I just made some BS to do what you are seeking. I do not agree with it because I hate directly hardcoding for split, but since you can't use a list this is what you get:
FileStream census = new FileStream(path, FileMode.Open, FileAccess.Read);
StreamReader inFile = new StreamReader(census);
int[] districts = new int[1024];
int[] ageGroups = new int[1024];
int counter = 0;
string line;
while ((line = inFile.ReadLine()) != null)
{
string[] splitString = line.Split(',');
int.TryParse(splitString[0], out ageGroups[counter]);
int.TryParse(splitString[3], out districts[counter]);
counter++;
}
This will give you two arrays, districts and ageGroups that are of length 1024 and will contain the values for each row in the census.txt file.

For testing I would like to generate a lot of files with some content - any easy way?

Im using C# and my code is reading and moving some files. The problem is, that there are not so many files to read and move them to other folders. But I would like to test my code with 500,1000 or more files at once.
I could create every single file by myself -> not so smart. I could generate these files and write my own code for this -> could work, but is there not an easier way? Maybe there are already some tools for developers to create testfiles? Or is there another solution in c#/.net?
PS: Ah forgot to say - Im reading normal ascii file. Later I would like to create "csv-like" files (strings splittet by ";") if it would be possible.
This code will create an arbitrary number of files, each with an arbitrary number of lines, each containing an arbitrary number of comma-separated random integer values.
I hope it gets you started on creating some test data for your application.
static void Main(string[] args)
{
int numFiles = 30;
for (int fileIndex = 0; fileIndex < numFiles; fileIndex++)
{
string randomFileName = Path.Combine(#"c:\temp", Path.GetRandomFileName() + ".csv");
GenerateTestFile(randomFileName, 20, 10);
}
}
static void GenerateTestFile(string fileName, int numLines, int numValues)
{
int[] values = new int[numValues];
Random random = new Random(DateTime.Now.Millisecond);
FileInfo f = new FileInfo(fileName);
using (TextWriter fs = f.CreateText())
{
for (int lineIndex = 0; lineIndex < numLines; lineIndex++)
{
for (int valIndex = 0; valIndex < values.Length; valIndex++)
{
values[valIndex] = random.Next(100);
}
fs.WriteLine(string.Join(",", values));
}
}
}
var yourSampleTextStringArray = new[]{"dada","dada","aaa"/*.....*/};
var rnd = new Random();
for (int i = 0; i < 10e3; i++)
{
var temp = Path.GetTempFileName();
File.WriteAllLines(temp, yourSampleTextStringArray.Where(line => rnd.NextDouble() > 0.5));
}

Categories

Resources