How to read large file and split by "\r\n"

How to read large file and split by "\r\n" - c#

I have a large file >200MB. The file is an CSV-file from an external party, but sadly I cannot just read the file line by line, as \r\n is used to define a new line.
Currently I am reading in all the lines using this approach:
var file = File.ReadAllText(filePath, Encoding.Default);
var lines = Regex.Split(file, #"\r\n");
for (int i = 0; i < lines.Length; i++)
{
string line = lines[i];
...
}
How can I optimize this? After calling ReadAllText on my 225MB file, the process is using more than 1GB RAM. Is it possible to use a streaming approach in my case, where I need to split the file using my \r\n pattern?
EDIT1:
Your solutions using the File.ReadLines and a StreamReader will not work, as it sees each line in the file as one line. I need to split the file using my \r\n pattern. Reading the file using my code results in 758.371 lines (which is correct), whereas a normal line counts results in more than 1.5 million.
SOLUTION
public static IEnumerable<string> ReadLines(string path)
{
const string delim = "\r\n";
using (StreamReader sr = new StreamReader(path))
{
StringBuilder sb = new StringBuilder();
while (!sr.EndOfStream)
{
for (int i = 0; i < delim.Length; i++)
{
Char c = (char)sr.Read();
sb.Append(c);
if (c != delim[i])
break;
if (i == delim.Length - 1)
{
sb.Remove(sb.Length - delim.Length, delim.Length);
yield return sb.ToString();
sb = new StringBuilder();
break;
}
}
}
if (sb.Length>0)
yield return sb.ToString();
}
}

You can use File.ReadLines which returns IEnumerable<string> instead of loading whole file to memory.
foreach(var line in File.ReadLines(#filePath, Encoding.Default)
.Where(l => !String.IsNullOrEmpty(l)))
{
}

using StreamReader it will be easy.
using (StreamReader sr = new StreamReader(path))
{
foreach(string line = GetLine(sr))
{
//
}
}
IEnumerable<string> GetLine(StreamReader sr)
{
while (!sr.EndOfStream)
yield return new string(GetLineChars(sr).ToArray());
}
IEnumerable<char> GetLineChars(StreamReader sr)
{
if (sr.EndOfStream)
yield break;
var c1 = sr.Read();
if (c1 == '\\')
{
var c2 = sr.Read();
if (c2 == 'r')
{
var c3 = sr.Read();
if (c3 == '\\')
{
var c4 = sr.Read();
if (c4 == 'n')
{
yield break;
}
else
{
yield return (char)c1;
yield return (char)c2;
yield return (char)c3;
yield return (char)c4;
}
}
else
{
yield return (char)c1;
yield return (char)c2;
yield return (char)c3;
}
}
else
{
yield return (char)c1;
yield return (char)c2;
}
}
else
yield return (char)c1;
}

Use StreamReader to read file line by line:
using (StreamReader sr = new StreamReader(filePath))
{
while (true)
{
string line = sr.ReadLine();
if (line == null)
break;
}
}

How about
StreamReader sr = new StreamReader(path);
while (!sr.EndOfStream)
{
string line = sr.ReadLine();
}
Using the stream reader approach means the whole file won't get loaded into memory.

This was my lunch break :)
Set MAXREAD to the amount of data you want in memory if for example using a foreach since I'm using yield return. Use the code at your own risk, I've tried it on smaller sets of data :)
Your usage would be something like:
foreach (var row in StreamReader(FileName).SplitByChar(new char[] {'\r','\n'}))
{
// Do something awesome! :)
}
And the extension method like this:
public static class FileStreamExtensions
{
public static IEnumerable<string> SplitByChar(this StreamReader stream, char[] splitter)
{
int MAXREAD = 1024 * 1024;
var chars = new List<char>(MAXREAD);
var bytes = new char[MAXREAD];
var lastStop = 0;
var read = 0;
while (!stream.EndOfStream)
{
read = stream.Read(bytes, 0, MAXREAD);
lastStop = 0;
for (int i = 0; i < read; i++)
{
if (bytes[i] == splitter[0])
{
var assume = true;
for (int p = 1; p < splitter.Length; p++)
{
assume &= splitter[p] == bytes[i + p];
}
if (assume)
{
chars.AddRange(bytes.Skip(lastStop).Take(i - lastStop));
var res = new String(chars.ToArray());
chars.Clear();
yield return res;
i += splitter.Length - 1;
lastStop = i + 1;
}
}
}
chars.AddRange(bytes.Skip(lastStop));
}
chars.AddRange(bytes.Skip(lastStop).Take(read - lastStop));
yield return new String(chars.ToArray());
}
}

Related

How to merge .txt files in c#? [duplicate]

using (StreamWriter writer = File.CreateText(FinishedFile))
{
int lineNum = 0;
while (lineNum < FilesLineCount.Min())
{
for (int i = 0; i <= FilesToMerge.Count() - 1; i++)
{
if (i != FilesToMerge.Count() - 1)
{
var CurrentFile = File.ReadLines(FilesToMerge[i]).Skip(lineNum).Take(1);
string CurrentLine = string.Join("", CurrentFile);
writer.Write(CurrentLine + ",");
}
else
{
var CurrentFile = File.ReadLines(FilesToMerge[i]).Skip(lineNum).Take(1);
string CurrentLine = string.Join("", CurrentFile);
writer.Write(CurrentLine + "\n");
}
}
lineNum++;
}
}
The current way i am doing this is just too slow. I am merging files that are each 50k+ lines long with various amounts of data.
for ex:
File 1
1
2
3
4
File 2
4
3
2
1
i need this to merge into being a third fileFile 3
1,4
2,3
3,2
4,1P.S. The user can pick as many files as they want from any locations.
Thanks for the help.

You approach is slow because of the Skip and Take in the loops.
You could use a dictionary to collect all line-index' lines:
string[] allFileLocationsToMerge = { "filepath1", "filepath2", "..." };
var mergedLists = new Dictionary<int, List<string>>();
foreach (string file in allFileLocationsToMerge)
{
string[] allLines = File.ReadAllLines(file);
for (int lineIndex = 0; lineIndex < allLines.Length; lineIndex++)
{
bool indexKnown = mergedLists.TryGetValue(lineIndex, out List<string> allLinesAtIndex);
if (!indexKnown)
allLinesAtIndex = new List<string>();
allLinesAtIndex.Add(allLines[lineIndex]);
mergedLists[lineIndex] = allLinesAtIndex;
}
}
IEnumerable<string> mergeLines = mergedLists.Values.Select(list => string.Join(",", list));
File.WriteAllLines("targetPath", mergeLines);

Here's another approach - this implementation only stores in memory one set of lines from each file simultaneously, thus reducing memory pressure significantly (if that is an issue).
public static void MergeFiles(string output, params string[] inputs)
{
var files = inputs.Select(File.ReadLines).Select(iter => iter.GetEnumerator()).ToArray();
StringBuilder line = new StringBuilder();
bool any;
using (var outFile = File.CreateText(output))
{
do
{
line.Clear();
any = false;
foreach (var iter in files)
{
if (!iter.MoveNext())
continue;
if (line.Length != 0)
line.Append(", ");
line.Append(iter.Current);
any = true;
}
if (any)
outFile.WriteLine(line.ToString());
}
while (any);
}
foreach (var iter in files)
{
iter.Dispose();
}
}
This also handles files of different lengths.

Editing a line in a file by its number [duplicate]

This question already has answers here:
Edit a specific Line of a Text File in C#
(6 answers)
Closed 5 years ago.
I have to write an implementation of string that stores it's values on hard drive instead of ram (I know how stupid it sounds, but it's intended to teach us how different sorting algorithms work on ram and hard drive). This is what I've written so far:
class HDDArray : IEnumerable<int>
{
private string filePath;
public int this[int index]
{
get
{
using (var reader = new StreamReader(filePath))
{
string line = reader.ReadLine();
for (int i = 0; i < index; i++)
{
line = reader.ReadLine();
}
return Convert.ToInt32(line);
}
}
set
{
using (var fs = File.Open(filePath, FileMode.OpenOrCreate, FileAccess.ReadWrite))
{
var reader = new StreamReader(fs);
var writer = new StreamWriter(fs);
for (int i = 0; i < index; i++)
{
reader.ReadLine();
}
writer.WriteLine(value);
writer.Dispose();
}
}
}
public int Length
{
get
{
int length = 0;
using (var reader = new StreamReader(filePath))
{
while (reader.ReadLine() != null)
{
length++;
}
}
return length;
}
}
public HDDArray(string file)
{
filePath = file;
if (File.Exists(file))
File.WriteAllText(file, String.Empty);
else
File.Create(file).Dispose();
}
public IEnumerator<int> GetEnumerator()
{
using (var reader = new StreamReader(filePath))
{
string line;
while ((line = reader.ReadLine()) != null)
{
yield return Convert.ToInt32(line);
}
}
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
}
The problem I'm facing is when trying to edit a line (in the the set portion of the indexer) I end up adding a new line instead of editing the old one (it's pretty obvious why, I just can't figure how to fix it).

Your array is designed to work with integers. Such a class is quite easy to create because the length of all numbers is 4 bytes.
class HDDArray : IEnumerable<int>, IDisposable
{
readonly FileStream stream;
readonly BinaryWriter writer;
readonly BinaryReader reader;
public HDDArray(string file)
{
stream = new FileStream(file, FileMode.Create, FileAccess.ReadWrite);
writer = new BinaryWriter(stream);
reader = new BinaryReader(stream);
}
public int this[int index]
{
get
{
stream.Position = index * 4;
return reader.ReadInt32();
}
set
{
stream.Position = index * 4;
writer.Write(value);
}
}
public int Length
{
get
{
return (int)stream.Length / 4;
}
}
public IEnumerator<int> GetEnumerator()
{
stream.Position = 0;
while (reader.PeekChar() != -1)
yield return reader.ReadInt32();
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
public void Dispose()
{
reader?.Dispose();
writer?.Dispose();
stream?.Dispose();
}
}
Since the size of each array element is known, we can simply move to stream by changing its Position property.
BinaryWriter and BinaryReader are very comfortable to write and read numbers.
Open stream is a very heavy operation. Hence do it once when you create the class. At the end of the work, you need to clean up after themselves. So I implemented the IDisposable interface.
Usage:
HDDArray arr = new HDDArray("test.dat");
Console.WriteLine("Length: " + arr.Length);
for (int i = 0; i < 10; i++)
arr[i] = i;
Console.WriteLine("Length: " + arr.Length);
foreach (var n in arr)
Console.WriteLine(n);
// Console.WriteLine(arr[20]); // Exception!
arr.Dispose(); // release resources

I stand to be corrected, but I dont think there is an easy way to re-write a specific line, so you will probably find it easier to rewrite the file - modifying that line.
You could change your set code as follows:
set
{
var allLinesInFile = File.ReadAllLines(filepath);
allLinesInFile[index] = value;
File.WriteAllLines(filepath, allLinesInFile);
}
Goes without saying that there should be some safety checks in there to check the file exists and index < allLinesInFile.Length

I think for the sake of homework of sorting algorithms you needn't bother yourself memory size issues.
Of course please add checking file existing to read.
Note: Line counting in example starts from 0.
string[] lines = File.ReadAllLines(filePath);
using (StreamWriter writer = new StreamWriter(filePath))
{
for (int currentLineNmb = 0; currentLineNmb < lines.Length; currentLineNmb++ )
{
if (currentLineNmb == lineToEditNmb)
{
writer.WriteLine(lineToWrite);
continue;
}
writer.WriteLine(lines[currentLineNmb]);
}
}

c# How to run a application faster

I am creating a word list of possible uppercase letters to prove how insecure 8 digit passwords are this code will write aaaaaaaa to aaaaaaab to aaaaaaac etc. until zzzzzzzz using this code:
class Program
{
static string path;
static int file = 0;
static void Main(string[] args)
{
new_file();
var alphabet = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ123456789+-*_!$£^=<>§°ÖÄÜöäü.;:,?{}[]";
var q = alphabet.Select(x => x.ToString());
int size = 3;
int counter = 0;
for (int i = 0; i < size - 1; i++)
{
q = q.SelectMany(x => alphabet, (x, y) => x + y);
}
foreach (var item in q)
{
if (counter >= 20000000)
{
new_file();
counter = 0;
}
if (File.Exists(path))
{
using (StreamWriter sw = File.AppendText(path))
{
sw.WriteLine(item);
Console.WriteLine(item);
/*if (!(Regex.IsMatch(item, #"(.)\1")))
{
sw.WriteLine(item);
counter++;
}
else
{
Console.WriteLine(item);
}*/
}
}
else
{
new_file();
}
}
}
static void new_file()
{
path = #"C:\" + "list" + file + ".txt";
if (!File.Exists(path))
{
using (StreamWriter sw = File.CreateText(path))
{
}
}
file++;
}
}
The Code is working fine but it takes Weeks to run it. Does anyone know a way to speed it up or do I have to wait? If anyone has a idea please tell me.

Performance:
size 3: 0.02s
size 4: 1.61s
size 5: 144.76s
Hints:
removed LINQ for combination generation
removed Console.WriteLine for each password
removed StreamWriter
large buffer (128k) for file writing
const string alphabet = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ123456789+-*_!$£^=<>§°ÖÄÜöäü.;:,?{}[]";
var byteAlphabet = alphabet.Select(ch => (byte)ch).ToArray();
var alphabetLength = alphabet.Length;
var newLine = new[] { (byte)'\r', (byte)'\n' };
const int size = 4;
var number = new byte[size];
var password = Enumerable.Range(0, size).Select(i => byteAlphabet[0]).Concat(newLine).ToArray();
var watcher = new System.Diagnostics.Stopwatch();
watcher.Start();
var isRunning = true;
for (var counter = 0; isRunning; counter++)
{
Console.Write("{0}: ", counter);
Console.Write(password.Select(b => (char)b).ToArray());
using (var file = System.IO.File.Create(string.Format(#"list.{0:D5}.txt", counter), 2 << 16))
{
for (var i = 0; i < 2000000; ++i)
{
file.Write(password, 0, password.Length);
var j = size - 1;
for (; j >= 0; j--)
{
if (number[j] < alphabetLength - 1)
{
password[j] = byteAlphabet[++number[j]];
break;
}
else
{
number[j] = 0;
password[j] = byteAlphabet[0];
}
}
if (j < 0)
{
isRunning = false;
break;
}
}
}
}
watcher.Stop();
Console.WriteLine(watcher.Elapsed);
}

Try the following modified code. In LINQPad it runs in < 1 second. With your original code I gave up after 40 seconds. It removes the overhead of opening and closing the file for every WriteLine operation. You'll need to test and ensure it gives the same results because I'm not willing to run your original code for 24 hours to ensure the output is the same.
class Program
{
static string path;
static int file = 0;
static void Main(string[] args)
{
new_file();
var alphabet = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ123456789+-*_!$£^=<>§°ÖÄÜöäü.;:,?{}[]";
var q = alphabet.Select(x => x.ToString());
int size = 3;
int counter = 0;
for (int i = 0; i < size - 1; i++)
{
q = q.SelectMany(x => alphabet, (x, y) => x + y);
}
StreamWriter sw = File.AppendText(path);
try
{
foreach (var item in q)
{
if (counter >= 20000000)
{
sw.Dispose();
new_file();
counter = 0;
}
sw.WriteLine(item);
Console.WriteLine(item);
}
}
finally
{
if(sw != null)
{
sw.Dispose();
}
}
}
static void new_file()
{
path = #"C:\temp\list" + file + ".txt";
if (!File.Exists(path))
{
using (StreamWriter sw = File.CreateText(path))
{
}
}
file++;
}
}

your alphabet is missing 0
With that fixed there would be 89 chars in your set. Let's call it 100 for simplicity. The set you are looking for is all the 8 character length strings drawn from that set. There are 100^8 of these, i.e. 10,000,000,000,000,000.
The disk space they will take up depends on how you encode them, lets be generous - assume you use some 8 bit char set that contains the these characters, and you don't put in carriage returns, so one byte per char, so 10,000,000,000,000,000 bytes =~ 10 peta byes?
Do you have 10 petabytes of disk? (10000 TB)?
[EDIT] In response to 'this is not an answer':
The original motivation is to create the list? The shows how large the list would be. Its hard to see what could be DONE with the list if it was actualised, i.e. it would always be quicker to reproduce it than to load it. Surely whatever point could be made by producing the list can also be made by simply knowing it's size, which the above shows how to work it out.
There are LOTS of inefficiencies in you code, but if your questions is 'how can i quickly produce this list and write it to disk' the answer is 'you literally cannot'.
[/EDIT]

getting mulitple images from a single stream piped from ffmpeg stdout

I start a process to retrieve a few frames from a video file with ffmpeg,
ffmpeg -i "<videofile>.mp4" -frames:v 10 -f image2pipe pipe:1
and pipe the images to stdout -
var cmd = Process.Start(p);
var stream = cmd.StandardOutput.BaseStream;
var img = Image.FromStream(stream);
Getting the first image this way works, but how do I get all of them?

OK this was gobspackingly easy, kind of embarrassed I asked here. I'll post the answer in case it will help anyone else.
The first few bytes in the stream will be repeated every time there is a new image. I guessed the first 8 would do and voila.
static IEnumerable<Image> GetThumbnails(Stream stream)
{
byte[] allImages;
using (var ms = new MemoryStream())
{
stream.CopyTo(ms);
allImages = ms.ToArray();
}
var bof = allImages.Take(8).ToArray(); //??
var prevOffset = -1;
foreach (var offset in GetBytePatternPositions(allImages, bof))
{
if (prevOffset > -1)
yield return GetImageAt(allImages, prevOffset, offset);
prevOffset = offset;
}
if (prevOffset > -1)
yield return GetImageAt(allImages, prevOffset, allImages.Length);
}
static Image GetImageAt(byte[] data, int start, int end)
{
using (var ms = new MemoryStream(end - start))
{
ms.Write(data, start, end - start);
return Image.FromStream(ms);
}
}
static IEnumerable<int> GetBytePatternPositions(byte[] data, byte[] pattern)
{
var dataLen = data.Length;
var patternLen = pattern.Length - 1;
int scanData = 0;
int scanPattern = 0;
while (scanData < dataLen)
{
if (pattern[0] == data[scanData])
{
scanPattern = 1;
scanData++;
while (pattern[scanPattern] == data[scanData])
{
if (scanPattern == patternLen)
{
yield return scanData - patternLen;
break;
}
scanPattern++;
scanData++;
}
}
scanData++;
}
}

Read double value from a file C#

I have a txt file that the format is:
0.32423 1.3453 3.23423
0.12332 3.1231 9.23432432
9.234324234 -1.23432 12.23432
...
Each line has three double value. There are more than 10000 lines in this file. I can use the ReadStream.ReadLine and use the String.Split, then convert it.
I want to know is there any faster method to do it.
Best Regards,

StreamReader.ReadLine, String.Split and Double.TryParse sounds like a good solution here.
No need for improvement.

There may be some little micro-optimisations you can perform, but the way you've suggested sounds about as simple as you'll get.
10000 lines shouldn't take very long - have you tried it and found you've actually got a performance problem? For example, here are two short programs - one creates a 10,000 line file and the other reads it:
CreateFile.cs:
using System;
using System.IO;
public class Test
{
static void Main()
{
Random rng = new Random();
using (TextWriter writer = File.CreateText("test.txt"))
{
for (int i = 0; i < 10000; i++)
{
writer.WriteLine("{0} {1} {2}", rng.NextDouble(),
rng.NextDouble(), rng.NextDouble());
}
}
}
}
ReadFile.cs:
using System;
using System.Diagnostics;
using System.IO;
using System.Linq;
public class Test
{
static void Main()
{
Stopwatch sw = Stopwatch.StartNew();
using (TextReader reader = File.OpenText("test.txt"))
{
string line;
while ((line = reader.ReadLine()) != null)
{
string[] bits = line.Split(' ');
foreach (string bit in bits)
{
double value;
if (!double.TryParse(bit, out value))
{
Console.WriteLine("Bad value");
}
}
}
}
sw.Stop();
Console.WriteLine("Total time: {0}ms",
sw.ElapsedMilliseconds);
}
}
On my netbook (which admittedly has an SSD in) it only takes 82ms to read the file. I would suggest that's probably not a problem :)

I would suggest reading all your lines at once with
string[] lines = System.IO.File.ReadAllLines(fileName);
This wold ensure that the I/O is done with the maximum efficiency. You woul have to measure (profile) but I would expect the conversions to take far less time.

your method is already good!
you can improve it by writing a readline function that returns an array of double and you reuse this function in other programs.

This solution is a little bit slower (see benchmarks at the end), but its nicer to read. It should also be more memory efficient because only the current character is buffered at the time (instead of the whole file or line).
Reading arrays is an additional feature in this reader which assumes that the size of the array always comes first as an int-value.
IParsable is another feature, that makes it easy to implement Parse methods for various types.
class StringSteamReader {
private StreamReader sr;
public StringSteamReader(StreamReader sr) {
this.sr = sr;
this.Separator = ' ';
}
private StringBuilder sb = new StringBuilder();
public string ReadWord() {
eol = false;
sb.Clear();
char c;
while (!sr.EndOfStream) {
c = (char)sr.Read();
if (c == Separator) break;
if (IsNewLine(c)) {
eol = true;
char nextch = (char)sr.Peek();
while (IsNewLine(nextch)) {
sr.Read(); // consume all newlines
nextch = (char)sr.Peek();
}
break;
}
sb.Append(c);
}
return sb.ToString();
}
private bool IsNewLine(char c) {
return c == '\r' || c == '\n';
}
public int ReadInt() {
return int.Parse(ReadWord());
}
public double ReadDouble() {
return double.Parse(ReadWord());
}
public bool EOF {
get { return sr.EndOfStream; }
}
public char Separator { get; set; }
bool eol;
public bool EOL {
get { return eol || sr.EndOfStream; }
}
public T ReadObject<T>() where T : IParsable, new() {
var obj = new T();
obj.Parse(this);
return obj;
}
public int[] ReadIntArray() {
int size = ReadInt();
var a = new int[size];
for (int i = 0; i < size; i++) {
a[i] = ReadInt();
}
return a;
}
public double[] ReadDoubleArray() {
int size = ReadInt();
var a = new double[size];
for (int i = 0; i < size; i++) {
a[i] = ReadDouble();
}
return a;
}
public T[] ReadObjectArray<T>() where T : IParsable, new() {
int size = ReadInt();
var a = new T[size];
for (int i = 0; i < size; i++) {
a[i] = ReadObject<T>();
}
return a;
}
internal void NextLine() {
eol = false;
}
}
interface IParsable {
void Parse(StringSteamReader r);
}
It can be used like this:
public void Parse(StringSteamReader r) {
double x = r.ReadDouble();
int y = r.ReadInt();
string z = r.ReadWord();
double[] arr = r.ReadDoubleArray();
MyParsableObject o = r.ReadObject<MyParsableObject>();
MyParsableObject [] oarr = r.ReadObjectArray<MyParsableObject>();
}
I did some benchmarking, comparing StringStreamReader with some other approaches, already proposed (StreamReader.ReadLine and File.ReadAllLines). Here are the methods I used for benchmarking:
private static void Test_StringStreamReader(string filename) {
var sw = new Stopwatch();
sw.Start();
using (var sr = new StreamReader(new FileStream(filename, FileMode.Open, FileAccess.Read))) {
var r = new StringSteamReader(sr);
r.Separator = ' ';
while (!r.EOF) {
var dbls = new List<double>();
while (!r.EOF) {
dbls.Add(r.ReadDouble());
}
}
}
sw.Stop();
Console.WriteLine("elapsed: {0}", sw.Elapsed);
}
private static void Test_ReadLine(string filename) {
var sw = new Stopwatch();
sw.Start();
using (var sr = new StreamReader(new FileStream(filename, FileMode.Open, FileAccess.Read))) {
var dbls = new List<double>();
while (!sr.EndOfStream) {
string line = sr.ReadLine();
string[] bits = line.Split(' ');
foreach(string bit in bits) {
dbls.Add(double.Parse(bit));
}
}
}
sw.Stop();
Console.WriteLine("elapsed: {0}", sw.Elapsed);
}
private static void Test_ReadAllLines(string filename) {
var sw = new Stopwatch();
sw.Start();
string[] lines = System.IO.File.ReadAllLines(filename);
var dbls = new List<double>();
foreach(var line in lines) {
string[] bits = line.Split(' ');
foreach (string bit in bits) {
dbls.Add(double.Parse(bit));
}
}
sw.Stop();
Console.WriteLine("Test_ReadAllLines: {0}", sw.Elapsed);
}
I used a file with 1.000.000 lines of double values (3 values each line). File is located on a SSD disk and each test was repeated multiple times in release-mode. These are the results (on average):
Test_StringStreamReader: 00:00:01.1980975
Test_ReadLine: 00:00:00.9117553
Test_ReadAllLines: 00:00:01.1362452
So, as mentioned StringStreamReader is a bit slower than the other approaches. For 10.000 lines, the performance is around (120ms / 95ms / 100ms).

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

How to read large file and split by "\r\n" - c#

You can use File.ReadLines which returns IEnumerable<string> instead of loading whole file to memory. foreach(var line in File.ReadLines(#filePath, Encoding.Default) .Where(l => !String.IsNullOrEmpty(l))) { }

Use StreamReader to read file line by line: using (StreamReader sr = new StreamReader(filePath)) { while (true) { string line = sr.ReadLine(); if (line == null) break; } }

How about StreamReader sr = new StreamReader(path); while (!sr.EndOfStream) { string line = sr.ReadLine(); } Using the stream reader approach means the whole file won't get loaded into memory.

Related

How to merge .txt files in c#? [duplicate]

Editing a line in a file by its number [duplicate]

c# How to run a application faster

getting mulitple images from a single stream piped from ffmpeg stdout

Read double value from a file C#

Categories

Resources