StreamReader doesn't stop reading text file - c#

I have a program to read a million- line file. Each line has one floating-point value on it. The value is to be read in and put in an element in an array.
using System;
using System.Diagnostics;
using System.IO;
namespace sort1mRandFloat
{
public class Program
{
static void Main()
{
Console.WriteLine("Creating Single array...");
Single[] fltArray = new Single[1000000];
Console.WriteLine("Array created, making string...");
String line;
Console.WriteLine("String created, opening file...");
StreamReader file = new StreamReader(#"C:\\Users\\Aaron\\Desktop\\rand1mFloats.txt");
Console.WriteLine("File opened, creating stopwatch and starting main execution event. See you on the other side.");
int i;
Stopwatch stopWatch = new Stopwatch();
stopWatch.Start();
while((line = file.ReadLine()) != null)
{
for(i=0; i < 1000000; i++)
{
fltArray[i] = Convert.ToSingle(line);
if (i == 999999)
Console.WriteLine("At 999999");
}
}
file.Close();
stopWatch.Stop();
TimeSpan ts = stopWatch.Elapsed;
String elapsedTime = String.Format("{0:00}:{1:00}:{2:00}.{3:00}",
ts.Hours, ts.Minutes, ts.Seconds, ts.Milliseconds/10);
Console.WriteLine("It took " + elapsedTime + " to read a thousand lines into the array.\n");
Console.WriteLine("Element 0 is: " + fltArray[0]);
Console.WriteLine("Element 999999 is: " + fltArray[999999]);
Console.ReadLine();
}
}
}
When this code is run on the file, it doesn't ever stop. It's looking for something to tell it that it's at the end of the tile or something, and it's not finding it. Upon filling the 999,999th element, it loops back to 0 and starts again.
This code is more or less based on what Microsoft recommends on their website... any idea on what I'm doing wrong?
The file can be found below. As I have not been able to store the file in the array yet, I cannot say how long it will take for it to work. There's quite a few values in the file. Metered connection warning: 18 MB file.
1 million line file- OneDrive

You should not have for inside while. You only need one loop:
var i = 0;
while((line = file.ReadLine()) != null)
{
fltArray[i] = Convert.ToSingle(line);
if (i == 999999)
Console.WriteLine("At 999999");
i++;
}
or with for:
for(i=0; i < 1000000 && (line = file.ReadLine()) != null; i++)
{
fltArray[i] = Convert.ToSingle(line);
if (i == 999999)
Console.WriteLine("At 999999");
}
Update
I'm getting following results for your file:
Creating Single array...
Array created, making string...
String created, opening file...
File opened, creating stopwatch and starting main execution event. See you on the other side.
At 999999
It took 00:00:00.42 to read a thousand lines into the array.
Element 0 is: 0,9976465
Element 999999 is: 0,04730097
Release build, run outside of VS, i5-3317U # 1.7GHz.

I'm on my phone, so I apologize for the brevity. Your outer while loop will hit each of your 1 million lines, and your inner for loop is iterating 1 million times for a total of 1 trillion iterations. Also, your while condition can utilize the file.EndOfStream property.

Basically you are converting every line 1000000 times, because you have the for-loop within your while-loop that does the reading.
Simply remove the for-loop and replace it with i++
Every time file.ReadLine is called it reads a single line from file until it reaches the end of the file and become null (therefor exiting your while-loop).

Related

WPF C#: Splitting a long string

I have been thinking about this problem for a long time, but now I managed to request help from those who know. I have a code, that is supposed to read text from a big file (a couple of Gbs) line by line. Every line can be around 500Mb as it must be a video, converted to base64 connected with video name. Here I read current line and separate video name from its' content (start from else).
string[] fileline = GetFileLine(resPath, currentRow).Split(); //Here split causes SystemOutOfMemory
try
{
string base64 = fileline[0].Replace(specSymbol, ' ');
try
{
if (!IsVideo(ref base64) && !IsGif(ref base64))
{
ShowPrimary();
imgFile.Source = BytesToBitmap(Convert.FromBase64String(base64));
}
else
btnLoadFile.Background = readyColor;
if (fileline.Length > 1)
return fileline[1].Replace(specSymbol, ' ');
}
catch (Exception ex3) { MessageBox.Show("Next(4):" + ex3.Message); }
}
catch (Exception ex2) { MessageBox.Show("Next(3):" + ex2.Message); }
So my question is: does the way to split long strings exist or I only have to store names in a separate file without splitting?
UPD1: I have wrote a method using an advice #canton7 gave me. I have tested it on really small files (around 100 symbols), where it works good, but I am testing it now on 25Mb file, and the speed of the reading is awful (like 10Mb in an hour), even though, the reading of really big files didn't make the program to crash, so I think I'm on the right way.
I still wonder if there is a better method. If you have some advice on the ready method improvement - please give it here.
static string ReadFirstHalfAfter(string path, int skips = 0)
{
int skipsDone = 0;
int ri = 0;
char[] buffer = new char[1];
StreamReader reader = new StreamReader(path);
while (reader.Peek() >= 0)//while reader is not at the end of file
{
reader.Read(buffer, ri, 1);//reading one element from the current position
if (skipsDone < skips)//line skips not enough
{
if (buffer[buffer.Length - 1] == '\n')//current symbol is line end
{
skipsDone++;//line skip counted
continue;
}
}
else//enough line skips
{
if (buffer[buffer.Length - 1] == ' ') break; //if line separator - stop
ExpandArray(ref buffer); //adding one more free element
ri++; //switching element to read next
}
if (ri % 10000 == 0) Console.Write('.');
}
return new string(buffer).Trim(' ');
}
To separate string into 2 pieces you can use substring to save memory, but if you want more memory to be saved - there is only one way through writing the line parts in the different rows.

Calculate estimated time for an exponential delay

Curiously as I am, I wrote a small program that writes one space into a text file, then 2, then 4, etc. I record the time it needs to do so for each loop and of course it exponencially extends. While it's just about 0.003 seconds at the beginning, it gets to the minute mark really fast. Now I want to calculate the estimated time for the program to finish.
This is the code I use so far:
//This creates the file if it doesn't exist
File.AppendAllText("C:/1G.txt", "");
//I am starting with 30 iterations
for (int i = 0; i < 30; i++)
{
DateTime start = DateTime.Now;
//The 1<<i will loop 1, 2, 4, 8, etc. times
for (int j = 0; j < 1 << i; j++)
{
File.AppendAllText("C:/1G.txt", " ");
}
DateTime end = DateTime.Now;
Console.WriteLine($"i = {i}, 1<<i = {1 << i}, Time: {(end-start)}");
}
Now normally when you try to calculate an estimated time, you take the time spans you already needed for each task, sum them up and divide them by the number of timestamps you have. But here this is not possible, as we can be sure that the next iteration will take longer than the first one.
Now I could just double the time for each iteration and have the time it can take. But my "problem" is, that it's not doubling 100% (which would be impossible):
Time: 00:00:00.0150
Time: 00:00:00.0020
Time: 00:00:00.0010
Time: 00:00:00.0020
Time: 00:00:00.0060
Time: 00:00:00.0090
Time: 00:00:00.0850
Time: 00:00:00.0708
Time: 00:00:00.3261
Time: 00:00:00.6483
Time: 00:00:01.0382
Time: 00:00:02.1114
Time: 00:00:02.4375
Time: 00:00:04.3125
Time: 00:00:09.0887
Time: 00:00:17.9730
...
How could I calculate a vague estimated time for this case?
Are you trying to prove that bad practices can impact your code performance? If not and you really want to measure execution time first of all try to use Stopwatch for time measuring (create it once and reset after internal loop execution finishes) - it's much better for measuring duration than comparing DateTime.Nows. In the next place, by using File.AppendAllText you're opening and closing Stream to a file with every method invocation. It would be much better to actually open the stream once, write the data you want and close it once after. Could you elaborate about what are you actually trying to achieve, because I can't really understand what are you asking about in the first place. You're doing exponentially more work so with your implementation the time also raises exponentially. If I get this right you want to get average invocation time of writing spaces to a file once. To do this you have to compensate for number of samples. I'd implement it the following way:
static void Main()
{
var stopwatch = new Stopwatch();
var samples = new double[30];
for (var i = 0; i < 30; i++)
{
stopwatch.Start();
// File.OpenWrite creates the file if it doesn't exist
// Move these usings outside of the loop if you don't want to measure opening/closing stream to file
using (var fileStream = File.OpenWrite("D:\\1G.txt"))
using (var streamWriter = new StreamWriter(fileStream))
{
// Option A
// This will create a string with desired number of spaces,
// no internal loop necessary, but allocates a lot of memory
streamWriter.Write(new string(' ', 1 << i));
// Option B
// If you insist on creating a loop
//for (int j = 0; j < 1 << i; j++)
//{
// streamWriter.Write(' ');
//}
}
stopwatch.Stop();
var writeDurationTimeSpan = TimeSpan.FromTicks(stopwatch.ElapsedTicks);
var writeDurationInMs = writeDurationTimeSpan.TotalMilliseconds;
var singleSpaceWriteDuratonInMs = writeDurationTimeSpan.TotalMilliseconds / (1 << i);
samples[i] = singleSpaceWriteDuratonInMs;
Console.WriteLine("i = {0}, 1<<i = {1}, Execution duration: {2} ms, Single space execution duration: {3} ms",
i,
1 << i,
writeDurationInMs,
singleSpaceWriteDuratonInMs.ToString("F20").TrimEnd('0')
);
stopwatch.Reset();
}
Console.WriteLine("Average single space invocation time: {0} ms",
samples.Average().ToString("F20").TrimEnd('0')
);
}
By the way I really recommend using BenchmarkDotNet for benchmarks, execution time measuring etc. Do give it a try - it's a fantastic library.

Move specific set of lines from a Flat File into another

I am trying to dump the contents of a file into another file using c#. Not the entire content but only specific set of lines. I read the file into an array.
Now what I want to do is I want to remove certain number of lines, say from a total of 50 lines of file, first 10 lines and bottom 20 lines are to be excluded.
My code looks like
System.IO.StreamWriter file = new System.IO.StreamWriter(#"C:\manoj\File.txt");
string[] lines = System.IO.File.ReadAllLines(#"C:\manoj\sample.txt");
for (i = 10; i <= 30; i++)
{
foreach(string line in lines)
{
file.writeline(line[i]);
}
}
Index is out of bound for the array is the error that I am getting.
Can someone please advise me?
You are accessing, for each line of the file, to each char from the 10th to the 30th.
At the first line containing less than 30 char, the program raise the error you get
You should not be nesting the loop that counts to 30 with the loop that iterates over the lines in your file. Try this
var total_lines = lines.Count;
var linecount = 0;
foreach(string line in lines) {
linecount ++;
if (linecount >= 10 || linecount <= total_lines - 20) {
file.writeline(line);
}
}
This uses the linecount variable to count your lines, and then selectively outputs based on the value of that variable.

C# add line numbers to a text file

I am trying to read a text file in C# and add line numbers to the lines.
This my input file:
This is line one
this is line two
this is line three
And this should be the output:
1 This is line one
2 this is line two
3 this is line three
This is my code so far:
class Program
{
public static void Main()
{
string path = Directory.GetCurrentDirectory() + #"\MyText.txt";
StreamReader sr1 = File.OpenText(path);
string s = "";
while ((s = sr1.ReadLine()) != null)
{
for (int i = 1; i < 4; i++)
Console.WriteLine(i + " " + s);
}
sr1.Close();
Console.WriteLine();
StreamWriter sw1 = File.AppendText(path);
for (int i = 1; i < 4; i++)
{
sw1.WriteLine(s);
}
sw1.Close();
}
}
I am 90% sure I need to use for cycle to get the line numbers there but so far with this code I get this output in the console:
1 This is line one
2 This is line one
3 This is line one
1 this is line two
2 this is line two
3 this is line two
1 this is line three
2 this is line three
3 this is line three
And this is in the output file:
This is line number one.
This is line number two.
This is line number three.1
2
3
I am not sure why the string variable s is not used when writing in the file even though it is defined earlier (another block, another rules maybe?).
using System;
using System.Collections.Generic;
using System.IO;
using System.Text;
namespace AppendText
{
class Program
{
public static void Main()
{
string path = Directory.GetCurrentDirectory() + #"\MyText.txt";
StreamReader sr1 = File.OpenText(path);
string s = "";
int counter = 1;
StringBuilder sb = new StringBuilder();
while ((s = sr1.ReadLine()) != null)
{
var lineOutput = counter++ + " " + s;
Console.WriteLine(lineOutput);
sb.Append(lineOutput);
}
sr1.Close();
Console.WriteLine();
StreamWriter sw1 = File.AppendText(path);
sw1.Write(sb);
sw1.Close();
}
}
}
IEnumerable<string> lines = File.ReadLines(file)
.Select((line,i)=>i + " " + line)
.ToList();
File.WriteAllLines(file, lines);
OPEN STREAM
read the whole line and store it in a temp variable.
Use a counter to keep track which line you have read.
concatenate the counter with the temp variable.
save it to the file.
move your line pointer to next line and
repeat.
THEN CLOSE YOUR STREAM
I could provide you the right code, but because it is home work I will just ask you question that should lead you to the right answer:
why do you close the StreamReader the while inside your loop ? You will still access it after, that can cause an error.
why do you write in your StreamWriter without the prepended index ?
Why do you open the StreamWriter inside the loop ? Wouldn't it be better to open the StreamWriter and StreamReader outside the loop. Do you job in the loop and then close the Streams ?
You need to prepend the line number to each line string. Check out String.Format. Also, try a counter variable that sits outside the while loop to keep the line number count.
Hopefully that's enough to get you on the right path without handing you the exact answer.
Are you sure you want to close stream inside the loop while?
Watch out the FOR loops, you put them inside the While, so basically you are saying:
while ((s = sr1.ReadLine()) != null)
Every row read
for (int i = 1; i < 4; i++)
Repeat 3 times a write.
Also, you are closing the stream inside the while, so after the first row read.
Here is one major issue for you:
for (int i = 1; i < 4; i++)
Console.WriteLine(i + " " + s);
}
You are closing the for loop with a curly brace but not using a curly brace to open it. This means that the curly brace quoted above is actually closing the while loop so you loop through doing all the console.writeline and then when you come to writing to the file you are actually not reading from the file at all - s is "" due to scoping.
An alternative to #Hasan's answer for in-memory strings as a one-liner:
function AddLineNumbers(string input) =>
String.Join('\n', input.Split('\n').Select((text, i) => $"{i+1}: {text}"));

file handling in C# .net

There is a list of things I want to do. I have a forms application.
Go to a particular line. I know how to go in a serial manner, but is there any way by which I can jump to a particular line no.
To find out total no of line.
If the file is not too big, you can try the ReadAllLines.
This reads the whole file, into a string array, where every line is an element of the array.
Example:
var fileName = #"C:\MyFolder\MyFileName.txt";
var contents = System.IO.File.ReadAllLines(fileName);
Console.WriteLine("Line: 10: " + contents[9]);
Console.WriteLine("Number of lines:");
Console.WriteLine(contents.Lenght);
But be aware: This reads in the whole file into memory.
If the file is too big:
Open the file (OpenText), and create a Dictionary to store the offset of every line. Scan every line, and store the offset. Now you can go to every line, and you have the number of lines.
var lineOffset = new Dictionary<int, long>();
using (var rdr = System.IO.File.OpenText(fileName)) {
int lineNr = 0;
lineOffset.Add(0,0);
while (rdr.ReadLine() != null)) {
lineNr++;
lineOffset.Add(lineNr, rdr.BaseStream.Position);
}
// Goto line 10
rdr.BaseStream.Position = lineOffset[10];
var line10 = rdr.ReadLine();
}
This would help for your first point: jump into file line c#

Categories

Resources