How to read text file lines after Specific lines using StreamReader - c#

I have a text file which i am reading using StreamReader .Now as per my requirement whatever lines i have read first,i dont want to read again means i dont want to take that data again.So i have added File.ReadLines(FileToCopy).Count(); code to get the number of lines read at first.Now whatever line returned by above line of code,i want to read after that.
Here is my code .
string FileToCopy = "E:\\vikas\\call.txt";
if (System.IO.File.Exists(FileToCopy) == true)
{
lineCount = File.ReadLines(FileToCopy).Count();
using (StreamReader reader = new StreamReader(FileToCopy))
{
}
}
What Condition i need to specify here .Please help me.
while ((line = reader.ReadLine()) != null)
{
var nextLines = File.ReadLines(FileToCopy).Skip(lineCount);
if (line != "")
{
}

There's a much faster way to do this that doesn't require you to read the entire file in order to get to the point where you left off. The key is to keep track of the file's length. Then you open the file as a FileStream, position to the previous length (i.e. the end of where you read before), and then create a StreamReader. So it looks like this:
long previousLength = 0;
Then, when you want to copy new stuff:
using (var fs = File.OpenRead(FileToCopy))
{
// position to just beyond where you read before
fs.Position = previousLength;
// and update the length for next time
previousLength = fs.Length;
// now open a StreamReader and read
using (var sr = new StreamReader(fs))
{
while (!sr.EndOfStream)
{
var line = sr.ReadLine();
// do something with the line
}
}
}
This will save you huge amounts of time if the file gets large. For example if the file was a gigabyte in size the last time you read it, then File.ReadLines(filename).Skip(count) will take you 20 seconds to get to the end so you can read the next lines. The method I described above will take much less time--probably less than a second.

This:
lineCount = File.ReadLines(FileToCopy).Count();
Will return total lines count in your file.It's useless for you.You need to store the line count that you read from the file.Then everytime you read again, use Skip method:
var nextLines = File.ReadLines("filaPath").Skip(lineCount);
You don't need StreamReader here.For example if you read file for first time,let's say 10 line:
var lines = File.ReadLines(filePath).Take(10);
lineCount += 10;
For second time Skip the first 10 line and read more and update the lineCount:
var nextLines = File.ReadLines(filePath).Skip(lineCount).Take(20);
lineCount += 20;
More generically you can write a method for this and call it whenever you want to read next lines:
public static string[] ReadFromFile(string filePath, int count, ref int lineCount)
{
lineCount += count;
return File.ReadLines(filePath).Skip(lineCount).Take(count).ToArray();
}
private static int lineCount = 0;
private static void Main(string[] args)
{
// read first ten line
string[] lines = ReadFromFile("sample.txt", 10, ref lineCount);
// read next 30 lines
string[] otherLines = ReadFromFile("sample.txt", 30, ref lineCount)
}
I hope you get the idea.

Just read lineCount lines from your new stream:
for(int n=0; n<lineCount; n++)
{
reader.ReadLine();
}
That is the easiest method, when you have to actually skip N lines (not N bytes).

Related

Reading the 2 last line from a text

i am new to c# and i am working on an app that display the time difference from two date on the last two line on a text file.
I want to read the before last line from a file text, i already know how to read the last line but i need to read the before last.
This is my code :
var lastLine = File.ReadAllLines("C:\\test.log").Last();
richTextBox1.Text = lastLine.ToString();
All the previous answers eagerly load all the file up in memory before returning the requested last lines. This can be an issue if the file is big. Luckily, it is easily avoidable.
public static IEnumerable<string> ReadLastLines(string path, int count)
{
if (count < 1)
return Enumerable.Empty<string>();
var queue = new Queue<string>(count);
foreach (var line in File.ReadLines(path))
{
if (queue.Count == count)
queue.Dequeue();
queue.Enqueue(line);
}
return queue;
}
This will only keep in memory the last n read lines avoiding memory issues with large files.
Since
File.ReadAllLines("C:\\test.log");
returns an array you can take the last two items of the array:
var data = File.ReadAllLines("C:\\test.log");
string last = data[data.Length - 1];
string lastButOne = data[data.Length - 2];
In general case with long files (and that's why ReadAllLines is a bad choice) you can implement
public static partial class EnumerableExtensions {
public static IEnumerable<T> Tail<T>(this IEnumerable<T> source, int count) {
if (null == source)
throw new ArgumentNullException("source");
else if (count < 0)
throw new ArgumentOutOfRangeException("count");
else if (0 == count)
yield break;
Queue<T> queue = new Queue<T>(count + 1);
foreach (var item in source) {
queue.Enqueue(item);
if (queue.Count > count)
queue.Dequeue();
}
foreach (var item in queue)
yield return item;
}
}
...
var lastTwolines = File
.ReadLines("C:\\test.log") // Not all lines
.Tail(2);
You can try to do this
var lastLines = File.ReadAllLines("C:\\test.log").Reverse().Take(2).Reverse();
But depending on how large your file is there are probably more efficient methods to process this than reading all lines at once. See Get last 10 lines of very large text file > 10GB and How to read last “n” lines of log file
Simply store the result of ReadAllLines to a variable and than take the two last ones:
var allText = File.ReadAllLines("C:\\test.log");
var lastLines = allText.Skip(allText.Length - 2);
You can use Skip() and Take() like
var lastLine = File.ReadAllLines("C:\\test.log");
var data = lastLine.Skip(lastLine.Length - 2);
richTextBox1.Text = lastLine.ToString();
You can use StreamReader in a combination of Queue<string> since you have to read whole file either way.
// if you want to read more lines change this to the ammount of lines you want
const int LINES_KEPT = 2;
Queue<string> meQueue = new Queue<string>();
using ( StreamReader reader = new StreamReader(File.OpenRead("C:\\test.log")) )
{
string line = string.Empty;
while ( ( line = reader.ReadLine() ) != null )
{
if ( meQueue.Count == LINES_KEPT )
meQueue.Dequeue();
meQueue.Enqueue(line);
}
}
Now you can just use these 2 lines like such :
string line1 = meQueue.Dequeue();
string line2 = meQueue.Dequeue(); // <-- this is the last line.
Or to add this to the RichTextBox :
richTextBox1.Text = string.Empty; // clear the text
while ( meQueue.Count != 0 )
{
richTextBox1.Text += meQueue.Dequeue(); // add all lines in the same order as they were in file
}
Using File.ReadAllLines will read the whole text and then using Linq will iterate through already red lines. This method does everything in one run.
string line;
string[] lines = new string[]{"",""};
int index = 0;
using ( StreamReader reader = new StreamReader(File.OpenRead("C:\\test.log")) )
{
while ( ( line = reader.ReadLine() ) != null )
{
lines[index] = line;
index = 1-index;
}
}
// Last Line -1 = lines[index]
// Last line = lines[1-index]

How to improve performance of accessing lines in IEnumerable<T> File.ReadLines()

I am loading a file using File.ReadLines method (Files could get very large so I used this rather than ReadAllLines)
I need to access each line and perform an action on it. So my code is like this
IEnumerable<String> lines = File.ReadLines("c:\myfile.txt", new UTF8Encoding());
StringBuilder sb = new StringBuilder();
int totalLines = lines.Count(); //used for progress calculation
//use for instead of foreach here - easier to know the line I'm on for progress percent complete calculation
for(int i = 0; i < totalLines; i++){
//for example get the line and do something
sb.Append(lines.ElementAt(i) + "\r\n");
//get the line again using ElementAt(i) and do something else
//...ElementAt(I)...
}
So my bottleneck is each time I access ElementAt(i)because it has to iterate over the entire IEmumerable to get to position i.
Is there any way to keep using File.ReadLines, but improve this somehow?
EDIT - the reason I count at the beginning is so I can calculate progress complete for display to the user. Which is why I removed foreach in favor of the for.
How about using foreach? It's designed to handle exactly this situation.
IEnumerable<String> lines = File.ReadLines("c:\myfile.txt", new UTF8Encoding());
StringBuilder sb = new StringBuilder();
string previousLine = null;
int lineCounter = 0;
int totalLines = lines.Count();
foreach (string line in lines) {
// show progress
float done = ++lineCounter/totalLines;
Debug.WriteLine($"{done*100:0.00}% complete");
//get the line and do something
sb.AppendLine(line);
//do something else, like look at the previous line to compare
if (line == previousLine) {
Debug.WriteLine($"Line {lineCounter} is the same as the previous line.");
}
previousLine = line;
}
Sure, you can use a foreach instead the for loop, so you don't have to go back and reference the line via its index:
foreach (string line in lines)
{
sb.AppendLine(line);
}
You will also no longer need the int totalLines = lines.Count(); line because you don't need the count for anything (unless you're using somewhere you're not showing).

CSV File Splitting with specific size

Hi guys I've a function which will create multiple CSV files from a DataTable in smaller chunks based on size passed through app.config key/value pair.
Issues with the code below:
I've hardcoded the file size to 1 kb, when I'll pass a value of 20, it should created csv file of 20kb. Currently it's creating a file size of 5kb for the same value.
For the last left records it's not creating any file.
Kindly help me to fix this. Thanks!
code :
public static void CreateCSVFile(DataTable dt, string CSVFileName)
{
int size = Int32.Parse(ConfigurationManager.AppSettings["FileSize"]);
size *= 1024; //1 KB size
string CSVPath = ConfigurationManager.AppSettings["CSVPath"];
StringBuilder FirstLine = new StringBuilder();
StringBuilder records = new StringBuilder();
int num = 0;
int length = 0;
IEnumerable<string> columnNames = dt.Columns.Cast<DataColumn>().Select(column => column.ColumnName);
FirstLine.AppendLine(string.Join(",", columnNames));
records.AppendLine(FirstLine.ToString());
length += records.ToString().Length;
foreach (DataRow row in dt.Rows)
{
//Putting field values in double quotes
IEnumerable<string> fields = row.ItemArray.Select(field =>
string.Concat("\"", field.ToString().Replace("\"", "\"\""), "\""));
records.AppendLine(string.Join(",", fields));
length += records.ToString().Length;
if (length > size)
{
//Create a new file
num++;
File.WriteAllText(CSVPath + CSVFileName + DateTime.Now.ToString("yyyyMMddHHmmss") + num.ToString("_000") + ".csv", records.ToString());
records.Clear();
length = 0;
records.AppendLine(FirstLine.ToString());
}
}
}
Use File.ReadLines, Linq means deferred execution will be performed.
foreach(var line in File.ReadLines(FilePath))
{
// logic here.
}
From MSDN
The ReadLines and ReadAllLines methods differ as follows: When you use
ReadLines, you can start enumerating the collection of strings before
the whole collection is returned; when you use ReadAllLines, you must
wait for the whole array of strings be returned before you can access
the array. Therefore, when you are working with very large files,
ReadLines can be more efficient.
Now so, you could rewrite your method as below.
public static void SplitCSV(string FilePath, string FileName)
{
//Read Specified file size
int size = Int32.Parse(ConfigurationManager.AppSettings["FileSize"]);
size *= 1024 * 1024; //1 MB size
int total = 0;
int num = 0;
string FirstLine = null; // header to new file
var writer = new StreamWriter(GetFileName(FileName, num));
// Loop through all source lines
foreach (var line in File.ReadLines(FilePath))
{
if (string.IsNullOrEmpty(FirstLine)) FirstLine = line;
// Length of current line
int length = line.Length;
// See if adding this line would exceed the size threshold
if (total + length >= size)
{
// Create a new file
num++;
total = 0;
writer.Dispose();
writer = new StreamWriter(GetFileName(FileName, num));
writer.WriteLine(FirstLine);
length += FirstLine.Length;
}
// Write the line to the current file
writer.WriteLine(line);
// Add length of line in bytes to running size
total += length;
// Add size of newlines
total += Environment.NewLine.Length;
}
}
The solution is quite simple... you don't need to put all your lines in memory (as you do in string[] arr = File.ReadAllLines(FilePath);).
Instead, create an StreamReader on the input file, and read line by line to line buffer. When the buffer is over your "threshold size", write it to disk into a single csv file. The code should be something like this:
using (var sr = new System.IO.StreamReader(filePath))
{
var linesBuffer = new List<string>();
while (sr.Peek() >= 0)
{
linesBuffer.Add(sr.ReadLine());
if (linesBuffer.Count > yourThreshold)
{
// TODO: implement function WriteLinesToPartialCsv
WriteLinesToPartialCsv(linesBuffer);
// Clear the buffer:
linesBuffer.Clear();
// Try forcing c# to clear the memory:
GC.Collect();
}
}
}
As you can see, reading the stream line by line (instead of the whole CSV inpunt file, as your code did) you have better control over the memory.

How to Stream string data from a txt file into an array

I'm doing this exercise from a lab. the instructions are as follows
This method should read the product catalog from a text file called “catalog.txt” that you should
create alongside your project. Each product should be on a separate line.Use the instructions in the video to create the file and add it to your project, and to return an
array with the first 200 lines from the file (use the StreamReader class and a while loop to read
from the file). If the file has more than 200 lines, ignore them. If the file has less than 200 lines,
it’s OK if some of the array elements are empty (null).
I don't understand how to stream data into the string array any clarification would be greatly appreciated!!
static string[] ReadCatalogFromFile()
{
//create instance of the catalog.txt
StreamReader readCatalog = new StreamReader("catalog.txt");
//store the information in this array
string[] storeCatalog = new string[200];
int i = 0;
//test and store the array information
while (storeCatalog != null)
{
//store each string in the elements of the array?
storeCatalog[i] = readCatalog.ReadLine();
i = i + 1;
if (storeCatalog != null)
{
//test to see if its properly stored
Console.WriteLine(storeCatalog[i]);
}
}
readCatalog.Close();
Console.ReadLine();
return storeCatalog;
}
Here are some hints:
int i = 0;
This needs to be outside your loop (now it is reset to 0 each time).
In your while() you should check the result of readCatalog() and/or the maximum number of lines to read (i.e. the size of your array)
Thus: if you reached the end of the file -> stop - or if your array is full -> stop.
static string[] ReadCatalogFromFile()
{
var lines = new string[200];
using (var reader = new StreamReader("catalog.txt"))
for (var i = 0; i < 200 && !reader.EndOfStream; i++)
lines[i] = reader.ReadLine();
return lines;
}
A for-loop is used when you know the exact number of iterations beforehand. So you can say it should iterate exactly 200 time so you won't cross the index boundaries. At the moment you just check that your array isn't null, which it will never be.
using(var readCatalog = new StreamReader("catalog.txt"))
{
string[] storeCatalog = new string[200];
for(int i = 0; i<200; i++)
{
string temp = readCatalog.ReadLine();
if(temp != null)
storeCatalog[i] = temp;
else
break;
}
return storeCatalog;
}
As soon as there are no more lines in the file, temp will be null and the loop will be stopped by the break.
I suggest you use your disposable resources (like any stream) in a using statement. After the operations in the braces, the resource will automatically get disposed.

Reading input from a text file, converting into int, and storing into array.

I need some help. I have a text file that looks like so:
21,M,S,1
22,F,M,2
19,F,S,3
65,F,M,4
66,M,M,4
What I need to do is put the first column into an array int[] age and the last column into an array int[] districts. This is for a college project due in a week. I've been having a lot of trouble trying to figure this out. Any help would be greatly appreciated. I did try searching for an answer already but didn't find anything that i understood. I also cannot use anything we havent learned from the book, so it rules out lists<> and things of the like.
FileStream census = new FileStream("census.txt", FileMode.Open, FileAccess.Read);
StreamReader inFile = new StreamReader(census);
string input = "";
string[] fields;
int[] districts = new int[SIZE];
int[] ageGroups = new int[SIZE];
input = inFile.ReadLine();
while (input != null)
{
fields = input.Split(',');
for (int i = 0; i < 1; i++)
{
int x = int.Parse(fields[i]);
districts[i] = x;
}
input = inFile.ReadLine();
}
Console.WriteLine(districts[0]);
if your file is nothing but this then File.ReadAllLines() will return a string array with each element being a line of your file. Having done that, you can then use the length of the returned array to initialize the other two arrays, into which the data will be stored.
Once you have your string array you call string.Split() on each element with "," as your delimiter, now you will have another array of strings minus the commas, you will them take the values you want by their index position, 0 and 3 respectively, and you can store those somewhere. Your code would look something like this:
//you will need to replace path with the actual path to the file.
string[] file = File.ReadAllLines("path");
int[] age = new int[file.Length];
int[] districts = new int[file.Length];
int counter = 0;
foreach (var item in file)
{
string[] values = item.Split(',');
age[counter] = Convert.ToInt32(values[0]);
districts[counter] = Convert.ToInt32(values[3]);
counter++
}
Proper way of writing this code:
Write each step your trying to perform:
// open file
// for each line
// parse line
Then refine "parse line"
// split by fields
// parse and handle age
// parse and handle gender
// parse and handle martial status
// parse and handle ....
Then start writing missing code.
At that point you should figure out that iterating through fields of single record not going to do you any good as all fields have different meaning.
So you'll need to remove for and replace it with filed-by-field parsing/assignments.
Instead of looping through all your fields, simply refer to the actual index of the field:
Wrong:
for (int i = 0; i < 1; i++)
{
int x = int.Parse(fields[i]);
districts[i] = x;
}
Right:
districts[i] = int.Parse(fields[0]);
ageGroups[i] = int.Parse(fields[3]);
i++;
So I just made some BS to do what you are seeking. I do not agree with it because I hate directly hardcoding for split, but since you can't use a list this is what you get:
FileStream census = new FileStream(path, FileMode.Open, FileAccess.Read);
StreamReader inFile = new StreamReader(census);
int[] districts = new int[1024];
int[] ageGroups = new int[1024];
int counter = 0;
string line;
while ((line = inFile.ReadLine()) != null)
{
string[] splitString = line.Split(',');
int.TryParse(splitString[0], out ageGroups[counter]);
int.TryParse(splitString[3], out districts[counter]);
counter++;
}
This will give you two arrays, districts and ageGroups that are of length 1024 and will contain the values for each row in the census.txt file.

Categories

Resources