Reading the 2 last line from a text - c#

i am new to c# and i am working on an app that display the time difference from two date on the last two line on a text file.
I want to read the before last line from a file text, i already know how to read the last line but i need to read the before last.
This is my code :
var lastLine = File.ReadAllLines("C:\\test.log").Last();
richTextBox1.Text = lastLine.ToString();

All the previous answers eagerly load all the file up in memory before returning the requested last lines. This can be an issue if the file is big. Luckily, it is easily avoidable.
public static IEnumerable<string> ReadLastLines(string path, int count)
{
if (count < 1)
return Enumerable.Empty<string>();
var queue = new Queue<string>(count);
foreach (var line in File.ReadLines(path))
{
if (queue.Count == count)
queue.Dequeue();
queue.Enqueue(line);
}
return queue;
}
This will only keep in memory the last n read lines avoiding memory issues with large files.

Since
File.ReadAllLines("C:\\test.log");
returns an array you can take the last two items of the array:
var data = File.ReadAllLines("C:\\test.log");
string last = data[data.Length - 1];
string lastButOne = data[data.Length - 2];
In general case with long files (and that's why ReadAllLines is a bad choice) you can implement
public static partial class EnumerableExtensions {
public static IEnumerable<T> Tail<T>(this IEnumerable<T> source, int count) {
if (null == source)
throw new ArgumentNullException("source");
else if (count < 0)
throw new ArgumentOutOfRangeException("count");
else if (0 == count)
yield break;
Queue<T> queue = new Queue<T>(count + 1);
foreach (var item in source) {
queue.Enqueue(item);
if (queue.Count > count)
queue.Dequeue();
}
foreach (var item in queue)
yield return item;
}
}
...
var lastTwolines = File
.ReadLines("C:\\test.log") // Not all lines
.Tail(2);

You can try to do this
var lastLines = File.ReadAllLines("C:\\test.log").Reverse().Take(2).Reverse();
But depending on how large your file is there are probably more efficient methods to process this than reading all lines at once. See Get last 10 lines of very large text file > 10GB and How to read last “n” lines of log file

Simply store the result of ReadAllLines to a variable and than take the two last ones:
var allText = File.ReadAllLines("C:\\test.log");
var lastLines = allText.Skip(allText.Length - 2);

You can use Skip() and Take() like
var lastLine = File.ReadAllLines("C:\\test.log");
var data = lastLine.Skip(lastLine.Length - 2);
richTextBox1.Text = lastLine.ToString();

You can use StreamReader in a combination of Queue<string> since you have to read whole file either way.
// if you want to read more lines change this to the ammount of lines you want
const int LINES_KEPT = 2;
Queue<string> meQueue = new Queue<string>();
using ( StreamReader reader = new StreamReader(File.OpenRead("C:\\test.log")) )
{
string line = string.Empty;
while ( ( line = reader.ReadLine() ) != null )
{
if ( meQueue.Count == LINES_KEPT )
meQueue.Dequeue();
meQueue.Enqueue(line);
}
}
Now you can just use these 2 lines like such :
string line1 = meQueue.Dequeue();
string line2 = meQueue.Dequeue(); // <-- this is the last line.
Or to add this to the RichTextBox :
richTextBox1.Text = string.Empty; // clear the text
while ( meQueue.Count != 0 )
{
richTextBox1.Text += meQueue.Dequeue(); // add all lines in the same order as they were in file
}
Using File.ReadAllLines will read the whole text and then using Linq will iterate through already red lines. This method does everything in one run.

string line;
string[] lines = new string[]{"",""};
int index = 0;
using ( StreamReader reader = new StreamReader(File.OpenRead("C:\\test.log")) )
{
while ( ( line = reader.ReadLine() ) != null )
{
lines[index] = line;
index = 1-index;
}
}
// Last Line -1 = lines[index]
// Last line = lines[1-index]

Related

How to Stream string data from a txt file into an array

I'm doing this exercise from a lab. the instructions are as follows
This method should read the product catalog from a text file called “catalog.txt” that you should
create alongside your project. Each product should be on a separate line.Use the instructions in the video to create the file and add it to your project, and to return an
array with the first 200 lines from the file (use the StreamReader class and a while loop to read
from the file). If the file has more than 200 lines, ignore them. If the file has less than 200 lines,
it’s OK if some of the array elements are empty (null).
I don't understand how to stream data into the string array any clarification would be greatly appreciated!!
static string[] ReadCatalogFromFile()
{
//create instance of the catalog.txt
StreamReader readCatalog = new StreamReader("catalog.txt");
//store the information in this array
string[] storeCatalog = new string[200];
int i = 0;
//test and store the array information
while (storeCatalog != null)
{
//store each string in the elements of the array?
storeCatalog[i] = readCatalog.ReadLine();
i = i + 1;
if (storeCatalog != null)
{
//test to see if its properly stored
Console.WriteLine(storeCatalog[i]);
}
}
readCatalog.Close();
Console.ReadLine();
return storeCatalog;
}
Here are some hints:
int i = 0;
This needs to be outside your loop (now it is reset to 0 each time).
In your while() you should check the result of readCatalog() and/or the maximum number of lines to read (i.e. the size of your array)
Thus: if you reached the end of the file -> stop - or if your array is full -> stop.
static string[] ReadCatalogFromFile()
{
var lines = new string[200];
using (var reader = new StreamReader("catalog.txt"))
for (var i = 0; i < 200 && !reader.EndOfStream; i++)
lines[i] = reader.ReadLine();
return lines;
}
A for-loop is used when you know the exact number of iterations beforehand. So you can say it should iterate exactly 200 time so you won't cross the index boundaries. At the moment you just check that your array isn't null, which it will never be.
using(var readCatalog = new StreamReader("catalog.txt"))
{
string[] storeCatalog = new string[200];
for(int i = 0; i<200; i++)
{
string temp = readCatalog.ReadLine();
if(temp != null)
storeCatalog[i] = temp;
else
break;
}
return storeCatalog;
}
As soon as there are no more lines in the file, temp will be null and the loop will be stopped by the break.
I suggest you use your disposable resources (like any stream) in a using statement. After the operations in the braces, the resource will automatically get disposed.

find all next lines when previous line contains a string

I'm working on an ASP mvc application and i'm trying to get all the next lines when previous line contains a word
I've used the code below but i just can get the last line that contains the word given
int counter = 0;
string line;
List<string> found = new List<string>();
// Read the file and display it line by line.
System.IO.StreamReader file = new System.IO.StreamReader("C:\\Users\\Chaimaa\\Documents\\path.txt");
while ((line = file.ReadLine()) != null)
{
if (line.Contains("fact"))
{
found.Add(line);
}
foreach (var i in found)
{
var output = i;
ViewBag.highlightedText = output;
}
}
Any help on what should I add to
1- get ALL lines that contains the word
2- and preferably get the ALL NEXT lines
You can use an overload of Where that provides an index, store indexes in a hash set, and use the containment check to decide if a line should be kept or not, like this:
var seen = new HashSet<int>();
var res = data.Where((v, i) => {
if (v.Contains("fact")) {
seen.Add(i);
}
return seen.Contains(i-1);
});
Demo.
As a side benefit, seen would contain indexes of all lines where the word "fact" has been found.
You can write a PairWise method that takes in a sequence of values and returns a sequence containing each item paired with the item that came before it:
public static IEnumerable<Tuple<T, T>> Pairwise<T>(this IEnumerable<T> source)
{
using (var iterator = source.GetEnumerator())
{
if (!iterator.MoveNext())
yield break;
T prev = iterator.Current;
while (iterator.MoveNext())
{
yield return Tuple.Create(prev, iterator.Current);
prev = iterator.Current;
}
}
}
With this method we can pair off each of the lines, get the lines where the previous value contains a word, and then project out the second value, which is the line after it:
var query = lines.Pairwise()
.Where(pair => pair.Item1.Contains(word))
.Select(pair => pair.Item2);

How to loop through and compare millions of values in two text files?

I have two text files files (TXT) which contain over 2 million distinct file names. I want to loop through all the names in the first file and find those that are also present in the second text file.
I have tried looping through the StreamReader but it takes a lot of time. I also tried the code below, but it still takes too much time.
StreamReader first = new StreamReader(path);
string strFirst = first.ReadToEnd();
string[] strarrFirst = strFirst.Split('\n');
bool found = false;
StreamReader second = new StreamReader(path2);
string str = second.ReadToEnd();
string[] strarrSecond = str.Split('\n');
for (int j = 0; j < (strarrFirst.Length); j++)
{
found = false;
for (int i = 0; i < (strarrSecond .Length); i++)
{
if (strarrFirst[j] == strarrSecond[i])
{
found = true;
break;
}
}
if (!found)
{
Console.WriteLine(strarrFirst[j]);
}
}
What is a good way to compare the files?
How about this:
var commonNames = File.ReadLines(path).Intersect(File.ReadLines(path2));
That's O(N + M) instead of your current solution which tests every line in the first file with every line in the second file - O(N * M).
That's assuming you're using .NET 4. Otherwise, you could use File.ReadAllLines, but that will read the whole file into memory. Or you could write the equivalent of File.ReadLines yourself - it's not terribly hard.
Ultimately you're likely to be limited by file IO by the time you've got rid of the O(N * M) problem in your current code - there's not much way to get round that.
EDIT: For .NET 2, first let's implement something like ReadLines:
public static IEnumerable<string> ReadLines(string file)
{
using (TextReader reader = File.OpenText(file))
{
string line;
while ((line = reader.ReadLine()) != null)
{
yield return line;
}
}
}
Now we really want to use a HashSet<T>, but that wasn't in .NET 2 - so let's use Dictionary<TKey, TValue> instead:
Dictionary<string, string> map = new Dictionary<string, string>();
foreach (string line in ReadLines(path))
{
map[line] = line;
}
List<string> intersection = new List<string>();
foreach (string line in ReadLines(path2))
{
if (map.ContainsKey(line))
{
intersection.Add(line);
}
}
Try something like this to speed it up a bit ...
var path = string.Empty;
var path2 = string.Empty;
var strFirst = string.Empty;
var str = string.Empty;
var strarrFirst = new List<string>();
var strarrSecond = new List<string>();
using (var first = new StreamReader(path))
{
strFirst = first.ReadToEnd();
}
using (var second = new StreamReader(path2))
{
str = second.ReadToEnd();
}
strarrFirst.AddRange(strFirst.Split('\n'));
strarrSecond.AddRange(str.Split('\n'));
strarrSecond.Sort();
foreach(var value in strarrFirst)
{
var found = strarrSecond.BinarySearch(value) >= 0;
if (!found) Console.WriteLine(value);
}
Just for fun, I've tried Jon Skeet method and own:
var guidArray = Enumerable.Range(0, 1000000).Select(x => Guid.NewGuid().ToString()).ToList();
string path = "first.txt";
File.WriteAllLines(path, guidArray);
string path2 = "second.txt";
File.WriteAllLines(path2, guidArray.Select(x=>DateTime.UtcNow.Ticks % 2 == 0 ? x : Guid.NewGuid().ToString()));
var start = DateTime.Now;
var commonNames = File.ReadLines(path).Intersect(File.ReadLines(path2)).ToList();
Console.WriteLine((DateTime.Now - start).TotalMilliseconds);
start = DateTime.Now;
var lines = File.ReadAllLines(path);
var hashset = new HashSet<string>(lines);
var lines2 = File.ReadAllLines(path2);
var result = lines2.Where(hashset.Contains).ToList();
Console.WriteLine((DateTime.Now - start).TotalMilliseconds);
Console.ReadKey();
And Skeet's method was tiny bit faster (1453.0831 vs 1488.0851, iDevForFun method was quite slow - 12791.7316), so i think under layers should happen same thing as I was trying to do manually with hashset.

randomly select a line in a textfile c# streamreader

Hi I am reading from a text file and would like each line to be put into a seperate variable. From what I remember from my programming classes arrays cannot be dynamic. So if I set 15 arrays, and the text file has 1000 lines what can I do and how do I implement it.
The thing is only one line will be needed but I want the line to be randomly selected. the linetext is the whole text file with \r\n appended to the end of every request.
Maybe randomly select the \r\n then count 4 and add the string after it till the next \r\n. The problem with this idea is the strings getting called will also contain \ so any ideas?
if (crawler == true)
{
TextReader tr = new StreamReader("textfile.txt");
while (tr.Peek() != -1)
{
linktext = linktext + tr.ReadLine() + "\r\n";
}
//link = linktext;
hi.Text = linktext.ToString();
timer1.Interval = 7000; //1000ms = 1sec 7 seconds per cycle
timer1.Tick += new EventHandler(randomLink); //every cycle randomURL is called.
timer1.Start(); // start timer.
}
File.ReadAllLines(...) will read every line of the given file into an array of strings. I think that should be what you want but your question is kind of hard to follow.
You don't need to keep more than two lines in memory at a time... there's a sneaky trick you can use:
Create an instance of Random, or take one as a parameter
Read the first line. This automatically becomes the "current" line to return
Read the second line, and then call Random.Next(2). If the result is 0, make the second line the "current" line
Read the third line, and then call Random.Next(3). If the result is 0, make the third line the "current" line
... etc
When you reach the end of the file (reader.ReadLine returns null) return the "current" line.
Here's a general implementation for an IEnumerable<T> - if you're using .NET 4, you can use File.ReadLines() to get an IEnumerable<string> to pass to it. (This implementation has a bit more in it than is really needed - it's optimized for IList<T> etc.)
public static T RandomElement<T>(this IEnumerable<T> source,
Random random)
{
if (source == null)
{
throw new ArgumentNullException("source");
}
if (random == null)
{
throw new ArgumentNullException("random");
}
ICollection collection = source as ICollection;
if (collection != null)
{
int count = collection.Count;
if (count == 0)
{
throw new InvalidOperationException("Sequence was empty.");
}
int index = random.Next(count);
return source.ElementAt(index);
}
ICollection<T> genericCollection = source as ICollection<T>;
if (genericCollection != null)
{
int count = genericCollection.Count;
if (count == 0)
{
throw new InvalidOperationException("Sequence was empty.");
}
int index = random.Next(count);
return source.ElementAt(index);
}
using (IEnumerator<T> iterator = source.GetEnumerator())
{
if (!iterator.MoveNext())
{
throw new InvalidOperationException("Sequence was empty.");
}
int countSoFar = 1;
T current = iterator.Current;
while (iterator.MoveNext())
{
countSoFar++;
if (random.Next(countSoFar) == 0)
{
current = iterator.Current;
}
}
return current;
}
}
A List<T> is a dynamically expanding list. You might want to use that instead of an array.
If there is only 1000 elements, just read them into the list and select a random element.
Regarding the array thing.. you could use a List<> instead, which is dynamic
Here is an example of how this can be achieved:
public static string GetRandomLine(ref string file) {
List<string> lines = new List<string>();
Random rnd = new Random();
int i = 0;
try {
if (File.Exists(file)) {
StreamReader reader = new StreamReader(file);
while (!(reader.Peek() == -1))
lines.Add(reader.ReadLine());
i = rnd.Next(lines.Count);
reader.Close();
reader.Dispose();
return lines[i].Trim();
}
else {
return string.Empty;
}
}
catch (IOException ex) {
MessageBox.Show("Error: " + ex.Message);
return string.Empty;
}
}
If you create the file then the ideal way would be to store meta data about the file, like the number of lines, before hand, and then decide which 'random' line to choose.
Otherwise, you cant get around the "array" problem by not using them. Instead use a List which stores any number of strings. After that picking a random one is as simple as generating a random number between 0 and the size of the list.
Your problem has been done before, I recommend googling for "C# read random line from file".

Split large file into smaller files by number of lines in C#?

I am trying to figure out how to split a file by the number of lines in each file. THe files are csv and I can't do it by bytes. I need to do it by lines. 20k seems to be a good number per file. What is the best way to read a stream at a given position? Stream.BaseStream.Position? So if I read the first 20k lines i would start the position at 39,999? How do I know I am almost at the end of a files? Thanks all
using (System.IO.StreamReader sr = new System.IO.StreamReader("path"))
{
int fileNumber = 0;
while (!sr.EndOfStream)
{
int count = 0;
using (System.IO.StreamWriter sw = new System.IO.StreamWriter("other path" + ++fileNumber))
{
sw.AutoFlush = true;
while (!sr.EndOfStream && ++count < 20000)
{
sw.WriteLine(sr.ReadLine());
}
}
}
}
int index=0;
var groups = from line in File.ReadLines("myfile.csv")
group line by index++/20000 into g
select g.AsEnumerable();
int file=0;
foreach (var group in groups)
File.WriteAllLines((file++).ToString(), group.ToArray());
I'd do it like this:
// helper method to break up into blocks lazily
public static IEnumerable<ICollection<T>> SplitEnumerable<T>
(IEnumerable<T> Sequence, int NbrPerBlock)
{
List<T> Group = new List<T>(NbrPerBlock);
foreach (T value in Sequence)
{
Group.Add(value);
if (Group.Count == NbrPerBlock)
{
yield return Group;
Group = new List<T>(NbrPerBlock);
}
}
if (Group.Any()) yield return Group; // flush out any remaining
}
// now it's trivial; if you want to make smaller files, just foreach
// over this and write out the lines in each block to a new file
public static IEnumerable<ICollection<string>> SplitFile(string filePath)
{
return File.ReadLines(filePath).SplitEnumerable(20000);
}
Is that not sufficient for you? You mention moving from position to position,but I don't see why that's necessary.

Categories

Resources