C#: String.IndexOf to FileStream.Seek - c#

having a FileStream that I read with a StreamReader (it is a very large file), how can I set the Seek position of the FileStream to the first occurrence of a certain substring so that I can start reading this large file from a given point?
Thanks

What's in the file? Just lines of Unicode text? Then you've got a problem.
You will never know the position of the start of a line until you've read all the previous lines at least once. Unless the file is encoded in UTF-32, each character may take a variable number of bytes to represent it. Each line will have a variable length.
The best you can do is to scan through the file once and then make note of the positions of the starts of lines, in an index.

FileStream cannot do the search for you. You'll have to manually search for it. Probably you'll want to use an efficient string searching algorithm such as Knuth Morris Pratt.

Maybe this can help (Building a Regular Expression Stream Search with the .NET Framework):
https://www.developer.com/design/building-a-regular-expression-stream-search-with-the-net-framework/

If you mean first time you read the file then well you will have to read to know the position (of the particular string). Next time if content of the file is not changing you can remember this position (in some variable for use in same run of program), set stream position and start reading it.

Take a look at this example on MSDN
filestream=new FileStream(s.Substring(s.IndexOf("string"),s.Length),FileMode.Open,FileAccess.Read);

Related

Using StreamReader.ReadLine() to read a specific line number w/o reading entire file

I'm using StreamReader.ReadLine() in C# to read through a text file to find the specific content like "Step-xx" and then read and use the contents that point to the next occurrence of "Step-xx+1". I know the occurrence of the "Step-xx" line is 100 lines apart in my textfile. How can I jump to line 2500 and read the contents following "Step-25", rather than reading 2500 lines and comparing it to "Step-25", which I'm doing now. I need to speed up this search.
Thanks.
Files are not lines based (or even character based), so you can't skip to a specific line in a file.
If you really need to skip ahead in the file, you would need to make a guess where the 2500th line might start based on average line lengths, seek to that position and start reading. You would need to use a FileStream directly, not a StreamReader, and read the file as bytes. You would be looking for the 0x0d 0x0a byte combination that is used as newline in a Windows text file. When you have the bytes between two newlines, you can decode them into a string and look for the Step-xx markers.
Thanks for all the replies. This will do the trick.
string line = File.ReadLines(FileName).Skip(14).Take(1).First();
I need to figure out how changing from StreamReader to ReadLines would impact other things.
Thanks again

Read random string from a file without reading all file

how to read a random line from a file without reading all the file?
I tried FileStream class and Seek, Read functions. But didn't get luck with 100% working method.
This isn't possible as a line is defined by a carriage return or line break. There is no way to know how many lines there are in the file without reading all of it unless you know something about the structure of the file; for example, that every line is 80 characters. In that case you're picking a random 80 characters from a file of length N, in which case your FileStream seek is the way to go.

C#: Search and replace txt line

I am looking for a way to search a comma separated txt file for a keyword, and then replace another keyword on that exact line. For example if i have the following line in a big txt file:
Help, 0
I want to find this line in the txt (by telling program to look for the first word 'help') and replace the 0 with 1 to indicate that i have read it once so it looks like:
Help, 1
Thanks
It is generally a very bad idea to try and overwrite data in the same file: if your code throws an exception, you'll be left with a partially processed file; if your search target and replacement value have different lengths, you have to re-write the rest of the file. Note that these don't apply in your specific situation - but it's best not to let it become habit.
My recommendation:
Open both the input file and a temporary file (Path.GetTempFileName)
process and write each line ( StreamReader.ReadLine)
When finished with no errors, rename the original file to something like origFile.old
rename the temporary file to the original file name.
If something goes wrong, delete the temporary file and exit. This way the original file is left intact in the event of an error.
If you want to do the replacement "in place" (meaning you don't want to use another, temporary, file) then you would do so with a FileStream.
You have a couple of options, you can Read through the file stream until you find the text that you're looking for, then issue a Write. Keep in mind that FileStream works at the byte level, so you'll need to take character encoding into consideration. Encoding.GetString will do the conversion.
Alternatively, you can search for the text, and note its position. Then you can open a FileStream and just Seek to that position. Then you can issue the Write.
This may be the most efficient way, but it's definitely more challenging then the naive option. With the naive implementation, you:
Read the entire file into memory (File.ReadAllText)
Perform the replace (Regex.Replace)
Write it back to disk (File.WriteAllText)
There's no second file, but you are bound by the amount of memory the system has. If you know you're always dealing with small files, then this could be an option. Otherwise, you need to read up on character encoding and file streams.
Here's another SO question on the topic (including sample code): Editing a text file in place through C#

Adding a Line to the Middle of a File with .NET

Hello I am working on something, and I need to be able to be able to add text into a .txt file. Although I have this completed I have a small problem. I need to write the string in the middle of the file more or less. Example:
Hello my name is Brandon,
I hope someone can help, //I want the string under this line.
Thank you.
Hopefully someone can help with a solution.
Edit Alright thanks guys, I'll try to figure it out, probably going to just rewrite the whole file. Ok well the program I am making is related to the hosts file, and not everyone has the same hosts file, so I was wondering if there is a way to read their hosts file, and copy all of it, while adding the string to it?
With regular files there's no way around it - you must read the text that follows the line you wish to append after, overwrite the file, and then append the original trailing text.
Think of files on disk as arrays - if you want to insert some items into the middle of an array, you need to shift all of the following items down to make room. The difference is that .NET offers convenience methods for arrays and Lists that make this easy to do. The file I/O APIs offer no such convenience methods, as far as I'm aware.
When you know in advance you need to insert in the middle of a file, it is often easier to simply write a new file with the altered content, and then perform a rename. If the file is small enough to read into memory, you can do this quite easily with some LINQ:
var allLines = File.ReadAllLines( filename ).ToList();
allLines.Insert( insertPos, "This is a new line..." );
File.WriteAllLines( filename, allLines.ToArray() );
This is the best method to insert a text in middle of the textfile.
string[] full_file = File.ReadAllLines("test.txt");
List<string> l = new List<string>();
l.AddRange(full_file);
l.Insert(20, "Inserted String");
File.WriteAllLines("test.txt", l.ToArray());
one of the trick is file transaction. first you read the file up to the line you want to add text but while reading keep saving the read lines in a separate file for example tmp.txt and then add your desired text to the tmp.txt (at the end of the file) after that continue the reading from the source file till the end. then replace the tmp.txt with the source file. at the end you got file with added text in the middle :)
Check out File.ReadAllLines(). Probably the easiest way.
string[] full_file = File.ReadAllLines("test.txt");
List<string> l = new List<string>();
l.AddRange(full_file);
l.Insert(20, "Inserted String");
File.WriteAllLines("test.txt", l.ToArray());
If you know the line index use readLine until you reach that line and write under it.
If you know exactly he text of that line do the same but compare the text returned from readLine with the text that you are searching for and then write under that line.
Or you can search for the index of a specified string and writ after it using th escape sequence \n.
As others mentioned, there is no way around rewriting the file after the point of the newly inserted text if you must stick with a simple text file. Depending on your requirements, though, it might be possible to speed up the finding of location to start writing. If you knew that you needed to add data after line N, then you could maintain a separate "index" of the offsets of line numbers. That would allow you to seek directly to the necessary location to start reading/writing.

How do you specify where to start reading in a file when using StreamReader?

How do you specify where to start reading in a file when using StreamReader?
I have created a streamreader object, along with a file stream object. After both objects are created, how would I go upon controlling where I want the StreamReader to start reading from a file?
Let's say the file's contents are as follows,
// song list.
// junk info.
1. Song Name
2. Song Name
3. Song Name
4. Song Name
5. Song Name
6. Song Name
How would I control the streamreader to read from let's say #2? Also, how could I also control where to make it stop reading by a similar delimiter like at #5?
Edit: By delimiter I mean, a way to make StreamReader start reading from ('2.')
Are you trying to deserialize a file into some in-memory object? If so, you may want to simply parse the entire file in using ReadLine or something similar, store each line, and then access it via a data structure such as a KeyValuePair<int, string>.
Update: Ok... With the new info, I think you have two options. If you're looking at reading until you find a match, you can Peek(), check to see if the character is the one you're looking for, and then Read(). Alternatively, if you're looking for a set position, you can simply Read() that many characters and throw away the return value.
If you're looking for complex delimiter, you can read the entire line or even the entire file into memory and use Regular Expressions.
Hope that helps...
If the file contains new line delimiters you can use ReadLine to read a line at a time.
So to start reading at line #2, you would read the first line and discard and then read lines until line #5.
Well if the content is just plain text like that, you should use the StreamReader's ReadLine method.
http://msdn.microsoft.com/en-us/library/system.io.streamreader.readline.aspx
-Oisin

Categories

Resources