Writing text to the middle of a file - c#

Is there a way I can write text to a file from a certain point in the file?
For example, I open a file of 10 lines of text but I want to write a line of text to the 5th line.
I guess one way is to get the lines of text in the file back as an array using the readalllines method, and then add a line at a certain index in the array.
But there is a distinction in that some collections can only add members to the end and some at any destination. To double check, an array would always allow me to add a value at any index, right? (I'm sure one of my books said other wise).
Also, is there a better way of going about this?
Thanks

Oh, sigh. Look up the "master file update" algorithm.
here's pseudocode:
open master file for reading.
count := 0
while not EOF do
read line from master file into buffer
write line to output file
count := count + 1
if count = 5 then
write added line to output file
fi
od
rename output file to replace input file

If you're reading/writing small files (say, under 20 megabytes--yes I consider 20M "small") and not writing them that often (as in, not several times a second) then just read/write the whole thing.
Serial files like text documents aren't designed for random access. That's what databases are for.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
public class Class1
{
static void Main()
{
var beatles = new LinkedList<string>();
beatles.AddFirst("John");
LinkedListNode<string> nextBeatles = beatles.AddAfter(beatles.First, "Paul");
nextBeatles = beatles.AddAfter(nextBeatles, "George");
nextBeatles = beatles.AddAfter(nextBeatles, "Ringo");
// change the 1 to your 5th line
LinkedListNode<string> paulsNode = beatles.NodeAt(1);
LinkedListNode<string> recentHindrance = beatles.AddBefore(paulsNode, "Yoko");
recentHindrance = beatles.AddBefore(recentHindrance, "Aunt Mimi");
beatles.AddBefore(recentHindrance, "Father Jim");
Console.WriteLine("{0}", string.Join("\n", beatles.ToArray()));
Console.ReadLine();
}
}
public static class Helper
{
public static LinkedListNode<T> NodeAt<T>(this LinkedList<T> l, int index)
{
LinkedListNode<T> x = l.First;
while ((index--) > 0) x = x.Next;
return x;
}
}

Related

How to read the from a text file then calculate an average

I plan on reading the marks from a text file and then calculating what the average mark is based upon data written in previous code. I haven't been able to read the marks though or calculate how many marks there are as BinaryReader doesn't let you use .Length.
I have tried using an array to hold each mark but it doesn't like each mark being an integer
public static int CalculateAverage()
{
int count = 0;
int total = 0;
float average;
BinaryReader markFile;
markFile = new BinaryReader(new FileStream("studentMarks.txt", FileMode.Open));
//A loop to read each line of the file and add it to the total
{
//total = total + eachMark;
//count++;
}
//average = total / count;
//markFile.Close();
//Console.WriteLine("Average mark:", average);
return 0;
}
This is my studentMark.txt file in VS
First of all, don't use BinerayRead you can use StreamReader for example.
Also with using statement is not necessary implement the close().
There is an answer using a while loop, so using Linq you can do in one line:
var avg = File.ReadAllLines("file.txt").ToArray().Average(a => Int32.Parse(a));
Console.WriteLine("avg = "+avg); //5
Also using File.ReadAllLines() according too docs the file is loaded into memory and then close, so there is no leak memory problem or whatever.
Opens a text file, reads all lines of the file into a string array, and then closes the file.
Edit to add the way to read using BinaryReader.
First thing to know is you are reading a txt file. Unless you have created the file using BinaryWriter, the binary reader will not work. And, if you are creating a binary file, there is not a good practice name as .txt.
So, assuming your file is binary, you need to loop and read every integer, so this code shoul work.
var fileName = "file.txt";
if (File.Exists(fileName))
{
using (BinaryReader reader = new BinaryReader(File.Open(fileName, FileMode.Open)))
{
while (reader.BaseStream.Position < reader.BaseStream.Length)
{
total +=reader.ReadInt32();
count++;
}
}
average = total/count;
Console.WriteLine("Average = "+average); // 5
}
I've used using to ensure file is close at the end.
If your file only contains numbers, you only have to use ReadInt32() and it will work.
Also, if your file is not binary, obviously, binary writer will not work. By the way, my binary file.txt created using BinaryWriter looks like this:
So I'm assuming you dont have a binary file...

Read multiple lines from a large file in non-ascending order

I have a very large text file, over 1GB, and I have a list of integers that represent line numbers, and the need is to produce another file containing the text of the original files line numbers in the new file.
Example of original large file:
ogfile line 1
some text here
another line
blah blah
So when I get a List of "2,4,4,1" the output file should read:
some text here
blah blah
blah blah
ogfile line 1
I have tried
string lineString = File.ReadLines(filename).Skip(lineNumList[i]-1).Take(1).First();
but this takes way to long as the file has to be read in, skipped to the line in question, then reread the next time... and we are talking millions of lines in the 1GB file and my List<int> is thousands of line numbers.
Is there a better/faster way to read a single line, or have the reader skip to a specific line number without "skipping" line by line?
The high-order bit here is: you are trying to solve a database problem using text files. Databases are designed to solve big data problems; text files, as you've discovered, are terrible at random access. Use a database, not a text file.
If you are hell-bent upon using a text file, what you have to do is take advantage of stuff you know about the likely problem parameters. For example, if you know that, as you imply, there are ~1M lines, each line is ~1KB, and the set of lines to extract is ~0.1% of the total lines, then you can come up with an efficient solution like this:
Make a set containing the line numbers to be read. The set must be fast to check for membership.
Make a dictionary that maps from line numbers to line contents. This must be fast to look up by key and fast to add new key/value pairs.
Read each line of the file one at a time; if the line number is in the set, add the contents to the dictionary.
Now iterate the list of line numbers and map the dictionary contents; now we have a sequence of strings.
Dump that sequence to the destination file.
We have five operations, so hopefully it is around five lines of code.
void DoIt(string pathIn, IEnumerable<int> lineNumbers, string pathOut)
{
var lines = new HashSet<int>(lineNumbers);
var dict = File.ReadLines(pathIn)
.Select((lineText, index) => new KeyValuePair<int, string>(index, lineText))
.Where(p => lines.Contains(p.Key))
.ToDictionary(p => p.Key, p => p.Value);
File.WriteAllLines(pathOut, lineNumbers.Select(i => dict[i]));
}
OK, got it in six. Pretty good.
Notice that I made use of all those assumptions; if the assumptions are violated then this stops being a good solution. In particular we assume that the dictionary is going to be small compared to the size of the input file. If that is not true, then you'll need a more sophisticated technique to get efficiencies.
Conversely, can we extract additional efficiencies? Yes, provided we know facts about likely inputs. Suppose for example we know that the same file will be iterated several times but with different line number sets, but those sets are likely to have overlap. In that case we can re-use dictionaries instead of rebuilding them. That is, suppose a previous operation has left a Dictionary<int, string> computed for lines (10, 20, 30, 40) and file X. If a request then comes in for lines (30, 20, 10) for file X, we already have the dictionary in memory.
The key thing I want to get across in this answer is that you must know something about the inputs in order to build an efficient solution; the more restrictions you can articulate on the inputs, the more efficient a solution you can build. Take advantage of all the knowledge you have about the problem domain.
Use a StreamReader, so you don't have to read the entire file, just until the last desired line, and store them in a Dictionary, for later fast search.
Edit: Thanks to Erick Lippert, I included a HashSet for fast lookup.
List<int> lineNumbers = new List<int>{2,4,4,1};
HashSet<int> lookUp = new HashSet<int>(lineNumbers);
Dictionary<int,string> lines = new Dictionary<int,string>();
using(StreamReader sr = new StreamReader(inputFile)){
int lastLine = lookUp.Max();
for(int currentLine=1;currentLine<=lastLine;currentLine++){
if(lookUp.Contains(currentLine)){
lines[currentLine]=sr.ReadLine();
}
else{
sr.ReadLine();
}
}
}
using(StreamWriter sw = new StreamWriter(outputFile)){
foreach(var line in lineNumbers){
sw.WriteLine(lines[line]);
}
}
You may use a StreamReader and ReadLine method to read line by line without shocking the memory:
var lines = new Dictionary<int, string>();
var indexesProcessed = new HashSet<int>();
var indexesNew = new List<int> { 2, 4, 4, 1 };
using ( var reader = new StreamReader(#"c:\\file.txt") )
for ( int index = 1; index <= indexesNew.Count; index++ )
if ( reader.Peek() >= 0 )
{
string line = reader.ReadLine();
if ( indexesNew.Contains(index) && !indexesProcessed.Contains(index) )
{
lines[index] = line;
indexesProcessed.Add(index);
}
}
using ( var writer = new StreamWriter(#"c:\\file-new.txt", false) )
foreach ( int index in indexesNew )
if ( indexesProcessed.Contains(index) )
writer.WriteLine(lines[index]);
It reads the file and select the desired indexes then save them in the desired order.
We use a HashSet to store processed indexes to speedup Contains calls as you indicate the file can be over 1GB.
The code is made to avoid index out of bound in case of mismatches between the source file and the desired indexes, but it slows down the process. You can optimize if you are sure that there will be no problem. In this case you can remove all usage of indexesProcessed.
Output:
some text here
blah blah
blah blah
ogfile line 1
One way to do this would be to simply read the input file once (and store the result in a variable), and then grab the lines you need and write them to the output file.
Since the line number is 1-based and arrays are 0-based (i.e. line number 1 is array index 0), we subtract 1 from the line number when specifying the array index:
static void Main(string[] args)
{
var inputFile = #"f:\private\temp\temp.txt";
var outputFile = #"f:\private\temp\temp2.txt";
var fileLines = File.ReadAllLines(inputFile);
var linesToDisplay = new[] {2, 4, 4, 1};
// Write each specified line in linesToDisplay from fileLines to the outputFile
File.WriteAllLines(outputFile,
linesToDisplay.Select(lineNumber => fileLines[lineNumber - 1]));
GetKeyFromUser("\n\nDone! Press any key to exit...");
}
Another way to do this that should be more efficient is to only read the file up to the maximum line number (using the ReadLines method), rather than reading the whole file (using the ReadAllLines method), and save just the lines we care about in a dictionary that maps the line number to the line text:
static void Main(string[] args)
{
var inputFile = #"f:\private\temp\temp.txt";
var outputFile = #"f:\private\temp\temp2.txt";
var linesToDisplay = new[] {2, 4, 4, 1};
var maxLineNumber = linesToDisplay.Max();
var fileLines = new Dictionary<int, string>(linesToDisplay.Distinct().Count());
// Start lineNumber at 1 instead of 0
int lineNumber = 1;
// Just read up to the largest line number we need
// and save the lines we care about in our dictionary
foreach (var line in File.ReadLines(inputFile))
{
if (linesToDisplay.Contains(lineNumber))
{
fileLines[lineNumber] = line;
}
// Increment our lineNumber and break if we're done
if (++lineNumber > maxLineNumber) break;
}
// Write the output to our file
File.WriteAllLines(outputFile, linesToDisplay.Select(line => fileLines[line]));
GetKeyFromUser("\n\nDone! Press any key to exit...");
}

Do I need to create an object to sort by my "number" element?

My file numbering system just rolled over 100,000 which is causing some issues. Namely it causes programs to organize #100,000 before #99,999 because it sees the 1 first.
For example, another program would read the files in ascending order like this:
XXXX_100000_XXXXXX.file
XXXX_10001_XXXXXX.file
XXXX_99999_XXXXXX.file
But it should go:
XXXX_10001_XXXXXX.file
XXXX_99999_XXXXXX.file
XXXX_100000_XXXXXX.file
I have a function that reads all the files, sorts them by number, and puts them in a new array in order. Here's some pseudo code:
while(my directory has more files)
//this entire chunk assigns the number part of the filename to an int
string filename = my file
string num = filename[5] through filename[11]
//checks if the number is 5 digits, if yes, removes the underscore
if(num at position [11] == "_"){
num = num[5] through num[10]
}
int fileNum = num.toInteger
//now I have the number as an int
EDIT:
I just realized I could much more easily get the number by calling .Split on the filename and converting arr[1] to an int. I'll leave the old code for fun though.
Here's where I'm stuck. I want to feed these into a new array, sorted, or make the array sortable after everything is in there.
Do I need to create an object with the filename and number as elements, feed all the objects in, and then sort the array by number? I know that would work, but I can't help but thinking there's a more efficient way of doing this.
I don't need code written for me, I just need help working out the algorithm logic, or if my way is already the best way, let me know!
If you have an unsorted array of file names e.g.
string[] fileNames = ...
and a function for extracting the number from the name e.g.
public static int GetFileNumber(string myfile) {
string num = filename[5] through filename[11]
//checks if the number is 5 digits, if yes, removes the underscore
if(num at position [11] == "_"){
num = num[5] through num[10]
}
return num.toInteger
}
then you can sort them using Array.Sort:
Array.Sort(fileNames, (f1, f2) => GetFileNumber(f1).CompareTo(GetFileNumber(f2)));
Can you try using an int array for the number and string array for filename and then push it to collection and in the end sort the collection?
Create the collection with
https://learn.microsoft.com/en-us/dotnet/csharp/programming-guide/concepts/collections
And then sort the collection with
https://sankarsan.wordpress.com/2011/05/07/sorting-collections-in-c/
Sorry if I was not much of a help.
You can sort the string-list using an "Alphanumeric sort". This will put "100" after "9".
Example: https://www.dotnetperls.com/alphanumeric-sorting

file handling in C# .net

There is a list of things I want to do. I have a forms application.
Go to a particular line. I know how to go in a serial manner, but is there any way by which I can jump to a particular line no.
To find out total no of line.
If the file is not too big, you can try the ReadAllLines.
This reads the whole file, into a string array, where every line is an element of the array.
Example:
var fileName = #"C:\MyFolder\MyFileName.txt";
var contents = System.IO.File.ReadAllLines(fileName);
Console.WriteLine("Line: 10: " + contents[9]);
Console.WriteLine("Number of lines:");
Console.WriteLine(contents.Lenght);
But be aware: This reads in the whole file into memory.
If the file is too big:
Open the file (OpenText), and create a Dictionary to store the offset of every line. Scan every line, and store the offset. Now you can go to every line, and you have the number of lines.
var lineOffset = new Dictionary<int, long>();
using (var rdr = System.IO.File.OpenText(fileName)) {
int lineNr = 0;
lineOffset.Add(0,0);
while (rdr.ReadLine() != null)) {
lineNr++;
lineOffset.Add(lineNr, rdr.BaseStream.Position);
}
// Goto line 10
rdr.BaseStream.Position = lineOffset[10];
var line10 = rdr.ReadLine();
}
This would help for your first point: jump into file line c#

How to get String Line number in Foreach loop from reading array?

The program helps users to parse a text file by grouping certain part of the text files into "sections" array.
So the question is "Are there any methods to find out the line numbers/position within the array?" The program utilizes a foreach loop to read the "sections" array.
May someone please advise on the codes? Thanks!
namespace Testing
{
class Program
{
static void Main(string[] args)
{
TextReader tr = new StreamReader(#"C:\Test\new.txt");
String SplitBy = "----------------------------------------";
// Skip 5 lines of the original text file
for(var i = 0; i < 5; i++)
{
tr.ReadLine();
}
// Read the reststring
String fullLog = tr.ReadToEnd();
String[] sections = fullLog.Split(new string[] { SplitBy }, StringSplitOptions.None);
//String[] lines = sections.Skip(5).ToArray();
int t = 0;
// Tried using foreach (String r in sections.skip(4)) but skips sections instead of the Text lines found within each sections
foreach (String r in sections)
{
Console.WriteLine("The times are : " + t);
// Is there a way to know or get the "r" line number?
Console.WriteLine(r);
Console.WriteLine("============================================================");
t++;
}
}
}
}
A foreach loop doesn't have a loop counter of any kind. You can keep your own counter:
int number = 1;
foreach (var element in collection) {
// Do something with element and number,
number++;
}
or, perhaps easier, make use of LINQ's Enumerable.Select that gives you the current index:
var numberedElements = collection.Select((element, index) => new { element, index });
with numberedElements being a collection of anonymous type instances with properties element and index. In the case a file you can do this:
var numberedLines = File.ReadLines(filename)
.Select((Line,Number) => new { Line, Number });
with the advantage that the whole thing is processed lazily, so it will only read the parts of the file into memory that you actually use.
As far as I know, there is not a way to know which line number you are at within the file. You'd either have to keep track of the lines yourself, or read the file again until you get to that line and count along the way.
Edit:
So you're trying to get the line number of a string inside the array after the master string's been split by the SplitBy?
If there's a specific delimiter in that sub string, you could split it again - although, this might not give you what you're looking for, except...
You're essentially back at square one.
What you could do is try splitting the section string by newline characters. This should spit it out into an array that corresponds with line numbers inside the string.
Yes, you can use a for loop instead of foreach. Also, if you know the file isn't going to be too large, you can read all of the lines into an array with:
string[] lines = File.ReadAllLines(#"C:\Test\new.txt");
Well, don't use a foreach, use a for loop
for( int i = 0; i < sections.Length; ++ )
{
string section = sections[i];
int lineNum = i + 1;
}
You can of course maintain a counter when using a foreach loop as well, but there is no reason to since you have the standard for loop at your disposal which is made for this sort of thing.
Of course, this won't necessarily give you the line number of the string in the text file unless you split on Environment.NewLine. You are splitting on a large number of '-' characters and I have no idea how your file is structured. You'll likely end up underestimating the line number because all of the '---' bits will be discarded.
Not as your code is written. You must track the line number for yourself. Problematic areas of your code:
You skip 5 lines at the beginning of your code, you must track this.
Using the Split method, you are potentially "removing" lines from the original collection of lines. You must find away to know how many splits you have made, because they are an original part of the line count.
Rather than taking the approach you have, I suggest doing the parsing and searching within a classic indexed for-loop that visits each line of the file. This probably means giving up conveniences like Split, and rather looking for markers in the file manually with e.g. IndexOf.
I've got a much simpler solution to the questions after reading through all the answers yesterday.
As the string had a newline after each line, it is possible to split the strings and convert it into a new array which then is possible to find out the line number according to the array position.
The Codes:
foreach (String r in sections)
{
Console.WriteLine("The times are : " + t);
IList<String> names = r.Split('\n').ToList<String>();
}

Categories

Resources