How to read the from a text file then calculate an average

How to read the from a text file then calculate an average - c#

I plan on reading the marks from a text file and then calculating what the average mark is based upon data written in previous code. I haven't been able to read the marks though or calculate how many marks there are as BinaryReader doesn't let you use .Length.
I have tried using an array to hold each mark but it doesn't like each mark being an integer
public static int CalculateAverage()
{
int count = 0;
int total = 0;
float average;
BinaryReader markFile;
markFile = new BinaryReader(new FileStream("studentMarks.txt", FileMode.Open));
//A loop to read each line of the file and add it to the total
{
//total = total + eachMark;
//count++;
}
//average = total / count;
//markFile.Close();
//Console.WriteLine("Average mark:", average);
return 0;
}
This is my studentMark.txt file in VS

First of all, don't use BinerayRead you can use StreamReader for example.
Also with using statement is not necessary implement the close().
There is an answer using a while loop, so using Linq you can do in one line:
var avg = File.ReadAllLines("file.txt").ToArray().Average(a => Int32.Parse(a));
Console.WriteLine("avg = "+avg); //5
Also using File.ReadAllLines() according too docs the file is loaded into memory and then close, so there is no leak memory problem or whatever.
Opens a text file, reads all lines of the file into a string array, and then closes the file.
Edit to add the way to read using BinaryReader.
First thing to know is you are reading a txt file. Unless you have created the file using BinaryWriter, the binary reader will not work. And, if you are creating a binary file, there is not a good practice name as .txt.
So, assuming your file is binary, you need to loop and read every integer, so this code shoul work.
var fileName = "file.txt";
if (File.Exists(fileName))
{
using (BinaryReader reader = new BinaryReader(File.Open(fileName, FileMode.Open)))
{
while (reader.BaseStream.Position < reader.BaseStream.Length)
{
total +=reader.ReadInt32();
count++;
}
}
average = total/count;
Console.WriteLine("Average = "+average); // 5
}
I've used using to ensure file is close at the end.
If your file only contains numbers, you only have to use ReadInt32() and it will work.
Also, if your file is not binary, obviously, binary writer will not work. By the way, my binary file.txt created using BinaryWriter looks like this:
So I'm assuming you dont have a binary file...

Related

Reading and Write specific lines to text file C#

I have a master file called FileName with IDs of people. It is in sorted order.
I want to divide IDs into 27 chunks and copy each chunk into a different text file.
using (FileStream fs = File.Open(FileName, FileMode.Open, FileAccess.Read, FileShare.ReadWrite))
{
string line;
int numOfLines = File.ReadAllLines(FileName).Length; -- I have 73467
int eachSubSet = (numOfLines / 27);
var lines = File.ReadAllLines(dataFileName).Take(eachSubSet);
File.WriteAllLines(FileName1,lines);
}
I have 27 different text files. so I want 73467 of IDs divided equally and copied over to 27 different files. So, 1st file will have ID#1 to ID#2721
2nd Dile will have ID#2722 to ID#(2722+2721) and so on. I do not know how to automate this and run this quickly.
Thanks
HR

The simplest way would be to run File.ReadLine and WriteLine inside a loop and decide what file will receive which line.
I wouldn't recommend you to parallelize this routine since it's an IO operation, but just the copy of lines would be pretty fast.

Note that in your sample code you called File.ReadAllLines twice, so you actually parse your entire input file twice.
So avoiding that should speed up the process, and also you didn't actually split the files, you only wrote the first file out of the 27.
Untested, but something along these lines should work:
const int numOfFiles = 27;
string[] lines = File.ReadAllLines(FileName);
int numOfLines = lines.Length;
int eachSubSet = numOfLines/numOfFiles;
int firstSubset = numOfLines%numOfFiles + eachSubSet;
IEnumerable<string> linesLeftToWrite = lines;
for (int index = 0; index < numOfFiles; index++)
{
int numToTake = index == 0 ? firstSubset : eachSubSet;
File.WriteAllLines(string.Format("{0}_{1}.txt", FileName, index), linesLeftToWrite.Take(numToTake));
linesLeftToWrite = linesLeftToWrite.Skip(numToTake);
}

How to use Stream.Write Method to overwrite existing text

I am using StreamWriter to write records into a file. Now I want to overwrite specific record.
string file="c:\\......";
StreamWriter sw = new StreamWriter(new FileStream(file, FileMode.Open, FileAccess.Write));
sw.write(...);
sw.close();
I read somewhere here that I can use Stream.Write method to do that, I have no previous experience or knowledge of how to deal with bytes.
public override void Write(
byte[] array,
int offset,
int count
)
So how to use this method.
I need someone to explain what exactly byte[] array and int count are in this method, and any simple sample code shows how to use this method to overwrite existing record in a file.
ex. change any record like record Mark1287,11100,25| to Bill9654,22100,30| .

If you want to override a particular record, you must use FileStream.Seek-method to set the put your stream in position.
Example for Seek
using System;
using System.IO;
class FStream
{
static void Main()
{
const string fileName = "Test####.dat";
// Create random data to write to the file.
byte[] dataArray = new byte[100000];
new Random().NextBytes(dataArray);
using(FileStream
fileStream = new FileStream(fileName, FileMode.Create))
{
// Write the data to the file, byte by byte.
for(int i = 0; i < dataArray.Length; i++)
{
fileStream.WriteByte(dataArray[i]);
}
// Set the stream position to the beginning of the file.
fileStream.Seek(0, SeekOrigin.Begin);
// Read and verify the data.
for(int i = 0; i < fileStream.Length; i++)
{
if(dataArray[i] != fileStream.ReadByte())
{
Console.WriteLine("Error writing data.");
return;
}
}
Console.WriteLine("The data was written to {0} " +
"and verified.", fileStream.Name);
}
}
}
After having sought the position, use Write, whereas
public override void Write(
byte[] array,
int offset,
int count
)
Parameters
array
Type: System.Byte[]
The buffer containing data to write to the stream.
offset
Type: System.Int32
The zero-based byte offset in array from which to begin copying bytes to the stream.
count
Type: System.Int32
The maximum number of bytes to write.
And most important: always consider the documentation when unsure!

So... in short:
Your file is text base (but is allowed to become binary based).
Your record have various sizes.
This way there is, without analyzing your file, no way to know where a given record starts and ends. If you want to overwrite a record, the new record can be larger than the old record, so all records further in that file will have to be moved.
This requires a complex management system. Options could be:
When your application starts it analyzes your file and stores in memory the start and length of each record.
There is a seperate (binary)file which holds per record the start and length of each record. This will cost an additional 8 bytes in total (an Int32 for both start+length. Perhapse you want to conside Int64.)
If you want to rewrite a record, u can use this "record/start/length"-system to know where to start to write your record. But before you do that, you have to assure space, thus moving all records after the record being rewritten. Of course you have to update your managementsystem witht the new positions and length.
Another option is to do as a database: every record exists of fixed width columns. Even text columns have a maximum length. Because of this you can calculate very easy where each record start in the file. For example: if each record has a size of 200 bytes, then record #0 will start at position 0, the next record at position 200, the one after that at 400, etc. You do not have to move record when a record is rewritten.
Another suggestion is: create a mangementsystem like how memory is managed. Once a record is written it stays there. The managementsystem keeps a list of allocated portions and free portions of the file. If a new record is written, a free and fitting portion is search by the managementsystem and the record is written at that position (optionally leaving a smaller free portion). When a record is deleted, that space is freeds up. When you rewrite a record you actually delete the old record and write a new record (possibly at a totalle different location).
My last suggestion: Use a database :)

How to generate string of a certain length to insert into a file to meet a file size criteria?

I have a requirement to test some load issues with regards to file size. I have a windows application written in C# which will automatically generate the files. I know the size of each file, ex. 100KB, and how many files to generate. What I need help with is how to generate a string less than or equal to the required file size.
pseudo code:
long fileSizeInKB = (1024 * 100); //100KB
int numberOfFiles = 5;
for(var i = 0; i < numberOfFiles - 1; i++) {
var dataSize = fileSizeInKB;
var buffer = new byte[dataSize];
using (var fs = new FileStream(File, FileMode.Create, FileAccess.Write)) {
}
}

You can always use the a constructor for string which takes a char and a number of times you want that character repeated:
string myString = new string('*', 5000);
This gives you a string of 5000 stars - tweak to your needs.

Easiest way would be following code:
var content = new string('A', fileSizeInKB);
Now you've got a string with as many A as required.
To fill it with Lorem Ipsum or some other repeating string build something like the following pseudocode:
string contentString = "Lorem Ipsum...";
for (int i = 0; i < fileSizeInKB / contentString.Length; i++)
//write contentString to file
if (fileSizeInKB % contentString.Length > 0)
// write remaining substring of contentString to file
Edit: If you're saving in Unicode you may need to half the filesize count because unicode uses two bytes per character if I remember correctly.

There are so many variations on how you can do this. One would be, fill the file with a bunch of chars. You need 100KB? No problem.. 100 * 1024 * 8 = 819200 bits. A single char is 16 bits. 819200 / 16 = 51200. You need to stick 51,200 chars into a file. But consider that a file may have additional header/meta data, so you may need to account for that and decrease the number of chars to write to file.

As a partial answer to your question I recently created a portable WPF app that easily creates 'junk' files of almost any size: https://github.com/webmooch/FileCreator

How to read a large (1 GB) txt file in .NET?

I have a 1 GB text file which I need to read line by line. What is the best and fastest way to do this?
private void ReadTxtFile()
{
string filePath = string.Empty;
filePath = openFileDialog1.FileName;
if (string.IsNullOrEmpty(filePath))
{
using (StreamReader sr = new StreamReader(filePath))
{
String line;
while ((line = sr.ReadLine()) != null)
{
FormatData(line);
}
}
}
}
In FormatData() I check the starting word of line which must be matched with a word and based on that increment an integer variable.
void FormatData(string line)
{
if (line.StartWith(word))
{
globalIntVariable++;
}
}

If you are using .NET 4.0, try MemoryMappedFile which is a designed class for this scenario.
You can use StreamReader.ReadLine otherwise.

Using StreamReader is probably the way to since you don't want the whole file in memory at once. MemoryMappedFile is more for random access than sequential reading (it's ten times as fast for sequential reading and memory mapping is ten times as fast for random access).
You might also try creating your streamreader from a filestream with FileOptions set to SequentialScan (see FileOptions Enumeration), but I doubt it will make much of a difference.
There are however ways to make your example more effective, since you do your formatting in the same loop as reading. You're wasting clockcycles, so if you want even more performance, it would be better with a multithreaded asynchronous solution where one thread reads data and another formats it as it becomes available. Checkout BlockingColletion that might fit your needs:
Blocking Collection and the Producer-Consumer Problem
If you want the fastest possible performance, in my experience the only way is to read in as large a chunk of binary data sequentially and deserialize it into text in parallel, but the code starts to get complicated at that point.

You can use LINQ:
int result = File.ReadLines(filePath).Count(line => line.StartsWith(word));
File.ReadLines returns an IEnumerable<String> that lazily reads each line from the file without loading the whole file into memory.
Enumerable.Count counts the lines that start with the word.
If you are calling this from an UI thread, use a BackgroundWorker.

Probably to read it line by line.
You should rather not try to force it into memory by reading to end and then processing.

StreamReader.ReadLine should work fine. Let the framework choose the buffering, unless you know by profiling you can do better.

TextReader.ReadLine()

I was facing same problem in our production server at Agenty where we see large files (sometimes 10-25 gb (\t) tab delimited txt files). And after lots of testing and research I found the best way to read large files in small chunks with for/foreach loop and setting offset and limit logic with File.ReadLines().
int TotalRows = File.ReadLines(Path).Count(); // Count the number of rows in file with lazy load
int Limit = 100000; // 100000 rows per batch
for (int Offset = 0; Offset < TotalRows; Offset += Limit)
{
var table = Path.FileToTable(heading: true, delimiter: '\t', offset : Offset, limit: Limit);
// Do all your processing here and with limit and offset and save to drive in append mode
// The append mode will write the output in same file for each processed batch.
table.TableToFile(#"C:\output.txt");
}
See the complete code in my Github library : https://github.com/Agenty/FileReader/
Full Disclosure - I work for Agenty, the company who owned this library and website

My file is over 13 GB:
You can use my class:
public static void Read(int length)
{
StringBuilder resultAsString = new StringBuilder();
using (MemoryMappedFile memoryMappedFile = MemoryMappedFile.CreateFromFile(#"D:\_Profession\Projects\Parto\HotelDataManagement\_Document\Expedia_Rapid.jsonl\Expedia_Rapi.json"))
using (MemoryMappedViewStream memoryMappedViewStream = memoryMappedFile.CreateViewStream(0, length))
{
for (int i = 0; i < length; i++)
{
//Reads a byte from a stream and advances the position within the stream by one byte, or returns -1 if at the end of the stream.
int result = memoryMappedViewStream.ReadByte();
if (result == -1)
{
break;
}
char letter = (char)result;
resultAsString.Append(letter);
}
}
}
This code will read text of file from start to the length that you pass to the method Read(int length) and fill the resultAsString variable.
It will return the bellow text:

I'd read the file 10,000 bytes at a time. Then I'd analyse those 10,000 bytes and chop them into lines and feed them to the FormatData function.
Bonus points for splitting the reading and line analysation on multiple threads.
I'd definitely use a StringBuilder to collect all strings and might build a string buffer to keep about 100 strings in memory all the time.

Writing text to the middle of a file

Is there a way I can write text to a file from a certain point in the file?
For example, I open a file of 10 lines of text but I want to write a line of text to the 5th line.
I guess one way is to get the lines of text in the file back as an array using the readalllines method, and then add a line at a certain index in the array.
But there is a distinction in that some collections can only add members to the end and some at any destination. To double check, an array would always allow me to add a value at any index, right? (I'm sure one of my books said other wise).
Also, is there a better way of going about this?
Thanks

Oh, sigh. Look up the "master file update" algorithm.
here's pseudocode:
open master file for reading.
count := 0
while not EOF do
read line from master file into buffer
write line to output file
count := count + 1
if count = 5 then
write added line to output file
fi
od
rename output file to replace input file

If you're reading/writing small files (say, under 20 megabytes--yes I consider 20M "small") and not writing them that often (as in, not several times a second) then just read/write the whole thing.
Serial files like text documents aren't designed for random access. That's what databases are for.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
public class Class1
{
static void Main()
{
var beatles = new LinkedList<string>();
beatles.AddFirst("John");
LinkedListNode<string> nextBeatles = beatles.AddAfter(beatles.First, "Paul");
nextBeatles = beatles.AddAfter(nextBeatles, "George");
nextBeatles = beatles.AddAfter(nextBeatles, "Ringo");
// change the 1 to your 5th line
LinkedListNode<string> paulsNode = beatles.NodeAt(1);
LinkedListNode<string> recentHindrance = beatles.AddBefore(paulsNode, "Yoko");
recentHindrance = beatles.AddBefore(recentHindrance, "Aunt Mimi");
beatles.AddBefore(recentHindrance, "Father Jim");
Console.WriteLine("{0}", string.Join("\n", beatles.ToArray()));
Console.ReadLine();
}
}
public static class Helper
{
public static LinkedListNode<T> NodeAt<T>(this LinkedList<T> l, int index)
{
LinkedListNode<T> x = l.First;
while ((index--) > 0) x = x.Next;
return x;
}
}

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.