Ignore certain lines in a text file (C# Streamreader) - c#

I'm trying to work out a way of removing records from a program I'm writing. I have a text file with all the customer data spread over a set of lines and I read in these lines one at a time and store them in a List
When writing I simply append to the file. However, for deleting I had the idea of adding a character such as * or # to the front of lines no longer needed. However I am unsure how to do this
Below is how I currrently read the data in:
Thanks in advance
StreamReader dataIn = null;
CustomerClass holdcus; //holdcus and holdacc are used as "holding pens" for the next customer/account
Accounts holdacc;
bool moreData = false;
string[] cusdata = new string[13]; //holds customer data
string[] accdata = new string[8]; //holds account data
if (fileIntegCheck(inputDataFile, ref dataIn))
{
moreData = getCustomer(dataIn, cusdata);
while (moreData == true)
{
holdcus = new CustomerClass(cusdata[0], cusdata[1], cusdata[2], cusdata[3], cusdata[4], cusdata[5], cusdata[6], cusdata[7], cusdata[8], cusdata[9], cusdata[10], cusdata[11], cusdata[12]);
customers.Add(holdcus);
int x = Convert.ToInt32(cusdata[12]);
for (int i = 0; i < x; i++) //Takes the ID number for the last customer, as uses it to set the first value of the following accounts
{ //this is done as a key to which accounts map to which customers
moreData = getAccount(dataIn, accdata);
accdata[0] = cusdata[0];
holdacc = new Accounts(accdata[0], accdata[1], accdata[2], accdata[3], accdata[4], accdata[5], accdata[6], accdata[7]);
accounts.Add(holdacc);
}
moreData = getCustomer(dataIn, cusdata);
}
}
if (moreData != null) dataIn.Close();

Since your using string arrays, you can just do cusdata[index] = "#"+cusdata[index] to append it to the beginning of the line. However if your question is how to delete it from the file, why not skip the above step and just not add the line you want deleted when writing the file?

Here is a small read / write sample that should suit your needs. If it doesnt then let me know in the comment.
class Program
{
static readonly string filePath = "c:\\test.txt";
static void Main(string[] args)
{
// Read your file
List<string> lines = ReadLines();
//Create your remove logic here ..
lines = lines.Where(x => x.Contains("Julia Roberts") != true).ToList();
// Rewrite the file
WriteLines(lines);
}
static List<string> ReadLines()
{
List<string> lines = new List<string>();
using (StreamReader sr = new StreamReader(new FileStream(filePath, FileMode.Open)))
{
while (!sr.EndOfStream)
{
string buffer = sr.ReadLine();
lines.Add(buffer);
// Just to show you the results
Console.WriteLine(buffer);
}
}
return lines;
}
static void WriteLines(List<string> lines)
{
using (StreamWriter sw = new StreamWriter(new FileStream(filePath, FileMode.Create)))
{
foreach (var line in lines)
{
sw.WriteLine(line);
}
}
}
}
I used the following "data sample" for this
Matt Damon 100 222
Julia Roberts 125 152
Robert Downey Jr. 150 402
Tom Hanks 55 932

Related

Add two lines from csv file to array(s)

I have a csv file with the following data:
500000,0.005,6000
690000,0.003,5200
I need to add each line as a separate array. So 50000, 0.005, 6000 would be array1. How would I do this?
Currently my code adds each column into one element.
For example data[0] is showing 500000
690000
static void ReadFromFile(string filePath)
{
try
{
// Create an instance of StreamReader to read from a file.
// The using statement also closes the StreamReader.
using (StreamReader sr = new StreamReader(filePath))
{
string line;
// Read and display lines from the file until the end of
// the file is reached.
while ((line = sr.ReadLine()) != null)
{
string[] data = line.Split(',');
Console.WriteLine(data[0] + " " + data[1]);
}
}
}
catch (Exception e)
{
// Let the user know what went wrong.
Console.WriteLine("The file could not be read:");
Console.WriteLine(e.Message);
}
}
Using the limited data set you've provided...
const string test = #"500000,0.005,6000
690000,0.003,5200";
var result = test.Split('\n')
.Select(x=> x.Split(',')
.Select(y => Convert.ToDecimal(y))
.ToArray()
)
.ToArray();
foreach (var element in result)
{
Console.WriteLine($"{element[0]}, {element[1]}, {element[2]}");
}
Can it be done without LINQ? Yes, but it's messy...
const string test = #"500000,0.005,6000
690000,0.003,5200";
List<decimal[]> resultList = new List<decimal[]>();
string[] lines = test.Split('\n');
foreach (var line in lines)
{
List<decimal> decimalValueList = new List<decimal>();
string[] splitValuesByComma = line.Split(',');
foreach (string value in splitValuesByComma)
{
decimal convertedValue = Convert.ToDecimal(value);
decimalValueList.Add(convertedValue);
}
decimal[] decimalValueArray = decimalValueList.ToArray();
resultList.Add(decimalValueArray);
}
decimal[][] resultArray = resultList.ToArray();
That will give the exact same output as what I've done with the first example
If you may use a List<string[]> you do not have to worry about the array length.
In the following example, the variable lines will be a list arrays, like:
["500000", "0.005", "6000"]
["690000", "0.003", "5200"]
static void ReadFromFile(string filePath)
{
try
{
// Create an instance of StreamReader to read from a file.
// The using statement also closes the StreamReader.
using (StreamReader sr = new StreamReader(filePath))
{
List<string[]> lines = new List<string[]>();
string line;
// Read and display lines from the file until the end of
// the file is reached.
while ((line = sr.ReadLine()) != null)
{
string[] splittedLine = line.Split(',');
lines.Add(splittedLine);
}
}
}
catch (Exception e)
{
// Let the user know what went wrong.
Console.WriteLine("The file could not be read:");
Console.WriteLine(e.Message);
}
}
While other have split method, I will have a more "scolar"-"specified" method.
You have some Csv value in a file. Find a name for this object stored in a Csv, name every column, type them.
Define the default value of those field. Define what happends for missing column, and malformed field. Header?
Now that you know what you have, define what you want. This time again: Object name -> Property -> Type.
Believe me or not, the simple definition of your input and output solved your issue.
Use CsvHelper to simplify your code.
CSV File Definition:
public class CsvItem_WithARealName
{
public int data1;
public decimal data2;
public int goodVariableNames;
}
public class CsvItemMapper : ClassMap<CsvItem_WithARealName>
{
public CsvItemMapper()
{ //mapping based on index. cause file has no header.
Map(m => m.data1).Index(0);
Map(m => m.data2).Index(1);
Map(m => m.goodVariableNames).Index(2);
}
}
A Csv reader method, point a document it will give your the Csv Item.
Here we have some configuration: no header and InvariantCulture for decimal convertion
private IEnumerable<CsvItem_WithARealName> GetCsvItems(string filePath)
{
using (var fileReader = File.OpenText(filePath))
using (var csvReader = new CsvHelper.CsvReader(fileReader))
{
csvReader.Configuration.CultureInfo = CultureInfo.InvariantCulture;
csvReader.Configuration.HasHeaderRecord = false;
csvReader.Configuration.RegisterClassMap<CsvItemMapper>();
while (csvReader.Read())
{
var record = csvReader.GetRecord<CsvItem_WithARealName>();
yield return record;
}
}
}
Usage :
var filename = "csvExemple.txt";
var items = GetCsvItems(filename);

C# Finding the mean of every xth value in a file using streamreader

I'm relatively new to c# and I am trying to write a program that finds the mean of every xth value in a file using Streamreader. (For example if I wanted to find the mean of every fifth value in that file)
I written some code that reads the file and splits it into a new line for each comma, and this works fine, when I try and read each specific value.
However I'm struggling to think of a way to find every specific value, such as every 4th one and then find the mean of these and output it in the same program.
static void Main(string[] args)
{
using (var reader = new StreamReader(#"file"))
{
List<string> list = new List<string>();
while (!reader.EndOfStream)
{
var line = reader.ReadLine();
var values = line.Split(',');
list.Add(values[0]);
}
}
}
Any suggestions or help would be greatly appreciated
Try like this;
static void Main()
{
using (var reader = new StreamReader(#"file"))
{
int lineNumber = 4;
bool streamEnded = false;
List<string> list = new List<string>();
while (!streamEnded)
{
var line = ReadSpecificLine(reader, lineNumber,out streamEnded);
if (string.IsNullOrEmpty(line))
{
continue;
}
var values = line.Split(',');
list.Add(values[0]);
}
}
}
public static string ReadSpecificLine(StreamReader sr, int lineNumber,out bool streamEnded)
{
streamEnded = false;
for (int i = 1; i < lineNumber; i++)
{
if (sr.EndOfStream)
{
streamEnded = true;
return "";
}
sr.ReadLine();
}
if (sr.EndOfStream)
{
streamEnded = true;
return "";
}
return sr.ReadLine();
}

Create list of arrays from text file in C#

I have a number of text files that all follow the same content format:
"Title section","Version of the app"
10
"<thing 1>","<thing 2>","<thing 3>","<thing 4>","<thing 5>","<thing 6>","<thing 7>","<thing 8>","<thing 9>","<thing 10>"
'Where:
' first line never changes, it always contains exactly these 2 items
' second line is a count of how many "line 3s" there are
' line 3 contains a command to execute and (up to) 9 parameters
' - there will always be 10 qoute-delimited entries, even if some are blank
' - there can be N number of entries (in this example, there will be 10 commands to read)
I am reading each of these text files in, using StreamReader, and want to set each file up in its own class.
public class MyTextFile{
public string[] HeaderLine { get; set; }
public int ItemCount { get; set; }
List<MyCommandLine> Commands { get; set;}
}
public class MyCommandLine{
public string[] MyCommand { get; set; }
}
private void btnGetMyFilesiles_Click(object sender, EventArgs e){
DirectoryInfo myFolder = new DirectoryInfo(#"C:\FileSpot");
FileInfo[] myfiles = myfolder.GetFiles("*.ses");
string line = "";
foreach(FileInfo file in Files ){
str = str + ", " + file.Name;
// Read the file and display it line by line.
System.IO.StreamReader readingFile = new System.IO.StreamReader(file.Name);
MyTextFile myFileObject = new MyTextFile()
while ((line = readingFile.ReadLine()) != null){
' create the new MyTextFile here
}
file.Close();
}
}
}
The objective is to determine what the actual command being called is (""), and if any of the remaining parameters point to a pre-existing file, determine if that file exists. My problem is that I can't figure out how to read N number of "line 3" into their own objects and append these objects to the MyTextFile object. I'm 99% certain that I've led myself astray in reading each file line-by-line, but I don't know how to get out of it.
So, addressing the specific issue of getting N number of line 3 items into your class, you could do something like this (obviously you can make some changes so it is more specific to your application).
public class MyTextFile
{
public List<Array> Commands = new List<Array>();
public void EnumerateCommands()
{
for (int i = 0; i < Commands.Count; i++)
{
foreach (var c in Commands[i])
Console.Write(c + " ");
Console.WriteLine();
}
}
}
class Program
{
static void Main(string[] args)
{
string line = "";
int count = 0;
MyTextFile tf = new MyTextFile();
using (StreamReader sr = new StreamReader(#"path"))
{
while ((line = sr.ReadLine()) != null)
{
count += 1;
if (count >= 3)
{
object[] Arguments = line.Split(',');
tf.Commands.Add(Arguments);
}
}
}
tf.EnumerateCommands();
Console.ReadLine();
}
}
At least now you have a list of commands within your 'MyTextFile' class that you can enumerate through and do stuff with.
** I added the EnumerateCommands method so that you could actually see the list is storing the line items. The code should run in a Console application with the appropriate 'using' statements.
Hope this helps.
If all of the is separated with coma sign , you can just do something like :
int length = Convert.ToInt32 (reader.ReadLine ());
string line = reader.ReadLine ();
IEnumerable <string> things = line.Split (',').Select (thing => thing. Replace ('\"'', string.Empty).Take(length);
Take indicates how many things to take from the line.

How to avoid c# File.ReadLines First() locking file

I do not want to read the whole file at any point, I know there are answers on that question, I want t
o read the First or Last line.
I know that my code locks the file that it's reading for two reasons 1) The application that writes to the file crashes intermittently when I run my little app with this code but it never crashes when I am not running this code! 2) There are a few articles that will tell you that File.ReadLines locks the file.
There are some similar questions but that answer seems to involve reading the whole file which is slow for large files and therefore not what I want to do. My requirement to only read the last line most of the time is also unique from what I have read about.
I nead to know how to read the first line (Header row) and the last line (latest row). I do not want to read all lines at any point in my code because this file can become huge and reading the entire file will become slow.
I know that
line = File.ReadLines(fullFilename).First().Replace("\"", "");
... is the same as ...
FileStream fs = new FileStream(#fullFilename, FileMode.Open, FileAccess.Read, FileShare.Read);
My question is, how can I repeatedly read the first and last lines of a file which may be being written to by another application without locking it in any way. I have no control over the application that is writting to the file. It is a data log which can be appended to at any time. The reason I am listening in this way is that this log can be appended to for days on end. I want to see the latest data in this log in my own c# programme without waiting for the log to finish being written to.
My code to call the reading / listening function ...
//Start Listening to the "data log"
private void btnDeconstructCSVFile_Click(object sender, EventArgs e)
{
MySandbox.CopyCSVDataFromLogFile copyCSVDataFromLogFile = new MySandbox.CopyCSVDataFromLogFile();
copyCSVDataFromLogFile.checkForLogData();
}
My class which does the listening. For now it simply adds the data to 2 generics lists ...
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using MySandbox.Classes;
using System.IO;
namespace MySandbox
{
public class CopyCSVDataFromLogFile
{
static private List<LogRowData> listMSDataRows = new List<LogRowData>();
static String fullFilename = string.Empty;
static LogRowData previousLineLogRowList = new LogRowData();
static LogRowData logRowList = new LogRowData();
static LogRowData logHeaderRowList = new LogRowData();
static Boolean checking = false;
public void checkForLogData()
{
//Initialise
string[] logHeaderArray = new string[] { };
string[] badDataRowsArray = new string[] { };
//Get the latest full filename (file with new data)
//Assumption: only 1 file is written to at a time in this directory.
String directory = "C:\\TestDir\\";
string pattern = "*.csv";
var dirInfo = new DirectoryInfo(directory);
var file = (from f in dirInfo.GetFiles(pattern) orderby f.LastWriteTime descending select f).First();
fullFilename = directory + file.ToString(); //This is the full filepath and name of the latest file in the directory!
if (logHeaderArray.Length == 0)
{
//Populate the Header Row
logHeaderRowList = getRow(fullFilename, true);
}
LogRowData tempLogRowList = new LogRowData();
if (!checking)
{
//Read the latest data in an asynchronous loop
callDataProcess();
}
}
private async void callDataProcess()
{
checking = true; //Begin checking
await checkForNewDataAndSaveIfFound();
}
private static Task checkForNewDataAndSaveIfFound()
{
return Task.Run(() => //Call the async "Task"
{
while (checking) //Loop (asynchronously)
{
LogRowData tempLogRowList = new LogRowData();
if (logHeaderRowList.ValueList.Count == 0)
{
//Populate the Header row
logHeaderRowList = getRow(fullFilename, true);
}
else
{
//Populate Data row
tempLogRowList = getRow(fullFilename, false);
if ((!Enumerable.SequenceEqual(tempLogRowList.ValueList, previousLineLogRowList.ValueList)) &&
(!Enumerable.SequenceEqual(tempLogRowList.ValueList, logHeaderRowList.ValueList)))
{
logRowList = getRow(fullFilename, false);
listMSDataRows.Add(logRowList);
previousLineLogRowList = logRowList;
}
}
//System.Threading.Thread.Sleep(10); //Wait for next row.
}
});
}
private static LogRowData getRow(string fullFilename, bool isHeader)
{
string line;
string[] logDataArray = new string[] { };
LogRowData logRowListResult = new LogRowData();
try
{
if (isHeader)
{
//Asign first (header) row data.
//Works but seems to block writting to the file!!!!!!!!!!!!!!!!!!!!!!!!!!!
line = File.ReadLines(fullFilename).First().Replace("\"", "");
}
else
{
//Assign data as last row (default behaviour).
line = File.ReadLines(fullFilename).Last().Replace("\"", "");
}
logDataArray = line.Split(',');
//Copy Array to Generics List and remove last value if it's empty.
for (int i = 0; i < logDataArray.Length; i++)
{
if (i < logDataArray.Length)
{
if (i < logDataArray.Length - 1)
{
//Value is not at the end, from observation, these always have a value (even if it's zero) and so we'll store the value.
logRowListResult.ValueList.Add(logDataArray[i]);
}
else
{
//This is the last value
if (logDataArray[i].Replace("\"", "").Trim().Length > 0)
{
//In this case, the last value is not empty, store it as normal.
logRowListResult.ValueList.Add(logDataArray[i]);
}
else { /*The last value is empty, e.g. "123,456,"; the final comma denotes another field but this field is empty so we will ignore it now. */ }
}
}
}
}
catch (Exception ex)
{
if (ex.Message == "Sequence contains no elements")
{ /*Empty file, no problem. The code will safely loop and then will pick up the header when it appears.*/ }
else
{
//TODO: catch this error properly
Int32 problemID = 10; //Unknown ERROR.
}
}
return logRowListResult;
}
}
}
I found the answer in a combination of other questions. One answer explaining how to read from the end of a file, which I adapted so that it would read only 1 line from the end of the file. And another explaining how to read the entire file without locking it (I did not want to read the entire file but the not locking part was useful). So now you can read the last line of the file (if it contains end of line characters) without locking it. For other end of line delimeters, just replace my 10 and 13 with your end of line character bytes...
Add the method below to public class CopyCSVDataFromLogFile
private static string Reverse(string str)
{
char[] arr = new char[str.Length];
for (int i = 0; i < str.Length; i++)
arr[i] = str[str.Length - 1 - i];
return new string(arr);
}
and replace this line ...
line = File.ReadLines(fullFilename).Last().Replace("\"", "");
with this code block ...
Int32 endOfLineCharacterCount = 0;
Int32 previousCharByte = 0;
Int32 currentCharByte = 0;
//Read the file, from the end, for 1 line, allowing other programmes to access it for read and write!
using (FileStream reader = new FileStream(fullFilename, FileMode.Open, FileAccess.Read, FileShare.ReadWrite, 0x1000, FileOptions.SequentialScan))
{
int i = 0;
StringBuilder lineBuffer = new StringBuilder();
int byteRead;
while ((-i < reader.Length) /*Belt and braces: if there were no end of line characters, reading beyond the file would give a catastrophic error here (to be avoided thus).*/
&& (endOfLineCharacterCount < 2)/*Exit Condition*/)
{
reader.Seek(--i, SeekOrigin.End);
byteRead = reader.ReadByte();
currentCharByte = byteRead;
//Exit condition: the first 2 characters we read (reading backwards remember) were end of line ().
//So when we read the second end of line, we have read 1 whole line (the last line in the file)
//and we must exit now.
if (currentCharByte == 13 && previousCharByte == 10)
{
endOfLineCharacterCount++;
}
if (byteRead == 10 && lineBuffer.Length > 0)
{
line += Reverse(lineBuffer.ToString());
lineBuffer.Remove(0, lineBuffer.Length);
}
lineBuffer.Append((char)byteRead);
previousCharByte = byteRead;
}
reader.Close();
}

Reading a large log file in C#

For my project, I need to extract message types from a log file. I have a 700 MB log file which contains about 4.7 million lines and I need to read each entry line by line and extract the message field. I need to find the size of message in each entry(which is the event size) and store it along with that message in a dictionary. There can be multiple messages for same event size. But I get a OutOfMemoryException when I use the below logic.
Dictionary<Int32,List<String>> dt=new Dictionary<Int32,List<String>>();
List<String> entries=new List<String>();
StreamReader sr=new StreamReader("Bluegene.log");
String s;
while((s=sr.readLine())!=null)
{
eventsize=s.length - 9; //size of only the message field
entries.Add(s);
if (!dt.ContainsKey(eventsize))
{
dt.Add(eventsize, entries);
}
else
{
dt.Remove(eventsize);
dt.Add(eventsize, entries);
}
}
Will using MemoryMappedFile help?
The problem is your list is ever growing.
So, you can try the following:
Dictionary<Int32, List<String>> dt = new Dictionary<Int32, List<String>>();
int eventsize;
StreamReader sr = new StreamReader("Bluegene.log");
string s;
while ((s = sr.ReadLine()) != null)
{
eventsize = s.Length - 9; //size of only the message field
if (!dt.ContainsKey(eventsize))
{
List<String> entries = new List<String>();
entries.Add(s);
dt.Add(eventsize, entries);
}
else
{
dt[eventsize].Add(s);
}
}

Categories

Resources