C# CSV file to array/list - c#

I want to read 4-5 CSV files in some array in C#
I know that this question is been asked and I have gone through them...
But my use of CSVs is too much simpler for that...
I have csv fiels with columns of following data types....
string , string
These strings are without ',' so no tension...
That's it. And they aren't much big. Only about 20 records in each.
I just want to read them into array of C#....
Is there any very very simple and direct way to do that?

To read the file, use
TextReader reader = File.OpenText(filename);
To read a line:
string line = reader.ReadLine()
then
string[] tokens = line.Split(',');
to separate them.
By using a loop around the two last example lines, you could add each array of tokens into a list, if that's what you need.

This one includes the quotes & commas in fields. (assumes you're doing a line at a time)
using Microsoft.VisualBasic.FileIO; //For TextFieldParser
// blah blah blah
StringReader csv_reader = new StringReader(csv_line);
TextFieldParser csv_parser = new TextFieldParser(csv_reader);
csv_parser.SetDelimiters(",");
csv_parser.HasFieldsEnclosedInQuotes = true;
string[] csv_array = csv_parser.ReadFields();

Here is a simple way to get a CSV content to an array of strings. The CSV file can have double quotes, carriage return line feeds and the delimiter is a comma.
Here are the libraries that you need:
System.IO;
System.Collection.Generic;
System.IO is for FileStream and StreamReader class to access your file. Both classes implement the IDisposable interface, so you can use the using statements to close your streams. (example below)
System.Collection.Generic namespace is for collections, such as IList,List, and ArrayList, etc... In this example, we'll use the List class, because Lists are better than Arrays in my honest opinion. However, before I return our outbound variable, i'll call the .ToArray() member method to return the array.
There are many ways to get content from your file, I personally prefer to use a while(condition) loop to iterate over the contents. In the condition clause, use !lReader.EndOfStream. While not end of stream, continue iterating over the file.
public string[] GetCsvContent(string iFileName)
{
List<string> oCsvContent = new List<string>();
using (FileStream lFileStream =
new FileStream(iFilename, FileMode.Open, FileAccess.Read))
{
StringBuilder lFileContent = new StringBuilder();
using (StreamReader lReader = new StreamReader(lFileStream))
{
// flag if a double quote is found
bool lContainsDoubleQuotes = false;
// a string for the csv value
string lCsvValue = "";
// loop through the file until you read the end
while (!lReader.EndOfStream)
{
// stores each line in a variable
string lCsvLine = lReader.ReadLine();
// for each character in the line...
foreach (char lLetter in lCsvLine)
{
// check if the character is a double quote
if (lLetter == '"')
{
if (!lContainsDoubleQuotes)
{
lContainsDoubleQuotes = true;
}
else
{
lContainsDoubleQuotes = false;
}
}
// if we come across a comma
// AND it's not within a double quote..
if (lLetter == ',' && !lContainsDoubleQuotes)
{
// add our string to the array
oCsvContent.Add(lCsvValue);
// null out our string
lCsvValue = "";
}
else
{
// add the character to our string
lCsvValue += lLetter;
}
}
}
}
}
return oCsvContent.ToArray();
}
Hope this helps! Very easy and very quick.
Cheers!

Related

How to read a portion of a line in a text file?

So, I have a text file with thousands of lines formatted similarly to this:
123456:0.8525000:1590882780:91011
These files are almost always a different length, and I only need to read the first two parts of the line, being 123456:0.8525000.
I know that I can split each line using C#, but I'm unsure how to only read the first 2 parts. Anyone have any idea on how to do this? Sorry if my question doesn't make sense, I can restate it if needed.
The Split function returns a string[], an array of strings.
Just take the 2 first elements of the result of Split (with : as the separator).
var read = "123456:0.8525000:1590882780:91011";
var values = read.Split(":");
Console.WriteLine(values[0]); // 123456
Console.WriteLine(values[1]); // 0.8525000
.NET Fiddle
Don't forget that elements of values are string and not yet int or double values. See How to convert string to integer in C# for how to convert from string to number type.
There are TONS of ways to doing this but I am going to suggest some options that involving read the full line as its much easier to work with / understand and that your lines are of varying length. I did add a suggestion on using StreamReader on a file at the end in addendum but you may need to figure out serious work arounds on skipping lines you don't want, restarting a char iterating loop on new lines etc.
I first demonstrate the latest and greatest IAsyncEnumerable found in NetCore 3.x followed by a similar string-based approach. By sharing an Int example that is a slightly advanced and that will also be asynchronous, I hope to also help others and demonstrate a fairly modern approach in 2020. Streaming out only the data you need will be a huge benefit in keeping it fast and a low memory footprint.
public static async IAsyncEnumerable<int> StreamFileOutAsIntsAsync(string filePathName)
{
if (string.IsNullOrWhiteSpace(filePathName)) throw new ArgumentNullException(nameof(filePathName));
if (!File.Exists(filePathName)) throw new ArgumentException($"{filePathName} is not a valid file path.");
using var streamReader = File.OpenText(filePathName);
string currentLine;
while ((currentLine = await streamReader.ReadLineAsync().ConfigureAwait(false)) != null)
{
if (int.TryParse(currentLine.AsSpan(), out var output))
{
yield return output;
}
}
}
This streams every int out of a file, checking that file exists and that the filename path is not null or blank etc.
Streaming maybe too much for a beginner so I don't know your level.
You may want to start with just turning the file into a list of strings.
Modifying my previous example above to something less complex but split your strings for you. I recommend learning about streaming so you don't have every piece of string in memory while you work on it... or maybe you want them all. I am not here to judge.
Once you get your string line out from a file you can do whatever else needs to be done.
public static async Task<List<string>> GetStringsFromFileAsync(string filePathName)
{
if (string.IsNullOrWhiteSpace(filePathName)) throw new ArgumentNullException(nameof(filePathName));
if (!File.Exists(filePathName)) throw new ArgumentException($"{filePathName} is not a valid file path.");
using var streamReader = File.OpenText(filePathName);
string currentLine;
var strings = new List<string>();
while ((currentLine = await streamReader.ReadLineAsync().ConfigureAwait(false)) != null)
{
var lineAsArray = currentLine.Split(new string[] { ":" }, StringSplitOptions.RemoveEmptyEntries);
// Simple Data Validation
if (lineAsArray.Length == 4)
{
strings.Add($"{lineAsArray[0]}:{lineAsArray[1]}");
strings.Add($"{lineAsArray[2]}:{lineAsArray[3]}");
}
}
return strings;
}
The meat of the code is really simple, open the file for reading!
using var streamReader = File.OpenText(filePathName);
and then loop through that file...
while ((currentLine = await streamReader.ReadLineAsync()) != null)
{
var lineAsArray = currentLine.Split(new string[] { ":" }, StringSplitOptions.RemoveEmptyEntries);
// Simple Data Validation
if (lineAsArray.Length == 4)
{
// Do whatever you need to do with the first bits of information.
// In this case, we add them all to a list for return.
strings.Add($"{lineAsArray[0]}:{lineAsArray[1]}");
strings.Add($"{lineAsArray[2]}:{lineAsArray[3]}");
}
}
What this demonstrates is that, for every line that I read out that is not null, break into four parts (based on the ":") character removing all empty entries.
We then use a C# feature called String Interpolation ($"") to put the first two back together with ":" as a string. Then the second two. Or whatever you need to do with reading each part of the line.
That's really all there is to it! Hope it helps.
Addendum: If you really need to read parts of file, please use a StreamReader.Read and Peek()
using (var sr = new StreamReader(path))
{
while (sr.Peek() >= 0)
{
Console.Write((char)sr.Read());
}
}
Reading each character
Some bare bones code:
string fileName = #"c:\some folder\path\file.txt";
using (StreamReader sr = new StreamReader(fileName))
{
while (!sr.EndOfStream)
{
String[] values = sr.ReadLine().Split(":".ToCharArray());
if (values.Length >= 2)
{
// ... do something with values[0] and values[1] ...
Console.WriteLine(values[0] + ", " + values[1]);
}
}
}

Remove a specific column from a delimited file

I've been working with some big delimited text (~1GB) files these days. It looks like somewhat below
COlumn1 #COlumn2#COlumn3#COlumn4
COlumn1#COlumn2#COlumn3 #COlumn4
where # is the delimiter.
In case a column is invalid I might have to remove it from the whole text file. The output file when Column 3 is invalid should look like this.
COlumn1 #COlumn2#COlumn4
COlumn1#COlumn2#COlumn4
string line = "COlumn1# COlumn2 #COlumn3# COlumn4";
int junk =3;
int columncount = line.Split(new char[] { '#' }, StringSplitOptions.None).Count();
//remove the [junk-1]th '#' and the value till [junk]th '#'
//"COlumn1# COlumn2 # COlumn4"
I's not able to find a c# version of this in SO. Is there a way I can do that? Please help.
EDIT:
The solution which I found myself is like below which does the job. Is there a way I could modify this to a better way so that it narrows down the performance impact it might have in case of large text files?
int junk = 3;
string line = "COlumn1#COlumn2#COlumn3#COlumn4";
int counter = 0;
int colcount = line.Split(new char[] { '#' }, StringSplitOptions.None).Length - 1;
string[] linearray = line.Split(new char[] { '#' }, StringSplitOptions.None);
List<string> linelist = linearray.ToList();
linelist.RemoveAt(junk - 1);
string finalline = string.Empty;
foreach (string s in linelist)
{
counter++;
finalline += s;
if (counter < colcount)
finalline += "#";
}
Console.WriteLine(finalline);
EDITED
This method can be very memory expensive, as your can read in this post, the suggestion should be:
If you need to run complex queries against the data in the file, the right thing to do is to load the data to database and let DBMS to take care of data retrieval and memory management.
To avoid memory consumption you should use a StreamReader to read file line by line
This could be a start for your task, missing your invalid match logic
using System.Collections.Generic;
using System.IO;
using System.Text;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
const string fileName = "temp.txt";
var results = FindInvalidColumns(fileName);
using (var reader = File.OpenText(fileName))
{
while (!reader.EndOfStream)
{
var builder = new StringBuilder();
var line = reader.ReadLine();
if (line == null) continue;
var split = line.Split(new[] { "#" }, 0);
for (var i = 0; i < split.Length; i++)
if (!results.Contains(i))
builder.Append(split[i]);
using (var fs = new FileStream("new.txt", FileMode.Append, FileAccess.Write))
using (var sw = new StreamWriter(fs))
{
sw.WriteLine(builder.ToString());
}
}
}
}
private static List<int> FindInvalidColumns(string fileName)
{
var invalidColumnIndexes = new List<int>();
using (var reader = File.OpenText(fileName))
{
while (!reader.EndOfStream)
{
var line = reader.ReadLine();
if (line == null) continue;
var split = line.Split(new[] { "#" }, 0);
for (var i = 0; i < split.Length; i++)
{
if (IsInvalid(split[i]) && !invalidColumnIndexes.Contains(i))
invalidColumnIndexes.Add(i);
}
}
}
return invalidColumnIndexes;
}
private static bool IsInvalid(string s)
{
return false;
}
}
}
First, what you will do is re-write the line to a text file using a 0-length string for COlumn3. Therefore the line after being written correctly would look like this:
COlumun1#COlumn2##COlumn4
As you can see, there are two delimiters between COlumn2 and COlumn4. This is a cell with no data in it. (By "cell" I mean one column of a certain, single row.) Later, when some other process reads this using the Split function, it will still create a new value for Column 3, but in the array generated by Split, the 3rd position would be an empty string:
String[] columns = stream_reader.ReadLine().Split('#');
int lengthOfThirdItem = columns[2].Length; // for proof
// lengthOfThirdItem = 0
This reduces invalid values to null and persists them back in the text file.
For more on String.Split see C# StreamReader save to Array with separator.
It is not possible to write to lines internal to a text file while it is also open for read. This article discusses it some (simultaneous read-write a file in C#), but it looks like that question-asker just wants to be able to write lines to the end. You want to be able to write lines at any point in the interior. I think this is not possible without buffering the data in some way.
The simplest way to buffer the data is rename the file to a temp file first (using File.CoMovepy() // http://msdn.microsoft.com/en-us/library/system.io.file.move(v=vs.110).aspx). Then use the temp file as the data source. Just open the temp file that to read in the data which may have corrupt entries, and write the data afresh to the original file name using the approach I describe above to represent empty columns. After this is complete, then you should delete the temp file.
Important
Deleting the temp file may leave you vulnerable to power and data transients (or software 'transients'). (I.e., a power drop that interrupts part of the process could leave the data in an unusable state.) So you may also want to leave the temp file on the drive as an emergency backup in case of some problem.

Assign line in text file to a string

I'm making a simple text adventure in C# and I was wondering if it was possible to read certain lines from a .txt file and assign them to a string.
I am aware of how to read all the text from a .txt file but how exactly would I assign the contents of certain lines to a string?
Have you considered the ReadAllLines method?
It returns an array of lines from which you can choose your desired line.
So for eg, if you wish to choose the 3rd line (Assuming you have 3 lines in the file):
string[] lines = File.ReadAllLines(path);
string myThirdLine= lines[2];
Probably the easiest (and cheapest in terms of memory consumption) is File.ReadLines:
String stringAtLine10 = File.ReadLines(path).ElementAtOrDefault(9);
Note that it is null if there are less than 10 lines in the file. See: ElementAtOrDefault.
It's just the concise version of a StreamReader and a counter variable which increases on every line.
As an advanced alternative: ReadLines plus some LINQ:
var lines = File.ReadLines(myFilePath).Where(MyCondition).ToArray();
where MyCondition:
bool MyCondition(string line)
{
if (line == "something")
{
return true;
}
return false;
}
In case you don't want to load all lines atonce
using(StreamReader reader=new StreamReader(path))
{
String line;
while((line=reader.ReadLine())!=null)//process temp
}
Here's a example how you can assign the lines to a string, you can't decide which line is which via fields, you have to select them yourself.
which is the line of the string you want to assign.
For example, you want line one, you define which as one and not zero, you want line eight, you define which with eight.
string getWord(int which)
{
string readed = "";
using (Systen.IO.StreamReader read = new System.IO.StreamReader("PATH HERE"))
{
readed = read.ReadToEnd();
}
string[] toReturn = readed.Split('\n');
return toReturn[which - 1];
}

C# read txt file and store the data in formatted array

I have a text file which contains following
Name address phone salary
Jack Boston 923-433-666 10000
all the fields are delimited by the spaces.
I am trying to write a C# program, this program should read a this text file and then store it in the formatted array.
My Array is as follows:
address
salary
When ever I am trying to look in google I get is how to read and write a text file in C#.
Thank you very much for your time.
You can use File.ReadAllLines method to load the file into an array. You can then use a for loop to loop through the lines, and the string type's Split method to separate each line into another array, and store the values in your formatted array.
Something like:
static void Main(string[] args)
{
var lines = File.ReadAllLines("filename.txt");
for (int i = 0; i < lines.Length; i++)
{
var fields = lines[i].Split(' ');
}
}
Do not reinvent the wheel. Can use for example fast csv reader where you can specify a delimeter you need.
There are plenty others on internet like that, just search and pick that one which fits your needs.
This answer assumes you don't know how much whitespace is between each string in a given line.
// Method to split a line into a string array separated by whitespace
private string[] Splitter(string input)
{
return Regex.Split(intput, #"\W+");
}
// Another code snippet to read the file and break the lines into arrays
// of strings and store the arrays in a list.
List<String[]> arrayList = new List<String[]>();
using (FileStream fStream = File.OpenRead(#"C:\SomeDirectory\SomeFile.txt"))
{
using(TextReader reader = new StreamReader(fStream))
{
string line = "";
while(!String.IsNullOrEmpty(line = reader.ReadLine()))
{
arrayList.Add(Splitter(line));
}
}
}

Searching strings in txt file

I have a .txt file with a list of 174 different strings. Each string has an unique identifier.
For example:
123|this data is variable|
456|this data is variable|
789|so is this|
etc..
I wish to write a programe in C# that will read the .txt file and display only one of the 174 strings if I specify the ID of the string I want. This is because in the file I have all the data is variable so only the ID can be used to pull the string. So instead of ending up with the example about I get just one line.
eg just
123|this data is variable|
I seem to be able to write a programe that will pull just the ID from the .txt file and not the entire string or a program that mearly reads the whole file and displays it. But am yet to wirte on that does exactly what I need. HELP!
Well the actual string i get out from the txt file has no '|' they were just in the example. An example of the real string would be: 0111111(0010101) where the data in the brackets is variable. The brackets dont exsist in the real string either.
namespace String_reader
{
class Program
{
static void Main(string[] args)
{
String filepath = #"C:\my file name here";
string line;
if(File.Exists(filepath))
{
StreamReader file = null;
try
{
file = new StreamReader(filepath);
while ((line = file.ReadLine()) !=null)
{
string regMatch = "ID number here"; //this is where it all falls apart.
Regex.IsMatch (line, regMatch);
Console.WriteLine (line);// When program is run it just displays the whole .txt file
}
}
}
finally{
if (file !=null)
file.Close();
}
}
Console.ReadLine();
}
}
}
Use a Regex. Something along the lines of Regex.Match("|"+inputString+"|",#"\|[ ]*\d+\|(.+?)\|").Groups[1].Value
Oh, I almost forgot; you'll need to substitute the d+ for the actual index you want. Right now, that'll just get you the first one.
The "|" before and after the input string makes sure both the index and the value are enclosed in a | for all elements, including the first and last. There's ways of doing a Regex without it, but IMHO they just make your regex more complicated, and less readable.
Assuming you have path and id.
Console.WriteLine(File.ReadAllLines(path).Where(l => l.StartsWith(id + "|")).FirstOrDefault());
Use ReadLines to get a string array of lines then string split on the |
You could use Regex.Split method
FileInfo info = new FileInfo("filename.txt");
String[] lines = info.OpenText().ReadToEnd().Split(' ');
foreach(String line in lines)
{
int id = Convert.ToInt32(line.Split('|')[0]);
string text = Convert.ToInt32(line.Split('|')[1]);
}
Read the data into a string
Split the string on "|"
Read the items 2 by 2: key:value,key:value,...
Add them to a dictionary
Now you can easily find your string with dictionary[key].
first load the hole file to a string.
then try this:
string s = "123|this data is variable| 456|this data is also variable| 789|so is this|";
int index = s.IndexOf("123", 0);
string temp = s.Substring(index,s.Length-index);
string[] splitStr = temp.Split('|');
Console.WriteLine(splitStr[1]);
hope this is what you are looking for.
private static IEnumerable<string> ReadLines(string fspec)
{
using (var reader = new StreamReader(new FileStream(fspec, FileMode.Open, FileAccess.Read, FileShare.Read)))
{
while (!reader.EndOfStream)
yield return reader.ReadLine();
}
}
var dict = ReadLines("input.txt")
.Select(s =>
{
var split = s.Split("|".ToArray(), 2);
return new {Id = Int32.Parse(split[0]), Text = split[1]};
})
.ToDictionary(kv => kv.Id, kv => kv.Text);
Please note that with .NET 4.0 you don't need the ReadLines function, because there is ReadLines
You can now work with that as any dictionary:
Console.WriteLine(dict[12]);
Console.WriteLine(dict[999]);
No error handling here, please add your own
You can use Split method to divide the entire text into parts sepparated by '|'. Then all even elements will correspond to numbers odd elements - to strings.
StreamReader sr = new StreamReader(filename);
string text = sr.ReadToEnd();
string[] data = text.Split('|');
Then convert certain data elements to numbers and strings, i.e. int[] IDs and string[] Strs. Find the index of the given ID with idx = Array.FindIndex(IDs, ID.Equals) and the corresponding string will be Strs[idx]
List <int> IDs;
List <string> Strs;
for (int i = 0; i < data.Length - 1; i += 2)
{
IDs.Add(int.Parse(data[i]));
Strs.Add(data[i + 1]);
}
idx = Array.FindIndex(IDs, ID.Equals); // we get ID from input
answer = Strs[idx];

Categories

Resources