How can I efficiently process a delimited text file? - c#

I'm simply trying to execute File.ReadAllLines against a specific file and, for every line, split on |. I have to use regex on this one.
This code below doesnt work, but you'll see what i'm trying to do:
string[] contents = File.ReadAllLines(filename);
string[] splitlines = Regex.Split(contents, '|');
foreach (string split in splitlines)
{
//Regex line = content.Split('|');
//content.Split('|');
string prefix = prefix = Regex.Match(line, #"(\S+)(\d+)").Groups[0].Value;
File.AppendAllText(workingdirform2 + "configuration.txt", prefix+"\r\n");
}

It's not entirely clear to me what you are trying to do, but there are a number of errors in your code. I have tried to guess what you are doing, but if this isn't what you want, please explain what you do want preferably with some examples:
string inputFilename = "input.txt";
string outputFilename = "output.txt";
using (StreamWriter streamWriter = File.AppendText(outputFilename))
{
using (StreamReader streamReader = File.OpenText(inputFilename))
{
while (true)
{
string line = streamReader.ReadLine();
if (line == null)
{
break;
}
string[] splitlines = line.Split('|');
foreach (string split in splitlines)
{
Match match = Regex.Match(split, #"\S+\d+");
if (match.Success)
{
string prefix = match.Groups[0].Value;
streamWriter.WriteLine(prefix);
}
else
{
// Handle match failed...
}
}
}
}
}
Key points:
You seem to want to perform an operation on each line, so you need to iterate over the lines.
Use the simple string.Split method if you want to split on a single character. Regex.Split doesn't accept a character and "|" has a special meaning in regular expressions so it wouldn't have worked anyway unless you escaped it.
You were opening and closing the output file multiple times. You should open it just once and keep it open until you have finished writing to it. The using keyword is useful here.
Use WriteLine instead of appending "\r\n".
If the input file is large, use a StreamReader instead of ReadAllLines.
If the match fails, your program will throw an exception. You probably should check match.Success before using the match and if this returns false, handle the error appropriately (skip the line, report a warning, throw an exception with an appropriate message, etc.)
You aren't actually using groups 1 and 2 in the regular expression, so you can remove the parentheses to save the regular expression engine from having to store results that you won't use anyway.

You should pass the original string to Regex.Split and not an array.
Looks like you are using line instead of split when settings the prefix. Without knowing more about your code I cant tell if it's right or not but in any case it sticks out as the error.(it shouldnt build either)
This is a really inefficient on at least two levels :)

Regex.Split takes a string, not an array of strings.
I would recommend calling Regex.Split on each item of contents individually, then looping over the results of that call. This would mean nested for loops.
string[] contents = File.ReadAllLines(filename);
foreach (string line in contents)
{
string[] splitlines = Regex.Split(line);
foreach (string splitline in splitlines)
{
string prefix = Regex.Match(splitline, #"(\S+)(\d+)").Groups[0].Value;
File.AppendAllText(workingdirform2 + "configuration.txt", prefix+"\r\n");
}
}
This, of course isn't the most efficient way to go about it.
A more efficient way might be to split on a regular expression instead. I think this works:
string splitlines = Regex.Split(File.ReadAllText(filename), "$|\\|");

I have to assume, based on the limited feedback, that this is what you're looking for:
string inputFile = filename;
string outputFile = Path.Combine( workingdirform2, "configuration.txt" );
using ( StreamReader inputFileStream = File.OpenText( inputFile ) )
{
using ( StreamWriter ouputFileStream = File.AppendText( outputFile ) )
{
// Iterate over the file contents to extract the prefix
string currentLine;
while ( ( currentLine = inputFileStream.ReadLine() ) != null )
{
// Notice the updated Regex - your's is a bit broken
string prefix = Regex.Match( currentLine, #"^(\S+?)\d+" ).Groups[1].Value;
ouputFileStream.WriteLine( prefix );
}
}
}
This would take a file full of:
Text1231|abc|abc
Text1232|abc|abc
Text1233|abc|abc
Text1234|abc|abc
and place:
Text
Text
Text
Text
into a new file.
I hope this, at least, gets you on the right path. My crystal ball is getting hazy.. haaazzzy..

Probably one of the best way to process text files in C# is to use fileHelpers. Give it a look. It allows you to strongly type your import data.

Related

C# can't split a string into an array by newline (from StreamReader)

StreamReader login = new StreamReader("C:/Users/Me/Documents/logins.txt");
string ar = login.ReadToEnd();
string[] names = ar.Split("\r\n");
login.Close();
I'm reading from a file a set of logins, exampled as "username,password" then a newline as "usr,pwd" or something else. I want to split the txt file into a set of arrays by splitting at the start of a new line, but "\r\n" doesn't seem to be working, coming up with the error "cannot convert from string to char". I have tried Environment.Newline, but that is not working either, coming with the same error message.
Instead of dealing with a stream just use File.ReadAllLines
string[] names = File.ReadAllLines("C:/Users/Me/Documents/logins.txt");
String.Split needs an array or eiter char or string values to split on. You need to change your code to:
string[] names = ar.Split(new string[]{"\r\n"}, StringSplitOptions.None);
You can read each line individually like so:
using (StreamReader reader = new StreamReader(pathToFile)) {
string line = reader.ReadLine();
}
This may be preferable as you don't have to rely on the line return type to be correct using a char

Use everything before a specific character

So, I've been learning C# and I need to remove everything after the
":" character.
I've used a StreamReader to read the text file, but then I can't use the Split function, then I tried it by using an int function to import it, but then again I can't use the Split function?
What I want this to do is import a text file that's written like;
name:lastname
name2:lastname2
And so that it only shows name and name2.
I've been searching this for a couple of days but I can't seem to figure it out!
I don't know what I'm doing wrong and how to import the text file without using StreamReader or anything else.
Edit:
I'm trying to post something to a website that goes like;
example.com/q=(name without ":")
Edit 2:
StreamReader list = new StreamReader(#"list.txt");
string reader = list.ReadToEnd();
string[] split = reader.Split(":".ToCharArray());
Console.WriteLine(split);
gives output as;
System.String[]
You've got a few issues here. First, use File.ReadLines() instead of a StreamReader, its much simpler and easier:
IEnumerable<string> lines = File.ReadLines("path/to/file");
Next, your lines variable needs to be iterated so you can get to each line of the collection:
foreach (string line in lines)
{
//TODO: write split logic here
}
Then you have to split each line on the ':' character:
string[] split = line.Split(":");
Your split variable is an array of string (i.e string[]) which means you have to access a specific index of the array if you want to see its value. This is your second issue, if you pass split to Console.WriteLine() under the hood it just calls .ToString() on the object you have passed it, and with a string[] it won't automatically give you all the values, you have to write that yourself.
So if your line variable was: "name:Steve", the split variable would have two indexes and look like this:
//split[0] = "name"
//split[1] = "Steve"
I made a fiddle here that demonstrates.
I your file size small and your name:lastname in one line use:
var lines = File.ReadAllLines("filaPath");
foreach (var line in lines)
{
var array = line.Split(':');
if (array.Length > 0)
{
var name = array[0];
}
}
if name:lastname isn't in new line tell me how it's seprated

Alternative to File.AppendAllText for newline

I am trying to read characters from a file and then append them in another file after removing the comments (which are followed by semicolon).
sample data from parent file:
Name- Harly Brown ;Name is Harley Brown
Age- 20 ;Age is 20 years
Desired result:
Name- Harley Brown
Age- 20
I am trying the following code-
StreamReader infile = new StreamReader(floc + "G" + line + ".NC0");
while (infile.Peek() != -1)
{
letter = Convert.ToChar(infile.Read());
if (letter == ';')
{
infile.ReadLine();
}
else
{
System.IO.File.AppendAllText(path, Convert.ToString(letter));
}
}
But the output i am getting is-
Name- Harley Brown Age-20
Its because AppendAllText is not working for the newline. Is there any alternative?
Sure, why not use File.AppendAllLines. See documentation here.
Appends lines to a file, and then closes the file. If the specified file does not exist, this method creates a file, writes the specified lines to the file, and then closes the file.
It takes in any IEnumerable<string> and adds every line to the specified file. So it always adds the line on a new line.
Small example:
const string originalFile = #"D:\Temp\file.txt";
const string newFile = #"D:\Temp\newFile.txt";
// Retrieve all lines from the file.
string[] linesFromFile = File.ReadAllLines(originalFile);
List<string> linesToAppend = new List<string>();
foreach (string line in linesFromFile)
{
// 1. Split the line at the semicolon.
// 2. Take the first index, because the first part is your required result.
// 3. Trim the trailing and leading spaces.
string appendAbleLine = line.Split(';').FirstOrDefault().Trim();
// Add the line to the list of lines to append.
linesToAppend.Add(appendAbleLine);
}
// Append all lines to the file.
File.AppendAllLines(newFile, linesToAppend);
Output:
Name- Harley Brown
Age- 20
You could even change the foreach-loop into a LINQ-expression, if you prefer LINQ:
List<string> linesToAppend = linesFromFile.Select(line => line.Split(';').FirstOrDefault().Trim()).ToList();
Why use char by char comparison when .NET Framework is full of useful string manipulation functions?
Also, don't use a file write function multiple times when you can use it only one time, it's time and resources consuming!
StreamReader stream = new StreamReader("file1.txt");
string str = "";
while ((string line = infile.ReadLine()) != null) { // Get every line of the file.
line = line.Split(';')[0].Trim(); // Remove comment (right part of ;) and useless white characters.
str += line + "\n"; // Add it to our final file contents.
}
File.WriteAllText("file2.txt", str); // Write it to the new file.
You could do this with LINQ, System.File.ReadLines(string), and System.File.WriteAllLines(string, IEnumerable<string>). You could also use System.File.AppendAllLines(string, IEnumerable<string>) in a find-and-replace fashion if that was, in fact, the functionality you were going for. The difference, as the names suggest, is whether it writes everything out as a new file or if it just appends to an existing one.
System.IO.File.WriteAllLines(newPath, System.IO.File.ReadLines(oldPath).Select(c =>
{
int semicolon = c.IndexOf(';');
if (semicolon > -1)
return c.Remove(semicolon);
else
return c;
}));
In case you aren't super familiar with LINQ syntax, the idea here is to loop through each line in the file, and if it contains a semicolon (that is, IndexOf returns something that is over -1) we cut that off, and otherwise, we just return the string. Then we write all of those to the file. The StreamReader equivalent to this would be:
using (StreamReader reader = new StreamReader(oldPath))
using (StreamWriter writer = new StreamWriter(newPath))
{
string line;
while ((line = reader.ReadLine()) != null)
{
int semicolon = line.IndexOf(';');
if (semicolon > -1)
line = c.Remove(semicolon);
writer.WriteLine(line);
}
}
Although, of course, this would feed an extra empty line at the end and the LINQ version wouldn't (as far as I know, it occurs to me that I'm not one hundred percent sure on that, but if someone reading this does know I would appreciate a comment).
Another important thing to note, just looking at your original file, you might want to add in some Trim calls, since it looks like you can have spaces before your semicolons, and I don't imagine you want those copied through.

Assign line in text file to a string

I'm making a simple text adventure in C# and I was wondering if it was possible to read certain lines from a .txt file and assign them to a string.
I am aware of how to read all the text from a .txt file but how exactly would I assign the contents of certain lines to a string?
Have you considered the ReadAllLines method?
It returns an array of lines from which you can choose your desired line.
So for eg, if you wish to choose the 3rd line (Assuming you have 3 lines in the file):
string[] lines = File.ReadAllLines(path);
string myThirdLine= lines[2];
Probably the easiest (and cheapest in terms of memory consumption) is File.ReadLines:
String stringAtLine10 = File.ReadLines(path).ElementAtOrDefault(9);
Note that it is null if there are less than 10 lines in the file. See: ElementAtOrDefault.
It's just the concise version of a StreamReader and a counter variable which increases on every line.
As an advanced alternative: ReadLines plus some LINQ:
var lines = File.ReadLines(myFilePath).Where(MyCondition).ToArray();
where MyCondition:
bool MyCondition(string line)
{
if (line == "something")
{
return true;
}
return false;
}
In case you don't want to load all lines atonce
using(StreamReader reader=new StreamReader(path))
{
String line;
while((line=reader.ReadLine())!=null)//process temp
}
Here's a example how you can assign the lines to a string, you can't decide which line is which via fields, you have to select them yourself.
which is the line of the string you want to assign.
For example, you want line one, you define which as one and not zero, you want line eight, you define which with eight.
string getWord(int which)
{
string readed = "";
using (Systen.IO.StreamReader read = new System.IO.StreamReader("PATH HERE"))
{
readed = read.ReadToEnd();
}
string[] toReturn = readed.Split('\n');
return toReturn[which - 1];
}

C# CSV file to array/list

I want to read 4-5 CSV files in some array in C#
I know that this question is been asked and I have gone through them...
But my use of CSVs is too much simpler for that...
I have csv fiels with columns of following data types....
string , string
These strings are without ',' so no tension...
That's it. And they aren't much big. Only about 20 records in each.
I just want to read them into array of C#....
Is there any very very simple and direct way to do that?
To read the file, use
TextReader reader = File.OpenText(filename);
To read a line:
string line = reader.ReadLine()
then
string[] tokens = line.Split(',');
to separate them.
By using a loop around the two last example lines, you could add each array of tokens into a list, if that's what you need.
This one includes the quotes & commas in fields. (assumes you're doing a line at a time)
using Microsoft.VisualBasic.FileIO; //For TextFieldParser
// blah blah blah
StringReader csv_reader = new StringReader(csv_line);
TextFieldParser csv_parser = new TextFieldParser(csv_reader);
csv_parser.SetDelimiters(",");
csv_parser.HasFieldsEnclosedInQuotes = true;
string[] csv_array = csv_parser.ReadFields();
Here is a simple way to get a CSV content to an array of strings. The CSV file can have double quotes, carriage return line feeds and the delimiter is a comma.
Here are the libraries that you need:
System.IO;
System.Collection.Generic;
System.IO is for FileStream and StreamReader class to access your file. Both classes implement the IDisposable interface, so you can use the using statements to close your streams. (example below)
System.Collection.Generic namespace is for collections, such as IList,List, and ArrayList, etc... In this example, we'll use the List class, because Lists are better than Arrays in my honest opinion. However, before I return our outbound variable, i'll call the .ToArray() member method to return the array.
There are many ways to get content from your file, I personally prefer to use a while(condition) loop to iterate over the contents. In the condition clause, use !lReader.EndOfStream. While not end of stream, continue iterating over the file.
public string[] GetCsvContent(string iFileName)
{
List<string> oCsvContent = new List<string>();
using (FileStream lFileStream =
new FileStream(iFilename, FileMode.Open, FileAccess.Read))
{
StringBuilder lFileContent = new StringBuilder();
using (StreamReader lReader = new StreamReader(lFileStream))
{
// flag if a double quote is found
bool lContainsDoubleQuotes = false;
// a string for the csv value
string lCsvValue = "";
// loop through the file until you read the end
while (!lReader.EndOfStream)
{
// stores each line in a variable
string lCsvLine = lReader.ReadLine();
// for each character in the line...
foreach (char lLetter in lCsvLine)
{
// check if the character is a double quote
if (lLetter == '"')
{
if (!lContainsDoubleQuotes)
{
lContainsDoubleQuotes = true;
}
else
{
lContainsDoubleQuotes = false;
}
}
// if we come across a comma
// AND it's not within a double quote..
if (lLetter == ',' && !lContainsDoubleQuotes)
{
// add our string to the array
oCsvContent.Add(lCsvValue);
// null out our string
lCsvValue = "";
}
else
{
// add the character to our string
lCsvValue += lLetter;
}
}
}
}
}
return oCsvContent.ToArray();
}
Hope this helps! Very easy and very quick.
Cheers!

Categories

Resources