I need read a text file (10mb) and convert to .csv. See below portion of code:
string DirPathForm = System.IO.Path.GetDirectoryName(System.Reflection.Assembly.GetEntryAssembly().Location);'
string[] lines = File.ReadAllLines(DirPathForm + #"\file.txt");
Some portion of the text file have a pattern. So, used as below:
string[] lines1 = lines.Select(x => x.Replace("abc[", "ab,")).ToArray();
Array.Clear(lines, 0, lines.Length);
lines = lines1.Select(x => x.Replace("] CDE ", ",")).ToArray();
Some portion does not have a pattern to use directly Replace. The question is how remove the characters, numbers and whitespaces in this portion. Please see below?
string[] lines = {
"a] 773 b",
"e] 1597 t",
"z] 0 c"
};
to get the result below:
string[] result = {
"a,b",
"e,t",
"z,c"
};
obs: the items removed need be replaced by ",".
First of all, you should not use ReadAllLines since it is a huge file operation. It will load all the data into RAM and it is not correct. Instead, read the lines one by one in a loop.
Secondly, you can definitely use regex to replace data from the first condition to the second one.
Related
string[] splittedText = File.ReadAllLines(#"file.txt");//.Split(',');
foreach (string data in splittedText)
{
}
I want to read through a file in c# which returns array of string type. Then, I will be iterating over the array to fetch my desired data.
If you want to read a CSV file, you should use a CVS parser. Values in the CSV file are separated using command and in some cases, the value in the CSV file can also contain a comma. In that case, the column values are wrapped in double-quotes. And this solution will not handle that scenario.
var splittedText = File.ReadAllText("E:\\Test.txt").Split(',');
foreach (string data in splittedText)
{
Console.WriteLine(data.Trim());
}
Hint - Reading file line by line or Reading whole file content depends on your use case. May be below code snippet give some idea on how to split the content.
Please try.
var inputtext = File.ReadAllText(#"inpufile.txt");
inputtext.Replace("\n", "")
.Split(',',
StringSplitOptions.RemoveEmptyEntries | StringSplitOptions.TrimEntries)
.ToList().ForEach(t =>
{
System.Console.WriteLine(t);
//Other manupulations
});
if you want to split based on multiple characters , pass a character array to the split().
new char[] { ',', ':' };
Thank you.
You need change File.ReadAllLines to File.ReadAllText(path) then you can split method.
I have a text file which has many irrelevant values and then have values which I have load it into a table. Sample of the file looks like this
Some file description date
C D 8989898989898 some words
D F 8979797979 some more words
8 H 98988989989898 Some more words for the purpose
KD978787878 280000841 1974DIAA EIDER 320
KK967867668 280000551 1999OOOD FIDERN 680
I can't start from the number of lines because the description part (which is 4 lines, excluding empty line) can be of multi line. Means, it can have up to 40-50 lines per text file.
The only way I can think to pick the data is to select only those rows which has 5 columns and have certain number of space between them.
I have tried it using foreach loop but that didn't work out pretty well. May be I am not able to implement it.
DataTable dt = new DataTable();
using (StreamWriter sw = File.CreateText(path))
{
string[] rows = content.Split('\n');
foreach (string s in rows)
{
// how to pick up rows when there are only 5 columns in a row separated by a definite number of space?
string[] columns = s.Split(' '); // how to calculate exact spaces here, because space count could be different from one column to the other. Ex: difference between first column and second is 16 and second to third is 8.
foreach (string t in columns)
{
}
}
}
A lot of this comes down to massaging and sanitizing the data(yuck!) I would:
1.Use String.Split on content to get all lines(like you did)
string[] lines = content.Split(new[] { "\r\n", "\r", "\n" }, StringSplitOptions.None);
2.Parse out empty lines and loop over the result
foreach(string line in lines.Where(x => !String.IsNullOrEmpty(x.Trim())))
3.Use String.Split on each line to split out each field for a particular row, stripping white space
string[] fields = line.Split(new string[] { " " }, StringSplitOptions.RemoveEmptyEntries);
At this point you can count the number of fields in the row or throw something at each actual field.
This is an ideal place to use regex to find only lines that fit your needs and even grouping them properly you can get out the trimmed values of the five columns already.
The search expressions seems to be something like "^(K[A-Z0-9]+) +([0-9]+) +([A-Z0-9]+) +([A-Z]+) +([0-9]+) *$" or similar. It helped me a lot in programming to know regex.
So, I've been learning C# and I need to remove everything after the
":" character.
I've used a StreamReader to read the text file, but then I can't use the Split function, then I tried it by using an int function to import it, but then again I can't use the Split function?
What I want this to do is import a text file that's written like;
name:lastname
name2:lastname2
And so that it only shows name and name2.
I've been searching this for a couple of days but I can't seem to figure it out!
I don't know what I'm doing wrong and how to import the text file without using StreamReader or anything else.
Edit:
I'm trying to post something to a website that goes like;
example.com/q=(name without ":")
Edit 2:
StreamReader list = new StreamReader(#"list.txt");
string reader = list.ReadToEnd();
string[] split = reader.Split(":".ToCharArray());
Console.WriteLine(split);
gives output as;
System.String[]
You've got a few issues here. First, use File.ReadLines() instead of a StreamReader, its much simpler and easier:
IEnumerable<string> lines = File.ReadLines("path/to/file");
Next, your lines variable needs to be iterated so you can get to each line of the collection:
foreach (string line in lines)
{
//TODO: write split logic here
}
Then you have to split each line on the ':' character:
string[] split = line.Split(":");
Your split variable is an array of string (i.e string[]) which means you have to access a specific index of the array if you want to see its value. This is your second issue, if you pass split to Console.WriteLine() under the hood it just calls .ToString() on the object you have passed it, and with a string[] it won't automatically give you all the values, you have to write that yourself.
So if your line variable was: "name:Steve", the split variable would have two indexes and look like this:
//split[0] = "name"
//split[1] = "Steve"
I made a fiddle here that demonstrates.
I your file size small and your name:lastname in one line use:
var lines = File.ReadAllLines("filaPath");
foreach (var line in lines)
{
var array = line.Split(':');
if (array.Length > 0)
{
var name = array[0];
}
}
if name:lastname isn't in new line tell me how it's seprated
I have a text file which contains lines that i need to process.Here is the format of the lines present into my text file..
07 IVIN 15:37 06/03 022 00:00:14 600 2265507967 0:03
08 ITRS 15:37 06/03 022 00:00:09 603 7878787887 0:03
08 ITRS 15:37 06/03 022 00:00:09 603 2265507967 0:03
Now as per my requirement i have to read this text file line by line.Now as soon as i get ITRS into any line i have to search for the number 2265507967 into the immediate upside of the text file lines.As soon as it gets 2265507967 in the upside lines ,it should read that line.
Now i am reading the lines into strings and breaking into characters based on spaces.Here is my code..
var strings = line.Split(new char[] { ' ' }, StringSplitOptions.RemoveEmptyEntries);
My problem is that i am not getting way to traverse upside of the text file and search for the substring .i.e. 2265507967.Please help .
I am not aware of being able to go backwards when reading a file (other than using the seek() method) but I might be wrong...
A simpler approach would be to:
Create a dictionary, key value being the long numeric values while the value being the line to which it belongs: <2265507967,07 IVIN 15:37 06/03 022 00:00:14 600 2265507967 0:03>
Go through the file one line at a time and:
a. If the line contains ITRS, get the value from the line and check your dictionary. Once you will have found it, clear the dictionary and go back to step 1.
b. If it does not contain ITRS, simply add the number and the line as key-value pairs.
This should be quicker than going through one line at a time and also simpler. The drawback would be that it could be quite memory intensive.
EDIT: I do not have a .NET compiler handy, so I will provide some pseudo code to better explain my answer:
//Initialization
Dictionary<string, string> previousLines = new Dictionary<string, string>();
TextReader tw = new TextReader(filePath);
string line = String.Empty;
//Read the file one line at a time
while((line = tw.ReadLine()) != null)
{
if(line.contains("ITRS")
{
//Get the number you will use for searching
string number = line.split(new char[]{' '})[4];
//Use the dictionary to read a line you have previously read.
string line = previousLines[number];
previousLines.Clear(); //Remove the elements so that they do not interrupt the next searches. I am assuming that you want to search between lines which are found between ITRS tags. If you do not want this, simply omit this line.
... //Do your logic here.
}
else
{
string number = line.split(new char[]{' '})[4];
previousLines.Add(number, line);
}
}
I'm trying to replace pipe symbol(|) with new line(\n) in my text(test1.txt) file. But when I'm trying to save it in text(test2.text) file the result is not coming in my test2.txt file but I see the result in my console window. Any one please help on this.
string lines = File.ReadAllText(#"C:\NetProject\Nag Assignments\hi.txt");
//string input = "abcd|efghijk|lmnopqrstuvwxyz";
lines = lines.Replace('|', '\n');
File.WriteAllText(#"C:\NetProject\Nag Assignments\hi2.txt", lines);
Console.WriteLine(lines);
You can try this one:
lines = lines.Replace("|", Environment.NewLine);
It returns "\r\n", for non-Unix platforms according to documentation.
Seems like you want multiple things here. (both original question and subsequent comments)
One is to separate the lines and be able to reference them separately:
string[] separatedLines = lines.Split('|');
The other is to join them back together with a different separator:
string rejoinedLines = string.Join(Environment.NewLine, separatedLines);
You then have access to the individual lines from the separatedLines variable above such as separatedLines[0] and you can also write the rejoinedLines variable back to the other file like you wanted.
EDIT: For example, the following code:
string lines = "a|bc|def";
string[] separatedLines = lines.Split('|');
string rejoinedLines = string.Join(Environment.NewLine, separatedLines);
for (int i = 0; i < separatedLines.Length; i++)
{
Console.WriteLine("Line {0}: {1}", i + 1, separatedLines[i]);
}
Gives output of:
Line 1: a
Line 2: bc
Line 3: def
Instead of:
lines = lines.Replace('|', '\n');
Try:
lines = lines.Replace("|","\r\n");
string[] space = lines.Split ('|');
Will save every substring in space.
The line break should be \r\n for carriage return. It depends if you are reading a file binary or text mode. \n is used in text mode while \r\n is used in binary mode.