Reading a manipulating text files variable suggestion - c#

I'm new to C# and am working on a fun maybe useful program for my education.
I've got data that's stored in files, one file for each entry in the below format. I believe the are in a particular order in the file as well.
There are around a 100 values for each file. My program will basically alter a few of these
values and write them back to the file.
I'm trying to figure out how I should store these values. I know how to read the text file.
I thought about reading each line and storing it in an array. Does anyone have any other suggestions? Would this be a good use case for a class?
D:"value1"=00000800
D:"value2"=00000001
S:"value3"=full

Glad you have picked up C#. I hope you find learning it rewarding.
One of the methods I prefer when I want to modify a file in C# is first File.ReadAllLines and then Files.WriteAllLines. For these two static methods, you will need using System.IO.
To parse the texts, you might need String.Split.
Here's an example:
using System;
using System.IO;
class Test
{
public static void Main()
{
var filepath = #"myfile.txt";
// Read all lines.
var allLines = File.ReadAllLines(filepath);
// Modify your text here.
foreach (var line in allLines)
{
// Parse the line and separate its components with delimiters ':', '"' and '='.
var components = line.Split(new char[]{':', '"', '=',});
// Change all X:"value_i"=Y to X:"value_i"=5.
components[2] = "5";
}
// Write all lines.
File.WriteAllLines(filepath, allLines);
}
}

Related

How to extract text from multiple files

I have upwards of 200 files that I need to extract a certain sequence of lines from, and write the results in a new csv file. I am just learning C#, but have experience with other languages far in the past. I have tried looking up all the individual steps, along with Regex, which I don't understand, but I don't know how to stitch it all together.
Sample text:
--> SAT1_988_Connection_Verify
EA0683010A01030F15A40202004E2000
E0068300
E40683010278053A
>
(S45, 10:38:35 AM)
Algorithm Steps
1) I need to point the program at a directory with the files.
2) I need the program to search through each file in the directory.
3) I need to find the lines that starts with "E40", of which there could be multiple or none. Additionally, this line varies in length.
4) I need to grab that line, as well as the two before it, which are highlighted in the nested block quote above.
5) There is always a blank line after the target line.
6)I need to write those three lines separated by commas in a text document.
My code so far:
using System;
using System.Collections.Generic;
using System.IO;
namespace ConsoleApplication2
{
class Program
{
static void Main()
{
string path = #"C:\ETT\Test.txt";
string[] readText = File.ReadAllLines(path);
foreach (string s in readText)
{
}
}
public static string getBetween(string[] strSource, string strKey)
{
int Start, End;
if (strSource.Contains(strKey))
{
Start = Array.IndexOf(strSource, strKey) -2;
End = Array.IndexOf(strSource, strKey) + 1;
return strSource.Substring(Start, End - Start);
}
else
{
return "";
}
}
}
}
There are many ways of doing this. However just to help you (and because you added comparatively detailed amount of information for a first post, you need to look up the following topics
Directory.EnumerateFiles Method
Returns an enumerable collection of file names that match a search
pattern in a specified path.
File.ReadAllLines Method
Opens a text file, reads all lines of the file into a string array,
and then closes the file.
Enumerable.Where<TSource> Method (IEnumerable, Func)
Filters a sequence of values based on a predicate.
String.StartsWith Method
Determines whether the beginning of this string instance matches a
specified string.
https://joshclose.github.io/CsvHelper/
A library for reading and writing CSV files. Extremely fast, flexible,
and easy to use. Supports reading and writing of custom class objects.
CSV helper implements RFC 4180. By default, it's very conservative in
its writing, but very liberal in its reading. There is a large set of
configuration that can be done to change how reading and writing
behaves, giving you the ability read/write non-standard files also.
The only tricky part will be getting 3 lines before
List<T>.IndexOf Method (T)
Searches for the specified object and returns the zero-based index of
the first occurrence within the entire List.
From that index, you can use List[Index-1] List[Index-2] to get the preceding lines
Good luck.

Get partcular information from text file (C#)

I need a c# program which can split strings and copy some particular informations into another file.
I've a text file like this:
BRIDGE.V2014R6I1.SOFT
icem.V4R12I2.SOFT
mygale.V4R1I1.SOFT,patch01_MAJ_APL.exe
photoshop.V2014R10I1.SOFT
rhino.V5R0I1.SOFT,patch01_Update_Files.exe
TSFX.V2R3I2.SOFT,patch01_corrections.exe,patch02_clock.exe,patch03_correction_tri_date.exe,patch04_gestion_chemins_unc.exe
and I need only some of these information into another file as below :
BRIDGE,SOFT
ICEM,SOFT
MYGALE,SOFT
PHOTOSHOP,SOFT
any helps pls :)
As I don't know, wether your text file is always like that, I can only provide a specific answer. First of all you have to, as ThatsEli pointed out, split the string at the point:
var splitted = inputString.Split(".");
Now it seems as though your second (zero based index) item has the irrelevant information with a comma splitted from the relevant. So all you have to do is to build together the zeroth and the second, while the second only has the first part before the comma:
var res = $"{splitted[0]},{splitted[2].Split(",")[0]}";
However, you seem to want your result in uppercase:
var resUpper = res.ToUpper();
But actually this only works as long as you have a perfect input file - otherwise you have to check, wether it actually has that many items or you'll get an IndexOutOfRange exception.
Actually I'm not sure wether you know how to read/write from/to a file, so I'll provide examples on this as well.
Read
var path = #"Your\Path\To\The\Input\File";
if (!File.Exists(path))
{
Console.WriteLine("File doesn't exist! If you're using a console, otherwise use another way to print error messages");
return;
}
var inputString = File.ReadAllText(path);
Write
var outputPath = #"your\Output\Path";
if(!File.Exists(outputPath))
{
Console.WriteLine("You know what to put here");
return;
}
File.WriteAllText(outputPath, inputString);
I would split the string and create a new file with parts of the array you got from the split.
You can split a string with eg. Split(".");
And then e.g. create a new string stringname = splitstring[0] + "," + splitstring[2]
That would add the first and third part back together.
That would apply to your first line.

c# smart way to delete multiple occurance of a character in a string

My program reads a file which has thousands of lines of something like this below
"Timestamp","LiveStandby","Total1","Total2","Total3", etc..
each line is different
What is the best way to split by , and delete the "" as well as put the values in a list
this is what I have
while ((line = file.ReadLine()) != null)
{
List<string> title_list = new List<string>(line.Split(','));
}
the step above still missing the deletion of the quotes. I can do foreach but that kinda defeat the purpose of having List and Split in just 1 line. What is the best and smart way to do it?
The best way in my opinion is to use a library that parses CSV, such as FileHelpers.
Concretely, in your case, this would be the solution using the FileHelpers library:
Define a class that describes the structure of a record:
[DelimitedRecord(",")]
public class MyDataRecord
{
[FieldQuoted('"')]
public string TimeStamp;
[FieldQuoted('"')]
public string LiveStandby;
[FieldQuoted('"')]
public string Total1;
[FieldQuoted('"')]
public string Total2;
[FieldQuoted('"')]
public string Total3;
}
Use this code to parse the entire file:
var csvEngine = new FileHelperEngine<MyDataRecord>(Encoding.UTF8)
{
Options = { IgnoreFirstLines = 1, IgnoreEmptyLines = true }
};
var parsedItems = csvEngine.ReadFile(#"D:\myfile.csv");
Please note that this code is for illustration only and I have not compiled/run it. However, the library is pretty straightforward to use and there are good examples and documentation on the website.
Keeping it simple like this should work:
List<string> strings = new List<string>();
while ((line = file.ReadLine()) != null)
string.AddRange(line.Replace("\"").split(',').AsEnumerable());
I'm going to clarify this a bit. If you have a user formatted file that has a predictable format (ie the user has generated the data out of EXCEL or similar program) then you are way better off using an exising parser that is well tested.
Scenarios like the following are just a few examples that manual parsing will have problems with:
"column 1", 2, 0104400, $1,300, "This is an interestion question, he said"
.. and there are more with escaping, file formats etc that can be a headache for roll your own.
If you do that, then ensure you get one that can tollerate differences in columns per row as it can make a difference.
If, on the other hand, you know what's going into the data which is common in system generated files then using CSV parsers will cause more problems than they solve. For example, I have dealt with scenarios where the first part is fixed and can be strongly typed, but there are following parts in a row that are not. This can also happen if you're parsing flat file data in fixed width scenarios from legacy databases. A csv solution makes assumptions we don't want and is not the right solution in many of those cases.
If that is the case and you just want to strip out quotes after splitting on commas, then try a bit of linq. This can also be extended to replace specific characters you are worried about.
line.Split(',').Select(i => i.Replace("\"", "")).ToArray()
Hope that clears up all the conflicting advice.
You can use the Array.ConvertAll() function.
string line = "\"Timestamp\",\"LiveStandby\",\"Total1\",\"Total2\",\"Total3\"";
var list = new List<String>(Array.ConvertAll(line.Split(','), x=> x.Replace("\"","")));
Perform the Replace first, then Split into your List. Here's your code with Replace.
while ((line = file.ReadLine()) != null)
{
List<string> title_list = new List<string>(line.Replace("\"", "").Split(','));
}
Although, you're going to need a variable to hold all of the Lists, so look at using AddRange().

How should I use the following code in VS?

We recently received a bunch of files with tab-delimiters.
We were having difficulties importing them in sql server database.
The vendor who sent the files also sent the code below for us to use in converting the files from tab to comma delimiters.
How do I use this file in visual studio.
I have used visual studio several times befor but I have not used it with just single file such as this.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
namespace TabToComma
{
class Program
{
static void Main(string[] args)
{
StreamReader sr;
StreamWriter sw;
sr = new StreamReader(#"c:\input.txt");
sw = new StreamWriter(#"c:\output.txt");
string nextline;
string replacedline;
while (sr.Peek() >= 0)
{
nextline = sr.ReadLine();
replacedline = nextline.Replace('\t',','); // replace each tab in line with a comma
sw.WriteLine(replacedline);
}
sr.Close();
sw.Close();
}
}
}
Alternatively, if someone knows how I can accomplish same thing using vbscript please point me in the right direction.
Thanks alot in advance
Create a console app, and replace contents of generated program.cs with the text above. And then, hit RUN :)
You need to create a new Console application and then paste this code into the example file created as part of the solution. Then change the "c:\input.txt" to be the file you want to convert and then hit run.
Also, here's a replacement for the content of Main() that might make your life easier, as long as the files are of decent size:
foreach(string f in args) {
System.IO.File.WriteAllText(f, System.IO.File.ReadAllText(f).Replace('\t', ','));
}
Compile and drag and drop all your files onto the resulting executable. They'll be converted automatically.
You can even grab the compiled executable from here: http://dl.dropbox.com/u/2463964/TabsToCommas.exe if you're having trouble compiling it.
OK, that was nice playing in the answers with all kind of methods how to replace characters in a string. But unfortunately, reality is not as easy as that. How do you handle data with comma's in it for example? Like Telephone bill{tab}USD{tab}1,234.00 becoming Telephone bill,USD,1,234.00. An extra column is inserted and data gets corrupted because the database registers that your telephone bill was only one dollar. Luckily, the problem is not the other way around because even The Scripting Guy doesn't have a waterproof solution for that.
What your vendor should have delivered is a line by line reader, where every line is split on the tab character into an array with all values. Then read out all values in the array to see if there is one or more comma's in the value and wrap it with double quotes. After that, the array is assembled to a string with a join on the comma to make it a 'real' CSV file.
But why go through all the hassle if you can tackle the problem at the source; why not flag your data as tab delimited in SQL?
BULK
INSERT TableYouWantToImportTo
FROM 'c:\input.txt'
WITH
(
FIELDTERMINATOR = '\t',
ROWTERMINATOR = '\n'
)
GO

Is there a C# language construct/framework object to apply a function to each line of a file?

What I'm looking for is something along these lines:
var f = UsefulFileObject(#"c:\temp.log");
f.ForEachLine( (line) => line.Trim() );
After that I'd expect each line in the file to be trimmed.
Is there such an object in the framework already or should I start making my own one?
After seeing several wrong solutions using ForEach on an immutable type such as string, I guess I'll post my own.
This will actually read and write the mentioned file. Still, it's for smallish files only as the entire contents will be in memory*.
string filename = #"C:\testfile.txt";
string[] fileLines = File.ReadAllLines(filename);
fileLines = Array.ConvertAll(fileLines, l => l.Trim());
File.WriteAllLines(filename, fileLines);
In the real world, you'd probably want to write to a different file first and rename it to the original file after the operation has succeeded. Otherwise parts of the file could be lost if something went wrong halfway writing.
Ad memory usage:
Actually, the file will be in memory twice momentarily. You could solve that by using a regular for loop instead of any fancy extension methods and lambda expressions. I'll skip that excercise and go straight to the 'proper' way of doing this for larger files:
using (StreamReader reader = new StreamReader(#"D:\infile.txt"))
using (StreamWriter writer = new StreamWriter(#"D:\outfile.txt"))
{
string line;
while ((line = reader.ReadLine()) != null)
{
writer.WriteLine(line.Trim());
}
}
// Some File.Move(..) usage to rename the files
Implementing "ForEachLine":
If you want to implement a helper method that does what you describe, you could hide the ugly renaming logic in there.
The method signature would be something like:
public void ForEachLine(Func<string, string> func)
and the line doing the writing would just be:
writer.WriteLine(func(line));
Since you can't modify a text file in place (well, obviously you can, but it's just freaking hard), such an object would basically be centered around a read-modify-write system.
For rather small files use File.ReadAllLines(filename).ForEach( l=> l.Trim() )
You have the System.IO.File.ReadAllLines() method that returns an string[] (array of strings).
The array can then be used with a foreach loop of you can convert it to a List and use the ForEach(Action) method.
So this should work:
string[] logLines = System.IO.File.ReadAllLines(#"c:\temp.log");
logLines.ToList().ForEach(l => l.Trim());

Categories

Resources