Searching for text in a .txt - c#

What would be the best way to search a text file that looks like this..?
efee|| Nbr| Address| Name |Phone|City|State|Zip abc
||455|gsgd |first last|gsg |fef |jk |0393 gjgj||jfj|ddg
|first last|fht |ree |hn |th ...more lines...
I started by reading in the file and all its contexts with a streamreader
I was thinking to count the "|" and grab the text between the 5th and 6th using substring but i'm not sure how to do the count of the "|". Or if someone has a better idea I'm open to it.
Tried something like this:
StreamReader file = new StreamReader(#"...");
string line;
int num=0;
while ((line = file.ReadLine()) != null)
{
for (int i = 1; i <= 6; i++)
{
if (line.Contains("|"))
{
num++;
}
}
int start = line.IndexOf("|");
int end = line.IndexOf("|");
string result = line.Substring(start, end - start - 1);
}
The text I want I beleive is always between the 5th and 6th "|"

You can do it like this:
var res = File
.ReadLines(#"FileName.txt")
.Select(line => line.Split(new[]{'|'}, StringSplitOptions.None)[5])
.ToList();
This produces a List<strings> from the file, where each string is the part of the corresponding line of the file taken from between the fifth and the sixth '|' separator.

For a delimited file you should use a parser - there is one in the Microsoft.VisualBasic.FileIO namespace - the TextFieldParser class, though you could also look at third-party libraries like the popular FileHelpers.
A simpler approach would be to use string.Split on the | character and getting the value in the corresponding index of the returned string[], however, if any of the fields are escaped and can validly contain | internally, this will fail.

You could split each line into an array:
while ((line = file.ReadLine()) != null)
{
var values = line.Split('|');
}

This should work
string txt = File.ReadAllText("file.txt");
string res = Regex.Match(txt, "\\|*?{5}(.+?)\\|", RegexOptions.Singleline).Result("$1");

Related

Alternative to File.AppendAllText for newline

I am trying to read characters from a file and then append them in another file after removing the comments (which are followed by semicolon).
sample data from parent file:
Name- Harly Brown ;Name is Harley Brown
Age- 20 ;Age is 20 years
Desired result:
Name- Harley Brown
Age- 20
I am trying the following code-
StreamReader infile = new StreamReader(floc + "G" + line + ".NC0");
while (infile.Peek() != -1)
{
letter = Convert.ToChar(infile.Read());
if (letter == ';')
{
infile.ReadLine();
}
else
{
System.IO.File.AppendAllText(path, Convert.ToString(letter));
}
}
But the output i am getting is-
Name- Harley Brown Age-20
Its because AppendAllText is not working for the newline. Is there any alternative?
Sure, why not use File.AppendAllLines. See documentation here.
Appends lines to a file, and then closes the file. If the specified file does not exist, this method creates a file, writes the specified lines to the file, and then closes the file.
It takes in any IEnumerable<string> and adds every line to the specified file. So it always adds the line on a new line.
Small example:
const string originalFile = #"D:\Temp\file.txt";
const string newFile = #"D:\Temp\newFile.txt";
// Retrieve all lines from the file.
string[] linesFromFile = File.ReadAllLines(originalFile);
List<string> linesToAppend = new List<string>();
foreach (string line in linesFromFile)
{
// 1. Split the line at the semicolon.
// 2. Take the first index, because the first part is your required result.
// 3. Trim the trailing and leading spaces.
string appendAbleLine = line.Split(';').FirstOrDefault().Trim();
// Add the line to the list of lines to append.
linesToAppend.Add(appendAbleLine);
}
// Append all lines to the file.
File.AppendAllLines(newFile, linesToAppend);
Output:
Name- Harley Brown
Age- 20
You could even change the foreach-loop into a LINQ-expression, if you prefer LINQ:
List<string> linesToAppend = linesFromFile.Select(line => line.Split(';').FirstOrDefault().Trim()).ToList();
Why use char by char comparison when .NET Framework is full of useful string manipulation functions?
Also, don't use a file write function multiple times when you can use it only one time, it's time and resources consuming!
StreamReader stream = new StreamReader("file1.txt");
string str = "";
while ((string line = infile.ReadLine()) != null) { // Get every line of the file.
line = line.Split(';')[0].Trim(); // Remove comment (right part of ;) and useless white characters.
str += line + "\n"; // Add it to our final file contents.
}
File.WriteAllText("file2.txt", str); // Write it to the new file.
You could do this with LINQ, System.File.ReadLines(string), and System.File.WriteAllLines(string, IEnumerable<string>). You could also use System.File.AppendAllLines(string, IEnumerable<string>) in a find-and-replace fashion if that was, in fact, the functionality you were going for. The difference, as the names suggest, is whether it writes everything out as a new file or if it just appends to an existing one.
System.IO.File.WriteAllLines(newPath, System.IO.File.ReadLines(oldPath).Select(c =>
{
int semicolon = c.IndexOf(';');
if (semicolon > -1)
return c.Remove(semicolon);
else
return c;
}));
In case you aren't super familiar with LINQ syntax, the idea here is to loop through each line in the file, and if it contains a semicolon (that is, IndexOf returns something that is over -1) we cut that off, and otherwise, we just return the string. Then we write all of those to the file. The StreamReader equivalent to this would be:
using (StreamReader reader = new StreamReader(oldPath))
using (StreamWriter writer = new StreamWriter(newPath))
{
string line;
while ((line = reader.ReadLine()) != null)
{
int semicolon = line.IndexOf(';');
if (semicolon > -1)
line = c.Remove(semicolon);
writer.WriteLine(line);
}
}
Although, of course, this would feed an extra empty line at the end and the LINQ version wouldn't (as far as I know, it occurs to me that I'm not one hundred percent sure on that, but if someone reading this does know I would appreciate a comment).
Another important thing to note, just looking at your original file, you might want to add in some Trim calls, since it looks like you can have spaces before your semicolons, and I don't imagine you want those copied through.

Reading a participant list from a .csv into C# program

I'm writing a small program that reads some people's firstname, surname, ID and email from an Excel sheet into the console, which isn't the problem, but instead of getting this output:
Poul EjnarRovsingpersomething#mail.com
ReneBach2014914something#mail.com
JohnJohnsson3950185something#mail.com
I want the output to be similar to this:
Poul Ejnar Rovsing per something#mail.com
Rene Bach 2014914 something#mail.com
John Johnsson 3950185 something#mail.com
The code I'm using is giving me this output, which is certainly a step in the right direction, but not quite what I'm looking for:
Poul Ejnar Rovsing per something#mail.com
Rene Bach 2014914 something#mail.com
John Johnsson 3950185 something#mail.com
And for some reason it's only outputting every other row instead of all of them, which is also puzzling me quite a bit. What am I missing here?
This is my code:
static void Main(string[] args)
{
string[] tokens;
char[] separators = {';'};
string str = "";
string newSeparator = " ";
FileStream fs = new FileStream(#"D:\Dokumenter\Skole\6. semester\GUI\Exercises\Exercise2\02 deltagerliste.csv", FileMode.Open);
StreamReader sr = new StreamReader(fs, Encoding.Default);
while ((str = sr.ReadLine()) != null)
{
str = sr.ReadLine();
tokens = str.Split(separators, StringSplitOptions.RemoveEmptyEntries);
Console.WriteLine(tokens[0] + newSeparator + tokens[1] + newSeparator + tokens[2] + newSeparator + tokens[3]);
}
Console.ReadLine();
}
Fixed Width Outputs
For fixed width formatting, you can take advantage of composite formatting and alignments using String.Format. For example:
String.Format("{0,10}", "name"); // output blocks of 10 characters, right aligned
String.Format("{0,-10}", "name"); // output blocks of 10 characters, left aligned
Format strings are of the form: {index[,alignment][:formatString]}. To left align an item, use a negative value for alignment.
To use this in a composite format string, you just add more format placeholders in curly brackets, the index corresponds to the index of the argument in String.Format:
var sString = "name";
var anInt = 1;
var aDecimal = 1.23M;
var s = String.Format("|{0,10}|{1,10:0}|{2,10:0.00}|", sString, anInt, aDecimal);
Output:
| name| 1| 1.23|
Line skipping
And, it is skipping every other line as every time you iterate in the while loop, you read one line, then read again:
while ((str = sr.ReadLine()) != null) // <--- first read
{
str = sr.ReadLine(); // <--- second read replaces the first one
try a do loop with the while and the read at the end
str = sr.ReadLine();
do {
... do stuff here ...
} while ((str = sr.ReadLine()) != null);
You invoke the ReadLine twice, and you are skipping a row also use \t, this indent your output in tabs.
You are reading every other line because there are two calls to the StreamReader
while ((str = sr.ReadLine()) != null)
{
str = sr.ReadLine(); // Don't need this one!
...
The first call in the While statement will advance the reader one line. The second call will advance it again, overwriting what you previously just read.
To get the spacing correct you could use \t to insert tabs, but you would still need to do some math on the size of each token so you can use the correct number of tabs. Alternatively you could use String.PadRight to make each column a specific length.

Searching strings in txt file

I have a .txt file with a list of 174 different strings. Each string has an unique identifier.
For example:
123|this data is variable|
456|this data is variable|
789|so is this|
etc..
I wish to write a programe in C# that will read the .txt file and display only one of the 174 strings if I specify the ID of the string I want. This is because in the file I have all the data is variable so only the ID can be used to pull the string. So instead of ending up with the example about I get just one line.
eg just
123|this data is variable|
I seem to be able to write a programe that will pull just the ID from the .txt file and not the entire string or a program that mearly reads the whole file and displays it. But am yet to wirte on that does exactly what I need. HELP!
Well the actual string i get out from the txt file has no '|' they were just in the example. An example of the real string would be: 0111111(0010101) where the data in the brackets is variable. The brackets dont exsist in the real string either.
namespace String_reader
{
class Program
{
static void Main(string[] args)
{
String filepath = #"C:\my file name here";
string line;
if(File.Exists(filepath))
{
StreamReader file = null;
try
{
file = new StreamReader(filepath);
while ((line = file.ReadLine()) !=null)
{
string regMatch = "ID number here"; //this is where it all falls apart.
Regex.IsMatch (line, regMatch);
Console.WriteLine (line);// When program is run it just displays the whole .txt file
}
}
}
finally{
if (file !=null)
file.Close();
}
}
Console.ReadLine();
}
}
}
Use a Regex. Something along the lines of Regex.Match("|"+inputString+"|",#"\|[ ]*\d+\|(.+?)\|").Groups[1].Value
Oh, I almost forgot; you'll need to substitute the d+ for the actual index you want. Right now, that'll just get you the first one.
The "|" before and after the input string makes sure both the index and the value are enclosed in a | for all elements, including the first and last. There's ways of doing a Regex without it, but IMHO they just make your regex more complicated, and less readable.
Assuming you have path and id.
Console.WriteLine(File.ReadAllLines(path).Where(l => l.StartsWith(id + "|")).FirstOrDefault());
Use ReadLines to get a string array of lines then string split on the |
You could use Regex.Split method
FileInfo info = new FileInfo("filename.txt");
String[] lines = info.OpenText().ReadToEnd().Split(' ');
foreach(String line in lines)
{
int id = Convert.ToInt32(line.Split('|')[0]);
string text = Convert.ToInt32(line.Split('|')[1]);
}
Read the data into a string
Split the string on "|"
Read the items 2 by 2: key:value,key:value,...
Add them to a dictionary
Now you can easily find your string with dictionary[key].
first load the hole file to a string.
then try this:
string s = "123|this data is variable| 456|this data is also variable| 789|so is this|";
int index = s.IndexOf("123", 0);
string temp = s.Substring(index,s.Length-index);
string[] splitStr = temp.Split('|');
Console.WriteLine(splitStr[1]);
hope this is what you are looking for.
private static IEnumerable<string> ReadLines(string fspec)
{
using (var reader = new StreamReader(new FileStream(fspec, FileMode.Open, FileAccess.Read, FileShare.Read)))
{
while (!reader.EndOfStream)
yield return reader.ReadLine();
}
}
var dict = ReadLines("input.txt")
.Select(s =>
{
var split = s.Split("|".ToArray(), 2);
return new {Id = Int32.Parse(split[0]), Text = split[1]};
})
.ToDictionary(kv => kv.Id, kv => kv.Text);
Please note that with .NET 4.0 you don't need the ReadLines function, because there is ReadLines
You can now work with that as any dictionary:
Console.WriteLine(dict[12]);
Console.WriteLine(dict[999]);
No error handling here, please add your own
You can use Split method to divide the entire text into parts sepparated by '|'. Then all even elements will correspond to numbers odd elements - to strings.
StreamReader sr = new StreamReader(filename);
string text = sr.ReadToEnd();
string[] data = text.Split('|');
Then convert certain data elements to numbers and strings, i.e. int[] IDs and string[] Strs. Find the index of the given ID with idx = Array.FindIndex(IDs, ID.Equals) and the corresponding string will be Strs[idx]
List <int> IDs;
List <string> Strs;
for (int i = 0; i < data.Length - 1; i += 2)
{
IDs.Add(int.Parse(data[i]));
Strs.Add(data[i + 1]);
}
idx = Array.FindIndex(IDs, ID.Equals); // we get ID from input
answer = Strs[idx];

How to get specific line from a string in C#?

I have a string in C# and would like to get text from specific line, say 65. And if file does not have so many lines I would like to get "". How to do this?
Quick and easy, assuming \r\n or \n is your newline sequence
string GetLine(string text, int lineNo)
{
string[] lines = text.Replace("\r","").Split('\n');
return lines.Length >= lineNo ? lines[lineNo-1] : null;
}
private static string ReadLine(string text, int lineNumber)
{
var reader = new StringReader(text);
string line;
int currentLineNumber = 0;
do
{
currentLineNumber += 1;
line = reader.ReadLine();
}
while (line != null && currentLineNumber < lineNumber);
return (currentLineNumber == lineNumber) ? line :
string.Empty;
}
You could use a System.IO.StringReader over your string. Then you could use ReadLine() until you arrived at the line you wanted or ran out of string.
As all lines could have a different length, there is no shortcut to jump directly to line 65.
When you Split() a string you duplicate it, which would also double the memory consumption.
If you have a string instance already, you can use String.Split to split each line and check if line 65 is available and if so use it.
If the content is in a file use File.ReadAllLines to get a string array and then do the same check mentioned before. This will work well for small files, if your file is big consider reading one line at a time.
using (var reader = new StreamReader(File.OpenRead("example.txt")))
{
reader.ReadLine();
}
What you can do is, split the string based on the newline character.
string[] strLines = yourString.split(Environment.NewLine);
if(strLines.Length > lineNumber)
{
return strLines[lineNumber];
}
theString.Split("\n".ToCharArray())[64]
Other than taking advantage of a specific file structure and lower level file operations, I don't think theres any faster way than to read 64 lines, discard them and then read the 65th line and keep it. At each step, you can easily check if you've read the entire file.

C# writing out text files matching listbox and contents of another text file

I have a file created from a directory listing. From each of item a user selects from a ListBox, the application reads the directory and writes out a file that matches all the contents. Once that is done it goes through each item in the ListBox and copies out the item that matches the ListBox selection. Example:
Selecting 0001 matches:
0001456.txt
0001548.pdf.
The code i am using isn't handling 0s very well and is giving bad results.
var listItems = listBox1.Items.OfType<string>().ToArray();
var writers = new StreamWriter[listItems.Length];
for (int i = 0; i < listItems.Length; i++)
{
writers[i] = File.CreateText(
Path.Combine(destinationfolder, listItems[i] + "ANN.TXT"));
}
var reader = new StreamReader(File.OpenRead(masterdin + "\\" + "MasterANN.txt"));
string line;
while ((line = reader.ReadLine()) != null)
{
for (int i = 0; i < listItems.Length; i++)
{
if (line.StartsWith(listItems[i].Substring(0, listItems[i].Length - 1)))
writers[i].WriteLine(line);
}
}
Advice for correcting this?
Another Sample:
I have 00001 in my listbox: it returns these values:
00008771~63.txt
00002005~3.txt
00009992~1.txt
00001697~1.txt
00000001~1.txt
00009306~2.txt
00000577~1.txt
00001641~1.txt
00001647~1.txt
00001675~1.txt
00001670~1.txt
It should only return:
00001641~1.txt
00001647~1.txt
00001675~1.txt
00001670~1.txt
00001697~1.txt
Or if someone could just suggest a better method for taking each line in my listbox searching for line + "*" and whatever matches writes out a textfile...
This is all based pretty much on the one example you gave, but I believe the problem is that when you are performing your matching, you are getting the substring if your list item value and chopping off the last character.
In your sample you are attempting to match files starting with "00001", but when you do the match you are getting substring starting at zero and value.length-1 characters, which in this case would be "0000". For example:
string s = "00001";
Console.WriteLine(s.Substring(0,s.Length-1));
results in
0000
So I think if you just changed this line:
if (line.StartsWith(listItems[i].Substring(0, listItems[i].Length - 1)))
writers[i].WriteLine(line);
to this
if (line.StartsWith(listItems[i]))
writers[i].WriteLine(line);
you would be in good shape.
Sorry if I misunderstood your question, but let's start with this:
string line = String.Empty;
string selectedValue = "00001";
List<string> matched = new List<string>();
StreamReader reader = new StreamReader(Path.Combine(masterdin, "MasterANN.txt"));
while((line = reader.ReadLine()) != null)
{
if(line.StartsWith(selectedValue))
{
matched.Add(line);
}
}
This will match all lines from your MasterANN.txt file which begins with "00001" and add them into a collection (later we'll work on writing this into a file, if required).
This clarifies something?

Categories

Resources