How to find the location of a string in a text file

How to find the location of a string in a text file - c#

I'm trying to figure out how to find a certain string and display how many lines down it is in a text file.
For example let's saying I'm trying to find the string "I'm a string" in a text file then also have the location of the string(As in lines down) recorded in a variable.
Anyone got any tips too accomplish this?
Thanks

First, I would read in the file, then loop through each line searching for the text. Something like...
string[] lines = System.IO.File.ReadAllLines(#"C:\file.txt");
int count = 0;
foreach (string line in lines)
{
count++;
if (line.indexOf("I'm a string") > -1) {
// found it
}
}

Since this looks like a HW Question, I will not be posting the complete solution, but only pointers and guidelines.
You basically want to scan through your whole text file, letter by letter, reading the next n chars, where n is the length of your search string.
If that set matches your search string, you have your answer.
The number of "\n" you encounter is the number of lines you had to traverse through.
There exist simpler regex solutions also.. You should try looking at those.

Better than ReadAllLines:
public static IEnumerable<string> ReadLines(string path)

Related

Get partcular information from text file (C#)

I need a c# program which can split strings and copy some particular informations into another file.
I've a text file like this:
BRIDGE.V2014R6I1.SOFT
icem.V4R12I2.SOFT
mygale.V4R1I1.SOFT,patch01_MAJ_APL.exe
photoshop.V2014R10I1.SOFT
rhino.V5R0I1.SOFT,patch01_Update_Files.exe
TSFX.V2R3I2.SOFT,patch01_corrections.exe,patch02_clock.exe,patch03_correction_tri_date.exe,patch04_gestion_chemins_unc.exe
and I need only some of these information into another file as below :
BRIDGE,SOFT
ICEM,SOFT
MYGALE,SOFT
PHOTOSHOP,SOFT
any helps pls :)

As I don't know, wether your text file is always like that, I can only provide a specific answer. First of all you have to, as ThatsEli pointed out, split the string at the point:
var splitted = inputString.Split(".");
Now it seems as though your second (zero based index) item has the irrelevant information with a comma splitted from the relevant. So all you have to do is to build together the zeroth and the second, while the second only has the first part before the comma:
var res = $"{splitted[0]},{splitted[2].Split(",")[0]}";
However, you seem to want your result in uppercase:
var resUpper = res.ToUpper();
But actually this only works as long as you have a perfect input file - otherwise you have to check, wether it actually has that many items or you'll get an IndexOutOfRange exception.
Actually I'm not sure wether you know how to read/write from/to a file, so I'll provide examples on this as well.
Read
var path = #"Your\Path\To\The\Input\File";
if (!File.Exists(path))
{
Console.WriteLine("File doesn't exist! If you're using a console, otherwise use another way to print error messages");
return;
}
var inputString = File.ReadAllText(path);
Write
var outputPath = #"your\Output\Path";
if(!File.Exists(outputPath))
{
Console.WriteLine("You know what to put here");
return;
}
File.WriteAllText(outputPath, inputString);

I would split the string and create a new file with parts of the array you got from the split.
You can split a string with eg. Split(".");
And then e.g. create a new string stringname = splitstring[0] + "," + splitstring[2]
That would add the first and third part back together.
That would apply to your first line.

Split string to array when some elements are empty

I need to process a large amount of csv data in real time as it is spat out by a TCP port. Here is an example as displayed by Putty:
MSG,3,1920,742,4009C5,14205994,2017/01/29,20:14:27.065,2017/01/29,20:14:27.972,,8000,,,51.26582,-0.33783,,,0,0,0,0
MSG,4,1920,742,4009C5,14205994,2017/01/29,20:14:27.065,2017/01/29,20:14:27.972,,,212.9,242.0,,,0,,,,,
MSG,1,1920,742,4009C5,14205994,2017/01/29,20:14:27.065,2017/01/29,20:14:27.972,BAW469,,,,,,,,,,,
MSG,3,1920,742,4009C5,14205994,2017/01/29,20:14:27.284,2017/01/29,20:14:27.972,,8000,,,51.26559,-0.33835,,,0,0,0,0
MSG,4,1920,742,4009C5,14205994,2017/01/29,20:14:27.284,2017/01/29,20:14:27.972,,,212.9,242.0,,,0,,,,,
I need to put each line of data in string (line) into an array (linedata[]) so that I can read and process certain elements, but linedata = line.Split(','); seems to ignore the many empty elements, with the result that linedata[20], for example, may or may not exist, and if it doesn't I get an error if I try to read it. Even if element 20 in the line contains a value it won't necessarily be the 20th element in the array. And that's no good.
I can work out how to parse line character by character into linedata[], inserting an empty string where appropriate, but surely there must be a better way ? Have I missed something obvious ?
Many Thanks. Perhaps I'd better add that I'm quite new to C#, my past experience is all with Delphi 7. I really miss stringlists.
Edited: sorry, this is now resolved with the help of MSDN's documentation. This code works: lineData = line.Split(separators, StringSplitOptions.None); after setting "string[] separators = { "," };". My big mistake was to follow examples found on tutorial sites which didn't give any clues that the .split method had any options.

https://msdn.microsoft.com/en-us/library/system.stringsplitoptions(v=vs.110).aspx
That link has an example section, look at example 1b specifically. There is an extra parameter to Split called StringSplitOptions which does this.
For Example:
string[] linedata = line.Split(charSeparators, StringSplitOptions.None);
foreach (string line in linedata)
{
Console.Write("<{0}>", line);
}
Console.Write("\n\n");
The way to find this sort of information is to start with the Reference Documentation for the function, and hope it has an option or a link to a similar function.
If you want to also start validating types, handling variants in the format etc... you could move up to a CSV library. If you do not need that functionality, this is the easiest way and efficient for small files.

Some of the overloads for String.Split() take a StringSplitOptions argument, and if you use the RemoveEmptyEntries option, it will...remove the empty entries. So you can specify the None option:
linedata = line.Split(new [] { ',' }, StringSplitOptions.None);
Or better yet, use the overload that doesn't take a StringSplitOptions, which treats it as None by default:
linedata = line.Split(',');
The code in your question indicates that you are doing this, but your description of the problem suggests that you are not.
However, you're probably better off using an actual CSV parser, which would handle things like unescaping and so on.

The StringReader class provides methods for reading lines, characters, or blocks of characters from a string. Hope this could be the clue
string str = #"MSG,3,1920,742,4009C5,14205994,2017/01/29,20:14:27.065,2017/01/29,20:14:27.972,,8000,,,51.26582,-0.33783,,,0,0,0,0
MSG,4,1920,742,4009C5,14205994,2017/01/29,20:14:27.065,2017/01/29,20:14:27.972,,,212.9,242.0,,,0,,,,,
MSG,1,1920,742,4009C5,14205994,2017/01/29,20:14:27.065,2017/01/29,20:14:27.972,BAW469,,,,,,,,,,,
MSG,3,1920,742,4009C5,14205994,2017/01/29,20:14:27.284,2017/01/29,20:14:27.972,,8000,,,51.26559,-0.33835,,,0,0,0,0
MSG,4,1920,742,4009C5,14205994,2017/01/29,20:14:27.284,2017/01/29,20:14:27.972,,,212.9,242.0,,,0,,,,,";
using (StringReader reader = new StringReader(str))
do
{
string[] linedata = reader.ReadLine().Split(',');
} while (reader.Read() != -1);

While you should look into the various ways the String class can help you here, sometimes the quick and dirty "MAKE it fit" option is called for. In this case, that'd be to roll through the strings in advance and ensure you have at least one character between the commas.
public static string FixIt(string s)
{
return s.Replace(",,", ", ,");
}
You should be able to:
var lineData = FixIt(line).Split(',');
Edit: In response to the question below, I'm not sure what you meant, but if you mean doing it without creating a helper method, you can do so easily. The code will be harder to read and troubleshoot if you do it in one line though. My personal rule is, if you have to do it a LOT, it should probably be a method. If you only had to do it once, this is particularly clean. I'd actually do it this way and just wrap it in a method that does all the work for you.
var lineData = line.Replace(",,", ", ,").Split(',');
As a method, it'd be:
public static string[] GiveMeAnArray(string s)
{
return s.Replace(",,", ", ,").Split(',');
}

c# copy text till end of line richtextbox

I have a problem where I need to get specific strings out of a text file. This file is a settings file to something so most of the time what I need is contained in one line.
I need to copy these unknow strings into textBoxes, but I know the text right before the strings themselves (something like name = cannon where cannon is the string I need) How can I copy from the "=" to the end of the line? (I copied it out into a richTextBox)

Try this:
string settings = string.Empty;
IEnumerable<string> lines = File.ReadLines(myPath); //reads all lines of text file
foreach (string s in lines) //iterate thru all lines
{
if s.Contains("=")
{
settings = s.substring(s.IndexOf("=")); //get substring from "=" to end of line
break; //break out of the loop
}
}
This is basically an expansion of slobodan's answer. Your question is how to copy from "=" to the end of the line, which this does. Your comment on his answer is contradictory, however, as you mention there that it's on multiple lines instead. Let me know what you need and I'll try to change my answer.

Something like text.substring(text.IndexOf("="))

Advanced searching in Word documents

I have to build an application in C#.NET with which i can search for certain words in a Word document. I've seen that there are API's for this in C#.NET. But i need to take this a step further.
One thing i want to be able to do is search with a regex string.
And another thing i need to do is search for a range of numbers. So i should be able to say something like >500. And it should then find every "word" that has a larger value than 500.
So the last two things are my problem. I couldn't find any direct info about this. Is it possible to search in a Word document using regex with C# code? And is there a good way to specify a range if numbers that it should find?
I want to do this in C#.NET.
Any info on this is appreciated!

I've done it on a .txt file, you must change first line of code and open the word file however it should be :
string fileData = System.IO.File.ReadAllText(#"C:\1\1.txt");
string[] words = fileData.Split(' ');
List<int> integers = new List<int>();
foreach (string word in words)
{
try
{
int integer = int.Parse(word);
if(integer > 500)
integers.Add(integer);
}
catch (Exception)
{
//some code maybe
}
}
foreach (int integer in integers)
{
MessageBox.Show(integer.ToString());
}
and for opening word documents take a look at how to read .docx files.

Reading line by line

I have a program that generates a plain text file. The structure (layout) is always the same. Example:
Text File:
LinkLabel
"Hello, this text will appear in a LinkLabel once it has been
added to the form. This text may not always cover more than one line. But will always be surrounded by quotation marks."
240, 780
So, to explain what is going on in that file:
Control
Text
Location
And when a button on the Form is clicked, and the user opens one of these files from the OpenFileDialog dialog, I need to be able to Read each line. Starting from the top, I want to check to see what control it is, then starting on the second line I need to be able to get all text inside the quotation marks (regardless of whether is is one line of text or more), and on the next line (after the closing quotation mark), I need to extract the location (240, 780)... I have thought of a few ways of going about this but when I go to write it down and put it to practice, it doesn't make much sense and end up figuring out ways that it won't work.
Has anybody ever done this before? Would anybody be able to provide any help, suggestions or advice on how I'd go about doing this?
I have looked up CSV files but that seems too complicated for something that seems so simple.
Thanks
jase

You could use a regular expression to get the lines from the text:
MatchCollection lines = Regex.Matches(File.ReadAllText(fileName), #"(.+?)\r\n""([^""]+)""\r\n(\d+), (\d+)\r\n");
foreach (Match match in lines) {
string control = match.Groups[1].Value;
string text = match.Groups[2].Value;
int x = Int32.Parse(match.Groups[3].Value);
int y = Int32.Parse(match.Groups[4].Value);
Console.WriteLine("{0}, \"{1}\", {2}, {3}", control, text, x, y);
}

I'll try and write down the algorithm, the way I solve these problems (in comments):
// while not at end of file
// read control
// read line of text
// while last char in line is not "
// read line of text
// read location
Try and write code that does what each comment says and you should be able to figure it out.
HTH.

You are trying to implement a parser and the best strategy for that is to divide the problem into smaller pieces. And you need a TextReader class that enables you to read lines.
You should separate your ReadControl method into three methods: ReadControlType, ReadText, ReadLocation. Each method is responsible for reading only the item it should read and leave the TextReader in a position where the next method can pick up. Something like this.
public Control ReadControl(TextReader reader)
{
string controlType = ReadControlType(reader);
string text = ReadText(reader);
Point location = ReadLocation(reader);
... return the control ...
}
Of course, ReadText is the most interesting one, since it spans multiple lines. In fact it's a loop that calls TextReader.ReadLine until the line ends with a quotation mark:
private string ReadText(TextReader reader)
{
string text;
string line = reader.ReadLine();
text = line.Substring(1); // Strip first quotation mark.
while (!text.EndsWith("\"")) {
line = reader.ReadLine();
text += line;
}
return text.Substring(0, text.Length - 1); // Strip last quotation mark.
}

This kind of stuff gets irritating, it's conceptually simple, but you can end up with gnarly code. You've got a comparatively simple case:one record per file, it gets much harder if you have lots of records, and you want to deal nicely with badly formed records (consider writing a parser for a language such as C#.
For large scale problems one might use a grammar driven parser such as this: link text
Much of your complexity comes from the lack of regularity in the file. The first field is terminated by nwline, the second by delimited by quotes, the third terminated by comma ...
My first recomendation would be to adjust the format of the file so that it's really easy to parse. You write the file so you're in control. For example, just don't have new lines in the text, and each item is on its own line. Then you can just read four lines, job done.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.