Splitting a string seems not to work - c#

I have problems with reading a file (textasset) line by line and getting the results!
Here is the file I am trying to read:
AUTHOR
COMMENT
INFO 1 X ARG 0001 0.581 2.180 1.470
INFO 2 X ARG 0001 1.400 0.974 1.724
INFO 3 X ARG 0001 2.553 0.934 0.751
INFO 4 X ARG 0001 3.650 0.494 1.053
INFO 5 X ARG 0001 1.188 3.073 1.532
INFO 6 X ARG 0001 2.312 1.415 -0.466
INFO 7 X ARG 0001 -0.232 2.249 2.180
END
Here is the code I am using:
//read file
string[] line = file.text.Split("\n"[0]);
for(int i = 0 ; i < line.Length ; i++)
{
if(line[i].Contains("INFO"))
{
//To replace all spaces with single underscore "_" (it works fine)
string l = Regex.Replace(line[i]," {2,}","_");
//In this Debug.Log i get correct results
//For example "INFO_1_X_ARG_0001_0.581_2.180_1.470"
Debug.Log(l);
string[] subline = Regex.Split(l,"_");
//Only for first "INFO" line i get correct results (INFO,1,X,ARG,0001,0.581,2.180,1.470)
//For all other "INFO" lines i get incomplete results (first,fourth and fifth element are not put into substrings
//like they are dissapeard!
foreach(string s in subline){Debug.Log(s);}
}
}
Explanation:
I first split text into lines (works fine),then i read only lines that contain INFO
I loop all lines that contain INFO and replace all spaces with underscore _ (this works fine)
I split lines that contain INFO into substrings based on underscore _
When I print out the lines only first line with INFO seems to have all substrings
every next line is not splitted correctly (first part INFO is omitted as well as third string)
It seems very unreliable. Is this the way to go with these things? Any help is appreciated! This should be simple, what i am doing wrong?
EDIT:
Something is wrong with this code (it should be simple, but it does not work)
Here is the updated code (i just made a List<string> list = new List<string>() and copied all substrings. I use unity3D so that list contents show in the inspector. I was shocked when i so all properly extracted substrings but simple
foreach(string s in list)
Debug.Log(s);
was indeed missing some values. so I was trying different things and this code:
for(int x = 0; x < list.Count ; x++)
{
Debug.Log("List: " + x.ToString() + " " + list[x].ToString());
}
shows contents of the list properly, but this code (note that i just removed x.ToString()) is missing some elements in the list. It does not want to read them!
for(int x = 0; x < list.Count ; x++)
Debug.Log("List: " + list[x].ToString());
So i am not sure what is going on here?!

There are some problems
1>The contains method you are using is case sensitive i.e INFO != info
You should use
line[i].ToLower().Contains("info")
2>Is the text always separated by space.it may also be separated by tabs.you are better off with
Regex.Replace(line[i]," {2,}|\t+","_");
//this would replace 1 to many tabs or 2 or more space

The following seems to be working for me:
using (var fs = new FileStream(filePath, FileMode.Open))
using (var reader = new StreamReader(fs))
{
string line;
while ((line = reader.ReadLine()) != null)
{
if (line.StartsWith("INFO"))
{
line = Regex.Replace(line, "[ ]+", "_");
var subline = line.Split('_');
foreach (var str in subline)
{
Console.Write("{0} ",str);
}
Console.WriteLine();
}
}
}

You may want to try something like this:
for (int i = 0; i < line.Length; i++)
{
if (line[i].Contains("INFO"))
{
string l = Regex.Replace(line[i], #"\p{Zs}{2,}|\t+", "_");
string[] sublines = l.Split('_');
// If you want to see the debug....
sublines.ForEach(s => Debug.Log(s));
}
}
The \p{Zs} will match all Unicode separator/space characters (e.g. space, non-breaking spaces, etc.). The following reference may be of some help to you: Character Classes in Regular Expressions.

Try string.split("\t"[0]") You have probable tabulators between columns.

Related

Insert a value in a specific line by index

private void Parse_Click(object sender, EventArgs e)
{
for (int i = 0; i < keywordRanks.Lines.Length; i++)
{
int p = keywordRanks.Lines.Length;
MessageBox.Show(p.ToString());
string splitString = keywordRanks.Lines[i];
string[] s = splitString.Split(':');
for (int j = 0; j < keywords.Lines.Length; j++)
{
string searchString = keywords.Lines[j];
if (s[0].Equals(searchString))
{
richTextBox1.Lines[j] = searchString + ':' + s[1];
}
}
}
}
I have an issue with inserting string in a particular line. I have 2 multi line TextBoxes and one RichTextBox.
My application will search for the strings from textbox1 to textbox2 line by line and need to insert those matched values in a RichTextBox control but in the exact index position where it found in textbox2.
If the value found in 5th line of textbox2 then that found line need to be inserted in the RichTextBox 5th line.
Some how my code is not working. I tried a lot but no luck. The code I need something like below but its not working and an IndexOutOfBound exception is raised.
richTextBox1.Lines[j] = searchString + ':' + s[1];
Your RichTextBox must contain all the needed lines before you can set the value using the line index.
If the Control contains no text or line breaks (\n), no lines are defined and any attempt to set a specific Line[Index] value will generate an IndexOutOfRangeException exception.
Here, I'm using a pre-built array, sized as the number of possible matches (the Lines.Length of the keywords TextBox). All matches found are stored here in the original position. The array is then assigned to the RichTextBox.Lines property.
Note: directly using and pre-setting the RichTextBox.Lines won't have effect: the text will remain empty.
string[] MatchesFound = new string[keywords.Lines.Length];
foreach (string currentSourceLine in keywordRanks.Lines)
{
string[] SourceLineValue = currentSourceLine.Split(':');
var match = keywords.Lines.ToList().FindIndex(s => s.Equals(SourceLineValue[0]));
if (match > -1)
MatchesFound[match] = currentSourceLine;
}
richTextBox1.Lines = MatchesFound;
Source Matches Result
(keywordRanks) (keywords) (RichTextBox)
-------------------------------------------
aand:1 aand aand:1
cnd:5 this one
cnds:9 cnds cnds:9
fan:2 another one
gst:0 cnd cnd:5
fan fan:2

Cycle through text file to capture string

I'm looking for a solution to this issue that I'm having a hard time with: I'm trying to read text from a .txt file with this code. The line Monsters[i, j] = string(col.Split(' ')); is giving me trouble, telling me that string is an invalid expression. The file is a list of characters and their attributes separated by spaces. Thanks in advance!
String input = File.ReadAllText(#"CharacterAttributes.txt");
int a = 0, b = 0;
string[,] Monsters = new string[24,11];
foreach (var row in input.Split('\n'))
{
b = 0;
foreach (var col in row.Trim().Split(' '))
{
Monsters[a, b] = string(col.Split(' '));
b++;
}
b++;
}
From what it appears to me:
You've already separated each character by row: var row in input.Split('\n')
You've already separated each attribute for the character by space: var col in row.Trim().Split(' ')
So, when we get to Monsters[a,b] = string(col.Split(' ')) (which btw, string() is invalid syntax), I see no reason to split any further, and what you're actually trying to do is store the value of col to your Monsters[a,b], assuming a is each character, and b is the attribute for said character.
Monsters[a,b] = col; may well be what you are looking for.

How to skip txt file chunks

How do I skip reading the file at the red boxes only to continue reading the file at the blue boxes? What adjustments would I need to make to 'fileReader'?
So far, with the help of SO users, I've been able to successfully skip the first 8 lines (first red box) and read the rest of the file. But now I want to read ONLY the parts indicated in blue.
I'm thinking of making a method for each chunk in blue. Basically start it by skipping first 8 lines of file if its first blue box, about 23 for the next blue box but ending the file reader is where I'm having problems. Simply don't know what to use.
private void button1_Click(object sender, EventArgs e)
{
// Reading/Inputing column values
OpenFileDialog ofd = new OpenFileDialog();
if (ofd.ShowDialog() == System.Windows.Forms.DialogResult.OK)
{
string[] lines = File.ReadAllLines(ofd.FileName).Skip(8).ToArray();
textBox1.Lines = lines;
int[] pos = new int[3] {0, 6, 18}; //setlen&pos to read specific colmn vals
int[] len = new int[3] {6, 12, 28}; // only doing 3 columns right now
foreach (string line in textBox1.Lines)
{
for (int j = 0; j < 3; j++) // 3 columns
{
val[j] = line.Substring(pos[j], len[j]).Trim();
list.Add(val[j]); // column values stored in list
}
}
}
}
Try something like this:
using System.Text.RegularExpressions; //add this using
foreach (string line in lines)
{
string[] tokens = Regex.Split(line.Trim(), " +");
int seq = 0;
DateTime dt;
if(tokens.Length > 0 && int.TryParse(tokens[0], out seq))
{
// parse this line - 1st type
}
else if (tokens.Length > 0 && DateTime.TryParse(tokens[0], out dt))
{
// parse this line - 2nd type
}
// else - don't parse the line
}
The Regex split is handy to break on any spaces till the next token. The Regex " +" means match one or more spaces. It splits when it finds something else. Based on your example, you only want to parse lines that begin with a number or a date, which this should accomplish. Note that I trimmed the line of leading and trailing spaces so that you don't split on any of those and get empty string tokens.
I can see what you want to read anything what:
between line ending with Numerics (possible one line after)
until line starting with 0Total (is that zero, right?);
between line ending with CURREN
until line with 1 as first symbol in the row.
Shouldn't be hard. Read file by line. When (1) or (3) occurs, start generating until (2) or (4) correspondingly.

Searching for text in a .txt

What would be the best way to search a text file that looks like this..?
efee|| Nbr| Address| Name |Phone|City|State|Zip abc
||455|gsgd |first last|gsg |fef |jk |0393 gjgj||jfj|ddg
|first last|fht |ree |hn |th ...more lines...
I started by reading in the file and all its contexts with a streamreader
I was thinking to count the "|" and grab the text between the 5th and 6th using substring but i'm not sure how to do the count of the "|". Or if someone has a better idea I'm open to it.
Tried something like this:
StreamReader file = new StreamReader(#"...");
string line;
int num=0;
while ((line = file.ReadLine()) != null)
{
for (int i = 1; i <= 6; i++)
{
if (line.Contains("|"))
{
num++;
}
}
int start = line.IndexOf("|");
int end = line.IndexOf("|");
string result = line.Substring(start, end - start - 1);
}
The text I want I beleive is always between the 5th and 6th "|"
You can do it like this:
var res = File
.ReadLines(#"FileName.txt")
.Select(line => line.Split(new[]{'|'}, StringSplitOptions.None)[5])
.ToList();
This produces a List<strings> from the file, where each string is the part of the corresponding line of the file taken from between the fifth and the sixth '|' separator.
For a delimited file you should use a parser - there is one in the Microsoft.VisualBasic.FileIO namespace - the TextFieldParser class, though you could also look at third-party libraries like the popular FileHelpers.
A simpler approach would be to use string.Split on the | character and getting the value in the corresponding index of the returned string[], however, if any of the fields are escaped and can validly contain | internally, this will fail.
You could split each line into an array:
while ((line = file.ReadLine()) != null)
{
var values = line.Split('|');
}
This should work
string txt = File.ReadAllText("file.txt");
string res = Regex.Match(txt, "\\|*?{5}(.+?)\\|", RegexOptions.Singleline).Result("$1");

C# writing out text files matching listbox and contents of another text file

I have a file created from a directory listing. From each of item a user selects from a ListBox, the application reads the directory and writes out a file that matches all the contents. Once that is done it goes through each item in the ListBox and copies out the item that matches the ListBox selection. Example:
Selecting 0001 matches:
0001456.txt
0001548.pdf.
The code i am using isn't handling 0s very well and is giving bad results.
var listItems = listBox1.Items.OfType<string>().ToArray();
var writers = new StreamWriter[listItems.Length];
for (int i = 0; i < listItems.Length; i++)
{
writers[i] = File.CreateText(
Path.Combine(destinationfolder, listItems[i] + "ANN.TXT"));
}
var reader = new StreamReader(File.OpenRead(masterdin + "\\" + "MasterANN.txt"));
string line;
while ((line = reader.ReadLine()) != null)
{
for (int i = 0; i < listItems.Length; i++)
{
if (line.StartsWith(listItems[i].Substring(0, listItems[i].Length - 1)))
writers[i].WriteLine(line);
}
}
Advice for correcting this?
Another Sample:
I have 00001 in my listbox: it returns these values:
00008771~63.txt
00002005~3.txt
00009992~1.txt
00001697~1.txt
00000001~1.txt
00009306~2.txt
00000577~1.txt
00001641~1.txt
00001647~1.txt
00001675~1.txt
00001670~1.txt
It should only return:
00001641~1.txt
00001647~1.txt
00001675~1.txt
00001670~1.txt
00001697~1.txt
Or if someone could just suggest a better method for taking each line in my listbox searching for line + "*" and whatever matches writes out a textfile...
This is all based pretty much on the one example you gave, but I believe the problem is that when you are performing your matching, you are getting the substring if your list item value and chopping off the last character.
In your sample you are attempting to match files starting with "00001", but when you do the match you are getting substring starting at zero and value.length-1 characters, which in this case would be "0000". For example:
string s = "00001";
Console.WriteLine(s.Substring(0,s.Length-1));
results in
0000
So I think if you just changed this line:
if (line.StartsWith(listItems[i].Substring(0, listItems[i].Length - 1)))
writers[i].WriteLine(line);
to this
if (line.StartsWith(listItems[i]))
writers[i].WriteLine(line);
you would be in good shape.
Sorry if I misunderstood your question, but let's start with this:
string line = String.Empty;
string selectedValue = "00001";
List<string> matched = new List<string>();
StreamReader reader = new StreamReader(Path.Combine(masterdin, "MasterANN.txt"));
while((line = reader.ReadLine()) != null)
{
if(line.StartsWith(selectedValue))
{
matched.Add(line);
}
}
This will match all lines from your MasterANN.txt file which begins with "00001" and add them into a collection (later we'll work on writing this into a file, if required).
This clarifies something?

Categories

Resources