I encountered System.IndexOutOfRangeException when loading a txt file - c#

I am trying the load a txt file which is written under a certain format, then I have encountered System.IndexOutOfRangeException. Do you have any idea on what's wrong with my codes? Thank you!
txt.File:
P§Thomas§40899§2§§§
P§Damian§40726§1§§§
P=Person; Thomas=first name; 40899=ID; 2=status
here are my codes:
using (StreamReader file = new StreamReader(fileName))
{
while (file.Peek() >= 0)
{
string line = file.ReadLine();
char[] charSeparators = new char[] { '§' };
string[] parts = line.Split(charSeparators, StringSplitOptions.RemoveEmptyEntries);
foreach (PersonId personids in PersonIdDetails)
{
personids.ChildrenVisualisation.Clear();
foreach (PersonId personidchildren in personids.Children)
{
personidchildren.FirstName = parts[1];
personidchildren.ID = parts[2];
personidchildren.Status = parts[3];
personids.ChildrenVisualisation.Add(personidchildren);
}
}
}
}
at parts[1] the exception was thrown.

You should check if parts have enough items:
...
string[] parts = line.Split(charSeparators, StringSplitOptions.RemoveEmptyEntries);
foreach (PersonId personids in PersonIdDetails) {
personids.ChildrenVisualisation.Clear();
// Check if parts have enough infirmation: at least 3 items
if (parts.Length > 3) // <- Pay attention for "> 3"
foreach (PersonId personidchildren in personids.Children) {
//TODO: Check, do you really start with 1, not with 0?
personidchildren.FirstName = parts[1];
personidchildren.ID = parts[2];
personidchildren.Status = parts[3];
personids.ChildrenVisualisation.Add(personidchildren);
}
else {
// parts doesn't have enough data
//TODO: clear personidchildren or throw an exception
}
}
...

I'm given the impression that the actual file is not that big, so it might be useful to use File.ReadAllLines instead (the con is the you need to have the entire file in memory), which gives you all the lines.
Also, removing the lines which are either empty or just whitespace might be necessary.
foreach (var line in File.ReadAllLines(fileName).Where(l => !string.IsNullOrWhiteSpace(l))
{
char[] charSeparators = new char[] { '§' };
string[] parts = line.Split(charSeparators, StringSplitOptions.RemoveEmptyEntries);
foreach (PersonId personids in PersonIdDetails)
{
personids.ChildrenVisualisation.Clear();
foreach (PersonId personidchildren in personids.Children)
{
personidchildren.FirstName = parts[1];
personidchildren.ID = parts[2];
personidchildren.Status = parts[3];
personids.ChildrenVisualisation.Add(personidchildren);
}
}
}

Change the first line to
using (StreamReader file = new StreamReader(fileName, Encoding.GetEncoding("iso-8859-1")));

Possible way to do it, It's just one solution between more solutions
using (StreamReader file = new StreamReader(fileName))
{
while (file.Peek() >= 0)
{
string line = file.ReadLine();
char[] charSeparators = new char[] { '§' };
string[] parts = line.Split(charSeparators, StringSplitOptions.RemoveEmptyEntries);
foreach (PersonId personids in PersonIdDetails)
{
personids.ChildrenVisualisation.Clear();
foreach (PersonId personidchildren in personids.Children)
{
if(parts.Length > 3)//Only if you want to save lines with all parts but you can create an else clause for other lines with 1 or 2 parts depending on the length
{
personidchildren.FirstName = parts[1];
personidchildren.ID = parts[2];
personidchildren.Status = parts[3];
personids.ChildrenVisualisation.Add(personidchildren);
}
}
}
}
}

Related

How to read and separate segments of a txt file?

I have a txt file, that has headers and then 3 columns of values (i.e)
Description=null
area = 100
1,2,3
1,2,4
2,1,5 ...
... 1,2,1//(these are the values that I need in one list)
Then another segment
Description=null
area = 10
1,2,3
1,2,4
2,1,5 ...
... 1,2,1//(these are the values that I need in one list).
In fact I just need one list per "Table" of values, the values always are in 3 columns but, there are n segments, any idea?
Thanks!
List<double> VMM40xyz = new List<double>();
foreach (var item in VMM40blocklines)
{
if (item.Contains(','))
{
VMM40xyz.AddRange(item.Split(',').Select(double.Parse).ToList());
}
}
I tried this, but it just work with the values in just one big list.
It looks like you want your data to end up in a format like this:
public class SetOfData //Feel free to name these parts better.
{
public string Description = "";
public string Area = "";
public List<double> Data = new List<double>();
}
...stored somewhere in...
List<SetOfData> finalData = new List<SetOfData>();
So, here's how I'd read that in:
public static List<SetOfData> ReadCustomFile(string Filename)
{
if (!File.Exists(Filename))
{
throw new FileNotFoundException($"{Filename} does not exist.");
}
List<SetOfData> returnData = new List<SetOfData>();
SetOfData currentDataSet = null;
using (FileStream fs = new FileStream(Filename, FileMode.Open))
{
using (StreamReader reader = new StreamReader(fs))
{
while (!reader.EndOfStream)
{
string line = reader.ReadLine();
//This will start a new object on every 'Description' line.
if (line.Contains("Description="))
{
//Save off the old data set if there is one.
if (currentDataSet != null)
returnData.Add(currentDataSet);
currentDataSet = new SetOfData();
//Now, to make sure there is something after "Description=" and to set the Description if there is.
//Your example data used "null" here, which this will take literally to be a string containing the letters "null". You can check the contents of parts[1] inside the if block to change this.
string[] parts = line.Split('=');
if (parts.Length > 1)
currentDataSet.Description = parts[1].Trim();
}
else if (line.Contains("area = "))
{
//Just in case your file didn't start with a "Description" line for some reason.
if (currentDataSet == null)
currentDataSet = new SetOfData();
//And then we do some string splitting like we did for Description.
string[] parts = line.Split('=');
if (parts.Length > 1)
currentDataSet.Area = parts[1].Trim();
}
else
{
//Just in case your file didn't start with a "Description" line for some reason.
if (currentDataSet == null)
currentDataSet = new SetOfData();
string[] parts = line.Split(',');
foreach (string part in parts)
{
if (double.TryParse(part, out double number))
{
currentDataSet.Data.Add(number);
}
}
}
}
//Make sure to add the last set.
returnData.Add(currentDataSet);
}
}
return returnData;
}

Removing the helping words from sentence

I have made a simple program in which a phrase is written and it displays the videos matching individual word.Let say i entered "I go to school". Here it should eliminate word "to"from the sentence and return just three words.
Here is a code which i have tried yet!! It works fine but when i enter some phrase it removes the helping verb and besides it, it replaces an empty string which makes problem. Any suggestions please
Code
class MyPlayer
{
string complete_name;
string root;
string[] supportedExtensions;
string videoname;
public MyPlayer(string snt)
{
videoname = snt;
}
public List<VideosDetail> test()
{
complete_name = videoname.ToLower() + ".wmv";
root = System.IO.Path.GetDirectoryName(#"C:\Users\Administrator\Desktop\VideosFrame\VideosFrame\Model\");
supportedExtensions = new[] { ".wmv" };
var files = Directory.GetFiles(Path.Combine(root, "Videos"), "*.*").Where(s => supportedExtensions.Contains(Path.GetExtension(s).ToLower()));
List<VideosDetail> videos = new List<VideosDetail>();
VideosDetail id;
bool flagefilefound = false;
foreach (var file in files)
{
id = new VideosDetail()
{
Path = file,
FileName = Path.GetFileName(file),
Extension = Path.GetExtension(file)
};
FileInfo fi = new FileInfo(file);
if (id.FileName == complete_name)
{
id.FileName = fi.Name;
id.Size = fi.Length;
videos.Add(id);
flagefilefound = true;
}
if (flagefilefound)
break;
}
if (!flagefilefound)
{
MessageBox.Show("no such video is available. ");
}
return videos;
}
}
private void play_Click(object sender, RoutedEventArgs e)
{
List<string> chk = new List<string>();
chk.Add("is");
chk.Add("am");
chk.Add("are");
chk.Add("were");
chk.Add("was");
chk.Add("do");
chk.Add("does");
chk.Add("has");
chk.Add("have");
chk.Add("an");
chk.Add("the");
chk.Add("to");
chk.Add("of");
string sen = vdo.Text;
List<string> tmp = new List<string>();
string[] split = sen.Split(' ');
foreach (var item in split)
{
tmp.Add(item);
}
foreach (var item in chk)
{
if( sen.Contains(item) )
{
int index = sen.IndexOf(item);
sen = sen.Remove(index,item.Length);
};
}
foreach (var i in tmp)
{
MyPlayer player = new MyPlayer(i);
VideoList.ItemsSource = player.test();
}
}
What you're actually doing is eliminating so called stop words, and, probably, creating bag of words:
private static HashSet<String> s_StopWords =
new HashSet<String>(StringComparer.OrdinalIgnoreCase) {
"is", "am", "are", "were", "was", "do", "does", "to", "from", // etc.
};
private static Char[] s_Separators = new Char[] {
'\r', '\n', ' ', '\t', '.', ',', '!', '?', '"', //TODO: check this list
};
...
String source = "I go to school";
// ["I", "go", "school"] - "to" being a stop word is removed
String[] words = source
.Split(s_Separators, StringSplitOptions.RemoveEmptyEntries)
.Where(word => !s_StopWords.Contains(word))
.ToArray();
// Combine back: "I go school"
String result = String.Join(" ", words);

How to read tab delimited lines by skipping alternate lines

I am currently able to parse and extract data from large tab delimited file. I am reading, parsing and extracting line by line and adding the split items in my Data table (Row Limit adding 3 rows at a time). I need to skip even lines i.e. Read first maximum tab delimited line and then skip 2nd one and read the third one directly.
My Tab delimited source file format
001Mean 26.975 1.1403 910.45
001Stdev 26.975 1.1403 910.45
002Mean 26.975 1.1403 910.45
002Stdev 26.975 1.1403 910.45
Need to skip or avoid reading Stdev tab delimited lines.
C# Code:
Getting the Maximum length of items in a tab delimited line of the file by splitting a line
using (var reader = new StreamReader(sourceFileFullName))
{
string line = null;
line = reader.ReadToEnd();
if (!string.IsNullOrEmpty(line))
{
var list_with_max_cols = line.Split('\n').OrderByDescending(y => y.Split('\t').Count()).Take(1);
foreach (var value in list_with_max_cols)
{
var values = value.ToString().Split(new[] { '\t', '\n' }).ToArray();
MAX_NO_OF_COLUMNS = values.Length;
}
}
}
Reading the file line by line until maximum length in a tab delimited line is satisfied as first line to parse and extract
using (var reader = new StreamReader(sourceFileFullName))
{
string new_read_line = null;
//Read and display lines from the file until the end of the file is reached.
while ((new_read_line = reader.ReadLine()) != null)
{
var items = new_read_line.Split(new[] { '\t', '\n' }).ToArray();
if (items.Length != MAX_NO_OF_COLUMNS)
continue;
//when reach first line it is column list need to create datatable based on that.
if (firstLineOfFile)
{
columnData = new_read_line;
firstLineOfFile = false;
continue;
}
if (firstLineOfChunk)
{
firstLineOfChunk = false;
chunkDataTable = CreateEmptyDataTable(columnData);
}
AddRow(chunkDataTable, new_read_line);
chunkRowCount++;
if (chunkRowCount == _chunkRowLimit)
{
firstLineOfChunk = true;
chunkRowCount = 0;
yield return chunkDataTable;
chunkDataTable = null;
}
}
}
Creating Data Table:
private DataTable CreateEmptyDataTable(string firstLine)
{
IList<string> columnList = Split(firstLine);
var dataTable = new DataTable("TableName");
for (int columnIndex = 0; columnIndex < columnList.Count; columnIndex++)
{
string c_string = columnList[columnIndex];
if (Regex.Match(c_string, "\\s").Success)
{
string tmp = Regex.Replace(c_string, "\\s", "");
string finaltmp = Regex.Replace(tmp, #" ?\[.*?\]", ""); // To strip strings inside [] and inclusive [] alone
columnList[columnIndex] = finaltmp;
}
}
dataTable.Columns.AddRange(columnList.Select(v => new DataColumn(v)).ToArray());
dataTable.Columns.Add("ID");
return dataTable;
}
How to skip lines by reading alternatively and split and then add to my datatable !!!
AddRow Function : Managed to achieve my requirement by adding following changes !!!
private void AddRow(DataTable dataTable, string line)
{
if (line.Contains("Stdev"))
{
return;
}
else
{
//Rest of Code
}
}
Considering you have tab separated values in each line, how about reading the odd lines and splitting them into arrays. This is just a sample; you can expand upon this.
Test data (file.txt)
luck is when opportunity meets preparation
this line needs to be skipped
microsoft visual studio
another line to be skipped
let us all code
Code
var oddLines = File.ReadLines(#"C:\projects\file.txt").Where((item, index) => index%2 == 0);
foreach (var line in oddLines)
{
var words = line.Split('\t');
}
Debug screen shots
EDIT
To get lines that don't contain 'Stdev'
var filteredLines = System.IO.File.ReadLines(#"C:\projects\file.txt").Where(item => !item.Contains("Stdev"));
Change
using (var reader = new StreamReader(sourceFileFullName))
{
string new_read_line = null;
//Read and display lines from the file until the end of the file is reached.
while ((new_read_line = reader.ReadLine()) != null)
{
var items = new_read_line.Split(new[] { '\t', '\n' }).ToArray();
if (items.Length != MAX_NO_OF_COLUMNS)
continue;
To
using (var reader = new StreamReader(sourceFileFullName))
{
int cnt = 0;
string new_read_line = null;
//Read and display lines from the file until the end of the file is reached.
while ((new_read_line = reader.ReadLine()) != null)
{
cnt++;
if(cnt % 2 == 0)
continue;
var items = new_read_line.Split(new[] { '\t', '\n' }).ToArray();
if (items.Length != MAX_NO_OF_COLUMNS)
continue;

get all lines from a huge textfile after a string

I have a lot of huge text files, I need to retrive all lines after certain string using c#,
fyi, the string will be there within last few lines, but not sure last how many lines.
sample text would be
someline
someline
someline
someline
etc
etc
"uniqueString"
line 1
line 2
line 3
I need to get lines
line 1
line 2
line 3
bool found=false;
List<String> lines = new List<String>();
foreach(var line in File.ReadLines(#"C:\MyFile.txt"))
{
if(found)
{
lines.Add(line);
}
if(!found && line.Contains("UNIQUEstring"))
{
found=true;
}
}
Try this code
public string[] GetLines()
{
List<string> lines = new List<string>();
bool startRead = false;
string uniqueString = "uniqueString";
using (StreamReader st = new StreamReader("File.txt"))
{
while (!st.EndOfStream)
{
if (!startRead && st.ReadLine().Equals(uniqueString))
startRead = true;
if (!startRead)
continue;
lines.Add(st.ReadLine());
}
}
return lines.ToArray();
}

Extract data from a big string

First of all, i'm using the function below to read data from a pdf file.
public string ReadPdfFile(string fileName)
{
StringBuilder text = new StringBuilder();
if (File.Exists(fileName))
{
PdfReader pdfReader = new PdfReader(fileName);
for (int page = 1; page <= pdfReader.NumberOfPages; page++)
{
ITextExtractionStrategy strategy = new SimpleTextExtractionStrategy();
string currentText = PdfTextExtractor.GetTextFromPage(pdfReader, page, strategy);
currentText = Encoding.UTF8.GetString(ASCIIEncoding.Convert(Encoding.Default, Encoding.UTF8, Encoding.Default.GetBytes(currentText)));
text.Append(currentText);
pdfReader.Close();
}
}
return text.ToString();
}
As you can see , all data is saved in a string. The string looks like this:
label1: data1;
label2: data2;
label3: data3;
.............
labeln: datan;
My question: How can i get the data from string based on labels ?
I've tried this , but i'm getting stuck:
if ( string.Contains("label1"))
{
extracted_data1 = string.Substring(string.IndexOf(':') , string.IndexOf(';') - string.IndexOf(':') - 1);
}
if ( string.Contains("label2"))
{
extracted_data2 = string.Substring(string.IndexOf("label2") + string.IndexOf(':') , string.IndexOf(';') - string.IndexOf(':') - 1);
}
Have a look at the String.Split() function, it tokenises a string based on an array of characters supplied.
e.g.
string[] lines = text.Split(new[] {';'}, StringSplitOptions.RemoveEmptyEntries);
now loop through that array and split each one again
foreach(string line in lines) {
string[] pair = line.Split(new[] {':'});
string key = pair[0].Trim();
string val = pair[1].Trim();
....
}
Obviously check for empty lines, and use .Trim() where needed...
[EDIT]
Or alternatively as a nice Linq statement...
var result = from line in text.Split(new[] {';'}, StringSplitOptions.RemoveEmptyEntries)
let tokens = line.Split(new[] {':'})
select tokens;
Dictionary<string, string> =
result.ToDictionary (key => key[0].Trim(), value => value[1].Trim());
It's pretty hard-coded, but you could use something like this (with a little bit of trimming to your needs):
string input = "label1: data1;" // Example of your input
string data = input.Split(':')[1].Replace(";","").Trim();
You can do this by using Dictionary<string,string>,
Dictionary<string, string> dicLabelData = new Dictionary<string, string>();
List<string> listStrSplit = new List<string>();
listStrSplit = strBig.Split(';').ToList<string>();//strBig is big string which you want to parse
foreach (string strSplit in listStrSplit)
{
if (strSplit.Split(':').ToList<string>().Count > 1)
{
List<string> listLable = new List<string>();
listLable = strSplit.Split(':').ToList<string>();
dicLabelData.Add(listLable[0],listLable[1]);//Key=Label,Value=Data
}
}
dicLabelData contains data of all label....
i think you can use regex to solve this problem. Just split the string on the break line and use a regex to get the right number.
You can use a regex to do it:
Regex rx = new Regex("label([0-9]+): ([^;]*);");
var matches = rx.Matches("label1: a string; label2: another string; label100: a third string;");
foreach (Match match in matches) {
var id = match.Groups[1].ToString();
var data = match.Groups[2].ToString();
var idAsNumber = int.Parse(id);
// Here you use an array or a dictionary to save id/data
}

Categories

Resources