Convert text data into multidimensional array in C#: - c#

I have a following string, with line breaks in a textfile:
Card No. Seq Account 1 Account 2 Account 3 Account 4 Customer Name Expiry Status
0100000184998 1 2500855884500 - - /NIRMAL PRADHAN 1302 Cold
0100000186936 1 - - - /RITA SHRESTHA 1302 Cold
0100000238562 1 2500211214500 - - /HARRY SHARMA 1301 Cold
0100000270755 0 1820823730100 - - /EXPRESS ACCOUNT 9999 Cold
0100000272629 0 1820833290100 - - - /ROMA MAHARJAN 1208 Cold
0100000272637 0 2510171014500 - - /NITIN KUMAR SHRESTHA 1208 Cold
0100000272645 0 1800505550100 - - - /DR HARI BHATTA 1208 Cold
Here,
Card No., Seq has fixed digits.
Account 1, Account 2, Account 3, Account 4 can have fixed digit
number or - or null.
Customer Name can have First Name, Last Name, Middle Name etc.
My desired result is:
array[0][0] = "0100000184998"
array[0][1] = "1"
array[0][2] = "2500855884500"
array[0][3] = " "
array[0][4] = "-"
array[0][6] = "NIRMAL PRADHAN "
array[1][0] = "0100000186936"
array[1][1] = "1"
array[1][3] = " "
array[1][4] = "-"
Here, What I tried is:
var sourceFile = txtProcessingFile.Text;
string contents = System.IO.File.ReadAllText(sourceFile);
var newarr = contents.Split(new char[]{ '\r', '\n' }, StringSplitOptions.RemoveEmptyEntries)
.Select (x =>
x.Split(new char[]{ ' ' }, StringSplitOptions.RemoveEmptyEntries).ToArray()
).ToArray();
DataTable dt = new DataTable("NewDataTable");
dt.Columns.Add("CardNo");
dt.Columns.Add("SNo");
dt.Columns.Add("Account1");
and so on...
for (int row = 0; row < newarr.Length; row++)
{
for (int col = 0; col < newarr[col].Length; col++)
{
dt.Rows.Add(newarr[row]);
row++;
}
}
This works fine if data field is not empty and Customer name is only the first name or delimited.
But, here what I am trying to get is:
First Name, Middle Name or Last Name must be stored in the same
array element.
Account Number in the array element must left blank if it is blank.
How is it possible to store it correctly on the datatable ?

I suggest that you learn to use the TextFieldParser class. Yes, it's in the Microsoft.VisualBasic namespace, but you can use it from C#. That class lets you easily parse text files that have fixed field widths. See the article How to: Read From Fixed-width Text Files in Visual Basic for an example. Again, the sample is in Visual Basic, but it should be very easy to convert to C#.

If you are willing to make the compromise of not making a difference between - and null values in the account values, you may try this:
var sourceFile = txtProcessingFile.Text;
string[] contents = System.IO.File.ReadAllLines(sourceFile);
DataTable dt = new DataTable("NewDataTable");
dt.Columns.Add("CardNo");
dt.Columns.Add("SNo");
dt.Columns.Add("Account1");
dt.Columns.Add("Account2");
dt.Columns.Add("Account3");
dt.Columns.Add("Account4");
dt.Columns.Add("CustomerName");
dt.Columns.Add("Expiry");
dt.Columns.Add("Status");
for (int row = 2; row < contents.Length; row++)
{
var newRow = dt.NewRow();
var regEx = new Regex(#"([\w]*)");
var matches = regEx.Matches(contents[row].ToString())
.Cast<Match>()
.Where(m => !String.IsNullOrEmpty(m.Value))
.ToList();
var numbers = matches.Where(m => Regex.IsMatch(m.Value, #"^\d+$")).ToList();
var names = matches.Where(m => !Regex.IsMatch(m.Value, #"^\d+$")).ToList();
for (int i = 0; i < numbers.Count() - 1; i++)
{
newRow[i] = numbers.Skip(i).First();
}
newRow[newRow.ItemArray.Length - 2] = numbers.Last();
newRow[newRow.ItemArray.Length - 1] = names.Last();
newRow[newRow.ItemArray.Length - 3] = names.Take(names.Count() - 1).Aggregate<Match, string>("", (a, b) => a += " " + b.Value);
dt.Rows.Add(newRow);
}

To get around the names with a single space in them, you could try splitting on a double-space instead of a single space:
x.Split(new string[]{ " " }
This still won't fix the issue with columns that have no value in them. It appears that your text file has everything in a specific position. Seq is in position 16, Account 1 is in position 20, etc.
Once your lines are stored in newarr, you may just want to use String.Substring() with .Trim() to get the value in each column.

Related

String replacement for Azure Translation notranslate

I have been working on trying to get Azure translator to convert text stored in a database column. Here is a couple of examples of how the text is currently stored:
eg1. "Add %%objectives%% from predefined sets of %%objectives%%"
eg2. %%Risk%%
eg3. some text here %%model%%. Please refresh the page.
My goal is to translate everything but the data within the % %. The problem is with Azure translate it has to be within the syntax of <div class="notranslate">" "" which means I have to replace all of the %% with that syntax. I was able to convert this and it works with only 1 within the string but everything else seemed to go down a rabbit hole. Here is my code:
english = "Add %%objectives%% from predefined sets of %%objectives%%";
if (english.Contains("%%"))
{
Dictionary<int, int> positions = new Dictionary<int, int>(); // this is to hold the locations of where delims are in string
ArrayList l = new ArrayList();
char[] letters = english.ToCharArray();
// get the first location of %
for (int i = 0; i < english.Length; i++)
{
if (letters[i] == '%')
{
l.Add(i);
}
}
string temp = "";
// only works if theres 1 % in the string
if (l.Count == 4)
{
int loc = english.IndexOf('%'); //%%Model%% = 0
int lastloc = english.LastIndexOf('%');
temp = " <div class=\"notranslate\">" + english.Substring(loc + 2, (lastloc - 3) - loc) + "</div>";
var lang = Translate(convert(english, temp), "en", "it");
// need to convert back to %%
Console.WriteLine(lang);
dataNode.SelectSingleNode("value").InnerText = lang;
}
else if (l.Count > 4) //this means that there are more than 1 delimted
{
foreach(int i in l) // 4 , 5 , 16 ,17, 43, 44
// % % text % % text % %
{
}
Any help is appreciated!
Do not replace the '%%'. Insert the <div class="notranslate"> before any odd sequence number '%%' and insert the </div> after any even sequence number '%%'. This way, the translation retains the %% markup.
If you have the option to translate the string after the variables have been replaced with real, human-readable values, you will get a better translation out of it.

Converting a list of "word,word\n" in a txt file to an array [,] - getting error

I have a list of postal codes and the cities the codes are for in a text file. The data looks like this:
2450,København SV
2500,Valby
2600,Glostrup
2605,Brøndby
2610,Rødovre
2625,Vallensbæk
2630,Taastrup
2635,Ishøj
2640,Hedehusene
There are 580 lines of text there.
I started by converting the text to jagged array[][] but that's not really meeting my needs farther along in my code. So a simple array[,] is preferable.
Unfortunately I'm apparently too new in c# to be able to get there myself.
string testing = File.ReadAllText(#"U:\Testing.txt");
int i = 0, j = 0;
string[,] result = new string[580, 2];
foreach (var row in testing.Split('\n'))
{
j = 0;
foreach (var col in row.Trim().Split(','))
{
result[i, j] = col.Trim();
j++; //Line 26 - this is where I get the exception error
}
i++;
}
I can't figure out why I'm getting the following error and I've begun tearing out my hair. Any ideas??
System.IndexOutOfRangeException
HResult=0x80131508
Message=Index was outside the bounds of the array.
Source=Testing
StackTrace:
at Testing.Analysis.Main() in U:\Uddannelse\Testing\Testing\Program.cs:line 26
You are getting the error because somewhere in your file, some rows have a comma in the city name.
If you want to get the whole name, try something like this -
var row = "2450,København, SV"
var values = row.Split(new[] {','}, 2);
//With the extra int param to the Split function, you are saying no matter how many substrings you can form , only give me this specific number of substrings.
//With a value of 2, you are essentially splitting at the first instance of the comma.
This will give you two values, the first being "2450" and the second "København, SV"
This is assuming that every row has atleast a comma, if not, you'll need to put in a check for it as well.
You can try this that corrects the indexing.
static void Test()
{
var testing = File.ReadAllLines(#"c:\Testing.txt");
string[,] result = new string[testing.Length, 2];
int i = 0, j = 0;
foreach ( var line in testing )
{
j = 0;
foreach ( var col in line.Trim().Split(',') )
result[i, j++] = col.Trim();
i++;
}
for ( int index = result.GetLowerBound(0); index < result.GetUpperBound(0); index++ )
Console.WriteLine($"Code = {result[index, 0]}, Name = {result[index,1]}");
}
Here is one way you can approach this:
string file_path = "...";
//read all lines from the file
var lines = File.ReadAllLines(file_path);
//You could also use StreamReader to read the file in line by line
string[,] result = new string[lines.Length, 2];
string line;
char[] separator = new char[] { ',' };
//loop over the lines until the end of the file
for(int current_line = 0; current_line < lines.Length; current_line++)
{
//second argument limits you to two parts, so additional commas will not cause issues
var parts = line.Trim().Split(separator, 2);
//make sure the data was in your expected format (i.e. two parts)
if(parts.Length == 2)
{
result[current_line, 0] = parts[0];
result[current_line, 1] = parts[1];
}
}

Missing characters after string concatenate

I am having a problem whereby the letter at the position(e.g 39) would be replaced with the text I wanted to input. However what I want was to insert the text at position 39 instead of replacing it. Anyone please guide me on this.
string description = variables[1]["value"].ToString();// where I get the text
int nInterval = 39;// for every 39 characters in the text I would have a newline
string res = String.Concat(description.Select((c, z) => z > 0 && (z % nInterval) == 0 ? Environment.NewLine +"Hello"+(((z/ nInterval)*18)+83).ToString()+"world": c.ToString()));
file_lines = file_lines.Replace("<<<terms_conditions>>>",resterms); //file_lines is where I read the text file
Original text
Present this redemption slip to receive: One
After String.Concat
Present this redemption slip to receive\r\n\u001bHello101world
One //: is gone
I am also having a issue where I want to put a new line if it contains * in the text. If anybody is able to help that would be great.
Edit:
What I want to achieve is something like this
Input
*Item is considered paid if unsealed.*No replacement or compensation will be given for any expired coupons.
so like i need to find every 39 character and also * to input newline so it will be
Output
*Item is considered paid if unsealed.
*No replacement or compensation will be
given for any expired coupons.
Try String.Insert(Int32, String) Method
Insert \n where you need new line.
If I understood your question properly, you want a newline after every 39 characters. You can use string.Insert(Int32, String) method for that.
And use String.Replace(String, String) for your * problem.
Below code snippet doing that using a simple for loop.
string sampleStr = "Lorem Ipsum* is simply..";
for (int i = 39; i < sampleStr.Length; i = i + 39){
sampleStr = sampleStr.Insert(i, Environment.NewLine);
}
//sampleStr = sampleStr.Replace("*", Environment.NewLine);
int[] indexes = Enumerable.Range(0, sampleStr.Length).Where(x => sampleStr[x] == '*').ToArray();
for (int i = 0; i < indexes.Length; i++)
{
int position = indexes[i];
if (position > 0) sampleStr = sampleStr.Insert(position, Environment.NewLine);
}
If you want to do both together
int[] indexes = Enumerable.Range(0, sampleStr.Length).Where(x => sampleStr[x] == '*' || x % 39 == 0).ToArray();
int j = 0;
foreach (var position in indexes)
{
if (position > 0)
{
sampleStr = sampleStr.Insert(position + j, Environment.NewLine);
j = j + 2; // increment by two since newline will take two chars
}
}
Without debating the method chosen to achieve the desired result, the problem with the code is that at the 39th character it adds some text, but the character itself has been forgotten.
Changing the following line should give the expected output.
string res = String.Concat(description.Select((c, z) => z > 0 && (z % nInterval) == 0 ? Environment.NewLine + "Hello" + (((z / nInterval) * 18) + 83).ToString() + "world" + c.ToString() : c.ToString()));
<== UPDATED ANSWER BASED ON CLARIFICATION IN QUESTION ==>
This will do what you want, I believe. See comments in line.
var description = "*Item is considered paid if unsealed.*No replacement or compensation will be given for any expired coupons.";
var nInterval = 39; // for every 39 characters in the text I would have a newline
var newline = "\r\n"; // for clarity in the Linq statement. Can be set to Environment.Newline if desired.
var z = 0; // we'll handle the count manually.
var res = string.Concat(
description.Select(
(c) => (++z == nInterval || c == '*') // increment z and check if we've hit the boundary OR if we've hit a *
&& ((z = 0)==0) // resetting the count - this only happens if the first condition was true
? newline + (c == ' ' ? string.Empty : c.ToString()) // if the first character of a newline is a space, we don't need it
: c.ToString()
));
Output:
*Item is considered paid if unsealed.
*No replacement or compensation will be
given for any expired coupons.

Adding 'space' in C# textbox

Hi guys, so I need to add a 'space' between each character in my displayed text box.
I am giving the user a masked word like this He__o for him to guess and I want to convert this to H e _ _ o
I am using the following code to randomly replace characters with '_'
char[] partialWord = word.ToCharArray();
int numberOfCharsToHide = word.Length / 2; //divide word length by 2 to get chars to hide
Random randomNumberGenerator = new Random(); //generate rand number
HashSet<int> maskedIndices = new HashSet<int>(); //This is to make sure that I select unique indices to hide. Hashset helps in achieving this
for (int i = 0; i < numberOfCharsToHide; i++) //counter until it reaches words to hide
{
int rIndex = randomNumberGenerator.Next(0, word.Length); //init rindex
while (!maskedIndices.Add(rIndex))
{
rIndex = randomNumberGenerator.Next(0, word.Length); //This is to make sure that I select unique indices to hide. Hashset helps in achieving this
}
partialWord[rIndex] = '_'; //replace with _
}
return new string(partialWord);
I have tried : partialWord[rIndex] = '_ ';however this brings the error "Too many characters in literal"
I have tried : partialWord[rIndex] = "_ "; however this returns the error " Cannot convert type string to char.
Any idea how I can proceed to achieve a space between each character?
Thanks
The following code should do as you ask. I think the code is pretty self explanatory., but feel free to ask if anything is unclear as to the why or how of the code.
// char[] partialWord is used from question code
char[] result = new char[(partialWord.Length * 2) - 1];
for(int i = 0; i < result.Length; i++)
{
result[i] = i % 2 == 0 ? partialWord[i / 2] : ' ';
}
return new string(result);
Since the resulting string is longer than the original string, you can't use only one char array because its length is constant.
Here's a solution with StringBuilder:
var builder = new StringBuilder(word);
for (int i = 0 ; i < word.Length ; i++) {
builder.Insert(i * 2, " ");
}
return builder.ToString().TrimStart(' '); // TrimStart is called here to remove the leading whitespace. If you want to keep it, delete the call.

How to read a specific word in text file and copy it into excel?

I have a text file as shown below
Name:xxxx Address:xxxxx Contact No: xxx NIC No: xxxx
in a single string.
I want to read the text file and extract only the name address contact number and NIC no using c# into an excel sheet.
I was able to read the whole string and save it into an excel sheet.
Apparently, you already know how to read a textfile and how to write to Excel. Remains the problem of how to split the line into separate values.
IF all those lines have the same field labels and field order, then you could use a regex to parse the line:
string line = "Name: xx xx Address:yyy yyYY Contact No: 1234 NIC No: xXxX";
var regex = new Regex(#"Name:\s*(.*)\s*Address:\s*(.*)\s*Contact No:\s*(.*)\s*NIC No:\s*(.*)\s*");
var match = regex.Match(line);
if (match.Success)
{
var name = match.Groups[1].Value;
var address = match.Groups[2].Value;
var contactNo = match.Groups[3].Value;
var nic = match.Groups[4].Value;
// TODO write these fields to Excel
}
This regex uses the field labels ("Name:", "Address:", etc) to find the values you need. The \s* parts mean that any whitespace around the values is ignored. The (.*) parts capture values into Groups in the match class, counting from 1.
If your Name, Address Contact No etc. fields are separated using a tab delimiter (\t) then you can split the string using tab delimiter like this:
string.Split('\t');
Instead of \t you can use whatever delimiter is there is the text file.
If you have a space then you might have a problem because the fields and field values may have spaces in between.
It is not clear if you have only one record in each file.
Let's suppose your data in a single file is:
Name:N1 Address:A1 W. X, Y - Z Contact No: C1 NIC No: I1 Name:N2 Address:A2 W. X, Y - Z Contact No: C2 NIC No: I2
So there are 2 records on a single line (but there could be more)
Name:N1 Address:A1 W. X, Y - Z Contact No: C1 NIC No: I1
Name:N2 Address:A2 W. X, Y - Z Contact No: C2 NIC No: I2
I don't think splitting by white spaces is practical because fields like name and address may contain spaces. Ideally colon symbol (:) is used only as delimiter between keys and values and it's not used in any value. Otherwise the solution gets more complicated.
Also, I assume that the order of keys is guaranteed to be as in the example above:
Name
Address
Contact No
NIC No
Use a list of custom objects or a DataTable to hold your structured data.
In this example I will use DataTable:
var separators = new char[] { ':' };
var data = new DataTable();
data.Columns.Add("Name", typeof(string));
data.Columns.Add("Address", typeof(string));
data.Columns.Add("ContractNo", typeof(string));
data.Columns.Add("NICNo", typeof(string));
For each file with records, open the file, read the file content and "process" it:
foreach (string fileName in fileNames)
{
//read file content
string fileContent = ...;
string[] tokens = fileContent.Split(separators);
//we skip first token. It will always be 'Name'.
for(int i = 0; i < (tokens - 1) / 4; i++)
{
var record = data.NewRow();
string token = tokens[i * 4 + 1];
record["Name"] = token.Substring(0, token.Lenght - 7).Trim(); // Remove 'Address' from end and trim spaces
token = tokens[i * 4 + 2];
record["Address"] = token.Substring(0, token.Length - 10).Trim(); //Remove 'Contact No' from end and trim spaces
token = tokens[i * 4 + 3];
record["ContractNo"] = token.Substring(0, token.Length - 6).Trim(); //Remove 'NIC No' from end and trim spaces
token = tokens[i * 4 + 4];
if (token.EndsWith('Name')) //if there are multiple records
token = token.Substring(0, token.Length - 4);
record["NICNo"] = token.Trim();
data.Rows.Add(record);
}
}
This will also work if each file contains only one record.
Now that you have the structured data in a data table it should be easy to insert them in excel worksheet.
static void Main(string[] args)
{
//Application.WorkbookBeforeSave += new Excel.AppEvents_WorkbookBeforeSaveEventHandler(Application_WorkbookBeforeSave);
string mydocpath = Environment.GetFolderPath(Environment.SpecialFolder.MyDocuments);
System.Data.DataTable dt = new System.Data.DataTable();
dt.Columns.Add("Name");
dt.Columns.Add("Address");
dt.Columns.Add("Contact No");
dt.Columns.Add("NIC");
foreach (string txtName in Directory.GetFiles(#"D:\unityapp\tab02", "*.txt"))
{
StreamReader sr = new StreamReader(txtName);
//string line = "Name: Address: Contact No: NIC No:";
string[] token1 = sr.ReadLine().Split(new string[] { "Name: ", "Address: ", "Contact No: ", "NIC No:" }, StringSplitOptions.None);
dt.Rows.Add(token1[1], token1[2], token1[3], token1[4]);
}
Microsoft.Office.Interop.Excel.Application x = new Microsoft.Office.Interop.Excel.Application();
// Workbook wb = x.Workbooks.Open(#"C:\Book1.xlsx");
Workbook wb = x.Workbooks.Add();
Worksheet sheet = (Microsoft.Office.Interop.Excel.Worksheet)wb.Worksheets.get_Item(1);
// Microsoft.Office.Interop.Excel.Workbook wb = new Microsoft.Office.Interop.Excel.Workbook();
// Microsoft.Office.Interop.Excel.Worksheet sheet = new Microsoft.Office.Interop.Excel.Worksheet();
sheet.Cells[1, 1] = "Name";
sheet.Cells[1, 1].Interior.ColorIndex = 10;
sheet.Cells[1, 2] = "Address";
sheet.Cells[1, 2].Interior.ColorIndex = 20;
sheet.Cells[1, 3] = "Contact No";
sheet.Cells[1, 3].Interior.ColorIndex = 30;
sheet.Cells[1, 4] = "NIC";
sheet.Cells[1, 4].Interior.ColorIndex = 40;
int rowCounter = 2;
int columnCounter = 1;
foreach (DataRow dr in dt.Rows)
{
sheet.Cells[rowCounter, columnCounter] = dr["Name"].ToString();
columnCounter += 1;
sheet.Cells[rowCounter, columnCounter] = dr["Address"].ToString();
columnCounter += 1;
sheet.Cells[rowCounter, columnCounter] = dr["Contact No"].ToString();
columnCounter += 1;
sheet.Cells[rowCounter, columnCounter] = dr["NIC"].ToString();
rowCounter += 1;
columnCounter = 1;
}
wb.SaveAs(#"D:\Unity.xlsx");
wb.Close();
x.Quit();
Process.Start(#"D:\Unity.xlsx");
}
}
}

Categories

Resources