Convert txt with different number of spaces into xls file - c#

I tried searching for a solution here but I can't seem to find any answers. I have a textfile that appears like this:
Nmr_test 101E-6 PASSED PASSED PASSED PASSED
Dc_volts 10V_100 CAL_+10V +9.99999000 +10.0000100 +9.99999740 +9.99999727
Dcv_lin 10V_6U 11.5 +0.0000E+000 +7.0000E+000 +2.0367E+001 +2.7427E+001
Dcv_lin 10V_6U 3 +0.0000E+000 +5.0000E+000 +1.3331E+001 +1.8872E+001
I have to convert this textfile to an Excel/xls file but I can't figure out how to insert them to the correct excel columns as they have different number of spaces in between columns. I've tried using this code below which is using space as a separator but it fails of course due to the varying number of spaces between the columns:
var lines = File.ReadAllLines(string.Concat(Directory.GetCurrentDirectory(), "\\Temp_textfile.txt"));
var rowcounter = 1;
foreach(var line in lines)
{
var columncounter = 1;
var values = line.Split(' ');
foreach(var value in values)
{
excelworksheet.Cells[rowcounter, columncounter] = new Cell(value);
columncounter++;
}
rowcounter++;
}
excelworkbook.Worksheets.Add(excelworksheet);
excelworkbook.Save(string.Concat(Directory.GetCurrentDirectory(), "\\Exported_excelfile.xls"));
Any advice?
EDIT: Got it working using SubString that selects each column using their fixed width.

Related

Load text file data into data table for specific length scenario

I have a text file which has many irrelevant values and then have values which I have load it into a table. Sample of the file looks like this
Some file description date
C D 8989898989898 some words
D F 8979797979 some more words
8 H 98988989989898 Some more words for the purpose
KD978787878 280000841 1974DIAA EIDER 320
KK967867668 280000551 1999OOOD FIDERN 680
I can't start from the number of lines because the description part (which is 4 lines, excluding empty line) can be of multi line. Means, it can have up to 40-50 lines per text file.
The only way I can think to pick the data is to select only those rows which has 5 columns and have certain number of space between them.
I have tried it using foreach loop but that didn't work out pretty well. May be I am not able to implement it.
DataTable dt = new DataTable();
using (StreamWriter sw = File.CreateText(path))
{
string[] rows = content.Split('\n');
foreach (string s in rows)
{
// how to pick up rows when there are only 5 columns in a row separated by a definite number of space?
string[] columns = s.Split(' '); // how to calculate exact spaces here, because space count could be different from one column to the other. Ex: difference between first column and second is 16 and second to third is 8.
foreach (string t in columns)
{
}
}
}
A lot of this comes down to massaging and sanitizing the data(yuck!) I would:
1.Use String.Split on content to get all lines(like you did)
string[] lines = content.Split(new[] { "\r\n", "\r", "\n" }, StringSplitOptions.None);
2.Parse out empty lines and loop over the result
foreach(string line in lines.Where(x => !String.IsNullOrEmpty(x.Trim())))
3.Use String.Split on each line to split out each field for a particular row, stripping white space
string[] fields = line.Split(new string[] { " " }, StringSplitOptions.RemoveEmptyEntries);
At this point you can count the number of fields in the row or throw something at each actual field.
This is an ideal place to use regex to find only lines that fit your needs and even grouping them properly you can get out the trimmed values of the five columns already.
The search expressions seems to be something like "^(K[A-Z0-9]+) +([0-9]+) +([A-Z0-9]+) +([A-Z]+) +([0-9]+) *$" or similar. It helped me a lot in programming to know regex.

Issue renaming two columns in a CSV file instead of one

I need to be able to rename the column in a spreadsheet from 'idn_prod' to 'idn_prod1', but there are two columns with this name.
I have tried implementing code from similar posts, but I've only been able to update both columns. Below you'll find the code I have that just renames both columns.
//locate and edit column in csv
string file1 = #"C:\Users\username\Documents\AppDevProjects\import.csv";
string[] lines = System.IO.File.ReadAllLines(file1);
System.IO.StreamWriter sw = new System.IO.StreamWriter(file1);
foreach(string s in lines)
{
sw.WriteLine(s.Replace("idn_prod", "idn_prod1"));
}
I expect only the 2nd column to be renamed, but the actual output is that both are renamed.
Here are the first couple rows of the CSV:
I'm assuming that you only need to update the column header, the actual rows need not be updated.
var file1 = #"test.csv";
var lines = System.IO.File.ReadAllLines(file1);
var columnHeaders = lines[0];
var textToReplace = "idn_prod";
var newText = "idn_prod1";
var indexToReplace = columnHeaders
.LastIndexOf("idn_prod");//LastIndex ensures that you pick the second idn_prod
columnHeaders = columnHeaders
.Remove(indexToReplace,textToReplace.Length)
.Insert(indexToReplace, newText);//I'm removing the second idn_prod and replacing it with the updated value.
using (System.IO.StreamWriter sw = new System.IO.StreamWriter(file1))
{
sw.WriteLine(columnHeaders);
foreach (var str in lines.Skip(1))
{
sw.WriteLine(str);
}
sw.Flush();
}
Replace foreach(string s in lines) loop with
for loop and get the lines count and rename only the 2nd column.
I believe the only way to handle this properly is to crack the header line (first string that has column names) into individual parts, separated by commas or tabs or whatever, and run through the columns one at a time yourself.
Your loop would consider the first line from the file, use the Split function on the delimiter, and look for the column you're interested in:
bool headerSeen = false;
foreach (string s in lines)
{
if (!headerSeen)
{
// special: this is the header
string [] parts = s.Split("\t");
for (int i = 0; i < parts.Length; i++)
{
if (parts[i] == "idn_prod")
{
// only fix the *first* one seen
parts[i] = "idn_prod1";
break;
}
}
sw.WriteLine( string.Join("\t", parts));
headerSeen = true;
}
else
{
sw.WriteLine( s );
}
}
The only reason this is even remotely possible is that it's the header and not the individual lines; headers tend to be more predictable in format, and you worry less about quoting and fields that contain the delimiter, etc.
Trying this on the individual data lines will rarely work reliably: if your delimiter is a comma, what happens if an individual field contains a comma? Then you have to worry about quoting, and this enters all kinds of fun.
For doing any real CSV work in C#, it's really worth looking into a package that specializes in this, and I've been thrilled with CsvHelper from Josh Close. Highly recommended.

parsing text file to data table with irregular rows

i am trying to parse a tabular data in a text file into a data table.
the text file contains text
PID USERNAME THR PRI NICE SIZE RES STATE TIME WCPU COMMAND
11 root 1 171 52 0K 12K RUN 23:46 80.42% idle
12 root 1 -20 -139 0K 12K RUN AS 0:56 7.96% swi7:
the code i have is like
public class Program
{
static void Main(string[] args)
{
var lines = File.ReadLines("bb.txt").ToArray();
var headerLine = lines[0];
var dt = new DataTable();
var columnsArray = headerLine.Split(" ".ToCharArray(), StringSplitOptions.RemoveEmptyEntries);
var dataColumns = columnsArray.Select(item => new DataColumn { ColumnName = item });
dt.Columns.AddRange(dataColumns.ToArray());
for (int i = 1; i < lines.Length; i++)
{
var rowLine = lines[i];
var rowArray = rowLine.Split(" ".ToCharArray(), StringSplitOptions.RemoveEmptyEntries);
var x = dt.NewRow();
x.ItemArray = rowArray;
dt.Rows.Add(x);
}
}
}
i get an error that "Input array is longer than the number of columns in this table" at second attempt on
x.ItemArray = rowArray;
Off course because second row has "RUN AS" as the value of 8th column. it also has a space between it which is a common split character for the entire row hence creating a mismatch between array's length and columns length.
what is the possible solution for this kind of situation.
Assuming that "RUN AS" is your only string that causes you the condition like this, you could just run var sanitizedLine = rowLine.Replace("RUN AS", "RUNAS") before your split and then separate the words back out afterwards. If this happens more often, however, you may need to set a condition to check that the array generated by the split matches the length of the header, then combine the offending indexes in a new array of the correct length before attempting to add it.
Ideally, however, you would instead have whatever is generating your input file wrap strings in quotes to make your life easier.

How to insert value at specific column in text files

I have 3 txt files which are generated on a daily basis by one of our systems, that need values inserted at specific column positions.
I've accomplished this with the code below, however:
The specific value (**LineText) needs to be on all rows that have text and not just one row. I am not sure how to accomplish this.
My code currently inserts the value (**LineText), however it pushes everything over. Is there a way for the value to be inserted without pushing the rest of the data over?
Each day 3 files will be generated with the names REYYYYMMDD.TXT, TRYYYYMMDD.TXT and CTYYYYMMDD.TXT. Is there a way for the code to pick up these names? I've tried using wildcards such as RE*.TXT, TR*.TXT etc but it doesn't work.
Results example below (What my code currently does with the RE20150109.TXT file)
223016254 CSST45124
167520001 EUR SKBSUS12454
158013456 CSST15568
140490002 CSST14779
167520004 SKBSUS88897
515800001 CSST13679
149370003 CSST32897
161930009 RTVS10035
Below is what I would like it to do but am not sure how :
223016254 EUR CSST45124
167520001 EUR SKBSUS12454
158013456 EUR CSST15568
140490002 EUR CSST14779
167520004 EUR SKBSUS88897
515800001 EUR CSST13679
149370003 EUR CSST32897
161930009 EUR RTVS10035
My C# code is below:
using System;
using System.IO;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace AstTXTEdit
{
class Program
{
static void Main(string[] args)
{
string REFilePath = #"C:\AstImport\RE20150109.TXT";
int RElineNo = 1; //How do i set this to be all rows within the text file?
string RELineText = "";
int REPosition = 12;
var REFullContent = File.ReadAllLines(REFilePath);
RELineText = REFullContent[RElineNo];
RELineText = RELineText.Insert(REPosition, "EUR");
REFullContent[RElineNo] = RELineText;
File.WriteAllLines(REFilePath, REFullContent);
string TRFilePath = #"C:\AstImport\TR20150109.TXT";
int TRlineNo = 1; //How do i set this to be all rows within the text file?
string TRLineText = "";
int TRPosition = 40;
var FullContent = File.ReadAllLines(TRFilePath);
TRLineText = FullContent[TRlineNo];
TRLineText = TRLineText.Insert(TRPosition, "Y");
FullContent[TRlineNo] = TRLineText;
File.WriteAllLines(TRFilePath, FullContent);
string CTFilePath = #"C:\AstImport\CT20150109.TXT";
int CTlineNo = 1; //How do i set this to be all rows within the text file?
string CTLineText = "";
int CTPosition = 36;
var CTFullContent = File.ReadAllLines(CTFilePath);
CTLineText = FullContent[CTlineNo];
CTLineText = CTLineText.Insert(CTPosition, "I");
FullContent[CTlineNo] = CTLineText;
File.WriteAllLines(CTFilePath, FullContent);
}
}
}
Any Assistance would be most appreciated.
Kind Regards,
Andrea
The first two questions:
The specific value (**LineText) needs to be on all rows that have text and not just one row. I am not sure how to accomplish this.
and
My code currently inserts the value (**LineText), however it pushes everything over. Is there a way for the value to be inserted without pushing the rest of the data over?
can be solved by iterating all lines with the Select and transforming each line by first inserting the new string an then removing the same length of characters after the insertion:
// Read entire file;
var lines = File.ReadAllLines("data2.txt");
var eur = "EUR"; // String to insert.
// Calc the position to insert at.
var insertAt = "223016254".Length + 1;
var result =
lines
.Select(x =>
// Insert the 'eur' string.
x.Insert(insertAt, eur)
// Remove the spaces after insertion.
.Remove(insertAt + eur.Length, eur.Length))
.ToList();
As far as the third question is concerned:
Each day 3 files will be generated with the names REYYYYMMDD.TXT, TRYYYYMMDD.TXT and CTYYYYMMDD.TXT. Is there a way for the code to pick up these names?
I wouldn't use Directory.GetFiles wildcards because in most cases they are too simple. A linq query with with a regex file name pattern could do much more:
string fileNamePattern = #"(CT|RE|TR)\d{8}\.TXT$"
string[] files = Directory.GetFiles(path);
files =
files
.Where(fileName =>
Regex.IsMatch(fileName, fileNamePattern , RegexOptions.IgnoreCase))
.ToArray();
Question 3:
List<string> fileNames = Directory.GetFiles(#"c:\AstImport", "RE*.TXT").ToList();
fileNames.AddRange(Directory.GetFiles(#"c:\myfolder", "TR*.TXT"));
foreach (string fileName in fileNames)
{
}
Question 1: It would be easy to insert the missing strings with linq
var v = File.ReadAllLines(path).Select(s => { string[] arr = s.Split(' '); if (arr[1] == "EUR") return s; else return String.Join(" ", new string[] { arr[0], "EUR", arr[1] }); });
File.WriteAllLines(path, v);
Question 2: I did not understand the question
Question 3: The Linq Answer
Directory.GetFiles("path", "RE*.txt").ToList().ForEach(s => { /* Same as 1 */ });

How to display data from text file into many columns?

I have text file which consists of many rows and 18 columns of data seperated by tabs. I used this code and it is displaying entire data in single column. What I need is the data should be displayed in columns.
public static List<string> ReadDelimitedFile(string docPath)
{
var sepList = new List<string>();
// Read the file and display it line by line.
using (StreamReader file = new StreamReader(docPath))
{
string line;
while ((line = file.ReadLine()) != null)
{
var delimiters = new char[] { '\t' };
var segments = line.Split(delimiters, StringSplitOptions.RemoveEmptyEntries);
foreach (var segment in segments)
{
//Console.WriteLine(segment);
sepList.Add(segment);
}
}
file.Close();
}
// Suspend the screen.
Console.ReadLine();
return sepList;
}
You're outputting everything in one column like this (pseudo-code, to illustrate structure):
while (reading lines)
for (reading entries)
WriteLine(entry)
That is, for every line in the file and for every entry in that line, you output a new line. Instead, you want to only write a new line for every line in the file, and write the entries with separators (tabs?). Something more like this:
while (reading lines)
for (reading entries)
Write(entry)
WriteLine(newline)
That way all the entries for any given line in the file are on the same line in the output.
How you delimit those entries in the output is up to you, of course. And to write a carriage return could be as simple as Console.WriteLine(string.Empty), though I bet there are lots of other ways to do it.
18 columns would seem to be served best by using a dataGridView.
// Create your dataGrodView with the 18 columns using your designer.
int col = 0;
foreach (var segment in segments)
{
//Console.WriteLine(segment);
//sepList.Add(segment);
dataGridView1.Rows[whateverRow].Cells[col].Value = segment;
}
So according to your code, you have a following loop:
while{
<reads the lines one by one>
for each line{
<reading each segment and adding to the list.>
}
}
Your code read each segment of a line and append to the list. Ideally you should have 18 list for 18 columns. In java this problem can be solved with hashmaps:
Hashmap <String, ArrayList<String>> hmp = new Hashmap<String, ArrayList<String>>();`
while(read each line){
List<String> newList = new ArrayList<String>
foreach(segment as segments){
newList.add(segment);
}
hmp.put(column1,segment);
}
return hmp;
so you will have hmp.put(column2, segment), hmp.put(column3, segment) and so on.
Hope it helps.
You should be using DataTable or similar type for that but if you want to use List you can "emulate" rows and columns like this:
var rows = new List<List<string>>();
foreach(var line in File.ReadAllLines(docPath))
{
var columns = line.Split(new char[] { '\t' }, StringSplitOptions.RemoveEmptyEntries).ToList();
rows.Add(columns);
}
That will give you row/column like structure
foreach(var row in rows)
{
foreach(var column in row)
{
Console.Write(column + ",");
}
Console.WriteLine();
}

Categories

Resources