c# load csv file and sort columns

c# load csv file and sort columns - c#

I have a datagridview set up with elements and some identifying characteristics as column headers.
Col 1 Col 2 Col 3 4 5 6 7 ...
Sample , Symbol, Symbol Color, Na, K, Mg, Mn...
I can load CSV or text tab delimited files currently but the formatting has to match the datagridview. Is there a way to load a CSV of element data with column headers in random order, and then place them in the columns you desire.
Currently, the csv must be formatted in the same order as the datagridview:
Na, K, Mg, Mn....
88, 5, 6, 16...
56, 7, 33, 12...
Is it possible, if the data was in a different order to have it sorted to the format of the existing datagridview:
Mg, Mn, Na, K....
6, 16, 88, 5...
33, 12, 56, 7...
There may be missing columns from the imported file sometimes and thats ok. I have figured out how to hide the columns of empty data.

I would suggest to use some File Library like FileHelper to perform the actions. Its free and opensource with some great stuff like reading csv or any formatted data file, read data asynchronously, define column order for entity.
Edit:
Handling missing value: Create custom handler to handle the missing value or can be defined the data as nullable For more info http://www.filehelpers.net/example/MissingValues/MissingValuesNullable/
Custom order: Use ColumnOrder attribute to define the column order

This is a very simple way to do it.
First create a class like this;
private class MyColumns
{
public string Na { get; set; }
public string K { get; set; }
public string Mg { get; set; }
public string Mn { get; set; }
}
Then you can parse your csv like this.
var allLines = File.ReadAllLines(#"C:\kosalaw\myfile.csv"); //read all lines from the csv file
MyColumns[] AllColumns = new MyColumns[allLines.Count() -1]; //create an array of MyColumns class
var colHeaders = allLines[0].Split(new[]{"\",\""},StringSplitOptions.None).ToList(); // Identify columns headers
for (int index = 1; index < allLines.Length; index++)//loop through the lines. We skip first line as it is the column header
{
var line = allLines[index];
var lineColumns = line.Split(new[] { "\",\"" }, StringSplitOptions.None); //split each line in to columns
AllColumns[index - 1] = new MyColumns //now use column header to identify the exact column.
{
K = lineColumns[colHeaders.IndexOf("K")],
Mg = lineColumns[colHeaders.IndexOf("Mg")],
Mn = lineColumns[colHeaders.IndexOf("Mn")],
Na = lineColumns[colHeaders.IndexOf("Na")]
};
}

Related

Add items to a specific Column of a ListView in winforms

I am creating a WinForms application in visual studio 2017,
I am adding two columns to my ListView,
ListView1.Columns.Add("Column1", -2, HorizontalAlignment.Left);
ListView1.Columns.Add("Column2", -2, HorizontalAlignment.Left);
I am looping a List of strings, I would like to split it in half, where the first half goes to Column1 and the second goes to Column 2.
List<String> strings;
I have looked at many soloutions online using subItems instead, I cannot use subItems because:
I need all the items to be selectable
Some of the strings vary in size, so I would like the columns to be flexible to be able to display the entire string
I need all the strings to be aligned to the left side
A sample of what it should like
Column1 Column2
STRING 1 STRING 100002
STRING 10000 STRING 2222
STRING 144 STRING XCEZ
STRING 144 STRING IK?
STRING 144 STRING 5
Does anyone know how to do this ? thank you in advance.

I'm not sure why you have a List<string> rather than having a List<MyClass>, which MyClass has two properties, Property1 and Property2.
Anyway, regarding to your question, you can use a for loop like this:
var list = new List<string> { "1", "2", "3", "4" };
var count = list.Count;
listView1.BeginUpdate();
for (var i = 0; i < count / 2; i++)
listView1.Items.Add(list[i]).SubItems.Add(list[count / 2 + i]);
listView1.EndUpdate();

EPPlus number format

I have an Excel sheet generated with Epplus, I am experiencing some pain points and I wish to be directed by someone who have solved a similar challenge.
I need to apply number formatting to a double value and I want to present it in Excel like this.
8 → 8.0
12 → 12.0
14.54 → 14.5
0 → 0.0
Here is my code
ws.Cells[row, col].Style.Numberformat.Format = "##0.0";
The final Excel file always append E+0 to the end of this format and therefore presents the final values like this instead.
8 → 8.0E+0
12 → 12.0E+0
14.54 → 14.5E+0
0 → 000.0E+0
When I check in the format cells of the generated Excel sheet, I see that my format appears as ##0.0E+2 instead of ##0.0 that I applied.
What may be wrong?

Here are some number format options for EPPlus:
//integer (not really needed unless you need to round numbers, Excel will use default cell properties)
ws.Cells["A1:A25"].Style.Numberformat.Format = "0";
//integer without displaying the number 0 in the cell
ws.Cells["A1:A25"].Style.Numberformat.Format = "#";
//number with 1 decimal place
ws.Cells["A1:A25"].Style.Numberformat.Format = "0.0";
//number with 2 decimal places
ws.Cells["A1:A25"].Style.Numberformat.Format = "0.00";
//number with 2 decimal places and thousand separator
ws.Cells["A1:A25"].Style.Numberformat.Format = "#,##0.00";
//number with 2 decimal places and thousand separator and money symbol
ws.Cells["A1:A25"].Style.Numberformat.Format = "€#,##0.00";
//percentage (1 = 100%, 0.01 = 1%)
ws.Cells["A1:A25"].Style.Numberformat.Format = "0%";
//accounting number format
ws.Cells["A1:A25"].Style.Numberformat.Format = "_-$* #,##0.00_-;-$* #,##0.00_-;_-$* \"-\"??_-;_-#_-";
Don't change the decimal and thousand separators to your own
localization. Excel will do that for you.
By request some DateTime formatting options.
//default DateTime pattern
worksheet.Cells["A1:A25"].Style.Numberformat.Format = DateTimeFormatInfo.CurrentInfo.ShortDatePattern;
//custom DateTime pattern
worksheet.Cells["A1:A25"].Style.Numberformat.Format = "dd-MM-yyyy HH:mm";

Addition to Accepted Answer, because value Accept Object you must pass Number to Value For Example if your input is in string :
var input = "5";
ws.Cells["A1:A25"].Value = double.Parse(input);

Another addition to the accepted answer: you can use nullable values and the formatting all looks good BUT it ends up being a string in Excel and you can't SUM, AVG etc.
So make sure you use the actual Value of the nullable.

And if you want to format a specific column like column "B" to number format you can do it this way-
using (var package = new ExcelPackage())
{
var worksheet = package.Workbook.Worksheets.Add("SHEET1");
worksheet.Cells["A1"].LoadFromDataTable(dataTable, PrintHeaders: true);
for (var col = 1; col < dataTable.Columns.Count + 1; col++)
{
if (col == 2)//col number 2 is equivalent to column B
{
worksheet.Column(col).Style.Numberformat.Format = "#";//apply the number formatting you need
}
worksheet.Column(col).AutoFit();
}
return File(package.GetAsByteArray(), XlsxContentType, "report.xlsx");//downloads file
}

I solved it as follows, so I just load the model and change as per my model if it is int ordatetime
var li = typeof(Model)
.GetProperties()
.ToArray();
using (var package = new ExcelPackage(stream))
{
var workSheet = package.Workbook.Worksheets.Add("Sheet1");
var i = 0;
foreach (var c in li)
{
i++;
if(c.PropertyType.Name == typeof(DateTime).Name || c.PropertyType.Name == typeof(DateTime?).Name)
workSheet.Column(i).Style.Numberformat.Format = DateTimeFormatInfo.CurrentInfo.ShortDatePattern; ;
if (c.PropertyType.Name == typeof(int).Name || c.PropertyType.Name == typeof(int?).Name)
workSheet.Column(i).Style.Numberformat.Format = "0";
}
}

Processing a text file where the fields are not consistent

A vendor is providing a delimited text file but the file can and likely will be custom for each customer. So if the specification provides 100 fields I may only receive 10 fields.
My concern is the overhead of each loop. In all I am using a while and 2 for loops just for the header and there will at least as many for the detail.
My answer is as follows:
using (StreamReader sr = new StreamReader(flName))
{
//Process first line to get field names
flHeader = sr.ReadLine().Split(charDelimiters);
//Check first field to determine header or detail file
if (flHeader[0].ToUpper() == "ORDERID")
{
header = true;
} else if (flHeader[0].ToUpper() == "ORDERITEMID"){
detail = true;
}
}
//Use TextFieldParser to read and parse files
using (TextFieldParser parser = new TextFieldParser(flName))
{
parser.TextFieldType = FieldType.Delimited;
parser.SetDelimiters(delimiters);
while (!parser.EndOfData)
{
string[] fields = parser.ReadFields();
//Send read line to header or detail processor
if (header == true)
{
if (flHeader[0] != fields[0])
{
ProcessHeader(fields);
}
}
if (detail == true)
{
if (flHeader[0] != fields[0])
{
ProcessDetail(fields);
}
}
}
//Header Processor snippet
//Declare header class
Data.BLL.OrderExportHeader_BLL OrderHeaderBLL = new Data.BLL.OrderExportHeader_BLL();
foreach (string field in fields)
{
int fldCnt = fields.Count();
//Loop through each field then use the switch to determine which field is to be filled in
for (int flds = 0; flds < fldCnt; flds++ )
{
string strField = field.Trim();
switch (flHeader[flds].ToUpper())
{
case "ORDERID":
OrderHeaderBLL.OrderID = strField;
break;
}
}
}
//header file
OrderID ManufacturerID CustomerID SalesRepID PONumber OrderDate CustomerName CustomerNumber RepNumber Discount Terms ShipVia Notes ShipToCompanyName ShipToContactName ShipToContactPhone ShipToFax ShipToContactEmail ShipToAddress1 ShipToAddress2 ShipToCity ShipToState ShipToZip ShipToCountry ShipDate BillingAddress1 BillingAddress2 BillingCity BillingState BillingZip BillingCountry FreightTerm PriceLevel OrderType OrderStatus IsPlaced ContactName ContactPhone ContactEmail ContactFax Exported ExportDate Source ContainerName ContainerCubes Origin MarketName FOB SubTotal OrderTotal TaxRate TaxTotal ShippingTotal IsDeleted IsContainer OrderGUID CancelDate DoNotShipBefore WrittenByName WrittenForName WrittenForRepNumber CatalogCode CatalogName ShipToCode
491975 18 0 2621 1234 7/17/2014 RepZio 2499174 0 Test 561-351-7416 max#repzio.com 465 Ocean Ridge Way Juno Beach FL 33408 7/18/2014 465 Ocean Ridge Way Juno Beach FL 33408 USA 0 ShopZio True Max Fraser 561-351-7416 max#repzio.com False ShopZio 0.00 ShopZio 1500.0000 1500.0000 0.000 0.0000 0.0000 False False 63960a7b-86b7-47a2-ad11-9763a6b52fd0 7/31/2014 7/18/2014

Your sample data is the key, and your sample is currently obscure, but I think it matches the description that follows.
Per your example of 10 fields out of a possible 100.
In parsing each line, you only need to split in into 10 fields. It looks like you are delimited by whitespace, but you have a problem in that fields can contain embedded whitespace. Perhaps your data is actually tab delimited in which case you are ok.
For simplicity, I am going to assume your 100 fields are name 'fld0', 'fld1', ..., 'fld99'
Now, assuming the received file contains this header
fld10, fld50, fld0, fld20, fld80, fld70, fld0, fld90, fld50, fld60
and a line of data looks like
Alpha Bravo Charlie Delta Echo Foxtrot Golf Hotel India Juliet
e.g.
split[0] = "Alpha", split[1] = "Bravo", etc.
You parse the header and find that the indexes in your master list of 100 fields are 10,50,0 etc.
So you build a lookupFld array with these index value, i.e., lookupFld[0] = 10, lookupFld[1] = 50, etc
Now, as you process each line, split into 10 fields and you have an immediate indexed lookup of the correct corresponding field in your master field list.
Now MasterList[0] = "fld0", MasterList[1] = "fld1", ..., MasterList[99] = "fld99"
for (ii=0; ii<lookupFld.count; ++ii)
{
// MasterField[lookupFld[ii]] is represented by with split[ii]
// when ii = 0
// lookupFld[0] is 10
// so MasterField[10] /* fld10 */ is represented by split[0] /* alpha */
}

I'm trying to parse a table in plain text format. The program is written in Visual Studio using C#. I need to parse through the table and insert the data into the database.
Below is a sample table I will be reading in:
ID Name Value1 Value2 Value3 Value4 //header
1 nameA 3.0 0.2 2 6.2
2 nameB
3 nameC 2.9 3.0 7.3
4 nameD 1.5 3.0 1.8 1.1
5 nameE
6 nameF 1.2 2.4 3.3 2.5
7 nameG 3.0 3.2 2.1 4.5
8 nameH 88 12.4 28.9
In the example, I will need to capture data for id 1, 3, 4, 6, 7, and 8.
I thought of two ways to approach this, but neither of them works 100%.
Method 1:
By reading in the header, I can get the start index for each column. I will then use Substring collect data for each row.
ISSUE: once it past a certain row (which I will have no idea when this is happening), the columns shift, and Substring will no longer to collect the correct data.
This method will only collect correct data for 1, 3, and 4.
Method 2:
Using Regex to collect all the matches. I'm hoping this can collect ID, Name, Value1, Value2, Value3, Value4, in this order.
My pattern is (\d*?)\s\s\s+(.*?)\s\s\s+(\d*\.*\d*)\s\s\s+(\d*\.*\d*)\s\s\s+(\d*\.*\d*)\s\s\s+(\d*\.*\d*)
ISSUE: data that are collected are shifted left for some rows. For example, on ID 3, Value2 should be blank, but the regex will be reading Value2 = 3.0, Value3 = 7.3, and Value4 = blank. Same thing goes for ID 8.
Question:
How can I read in the whole table and parse them correctly?
(1) I do not know starting from which row the values will be shifted and
(2) I do not know how many cells it will be shifted by and if they are consistent.
Additional Information
The table is in a PDF file, I converted the PDF to text file so I can read in the data. The shifting data happens when a table goes across multiple pages, but it is not consistent.
EDIT
Below are some actual data:
68 BENZYL ALCOHOL 6.0 0.4 1 7.4
91 EVERNIA PRUNASTRI (OAK MOSS) 34 3 3 10
22 test 2323 23 12

ok, here u go! Use this regex pattern:
NOTE: you have to match this to any single line, not to the whole document! If you want to do it for your whole document then you have to add the 'multiline' modifier ('m'). You can do this by adding (?m) at the beginning of the regex pattern!
EDIT:
You provided some lines of your real data. Here's my updated regex pattern:
^(?<id>\d+)(?:\s{2,25})(?<name>.+?)(?:\s{2,45})(?<val1>\d+(?:\.\d+)?)?(?:\s{2,33})(?<val2>\d+(?:\.\d+)?)?(?:\s{2,14})(?<val3>\d+(?:\.\d+)?)?(?:\s{2,19})(?<val4>\d+(?:\.\d+)?)?$

How about treating this file like a fixed-length file, where you can define each column by an index and length. Once you have defined your fixed length columns, you can just get the value for the column with Substring, then Trim to clean it up.
You can wrap all this up in a Linq statement to project to an anonymouse type and filter for the IDs you want.
Something like this:
static void Main(string[] args)
{
int[] select = new int[] { 1, 3, 4, 6, 7, 8 };
string[] lines = File.ReadAllLines("TextFile1.txt");
var q = lines.Skip(1).Select(l => new {
Id = Int32.Parse(GetValue(l, 0, 6)),
Name = GetValue(l, 6, 11),
Value1 = GetValue(l, 17, 11),
Value2 = GetValue(l, 28, 13),
Value3 = GetValue(l, 41, 14),
Value4 = GetValue(l, 55, 13),
}).Where(o => select.Contains(o.Id));
var r = q.ToArray();
}
static string GetValue(string line, int index, int length)
{
string value = null;
int lineLength = line.Length;
// Take as much of the line as we can up to column length
if(lineLength > index)
value = line.Substring(index, Math.Min(length, lineLength - index)).Trim();
// Return null if we just have whitespace
return String.IsNullOrWhiteSpace(value) ? null : value;
}

Issue with data table Select statement

The following VB line, where _DSversionInfo is a DataSet, returns no rows:
_DSversionInfo.Tables("VersionInfo").Select("FileID=88")
but inspection shows that the table contains rows with FileID's of 92, 93, 94, 90, 88, 89, 215, 216. The table columns are all of type string.
Further investigation showed that using the ID of 88, 215 and 216 will only return rows if the number is quoted.
ie _DSversionInfo.Tables("VersionInfo").Select("FileID='88'")
All other rows work regardless of whether the number is quoted or not.
Anyone got an explanation of why this would happen for some numbers but not others? I understand that the numbers should be quoted just not why some work and others don't?
I discovered this in some VB.NET code but (despite my initial finger pointing) don't think it is VB.NET specific.

According to the MSDN documentation on building expressions, strings should always be quoted. Failing to do so produces some bizarro unpredictable behavior... You should quote your number strings to get predictable and proper behavior like the documentation says.
I've encounted what you're describing in the past, and kinda tried to figure it out - here, pop open your favorite .NET editor and try the following:
Create a DataTable, and into a string column 'Stuff' of that DataSet, insert rows in the following order: "6", "74", "710", and Select with the filter expression "Stuff = 710". You will get 1 row back. Now, change the first row into any number greater than 7 - suddenly, you get 0 rows back.
As long as the numbers are ordered in proper descending order using string ordering logic (i.e., 7 comes after 599) the unquoted query appears to work.
My guess is that this is a limitation of how DataSet filter expressions are parsed, and it wasn't meant to work this way...
The Code:
// Unquoted filter string bizzareness.
var table = new DataTable();
table.Columns.Add(new DataColumn("NumbersAsString", typeof(String)));
var row1 = table.NewRow(); row1["NumbersAsString"] = "9"; table.Rows.Add(row1); // Change to '66
var row2 = table.NewRow(); row2["NumbersAsString"] = "74"; table.Rows.Add(row2);
var row4 = table.NewRow(); row4["NumbersAsString"] = "90"; table.Rows.Add(row4);
var row3 = table.NewRow(); row3["NumbersAsString"] = "710"; table.Rows.Add(row3);
var results = table.Select("NumbersAsString = 710"); // Returns 0 rows.
var results2 = table.Select("NumbersAsString = 74"); // Throws exception "Min (1) must be less than or equal to max (-1) in a Range object." at System.Data.Select.GetBinaryFilteredRecords()
Conclusion: Based on the exception text in that last case, there appears to be some wierd casting going on inside filter expressions that is not guaranteed to be safe. Explicitely putting single quotes around the value for which you're querying avoids this problem by letting .NET know that this is a literal.

DataTable builds an index on the columns to make Select() queries fast. That index is sorted by value, then it uses a binary search to select the range of records that matches the query expression.
So the records will be sorted like this 215,216,88,89,90,92,93,94. A binary search is done treating them as integer (as per our filter expression) cannot locate certain records because, it is designed to only search properly sorted collections.
It indexes the data as string and Binary search searches as number. See the below explanation.
string[] strArr = new string[] { "115", "118", "66", "77", "80", "81", "82" };
int[] intArr = new int[] { 215, 216, 88, 89, 90, 92, 93, 94 };
int i88 = Array.BinarySearch(intArr, 88); //returns -ve index
int i89 = Array.BinarySearch(intArr, 89); //returns +ve index
This should be a bug in the framework.

this error usually comes due to invalid data table column type in which you are going to search
i got this error when i was using colConsultDate instead of Convert(colConsultDate, 'System.DateTime')
because colConsultDate was a data table column of type string which i must have to convert into System.DateTime therefor your search query should be like
string query = "Convert(colConsultDate, 'System.DateTime') >= #" + sdateDevFrom.ToString("MM/dd/yy") + "# AND Convert(colConsultDate, 'System.DateTime') <= #" + sdateDevTo.ToString("MM/dd/yy") + "#";
DataRow[] dr = yourDataTable.Select(query);
if (dr.Length > 0)
{
nextDataTabel = dr.CopyToDataTable();
}

#Val Akkapeddi just wanna add things to your answer.
if you do something like this it would be benefited specially when you have to use comparison operators. because you put quotes around 74 it will be treated as string. please see yourself by actually writing code. Comparison operators
(decimal is just for reference you can add your desired datatype instead.)
var results2 = table.Select("Convert(NumbersAsString , 'System.Decimal') = 74.0")

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.