LinqToExcel returns null - c#
I have an excel sheet of xls format, I am using LinqToExcel to read it and then import it to my DB.
This sheet consist of about 3K row and only 6 cols. I am using .addmapping to map my class properties to the column names
The problem i have is: the cells of column "web-code" are SOMETIMES coming back as null although there are data in the cells.
Here is a sample data that is coming as null!
My Code watch
And here is a sample data where the data coming correct:
My Code Watch
I have tried applying ExcelColumn attribute for mapping, but no luck!
code:
var factory = new ExcelQueryFactory(_excelFilePath);
factory.AddMapping<ExcelPriceEntity>(x => x.WebCode, "WEB-CODE");
factory.AddMapping<ExcelPriceEntity>(x => x.Type, "TYPE");
factory.AddMapping<ExcelPriceEntity>(x => x.Style, "STYLE");
factory.AddMapping<ExcelPriceEntity>(x => x.Qty, "QTY");
factory.AddMapping<ExcelPriceEntity>(x => x.UnitPrice, "Unit Price");
factory.AddMapping<ExcelPriceEntity>(x => x.Bucket, "WEBCODE W/BUCKET");
factory.StrictMapping = StrictMappingType.ClassStrict;
factory.TrimSpaces = TrimSpacesType.Both;
factory.ReadOnly = true;
var prices = factory.Worksheet<ExcelPriceEntity>(_allPricesSheetName).ToList();
var priccerNP = prices.Where(p => p.Type.Contains("900 ARROW TAPE")).ToList();
My PriceEntity Class:
public class ExcelPriceEntity
{
//[ExcelColumn("TYPE")]
public string Type { get; set; }
public string WebCode { get; set; }
//[ExcelColumn("STYLE")]
public string Style { get; set; }
//[ExcelColumn("QTY")]
public string Qty { get; set; }
//[ExcelColumn("Unit Price")]
public string UnitPrice { get; set; }
//[ExcelColumn("WEBCODE W/BUCKET")]
public string Bucket { get; set; }
}
Alternate Solution:
I ended up saving the excel sheet as csv file, then import to SQL table.Then i used linq-to-sql to read the data.
Root Cause:
After researching i found out the problem was that the first cell of this column(web-code) was interger number, and excel trys to figure out the datatype of the column by looking at the first rows!
So next rows of (web-code) column was some text data. So excel couldn't parse it as integer, and assign null value to it!
What I could've done is, assing text value to the first cell so excel would guess the data type as string. But I didn't test that. For anyone reading this answer, try having text value in you first row if you came across the same problem
here, the Contains is not like string contains. It compares a list of cell values to the exact value u give inside the contains method. just try with the full text "900 AMMONIA STOCK OR CUST...)
Another alternative solution to #alsafoo is to convert the column from "general" to "text".
These are the steps:
1. Right click on any cell in the column.
2. In Number tab, select text.
3. Select Format cell
4. Press Ok.
After then, the library will read all values as string.
Related
How to detect if a row has extra columns (more than the header)
While reading a CSV file, how can I configure CsvHelper to enforce that each row has no extra columns that are not found in the header? I cannot find any obvious property under CsvConfiguration nor under CsvHelper.Configuration.Attributes. Context: In our CSV file format, the last column is a string description, which our users (using plain-text editors) sometimes forget to quote when the description contains commas. Such "raw" commas cause that row to have extra columns, and the intended description read into the software omits the description after the first raw comma. I want to detect this and throw an exception that suggests to the user they may have forgotten to quote the description cell. It looks like CsvConfiguration.DetectColumnCountChanges might be related, but presently the 29.0.0 library lacks any Intellisense description of CsvConfiguration properties, so I have no idea how to use this. Similar information for other CSV libraries: With LINQtoCSV this was done by setting IgnoreUnknownColumns = false in CsvFileDescription. Can Lumenworks CSV parser error when there are too many columns in a row?
You were on the right track with CsvConfiguration.DetectColumnCountChanges. void Main() { var config = new CsvConfiguration(CultureInfo.InvariantCulture) { DetectColumnCountChanges = true }; using (var reader = new StringReader("Id,Name\n1,MyName\n2,YourName,ExtraColumn")) using (var csv = new CsvReader(reader, config)) { try { var records = csv.GetRecords<Foo>().ToList(); } catch (BadDataException ex) { if (ex.Message.StartsWith("An inconsistent number of columns has been detected.")) { Console.WriteLine("There is an issue with an inconsistent number of columns on row {0}", ex.Context.Parser.RawRow); Console.WriteLine("Row data: \"{0}\"", ex.Context.Parser.RawRecord); Console.WriteLine("Please check for commas in a field that were not properly quoted."); } } } } public class Foo { public int Id { get; set; } public string Name { get; set; } }
C# Winform Excel Reading through a query
this is my first time to ask question. So please be gentle on me :) My problem is, I want to format an excel file generated by our time and attendance terminal. Here is the generated excel file and some description of my problem: What I want to do is read that excel file using C# Winform and format it as the way way I need it. My Problems are: How can I select the sheet? I know that it is only one sheet but I don't know on how to point a sheet using OleDbConnection. Sample in OleDBConnection is "[Sheet1$]" to read a Sheet but I'm not sure on what sheet name will be generated from the terminal. Can we use index? For Example : "from [Sheet1&]" will be "from [0$]"? Same as the first problem but in the excel column. How can I treat it as for example, "[0], 1". Last problem, and will probably explain it all. what I really want to do is use OleDbConnection and my command will be look like this : "SELECT DISTINCT [0$], Convert([1$], Date), MIN(Convert([1$], Time)), MAX(Convert([1$], Time)) FROM [0$] GROUP BY [0$], Convert([1$], Date)" Note : [0$] and [1$], are the index of either the columns or the sheets What I need need is to generate a file that will show the employees attendance that will format with date and there first Time IN for the Day and there last Time Out. Please Look the image below for the idea of the output. I've try to search but I'm not able to find a solution that will fits on what I need. Hope that anyone can help me. Thanks!
For question: Q1: See code note, i have not way to fix this question, if anybody got please sharing to me. Q2: Same as first, i have not way to using index by DateTableName(or Sheet Name) or Columns in SQL. Q3: Using Linq. More about Linq samples https://code.msdn.microsoft.com/101-LINQ-Samples-3fb9811b These steps are how to get the expected data: Read all data from Excel file need to know Sheet Name; Filter invalid data, because there are many columns value is empty, null or space-string but added to rows; Query data using Linq. Hope that my answer can help you. Reference code: public void Test() { string filePath = $"{Environment.CurrentDirectory}//test20170206.xls"; string conStr = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + filePath + ";Extended Properties='Excel 8.0;HDR=NO;IMEX=1;'"; //connection excel OleDbConnection conn = new OleDbConnection(conStr); conn.Open(); //get all sheel from excel file DataTable tb = conn.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null); foreach (DataRow row in tb.Rows) { Console.WriteLine(row["TABLE_NAME"]); //using index in row for (int i=0;i p.Datetime).ToLongTimeString(), TimeMax = g.Max(p => p.Datetime).ToLongTimeString() }; } public class Model20170206 { public int Id { get; set; } public DateTime Datetime { get; set; } public string Value1 { get; set; } public string Value2 { get; set; } public string Value3 { get; set; } public string Value4 { get; set; } } Practice steps: I using some test data like this: 1,2017-01-28 03:47:54,1,1,1,0 2,2017-01-29 04:58:18,1,0,2,0 3,2017-01-28 08:44:43,1,1,3,0 4,2017-01-28 05:47:56,1,0,4,0 0,2017-02-05 12:12:53,1,1,5,0 1,2017-01-31 12:02:24,1,0,6,0 2,2017-02-05 12:30:34,1,1,7,0 3,2017-02-04 02:30:08,1,0,8,0 4,2017-02-01 11:39:53,1,1,9,0 0,2017-02-05 07:45:58,1,0,10,0 1,2017-02-05 03:01:46,1,1,11,0 2,2017-02-02 09:22:17,1,0,12,0 3,2017-01-30 03:05:46,1,1,13,0 4,2017-02-04 09:02:21,1,0,14,0 0,2017-02-03 07:58:20,1,1,0,0 1,2017-01-29 07:53:48,1,0,1,0 2,2017-01-29 12:41:25,1,1,2,0 3,2017-02-06 02:58:50,1,0,3,0 4,2017-01-28 11:06:47,1,1,4,0 0,2017-02-04 10:40:18,1,0,5,0 1,2017-01-31 12:57:24,1,1,6,0 2,2017-02-03 12:28:38,1,0,7,0 3,2017-02-01 06:48:23,1,1,8,0 4,2017-01-28 12:42:59,1,0,9,0 0,2017-02-04 10:34:44,1,1,10,0 1,2017-02-06 06:38:31,1,0,11,0 2,2017-02-05 08:40:26,1,1,12,0 3,2017-01-31 01:56:32,1,0,13,0 4,2017-02-05 11:13:11,1,1,14,0 0,2017-01-29 10:34:58,1,0,0,0 1,2017-02-02 04:01:18,1,1,1,0 2,2017-02-06 01:08:09,1,0,2,0 3,2017-01-28 07:24:11,1,1,3,0 4,2017-02-02 08:25:50,1,0,4,0 0,2017-01-28 08:01:13,1,1,5,0 1,2017-02-03 08:33:10,1,0,6,0 2,2017-01-29 03:47:03,1,1,7,0 3,2017-02-05 12:36:56,1,0,8,0 4,2017-02-04 06:10:55,1,1,9,0 0,2017-01-29 05:13:43,1,0,10,0 1,2017-02-06 06:35:18,1,1,11,0 2,2017-01-31 08:23:25,1,0,12,0 3,2017-02-03 04:25:10,1,1,13,0 4,2017-01-31 04:31:34,1,0,14,0 0,2017-01-30 10:03:42,1,1,0,0 1,2017-01-30 11:07:57,1,0,1,0 2,2017-02-05 11:17:45,1,1,2,0 3,2017-02-02 12:59:56,1,0,3,0 4,2017-01-31 04:49:48,1,1,4,0 0,2017-02-02 01:02:05,1,0,5,0 1,2017-01-31 11:16:52,1,1,6,0 2,2017-02-03 09:53:51,1,0,7,0 3,2017-01-31 04:02:09,1,1,8,0 4,2017-01-28 05:06:38,1,0,9,0 0,2017-01-28 09:18:28,1,1,10,0 1,2017-02-01 04:26:56,1,0,11,0 2,2017-02-03 11:17:34,1,1,12,0 3,2017-02-06 09:09:23,1,0,13,0 4,2017-01-30 08:20:51,1,1,14,0 0,2017-02-05 06:42:01,1,0,0,0 1,2017-02-01 04:29:38,1,1,1,0 2,2017-02-02 05:28:21,1,0,2,0 3,2017-02-05 04:24:37,1,1,3,0 4,2017-01-31 07:59:14,1,0,4,0 0,2017-01-31 10:38:59,1,1,5,0 1,2017-02-06 03:01:17,1,0,6,0 2,2017-02-02 08:52:25,1,1,7,0 3,2017-02-02 07:47:35,1,0,8,0 4,2017-02-04 07:55:27,1,1,9,0 0,2017-02-06 12:32:50,1,0,10,0 1,2017-01-28 11:01:04,1,1,11,0 2,2017-01-30 01:12:09,1,0,12,0 3,2017-01-29 03:42:15,1,1,13,0 4,2017-02-05 07:00:44,1,0,14,0 0,2017-02-05 05:12:40,1,1,0,0 1,2017-01-28 11:28:18,1,0,1,0 2,2017-02-05 09:11:08,1,1,2,0 3,2017-01-29 09:11:08,1,0,3,0 4,2017-02-04 03:46:57,1,1,4,0 0,2017-02-02 06:21:06,1,0,5,0 1,2017-01-28 02:15:06,1,1,6,0 2,2017-01-31 02:34:50,1,0,7,0 3,2017-02-05 10:14:23,1,1,8,0 4,2017-01-31 05:05:26,1,0,9,0 0,2017-02-05 12:25:46,1,1,10,0 1,2017-02-05 07:15:07,1,0,11,0 2,2017-02-03 04:00:19,1,1,12,0 3,2017-02-06 03:25:13,1,0,13,0 4,2017-02-02 06:01:42,1,1,14,0 0,2017-02-03 04:41:57,1,0,0,0 1,2017-02-06 10:23:15,1,1,1,0 2,2017-01-29 07:45:19,1,0,2,0 3,2017-02-03 11:10:25,1,1,3,0 4,2017-02-05 12:36:33,1,0,4,0 0,2017-02-03 01:51:44,1,1,5,0 1,2017-02-01 08:01:16,1,0,6,0 2,2017-02-01 05:06:28,1,1,7,0 3,2017-01-31 03:20:15,1,0,8,0 4,2017-02-05 07:00:14,1,1,9,0 0,2017-01-31 08:10:04,1,0,10,0 1,2017-02-03 03:23:12,1,1,11,0 2,2017-01-31 10:24:29,1,0,12,0 3,2017-02-04 06:12:08,1,1,13,0 4,2017-01-31 05:24:38,1,0,14,0 0,2017-02-01 08:55:20,1,1,0,0 1,2017-01-29 04:59:03,1,0,1,0 2,2017-02-02 01:05:55,1,1,2,0 3,2017-02-01 10:09:05,1,0,3,0 4,2017-01-30 01:01:37,1,1,4,0 0,2017-02-06 10:57:49,1,0,5,0 1,2017-02-01 08:23:28,1,1,6,0 2,2017-02-01 09:00:45,1,0,7,0 3,2017-01-30 12:16:46,1,1,8,0 0,2017-02-04 06:06:36,1,0,10,0 1,2017-02-01 09:31:02,1,1,11,0 ... more ... Import them into Excel File look like this : My code result data like this :
LinqToExcel Duplicate Column Names
I have a machine generated excel file that has a few columns with the same name. e.g.: A B C D Group 1 Group 2 Period | Name Period | Name And i got a DTO like this: [ExcelColumn("Period")] public string FirstPeriod { get; set; } [ExcelColumn("Name")] public string FirstName { get; set; } [ExcelColumn("Period")] public string SecondPeriod { get; set; } [ExcelColumn("Name")] public string SecondName { get; set; } I use the following command to read the lines: var excel = new ExcelQueryFactory(filePath); excel.WorksheetRange<T>(beginCell, endColl + linesCount.ToString(), sheetIndex); It reads the file just fine, but when i check the content of my DTO i saw that all the 'Second' properties have the same values of the 'First' ones. This post was the closest thing that i found in my searches and i think the problem could be solved with something like this: excel.AddMapping<MyDto>(x => x.FirstPeriod, "A"); excel.AddMapping<MyDto>(x => x.FirstName, "B"); excel.AddMapping<MyDto>(x => x.SecondPeriod, "C"); excel.AddMapping<MyDto>(x => x.SecondName, "D"); But i don't know how to get the excel column letters... Obs: I got a few more code behind this but i don't think its relevant to the problem.
The problem that you're having is not possible to solve today with LinqToExcel because it wraps the OleDb functions and then they map properties based on columns names, so you lose the OleDb options like "FN" for specify columns (like "F1" for "A"). There's a issue on LinqToExcel github repo about this. https://github.com/paulyoder/LinqToExcel/issues/85 I recommend you to change the name of the columns to not duplicates names (e.g. Period1, Name1, Period2, Name2) if it's not possible to change because its machine generated, try change the header names in runtime. Another option is to make more than one query in excel file, with ranges splitted your groups and then merging the results later. var excel = new ExcelQueryFactory(filePath); var group1 = excel.WorksheetRange<T>(A1, B + rowCount); var group2 = excel.WorksheetRange<T>(C1, D + rowCount); Edit: I'll work on a feature to try solve this problem in a elegant manner, so maybe in future you have a more flexible option to map columns and properties (if they accept my Pull Request)
exporting a comma separated values text file to use in excel
In my C# winforms program I want to make an export function that will create a comma separated text file or csv. I am not sure about the logic of how to do this the best way. My exported file will be like this : Family Name, First Name, Sex, Age Dekker, Sean, Male, 23 Doe, John, Male, 40 So the first line I want to be the name of the columns, and the rest should be treated as values. Is it ok in this way for later usage? Or I should not include column names? Would be nice to hear your experiences about this!
Sean, sorry don't have enough privilege points to comment directly on your post. I think you may be confusing CSV and Excel files here. A CSV is simply a text file where each value is separated by a comma, there is no special formating etc. Excel will display CSV files since it knows how to open them but you can just as easily open them in notepad. Excel .xslx files are different and can contain all sorts of different formats, charts etc. To format these files its important to understand that .xslx files are essentially zips. So the first place to start is to create an excel file with some data, save it and then rename the extension to .zip Open the zip file created now and you will see a number of different folders and files, of these the most important for your purposes is the XL directory. In this folder you will see a shared strings xml file and a worksheets folder. Lets start by going into the worksheet folder and opening sheet1.xml. Look for the line that says " If there is text in this column, i.e. data that excel should read as text then you will have something like 0. This indicates that cell A1 is of type string t="s" and that the value is to be found as the first value in the SharedStrings.xml file 0 If there is a number in the cell then you may have something like 234. In this case Excel knows to use the value 234 in this cell. So in your case you will need to do the following: 1: create the excel document in C# - there are a number of libraries available for this 2: Open the excel file as a zip 3: Modify in your case the styles and worksheets xml files 4: Save the document
That is absolutely fine to do (to state the obvious....). Excel has a little checkbox that allows the user importing to use the first line as column headers if they select it. I would also suggest that you leave out the spaces at the start of each piece of data, it isn't necessary.
In general its best practice to include the column headers, the only reason not to do so would be an external program over which you have no control accessing your data which doesn't realise that the first row are the column headers and which can't be changed. To create the export function something like this should work: private static List<Person> people = new List<Person>(); static void Main(string[] args) { // add some people people.Add( new Person() { firstName = "John", familyName = "Smith", sex = Sex.Male, age = 12 } ); people.Add( new Person() { firstName = "Mary", familyName = "Doe", sex = Sex.Female, age = 25 } ); // write the data Write(); } static void Write() { using (TextWriter tw = new StreamWriter(#"c:\junk1\test.csv", false)) { // write the header tw.WriteLine("Family Name, First Name, Sex, Age"); // write the details foreach(Person person in people) { tw.WriteLine(String.Format("{0}, {1}, {2}, {3}", person.familyName, person.firstName, person.sex.ToString(), person.age.ToString())); } } } } /// <summary> /// Applicable sexes /// </summary> public enum Sex { Male, Female } /// <summary> /// holds details about a person /// </summary> public class Person { public string familyName { get; set; } public string firstName { get; set; } public Sex sex { get; set; } public int age { get; set; } }
You can use Dataset to do this. Please refer here
//Why not save the lines to a List<string> object, first List<sting>Object.Add//("your headers"), use the string.Join("," "your Header Array string[]" do not add //(+",") the .Join extension menthod will handle that for you. // here is an example //if you were to retreive the header values from a database using a SQL Reader var reader = sqlcmdSomeQueryCommand.ExecuteReader(); var columns = new List<string>(); //get all the field names from the Columns var for (int intCounter = 0; intCounter < reader.FieldCount; intCounter++) { columns.Add(reader.GetName(intCounter)); } strarryTmpString = columns.ToArray(); string TmpFields = string.Join(", ", strarryTmpString); columns.Clear(); columns.Add(TmpFields); //you can save the TmpFieldList to later add the rest of your comma delimited fields write line by line in a foreach loop or use the List<string> object .foreach(delegate(string delString) { someStreamWriterObject.WriteLine(delString) });
Parsing a CSV formatted text file
I have a text file that looks like this: 1,Smith, 249.24, 6/10/2010 2,Johnson, 1332.23, 6/11/2010 3,Woods, 2214.22, 6/11/2010 1,Smith, 219.24, 6/11/2010 I need to be able to find the balance for a client on a given date. I'm wondering if I should: A. Start from the end and read each line into an Array, one at a time. Check the last name index to see if it is the client we're looking for. Then, display the balance index of the first match. or B. Use RegEx to find a match and display it. I don't have much experience with RegEx, but I'll learn it if it's a no brainer in a situation like this.
I would recommend using the FileHelpers opensource project: http://www.filehelpers.net/ Piece of cake: Define your class: [DelimitedRecord(",")] public class Customer { public int CustId; public string Name; public decimal Balance; [FieldConverter(ConverterKind.Date, "dd-MM-yyyy")] public DateTime AddedDate; } Use it: var engine = new FileHelperAsyncEngine<Customer>(); // Read using(engine.BeginReadFile("TestIn.txt")) { // The engine is IEnumerable foreach(Customer cust in engine) { // your code here Console.WriteLine(cust.Name); // your condition >> add balance } }
This looks like a pretty standard CSV type layout, which is easy enough to process. You can actually do it with ADO.Net and the Jet provider, but I think it is probably easier in the long run to process it yourself. So first off, you want to process the actual text data. I assume it is reasonable to assume each record is seperated by some newline character, so you can utilize the ReadLine method to easily get each record: StreamReader reader = new StreamReader("C:\Path\To\file.txt") while(true) { var line = reader.ReadLine(); if(string.IsNullOrEmpty(line)) break; // Process Line } And then to process each line, you can split the string on comma, and store the values into a data structure. So if you use a data structure like this: public class MyData { public int Id { get; set; } public string Name { get; set; } public decimal Balance { get; set; } public DateTime Date { get; set; } } And you can process the line data with a method like this: public MyData GetRecord(string line) { var fields = line.Split(','); return new MyData() { Id = int.Parse(fields[0]), Name = fields[1], Balance = decimal.Parse(fields[2]), Date = DateTime.Parse(fields[3]) }; } Now, this is the simplest example, and doesn't account for cases where the fields may be empty, in which case you would either need to support NULL for those fields (using nullable types int?, decimal? and DateTime?), or define some default value that would be assigned to those values. So once you have that you can store the collection of MyData objects in a list, and easily perform calculations based on that. So given your example of finding the balance on a given date you could do something like: var data = customerDataList.First(d => d.Name == customerNameImLookingFor && d.Date == dateImLookingFor); Where customerDataList is the collection of MyData objects read from the file, customerNameImLookingFor is a variable containing the customer's name, and customerDateImLookingFor is a variable containing the date. I've used this technique to process data in text files in the past for files ranging from a couple records, to tens of thousands of records, and it works pretty well.
I think the cleanest way is to load the entire file into an array of custom objects and work with that. For 3 MB of data, this won't be a problem. If you wanted to do completely different search later, you could reuse most of the code. I would do it this way: class Record { public int Id { get; protected set; } public string Name { get; protected set; } public decimal Balance { get; protected set; } public DateTime Date { get; protected set; } public Record (int id, string name, decimal balance, DateTime date) { Id = id; Name = name; Balance = balance; Date = date; } } … Record[] records = from line in File.ReadAllLines(filename) let fields = line.Split(',') select new Record( int.Parse(fields[0]), fields[1], decimal.Parse(fields[2]), DateTime.Parse(fields[3]) ).ToArray(); Record wantedRecord = records.Single (r => r.Name = clientName && r.Date = givenDate);
Note that both your options will scan the file. That is fine if you only want to search in the file for 1 item. If you need to search for multiple client/date combinations in the same file, you could parse the file into a Dictionary<string, Dictionary <date, decimal>> first. A direct answer: for a one-off, a RegEx will probably be faster.
If you're just reading it I'd consider reading in the whole file in memory using StreamReader.ReadToEnd and then treating it as one long string to search through and when you find a record you want to look at just look for the previous and next line break and then you have the transaction row you want. If it's on a server or the file can be refreshed all the time this might not be a good solution though.
If it's all well-formatted CSV like this then I'd use something like the Microsoft.VisualBasic.TextFieldParser class or the Fast CSV class over on code project to read it all in. The data type is a little tricky because I imagine not every client has a record for every day. That means you can't just have a nested dictionary for your looksup. Instead, you want to "index" by name first and then date, but the form of the date record is a little different. I think I'd go for something like this as I read in each record: Dictionary<string, SortedList<DateTime, double>>
hey, hey, hey!!! why not do it with this great project on codeproject Linq to CSV, way cool! rock solid