I am exporting a DataSet to excel that was generated by running an SQL script within my C# program. The DataSet correctly stores certain columns as datetimes, but from my understanding it does not store formatting information. It appears then that the formatting is given when call
ws.FirstCell().InsertTable(dt, false);
Which gives it the default format dd/mm/yyyy. Unfortunately I can't just select the whole range with ClosedXML and change the number format since come columns contain non-date numbers.
I suppose I could probably go through my data set, find which columns are date times, and format those columns accordingly with closedXML. I would imagine there has to be a better way though. Does anyone know an easy option for doing this with closed XML (or interop even)? I was unable to find anything on this.
EDIT:
Appears this is the only way at the moment... if anyone needs it (and manages to find this post) its easy enough to do:
//select which columns are dates and change default format
int colNum = 1;
foreach (DataColumn col in dt.Columns)
{
if (col.DataType == typeof(System.DateTime))
{
Console.WriteLine(colNum);
ws.Column(colNum).Style.NumberFormat.Format = "mmm-dd-yyyy";
}
colNum++;
}
ws.FirstCell().InsertTable(dt, false);
Related
I am trying to export datagrid's data to an excel sheet in C#. Using below code I managed to export it successfully
private void ExportToExcelAndCsv()
{
dataGrid.SelectAllCells();
dataGrid.ClipboardCopyMode = DataGridClipboardCopyMode.IncludeHeader;
ApplicationCommands.Copy.Execute(null, dataGrid);
String resultat = (string)Clipboard.GetData(DataFormats.CommaSeparatedValue);
String result = (string)Clipboard.GetData(DataFormats.Text);
dataGrid.UnselectAllCells();
System.IO.StreamWriter file1 = new System.IO.StreamWriter(#"C:\Users\Fisniku\Desktop\"+tipiL+".xls");
file1.WriteLine(result.Replace(',', ' '));
file1.Close();
}
My issue is that I have a column in String datatype in SQLServer containing data in format of fractions such as:
1/2
2/2
1/3
9/4
When data is exported in excel, that column becomes a date, and shows data like a date 01-Jan in Custom format.
After trying to modify the column in excel to text format, it looses value and becomes invalid.
How can I modify the code so it will preserve the same format as in datagrid?
I am assuming that this data in the grid originates from a SQL query or table based on the tags on your question. If that's the case, and if either the data is read-only or the data is in a datatable that is bound to the grid, then an easy way of accomplishing this is to use EPPlus instead of interop.
EPPlus has a single command to export a C# datatable (System.Data.DataTable) into a spreadsheet:
ws.Cells["A1"].LoadFromDataTable(dt, true);
And, per your original problem, it respects datatypes meaning it won't clobber leading zeroes on text fields that have all numbers, convert text that looks like a date into a date, etc.
If you're not using a datatable, you can always convert your content to a datatable (although there is a lot of overhead with this approach).
If you are married to interop, then just be sure you format the appropriate columns as text BEFORE you do the paste:
sheet.Range["A:A"].NumberFormat = "#";
You may need to format the cell as text. Take a look at this question:
Format an Excel column (or cell) as Text in C#?
I am working with a client to import a rather larger Excel file (over 37K rows) into a custom system and utilizing the excellent LinqToExcel library to do so. While reading all of the data in, I noticed it was breaking on records about 80% in and dug a little further. The reason it fails is the majority of records (with associated dates ranging 2011 - 2015) are normal, e.g. 1/3/2015, however starting in 2016, the structure changes to look like this: '1/4/2016 (note the "tick" at the beginning of the date) and LinqToExcel starts returning a DBNull for that column.
Any ideas on why it would do that and ways around it? Note that this isn't a casting issue - I can use the Immediate Window to see all the values of the LinqToExcel.Row value and where that column index is, it's empty.
Edit
Here is the code I am using to read in the file:
var excel = new LinqToExcel.ExcelQueryFactory(Path.Combine(this.FilePath, this.CurrentFilename));
foreach (var row in excel.Worksheet(file.WorksheetName))
{
data.Add(this.FillEntity(row));
}
The problem I'm referring to is inside the row variable, which is a LinqToExcel.Row instance and contains the raw data from Excel. The values inside row all line up, with the exception of the column for the date which is empty.
** Edit 2 **
I downloaded the LinqToExcel code from GitHub and connected it to my project and it looks like the issue is even deeper than this library. It uses an IDataReader to read in all of the values and the cells in question that aren't being read are empty from that level. Here is the block of code from the
LinqToExcel.ExcelQueryExecutorclass that is failing:
private IEnumerable<object> GetRowResults(IDataReader data, IEnumerable<string> columns)
{
var results = new List<object>();
var columnIndexMapping = new Dictionary<string, int>();
for (var i = 0; i < columns.Count(); i++)
columnIndexMapping[columns.ElementAt(i)] = i;
while (data.Read())
{
IList<Cell> cells = new List<Cell>();
for (var i = 0; i < columns.Count(); i++)
{
var value = data[i];
//I added this in, since the worksheet has over 37K rows and
//I needed to snag right before it hit the values I was looking for
//to see what the IDataReader was exposing. The row inside the
//IDataReader relevant to the column I'm referencing is null,
//even though the data definitely exists in the Excel file
if (value.GetType() == typeof(DateTime) && value.Cast<DateTime>() == new DateTime(2015, 12, 31))
{
}
value = TrimStringValue(value);
cells.Add(new Cell(value));
}
results.CallMethod("Add", new Row(cells, columnIndexMapping));
}
return results.AsEnumerable();
}
Since their class uses an OleDbDataReader to retrieve the results, I think that is what can't find the value of the cell in question. I don't even know where to go from there.
Found it! Once I traced down that it was the OleDbDataReader that was failing and not the LinqToExcel library itself, it sent me down a different path to look around. Apparently, when an Excel file is read by an OleDbDataReader (as virtually all utilities do under the covers), the first few records are scanned to determine the type of content associated with the column. In my scenario, over 20K records had "normal" dates, so it assumed everything was a date. Once it got to the "bad" records, the ' in front of the date meant it couldn't be parsed into a date, so the value was null.
To circumvent this, I load the file and tell it to ignore column headers. Since the header for this column is a string and most of the values are dates, it makes everything a string because of the mismatched types and the values I need are loaded properly. From there, I can parse accordingly and get it to work.
Source: What is IMEX in the OLEDB connection string?
I am populating a ListObject with data from a database, and am pre-formatting ListColumns which come from VarChar (& similar) as Text before inserting the data.
This works well, but some affected cells now are showing the 'Number Stored As Text' error.
The answer https://stackoverflow.com/a/21869098/1281429 suppresses the error correctly, but requires looping through all cells (as it is not possible to perform the action on a range).
Unfortunately for large ranges this is unacceptably slow.
(n.b. - if you do it manually in Excel it's lightning fast)
Here is a code snippet in C# (for a particular column):
var columnDataRange = listColumn.DataBodyRange
var cells = columnDataRange.Cells;
for (var i = 1; i < cells.Count; i++)
{
InteropExcel.Range cell = cells[i, 1];
if (cell.Count > 1) break;
if (cell.Errors != null)
{
var item = cell.Errors.Item[InteropExcel.XlErrorChecks.xlNumberAsText];
item.Ignore = true;
}
}
Does anyone know of a faster way of doing this?
(Or, more generally, a faster way of iterating through cells in a range?)
Hope someone can help - thanks.
Edit: this is a VSTO Application-Level add-in for Excel 2010/2013.
Just to be sure - you are going from a database to an Excel export? Are you creating a new, clean spreadsheet or overwriting existing data in an existing spreadsheet?
If you are overwriting data in an existing spreadsheet, I would first clear the columns and format the columns in Excel (programmatically of course). It is likely old data and new data going into the same space are causing type issues.
So something like:
thisExcel.xlWorksheet.Range[yourrange].Value = ""
thisExcel.xlWorksheet.Range[yourrange].NumberFormat = choseyourformat
http://msdn.microsoft.com/en-us/library/office/ff196401(v=office.15).aspx
You should be able to apply that to a larger area.
Have a really weird problem with reading xlsx file(I'm using OleDbDataReader).
I have a column there that consist of the following data:
50595855
59528522
C_213154
23141411
The problem is that when I read this column the reader shows me that the third row is empty. The column format in Excel is set to 'General'. But when I set the format to 'Text', everything works fine and reader sees the data in that row.
So just for a sake of experiment, I prefixed first two rows with letter and made it look like following :
C_50595855
C_59528522
C_213154
23141411
And the reader reads everything without problem even when the column format is set to 'General'.
So Excel apparently somehow analyses data in the column before loading it, and it gets confused when first cells of the column look like numeric and some of the rest are texts..
It is really weird to me as either there is data in the cell or there isn't.
Anyone have any ideas why this is happening?
Any help will be much appreciated.
Regards,
Igor
As you surmised it's an issue caused by mixed data types. If you search on "OleDBDataReader mixed types" you'll get some answers. Here's an MSDN page that describes the problem:
"This problem is caused by a limitation of the Excel ISAM driver in that once it determines the datatype of an Excel column, it will return a Null for any value that is not of the datatype the ISAM driver has defaulted to for that Excel column. The Excel ISAM driver determines the datatype of an Excel column by examining the actual values in the first few rows and then chooses a datatype that represents the majority of the values in its sampling."
... and the solution(s):
"Insure that the data in Excel is entered as text. Just reformatting the Excel column to Text will not accomplish this. You must re-enter the existing values after reformatting the Excel column. In Excel, you can use F5 to re-enter existing values in the selected cell.
You can add the option IMEX=1; to the Excel connect string in the OpenDatabase method. For example:
Set Db = OpenDatabase("C:\Temp\Book1.xls", False, True, "Excel 8.0; HDR=NO; IMEX=1;")
"
I've been requested to import an excel spreadsheet which is fine but Im getting a problem with importing a certain cell that contains both numeric and alphanumeric characters.
excel eg
Col
A B C
Row 0123 8 Fake Address CF11 1XX
XX123 8 Fake Address CF11 1XX
As per the example above when the dataset is being loaded its treating Row 2, col (A) as a numeric field resulting in an empty column in the array.
My connection for the OleDb is
var dbImportConn = new OleDbConnection(#"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + dataSource
+ #";Extended Properties=""Excel 8.0;HDR=No;IMEX=1"";")
In this connection i have set the IMEX = 1 which should parse all contents as string into the dataset. Also if i change Row 1 Col (A) to have 'XX123' the entire Col (A) successfully parses as string! Unfortunately this is not going to help my scenario as the excel file is passed from an external client who have also advised that do not have the means to pass through the file with a header row which would solve my issue.
My one thought at this point is when I receive the file to edit the file (programmatically) to insert a header but again as the client may change how many columns are contained this would not be a safe option for me.
So basically I need to find a solution for dealing with the current format on the spreadsheet and to pass through all cells into the array. Has anyone come across this issue before ? Or know how to solve this ?
I await your thoughts
Thanks
Scott
ps If this is not clear just shout
Hi There is a registry setting called TypeGuessRows that you can change that tells Excel to scan all the column before deciding it's type. Currently, it seems, this is set to read an x number of rows in a column and decides the type of the column e.g. if your first x rows are integers and x+1 is string, the import will fail because it has already decided that this is an integer column. You can change the registry setting to read the whole column before deciding..
please see this also
http://jingyangli.wordpress.com/2009/02/13/imex1-revisit-and-typeguessrows-setting-change-to-0-watch-for-performance/
This isn't a direct answer, but I would like to recommend you use the Excel Data Reader, which is opensource under the LGPL licence and is Lightweight and fast library written in C# for reading Microsoft Excel files ('97-2007).