I am using oledb to read data from an excel file and store it in a dataset.
My excel file contents are like as follows:
0 0 somestring somestring
500 200 somestring somestring
When i checked the contents of my data set, the values of Columns 1 & 2 are not stored as integers but rather as DateTime values.
How will I make it be stored as integer values instead of DateTime?
Have you tried adding IMEX=1 to your OLEDB connection string?
Are you sure its a number? Following could be a few options:
Right click the columns in excel and change the format to Text/Custom.
Look into the NamedRange.FormatConditions Property; change the format the data when you read it from excel, see MSDN
Or try deleting an existing format on a range:
that is,
Excel.Worksheet sheet = this.Application.ActiveSheet as Excel.Worksheet;
Excel.Range range = sheet.get_Range("A1", "A5") as Excel.Range;
//delete previous validation rules
range.Validation.Delete();
You could use a 3rd party component like SpreadsheetGear for .NET which lets you get the underlying values of cells (with IWorkbook.Worksheets["MySheet"].Cells[rowIndex, colIndex].Value) regardless of the cell format, or you can get the formatted result with IRange.Text.
You can see live ASP.NET samples here and download the free trial here.
Disclaimer: I won SpreadsheetGear LLC
Related
I am struggling with the Open XML SDK and I've already read a lot of posts on this topic but cannot figure it out. My goal is to have a locally created Excel file which contains a formula and edit the input online and retrieve the calculated value online.
I don't know if this is possible since Open XML may only change the data and I wonder if it is also able to perform Excels calculations.
For example, my local file contains three cells:
A1: 1
A2: 2
A3: =(A1+A2)
Using Open XML I adjust A2 to the value of 3, however the result of A3 remains 3 instead of 4.
I have already read about Excel having to recalculate, but my goal is to have an Excel file as some sort of calculation engine instead of transfering all calculations to C#.
All tips and advice are welcome.
Kind regards, Patrick
First of all thanks for all responses.
Second I guess the response answered my question and the open XML SDK is only able to adjust the file and won't do anything regarding recaculating existing formulas in the file. This will only occur when opened in Excel. I will take a look at EPPlus.
You can use something like this.
Cell cell; //supposing this is your cell referencing A3
CellFormula cellformula = new CellFormula();
cellformula.Text = "SUM(A1, A2)";
CellValue cellValue = new CellValue();
cellValue.Text = "0";
cell.Append(cellformula);
cell.Append(cellValue);
A similar example can be found at this link: Formula cells in excel using openXML
I feel there could be some ambiguity captured in the question. OpenXML is way of storing documents, therefore it is not possible to do calculations with OpenXML SDK. It is spread sheet engine (Excel application) which performs the calculations.
When inputs are updated, spreadsheet saved, calculated values should get updated.
I tried searching for examples and never i found an example for inserting data into an empty excel.
Insert into [Sheet1$] (columnname1, columnName2) values ("somevalue","somevalue");
If I understand correctly, you want a simple way to create a file that can be read in excel. The simple solution I use many times, when I don't need any advanced features of excel sheets, is a CSV (comma seperated value).
You format your data like this :
COLUMN1,COLUMN2,COLUMN3
ROW1_VALUE1,ROW1_VALUE2,ROW1_VALUE3
ROW2_VALUE1,ROW2_VALUE2,ROW2_VALUE3
Between the lines there are linebreaks. On Windows use \r\n.
You can construct the file any way that you wish, for example :
File.WriteAllText("test.csv","product,price\r\nbook,100\r\ncoffee,500");
This will produce a CSV that can be read in excel.
Excel.Worksheet oSheet;
//------
oSheet.Cells[Row,Column] = "Some Info";
// --- Row & Column starts with 1
I've been requested to import an excel spreadsheet which is fine but Im getting a problem with importing a certain cell that contains both numeric and alphanumeric characters.
excel eg
Col
A B C
Row 0123 8 Fake Address CF11 1XX
XX123 8 Fake Address CF11 1XX
As per the example above when the dataset is being loaded its treating Row 2, col (A) as a numeric field resulting in an empty column in the array.
My connection for the OleDb is
var dbImportConn = new OleDbConnection(#"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + dataSource
+ #";Extended Properties=""Excel 8.0;HDR=No;IMEX=1"";")
In this connection i have set the IMEX = 1 which should parse all contents as string into the dataset. Also if i change Row 1 Col (A) to have 'XX123' the entire Col (A) successfully parses as string! Unfortunately this is not going to help my scenario as the excel file is passed from an external client who have also advised that do not have the means to pass through the file with a header row which would solve my issue.
My one thought at this point is when I receive the file to edit the file (programmatically) to insert a header but again as the client may change how many columns are contained this would not be a safe option for me.
So basically I need to find a solution for dealing with the current format on the spreadsheet and to pass through all cells into the array. Has anyone come across this issue before ? Or know how to solve this ?
I await your thoughts
Thanks
Scott
ps If this is not clear just shout
Hi There is a registry setting called TypeGuessRows that you can change that tells Excel to scan all the column before deciding it's type. Currently, it seems, this is set to read an x number of rows in a column and decides the type of the column e.g. if your first x rows are integers and x+1 is string, the import will fail because it has already decided that this is an integer column. You can change the registry setting to read the whole column before deciding..
please see this also
http://jingyangli.wordpress.com/2009/02/13/imex1-revisit-and-typeguessrows-setting-change-to-0-watch-for-performance/
This isn't a direct answer, but I would like to recommend you use the Excel Data Reader, which is opensource under the LGPL licence and is Lightweight and fast library written in C# for reading Microsoft Excel files ('97-2007).
I have an Excel 2007 workbook that contains tables of data that I'm importing into DataTable objects using ADO.NET.
Through some experimentation, I've managed to find two different ways to indicate that a cell should be treated as "null" by ADO.NET:
The cell is completely blank.
The cell contains #N/A.
Unfortunately, both of these are problematic:
Most of my columns of data in Excel are generated via formulas, but it's not possible in Excel to generate a formula that results in a completely blank cell. And only a completely blank cell will be considered null (an empty string will not work).
Any formula that evaluates to #N/A (either due to an actual lookup error or because the NA() function was used) will be considered null. This seemed like the ideal solution until I discovered that the Excel workbook must be open for this to work. As soon as you close the workbook, OLEDB suddenly starts seeing all those #N/As as strings. This causes exceptions like the following to be thrown when filling the DataTable:
Input string was not in a correct format. Couldn't store <#N/A> in Value Column. Expected type is Int32.
Question: How can I indicate a null value via an Excel formula without having to have the workbook open when I fill the DataTable? Or what can be done to make #N/A values be considered null even when the workbook is closed?
In case it's important, my connection string is built using the following method:
var builder = new OleDbConnectionStringBuilder
{
Provider = "Microsoft.ACE.OLEDB.12.0",
DataSource = _workbookPath
};
builder.Add("Extended Properties", "Excel 12.0 Xml;HDR=Yes;IMEX=0");
return builder.ConnectionString;
(_workbookPath is the full path to the workbook).
I've tried both IMEX=0 and IMEX=1 but it makes no difference.
You're hitting the brickwall that many very frustrated users of Excel are experiencing. Unfortunately Excel as a company tool is widespread and seems quite robust, unfortunately because each cell/column/row has a variant data type it makes it a nightmare to handle with other tools such as MySQL, SQL Server, R, RapidMiner, SPSS and the list goes on. It seems that Excel 2007/2010 is not very well supported and even more so when taking 32/64 bit versions into account, which is scandalous in this day and age.
The main problem is that when ACE/Jet access each field in Excel they use a registry setting 'TypeGuessRows' to determine how many rows to use to assess the datatype. The default for "Rows to Scan" is 8 rows. The registry setting 'TypeGuessRows' can specify an integer value from one (1) to sixteen (16) rows, or you can specify zero (0) to scan all existing rows. If you can't change the registry setting (such as in 90% of office environments) it makes life difficult as the rows to guess are limited to the first 8.
For example, without the registry change
If the first occurrence of #N/A is within the first 8 rows then IMEX = 1 will return the error as a string "#N/A". If IMEX = 0 then an #N/A will return 'Null'.
If the first occurrence of #N/A is beyond the first 8 rows then both IMEX = 0 & IMEX = 1 both return 'Null' (assuming required data type is numeric).
With the registry change (TypeGuessRows = 0) then all should be fine.
Perhaps there are 4 options:
Change the registry setting TypeGuessRows = 0
List all possible type variations in the first 8 rows as 'dummy data' (eg memo fields/nchar(max)/ errors #N/A etc)
Correct ALL data type anomalies in Excel
Don't use Excel - Seriously worth considering!
Edit:
Just to put the boot in :) another 2 things that really annoy me are; if the first field on a sheet is blank over the first 8 rows and you can't edit the registry setting then the whole sheet is returned as blank (Many fun conversations telling managers they're fools for merging cells!). Also, if in Excel 2007/2010 you have a department return a sheet with >255 columns/fields then you have huge problems if you need non-contiguous import (eg key in col 1 and data in cols 255+)
I am using the CarlosAg.ExcelXmlWriter library to generate an Excel file in C#. If I treat all data as strings everything works fine, but I need some cells to be recognized by Excel as date fields. When I try to set the data type accordingly, the resulting Excel file fails to open in Excel 2003 (or Excel 2007 for that matter).
In Excel 2003, I get the following error on load:
Problems came up in the following area
during load: Table
I am using the following code to generate the DateTime cells that are causing the problem:
string val = DateTime.Now.ToString("MM/dd/yyyy");
row.Cells.Add(new WorksheetCell(val, DataType.DateTime));
Thanks.
I eventually figured this out. Thanks to Lance Roberts for nudging me in the right direction.
First, define a WorksheetStyle called dateStyle and set its NumberFormat property to a valid Excel date format:
Workbook book = new Workbook();
Worksheet sheet = book.Worksheets.Add("myWorksheet");
WorksheetStyle dateStyle = book.Styles.Add("dateStyle");
dateStyle.NumberFormat = "Dd Mmm Yy";
WorksheetRow row = book.Worksheets[0].Table.Rows.Add();
Then, export the date using the .NET "s" date format and add the cell:
string val = DateTime.Now.ToString("s");
row.Cells.Add(val, DataType.DateTime, "dateStyle");
I haven't used his library, but work in Excel a lot. I don't know what he's doing with a datatype for a cell, since they don't work that way. Dates in Excel cells are all integers with Date Formatting.
I would try to put the date in as an integer, the trick is converting your string to the correct integer. See this link for information on Excel's Date as Numbers methodology. I would then set the WorksheetStyle.NumberFormat Property.