Excel Library possible integer overflow - c#

I am using Excel Library - http://code.google.com/p/excellibrary/ - To generate an excel 2003 spreadsheet. Everything works fine except when some big values are used.
These are some reference numbers that are used by a client and I simply need to present them as integer values in the spreadsheet.
int val = 1420007117;
worksheet.Celss[row, col] = new Cell(val); // Displays - 352108063
This results in the value 352108063 being displayed in the spreadsheet. If the value is lower, then it displays fine.
Anyone know what the issue might be, or how to work around this problem. Outputting the value as string is not possible as it leaves a green Number stored as Text error.

I would say that excel doesn't support 64-bit integers and excellibrary doesn't care about it.
For such big numbers you better use floating point. This is how Excel handles big numbers.

Related

Reading double's from Excel "the Excel way"

Background
I am trying to read a 22 x 22 matrix from a Excel Worksheet. The matrix holds percent values and the values of each row must have a sum of 100% (or 1 when dealing with the numbers behind the percent value). When I open such a Excel worksheet and build the sum on each row, it is always 100% (1). Perfect.
But when I read the worksheet and sum up the (double) values read from the sheet I get a significant distance to 1 on most of the rows (significant means more than 0.00000000001 absolute distance to 1).
Investigation
I modified the matrix in excel to display me the numbers behind the percent values and the compared it to what I've read using EPPlus. For example I had
99.86% (Excel with percent)
0.998610811163197 (Excel as number)
0.9986108111631975 (read with EPPlus)
I renamed my Excel document to a ZIP archived, unpacked it and opened the according sheet in Visual Studio. The value stored was exact the value I got with EPPlus - which wasn't really surprising.
Solution?
I decided to operate as excel does, at least I thought excel does it so. I tried to round the values after 15 digits. But funny enough, the result wasn't the same as in excel, even worse, after looking at some other values I had:
0.00 % (Excel with percent)
0.00000330942432252678 (Excel as number)
3.3094243225267778E-6 (stored in the XML, read via EPPlus)
So, the question is: is there a way to round or read the values from Excel as Excel displays them?
Here is my code for reading the excel:
using (ExcelPackage excel = new ExcelPackage())
{
excel.Load(File.OpenRead("data.xlsx"));
var a1 = excel.Workbook.Worksheets.First().Cells["A1"].Value;
var a2 = excel.Workbook.Worksheets.First().Cells["A2"].Value;
}
Apologies, I am not able to upload the excel file at the moment from my workplace to dropbox or something else, I'll attach it later.
Edit: here is the excel document.
If i understand your question, you have problem with display double value, right?
You can use correct format for displaying double values. For example:
double val = 99.8610811163198;
Console.WriteLine(val.ToString("P", CultureInfo.InvariantCulture));
About this read MS article: https://msdn.microsoft.com/en-us/library/kfsatb94(v=vs.110).aspx

C# - Decimals not writing out to Excel

I have an ETL that's saving data to an Excel file. The issue is that the decimals are not being written out for integers. Example:
14.00
is being written out as
14
My code for writing out that line is
loWorksheet.Cells[liRowNum, 5] = lcAmount.ToString("0.00");
When I step through the code, it shows as 14.00, but on the Excel file it is not retaining the decimal places. Is this something that can be fixed in my code or is this an Excel issue? Any suggestions?
I'm quite sure you have to set format for your cells. I can't check right now, but it will be something like
xlYourRange.NumberFormat = "0.00";
You can check this question Set data type like number, text and date in excel column using Microsoft.Office.Interop.Excel in c#
If you really want the data to be displayed literally the way it is in the source file, you have to deal with trade-offs. The simplest way is to format the data as text. You can do this a cell at a time or for entire columns:
loWorksheet.Columns["A:E"].NumberFormat = "#";
The trade-off is it's just text at this point. You can't add, sum, average, whatever.
On the other hands, if your data looks like this:
4.0
4.00
4.000
You can't really keep it as numbers and expect to retain the original format without doing some funny business.
If it's consistently two decimal places, and you know it's going to be, then I agree with #RenatZamaletdinov's solution.
And you might want to consider other strings and what Excel might to do them
0000123 becomes 123
10/23 will probably render as a date, depending on your localization
12345678901234567890 will render as scientific notation probably
These are all avoided if you make the numeric format text (#), but again without knowing what you plan to do with the data, it's hard to say if this is the correct approach.
Wrap lcAmount.ToString("0.00");in a pair of quotes and put an equal sign in front of it. This will prevent excel from overriding the format.
loWorksheet.Cells[liRowNum, 5] = "=" + '"' lcAmount.ToString("0.00") + '"';

Excel Interlop Value vs Text vs Value2

Having some problems parsing numbers out of the following excel spread sheet.
The code:
var curQOH = toolkit.ExcelWorksheet.Cells[i, 28] as Range;
var curQAV = toolkit.ExcelWorksheet.Cells[i, 29] as Range;
if (!curQOH.Text.Contains("("))
Int32.TryParse(curQOH.Text, out lastQOH);
else
Int32.TryParse(curQOH.Value as string, out lastQOH);
if (!curQAV.Text.Contains("("))
Int32.TryParse(curQAV.Text, out lastQAV);
else
Int32.TryParse(curQAV.Value as string, out lastQAV);
The code above parses the positive numbers just fine. No issues. But it seems like it cannot parse negative number.
To my knowledge, Text is suppose to give me what the viewer sees so I would get (10) as an output. Value does give the right number but I cannot seem to parse that after casting to string. (this issue why I cant store the value as string or cast it to int, Excel cell value as string won't store as string)
Stoped using Excel Interlop and started using OpenXML Excel library

Easy way to increment cell numbers in Excel

I'm an intermediate C# programmer, but I'm just starting out with Office automation, specifically Excel for now. I've got to say, the Office API is lacking, or at least it forces you to think about problems differently. One thing that's driving me nuts is cell numbers, such as A1 and B5 and so on. I'm forced to manipulate them often, but there's no easy way to do this. For example, if I'm on column C7 and want to copy or move something to B7, I can't just use --C7. Instead I have to figure out the numerical value of C, decrement it, turn it back into a letter then concatenate it with the row number again.
I could write methods to do this myself (e.g. decrementColumn(), decrementRow(), addColumns( String currentCellName, int howManyToAdd) ), but I don't want to reinvent the wheel. Does a library of functions exist for such oft-needed conversions or am I going to have to roll my own?
To copy/move values easily, you can use the .Offset method, which returns a Range.
For example, if the range/cell you are working with is C7, where rng represents this Range object:
rng.Offset(0,-1).Value = rng.Value
This returns the range, offset by -1 colums.
rng.Offset(10,15) would return a cell/range 10 rows below, and 15 columns right, etc.
You may also look at R1C1 address style in Excel, although I have never been fond of that. This link for Excel 2007 but should be mostly appropriate for any version of Excel.
http://msdn.microsoft.com/en-us/library/office/ee264226(v=office.12).aspx

Excel Interop - Efficiency and performance

I was wondering what I could do to improve the performance of Excel automation, as it can be quite slow if you have a lot going on in the worksheet...
Here's a few I found myself:
ExcelApp.ScreenUpdating = false -- turn off the redrawing of the screen
ExcelApp.Calculation = Excel.XlCalculation.xlCalculationManual -- turning off the calculation engine so Excel doesn't automatically recalculate when a cell value changes (turn it back on after you're done)
Reduce calls to Worksheet.Cells.Item(row, col) and Worksheet.Range -- I had to poll hundreds of cells to find the cell I needed. Implementing some caching of cell locations, reduced the execution time from ~40 to ~5 seconds.
What kind of interop calls take a heavy toll on performance and should be avoided? What else can you do to avoid unnecessary processing being done?
When using C# or VB.Net to either get or set a range, figure out what the total size of the range is, and then get one large 2 dimensional object array...
//get values
object[,] objectArray = shtName.get_Range("A1:Z100").Value2;
iFace = Convert.ToInt32(objectArray[1,1]);
//set values
object[,] objectArray = new object[3,1] {{"A"}{"B"}{"C"}};
rngName.Value2 = objectArray;
Note that its important you know what datatype Excel is storing (text or numbers) as it won't automatically do this for you when you are converting the type back from the object array. Add tests if necessary to validate the data if you can't be sure beforehand of the type of data.
This is for anyone wondering what the best way is to populate an excel sheet from a db result set. This is not meant to be a full list by any means but it does list a few options.
Some performance numbers while attempting to populate an excel sheet with 155 columns and 4200 records on an old Pentium 4 3GHz box including data retrieval time which was never more than 10 seconds in order of slowest to fastest is as follows...
One cell at a time - Just under 11 minutes
Populating a dataset by converting to html + Saving html to disk + Loading html into excel and saving worksheet as xls/xlsx - 5 minutes
One column at a time - 4 minutes
Using the deprecated sp_makewebtask procedure in SQL 2005 to create an HTML file - 9 Seconds + Followed by loading the html file in excel and saving as XLS/XLSX - About 2 minutes.
Convert .Net dataset to ADO RecordSet and use the WorkSheet.Range[].CopyFromRecordset function to populate excel - 45 seconds!
I ended up using option 5. Hope this helps.
If you're polling values of many cells you can get all the cell values in a range stored in a variant array in one fell swoop:
Dim CellVals() as Variant
CellVals = Range("A1:B1000").Value
There is a tradeoff here, in terms of the size of the range you're getting values for. I'd guess if you need a thousand or more cell values this is probably faster than just looping through different cells and polling the values.
Use excels builtin functionality whenever possible, for example: Instead of searching a whole column for a given string, use the find command available in the GUI by Ctrl-F:
Set Found = Cells.Find(What:=SearchString, LookIn:=xlValues, _
SearchOrder:=xlByRows, SearchDirection:=xlNext, _
MatchCase:=False, SearchFormat:=False)
If Not Found Is Nothing Then
Found.Activate
(...)
EndIf
If you want to sort some lists, use the excel sort command, don't do it manually in VBA:
Selection.Sort Key1:=Range("A1"), Order1:=xlAscending, Header:=xlGuess, _
OrderCustom:=1, MatchCase:=False, Orientation:=xlTopToBottom, _
DataOption1:=xlSortNormal
As Anonymous Type says: reading/writing large range blocks is very important to performance.
In cases where the COM-Interop overhead is still too large you may want to switch to using the XLL interface, which is the fastest Excel interface.
Although the XLL interface is primarily meant for C++ users, both XL DNA and Addin Express provide .NET to XLL bridge capability which is significantly faster than COM-Interop.
Performance also depends a lot on how you automate Excel. VBA is faster than COM automation is faster than .NET automation. And typically early (compile time) binding is faster than late binding, too.
If you have serious performance problems you could think of moving the critical parts of the code to a VBA module and call that code from your COM/.NET automation code.
If you use .NET you should also use the optimized primary interop assemblies available from Microsoft and not use custom-built interop assemblies.
Another big thing you can do in VBA is to use Option Explicit and avoid Variants wherever possible. Variants are not 100% avoidable in VBA, but they make the interpreter do more work at runtime and waste memory.
I found this article very helpful when I was starting with VBA in Excel.
http://www.ozgrid.com/VBA/SpeedingUpVBACode.htm
And this book
http://www.amazon.com/VB-VBA-Nutshell-Language-OReilly/dp/1565923588
Similar to
app.ScreenUpdates = false //and
app.Calculation = xlCalculationManual
you can also set
app.EnableEvents = false //Prevent Excel events
app.Interactive = false //Prevent user clicks and keystrokes
although they don't seem to make as big a difference as the first two.
Similar to setting Range values to arrays, if you are working with data that is mostly tables with the same formula in every row of a column, you can use R1C1 formula notation for your formula and set an entire column equal to the formula string to set the whole thing in one call.
app.ReferenceStyle = xlR1C1
app.ActiveSheet.Columns(2) = "=SUBSTITUTE(C[-1],"foo","bar")"
Also, creating XLL add-ins using ExcelDNA & .NET (or the hard way in C) is also the only way you can get UDFs to run on multiple threads. (See Excel DNA's ExcelFunction attribute's IsThreadSafe property.)
Before I transitioned to Excel DNA completely, I also experimented with creating COM visible libraries in .NET to reference in VBA projects. Heavy text processing is a bit faster than VBA that way, as are using wrapped .NET List classes instead of VBA's Collection, but Excel DNA is better.

Categories

Resources