I'm coming across an issue that I'm not sure there's a good answer for.
We have a bulk-insert spreadsheet template to allow people to define certain components of an online ad. They then upload the document, we process it, and set it up in the database.
Recently there was a feature request to change bulk-insert into bulk-edit; IOW, people will download an excel sheet with information about the current ad prepopulated in the fields on the sheet. They would make changes as a set, then re-upload and we'd process the changes and update the database.
The problem is, one of the pieces of information is an HTML snippet with a <script> tag, and it seems like Excel pretty much deletes that automatically, so that column is never being populated when pulling down the sheet. It makes sense, in a way; it resembles executable code and could be a serious virus threat under some conditions, but even if I specify the column as pure text (using the Style.NumberFormat = "#" in EPPlus), Excel just makes the entire piece of data go away. It also skews the columns, looks like... shifts the subsequent cells to the left by one cell.
Is there any way to (safely) make this work without requiring changes to the downloader's security settings?
I dont have time to check into this, but What if you saved the workbook as a macro workbook, to enable some of the less secure behavior within the workbook?
One other thing may be to escape the content with a single quote ' in the beginning of the cell, or wrap the entire "script" content with quotes.
What version of excel do you expect to encounter in the wild? I tested this with Excel 2013, and was able to save the following to a workbook, and parse it into a Datatable using EPP Plus 4.1.0.0:
<script type="text/javascript">$(document).ready(function() {var I =0; console.info(I+100);});</script>
'<script type="text/javascript">$(document).ready(function() {var I =0; console.info(I+100);});</script>
"<script type='text/javascript'>$(document).ready(function() {var I =0; console.info(I+100);});</script>"
Nothing fancy, just iterating each cell in the workbook, pulling in the value and converted to string:
object obj = Worksheets[WorkSheetIndex].Cells[k, l].Value;
Related
I have a tool I made for work. Every week there are 5-20 files for a certain process that fails and I have to find their job ids and rerun them.
I made a tool in C# that takes the names of the failed files in an Excel spreadsheet (we'll call it the Failed File Spreadsheet, or FFS if you're feeling cynical) and then cross references them with a different Excel spreadsheet that has the job ids, and displays the result in the terminal. It reads the FFS this with a fairly simple OledDbDataAdapter code:
public static DataTable GetDataFromExcel(string filename, string sheetName)
{
using(var oledb = new OleDbConnection(CONN_STR.Replace("<FILENAME>", filename).Replace("<HDR>", "no"))
{
var result = new DataSet();
new OleDbDataAdapter($"SELECT * FROM [{sheetName}]", oledb).Fill(result);
return result.Tables[0];
}
}
The tool works fine, mostly. It cross references with another excel sheet and I get my job ids and I can carry on with my task.
However there's one slight issue, and that is that, often when running the tool, when it reads from the FFS, sometimes it returns blank lines. Like if last week I had 7 files, then this week I erased those, pasted in 5 files, then my tool will show the job ids for those 5 files just fine, but also show two blanks, as if it's still reading those two extra rows from the previous week. If however I make a new blank spreadsheet in Excel, plug in my failed files and overwrite the save file, I don't have this issue at all, making me think this is an Excel issue and not a C# coding issue.
Is there a reason why, if I delete the contents of a cell, the OleDbDataAdapter would still be reading those cells? Like are there whitespace characters or other hidden characters still present after deleting contents? I mean I could fix it in the code and just say "don't write it out if the values are whitespace or null" but I want to know why blank cells are even being read at all.
This is just a minor bug and it's not stopping me from doing my work and this tool is nothing more than a personal tool to help with a weekly task. But I'd still like to know why cells that had content, but then had that content deleted, are still being read.
Excel is a little bit quirky like that. If you are manually editing your "Failed File Spreadsheet" (FFS) and as you say, you are pasting 5 rows over the existing 7 rows, then you may still read in those extra rows after the data you expect, if there is any formatting on the cells. To avoid this, in Excel select the range of cells of the whole sheet and right-click and select "Clear Contents".
To be fair, as you alluded to, I think it would be simpler just to fix it in code and skip rows in the DataTable that are empty. Or there is a SO post here which shows how to remove empty rows from a DataTable
My Excel 2007 is set to Auto Calculation Mode.
I have two Excel files. MyUDF (multiple cells) is used in both files.
When I open one file in Excel, I notice no MyUDF is calculated.
But when I open the other file, all MyUDFs are calculated.
So I am a little confused as when will a UDF be calculated upon open and when not?
MyUDF is a UDF in MyAddIn written in C#
Edit
The two files are open in the same instance of Excel. MyUDF is not volatile.
Thanks
Edit
I found http://social.msdn.microsoft.com/Forums/en-US/b477c05a-ae0a-470c-8ad5-482ecd05944b/xll-addin-does-not-calculate-udf-when-opening-a-workbook?forum=exceldev
It says xla will, xll, vba will not.
hmm, this does not match what I see.
Not necessarily. Something like:
Public Function MYUDF() As Date
Application.Volatile
MYUDF = Now
End Function
will be re-calculated, however something like:
Public Function MYUDF() As Date
MYUDF = Now
End Function
may not be re-calculated.
It might be a good idea to set "automatic calculation" in your workbook's Open event. Something like this inside your ThisWorkbook object (this is important!):
Private Sub Workbook_Open()
Application.Calculation = xlCalculationAutomatic
End Sub
This is just to make sure that the setting is set as you expect within your workbook to avoid relying on what calculation mode your Excel application has in its options/settings. If nothing else, at least it eliminates one possibility for why you are getting this behaviour.
Excel "helpfully" remembers where the function was called from.
With he same function in both spreadsheets, Excel may simply be waiting for the other to open so it can do the calculation. Check the formula in the spreadsheet that does not recalculate, and see if it's changed to something like 'C:\MyDocs\MyOtherSpreadsheet.xlsb'!MyUDF(A1).
If it has, search for the full spreadsheet name (including the extra characters it uses for delimiters) and replace with nothing.
If it hasn't, set the calculation to Automatic. To force a recalculation, either select Recalculate now from the ribbon, press F9, or do a search and replace, and replace all the = with exactly the same.
I have a table of data (calculation results) that the user should be able to export to different formats. I use Interop.Excel to prepare a table with the data and use the data and format it using visual formatting (fonts, colors etc.) and NumberFormat. Example:
cellRange.NumberFormat = "#,##0";
When I save the table as an Excel file all formatting is ok when exporting to .xlsx and .xls:
excelWorkBook.SaveAs(exportFileName, Excel.XlFileFormat.xlOpenXMLWorkbook); // for .xlsx
excelWorkBook.SaveAs(exportFileName, Excel.XlFileFormat.xlExcel8); // for .xls
I also want to give the user the possibility to export this table to .pdf and .xps from the application without having to open the Excel file. As I have prepared the tables in Interop.Excel, I tried exporting the same table to those file formats:
excelWorkBook.ExportAsFixedFormat(Excel.XlFixedFormatType.xlTypePDF,exportFileName); // for .pdf
excelWorkBook.ExportAsFixedFormat(Excel.XlFixedFormatType.xlTypeXPS,exportFileName); // for .xps
Both of these result in good documents except that all NumberFormats are lost resulting in long decimal values of doubles. This is not appropriate for the customer's summary of the data. (Colors and fonts remain as defined in .pdf and .xps.)
I have tried setting .Styleand .Styles to "Number" or the like. This does not solve the problem.
I have also tried to protect the Range of cells or the excelWorkSheet. This does not solve the problem either.
Someone suggested calling a VBA macro / sub through C# but after some looking into that, I get the impression that it's not a very straight forward (or stable) path.
I am looking for any help in resolving this issue through Interop.Excel or in another way.
lucn
After some testing it seems clear that the property I named in my comment must be set to false:
Microsoft.Office.Interop.Excel.Application.ActiveWindow.DisplayFormulas = false;
It is not evident why this influences the export to other formats such as *.pdfbut this is clearly the case and setting the .DisplayFormulas = false solves the issue.
Hope this helps somebody.
Excel 2010.
I have a C# app that has a dataset with multiple tables. I want to export this to a workbook where each table is a separate sheet it is important to keep the order of the datasets, and the name of the data tables)
One possible solution is to loop through each table, put it on its own dataset, save this dataset as XML, then use the Application.Workbooks.OpenXML method
MSDN OpenXML Documentation
But here is the problem, if I pass the third parameter (which gives a very nice import with filters and everything), excel succeed, but it warns me that some columns were imported as text, which is ok with me (one of the columns is UPC, which should be a text, not a number).
By displaying this message it stops the process until the user clicks that this is acceptable. Then I question my self about how the mother of all excels is doing these days.
How to prevent this message from popping up?
Or another way to do this import with such nice results? (Copy and paste works but not so nicely, writing in every cell using automation is way to slow, maybe using some excel library...)
You turn
Try
var excelApplication = new Application { DisplayAlerts = false };
or
Workbook excelWorkBoook = excelApplication.Workbooks.Open(...);
excelWorkBoook.CheckCompatibility = false;
I am making an add-in and I am trying to format the output which my add-in generates,using Format as table table-styles provided by Excel.
The one which you get on the 'home tab' --> 'Format as Table' button on the ribbon.
I am using following code:
SourceRange.Worksheet.ListObjects.Add(XlListObjectSourceType.xlSrcRange,
SourceRange, System.Type.Missing, xlYesNo, System.Type.Missing).Name =
TableName;
SourceRange.Select();
SourceRange.Worksheet.ListObjects[TableName].TableStyle = TableStyleName;
TableStyleName is any style name like TableStyleMedium17, you get it if you just hover a particular style in Excel.
My problem is that, even if I keep the SourceRange as 10 columns, all the columns right till the end get selected and are considered as one table.
Because of that the table I populate right next to it is also considered as a part of the first table that was generated.Since, both the table have same column names excel automatically changes the column names in all the following tables that are generated.
Also, because I am generating the tables in a loop after 2 tables are generated I get the error :
A table cannot overlap another table.
PS: I am clearly mentioning SourceRange as:
var startCell = (Range)worksheet.Cells[startRow, startCol];
var endCell = (Range)worksheet.Cells[endRow, endCol];
var SourceRange = worksheet.get_Range(startCell, endCell);
Kindly suggest a way out.
We were able to figure out what was happening on our end for this:
on the
xlWorkbook.Worksheets.Add([before],[after], Type.Missing, Type.Missing)
call, we had to flip before and after since we wanted the sheets to move right, not left and then accessed
xlWorkbook.Worksheets[sheetCount]
by increasing sheetcount for however many sheets were being generated.
Having it the other way was creating the worksheet to access a previously assigned table formatfrom the SourceRange.Worksheet.ListObjects[TableName].TableStyle = TableStyleName call.
So, I got around this problem a week after posting this, sorry did not update in the rush of things.
This actually is an in built excel functionality.
You cant help it, the excel application will keep doing this.
So, ultimately wrote my own table styles in c# and applied it to the excel range which is mentioned as SourceRange. Its just like writing CSS.
If you are interested in knowing the details of that comment it on this question itself or you can contact me by email from my profile.