I am calling a web service and the data from the web service is in csv format.
If I try to save data in xls/xlsx, then I get multiple sheets in a workbook.
So, how can I save the data in csv with multipletab/sheets in c#.
I know csv with multiple tabs is not practical, but is there any damn way or any library to save data in csv with multiple tabs/sheet?
CSV, as a file format, assumes one "table" of data; in Excel terms that's one sheet of a workbook. While it's just plain text, and you can interpret it any way you want, the "standard" CSV format does not support what your supervisor is thinking.
You can fudge what you want a couple of ways:
Use a different file for each sheet, with related but distinct names, like "Book1_Sheet1", "Book1_Sheet2" etc. You can then find groups of related files by the text before the first underscore. This is the easiest to implement, but requires users to schlep around multiple files per logical "workbook", and if one gets lost in the shuffle you've lost that data.
Do the above, and also "zip" the files into a single archive you can move around. You keep the pure CSV advantage of the above option, plus the convenience of having one file to move instead of several, but the downside of having to zip/unzip the archive to get to the actual files. To ease the pain, if you're in .NET 4.5 you have access to a built-in ZipFile implementation, and if you are not you can use the open-source DotNetZip or SharpZipLib, any of which will allow you to programmatically create and consume standard Windows ZIP files. You can also use the nearly universal .tar.gz (aka .tgz) combination, but your users will need either your program or a third-party compression tool like 7Zip or WinRAR to create the archive from a set of exported CSVs.
Implement a quasi-CSV format where a blank line (containing only a newline) acts as a "tab separator", and your parser would expect a new line of column headers followed by data rows in the new configuration. This variant of standard CSV may not readable by other consumers of CSVs as it doesn't adhere to the expected file format, and as such I would recommend you don't use the ".csv" extension as it will confuse and frustrate users expecting to be able to open it in other applications like spreadsheets.
If I try to save data in xls/xlsx, then I get multiple sheets in a workbook.
Your answer is in your question, don't use text/csv (which most certainly can not do multiple sheets, it can't even do one sheet; there's no such thing as a sheet in text/csv though there is in how some applications like Excel or Calc choose to import it into a format that does have sheets) but save it as xls, xlsx, ods or another format that does have sheets.
Both XLSX and ODS are much more complicated than text/csv, but are each probably the most straightforward of their respective sets of formats.
I've been using this library for a while now,
https://github.com/SheetJS/js-xlsx
in my projects to import data and structure from formats like: xls(x), csv and xml but you can for sure save in that formats as well (all from client)!
Hope that can help you,, take a look on online demo,
http://oss.sheetjs.com/js-xlsx/
peek in source code or file an issue on GH? but I think you will have to do most coding on youre own
I think you want to reduce the size of your excel file. If yes then you can do it by saving it as xlsb i.e., Excel Binary Workbook format. Further, you can reduce your file size by deleting all the blank cells.
Related
Good afternoon,
we have a small problem with performance of generating excel.
First, we was creating excel cell by cell - it is ... let's say unacceptable.
Second, we started insert into excel with one command - range creating and it is much faster, but still not perfect so we are searching next solutions.
Because we can load XML file from database, we tried used XSLT and from these two files create xls file. It is nice, but after open this file there is error message shown (it is because of problem or bug in registry). User has to accept this message and after excel is opened. We want to eliminate this error message. However we don't know how.
We was thinking about convert this xls file into xlsx but we are unable to do it becouse we can't install office on server (we cannot use Interop) and OpenXML libraries don't know work with normal xls file. So my question is:
Is possible to generate from XML file with using of some XLST (or something) the xlsx file?
Eventually can what files do we need to create and zip together if we want create xlsx file?
Thank you for information
You mention not being able to use the OpenXML libraries because they don't work with .xls files, but you also say "creating cell by cell", which implies that you are generating the file from scratch. Where is the xls file coming from? You mention excel opening, but then say you can't install it on the server. So, it appears to me that a user is uploading an xls file to your server, and then you are doing something with it and giving it back to them? If that is the case and you must be able to read/write an xls file without installing office, then I would suggest using ExcelLibrary, as mentioned in this post
Indeed, creating an xlsx file is much magnitudes faster with the open xml sdk.
I have processes running on Windows XP/7. They generate weekly .csv data files. I have a bunch of excel formulas that crunch the numbers for each .csv file produced for the week separately and then when adding the weekly data to the one big spreadsheet containing all the data put together.
The number of rows varies each week and for each process. So I can't hardcode that number in my dozens of formulas. So right now I go through this stupid process of manually entering the formulas each week into the .csv files.
There's got to be a way of automating this. Just now I quickly looked into doing this through C# or VB code. Could somebody recommend the best way to do this. Is C# or VB the right way to go? If so, any hints on how to put it all together - what's the model to use? For example, would it look something like this:
C# module reads in .csv data file
C# module creates an Excel spreadsheet and populates it with the .csv data
C# module runs my formulas on the all the rows.
Is that how one would approach it? Is there a better way for somebody who has very limited knowledge of C# or VB? I know Java and C++.
Any advice would be highly appreciated.
Thanks
From your explanations in comments, it appears that having a series of template Excel sheets would greatly facilitate the task.
So, for each process that generates data, you say the formulas are always the same, meaning that the columns are always the same (am I right?).
So, even if you don't know how many rows of data, you can still either create a template where only the first row is filled with formulas, and then you simply copy that row over and over, filling it with data as needed, or, you could fill a relatively "comfortable" number of rows with those same formulas, and fill in the data.
There are tons of atricles on how to Interop with Excel, so it's beyond my intent to provide you with specific code, but the idea is good.
If I can allow myself, I have worked in the past with a very interesting tool call Flexcel Studio for .NET, and I have found it to be of great help when it came to generating Excel sheets based on such templates.
Cheers
As others have suggested, I would recommend performing the calculations outside of excel if possible. There are plenty of stats libraries out there that are friendlier to work with than going through the hassle of moving data into excel, applying formulas to cell ranges, and so on.
If you really want to go the excel route, you can either use open-source libraries such as EPPLUS (.NET) or POI (Java) to work with .XLSX files directly. Some libraries do not support function evaluation so you will need to consider this when deciding on a library to use.
If you go with COM interop, you should read about about the following: Considerations for server-side Automation of Office.
As for the C# or VB (if not java with POI), I would go with C#. C# syntax is similar to java.
There might be a really simple solution to this problem.
Add 1 piece of auxiliary data to the .csv file either programmatically when running my process or when creating the .xlsx file (with all the formulas) from the .csv file. The auxiliary piece of data is the row count which will be in some known location.
Then modify all my formulas to use the INDIRECT function to specify the range using the cell
with the auxiliary piece of data.
I think that might work.
No, ADO.NET will not solve my problem because the excel files I'm working with do not contain information in tabular form. In other words, there is nothing to query, and the name of the sheets and number of sheets will vary.
essentially my job is to search every single cell in an excel document and validate it against some other data.
Right now all I have is a byte[] array that represents the contents of an .xls file. Converting to a string is meaningless since it's just binary data.
If I use COM interop and run Excel in the background, is it possible to inject it with binary data in byte[] array form or do I have to save the file to disk and then automate the process of opening it and scanning each row?
Isn't there an easier way to do it?
How do you read the binary data of an excel file (.xls) using .NET
There are a number of ways, the excel file format has changed a few times so reading the files natively is hard work and version dependent, it's usually not recommended. For reading tabular data most people choose ADO.NET, but as you allude, if you need any formatting or discovery then MS would recommend COM Interop.
If I use COM interop and run Excel in the background, is it possible to inject it with binary data in byte[] array form
The excel COM object model does allow you to bulk set data to a Range object you set it with a 2 dimensional object array (object[,])
or do I have to save the file to disk and then automate the process of opening it and scanning each row?
No, you can interact with the "out of process" COM server (Excel) without having to save first, you can set your data, format it etc in memory.
Isn't there an easier way to do it?
Yes there is, checkout Spreadsheet Gear their object model is nearly identical to the com model, however you do not need Excel involved at all, it is also an order of magnitude faster working with large data. Its not cheap ($1000 bucks last time I checked) but will save you way more than that in coding effort. (I am not affiliated with Spreadsheet gear in any way)
You could use NPOI to open & read your XLS files, you'll basically want to loop through your Sheets / Rows / Columns looking for data. I commonly use NPOI to read & write XLS forms that contain data in random cells throughout a worksheet.
I'd like to transform my XMLDocument containing what you'd call simple xml content into an xls file, possibly using XSLT (judging from what I've found so far) to transform the data, and I'd like the save the created file as an xls file to be opened with Excel whilst not being able to use Excel in the process of creating this file (thus being unable to use MS.Office.Interop.Excel from what I've heard). How exactly would I do this and which classes would I utilize to create this document?
Any help on this one would be greatly appreciated!
Thanks,
Dennis
Probably your best bet is NPOI, a .NET port of the Apache POI library.
You can't use XSLT to create a XLS file, because XLS is a binary format.
If you are talking about ExcelML, this can be done. If you want to do it by hand, I think the best way to start is the following:
Create an ExcelML file using Excel that looks like what you need. Open this file in a text editor and analyse it. Additionally, you should download the reference of the format (Open Help -> OfficeXMLSDK.chm).
There are 3 wasy to convert XML to Excel XLS.
- If the XLS file layout is simple, just transform your XML to an HTML table, and load in Excel (either from a disk file, or HTTP response). This works for many cases, however beware that column / row spanning really confuses Excel.
- If you have a complex layout, with lots of formatting / worksheets, jump into the OfficeXMLSDK, it's pretty hairy with tons of namespaces
- If you have Excel on the server, create a template document in Excel, and use the Excel.Interop
I have a DataTable with 3 columns (a,b,c) and a docx file with the corresponding MailMerge fields set up. What I'd like to do is perform a Mail Merge on the document with the data.
Presume that you can write to the hard disk (if you need to create a csv etc for doing the merge etc), you don't have word, excel etc, Open XML SDK is installed, but equally we can install anything else.
In terms of the answer, converting the input data to whatever is needed isn't really a problem, the problem is how to perform a mail merge in the Open XML SDK (or other FREE API).
As a side note, the output should be one file with n pages (where n is the number of rows in the data), i.e. not n documents (though I don't mind if the merge of documents is done at the end).
(I should add, I'm not tied to the MailMerge concept, being able to just do a replace for example would work - though obviously that then requires merging the files together at the end...)
I've got this working in a pretty horrible way - basically - at the moment, the algorithm follows this:
Unzip docx file
Read in document.xml (to a string)
string.Replace the fields
Rezip to temporary docx
Merge all the temporary documents created
The actual code for merging the documents comes from Eric White's Blog : http://blogs.msdn.com/ericwhite/archive/2009/02/05/move-insert-delete-paragraphs-in-word-processing-documents-using-the-open-xml-sdk.aspx