I'm currently working with an Excel file that has leading rows that have information I don't need. These extra rows also mess with importing that data in the header row below. So I'm trying to remove them to work with the data.
using (var pack = new ExcelPackage(myFileInfo))
{
// Should return the sheet name
var ws = pack.Workbook.Worksheets.FirstOrDefault();
// Should Delete rows 1-5 and shift up the rows after deletion
ws.DeleteRow(1,5,true);
}
I was thinking something like the above would work, but I've not had much success with it.
The goal would be to delete rows 1-5, shift up the rest of the data (maybe a merge would work?) then convert it into a datatable.
Anyone have tips tips or resources on removing rows from my excel sheet (prior to moving it into a datatable since that is where the issue occurs)
The code as you have it will remove the first 5 rows but you also need to do something with the amended file. You could save it in place with:
pack.Save();
or save to a new location with:
pack.SaveAs(new FileInfo(outputFilePath));
I have uploaded a complete example here:
static void Main(string[] args)
{
var myFileInfo = new FileInfo("Demo.xlsx");
using (var pack = new ExcelPackage(myFileInfo))
{
var ws = pack.Workbook.Worksheets.FirstOrDefault();
ws.DeleteRow(1, 5, true);
pack.SaveAs(new FileInfo("output.xlsx"));
}
}
If you build and run the solution you can see that it transforms the demo file from this in the input file (Demo.xlsx):
to this in the output file:
with the first 5 rows removed and everything shifted up.
Related
Just wondering if there's a way to mimic the "Format as Table" Excel function in C# for .csv files.
Context:
The WPF .NET Framework program I've created generates a 8x19 or 7x19 grid of data. The data collected is always different. My program can export this data into a CSV file. This is what it looks like when exported:
My customer is wanting the data in the CSV file to already be formatted into a table like so:
Is there a way to format it as a table after the data has been exported (besides manually doing it in Excel)?
Look into ClosedXML. Here is the documentation for your use case
It's published under MIT license.
Something like this should do the trick:
var wb = new XLWorkbook();
var ws = wb.AddWorksheet("Sheet1");
var range = ws.Range(1, 1, 50, 5);
var table = range.CreateTable();
table.Theme = XLTableTheme.TableStyleLight12;
let me outline my requirement. I have an excel spreadsheet with multiple pivot tables ( linked to charts / slicers etc ) and 2 worksheets with the data that those pivot tables refer to. Currently I have to manually execute a SQL query, copy the data, paste it over the current data in the spreadsheet and then refresh the pivot tables every day.
This is sub-optimal at best. So what I am trying to achieve is some C# code that I can execute on a schedule.
Using EPPlus, I have managed to load the excel file as a template, create a new one, get the data from SQL, update the 2 datasheets with the new data and then save the file.
using (var templateStream = new MemoryStream(File.ReadAllBytes(#"PATH_TO_TEMPLATE_FILE")))
{
using (var newStream = new MemoryStream())
{
//Create e NEW excel doc from the given template
using (ExcelPackage excelPackage = new ExcelPackage(newStream, templateStream))
{
//load the data from SQL
DataSet data = LoadDatasetFromQuery(configs, QueueItem);
//loop over the DataTables inside the DataSet
for (int i = 1; i <= data.Tables.Count; i++)
{
//Resolve the worksheet to put the data on
var worksheetName = configs.FirstOrDefault(c => c.Name.StartsWith($"Worksheet.{i}."));
ExcelWorksheet worksheet = excelPackage.Workbook.Worksheets[worksheetName.Value];
//Put the data on the worksheet top/left = B3
worksheet.Cells["B3"].LoadFromDataTable(data.Tables[i - 1], false);
}
//Save the file to the memory stream
excelPackage.Save();
}
//Write the file to the file system
File.WriteAllBytes(#"PATH_TO_OUTPUT_FILE", newStream.ToArray());
}
}
The problem is, when I try and open the excel file, it says it is corrupt and tries to repair it, which is does, by removing the pivot tables completely. My template file makes use of named ranges as referred to in this SO post but that has not resolved the issue.
Herewith the excel log of how it completed the "repair"
I have also dabbled a little bit in using the interop library ( Microsoft.Office.Interop.Excel ) but that is really like a black hole when it comes to debugging / documentation etc. I'm not averse to using it, I just don't know how. ( well nothing I have tried works properly anyways )
Any help with the above will be greatly appreciated. If you need more information, feel free to ask.
Ok, so it seems my above code was correct, but the excel template I was loading was dodgy. In order to correct the issue I had to make sure that all the pivot tables used named ranges to refer to the data ( click anywhere on the pivot table, then click on the Formulas tab in the top ribbon and then click on Name Manager ) source and then use the offset calculation ( to enable a dynamic range ) as suggested in the link in my post above.
=OFFSET(DataSource!$A$1,0,0,COUNTA(DataSource!$A:$A),COUNTA(DataSource!$1:$1))
where DataSource = the name of the worksheet with the data
Finally, I set up the pivots to refresh their data on opening ( right click on the pivot table, go to data tab and tick the "refresh on open" option )
There is a bit of a pain in that when I open the generated doc it is in "Protected mode" so the data + calcs dont refresh, but if I just click "Enable Editing" it all updates and normal service is resumed, happy days!
I mostly write number-crunching programs using Visual Studio C# (2019) where I am simply taking input data, calculating results and displaying it. No complicated Network or Internet programming. Think first or second college level programming coarse from the early 1990's.
For inputs I was reading in data from an excel file using the following directive:
using Excel = Microsoft.Office.Interop.Excel;
This proved to be very slow when executing the program. I then learned this way of accessing an Excel file is no longer supported and has been superseded by Open XML SDK. Please see the following link to the Microsoft Dev Center page:
https://learn.microsoft.com/en-us/office/open-xml/how-to-parse-and-read-a-large-spreadsheet
For what I want to do the Document Object Model(DOM) approach seems most appropriate for the thousands of individual excel cells I want to read as input data. However, the Microsoft Dev Center is certainly not the most user-friendly resource and the code example provided for reading an Excel file using this DOM approach is writing to a console which I'm not using. I never did get my code to work.
Long and short of it is, I got my code working using the GetCellValue Method:
https://learn.microsoft.com/en-us/office/open-xml/how-to-retrieve-the-values-of-cells-in-a-spreadsheet
However, this 'GetCellValue' method is still taking way too long. I need to read in thousands or tens of thousands of Excel input data cells in seconds or fractions of seconds not 20 seconds to a minute.
I think if I had an example of the DOM method reading in Excel data to an Array Variable (instead of writing to the console) it would help. Can anyone provide an example of such code?
Below I have included my code example where I modified the DOM approach code copied from the Microsoft Office Dev Center to write values from a source Excel File to a DataGrid instead of the Console used by the Dev Center code:
C#
// The DOM approach.
// Note that the code below works only for cells that contain numeric values.
//
public void ReadExcelFileDOM(string fileName)
{
using (SpreadsheetDocument spreadsheetDocument = SpreadsheetDocument.Open(fileName, false))
{
WorkbookPart workbookPart = spreadsheetDocument.WorkbookPart;
WorksheetPart worksheetPart = workbookPart.WorksheetParts.First();
SheetData sheetData = worksheetPart.Worksheet.Elements<SheetData>().First();
DataGridView_Vessel.Rows.Clear();
DataGridView_Vessel.Refresh();
string text;
int File_Row = 0;
int File_Cell = 0;
foreach (Row r in sheetData.Elements<Row>())
{
DataGridView_Vessel.Rows.Add();
foreach (Cell c in r.Elements<Cell>())
{
if (c.CellValue == null)
{
File_Cell++;
//continue;
}
else
{
text = c.CellValue.Text;
if(File_Cell<12)
{
DataGridView_Vessel.Rows[File_Row].Cells[File_Cell].Value = text;
}
File_Cell++;
}
}
File_Row++;
}
//Console.WriteLine();
//Console.ReadKey();
}
}
Sorry for my English. I used the library Epplus and I really like it. But I've got a problem: Out of Memory. Need to write large amounts of data, no matter what. I want to know is it possible to append to the end of the Excel file is not stored in the memory of all. Or create multiple files and then concatenate into one file. Thanks in advance.
1)if you retrieve your data from database
use a datareader instead of datatable
2)write the excel to a temp file, delete it after done(if it's web environment, use response.writefile then delete it)
3)write the header first then append data to it
something like this (using my phone to type this)
var pck = new ExcelPackage();
var ws = pck.AddSheet("sheet1");
//write header here
pck.saveas(fileinfo);
pck.dispose(); // not sure if function existed
pck= new excelpage(fileino.fullname);
ws = pck.worksheets[1];
var rowIndex =0;
while (reader.read())
{
if (++rowindex % 100000 == 0)
{
// save and re-open
}
//write row here
}
pck.save();
//dispose / send file / delete file etc
What I'm trying to accomplish
My app generates some tabular data
I want the user to be able to launch Excel and click "paste" to place the data as cells in Excel
Windows accepts a format called "CommaSeparatedValue" that is used with it's APIs so this seems possible
Putting raw text on the clipboard works, but trying to use this format does not
NOTE: I can correctly retrieve CSV data from the clipboard, my problem is about pasting CSV data to the clipboard.
What I have tried that isn't working
Clipboard.SetText()
System.Windows.Forms.Clipboard.SetText(
"1,2,3,4\n5,6,7,8",
System.Windows.Forms.TextDataFormat.CommaSeparatedValue
);
Clipboard.SetData()
System.Windows.Forms.Clipboard.SetData(
System.Windows.Forms.DataFormats.CommaSeparatedValue,
"1,2,3,4\n5,6,7,8",
);
In both cases something is placed on the clipboard, but when pasted into Excel it shows up as one cell of garbarge text: "–§žý;pC¦yVk²ˆû"
Update 1: Workaround using SetText()
As BFree's answer shows SetText with TextDataFormat serves as a workaround
System.Windows.Forms.Clipboard.SetText(
"1\t2\t3\t4\n5\t6\t7\t8",
System.Windows.Forms.TextDataFormat.Text
);
I have tried this and confirm that now pasting into Excel and Word works correctly. In each case it pastes as a table with cells instead of plaintext.
Still curious why CommaSeparatedValue is not working.
The .NET Framework places DataFormats.CommaSeparatedValue on the clipboard as Unicode text. But as mentioned at http://www.syncfusion.com/faq/windowsforms/faq_c98c.aspx#q899q, Excel expects CSV data to be a UTF-8 memory stream (it is difficult to say whether .NET or Excel is at fault for the incompatibility).
The solution I've come up with in my own application is to place two versions of the tabular data on the clipboard simultaneously as tab-delimited text and as a CSV memory stream. This allows the destination application to acquire the data in its preferred format. Notepad and Excel prefer the tab-delimited text, but you can force Excel to grab the CSV data via the Paste Special... command for testing purposes.
Here is some example code (note that WinForms-equivalents from the WPF namespaces are used here):
// Generate both tab-delimited and CSV strings.
string tabbedText = //...
string csvText = //...
// Create the container object that will hold both versions of the data.
var dataObject = new System.Windows.DataObject();
// Add tab-delimited text to the container object as is.
dataObject.SetText(tabbedText);
// Convert the CSV text to a UTF-8 byte stream before adding it to the container object.
var bytes = System.Text.Encoding.UTF8.GetBytes(csvText);
var stream = new System.IO.MemoryStream(bytes);
dataObject.SetData(System.Windows.DataFormats.CommaSeparatedValue, stream);
// Copy the container object to the clipboard.
System.Windows.Clipboard.SetDataObject(dataObject, true);
Use tabs instead of commas. ie:
Clipboard.SetText("1\t2\t3\t4\t3\t2\t3\t4", TextDataFormat.Text);
Just tested this myself, and it worked for me.
I have had success pasting into Excel using \t (see BFree's answer) as column separators and \n as row separators.
I got the most success defeating formatting issues by using a CSV library (KBCsv) to write the data into a CSV file in the temp folder then open it in Excel with Process.Start(). Once it is in Excel the formatting bit is easy(er), copy-paste from there.
string filePath = System.IO.Path.GetTempPath() + Guid.NewGuid().ToString() + ".csv";
using (var streamWriter = new StreamWriter(filePath))
using (CsvWriter csvWriter = new CsvWriter(streamWriter))
{
// optional header
csvWriter.WriteRecord(new List<string>(){"Heading1", "Heading2", "YouGetTheIdea" });
csvWriter.ValueSeparator = ',';
foreach (var thing in YourListOfThings ?? new List<OfThings>())
{
if (thing != null)
{
List<string> csvLine = new List<string>
{
thing.Property1, thing.Property2, thing.YouGetTheIdea
};
csvWriter.WriteRecord(csvLine);
}
}
}
Process.Start(filePath);
BYO Error handing & logging.