C# ExcelDataReader Error - 'Invalid file signature' for XLSB format

C# ExcelDataReader Error - 'Invalid file signature' for XLSB format - c#

I am receiving 'Invalid file signature' error when I try to read xlsb file using below code.
If I use CreateReader, then I am receiving 'Detected ZIP file, but not a valid OpenXml file' error. I have also tried other options as given below but nothing works for me.
Can somebody help me to read xlsb file.
Stream stream = new MemoryStream(srcContent);
public static DataSet GetXLSBData(Stream stream)
{
DataSet dataSet;
using (var reader = ExcelReaderFactory.CreateBinaryReader(stream))
{
dataSet = reader.AsDataSet();
}
foreach (DataTable table in dataSet.Tables)
{
table.TableName = table.TableName.Trim();
}
return dataSet;
}
Other options tried:
var reader = ExcelReaderFactory.CreateOpenXmlReader(stream)
var reader = ExcelReaderFactory.CreateCsvReader(stream)
var reader = ExcelReaderFactory.CreateReader(stream)

My proposal
c# code :
using (XlsxOrXlsbReadOrEdit excelFile = new XlsxOrXlsbReadOrEdit())
{
excelFile.Open("file.xlsx");
excelFile.ActualSheetName = "sheet1";
object[] row = null;
while (excelFile.Read())
{
if (row == null)
{
row = new object[excelFile.FieldCount];
}
excelFile.GetValues(row);
}
}
disclimer - I am creator of SpreadSheetTasks
Links
https://www.nuget.org/packages/SpreadSheetTasks/

Related

How to Export List Data using NPOI

I'm using NPOI to export data into excel.
So I created a List that will pull data from my database.
Now My question is how can I read my list data and write the data on my excel Sheet.
The following is my part of my code:
IWorkbook workbook;
workbook = new NPOI.XSSF.UserModel.XSSFWorkbook();
ISheet excelSheet = workbook.CreateSheet("Candidates");
IRow row = excelSheet.CreateRow(0);
foreach (var data in ApplicationList)
{
}
workbook.Write(fs);
So basically I need help on foreach (var data in ApplicationList)

While writing data cells can be created and SetCellValue can help set the data.
Below I have tried to iterate over a single column and list of strings.
This works fine on my system.
IWorkbook workbook = new HSSFWorkbook();
ISheet excelSheet = workbook.CreateSheet("Candidates");
IRow row = excelSheet.CreateRow(0);
var applicantList = new List<string> { "David", "Paul" };
var excelColumns = new[] { "Name" };
IRow headerRow = excelSheet.CreateRow(0);
var headerColumn = 0;
excelColumns.ToList().ForEach(excelColumn =>
{
var cell = headerRow.CreateCell(headerColumn);
cell.SetCellValue(excelColumn);
headerColumn++;
});
var rowCount = 1;
applicantList.ForEach(applicant => {
var row = excelSheet.CreateRow(rowCount);
var cellCount = 0;
excelColumns.ToList().ForEach(column => {
var cell = row.CreateCell(cellCount);
cell.SetCellValue(applicant);
cellCount++;
});
rowCount++;
});
var stream = new MemoryStream();
workbook.Write(stream);
string FilePath = "/Users/hemkumar/hem.xls"; //path to download
FileStream file = new FileStream(FilePath, FileMode.CreateNew,
FileAccess.Write);
stream.WriteTo(file);
file.Close();
stream.Close();
I hope it helps.

I know I am a little late here but I think it may help others
I have developed an excel utility with the use of the NPOI package, which can
Simply takes your data table or the collection
And Returns you excel while maintaining all the data table/list data type intact in the excel.
Github Code repo.: https://github.com/ansaridawood/.NET-Generic-Excel-Export-Sample/tree/master/GenericExcelExport/ExcelExport
Looking for a code explanation, you can find it here:
https://www.codeproject.com/Articles/1241654/Export-to-Excel-using-NPOI-Csharp-and-WEB-API
It uses NPOI DLL and it has 2 cs files to include and then you are good to go
Below is the first file for reference AbstractDataExport.cs:
using NPOI.SS.UserModel;
using NPOI.XSSF.UserModel;
using System;
using System.Collections.Generic;
using System.IO;
using System.Net;
using System.Net.Http;
using System.Net.Http.Headers;
namespace GenericExcelExport.ExcelExport
{
public interface IAbstractDataExport
{
HttpResponseMessage Export(List exportData, string fileName, string sheetName);
}
public abstract class AbstractDataExport : IAbstractDataExport
{
protected string _sheetName;
protected string _fileName;
protected List _headers;
protected List _type;
protected IWorkbook _workbook;
protected ISheet _sheet;
private const string DefaultSheetName = "Sheet1";
public HttpResponseMessage Export
(List exportData, string fileName, string sheetName = DefaultSheetName)
{
_fileName = fileName;
_sheetName = sheetName;
_workbook = new XSSFWorkbook(); //Creating New Excel object
_sheet = _workbook.CreateSheet(_sheetName); //Creating New Excel Sheet object
var headerStyle = _workbook.CreateCellStyle(); //Formatting
var headerFont = _workbook.CreateFont();
headerFont.IsBold = true;
headerStyle.SetFont(headerFont);
WriteData(exportData); //your list object to NPOI excel conversion happens here
//Header
var header = _sheet.CreateRow(0);
for (var i = 0; i < _headers.Count; i++)
{
var cell = header.CreateCell(i);
cell.SetCellValue(_headers[i]);
cell.CellStyle = headerStyle;
}
for (var i = 0; i < _headers.Count; i++)
{
_sheet.AutoSizeColumn(i);
}
using (var memoryStream = new MemoryStream()) //creating memoryStream
{
_workbook.Write(memoryStream);
var response = new HttpResponseMessage(HttpStatusCode.OK)
{
Content = new ByteArrayContent(memoryStream.ToArray())
};
response.Content.Headers.ContentType = new MediaTypeHeaderValue
("application/vnd.openxmlformats-officedocument.spreadsheetml.sheet");
response.Content.Headers.ContentDisposition =
new ContentDispositionHeaderValue("attachment")
{
FileName = $"{_fileName}_{DateTime.Now.ToString("yyyyMMddHHmmss")}.xlsx"
};
return response;
}
}
//Generic Definition to handle all types of List
public abstract void WriteData(List exportData);
}
}
and this the second and final file AbstractDataExportBridge.cs:
using NPOI.SS.UserModel;
using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Text.RegularExpressions;
namespace GenericExcelExport.ExcelExport
{
public class AbstractDataExportBridge : AbstractDataExport
{
public AbstractDataExportBridge()
{
_headers = new List<string>();
_type = new List<string>();
}
public override void WriteData<T>(List<T> exportData)
{
PropertyDescriptorCollection properties = TypeDescriptor.GetProperties(typeof(T));
DataTable table = new DataTable();
foreach (PropertyDescriptor prop in properties)
{
var type = Nullable.GetUnderlyingType(prop.PropertyType) ?? prop.PropertyType;
_type.Add(type.Name);
table.Columns.Add(prop.Name, Nullable.GetUnderlyingType(prop.PropertyType) ??
prop.PropertyType);
string name = Regex.Replace(prop.Name, "([A-Z])", " $1").Trim(); //space separated
//name by caps for header
_headers.Add(name);
}
foreach (T item in exportData)
{
DataRow row = table.NewRow();
foreach (PropertyDescriptor prop in properties)
row[prop.Name] = prop.GetValue(item) ?? DBNull.Value;
table.Rows.Add(row);
}
IRow sheetRow = null;
for (int i = 0; i < table.Rows.Count; i++)
{
sheetRow = _sheet.CreateRow(i + 1);
for (int j = 0; j < table.Columns.Count; j++)
{
ICell Row1 = sheetRow.CreateCell(j);
string type = _type[j].ToLower();
var currentCellValue = table.Rows[i][j];
if (currentCellValue != null &&
!string.IsNullOrEmpty(Convert.ToString(currentCellValue)))
{
if (type == "string")
{
Row1.SetCellValue(Convert.ToString(currentCellValue));
}
else if (type == "int32")
{
Row1.SetCellValue(Convert.ToInt32(currentCellValue));
}
else if (type == "double")
{
Row1.SetCellValue(Convert.ToDouble(currentCellValue));
}
}
else
{
Row1.SetCellValue(string.Empty);
}
}
}
}
}
}
For a detailed explanation, refer link provided in the beginning.

I'm using NPOI to export data into excel too.
But I have a list that will pull data from another excel file that created by NPOI.
Anyway, I think my solution to solve this problem, which is not much different from yours, can be effective.
After you see the code sample below, read the description.
await using var stream = new FileStream(#"C:\Users\Sina\Desktop\TestExcel.xlsx", FileMode.OpenOrCreate, FileAccess.Write);
IWorkbook workbook = new XSSFWorkbook();
var excelSheet = workbook.CreateSheet("TestSheet");
for (var i = 0; i < MyDataList.Count(); i++)
{
var row = excelSheet.CreateRow(i);
for (var j = 0; j < MyDataList[i].Cells.Count(); j++)
{
var cell = row.CreateCell(j);
cell.SetCellValue(MyDataList[i].Cells[j].ToString());
}
}
workbook.Write(stream);
As I said, instead of the list you got the data from your database, I've used a list that has data from another excel file that I pulled through NPOI.
You can see it in the code snippet above (MyDataList).
It is of type (List<IRow>).
You have to create as many rows as there are data in your list, so create it in a loop each time. var row = excelSheet.CreateRow(i)
Now notice that each row has several cells and I fill the cells with another loop and you need to create any number of cells in your row, so create it in this loop each time. var cell = row.CreateCell(j)
You can now use cell.SetCellValue() to set each cell data then use the data in your list instead of MyDataList[i].Cells[j] in that.
Note that the input type of the SetCellValue() method must be a string.
Now I want to add that I also used the AddRange() method instead of the second loop (like this - row.Cells.AddRange(FailedRowList[i].Cells)) but it didn't work, so if you can use that I would appreciate if you say it and let me know more. I hope my answer was helpful.
Thanks

Error: 'FileId' field header not found. Parameter name: name

I am new to CsvHelper, my apologies if I have missed something in the documentation.
I have a CSV file with 200 off columns. Typically there would be close to 65000 rows. Importing these rows into a SQL Database Table was fine, until I added a new field in the SQL Database Table called "FileId" - which does not exist in the CSV File. I wish to Inject this field and the relevant value.
How do I do this please?
Please see code below I am using:
const string fileToWorkWith = #"C:\Data\Fidessa ETP Files\Import\2019\myCsvFile.csv";
Output.WriteLn($"Working with file {fileToWorkWith}.");
const string databaseConnectionString = "Server=MyServer;Database=DB;User Id=sa; Password = xyz;";
Output.WriteLn($"Checking if working file exists.");
if (new System.IO.FileInfo(fileToWorkWith).Exists == false)
{
Output.WriteLn("Working file does not exist.", Output.WriteTypes.Error);
return;
}
Output.WriteLn("Reading file.");
using (var reader = new CsvReader(new StreamReader(fileToWorkWith), true, char.Parse(",") ))
{
reader.Columns = new List<LumenWorks.Framework.IO.Csv.Column>
{
new LumenWorks.Framework.IO.Csv.Column { Name = "FileId", Type = typeof(int), DefaultValue = "1" },
};
reader.UseColumnDefaults = true;
Output.WriteLn("Checking fields in file exist in the Database.");
foreach (var fieldName in reader.GetFieldHeaders())
{
if (Fields.IsValid(fieldName.Replace(" ","_")) == false)
{
Output.WriteLn($"A new field named {fieldName} has been found in the file that does not exist in the database.", Output.WriteTypes.Error);
return;
}
}
using (var sbc = new SqlBulkCopy(databaseConnectionString))
{
sbc.DestinationTableName = "FidessaETP.tableARC_EventsOrderAndFlow_ImportTest";
sbc.BatchSize = 1000;
Output.WriteLn("Mapping available Csv Fields to DB Fields");
foreach (var field in reader.GetFieldHeaders().ToArray())
{
sbc.ColumnMappings.Add(field, field.Replace(" ", "_"));
}
sbc.WriteToServer(reader);
}
}
The Error Details
Message:
'FileId' field header not found. Parameter name: name
Source:
LumenWorks.Framework.IO
Stack Trace:
System.ArgumentException: 'FileId' field header not found. Parameter
name: name at
LumenWorks.Framework.IO.Csv.CsvReader.System.Data.IDataRecord.GetOrdinal(String
name) at
System.Data.SqlClient.SqlBulkCopy.WriteRowSourceToServerCommon(Int32
columnCount) at
System.Data.SqlClient.SqlBulkCopy.WriteRowSourceToServerAsync(Int32
columnCount, CancellationToken ctoken) at
System.Data.SqlClient.SqlBulkCopy.WriteToServer(IDataReader reader) at
Haitong.Test.CsvImporter.Program.Main(String[] args) in
C:\Development\Workspaces\UK OPS Data Warehouse\UK OPS Data
Warehouse\Haitong.Test.CsvImporter\Program.cs:line 86

You might be able to solve the problem by loading the CSV data into a DataTable, adding a FileId column with a default value to the table, and passing the DataTable into the SqlBulkCopy. It doesn't look like your current solution loads the whole file into memory, so you should monitor memory usage if you try this approach. You might be able to get your current solution to work if you dig through the documentation of the Columns property of the CsvReader. It looks like it does not behave the way you are trying to use it.
Here is an example of you you might load the file using a DataTable:
DataTable csvTable = new DataTable();
using (var reader = new StreamReader("path\\to\\file.csv"))
{
using (var csv = new CsvReader(reader, true))
{
csvTable.Load(csv);
}
}
DataColumn newColumn = new DataColumn("FileId", typeof(System.Int32));
newColumn.DefaultValue = 1;
csvTable.Columns.Add(newColumn);
using (SqlBulkCopy sbc = new SqlBulkCopy(connectionString))
{
sbc.WriteToServer(csvTable);
}

Csv DataGridView Conversion to XML Winforms

so I am working on my project and I want to write datagridview which is from a CSV file into XML file and I have achieved that but what I want to know if there is any way to sort the order view or change the outcome of XML what I want is to sort Alphabetical order from a specific column. this is my code for the saving XML file.
if (saveFileDialogXml.ShowDialog() == DialogResult.OK)
{
//Xml Alphabetical order code goes here
DataTable dst = new DataTable();
dst = (DataTable)Datagridview1.DataSource;
dst.TableName = "Data";
dst.WriteXml(saveFileDialogXml.FileName);
}
}
but the output of this is
<?xml version="1.0" standalone="yes"?>
<Item_x0020_Code>Item Code</Item_x0020_Code>
<Item_x0020_Description>Item Description</Item_x0020_Description>
<Current_x0020_Count>Current Count</Current_x0020_Count>
<On_x0020_Order>On Order</On_x0020_Order>
as you can see it even put the Hexadecimal and it just throws everything there, so I was wondering if i can reformat it the way I want it to display like removing the x0020. So I tried using LINQ to see if there was a problem with file, but I keep getting another error which says
System.Xml.XmlException: 'The ' ' character, hexadecimal value 0x20, cannot be included in a name.'
This is the LINQ code :
var xmlFile = new XElement("root",
from line in File.ReadAllLines(#"C:\\StockFile\stocklist.csv")
.Where(n => !string.IsNullOrWhiteSpace(n))
where !line.StartsWith(",") && line.Length > 0
let parts = line.Split(',')
select new XElement("Item Code",
new XElement("Test1", parts[0]),
new XElement("Test2", parts[1])
)
);
Also, I am new to C# and my first post here so please excuse the messy writing or placements.

Try following :
DataTable dst = new DataTable();
int startColumn = 5;
for(int i = dst.Columns.Count - 1; i >= startColumn; i--)
{
dst = dst.AsEnumerable().OrderBy(x => dst.Columns[i]).CopyToDataTable();
}

Sorry for the late Reply I kinda figured it out so forgot to close or mark an answer anyway if any of you run to the same thing all I did was this
// Save file dialogue XML file.
if (saveFileDialogXml.ShowDialog() == DialogResult.OK)
{
//try block to catch exception and handle it.
try
{
//Changing Data Table name to stock.
string Stock = ((DataTable)Datagridview1.DataSource).TableName;
}
//Catching the exception and handling it.
catch (Exception)
{
string es = "Please Open The File Before Saving it";
string title = "Error";
MessageBox.Show(es, title);
}
// instatiate new DataTable.
DataTable dt = new DataTable
{
TableName = "Stock"
};
for (int i = 0; i < Datagridview1.Columns.Count; i++)
{
//if (dataGridView1.Columns[i].Visible) // Add's only Visible columns.
//{
string headerText = Datagridview1.Columns[i].HeaderText;
headerText = Regex.Replace(headerText, "[-/, ]", "_");
DataColumn column = new DataColumn(headerText);
dt.Columns.Add(column);
//}
}
foreach (DataGridViewRow DataGVRow in Datagridview1.Rows)
{
DataRow dataRow = dt.NewRow();
// Add's only the columns that I need
dataRow[0] = DataGVRow.Cells["Item Code"].Value;
dataRow[1] = DataGVRow.Cells["Item Description"].Value;
dataRow[2] = DataGVRow.Cells["Current Count"].Value;
dataRow[3] = DataGVRow.Cells["On Order"].Value;
dt.Rows.Add(dataRow); //dt.Columns.Add();
}
DataSet ds = new DataSet();
ds.Tables.Add(dt);
//Finally the save part:
XmlTextWriter xmlSave = new XmlTextWriter(saveFileDialogXml.FileName, Encoding.UTF8)
{
Formatting = Formatting.Indented
};
ds.DataSetName = "Data";
ds.WriteXml(xmlSave);
xmlSave.Close();

C# Reading CSV to DataTable and Invoke Rows/Columns

i am currently working on a small Project and i got stuck with a Problem i currently can not manage to solve...
I have multiple ".CSV" Files i want to read, they all have the same Data just with different Values.
Header1;Value1;Info1
Header2;Value2;Info2
Header3;Value3;Info3
While reading the first File i Need to Create the Headers. The Problem is they are not splited in Columns but in rows (as you can see above Header1-Header3).
Then it Needs to read the Value 1 - Value 3 (they are listed in the 2nd Column) and on top of that i Need to create another Header -> Header4 with the data of "Info2" which is always placed in Column 3 and Row 2 (the other values of Column 3 i can ignore).
So the Outcome after the first File should look like this:
Header1;Header2;Header3;Header4;
Value1;Value2;Value3;Info2;
And after multiple files it sohuld be like this:
Header1;Header2;Header3;Header4;
Value1;Value2;Value3;Value4;
Value1b;Value2b;Value3b;Value4b;
Value1c;Value2c;Value3c;Value4c;
I tried it with OleDB but i get the Error "missing ISAM" which i cant mange to fix. The Code i Used is the following:
public DataTable ReadCsv(string fileName)
{
DataTable dt = new DataTable("Data");
/* using (OleDbConnection cn = new OleDbConnection("Provider=Microsoft.Jet.OLEDB.4.0;Data Source=\"" +
Path.GetDirectoryName(fileName) + "\";Extendet Properties ='text;HDR=yes;FMT=Delimited(,)';"))
*/
using (OleDbConnection cn = new OleDbConnection("Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" +
Path.GetDirectoryName(fileName) + ";Extendet Properties ='text;HDR=yes;FMT=Delimited(,)';"))
{
using(OleDbCommand cmd = new OleDbCommand(string.Format("select *from [{0}]", new FileInfo(fileName).Name,cn)))
{
cn.Open();
using(OleDbDataAdapter adapter = new OleDbDataAdapter(cmd))
{
adapter.Fill(dt);
}
}
}
return dt;
}
Another attempt i did was using StreamReader. But the Headers are in the wrong place and i dont know how to Change this + do this for every file. the Code i tried is the following:
public static DataTable ReadCsvFilee(string path)
{
DataTable oDataTable = new DataTable();
var fileNames = Directory.GetFiles(path);
foreach (var fileName in fileNames)
{
//initialising a StreamReader type variable and will pass the file location
StreamReader oStreamReader = new StreamReader(fileName);
// CONTROLS WHETHER WE SKIP A ROW OR NOT
int RowCount = 0;
// CONTROLS WHETHER WE CREATE COLUMNS OR NOT
bool hasColumns = false;
string[] ColumnNames = null;
string[] oStreamDataValues = null;
//using while loop read the stream data till end
while (!oStreamReader.EndOfStream)
{
String oStreamRowData = oStreamReader.ReadLine().Trim();
if (oStreamRowData.Length > 0)
{
oStreamDataValues = oStreamRowData.Split(';');
//Bcoz the first row contains column names, we will poluate
//the column name by
//reading the first row and RowCount-0 will be true only once
// CHANGE TO CHECK FOR COLUMNS CREATED
if (!hasColumns)
{
ColumnNames = oStreamRowData.Split(';');
//using foreach looping through all the column names
foreach (string csvcolumn in ColumnNames)
{
DataColumn oDataColumn = new DataColumn(csvcolumn.ToUpper(), typeof(string));
//setting the default value of empty.string to newly created column
oDataColumn.DefaultValue = string.Empty;
//adding the newly created column to the table
oDataTable.Columns.Add(oDataColumn);
}
// SET COLUMNS CREATED
hasColumns = true;
// SET RowCount TO 0 SO WE KNOW TO SKIP COLUMNS LINE
RowCount = 0;
}
else
{
// IF RowCount IS 0 THEN SKIP COLUMN LINE
if (RowCount++ == 0) continue;
//creates a new DataRow with the same schema as of the oDataTable
DataRow oDataRow = oDataTable.NewRow();
//using foreach looping through all the column names
for (int i = 0; i < ColumnNames.Length; i++)
{
oDataRow[ColumnNames[i]] = oStreamDataValues[i] == null ? string.Empty : oStreamDataValues[i].ToString();
}
//adding the newly created row with data to the oDataTable
oDataTable.Rows.Add(oDataRow);
}
}
}
//close the oStreamReader object
oStreamReader.Close();
//release all the resources used by the oStreamReader object
oStreamReader.Dispose();
}
return oDataTable;
}
I am thankful for everyone who is willing to help. And Thanks for reading this far!
Sincerely yours

If I understood you right, there is a strict parsing there like this:
string OpenAndParse(string filename, bool firstFile=false)
{
var lines = File.ReadAllLines(filename);
var parsed = lines.Select(l => l.Split(';')).ToArray();
var header = $"{parsed[0][0]};{parsed[1][0]};{parsed[2][0]};{parsed[1][0]}\n";
var data = $"{parsed[0][1]};{parsed[1][1]};{parsed[2][1]};{parsed[1][2]}\n";
return firstFile
? $"{header}{data}"
: $"{data}";
}
Where it would return - if first file:
Header1;Header2;Header3;Header2
Value1;Value2;Value3;Value4
if not first file:
Value1;Value2;Value3;Value4
If I am correct, rest is about running this against a list file of files and joining the results in an output file.
EDIT: Against a directory:
void ProcessFiles(string folderName, string outputFileName)
{
bool firstFile = true;
foreach (var f in Directory.GetFiles(folderName))
{
File.AppendAllText(outputFileName, OpenAndParse(f, firstFile));
firstFile = false;
}
}
Note: I missed you want a DataTable and not an output file. Then you could simply create a list and put the results into that list making the list the datasource for your datatable (then why would you use semicolons in there? Probably all you need is to simply attach the array values to a list).

(Adding as another answer just to make it uncluttered)
void ProcessMyFiles(string folderName)
{
List<MyData> d = new List<MyData>();
var files = Directory.GetFiles(folderName);
foreach (var file in files)
{
OpenAndParse(file, d);
}
string[] headers = GetHeaders(files[0]);
DataGridView dgv = new DataGridView {Dock=DockStyle.Fill};
dgv.DataSource = d;
dgv.ColumnAdded += (sender, e) => {e.Column.HeaderText = headers[e.Column.Index];};
Form f = new Form();
f.Controls.Add(dgv);
f.Show();
}
string[] GetHeaders(string filename)
{
var lines = File.ReadAllLines(filename);
var parsed = lines.Select(l => l.Split(';')).ToArray();
return new string[] { parsed[0][0], parsed[1][0], parsed[2][0], parsed[1][0] };
}
void OpenAndParse(string filename, List<MyData> d)
{
var lines = File.ReadAllLines(filename);
var parsed = lines.Select(l => l.Split(';')).ToArray();
var data = new MyData
{
Col1 = parsed[0][1],
Col2 = parsed[1][1],
Col3 = parsed[2][1],
Col4 = parsed[1][2]
};
d.Add(data);
}
public class MyData
{
public string Col1 { get; set; }
public string Col2 { get; set; }
public string Col3 { get; set; }
public string Col4 { get; set; }
}

I don't know if this is the best way to do this. But what i would have done in your case, is to rewrite the CSV's the conventionnal way while reading all the files, then create a stream containing the new CSV created.
It would look like something like this :
var csv = new StringBuilder();
csv.AppendLine("Header1;Header2;Header3;Header4");
foreach (var item in file)
{
var newLine = string.Format("{0},{1},{2},{3}", item.value1, item.value2, item.value3, item.value4);
csv.AppendLine(newLine);
}
//Create Stream
MemoryStream stream = new MemoryStream();
StreamReader reader = new StreamReader(stream);
//Fill your data table here with your values
Hope this will help.

Defining a table rather than a range as a PivotTable 'cacheSource'

I am building a tool to automate the creation of an Excel workbook that contains a table and an associated PivotTable. The table structure is on one sheet, the data for which will be pulled from a database using another tool at a later point. The PivotTable is on a second sheet using the table from the previous sheet as the source.
I am using EPPlus to facilitate building the tool but am running into problems specifying the cacheSource. I am using the following to create the range and PivotTable:
var dataRange = dataWorksheet.Cells[dataWorksheet.Dimension.Address.ToString()];
var pivotTable = pivotWorksheet.PivotTables.Add(pivotWorksheet.Cells["B3"], dataRange, name);
This sets the cacheSource to:
<x:cacheSource type="worksheet" xmlns:x="http://schemas.openxmlformats.org/spreadsheetml/2006/main">
<x:worksheetSource ref="A1:X2" sheet="dataWorksheet" />
or within Excel, the data source is set to:
dataWorksheet!$A$1:$X$2
This works fine if the table size never changes, but as the number of rows will be dynamic, I am finding when the data is refreshed, data is only read from the initial range specified.
What I am want to do is to programmatically set the cacheSource to:
<x:cacheSource type="worksheet" xmlns:x="http://schemas.openxmlformats.org/spreadsheetml/2006/main">
<x:worksheetSource name="dataWorksheet" />
</x:cacheSource>
or in Excel, set the data source to:
dataWorksheet
I believe it may be possible to do this by accessing the XML directly (any pointers on this would be most welcome) but is there any way to do this using EPPlus?

It can be done but it is not the prettiest thing in the world. You can extract the cache def xml and edit it from the created EPPlus pivot table object but that will wreak havoc with the save logic when you call package.save() (or GetAsByteArray()) since it parses the xml on save to generate the final file. This is the result of, as you said, EPPlus not capable of handling a table as the source.
So, your alternative is to save the file with EPPlus normally and then manipulate the content of the xlsx which is a renamed zip file using a .net ZipArchive. The trick is you cannot manipulate the files out of order in the zip otherwise Excel will complain when it opens the file. And since you cannot insert an entry (only add to the end) you have to recreate the zip. Here is an extension method on a ZipArchive that will allow you to update the cache source:
public static bool SetCacheSourceToTable(this ZipArchive xlsxZip, FileInfo destinationFileInfo, string tablename, int cacheSourceNumber = 1)
{
var cacheFound = false;
var cacheName = String.Format("pivotCacheDefinition{0}.xml", cacheSourceNumber);
using (var copiedzip = new ZipArchive(destinationFileInfo.Open(FileMode.Create, FileAccess.ReadWrite), ZipArchiveMode.Update))
{
//Go though each file in the zip one by one and copy over to the new file - entries need to be in order
xlsxZip.Entries.ToList().ForEach(entry =>
{
var newentry = copiedzip.CreateEntry(entry.FullName);
var newstream = newentry.Open();
var orgstream = entry.Open();
//Copy all other files except the cache def we are after
if (entry.Name != cacheName)
{
orgstream.CopyTo(newstream);
}
else
{
cacheFound = true;
//Load the xml document to manipulate
var xdoc = new XmlDocument();
xdoc.Load(orgstream);
//Get reference to the worksheet xml for proper namespace
var nsm = new XmlNamespaceManager(xdoc.NameTable);
nsm.AddNamespace("default", xdoc.DocumentElement.NamespaceURI);
//get the source
var worksheetSource = xdoc.SelectSingleNode("/default:pivotCacheDefinition/default:cacheSource/default:worksheetSource", nsm);
//Clear the attributes
var att = worksheetSource.Attributes["ref"];
worksheetSource.Attributes.Remove(att);
att = worksheetSource.Attributes["sheet"];
worksheetSource.Attributes.Remove(att);
//Create the new attribute for table
att = xdoc.CreateAttribute("name");
att.Value = tablename;
worksheetSource.Attributes.Append(att);
xdoc.Save(newstream);
}
orgstream.Close();
newstream.Flush();
newstream.Close();
});
}
return cacheFound;
}
And here is how to use it:
//Throw in some data
var datatable = new DataTable("tblData");
datatable.Columns.AddRange(new[]
{
new DataColumn("Col1", typeof (int)), new DataColumn("Col2", typeof (int)), new DataColumn("Col3", typeof (object))
});
for (var i = 0; i < 10; i++)
{
var row = datatable.NewRow();
row[0] = i; row[1] = i*10; row[2] = Path.GetRandomFileName();
datatable.Rows.Add(row);
}
const string tablename = "PivotTableSource";
using (var pck = new ExcelPackage())
{
var workbook = pck.Workbook;
var source = workbook.Worksheets.Add("source");
source.Cells.LoadFromDataTable(datatable, true);
var datacells = source.Cells["A1:C11"];
source.Tables.Add(datacells, tablename);
var pivotsheet = workbook.Worksheets.Add("pivot");
pivotsheet.PivotTables.Add(pivotsheet.Cells["A1"], datacells, "PivotTable1");
using (var orginalzip = new ZipArchive(new MemoryStream(pck.GetAsByteArray()), ZipArchiveMode.Read))
{
var fi = new FileInfo(#"c:\temp\Pivot_From_Table.xlsx");
if (fi.Exists)
fi.Delete();
var result = orginalzip.SetCacheSourceToTable(fi, tablename, 1);
Console.Write("Cache source was updated: ");
Console.Write(result);
}
}

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

C# ExcelDataReader Error - 'Invalid file signature' for XLSB format - c#

Related

How to Export List Data using NPOI

Error: 'FileId' field header not found. Parameter name: name

Csv DataGridView Conversion to XML Winforms

C# Reading CSV to DataTable and Invoke Rows/Columns

Defining a table rather than a range as a PivotTable 'cacheSource'

Categories

Resources