I am using Epplus library in order to upload data from excel file.The code i am using is perfectly works for excel file which has standard form.ie if first row is column and rest all data corresponds to column.But now a days i am getting regularly , excel files which has different structure and i am not able to read
excel file like as shown below
what i want is on third row i wan only Region and Location Id and its values.Then 7th row is columns and 8th to 15 are its values.Finally 17th row is columns for 18th to 20th .How to load all these datas to seperate datatables
code i used is as shown below
I created an extension method
public static DataSet Exceltotable(this string path)
{
DataSet ds = null;
using (var pck = new OfficeOpenXml.ExcelPackage())
{
try
{
using (var stream = File.Open(path, FileMode.Open, FileAccess.Read, FileShare.ReadWrite))
{
pck.Load(stream);
}
ds = new DataSet();
var wss = pck.Workbook.Worksheets;
////////////////////////////////////
//Application app = new Application();
//app.Visible = true;
//app.Workbooks.Add("");
//app.Workbooks.Add(#"c:\MyWork\WorkBook1.xls");
//app.Workbooks.Add(#"c:\MyWork\WorkBook2.xls");
//for (int i = 2; i <= app.Workbooks.Count; i++)
//{
// for (int j = 1; j <= app.Workbooks[i].Worksheets.Count; j++)
// {
// Worksheet ws = app.Workbooks[i].Worksheets[j];
// ws.Copy(app.Workbooks[1].Worksheets[1]);
// }
//}
///////////////////////////////////////////////////
//for(int s=0;s<5;s++)
//{
foreach (var ws in wss)
{
System.Data.DataTable tbl = new System.Data.DataTable();
bool hasHeader = true; // adjust it accordingly( i've mentioned that this is a simple approach)
string ErrorMessage = string.Empty;
foreach (var firstRowCell in ws.Cells[1, 1, 1, ws.Dimension.End.Column])
{
tbl.Columns.Add(hasHeader ? firstRowCell.Text : string.Format("Column {0}", firstRowCell.Start.Column));
}
var startRow = hasHeader ? 2 : 1;
for (var rowNum = startRow; rowNum <= ws.Dimension.End.Row; rowNum++)
{
var wsRow = ws.Cells[rowNum, 1, rowNum, ws.Dimension.End.Column];
var row = tbl.NewRow();
foreach (var cell in wsRow)
{
//modifed by faras
if (cell.Text != null)
{
row[cell.Start.Column - 1] = cell.Text;
}
}
tbl.Rows.Add(row);
tbl.TableName = ws.Name;
}
DataTable dt = RemoveEmptyRows(tbl);
ds.Tables.Add(dt);
}
}
catch (Exception exp)
{
}
return ds;
}
}
If you're providing the template for users to upload, you can mitigate this some by using named ranges in your spreadsheet. That's a good idea anyway when programmatically working with Excel because it helps when you modify your own spreadsheet, not just when the user does.
You probably know how to name a range, but for the sake of completeness, here's how to name a range.
When you're working with the spreadsheet in code you can get a reference to the range using [yourworkbook].Names["yourNamedRange"]. If it's just a single cell and you need to reference the row or column index you can use .Start.Row or .Start.Column.
I add named ranges for anything - cells containing particular values, columns, header rows, rows where sets of data begin. If I need row or column indexes I assign useful variable names. That protects you from having all sorts of "magic numbers" in your spreadsheet. You (or your users) can move quite a bit around without breaking anything.
If they modify the structure too much then it won't work. You can also use protection on the workbook and worksheet to ensure that they can't accidentally modify the structure - tabs, rows, columns.
This is loosely taken from a test I was working with last weekend when I was learning this. It was just a "hello world" so I wasn't trying to make it all streamlined and perfect. (I was working on populating a spreadsheet, not reading one, so I'm just learning the properties as I go.)
// Open the workbook
using (var package = new ExcelPackage(new FileInfo("PriceQuoteTemplate.xlsx")))
{
// Get the worksheet I'm looking for
var quoteSheet = package.Workbook.Worksheets["Quote"];
//If I wanted to get the text from one named range
var cellText = quoteSheet.Workbook.Names["myNamedRange"].Text
//If I wanted to get the cell's value as some other type
var cellValue = quoteSheet.Workbook.Names["myNamedRange"].GetValue<int>();
//If I had a named range and I wanted to loop through the rows and get
//values from certain columns
var myRange = quoteSheet.Workbook.Names["rangeContainingRows"];
//This is a named range used to mark a column. So instead of using a
//magic number, I'll read from whatever column has this named range.
var someColumn = quoteSheet.Workbook.Names["columnLabel"].Start.Column;
for(var rowNumber = myRange.Start.Row; rowNumber < myRange.Start.Row + myRange.Rows; rowNumber++)
{
var getTheTextForTheRowAndColumn = quoteSheet.Cells(rowNumber, someColumn).Text
}
There might be a more elegant way to go about it. I just started using this myself. But the idea is you tell it to find a certain named range on the spreadsheet, and then you use the row or column number of that range instead of a magic row or column number.
Even though a range might be one cell, one row, or one column, it can potentially be a larger area. That's why I use .Start.Row. In other words, give me the row for the first cell in the range. If a range has more than one row, the .Rows property indicates the number of rows so I know how many there are. That means someone could even insert rows without breaking the code.
using System;
using System.Collections.Generic;
using System.Data;
using System.Data.OleDb;
using System.Linq;
using System.Web;
using System.Web.UI;
using System.Web.UI.WebControls;
using System.IO;
namespace ReadData
{
public partial class ImportExelDataInGridView : System.Web.UI.Page
{
protected void Page_Load(object sender, EventArgs e)
{
}
protected void btnUpload_Click(object sender, EventArgs e)
{
//Coneection String by default empty
string ConStr = "";
//Extantion of the file upload control saving into ext because
//there are two types of extation .xls and .xlsx of excel
string ext = Path.GetExtension(FileUpload1.FileName).ToLower();
//getting the path of the file
string path = Server.MapPath("~/MyFolder/"+FileUpload1.FileName);
//saving the file inside the MyFolder of the server
FileUpload1.SaveAs(path);
Label1.Text = FileUpload1.FileName + "\'s Data showing into the GridView";
//checking that extantion is .xls or .xlsx
if (ext.Trim() == ".xls")
{
//connection string for that file which extantion is .xls
ConStr = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + path + ";Extended Properties=\"Excel 8.0;HDR=Yes;IMEX=2\"";
}
else if (ext.Trim() == ".xlsx")
{
//connection string for that file which extantion is .xlsx
ConStr = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + path + ";Extended Properties=\"Excel 12.0;HDR=Yes;IMEX=2\"";
}
//making query
string query = "SELECT * FROM [Sheet1$]";
//Providing connection
OleDbConnection conn = new OleDbConnection(ConStr);
//checking that connection state is closed or not if closed the
//open the connection
if (conn.State == ConnectionState.Closed)
{
conn.Open();
}
//create command object
OleDbCommand cmd = new OleDbCommand(query, conn);
// create a data adapter and get the data into dataadapter
OleDbDataAdapter da = new OleDbDataAdapter(cmd);
DataSet ds = new DataSet();
//fill the excel data to data set
da.Fill(ds);
if (ds.Tables != null && ds.Tables.Count > 0)
{
for (int i = 0; i < ds.Tables[0].Columns.Count; i++)
{
if (ds.Tables[0].Columns[0].ToString() == "ID" && ds.Tables[0].Columns[1].ToString() == "name")
{
}
//else if (ds.Tables[0].Rows[0][i].ToString().ToUpper() == "NAME")
//{
//}
//else if (ds.Tables[0].Rows[0][i].ToString().ToUpper() == "EMAIL")
//{
//}
}
}
//set data source of the grid view
gvExcelFile.DataSource = ds.Tables[0];
//binding the gridview
gvExcelFile.DataBind();
//close the connection
conn.Close();
}
}
}
try
{
System.Diagnostics.Process[] process = System.Diagnostics.Process.GetProcessesByName("Excel");
foreach (System.Diagnostics.Process p in process)
{
if (!string.IsNullOrEmpty(p.ProcessName))
{
try
{
p.Kill();
}
catch { }
}
}
REF_User oREF_User = new REF_User();
oREF_User = (REF_User)Session["LoggedUser"];
string pdfFilePath = Server.MapPath("~/FileUpload/" + oREF_User.USER_ID + "");
if (Directory.Exists(pdfFilePath))
{
System.IO.DirectoryInfo di = new DirectoryInfo(pdfFilePath);
foreach (FileInfo file in di.GetFiles())
{
file.Delete();
}
Directory.Delete(pdfFilePath);
}
Directory.CreateDirectory(pdfFilePath);
string path = Server.MapPath("~/FileUpload/" + oREF_User.USER_ID + "/");
if (Path.GetExtension(FileUpload1.FileName) == ".xlsx")
{
string fullpath1 = path + Path.GetFileName(FileUpload1.FileName);
if (FileUpload1.FileName != "")
{
FileUpload1.SaveAs(fullpath1);
}
FileStream Stream = new FileStream(fullpath1, FileMode.Open);
IExcelDataReader ExcelReader = ExcelReaderFactory.CreateOpenXmlReader(Stream);
DataSet oDataSet = ExcelReader.AsDataSet();
Stream.Close();
bool result = false;
foreach (System.Data.DataTable oDataTable in oDataSet.Tables)
{
//ToDO code
}
oBL_PlantTransactions.InsertList(oListREF_PlantTransactions, null);
ShowMessage("Successfully saved!", REF_ENUM.MessageType.Success);
}
else
{
ShowMessage("File Format Incorrect", REF_ENUM.MessageType.Error);
}
}
catch (Exception ex)
{
ShowMessage("Please check the details and submit again!", REF_ENUM.MessageType.Error);
System.Diagnostics.Process[] process = System.Diagnostics.Process.GetProcessesByName("Excel");
foreach (System.Diagnostics.Process p in process)
{
if (!string.IsNullOrEmpty(p.ProcessName))
{
try
{
p.Kill();
}
catch { }
}
}
}
I found this article to be very helpful.
It lists various libraries you can choose from. One of the libraries I used is EPPlus as shown below.
Nuget: EPPlus Library
Excel Sheet 1 Data
Cell A2 Value :
Cell A2 Color :
Cell B2 Formula :
Cell B2 Value :
Cell B2 Border :
Excel Sheet 2 Data
Cell A2 Formula :
Cell A2 Value :
static void Main(string[] args)
{
using(var package = new ExcelPackage(new FileInfo("Book.xlsx")))
{
var firstSheet = package.Workbook.Worksheets["First Sheet"];
Console.WriteLine("Sheet 1 Data");
Console.WriteLine($"Cell A2 Value : {firstSheet.Cells["A2"].Text}");
Console.WriteLine($"Cell A2 Color : {firstSheet.Cells["A2"].Style.Font.Color.LookupColor()}");
Console.WriteLine($"Cell B2 Formula : {firstSheet.Cells["B2"].Formula}");
Console.WriteLine($"Cell B2 Value : {firstSheet.Cells["B2"].Text}");
Console.WriteLine($"Cell B2 Border : {firstSheet.Cells["B2"].Style.Border.Top.Style}");
Console.WriteLine("");
var secondSheet = package.Workbook.Worksheets["Second Sheet"];
Console.WriteLine($"Sheet 2 Data");
Console.WriteLine($"Cell A2 Formula : {secondSheet.Cells["A2"].Formula}");
Console.WriteLine($"Cell A2 Value : {secondSheet.Cells["A2"].Text}");
}
}
Related
I've developed an ASP.Net MVC application, that is running on a IIS sever. I've wrote a code that reads a CSV and insert the rows of it in a database.
[HttpPost]
public ActionResult InsertPosition(int id, HttpPostedFileBase position)
{
var posicoesExistentes = db.tbPositions.Where(s => s.id_unique == id).AsEnumerable();
foreach (tbPosition posicao in posicoesExistentes)
{
db.tbPositions.Remove(posicao);
}
if (!Directory.Exists(Server.MapPath("~/App_Data/")))
{
System.IO.Directory.CreateDirectory(Server.MapPath("~/App_Data/"));
}
string excelPath = Server.MapPath("~/App_Data/" + position.FileName);
if (System.IO.File.Exists(excelPath))
{
System.IO.File.Delete(excelPath);
}
position.SaveAs(excelPath);
string tempPath = Server.MapPath("~/App_Data/" + "tmp_" + position.FileName);
System.IO.File.Copy(excelPath, tempPath, true);
Excel.Application application = new Excel.Application();
Excel.Workbook workbook = application.Workbooks.Open(tempPath, ReadOnly: true,Editable:false);
Excel.Worksheet worksheet = workbook.ActiveSheet;
Excel.Range range = worksheet.UsedRange;
application.Visible = true;
for (int row = 1; row < range.Rows.Count - 1; row++)
{
tbPosition p = new tbPosition();
p.position = (((Excel.Range)range.Cells[row, 1]).Text == "") ? null : Convert.ToInt32(((Excel.Range)range.Cells[row, 1]).Text);
p.left = ((Excel.Range)range.Cells[row, 2]).Text;
p.right = ((Excel.Range)range.Cells[row, 3]).Text;
p.paper = ((Excel.Range)range.Cells[row, 4]).Text;
p.denomination = ((Excel.Range)range.Cells[row, 5]).Text;
p.material = ((Excel.Range)range.Cells[row, 6]).Text;
p.norme = ((Excel.Range)range.Cells[row, 7]).Text;
p.finalized_measures = ((Excel.Range)range.Cells[row, 8]).Text;
p.observation = ((Excel.Range)range.Cells[row, 9]).Text;
p.id_unique = id;
db.tbPositions.Add(p);
db.SaveChanges();
}
workbook.Close(true, Type.Missing, Type.Missing);
application.Quit();
System.IO.File.Delete(tempPath);
return Json("Success", JsonRequestBehavior.AllowGet);
}
but in return I got the error ' Microsoft Excel cannot access the file '...'. There are several possible reasons' when I try to open the requested excel file.
I've already tried to open the file as readonly, I've already tried to give permissions to the specifieds folders, multiples ways of close the excel file, and create an copy file of the original and read him. But unsuccessful in each one of these solutions. What have I missed here?
Unsupported
The short answer is that trying to programatically manipulate an Excel document using the Automation API is not supported outside of a UI context. You will come across all sorts of frustrations (for example, the API is permitted to show dialogs - how are you going to click on "OK" if it's running on a web-server?).
Microsoft explicitly state this here
Microsoft does not recommend or support server-side Automation of Office.
So what do I use?
I would recommend using the OpenXML SDK - this is free, fully supported and much faster than the Automation API.
Aspose also has a set of products, but they are not free, and I've not used them.
But I HAVE to do it this way
However, if you absolutely have to use the COM API then the following might help you:
HERE BE DRAGONS
The big problem with automation in Excel is that you need to ensure you close every single reference whenever you use them (by calling ReleaseComObject on it).
For example, the following code will cause Excel to stay open:
var range;
range = excelApplication.Range("A1");
range = excelApplication.Range("A2");
System.Runtime.InteropServices.Marshal.ReleaseComObject(range)
range = Nothing
This is because there is still a reference left over from the call to get range "A1".
Therefore, I would recommend writing a wrapper around the Excel class so that any access to, e.g., a range frees any previous ranges accessed before accessing the new range.
For reference, here is the code I used to release COM objects in the class I wrote:
Private Sub ReleaseComObject(ByVal o As Object)
Try
If Not IsNothing(o) Then
While System.Runtime.InteropServices.Marshal.ReleaseComObject(o) > 0
'Wait for COM object to be released.'
End While
End If
o = Nothing
Catch exc As System.Runtime.InteropServices.COMException
LogError(exc) ' Suppress errors thrown here '
End Try
End Sub
Try this
protected void ImportCSV(object sender, EventArgs e)
{
importbtn();
}
public class Item
{
public Item(string line)
{
var split = line.Split(',');
string FIELD1 = split[0];
string FIELD2 = split[1];
string FIELD3 = split[2];
string mainconn = ConfigurationManager.ConnectionStrings["ConnectionString"].ConnectionString;
using (SqlConnection con = new SqlConnection(mainconn))
{
using (SqlCommand cmd = new SqlCommand("storedProcedureName", con))
{
cmd.CommandType = CommandType.StoredProcedure;
cmd.Parameters.AddWithValue("#FIELD1", SqlDbType.VarChar).Value = FIELD1;
cmd.Parameters.AddWithValue("#FIELD2", SqlDbType.VarChar).Value = FIELD2;
cmd.Parameters.AddWithValue("#FIELD3", SqlDbType.VarChar).Value = FIELD3;
con.Open();
cmd.ExecuteNonQuery();
}
}
}
}
private void importbtn()
{
try
{
string csvPath = Server.MapPath("~/Files/") + Path.GetFileName(FileUpload1.PostedFile.FileName);
FileUpload1.SaveAs(csvPath);
var listOfObjects = File.ReadLines(csvPath).Select(line => new Item(line)).ToList();
DataTable dt = new DataTable();
dt.Columns.AddRange(new DataColumn[3] { new DataColumn("FIELD1", typeof(string)),
new DataColumn("FIELD2", typeof(string)),
new DataColumn("FIELD3",typeof(string)) });
string csvData = File.ReadAllText(csvPath);
foreach (string row in csvData.Split('\n'))
{
if (!string.IsNullOrEmpty(row))
{
dt.Rows.Add();
int i = 0;
//Execute a loop over the columns.
foreach (string cell in row.Split(','))
{
dt.Rows[dt.Rows.Count - 1][i] = cell;
i++;
}
}
}
GridView1.DataSource = dt;
GridView1.DataBind();
Label1.Text = "File Attached Successfully";
}
catch (Exception ex)
{
Message.Text = "Please Attach any File" /*+ ex.Message*/;
}
}
I am creating a DataGrid by importing an excel file. I want users manually to be able to change column names from the application.
Edit: Workaround at the bottom
My desktop app will have below logic:
Load excel file and display table in DataGrid
Manually change Column names to match fixed text. (e.x. Column "PricesZZZ" renamed to "Prices", "LeadTimeXXX to "LeadTime")
Export DataGrid to new excel template with only relevant columns that are matched by fixed text (thus the need to have correct
names).
Excel file can have multiple columns and only several of those columns have relevant information and the only way to identify them is to match header name or some other way have user "tell" program which column holds which information.
I need to find a way to change Column name based on user input as I think it's most straightforward. I'm new to c# so sorry if my thinking is a little backwards.
Below is the code snippet I have so far. Might not be relevant for this specific problem, but may help visualize. I use EPPlus library
Import excel
private void btnOpenXL_Click(object sender, RoutedEventArgs e)
{
// Create OpenFileDialog
Microsoft.Win32.OpenFileDialog dlg = new Microsoft.Win32.OpenFileDialog();
// Set filter for file extension and default file extension
dlg.DefaultExt = ".xls";
dlg.Filter = "Excel Files|*.xlsx;*.xls;*.xlsm;*.csv";
// Display OpenFileDialog by calling ShowDialog method
Nullable<bool> result = dlg.ShowDialog();
// Get the selected file name
if (result == true)
{
// Open document
string filename = dlg.FileName;
//call another class to draw the table
dataGrid.ItemsSource = GetDataTableFromExcel(filename).DefaultView;
MessageBox.Show("import done");
}
}
public static DataTable GetDataTableFromExcel(string path, bool hasHeader = true)
{
using (var pck = new OfficeOpenXml.ExcelPackage())
{
using (var stream = File.OpenRead(path))
{
pck.Load(stream);
}
var ws = pck.Workbook.Worksheets.First();
DataTable tbl = new DataTable();
foreach (var firstRowCell in ws.Cells[1, 1, 1, ws.Dimension.End.Column])
{
tbl.Columns.Add(hasHeader ? firstRowCell.Text : string.Format("Column {0}", firstRowCell.Start.Column));
}
var startRow = hasHeader ? 2 : 1;
for (int rowNum = startRow; rowNum <= ws.Dimension.End.Row; rowNum++)
{
var wsRow = ws.Cells[rowNum, 1, rowNum, ws.Dimension.End.Column];
DataRow row = tbl.Rows.Add();
foreach (var cell in wsRow)
{
row[cell.Start.Column - 1] = cell.Text;
}
}
return tbl;
}
}
Export excel
private void btnExportToXL_Click(object sender, RoutedEventArgs e)
{
DataTable dataTable = new DataTable();
dataTable = ((DataView)dataGrid.ItemsSource).ToTable();
ExportDataTableToExcel(dataTable);
MessageBox.Show("export done");
}
public void ExportDataTableToExcel(DataTable dataTable)
{
string path = "C:\\test";
var newFile = new FileInfo(path + "\\" +
DateTime.Now.Ticks + ".xlsx");
using (ExcelPackage pck = new ExcelPackage(newFile))
{
ExcelWorksheet ws = pck.Workbook.Worksheets.Add("Sheet1");
ws.Cells["A1"].LoadFromDataTable(dataTable, true);
pck.Save();
System.Diagnostics.Process.Start(newFile.ToString());
}
}
EDIT:
Workaround by double clicking on any cell in datagrid:
private void dataGrid_MouseDoubleClick(object sender, RoutedEventArgs e)
{
if (dataGrid.SelectedIndex == -1) //if column selected, cant use .CurrentColumn property
{
MessageBox.Show("Please double click on a row");
}
else
{
DataGridColumn columnHeader = dataGrid.CurrentColumn;
if (columnHeader != null)
{
string input = Interaction.InputBox("Title", "Prompt", "Default", 0, 0);
columnHeader.Header = input;
}
}
}
You can change the column names of the datagridview. But note, that this change is limited only to the grid and not it's data source. So in a nutshell, for simple representational purposes, you can use the following code:
dataGrid.Columns[i].HeaderText = "New Column Name"; //i is the index of the column
You can call this code form a Button click event of a Text change event of the input where the user provides the header name. Additionally, if you have the column names beforehand, you can replace then column headers with new values right after the data source has been bound to the grid. Change the headers after this line:
dataGrid.ItemsSource = GetDataTableFromExcel(filename).DefaultView;
//Set new column names here
I am using below code to export data from a csv file to datatable.
As the values are of mixed text i.e. both numbers and Alphabets, some of the columns are not getting exported to Datatable.
I have done some research here and found that we need to set ImportMixedType = Text and TypeGuessRows = 0 in registry which even did not solve the problem.
Below code is working for some files even with mixed text.
Could someone tell me what is wrong with below code. Do I miss some thing here.
if (isFirstRowHeader)
{
header = "Yes";
}
using (OleDbConnection connection = new OleDbConnection(#"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + pathOnly +
";Extended Properties=\"text;HDR=" + header + ";FMT=Delimited\";"))
{
using (OleDbCommand command = new OleDbCommand(sql, connection))
{
using (OleDbDataAdapter adapter = new OleDbDataAdapter(command))
{
adapter.Fill(table);
}
connection.Close();
}
}
for comma delimited file this worked for me
public DataTable CSVtoDataTable(string inputpath)
{
DataTable csvdt = new DataTable();
string Fulltext;
if (File.Exists(inputpath))
{
using (StreamReader sr = new StreamReader(inputpath))
{
while (!sr.EndOfStream)
{
Fulltext = sr.ReadToEnd().ToString();//read full content
string[] rows = Fulltext.Split('\n');//split file content to get the rows
for (int i = 0; i < rows.Count() - 1; i++)
{
var regex = new Regex("\\\"(.*?)\\\"");
var output = regex.Replace(rows[i], m => m.Value.Replace(",", "\\c"));//replace commas inside quotes
string[] rowValues = output.Split(',');//split rows with comma',' to get the column values
{
if (i == 0)
{
for (int j = 0; j < rowValues.Count(); j++)
{
csvdt.Columns.Add(rowValues[j].Replace("\\c",","));//headers
}
}
else
{
try
{
DataRow dr = csvdt.NewRow();
for (int k = 0; k < rowValues.Count(); k++)
{
if (k >= dr.Table.Columns.Count)// more columns may exist
{ csvdt .Columns.Add("clmn" + k);
dr = csvdt .NewRow();
}
dr[k] = rowValues[k].Replace("\\c", ",");
}
csvdt.Rows.Add(dr);//add other rows
}
catch
{
Console.WriteLine("error");
}
}
}
}
}
}
}
return csvdt;
}
The main thing that would probably help is to first stop using OleDB objects for reading a delimited file. I suggest using the 'TextFieldParser' which is what I have successfully used for over 2 years now for a client.
http://www.dotnetperls.com/textfieldparser
There may be other issues, but without seeing your .CSV file, I can't tell you where your problem may lie.
The TextFieldParser is specifically designed to parse comma delimited files. The OleDb objects are not. So, start there and then we can determine what the problem may be, if it persists.
If you look at an example on the link I provided, they are merely writing lines to the console. You can alter this code portion to add rows to a DataTable object, as I do, for sorting purposes.
I'm writing a list to an excel sheet generated using EPPlus with .xlsx extension. Then using worksheet.Cells[worksheet.Dimension.Address].AutoFitColumns(); method I tried to fit the columns.
This is how I write data
using (ExcelPackage xlPackage = new ExcelPackage(newFile))
{
System.Data.DataTable dt = new System.Data.DataTable();
var ws = xlPackage.Workbook.Worksheets.FirstOrDefault(x => x.Name == language.Culture);
if (ws == null)
{
int i = 1, j = 0;
worksheet = xlPackage.Workbook.Worksheets.Add(language.Culture);
foreach (ExcelFields fieldValues in UnmatchedFieldList)
{
//code
}
else
{
int i = 0;
worksheet = xlPackage.Workbook.Worksheets[language.Culture];
colCount = worksheet.Dimension.End.Column;
rowCount = worksheet.Dimension.End.Row;
foreach (ExcelFields fieldValues in UnmatchedFieldList)
{
worksheet.Cells[rowCount + 1, count + 1].Value = itemName;
}
worksheet.Cells[worksheet.Dimension.Address].AutoFitColumns();
xlPackage.Save();
}
I read data as
string sheetName = language.Culture;
var excelFile = new ExcelQueryFactory(excelPath);
IQueryable<Row> excelSheetValues = from workingSheet in excelFile.Worksheet(sheetName) select workingSheet;
string[] headerRow = excelFile.GetColumnNames(sheetName).ToArray();
At headerRow it is throwing the below error
When I'm trying to read the data from excel it is throwing an exception
External table is not in the expected format
I found out,this is due to the columns are not formatted(width) correctly. When I manually set the columns width by double clicking the cell and run the code it is working fine
So I want to achieve this using code
External table is not in the expected format exception is occurred because of exception of connection string so there fore check your connection string with following sample
public static string docPath= #"C:\sourcefolder\myfile.xlsx";
public static string ConnectionString= "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + docPath + ";Extended Properties=Excel 12.0;";
or if you use LinqToExcel then check whether you have set setting the DatabaseEngine property like as follows
public string docPath= #"C:\sourcefolder\myfile.xlsx";
var excelFile = new ExcelQueryFactory(docPath);
excelFile.DatabaseEngine = DatabaseEngine.Ace;
I have a Excel File (xls) that has a column called Money. In the Money column all the columns are formatted as number, except for some that have that marker saying formatted as text against them. I convert the Excel file to CSV using a c# script that uses IMEX=1 in the connection string to open it. The fields that are marked with stored as text do not come through to the csv file. The file is large, about 20MB. So this means 100 values like 33344 etc do not come thro the csv file.
I tried to put a delay in where I open the Excel File. This worked on my PC but not the Development machine.
Have any idea how to get round this without manually intervention, like format all columns with mixed data types as number etc ? I am looking for an automated solution that works every time . This is on SSIS 2008.
static void ConvertExcelToCsv(string excelFilePath, string csvOutputFile, int worksheetNumber = 1) {
if (!File.Exists(excelFilePath)) throw new FileNotFoundException(excelFilePath);
if (File.Exists(csvOutputFile)) throw new ArgumentException("File exists: " + csvOutputFile);
// connection string
var cnnStr = String.Format("Provider=Microsoft.Jet.OLEDB.4.0;Data Source={0};Extended Properties=\"Excel 8.0;IMEX=1;HDR=NO\"", excelFilePath);
var cnn = new OleDbConnection(cnnStr);
// get schema, then data
var dt = new DataTable();
try {
cnn.Open();
var schemaTable = cnn.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
if (schemaTable.Rows.Count < worksheetNumber) throw new ArgumentException("The worksheet number provided cannot be found in the spreadsheet");
string worksheet = schemaTable.Rows[worksheetNumber - 1]["table_name"].ToString().Replace("'", "");
string sql = String.Format("select * from [{0}]", worksheet);
var da = new OleDbDataAdapter(sql, cnn);
da.Fill(dt);
}
catch (Exception e) {
// ???
throw e;
}
finally {
// free resources
cnn.Close();
}
// write out CSV data
using (var wtr = new StreamWriter(csvOutputFile)) {
foreach (DataRow row in dt.Rows) {
bool firstLine = true;
foreach (DataColumn col in dt.Columns) {
if (!firstLine) { wtr.Write(","); } else { firstLine = false; }
var data = row[col.ColumnName].ToString().Replace("\"", "\"\"");
wtr.Write(String.Format("\"{0}\"", data));
}
wtr.WriteLine();
}
}
}
My solution was to specify a format for the incoming files which said no columns with mixed data types. Solution was from business side and not technology.