Issue with importing .csv using MS Jet OLEDB - c#

I'm importing a CSV file to SQL for an ASP .NET application. I am able to import the .csv, however one column contains null values if there is anything other than numbers in it.
This row imports fine:
1109,003,IN,0093219095,3/17/2013,3/21/2013,,Sobeys Warehouse,4819.13,61.37,4880.50,RV,1109-003
The fourth column is NULL in SQL:
1109,999,IN,REF 44308/S. DRA,3/18/2013,3/21/2013,,"EC Rebates W/E -02 14, 2013",-200.02,0.00,-200.02,SA,1109-999
All other columns that have text in them import fine, just the fourth one is an issue. I can't figure out what could possibly be different about the affected column over the others. If I replace the text with numbers it imports so its something to do with text data. SQL field is nvarchar(50) so its not a datatype issue.
My connection string (dir contains the path to the folder):
string connString = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=\"" + dir + "\\\";Extended Properties='text;HDR=Yes;FMT=Delimited(,)';";
Import code:
public DataTable GetCSV(string path)
{
if (!File.Exists(path))
{
return null;
}
DataTable dt = new DataTable();
string fullPath = Path.GetFullPath(path);
string file = Path.GetFileName(fullPath);
string dir = Path.GetDirectoryName(fullPath);
string connString = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=\"" + dir + "\\\";Extended Properties='text;HDR=Yes;FMT=Delimited(,)';";
string query = "SELECT * FROM " + file;
System.Data.OleDb.OleDbDataAdapter da = new System.Data.OleDb.OleDbDataAdapter(query, connString);
try
{
da.Fill(dt);
}
catch (InvalidOperationException)
{
}
da.Dispose();
return dt;
}

This solved my issue. Thanks to Pondlife for getting me pointed in the right direction. I figured it had something to do with data types, just not sure where it was getting walloped.
http://msdn.microsoft.com/en-us/library/windows/desktop/ms709353%28v=vs.85%29.aspx

Related

Showing extra rows in database after importing to SQL Server

I'm building a program that needs the following features:
Import an Excel file into database -- Check
Avoid duplicates --- working on it
Ignore some rows that are in the header of the Excel file and the bottom --- that's what I want to ask you guys
Here's my code
protected void Upload_Click(object sender, EventArgs e)
{
string excelPath = Server.MapPath("~/Nova pasta/") + Path.GetFileName(FileUpload1.PostedFile.FileName);
string filepath = Server.MapPath("~/Nova pasta/") + Path.GetFileName(FileUpload1.FileName);
string filename = Path.GetFileName(filepath);
FileUpload1.SaveAs(excelPath);
string strConnection = #"Data Source=PEDRO-PC\SQLEXPRESS;Initial Catalog=costumizado;Persist Security Info=True;User ID=sa;Password=1234";
string excelConnectionString = #"Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + filepath + ";Extended Properties=\"Excel 12.0 Xml;HDR=NO,IMEX=1;\"";
OleDbConnection excelConnection = new OleDbConnection(excelConnectionString);
excelConnection.Open();
DataTable schema = excelConnection.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
string sheetName = schema.Rows[0]["TABLE_NAME"].ToString();
OleDbCommand cmd = new OleDbCommand("Select * from [" + sheetName + "]", excelConnection);
OleDbDataReader dReader;
dReader = cmd.ExecuteReader();
using (SqlBulkCopy sqlBulk = new SqlBulkCopy(strConnection))
{
sqlBulk.ColumnMappings.Add(0,0);
sqlBulk.ColumnMappings.Add(1,1);
sqlBulk.ColumnMappings.Add(2,2);
sqlBulk.ColumnMappings.Add(3,3);
sqlBulk.DestinationTableName = "Dados";
sqlBulk.WriteToServer(dReader);
}
excelConnection.Close();
}
And what I'm struggling is that I need to my code to find the columns in the excel and ignore the rows that I don't need...
I thought that this lines were enough for the job :
sqlBulk.ColumnMappings.Add(0,0);
sqlBulk.ColumnMappings.Add(1,1);
sqlBulk.ColumnMappings.Add(2,2);
sqlBulk.ColumnMappings.Add(3,3);
sqlBulk.DestinationTableName = "Dados";
Here's the table that I want to import :
Dealing with fluctuating source files is tricky. Maybe try loading the file as a whole into a raw table and then after loading the table call a proc to only move the columns that match complete row to your prod or final table. For example, only select from raw where Data Mov, Data Valor, Descricao do Movimento, and Valor em EUR are not null. If needed you could add more validation if needed, like checking the first two columns for date format and the last column for a numeric value. I just think it might be easier to do it in SQL than in .NET code.

how to Read xls file using OLEDB?

I want to read all data from an xls file using OLEDB, but I don't have any experience in that.
string filename = #"C:\Users\sasa\Downloads\user-account-creation_2.xls";
string connString = #"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + filename + ";Extended Properties='Excel 8.0;HDR=YES'";
using (System.Data.OleDb.OleDbConnection conn = new System.Data.OleDb.OleDbConnection(connString))
{
conn.Open();
System.Data.OleDb.OleDbCommand selectCommand = new System.Data.OleDb.OleDbCommand("select * from [Sheet1$]", conn);
System.Data.OleDb.OleDbDataAdapter adapter = new System.Data.OleDb.OleDbDataAdapter(selectCommand);
DataTable dt = new DataTable();
adapter.Fill(dt);
int counter = 0;
foreach (DataRow row in dt.Rows)
{
String dataA = row["email"].ToString();
// String dataB= row["DataB"].ToString();
Console.WriteLine(dataA + " = ");
counter++;
if (counter >= 40) break;
}
}
I want to read all data from email row
I get this error
'Sheet$' is not a valid name. Make sure that it does not include invalid characters or punctuation and that it is not too long
Well, you don't have a sheet called Sheet1 do you? Your sheet seems to be called "email address from username" so your query should be....
Select * From ['email address from username$']
Also please don't use Microsoft.Jet.OLEDB.4.0 as it's pretty much obsolete now. Use Microsoft.ACE.OLEDB.12.0. If you specify Excel 12.0 in the extended properties it will open both .xls and .xlsx files.
You can also load the DataTable with a single line...
dt.Load(new System.Data.OleDb.OleDbCommand("Select * From ['email address from username$']", conn).ExecuteReader());
To read the names of the tables in the file use...
DataTable dtTablesList = conn.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
foreach (DataRow drTable in dtTablesList.Rows)
{
//Do Something
//But be careful as this will also return Defined Names. i.e ranges created using the Defined Name functionality
//Actual Sheet names end with $ or $'
if (drTable["Table_Name"].ToString().EndsWith("$") || drTable["Table_Name"].ToString().EndsWith("$'"))
{
Console.WriteLine(drTable["Table_Name"]);
}
}
Is it possible to use the Open XML SDK?
https://learn.microsoft.com/en-us/office/open-xml/how-to-retrieve-the-values-of-cells-in-a-spreadsheet

Query string selections for XLS spreadsheet C#

I am trying to grab cells in XLS spreadsheets, assign them to string arrays, then manipulate the data and export to multiple CVS files.
The trouble is the XLS spreadsheet contains information that is not relevant, useable data doesn't start till row 17 and columns have no headings with just the default Sheet1.
I have looked at related questions and tried figuring it out myself with no success. The following code to read the XLS kinda works but is messy to work with as the row lengths vary from one XLS file to another and it is automatically pulling empty columns and rows.
CODE
public static void xlsReader()
{
string fileName = string.Format("{0}\\LoadsAvailable.xls", Directory.GetCurrentDirectory());
string connectionString = #"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + fileName + ";" + #"Extended Properties='Excel 8.0;HDR=Yes;'";
string queryString = "SELECT * FROM [Sheet1$]";
using (OleDbConnection connection = new OleDbConnection(connectionString))
{
OleDbCommand command = new OleDbCommand(queryString, connection);
connection.Open();
OleDbDataReader reader = command.ExecuteReader();
int counter = 0;
while (reader.Read())
{
Console.WriteLine("Line " + counter + ":" + reader[28].ToString()); // Just for testing
counter++;
}
}
}
I could do a bunch of trickery with loops to get the data that is required but there has to be a query string that could get the data from row 17 with only 8 columns(not 26 columns with 18 empty)?
I have tried many query string examples and can not seam to get any to work with a starting row index or filter out the empty data.
Here is a handy method that converts an excel file to a flat file.
You may want to change the connection string properties to suit your needs. I needed headers for my case.
Note you will need the Access database engine installed on your machine. I needed the 32 bit version since the app i dev'd was 32 bit. I bet you will also need it.
I parameterized the delimiter for the flat file, because I had cases where I didn't need a comma but a pipe symbol.
How to call method ex: ConvertExcelToFlatFile(openFileName, savePath, '|'); // pipe delimited
// Converts Excel To Flat file
private void ConvertExcelToFlatFile(string excelFilePath, string csvOutputFile, char delimeter, int worksheetNumber = 1)
{
if (!File.Exists(excelFilePath)) throw new FileNotFoundException(excelFilePath);
if (File.Exists(csvOutputFile)) throw new ArgumentException("File exists: " + csvOutputFile);
// connection string
var cnnStr = String.Format("Provider=Microsoft.ACE.OLEDB.12.0;Data Source={0};Extended Properties=\"Excel 12.0 Xml; IMEX=1; HDR=NO\"", excelFilePath);
var cnn = new OleDbConnection(cnnStr);
// get schema, then data
var dt = new DataTable();
try
{
cnn.Open();
var schemaTable = cnn.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
if (schemaTable.Rows.Count < worksheetNumber) throw new ArgumentException("The worksheet number provided cannot be found in the spreadsheet");
string worksheet = schemaTable.Rows[worksheetNumber - 1]["table_name"].ToString().Replace("'", "");
string sql = String.Format("select * from [{0}]", worksheet);
var da = new OleDbDataAdapter(sql, cnn);
da.Fill(dt);
}
catch (Exception e)
{
throw e;
}
finally
{
// free resources
cnn.Close();
}
// write out CSV data
using (var wtr = new StreamWriter(csvOutputFile)) // disposes file handle when done
{
foreach (DataRow row in dt.Rows)
{
//MessageBox.Show(row.ItemArray.ToString());
bool firstLine = true;
foreach (DataColumn col in dt.Columns)
{
// skip the first line the initial
if (!firstLine)
{
wtr.Write(delimeter);
}
else
{
firstLine = false;
}
var data = row[col.ColumnName].ToString();//.Replace("\"", "\"\""); // replace " with ""
wtr.Write(String.Format("{0}", data));
}
wtr.WriteLine();
}
}
}

Generating different files from Source in different formats using customized templates

I need to use a source data (basically a database table with different columns and data loaded or a csv file) and then create different files (lets call them destination files) in different formats (text,excel,pdf etc) based on source data using mappings from columns in source data to columns in destination files. The format and no. of columns would be different for different destination files but source data would be same for all. If a new destination file needs to be created in future, the solution design should as simple as just adding another template for the new file and add to existing templates set.
any idea on above approach please? Source data - can be SQL database table or a CSV file. Need to write in C#.NET
I like using my CSV reader which takes a CSV file and puts into a DataTable.
public class CSVReader
{
public DataSet ReadCSVFile(string fullPath, bool headerRow)
{
string path = fullPath.Substring(0, fullPath.LastIndexOf("\\") + 1);
string filename = fullPath.Substring(fullPath.LastIndexOf("\\") + 1);
DataSet ds = new DataSet();
try
{
if (File.Exists(fullPath))
{
string ConStr = string.Format("Provider=Microsoft.Jet.OLEDB.4.0;Data Source={0}" + ";Extended Properties=\"Text;HDR={1};FMT=Delimited\\\"", path, headerRow ? "Yes" : "No");
string SQL = string.Format("SELECT * FROM {0}", filename);
OleDbDataAdapter adapter = new OleDbDataAdapter(SQL, ConStr);
adapter.Fill(ds, "TextFile");
ds.Tables[0].TableName = "Table1";
}
foreach (DataColumn col in ds.Tables["Table1"].Columns)
{
col.ColumnName = col.ColumnName.Replace(" ", "_");
}
}
catch (Exception ex)
{
MessageBox.Show(ex.Message);
}
return ds;
}
}

How can I create schema.ini file? I need to export my .csv file to datagridview

I want to export a CSV file to a datagridview. I need to create the file schema.ini. But I don't know, how can I create it?
There is my code:
public DataTable exceldenAl(string excelFile)
{
try
{
string fileName = Path.GetFileName(excelFile);
string pathOnly = Path.GetDirectoryName(excelFile);
string cmd = "Select * From [" + fileName + "$]";
string cnstr = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + pathOnly + "\\;Extended Properties=\"Text;HDR=Yes;FORMAT=Delimited\"";
OleDbConnection ocn = new OleDbConnection(cnstr);
ocn.Open();
OleDbCommand command = new OleDbCommand(cmd,ocn);
OleDbDataAdapter adap = new OleDbDataAdapter(command);
DataTable dt = new DataTable();
dt.Locale = CultureInfo.CurrentCulture;
adap.Fill(dt);
return dt;
}
finally {
}
}
private void btnExcelReader_Click(object sender, EventArgs e)
{
string dosya;
string cevap;
openFileDialog1.ShowDialog();
dosya = openFileDialog1.FileName.ToString();
ClsExcelReader er = new ClsExcelReader();
cevap = er.exceldenAl(dosya).ToString();
dataGridView1.DataSource = cevap;
//listViewExcelOku.DataBindings =
}
}
Open up notepad and create a file similar to this:
[YourCSVFileName.csv]
ColNameHeader=True
Format=CSVDelimited
DateTimeFormat=dd-MMM-yyyy
Col1=A DateTime
Col2=B Text Width 100
Col3=C Text Width 100
Col4=D Long
Col5=E Double
Modify the above file to fit your specific data schema. Save it as SCHEMA.ini in the same directory where your *.CSV file is located.
Read this link (Importing CSV File Into Database), it is a good example to get you up and understanding how the Schema.ini works
I wrote these Excel formulas to generate the content of this file if you can get hold of the Excel sheet with column headers. It's pretty basic. Add and remove features as desired. It assumes all text with delimiters. Then insert the following formula (including all your options) in A2:
="[no headers.csv]
"&"ColNameHeader=false
"&"MaxScanRows=0
"&"Format=Delimited(;)
Col"&COLUMN()&"="&A1&" text
"
And the following formula in B2.
=A2&"Col"&COLUMN()&"="&B1&" text
"
Then drag to the right to get your basic schema.ini (rightmost cell). You can adjust options in the excel cell below the column name. Each column has it's own definition. I got closing and opening quotes in the result when copied to text file.

Categories

Resources