Check excel file contents before import - c#

I am using VS2005 C# ASP.NET and SQL Server 2005.
I have a function which import Excel data. I have met a situation when the data inside is inappropriate, it will bring down the SQL Server DB.
E.g. SELECT [Username], [Password] from [userlist$] -> If an excel spreadsheet contains more than [Username] in one column or values below the columns, the server will crash.
E.G.
May I know how can I have a statement to check for this file error before uploading? Prefer if else statements for checking.
Thank you for any help or examples given.
Below is my code snippet for the excel upload:
if (FileImport.HasFile)
{
// Get the name of the Excel spreadsheet to upload.
string strFileName = Server.HtmlEncode(FileImport.FileName);
// Get the extension of the Excel spreadsheet.
string strExtension = Path.GetExtension(strFileName);
// Validate the file extension.
if (strExtension == ".xls" || strExtension == ".xlsx")
{
// Generate the file name to save.
string strUploadFileName = "C:/Documents and Settings/user01/My Documents/Visual Studio 2005/WebSites/MajorProject/UploadFiles/" + DateTime.Now.ToString("yyyyMMddHHmmss") + strExtension;
// Save the Excel spreadsheet on server.
FileImport.SaveAs(strUploadFileName);
// Create Connection to Excel Workbook
string connStr = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + strUploadFileName + ";Extended Properties=Excel 8.0;";
using (OleDbConnection connection =
new OleDbConnection(connStr))
{
string selectStmt = string.Format("Select [COLUMNS] FROM [userlist$]");
OleDbCommand command = new OleDbCommand(selectStmt, connection);
connection.Open();
Console.WriteLine("Connection Opened");
// Create DbDataReader to Data Worksheet
using (DbDataReader dr = command.ExecuteReader())
{
// SQL Server Connection String
string sqlConnectionString = "Data Source=<datasource>";
// Bulk Copy to SQL Server
using (SqlBulkCopy bulkCopy =
new SqlBulkCopy(sqlConnectionString))
{
bulkCopy.DestinationTableName = "UserDB";
bulkCopy.WriteToServer(dr);
return;
}
}
}

Depending on the amount of data that is in the Excel spreadsheet, you could read each of the values from the column you are interested in into a Dictionary, for example, and fail the import as soon as you find the first conflict.
For example, using your existing datareeader:
Dictionary<string, string> cValues = new Dictionary<string, string>();
// Create DbDataReader to Data Worksheet
using (DbDataReader dr = command.ExecuteReader())
{
while (dr.Read()) {
string sValue = dr[0].ToString();
if (cValue.ContainsKey(sValue)) {
// There is a duplicate value, so bail
throw new Exception("Duplicate value " + sValue);
} else {
cValues.Add(sValue, sValue);
}
}
}
// Now execute the reader on the command again to perform the upload
using (DbDataReader dr = command.ExecuteReader())

You should solve the cause of the problem, which is the database. Apparently, the way you store the data should be via stored procedures which handle wrongly input data.
Also, the constraints on the database tables should prohibit you from storing non-sensical data.

Related

how to Read xls file using OLEDB?

I want to read all data from an xls file using OLEDB, but I don't have any experience in that.
string filename = #"C:\Users\sasa\Downloads\user-account-creation_2.xls";
string connString = #"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + filename + ";Extended Properties='Excel 8.0;HDR=YES'";
using (System.Data.OleDb.OleDbConnection conn = new System.Data.OleDb.OleDbConnection(connString))
{
conn.Open();
System.Data.OleDb.OleDbCommand selectCommand = new System.Data.OleDb.OleDbCommand("select * from [Sheet1$]", conn);
System.Data.OleDb.OleDbDataAdapter adapter = new System.Data.OleDb.OleDbDataAdapter(selectCommand);
DataTable dt = new DataTable();
adapter.Fill(dt);
int counter = 0;
foreach (DataRow row in dt.Rows)
{
String dataA = row["email"].ToString();
// String dataB= row["DataB"].ToString();
Console.WriteLine(dataA + " = ");
counter++;
if (counter >= 40) break;
}
}
I want to read all data from email row
I get this error
'Sheet$' is not a valid name. Make sure that it does not include invalid characters or punctuation and that it is not too long
Well, you don't have a sheet called Sheet1 do you? Your sheet seems to be called "email address from username" so your query should be....
Select * From ['email address from username$']
Also please don't use Microsoft.Jet.OLEDB.4.0 as it's pretty much obsolete now. Use Microsoft.ACE.OLEDB.12.0. If you specify Excel 12.0 in the extended properties it will open both .xls and .xlsx files.
You can also load the DataTable with a single line...
dt.Load(new System.Data.OleDb.OleDbCommand("Select * From ['email address from username$']", conn).ExecuteReader());
To read the names of the tables in the file use...
DataTable dtTablesList = conn.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
foreach (DataRow drTable in dtTablesList.Rows)
{
//Do Something
//But be careful as this will also return Defined Names. i.e ranges created using the Defined Name functionality
//Actual Sheet names end with $ or $'
if (drTable["Table_Name"].ToString().EndsWith("$") || drTable["Table_Name"].ToString().EndsWith("$'"))
{
Console.WriteLine(drTable["Table_Name"]);
}
}
Is it possible to use the Open XML SDK?
https://learn.microsoft.com/en-us/office/open-xml/how-to-retrieve-the-values-of-cells-in-a-spreadsheet

SqlBulkCopy - Add rows along with additional values to database

I am trying a code snippet that add excel data into sql database using SqlBulkCopy. The code snippet is as given below
OleDbConnection connection=null;
string FilePath="";
try
{
if (FileUpload1.HasFile)
{
FileUpload1.SaveAs(Server.MapPath("~/UploadedFolder/"+FileUpload1.FileName));
FilePath = Server.MapPath("~/UploadedFolder/"+FileUpload1.FileName);
}
string path = FilePath;
// Connection String to Excel Workbook
string excelConnectionString = string.Format("Provider=Microsoft.ACE.OLEDB.12.0;Data Source={0};Extended Properties=Excel 8.0", path);
connection = new OleDbConnection();
connection.ConnectionString = excelConnectionString;
OleDbCommand command = new OleDbCommand("select * from [Sheet1$]", connection);
connection.Open();
// Create DbDataReader to Data Worksheet
DbDataReader dr = command.ExecuteReader();
// SQL Server Connection String
string sqlConnectionString = #"Data Source=sample;Initial Catalog=ExcelImport;User ID=sample;Password=sample";
// Bulk Copy to SQL Server
SqlBulkCopy bulkInsert = new SqlBulkCopy(sqlConnectionString);
bulkInsert.DestinationTableName = "Customer_Table";
bulkInsert.WriteToServer(dr);
}
catch (Exception ex)
{
Response.Write(ex.Message);
}
finally
{
connection.Close();
Array.ForEach(Directory.GetFiles(Server.MapPath("~/UploadedFolder/")), File.Delete);
}
This add the data from excel file to sql server database table. But my requirement is that I need to add the values of excel sheet plus additionally my own values say autogenerated studentid.
So my question is how I will add new values (say studentid, batchcode etc) along with values that read from excel. And these values to be added to each row of excel data.
Example:-
excel contains following columns
CustomerID,City,Country,PostalCode
Now I need to add values to sql server by adding some new columns as
StudentID,CustomerID,BatchCode,City,Country,Email,PostalCode
How can I do it
Please help
You could do the following:
load the excel data into a data table,
Add the remaining columns to the data table,
Set new column values
SqlBulkCopy the data table into SQL Server.
Something like this:
DbDataReader dr = command.ExecuteReader();
DataTable table = new DataTable("Customers");
table.Load(dr);
table.Columns.Add("StudentId", typeof(int));
table.Columns.Add("BatchCode", typeof(string));
table.Columns.Add("Email", typeof(string));
foreach (DataRow row in table.Rows)
{
row["StudentId"] = GetStudentId(row);
row["BatchCode"] = GetBatchCode(row);
row["Email"] = GetEmail(row);
}
// SQL Server Connection String
string sqlConnectionString = #"Data Source=sample;Initial Catalog=ExcelImport;User ID=sample;Password=sample";
// Bulk Copy to SQL Server
SqlBulkCopy bulkInsert = new SqlBulkCopy(sqlConnectionString);
bulkInsert.DestinationTableName = "Customer_Table";
bulkInsert.WriteToServer(table);

Reading an Excel file into a DataTable returns empty fields in the DataRows

I'm importing an excel file into a DataTable, and then getting the information i need from each subsequent DataRow, which are then to be inserted into a list.
I have a method that i call when i need to import an Excel (.xlsx or .xls) file into a DataTable, and i use it 6 or 7 other places in my program, so i'm pretty sure there aren't any errors there.
My problem is that when i access a DataRow, on this specific DataTable, the first few fields contain values, but everything else is just null.
If i look at it in Locals window i can see that the DataRow looks like this:
[0] {"Some string value"}
[1] {}
[2] {}
[3] {}
When it should look like this:
[0] {"Some string value"}
[1] {"Another string value"}
[2] {"Foo"}
[3] {"Bar"}
Here is the method that handles the import:
public List<DataTable> ImportExcel(string FileName)
{
List<DataTable> _dataTables = new List<DataTable>();
string _ConnectionString = string.Empty;
string _Extension = Path.GetExtension(FileName);
//Checking for the extentions, if XLS connect using Jet OleDB
if (_Extension.Equals(".xls", StringComparison.CurrentCultureIgnoreCase))
{
_ConnectionString =
"Provider=Microsoft.Jet.OLEDB.4.0; Data Source={0};Extended Properties=Excel 8.0";
}
//Use ACE OleDb
else if (_Extension.Equals(".xlsx", StringComparison.CurrentCultureIgnoreCase))
{
_ConnectionString =
"Provider=Microsoft.ACE.OLEDB.12.0;Data Source={0};Extended Properties=Excel 8.0";
}
DataTable dataTable = null;
var count = 0;
using (OleDbConnection oleDbConnection =
new OleDbConnection(string.Format(_ConnectionString, FileName)))
{
oleDbConnection.Open();
//Getting the meta data information.
//This DataTable will return the details of Sheets in the Excel File.
DataTable dbSchema = oleDbConnection.GetOleDbSchemaTable(OleDbSchemaGuid.Tables_Info, null);
foreach (DataRow item in dbSchema.Rows)
{
//reading data from excel to Data Table
using (OleDbCommand oleDbCommand = new OleDbCommand())
{
oleDbCommand.Connection = oleDbConnection;
oleDbCommand.CommandText = string.Format("SELECT * FROM [{0}]",
item["TABLE_NAME"].ToString());
using (OleDbDataAdapter oleDbDataAdapter = new OleDbDataAdapter())
{
if (count < 3)
{
oleDbDataAdapter.SelectCommand = oleDbCommand;
dataTable = new DataTable(item["TABLE_NAME"].ToString());
oleDbDataAdapter.Fill(dataTable);
_dataTables.Add(dataTable);
count++;
}
}
}
}
}
return _dataTables;
}
Any thoughts?
You may need to add ;IMEX=1 to "Extended Properties" in your connection string. But ultimately, reading excel files with OleDb is flimsy at best. You should use a 3rd party library that deals with them nativly like:
NPOI for XLS https://code.google.com/p/npoi/
EPPlus for XLSX http://epplus.codeplex.com/
Turns out the fault was with the excel document.
Apparently excel documents has a hidden table called Shared Strings.
It was because of this table i couldn't find the values i was looking for.

Remove need of creating backup file while importing Excel spreadsheet to SQL Server database

I am new to ASP.NET and C# and I am using VS2005 C# and SQL Server 2005.
I have a web application which contain a function which uploads data from a spreadsheet imported into the database.
However, my current function copies the spreadsheet uploaded into a directory, and used the uploaded file in the directory for reading of contents instead.
I would like to change it such that it will not create a backup copy of the uploaded excel file, and read the file contents directly from the uploaded file instead of the backup created.
Below is my code snippet for the import function for .xls and .xlsx spreadsheet:
if (FileImport.HasFile)
{
// Get the name of the Excel spreadsheet to upload.
string strFileName = Server.HtmlEncode(FileImport.FileName);
// Get the extension of the Excel spreadsheet.
string strExtension = Path.GetExtension(strFileName);
// Validate the file extension.
if (strExtension == ".xls" || strExtension == ".xlsx")
{
// Generate the file name to save.
string strUploadFileName = "C:/Documents and Settings/user01/My Documents/Visual Studio 2005/WebSites/MajorProject/UploadFiles/" + DateTime.Now.ToString("yyyyMMddHHmmss") + strExtension;
// Save the Excel spreadsheet on server.
FileImport.SaveAs(strUploadFileName);
// Create Connection to Excel Workbook
string connStr = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + strUploadFileName + ";Extended Properties=Excel 8.0;";
using (OleDbConnection connection =
new OleDbConnection(connStr))
{
string selectStmt = string.Format("Select [COLUMNS] FROM [userlist$]");
OleDbCommand command = new OleDbCommand(selectStmt, connection);
connection.Open();
Console.WriteLine("Connection Opened");
// Create DbDataReader to Data Worksheet
using (DbDataReader dr = command.ExecuteReader())
{
// SQL Server Connection String
string sqlConnectionString = "Data Source=<datasource>";
// Bulk Copy to SQL Server
using (SqlBulkCopy bulkCopy =
new SqlBulkCopy(sqlConnectionString))
{
bulkCopy.DestinationTableName = "UserDB";
bulkCopy.WriteToServer(dr);
return;
}
}
}
Below is my code snippet for the import function for .csv spreadsheet:
if (strExtension == ".csv")
{
// Generate the file name to save.
string dir = #"C:\Documents and Settings\user01\My Documents\Visual Studio 2005\WebSites\MajorProject\UploadFiles\";
string mycsv = DateTime.Now.ToString("yyyyMMddHHmmss") + strExtension;
// Save the Excel spreadsheet on server.
BaanImport.SaveAs(dir + mycsv);
// Create Connection to Excel Workbook
string connStr = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + dir + ";Extended Properties=Text;";
using (OleDbConnection ExcelConnection = new OleDbConnection(connStr))
{
string selectStmt = string.Format("SELECT [COLUMNS] FROM " + mycsv);
OleDbCommand ExcelCommand = new OleDbCommand(selectStmt, ExcelConnection);
OleDbDataAdapter ExcelAdapter = new OleDbDataAdapter(ExcelCommand);
ExcelConnection.Open();
using (DbDataReader dr = ExcelCommand.ExecuteReader())
{
// SQL Server Connection String
string sqlConnectionString = "Data Source=<datasource>";
// Bulk Copy to SQL Server
using (SqlBulkCopy bulkCopy =
new SqlBulkCopy(sqlConnectionString))
{
bulkCopy.DestinationTableName = "UserDB";
bulkCopy.WriteToServer(dr);
return;
}
}
}
}
May I know how could I change it such that it will not create a copy in the dir:#"C:\Documents and Settings\user01\My Documents\Visual Studio 2005\WebSites\MajorProject\UploadFiles\, but read the data directly from the imported file instead?
Thank you for any help in advance.
Assuming that the object you are working with for the imported files is an HttpPostedFile, then you can use its InputStream to read the file directly from its uploaded location.
See this MSDN documentation for more information and sample code.

How to Open a CSV or XLS with Jet OLEDB and attain Table Lock?

I am trying to figure out how to read/write lock a CSV or XLS file when I read it as a Database via Jet OLEDB.
The following code will open a CSV as a DB and load it into a DataTable object:
private DataTable OpenCSVasDB(string fullFileName)
{
string file = Path.GetFileName(fullFileName);
string dir = Path.GetDirectoryName(fullFileName);
string cStr = "Provider=Microsoft.Jet.OLEDB.4.0;"
+ "Data Source=\"" + dir + "\\\";"
+ "Extended Properties=\"text;HDR=Yes;FMT=Delimited\";";
string sqlStr = "SELECT * FROM [" + file + "]";
OleDbDataAdapter da;
DataTable dt = new DataTable();
try
{
da = new OleDbDataAdapter(sqlStr, cStr);
da.Fill(dt);
}
catch { dt = null; }
}
What I want to make sure of is that while I have the CSV or XLS file open, that I have a Read/Write LOCK on the Table (aka. the file), so that any other application that comes along and tries to read/write to this file has to wait its turn.
Does this happen automatically? If not, what do I need to do to make sure this does happen?
Btw, I'm working in C#/.NET 2.0, if that makes any difference...
Update: So, I'm clarifying my requirements now:
XLS file (because I need SELECT and UPDATE functionality) [CSV can only SELECT and INSERT]
LOCK the XLS file while the DB is Open. (can't have multiple threads and/or processes stepping on each other's changes...)
Read into DataTable object (for ease of working)
OLEDB's Jet driver locks flat files while there's an open OleDbDataReader to them. To verify this, look at the VerifyFileLockedByOleDB method in the code sample below. Note that having an open OleDbConnection is not enough-- you have to have an open Reader.
That said, your code posted above does not keep an open connection, since it uses OleDbDataAdapter.Fill() to quickly connect to the data source, suck out all the data, and then disconnect. The reader is never left open. The file is only locked for the (short) time that Fill() is running.
Furthermore, even if you open the reader yourself and pass it into DataTable.Load(), that method will close your DataReader for you once it's done, meaning that the file gets unlocked.
So if you really want to keep the file locked and still use a DataTable, you'll need to manually populate the datatable (schema and rows!) from an IDataReader, instead of relying on DataAdapter.Fill() or DataTable.Load().
Anyway, here's a code sample which shows:
your original code
an example which won't work because DataTable.Load() will close the DataReader and unlock the file
an alternate approach which will keep the file locked while you're working with the data, via operating at the row level using DataReader rather than using a DataTable
UPDATE: looks like keeping a DataReader open will prevent the same process from opening the file, but another process (e.g. Excel) can open (and write to!) the file. Go figure. Anyway, at this point I'd suggest, if you really want to keep the file locked, consider using something else besides OLEDB where you have more fine-grained control over how (adn when!) the file is opened and closed. I'd suggest the CSV reader fromhttp://www.codeproject.com/KB/database/CsvReader.aspx, which is well-tested and fast, but will give you the source code so if you need to change file-locking/opening/closing, you can do so.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Data.OleDb;
using System.Data;
using System.IO;
namespace TextFileLocking
{
class Program
{
private static DataTable OpenCSVasDB(string fullFileName)
{
string file = Path.GetFileName(fullFileName);
string dir = Path.GetDirectoryName(fullFileName);
string cStr = "Provider=Microsoft.Jet.OLEDB.4.0;"
+ "Data Source=\"" + dir + "\\\";"
+ "Extended Properties=\"text;HDR=Yes;FMT=Delimited\";";
string sqlStr = "SELECT * FROM [" + file + "]";
OleDbDataAdapter da;
DataTable dt = new DataTable();
try
{
da = new OleDbDataAdapter(sqlStr, cStr);
da.Fill(dt);
}
catch { dt = null; }
return dt;
}
private static DataTable OpenCSVasDBWithLockWontWork(string fullFileName, out OleDbDataReader reader)
{
string file = Path.GetFileName(fullFileName);
string dir = Path.GetDirectoryName(fullFileName);
string cStr = "Provider=Microsoft.Jet.OLEDB.4.0;"
+ "Data Source=\"" + dir + "\\\";"
+ "Extended Properties=\"text;HDR=Yes;FMT=Delimited\";";
string sqlStr = "SELECT * FROM [" + file + "]";
OleDbConnection openConnection = new OleDbConnection(cStr);
reader = null;
DataTable dt = new DataTable();
try
{
openConnection.Open();
OleDbCommand cmd = new OleDbCommand(sqlStr, openConnection);
reader = cmd.ExecuteReader();
dt.Load (reader); // this will close the reader and unlock the file!
return dt;
}
catch
{
return null;
}
}
private static void OpenCSVasDBWithLock(string fullFileName, Action<IDataReader> dataRowProcessor)
{
string file = Path.GetFileName(fullFileName);
string dir = Path.GetDirectoryName(fullFileName);
string cStr = "Provider=Microsoft.Jet.OLEDB.4.0;"
+ "Data Source=\"" + dir + "\\\";"
+ "Extended Properties=\"text;HDR=Yes;FMT=Delimited\";";
string sqlStr = "SELECT * FROM [" + file + "]";
using (OleDbConnection conn = new OleDbConnection(cStr))
{
OleDbCommand cmd = new OleDbCommand(sqlStr, conn);
conn.Open();
using (OleDbDataReader reader = cmd.ExecuteReader())
{
while (reader.Read())
{
dataRowProcessor(reader);
}
}
}
}
private static void VerifyFileLockedByOleDB(string fullFileName)
{
string file = Path.GetFileName(fullFileName);
string dir = Path.GetDirectoryName(fullFileName);
string cStr = "Provider=Microsoft.Jet.OLEDB.4.0;"
+ "Data Source=\"" + dir + "\\\";"
+ "Extended Properties=\"text;HDR=Yes;FMT=Delimited\";";
string sqlStr = "SELECT * FROM [" + file + "]";
using (OleDbConnection conn = new OleDbConnection(cStr))
{
OleDbCommand cmd = new OleDbCommand(sqlStr, conn);
conn.Open();
using (OleDbDataReader reader = cmd.ExecuteReader())
{
File.OpenRead(fullFileName); // should throw an exception
while (reader.Read())
{
File.OpenRead(fullFileName); // should throw an exception
StringBuilder b = new StringBuilder();
for (int i = 0; i < reader.FieldCount; i++)
{
b.Append(reader.GetValue(i));
b.Append(",");
}
string line = b.ToString().Substring(0, b.Length - 1);
Console.WriteLine(line);
}
}
}
}
static void Main(string[] args)
{
string filename = Directory.GetCurrentDirectory() + "\\SomeText.CSV";
try
{
VerifyFileLockedByOleDB(filename);
}
catch { } // ignore exception due to locked file
OpenCSVasDBWithLock(filename, delegate(IDataReader row)
{
StringBuilder b = new StringBuilder();
for (int i = 0; i <row.FieldCount; i++)
{
b.Append(row[i].ToString());
b.Append(",");
}
string line = b.ToString().Substring(0, b.Length - 1);
Console.WriteLine(line);
});
}
}
}
UPDATE: The following does not appear to lock my DB as I had hoped...
After much more digging, I found this page:
ADO Provider Properties and Settings
It says:
Jet OLEDB:Database Locking Mode
A Long value (read/write) that
specifies the mode used when locking
the database to read or modify
records.
The Jet OLEDB:Database Locking Mode
property can be set to any of the
following values:
Page-level Locking 0
Row-level Locking 1
Note A database can only be open in
one mode at a time. The first user to
open the database determines the
locking mode to be used while the
database is open.
So I assume that my code would get changed to:
string cStr = "Provider=Microsoft.Jet.OLEDB.4.0;"
+ "Data Source=\"" + dir + "\\\";"
+ "Extended Properties=\"text;HDR=Yes;FMT=Delimited\";"
+ "Jet OLEDB:Database Locking Mode=0";
Which should give me Page-level locking. If I wanted Row-level locking, I'd switch the value to 1.
Unfortunately, this doesn't actually appear to do any table/row/page locking when opening a CSV file as a DB.
Ok, so that new function you wrote kinda works, but I still end up with a "Race condition" which then causes an exception to be thrown. So in this section of the code:
using (OleDbConnection conn = new OleDbConnection(cStr))
{
OleDbCommand cmd = new OleDbCommand(sqlStr, conn);
conn.Open();
using (OleDbDataReader reader = cmd.ExecuteReader())
{
while (reader.Read())
{
reader.GetString(0); // breakpoint here
}
}
}
I put a breakpoint on the line with the comment "breakpoint here" and then ran the program. I then located the CSV file in File Explorer and tried to open it with Excel. It causes Excel to wait for the file to be unlocked, which is good.
But here's the bad part. When I clear the breakpoint and then tell it to continue debugging, Excel sneaks in, grabs a lock on the file and causes an exception in my running code.
(The exception is: The Microsoft Jet database engine cannot open the file ''. It is already opened exclusively by another user, or you need permission to view its data.)
I guess I can always wrap that code in a try-catch block, but when the exception occurs, I won't know if it is a legitimate exception or one caused by this weird condition.
The exception seems to occur when the Reader is finished reading. (after it reads the last row, but still is in the "using (OleDbDataReader reader = cmd.ExecuteReader())" loop.

Categories

Resources