Parsing CSV using OleDb using C# - c#

I know this topic is done to death but I am at wits end.
I need to parse a csv. It's a pretty average CSV and the parsing logic has been written using OleDB by another developer who swore that it work before he went on vacation :)
CSV sample:
Dispatch Date,Master Tape,Master Time Code,Material ID,Channel,Title,Version,Duration,Language,Producer,Edit Date,Packaging,1 st TX,Last TX,Usage,S&P Rating,Comments,Replace,Event TX Date,Alternate Title
,a,b,c,d,e,f,g,h,,i,,j,k,,l,m,,n,
The problem I have is that I get various errors depending on the connection string I try.
when I try the connection string:
Provider=Microsoft.Jet.OLEDB.4.0;Data Source="D:\TEST.csv\";Extended Properties="text;HDR=No;FMT=Delimited"
I get the error:
'D:\TEST.csv' is not a valid path. Make sure that the path name is spelled correctly and that you are connected to the server on which the file resides.
When I try the connection string:
Provider=Microsoft.ACE.OLEDB.12.0;Data Source=D:\TEST.csv;Extended Properties=Excel 12.0;
or the connection string
Provider=Microsoft.Jet.OLEDB.4.0;Data Source=D:\TEST.csv;Extended Properties=Excel 8.0;
I get the error:
External table is not in the expected format.
I am considering throwing away all the code and starting from scratch. Is there something obvious I am doing wrong?

You should indicate only the directory name in your connection string. The file name will be used to query:
var filename = #"c:\work\test.csv";
var connString = string.Format(
#"Provider=Microsoft.Jet.OleDb.4.0; Data Source={0};Extended Properties=""Text;HDR=YES;FMT=Delimited""",
Path.GetDirectoryName(filename)
);
using (var conn = new OleDbConnection(connString))
{
conn.Open();
var query = "SELECT * FROM [" + Path.GetFileName(filename) + "]";
using (var adapter = new OleDbDataAdapter(query, conn))
{
var ds = new DataSet("CSV File");
adapter.Fill(ds);
}
}
And instead of OleDB you could use a decent CSV parser (or another one).

Alternate solution is to use TextFieldParser class (part of .Net framework itself.) https://learn.microsoft.com/en-us/dotnet/api/microsoft.visualbasic.fileio.textfieldparser
This way you do not have to rely on other developer who has gone for holidays. I have used it so many times and have not hit any snag.
I have posted this from work (hence I cannot post an example snippet. I will do so when I go home this evening).

It seems your first row contains the column names, so you need to include the HDR=YES property, like this:
Provider=Microsoft.ACE.OLEDB.12.0;Data Source=D:\TEST.csv;Extended Properties="Excel 12.0;HDR=YES";

Try the connection string:
"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=D:\TEST.csv;Extended Properties=\"Excel 8.0;IMEX=1\""

var s=#"D:\TEST.csv";
string dir = Path.GetDirectoryName(s);
string sConnection = "Provider=Microsoft.Jet.OLEDB.4.0;"
+ "Data Source=\"" + dir + "\\\";"
+ "Extended Properties=\"text;HDR=YES;FMT=Delimited\"";

Related

Using the Contain Function in C#

I am pulling in a excel file to my project everything was working fine but not I have grab all of the cell that contain WMI. I am having trouble with stbQuery does any one know the correct syntax for grabbing information that only contains certain characters.
OleDbConnection con = new OleDbConnection ("provider=Microsoft.Jet.OLEDB.4.0;data source=" + txtFileName.Text + ";Extended Properties=Excel 8.0;");
StringBuilder stbQuery = new StringBuilder();
stbQuery.Append("Select [Wireless Number (uneditable)], [* Last Name] FROM [" + txtSheets.Text + "$] WHERE [* Last Name] = LIKE '%WMI%' ");
This is an SQL Syntax - LIKE Issue.
The LIKE Keyword doesn't need an = in Structured Query Language.
Example:
SELECT * FROM Customers
WHERE [* Last Name] LIKE 's%';
UPDATE
Look at D Stanley Comment.

Extract Data from Excel File and Store in SQL Server database

I am looking for advice on the best way to parse a Microsoft Excel file and update/store the data into a given SQL Server database. I using ASP.NET MVC so I plan on having a page/view take in an Excel spreadsheet and using that user given file I will need to use C# to parse the data from the columns and update the database based on matches with the spreadsheet column that contains the key column of the database table. The spreadsheet will always be in the same format so I will only need to handle on format. It seems like this could be a pretty common thing I am just looking for the best way to approach this before getting started. I am using Entity Framework in my current application but I don't have to use it.
I found this solution which seems like it could be a good option:
public IEnumerable<MyEntity> ReadEntitiesFromFile( IExcelDataReader reader, string filePath )
{
var myEntities = new List<MyEntity>();
var stream = File.Open( filePath, FileMode.Open, FileAccess.Read );
using ( var reader = ExcelReaderFactory.CreateOpenXmlReader( stream ) )
{
while ( reader.Read() )
{
var myEntity = new MyEntity():
myEntity.MyProperty1 = reader.GetString(1);
myEntity.MyProperty2 = reader.GetInt32(2);
myEntites.Add(myEntity);
}
}
return myEntities;
}
Here is an example of a what a file might look like (Clock# is the key)
So given a file in this format I want to match the user to the data table record using the clock # and update the record with each of the cells information. Each of the columns in the spreadsheet have a relatable column in the data table. All help is much appreciated.
You can use the classes in the namespace Microsoft.Office.Interop.Excel, which abstracts all the solution you found. Instead of me rewriting it, you can check out this article: http://www.codeproject.com/Tips/696864/Working-with-Excel-Using-Csharp.
Better yet, why not bypass the middle man? You can use an existing ETL tool, such as Pentaho, or Talend, or something to go straight from Excel to your database. These types of tools often offer a lot of customization, and are fairly straightforward to use. I've used Pentaho quite a lot for literally what you're describing, and it saved me the head ache of writing the code myself. Unless you want to/need to write it yourself, I think the latter is the best approach.
Try This
public string GetDataTableOfExcel(string file_path)
{
using (OleDbConnection conn = new OleDbConnection())
{
DataTable dt = new DataTable();
string Import_FileName = Server.MapPath(file_path);
//Import_FileName = System.IO.Path.GetDirectoryName(file_path);
string fileExtension = Path.GetExtension(Import_FileName);
if (fileExtension == ".xlsx")
conn.ConnectionString = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + Import_FileName + ";" + "Extended Properties='Excel 12.0 Xml;HDR=YES;'";
using (OleDbCommand comm = new OleDbCommand())
{
comm.CommandText = "Select * from [Sheet1$]";
comm.Connection = conn;
using (OleDbDataAdapter da = new OleDbDataAdapter())
{
da.SelectCommand = comm;
da.Fill(dt);
}
}
}
}
Now Your Data in DataTable. You can create insert query from datatable's data.
file_path is excel file's full path with directory name.

Data is missing while reading excel file using OLEDB

I am using OLEDB to read excel file into datatable. But the problem is, some values are missing(Empty). In my excel sheet one column datatype is General, it has mixed values like string and integers. Most of the cell values are integers. Why OLEDB is skipping string values.
OleDbConnection connection = new OleDbConnection();
connection.ConnectionString = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + filePath + "; Extended Properties=\"Excel 12.0;IMEX=1\";";
OleDbCommand myAccessCommand = new OleDbCommand();
myAccessCommand.CommandText = "Select * from [" + sheetName + "]";
OleDbDataAdapter myDataAdapter = new OleDbDataAdapter(myAccessCommand);
myDataAdapter.Fill(myDataSet);
Check following link and see points under "RESOLUTION":
http://support.microsoft.com/kb/194124
Please see point 2 NOTE.
Setting IMEX=1 is entirely dependent on your registry settings. By default, first 8 rows are checked to determine the data type. IMEX=1 can give unpredictable behaviors, such as skipping string values. There is also one small workaround for this problem. Just add single quote (') before every cell value in excel. Every cell will be treated as string.
Add IMEX=1 to the connection string as below:
string con = string.Format(#"Provider=Microsoft.Jet.OLEDB.4.0;Data Source={0};" + #"Extended Properties='Excel 8.0;HDR=Yes;IMEX=1'", fileName);

SQL Server Compact Insertion

i want to insert data into sql server Compact edition the database table screenshot is Here >>>
i Want to add data in users the addition script is as follows
SqlCeConnection Con = new SqlCeConnection();
Con.ConnectionString = "Data Source = 'Database.sdf';" +
"Password='Password';";
Con.Open();
int Amount=Convert.ToInt32(AmBox.Text),
Code=Convert.ToInt32(MCode.Text),
Num=Convert.ToInt32(MNum.Text);
string Name=Convert.ToString(NBox.Text),
FName=Convert.ToString(SOBox.Text),
Address=Convert.ToString(AdBox.Text);
SqlCeCommand Query =new SqlCeCommand("INSERT INTO Users VALUES " +
"(++ID,Name,FName,Address,Code,Num,Amount)",Con);
Query.ExecuteReader();
When it runs it generates an error SAYING "The column name is not valid [Node Name (if any) =,Column name=ID ]
I don't figure out the problem kindly tell me thanks!
You should change your code to something like this
using(SqlCeConnection Con = new SqlCeConnection("Data Source = 'Database.sdf';" +
"Password='Password';")
{
Con.Open();
SqlCeCommand Query = new SqlCeCommand("INSERT INTO Users " +
"(Name,FName,Address,MCode,MNum,Amount) " +
"VALUES (#Name,#FName,#Address,#Code,#Num,#Amount)",Con);
Query.Parameters.AddWithValue("#Name", NBox.Text);
Query.Parameters.AddWithValue("#FName", SOBox.Text));
Query.Parameters.AddWithValue("#Address",AdBox.Text));
Query.Parameters.AddWithValue("#Code", Convert.ToInt32(MCode.Text));
Query.Parameters.AddWithValue("#Num", Convert.ToInt32(MNum.Text));
Query.Parameters.AddWithValue("#Amount" , Convert.ToInt32(AmBox.Text));
Query.ExecuteNonQuery();
}
The using statement guarantees the correct disposing of the
connection
The Parameter collection avoid Sql Injection Attacks and quoting
problems
Use of ExecuteNonQuery because this is an insert query.
Removed the ++ID, it is not a valid value to pass to the database
If the ID field is an Identity column, then you don't pass any value from code, but let the database calculate the next value.
Also, I'm not sure you really need the single quote in your connection string around the data source and password keys.
EDIT ---
Sometimes the .SDF database could be located in a different folder.
(Modern operating systems prevent writing in the application folder).
In this case is necessary to set the path to the SDF file in the connection string.
For example, the SDF could be located in a subfolder of the C:\ProgramData directory.
string conString = "Data Source=" +
Path.Combine(
Environment.GetFolderPath(Environment.SpecialFolder.CommonApplicationData),
"MyAppData\\database.sdf") + ";Password=yourPassword;";

How to use GetOleDbSchemaTable method on a long name dbf file

As part of a project I'm working on in C# I need to read in a .dbf file. The first thing I want to do is to get the schema table from the file. I have code that works as long as the filename (without the extension) is not longer than 8 characters.
For example, let's say I have a file named MyLongFilename.dbf. The following code does not work; it throws the following exception: “The Microsoft Jet database engine could not find the object 'MyLongFilename'. Make sure the object exists and that you spell its name and the path name correctly.”
string cxn = "PROVIDER=Microsoft.Jet.OLEDB.4.0;Data Source=C:\MyLongFilename;Extended Properties=dBASE 5.0";
OleDbConnection connection = new OleDbConnection(cxn);
To get past this exception, the next step is to use a name the OldDbConnection likes ('MyLongF~1' instead of 'MyLongFilename'), which leads to this:
string cxn = "PROVIDER=Microsoft.Jet.OLEDB.4.0;Data Source=C:\MyLongF~1;Extended Properties=dBASE 5.0";
OleDbConnection connection = new OleDbConnection(cxn);
This does successfully return an OleDbConnection. Now to get the schema table I try the following:
connection.Open();
DataTable schemaTable = connection.GetOleDbSchemaTable(OleDbSchemaGuid.Columns,
new object[] { null, null, fileNameNoExt, null });
This returns a DataTable with no rows. If I rename the filename to 8 or less characters then this code works and I get back a row for each field in the database.
With the long filename, I know the returned connection is valid because I can use it to fill a DataSet like so:
string selectQuery = "SELECT * FROM [MyLongF~1#DBF];";
OleDbCommand command = new OleDbCommand(selectQuery, connection);
connection.Open();
OleDbDataAdapter dataAdapter = new OleDbDataAdapter();
dataAdapter.SelectCommand = command;
DataSet dataSet = new DataSet();
dataAdapter.Fill(dataSet);
This gives me back a DataSet containing a DataTable with all of the data from the dbf file.
So the question is how can I get just the schema table for the long named dbf file? Of course I can work around the issue by renaming/copying the file, but that’s a hack I don’t want to have to make. Nor do I want to fill the DataSet with the top 1 record and deduce the schema from columns.
According to MSDN, the folder represents the database and the files represent tables. You should be using the directory path not including the filename in the connection string then, and the name of the table as part of the restrictions to GetOleDbSchemaTable.
Well, i think the connection should be
string cxn = "PROVIDER=Microsoft.Jet.OLEDB.4.0;Data Source=C:\;Extended Properties=dBASE 5.0";
OleDbConnection connection = new OleDbConnection(cxn);
and the other is, maybe you should try with other provider, I boosted a lot along ago when I used like this:
string cxn = "PROVIDER=VFPOLEDB.1;Data Source=C:\;Extended Properties=dBASE 5.0";
But you should have VFP 7 installed
or install Microsoft OLE DB Provider for Visual FoxPro 9.0 from here
const string connectionString = #"Provider = vfpoledb; Data Source = {0}; Collating Sequence = general;";
OleDbConnection conn = new OleDbConnection(string.Format(connectionString, dirName));
conn.Open();
OleDbCommand cmd = new OleDbCommand(string.Format("select * from {0}", fileName), conn);
Is fileNameNoExt holding the short filename version? Also, MyLongF~1 is 9 characters, not 8.
If you have a single (and possibly small) dbf file you can solve the problem copying the dbf file elsewhere and open the copy instead of the original file.
I believe that the DataSource should represent the directory that contains the .DBF files. Each .DBF file corresponds to a table in that directory.
My guess is c:\MyLongF~1 is a short name for a directory that contains a filename corresponding to MyLongF~1#DBF
Can you verify whether or not this is the case?

Categories

Resources