Using .NET
I have a text file with comma separated data. One of the columns consists of text like the following : 1997/020269/07
Now when I do a select with an OdbcCommand the string is seen as a float and it returns the 'answer' instead of the actual text!
How can I get the actual text? Am I going to be forced to parsing the file manually?
Hope someone can help...please?! :)
Edit: Some code maybe? :)
string strConnString =
#"Driver={Microsoft Text Driver (*.txt; *.csv)};Dbq=" + _FilePath +
#"; Extensions=asc,csv,tab,txt;Persist Security Info=False";
var conn = new System.Data.Odbc.OdbcConnection(strConnString);
var cmd = new System.Data.Odbc.OdbcCommand("select MyColumn from TextFile.txt", conn);
var reader = cmd.ExecuteReader();
while (reader.Read())
{ Console.WriteLine(reader["MyColumn"]); }
This returns 0.014074977 instead of 1997/020269/07
Have you tried using a schema.ini file -- these can be used to explicitly define the format of the text file, including data types.
Your schema.ini file might end up looking a little like:
[sourcefilename.txt]
ColNameHeader=true
Format=CSVDelimited
Col1=MyColumn Text Width 14
Col2=...
Try using schema.ini
[yourfile.txt]
ColNameHeader=false
MaxScanRows=0
Format=FixedLength
Col1=MyColumn Text Width 20
Bye.
Related
I have a excel file without headers and I'm inserting there some text to specified cells using excel interop. Everything is fine as long as the string does not contain a dot symbol. Text is added to the cell but the application stops working and does not add any more texts. Below is my code.
string connectionString = String.Format(#"Provider=Microsoft.ACE.OLEDB.12.0;Data Source={0};Extended Properties='Excel 12.0;HDR=No;IMEX=0'", filenamepath);
using (OleDbConnection cn = new OleDbConnection(connectionString)) {
cn.Open();
OleDbCommand cmd = new OleDbCommand("INSERT INTO[sheet1$B3:B3] VALUES ('" + "Some string with dot." + "')", cn);
cmd.ExecuteNonQuery();
cn.Close();
}
Try prepending the "string with dot" with a single quote '. This tells Excel to interpret the input as text, which is obviously required in your case. Also, note that a "comma" might give you the same problem, depending on the culture.
E.g.:
"INSERT INTO[sheet1$B3:B3] VALUES ('''" + "Some string with dot." + "')"
// ^^
You can try it out in Excel. Without the ' prefix, Excel will format any number-like text as a number.
I have a csv file with the following header:
"Pickup Date","Pickup Time","Pickup Address","From Zone", and so on..
I can only read the first 2 columns and nothing beyond using oledb. I used a schema.ini file with all column names specified. Pls suggest.
Here is my sample csv.
"PickupDate","PickupTime","PickupAddress","FromZone"
"11/05/15","4:00:00 AM","9 Houston Rd, CityName, NC 28262,","262"
Here is my code:
Schema.ini
-----------
[ReportResults.csv]
ColNameHeader = True
Format = CSVDelimited
col1=Pickup Date DateTime
col2=Pickup Time Text width 100
col3=Pickup Address Text width 500
col4=FromZone short
oledb code
-----------
public static DataTable SelectCSV(string path, string query)
{
// since the file contains addresses with , the delimiter ", is used. Each cell is written within "" in the file.
var strConn = #"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + path +
"; Extended Properties='text;HDR=Yes;FMT=Delimited(\",)'";
OleDbConnection selectConnection = (OleDbConnection)null;
OleDbDataAdapter oleDbDataAdapter = (OleDbDataAdapter)null;
selectConnection = new OleDbConnection(strConn);
selectConnection.Open();
using(OleDbCommand cmd=new OleDbCommand(query,selectConnection))
using (oleDbDataAdapter = new OleDbDataAdapter(cmd))
{
DataTable dt = new DataTable();
dt.Locale=CultureInfo.CurrentCulture;
oleDbDataAdapter.Fill(dt);
return dt;
}
}
Every column is contained in double quotes so every comma inside a double quote is not considered as delimeter.
So you can import your file:
without using schema.ini
specifying EXTENDED PROPERTIES='text;HDR=Yes;FMT=Delimited' in your connection string
If you need to use a schema to solve other problems please note that your schema.ini is not formally correct; use something like this:
[ReportResults.csv]
ColNameHeader = True
Format = CSVDelimited
col1=PickupDate DateTime
col2=PickupTime Text width 100
col3=PickupAddress Text width 500
col4=FromZone short
If you have problem extracting DateTime column specify DateTimeFormat options; i.e. if your pickup date is something like 2015/11/13 specify DateTimeFormat=yyyy/MM/dd=yyyy/MM/dd.
If you have problem extracting Short column verify that FromZone is an integer between -32768 and 32767; if not, use a different type. You can also set DecimalSymbol option if you have problem with decimal separators.
You can find more info on MSDN.
Here is my code:
OdbcConnection conn = new OdbcConnection("Driver={Microsoft Text Driver (*.txt; *.csv)};DSN=scrapped.csv");
conn.Open();
OdbcCommand foo = new OdbcCommand(#"SELECT * FROM [scrapped.csv] WHERE KWOTA < 100.00", conn);
IDataReader dr = foo.ExecuteReader();
StreamWriter asd = new StreamWriter("outfile.txt");
while (dr.Read())
{
int cols = dr.GetSchemaTable().Rows.Count;
for (int i = 0; i < cols; i++)
{
asd.Write(string.Format("{0};",dr[i].ToString()));
}
asd.WriteLine();
}
asd.Flush();
asd.Close();
dr.Close();
conn.Close();
Here is my Scheme.ini
[scrapped.csv]
Format=Delimited(;)
NumberDigits=2
CurrencyThousandSymbol=
CurrencyDecimalSymbol=,
CurrencyDigits=2
Col1=DataOperacji Date
Col2=DataKsiegowania Date
Col3=OpisOperacji Text
Col4=Tytul Text
Col5=NadawcaOdbiorca Text
Col6=NumerKonta Text
Col7=Kwota Currency
Col8=SaldoPoOperacji Currency
Here I have sample from my CSV:
2013-01-22;2013-08-24;notmatter;"notmatter";"notmatter";'notmatter';7 111,55;10 222,20;
2013-03-26;2013-08-23;notmatter;"notmatter";"notmatter";'notmatter';-275,00;15 466,24;
So even if I have date and currency set in scheme.ini and regional settings (which should be used by odbc by defult but are not) values which i write to output file are total mess.
They are empty if there is space (my local thousend delimiter) and if I have value like 15,45 i got 15,4500 instead.
Date fields also behave abnormal, and even if I insert to scheme.ini DateTimeFormat I get nothing like I specified in format.
Any help would be appreciated, what to do with it, I would like to use ODBC and query CSV data like database with WHERE something = something
I added a line to your schema.ini and ran against an adodb connection and it worked for me in the matter of dates, other bits are still not right. Note DateTimeFormat.
[scrapped.csv]
Format=Delimited(;)
NumberDigits=2
CurrencyThousandSymbol=
CurrencyDecimalSymbol=,
CurrencyDigits=2
DateTimeFormat="yyyy-mm-dd"
Col1=DataOperacji Date
Col2=DataKsiegowania Date
Col3=OpisOperacji Text
Col4=Tytul Text
Col5=NadawcaOdbiorca Text
Col6=NumerKonta Text
Col7=Kwota Currency
Col8=SaldoPoOperacji Currency
You may also need:
ColNameHeader=False
MaxScanRows=0
But at the moment, I cannot see a way to get a space accepted as the CurrencyThousandSymbol
I'm a bit confused here.
When I use Excel 2003 to export a sheet to CSV, it actually uses semicolons ...
Col1;Col2;Col3
shfdh;dfhdsfhd;fdhsdfh
dgsgsd;hdfhd;hdsfhdfsh
Now when I read the csv using Microsoft drivers, it expects comma's and sees the list as one big column ???
I suspect Excel is exporting with semicolons because I have a AZERTY keyboard. However, doesn't the CSV reader then also have to take in account the different delimiter ?
How can I know the appropriate delimiter, and/or read the csv properly ??
public static DataSet ReadCsv(string fileName)
{
DataSet ds = new DataSet();
string pathName = System.IO.Path.GetDirectoryName(fileName);
string file = System.IO.Path.GetFileName(fileName);
OleDbConnection excelConnection = new OleDbConnection
(#"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + pathName + ";Extended Properties=Text;");
try
{
OleDbCommand excelCommand = new OleDbCommand(#"SELECT * FROM " + file, excelConnection);
OleDbDataAdapter excelAdapter = new OleDbDataAdapter(excelCommand);
excelConnection.Open();
excelAdapter.Fill(ds);
}
catch (Exception exc)
{
throw exc;
}
finally
{
if(excelConnection.State != ConnectionState.Closed )
excelConnection.Close();
}
return ds;
}
One way would be to just use a decent CSV library; one that lets you specify the delimiter:
using (var csvReader = new CsvReader("yourinputfile.csv"))
{
csvReader.ValueSeparator = ';';
csvReader.ReadHeaderRecord();
while (csvReader.HasMoreRecords)
{
var record = csvReader.ReadDataRecord():
var col1 = record["Col1"];
var col2 = record["Col2"];
}
}
Check what delimiter is specified on your computer. Control Panel > Regional and Language Options > Regional Options tab - click Customize button. There's an option there called "List separator". I suspect this is set to semi-colon.
Solution for German Windows 10:
Mention to change the decimal separator to . and maybe thousands separators to (thin space) as well.
Can't believe this is true...Comma-separated values are separated by semicolon?
As mentioned by dendarii, the CSV separator that Excel uses is determined by your regional settings, specifically the 'list separator' character.
(And Excel does this erroneously in my opinion, as it is called a comma seperated file)
HOWEVER, if that still does not solve your issue, there is another possible complication:
Check your 'digit grouping' character and ensure that is NOT a comma.
Excel appears to revert back to semicolon when exporting decimal numbers and has digit grouping also set to a comma.
Setting the digit grouping to a full stop / period (.) solved this for me.
I am writing long text (1K to 2K characters long, plain xml data) into a cell in excel workbook.
The below statement throws COM error Exception from HRESULT: 0x800A03EC
range.set_Value(Type.Missing, data);
If I copy paste the same xml manually into excel it just works fine ,but the same does not work progamatically.
If I strip the text to something like 100/300 chars it works fine.
There is a limit (somehwere between 800 and 900 chars if i remember correctly) that is nearly impossible to get around like this.
Try using an ole connection and inserting the data with an SQL command. That might work better for you. you can then use interop to do any formatting if necessary.
the following KB article explains that the max limit is 911 characters. I checked the same on my code it does work for string upto 911 chars.
http://support.microsoft.com/kb/818808
The work around mentioned in this article recommends to make sure no cell holds more than 911 characters. thats lame!
Good Ole and excel article: http://support.microsoft.com/kb/316934
The following code updates a private variable that is the number of successful rows and returns a string which is the path to the excel file.
Remember to use Path from System.IO;!
string tempXlsFilePathName;
string result = new string;
string sheetName;
string queryString;
int successCounter;
// set sheetName and queryString
sheetName = "sheetName";
queryString = "CREATE TABLE " + sheetName + "([columnTitle] char(255))";
// Write .xls
successCounter = 0;
tempXlsFilePathName = (_tempXlsFilePath + #"\literalFilename.xls");
using (OleDbConnection connection = new OleDbConnection(GetConnectionString(tempXlsFilePathName)))
{
OleDbCommand command = new OleDbCommand(queryString, connection);
connection.Open();
command.ExecuteNonQuery();
yourCollection.ForEach(dataItem=>
{
string SQL = "INSERT INTO [" + sheetName + "$] VALUES ('" + dataItem.ToString() + "')";
OleDbCommand updateCommand = new OleDbCommand(SQL, connection);
updateCommand.ExecuteNonQuery();
successCounter++;
}
);
// update result with successfully written username filepath
result = tempXlsFilePathName;
}
_successfulRowsCount = successCounter;
return result;
N.B. This was edited in a hurry, so may contain some mistakes.
To solve this limitation, only writing/updating one cell at a time and dispose the Excel com object immediately. And recreate the object again for writing/updating the next cell.
I can confirm this solution is working in VS2010 (VB.NET project) with Microsoft Excel 10.0 Object Library (Microsoft Office XP)
This limitation is supposed to have been removed in Excel 2007/2010. Using VBA the following works
Sub longstr()
Dim str1 As String
Dim str2 As String
Dim j As Long
For j = 1 To 2000
str1 = str1 & "a"
Next j
Range("a1:a5").Value2 = str1
str2 = Range("a5").Value2
MsgBox Len(str2)
End Sub
I'll start by saying I haven't tried this myself, but my research says that you can use QueryTables to overcome the 911 character limitation.
This is the primary post I found which talks about using a record set as the data source for a QueryTable and adding it to a spreadsheet: http://www.excelforum.com/showthread.php?t=556493&p=1695670&viewfull=1#post1695670.
Here is some sample C# code of using QueryTables: import txt files using excel interop in C# (QueryTables.Add).