This is my first foray into C# as a SSIS and Informatica developer living only in SQL. I have a script task that is reading data from a single SQL Server table via Query and simply writing that data to a text file. Everything works except what I think are two small formatting problems I can't figure out.
The following requirements are in place for this build. Thanks in advance I'm here to answer any questions!
SQL query is purposefully set as a Select * to pick up any new columns added(already in code)
First 3 columns excluded from write to file(already in code)
Problems:
" " wrappers need to be added to all values, column and rows.
Date in database is true Date but when writing to file it shows Datetime. Needs to be only date.
Current:
ID
Name
Date
Ratio
12345678
John Wayne
12/31/2018 12:00:00 AM
1/1
Needs to be:
"ID"
"Name"
"Date"
"Ratio"
"12345678"
"John Wayne"
"2018-12-31"
"1/1"
Code:
// Declare Variables
string DestinationFolder = Dts.Variables["User::Target_FilePath"].Value.ToString();
string QueryStage = Dts.Variables["User::Query_Stage"].Value.ToString();
//string TableName = Dts.Variables["User::TableName"].Value.ToString();
string FileName = Dts.Variables["User::OutputFileName"].Value.ToString();
string FileDelimiter = Dts.Variables["User::Target_FileDelim"].Value.ToString();
//string FileExtension = Dts.Variables["User::AC_Prefix"].Value.ToString();
//USE ADO.NET Connection from SSIS Package to get data from table
SqlConnection myADONETConnection = new SqlConnection();
myADONETConnection = (SqlConnection)(Dts.Connections["ADO_TEST_CONN"].AcquireConnection(Dts.Transaction) as SqlConnection);
// Read data from table or view to data table
string query = QueryStage;
SqlCommand cmd = new SqlCommand(query, myADONETConnection);
//myADONETConnection.Open();
DataTable d_table = new DataTable();
d_table.Load(cmd.ExecuteReader());
myADONETConnection.Close();
string FileFullPath = DestinationFolder + "\\" + FileName + ".txt";
StreamWriter sw = null;
sw = new StreamWriter(FileFullPath, false);
// Write the Header Row to File
int ColumnCount = d_table.Columns.Count;
for (int ic = 4; ic < ColumnCount; ic++)
{
sw.Write(d_table.Columns[ic]);
if (ic < ColumnCount - 1)
{
sw.Write(FileDelimiter);
}
}
sw.Write(sw.NewLine);
// Write All Rows to the File
foreach (DataRow dr in d_table.Rows)
{
for (int ir = 4; ir < ColumnCount; ir++)
{
if (!Convert.IsDBNull(dr[ir]))
{
sw.Write(dr[ir].ToString());
}
if (ir < ColumnCount - 1)
{
sw.Write(FileDelimiter);
}
}
sw.Write(sw.NewLine);
}
sw.Close();
Dts.TaskResult = (int)ScriptResults.Success;
Blindly running .ToString() on an object, which is done in the line sw.Write(dr[ir].ToString());, is going to use the default settings of converting that data type into a string. If it's a DateTime (the c# data type, not the SQL Date column type), then it will include the time information.
C# converts SQL column types (such as Date) into C# data types (DateTime). You need to detect this, just as you're detecting if a value is DBNull.
if (!Convert.IsDBNull(dr[ir]))
{
if (dr[ir] is DateTime dt)
{
// use DateTime's specific string rendering
sw.Write(dt.ToString("d"));
}
else
{
// fall back to standard string rendering
sw.Write(dr[ir].ToString());
}
}
You can change out the format ("d" in this case) to be something else if you need a different format. Keep in mind that the Culture of a computer will affect how the string is rendered, unless you explicitly use a named Culture.
The other thing in your problem is adding quotes around printed values. This can be done with string concatination. For example:
string result = "\"" + "my string" + "\"";
// result is "my string", with quotes
Remember to escape the quote mark.
Related
I have a nice piece of C# code which allows me to import data into a table with less columns than in the SQL table (as the file format is consistently bad).
My problem comes when I have a blank entry in a column. The values statement does not pickup an empty column from the csv. And so I receive the error
You have more insert columns than values
Here is the query printed to a message box...
As you can see there is nothing for Crew members 4 to 11, below is the file...
Please see my code:
SqlConnection ADO_DB_Connection = new SqlConnection();
ADO_DB_Connection = (SqlConnection)
(Dts.Connections["ADO_DB_Connection"].AcquireConnection(Dts.Transaction) as SqlConnection);
// Inserting data of file into table
int counter = 0;
string line;
string ColumnList = "";
// MessageBox.Show(fileName);
System.IO.StreamReader SourceFile =
new System.IO.StreamReader(fileName);
while ((line = SourceFile.ReadLine()) != null)
{
if (counter == 0)
{
ColumnList = "[" + line.Replace(FileDelimiter, "],[") + "]";
}
else
{
string query = "Insert into " + TableName + " (" + ColumnList + ") ";
query += "VALUES('" + line.Replace(FileDelimiter, "','") + "')";
// MessageBox.Show(query.ToString());
SqlCommand myCommand1 = new SqlCommand(query, ADO_DB_Connection);
myCommand1.ExecuteNonQuery();
}
counter++;
}
If you could advise how to include those fields in the insert that would be great.
Here is the same file but opened with a text editor and not given in picture format...
Date,Flight_Number,Origin,Destination,STD_Local,STA_Local,STD_UTC,STA_UTC,BLOC,AC_Reg,AC_Type,AdultsPAX,ChildrenPAX,InfantsPAX,TotalPAX,AOC,Crew 1,Crew 2,Crew 3,Crew 4,Crew 5,Crew 6,Crew 7,Crew 8,Crew 9,Crew 10,Crew 11
05/11/2022,241,BOG,SCL,15:34,22:47,20:34,02:47,06:13,N726AV,"AIRBUS A-319 ",0,0,0,36,AV,100612,161910,323227
Not touching the potential for sql injection as I'm free handing this code. If this a system generated file (Mainframe extract, dump from Dynamics or LoB app) the probability for sql injection is awfully low.
// Char required
char FileDelimiterChar = FileDelimiter.ToChar()[0];
int columnCount = 0;
while ((line = SourceFile.ReadLine()) != null)
{
if (counter == 0)
{
ColumnList = "[" + line.Replace(FileDelimiterChar, "],[") + "]";
// How many columns in line 1. Assumes no embedded commas
// The following assumes FileDelimiter is of type char
// Add 1 as we will have one fewer delimiters than columns
columnCount = line.Count(x => x == FileDelimiterChar) +1;
}
else
{
string query = "Insert into " + TableName + " (" + ColumnList + ") ";
// HACK: this fails if there are embedded delimiters
int foundDelimiters = line.Count(x => x == FileDelimiter) +1;
// at this point, we know how many delimiters we have
// and how many we should have.
string csv = line.Replace(FileDelimiterChar, "','");
// Pad out the current line with empty strings aka ','
// Note: I may be off by one here
// Probably a classier linq way of doing this or string.Concat approach
for (int index = foundDelimiters; index <= columnCount; index++)
{
csv += "','";
}
query += "VALUES('" + csv + "')";
// MessageBox.Show(query.ToString());
SqlCommand myCommand1 = new SqlCommand(query, ADO_DB_Connection);
myCommand1.ExecuteNonQuery();
}
counter++;
}
Something like that should get you a solid shove in the right direction. The concept is that you need to inspect the first line and see how many columns you should have. Then for each line of data, how many columns do you actually have and then stub in the empty string.
If you change this up to use SqlCommand objects and parameters, the approximate logic is still the same. You'll add all the expected parameters by figuring out columns in the first line and then for each line you will add your values and if you have a short row, you just send the empty string (or dbnull or whatever your system expects).
The big take away IMO is that CSV parsing libraries exist for a reason and there are so many cases not addressed in the above psuedocode that you'll likely want to trash the current approach in favor of a standard parsing library and then while you're at it, address the potential security flaws.
I see your updated comment that you'll take the formatting concerns back to the source party. If they can't address them, I would envision your SSIS package being
Script Task -> Data Flow task.
Script Task is going to wrangle the unruly data into a strict CSV dialect that a Data Flow task can handle. Preprocessing the data into a new file instead of trying to modify the existing in place.
The Data Flow then becomes a chip shot of Flat File Source -> OLE DB Destination
Here's how you can process this file... I would still ask for Json or XML though.
You need two outputs set up. Flight Info (the 1st 16 columns) and Flight Crew (a business key [flight number and date maybe] and CrewID).
Seems to me the problem is how the crew is handled in the CSV.
So basic steps are Read the file, use regex to split it, write out first 16 col to output1 and the rest (with key) to flight crew. And skip the header row on your read.
var lines = System.File.IO.ReadAllLines("filepath");
for(int i =1; i<lines.length; i++)
{
var = new System.Text.RegularExpressions.Regex("new Regex("(?:^|,)(?=[^\"]|(\")?)\"?((?(1)(?:[^\"]|\"\")*|[^,\"]*))\"?(?=,|$)"); //Some code I stole to split quoted CSVs
var m = r.Matches(line[i]); //Gives you all matches in a MatchCollection
//first 16 columns are always correct
OutputBuffer0.AddRow();
OutputBuffer0.Date = m[0].Groups[2].Value;
OutputBuffer0.FlightNumber = m[1].Groups[2].Value;
[And so on until m[15]]
for(int j=16; j<m.Length; j++)
{
OutputBuffer1.AddRow(); //This is a new output that you need to set up
OutputBuffer1.FlightNumber = m[1].Groups[2].Value;
[Keep adding to make a business key here]
OutputBuffer1.CrewID = m[j].Groups[2].Value;
}
}
Be careful as I just typed all this out to give you a general plan without any testing. For example m[0] might actually be m[0].Value and all of the data types will be strings that will need to be converted.
To check out how regex processes your rows, please visit https://regex101.com/r/y8Ayag/1 for explanation. You can even paste in your row data.
UPDATE:
I just tested this and it works now. Needed to escape the regex function. And specify that you wanted the value of group 2. Also needed to hit IO in the File.ReadAllLines.
The solution that I implemented in the end avoided the script task completely. Also meaning no SQL Injection possibilities.
I've done a flat file import. Everything into one column then using split_string and a pivot in SQL then inserted into a staging table before tidy up and off into main.
Flat File Import to single column table -> SQL transform -> Load
This also allowed me to iterate through the files better using a foreach loop container.
ELT on this occasion.
Thanks for all the help and guidance.
/* all of these are output fields that are being parsed with input fields from an excel*/
public void Import()
{
CRMRecord r;
DataTable dtCarrierData = LoadXL(true);
foreach (DataRow dr in dtCarrierData.Rows)
{
r = new CRMRecord();
r.FleetID = "TN045";
r.BillingCompany = "TCH";
r.StationCode = ParseField<string>(dr, "fp_truckstopcode");
r.DriverID = ParseField<string>(dr, "FP_unitnumber");
r.TransactionDate = ParseField<string>(dr, "FP_transdate"); /* I have a standard output and transaction date = FP_Transdate basically but the trouble is the FP_transdate format coming in as "yyddMM" Ex:210120, 210121, there are 5 more just like those in the column (FP_transdate) */
DateTime FP_transdate = DateTime.ParseExact("yyddMM", "MM/dd/yyyy", CultureInfo.InvariantCulture); /* here is where I have an output that is reading an excel everything is reading fine except for date, i get an error saying "Processing exception - String was not recognized as a valid DateTime." It is not formatted correctly in the input(its "yy/dd/MM", I need it to be "MM/dd/yyyy" and it comes from an outside source so I can't just change the cell value in excel to date. */
FP_transdate.ToString("MM/dd/yyyy");
r.Ref1 = ParseField<string>(dr, "FP_truckstopinvnum");
decimal trcFuelCost = ParseField<decimal>(dr, "trcFuelCost");
decimal reefFuelCost = ParseField<decimal>(dr, "reefFuelCost");
I have figured it out.
r.TransactionDate = ParseField(dr, "FP_transdate").Substring(2, 2) + "/" + ParseField(dr, "FP_transdate").Substring(4, 2) + "/" + "20" + ParseField(dr, "FP_transdate").Substring(0, 2);
There may be a better way to write this but this worked for me.
Basically, I have a some code that takes either an (1) alphanumeric or (2) numeric serial number and increments them. Everything works for the numeric serial numbers, but when I try to insert the alphanumeric serial number, it gives me the "invalid column name" error.
I've looked at a lot of the "invalid column name" posts here and none of them seem to answer my question. I've put breakpoints in and ran the code for both cases (numeric and alphanumeric) and I'm getting the same datatypes. Basically everything seems to line up correctly, so I'm at a loss.
The following code shows how I increment for both cases. Note that for the alphanumeric increment, I am calling a method IncrementAlphaNumeric, which takes the variable 'Output', which is the result of a SQL query that sorts the table and gets the last serial number.
// Increment Numeric Serial Numbers
if (isNum)
{
int lastNumber = Int32.Parse(Output);
int[] ints = Enumerable.Range(lastNumber + 1, printQuantity).Select(i => (int)i / 1).ToArray();
increments = ints.Select(x => x.ToString()).ToArray();
output.AppendText("Serial numbers to print: " + string.Join(", ", increments));
}
// Increment AlphaNumeric Serial Numbers
if (!isNum)
{
for (int i = 0; i < printQuantity; i++)
{
increments[i] = IncrementAlphaNumeric(Output);
snList.Add(increments[i]);
Output = increments[i];
}
output.AppendText("Serial numbers to print: " + string.Join(", ", increments));
}
Finally, I use Stringbuilder in order to insert the data into the database as follows:
// (5) Store new SNs in Database
StringBuilder sb = new StringBuilder();
foreach (string newSns in increments)
{
sb.AppendLine("INSERT INTO [Manufacturing].[dbo].[Device.Devices]([SerialNumber],[DeviceTypeID]) VALUES(" + newSns + "," + dType +")");
}
using (SqlCommand insertCommand = new SqlCommand(sb.ToString(), cnn))
{
var executeNonQuery = insertCommand.ExecuteNonQuery();
}
Again, it works for numeric, but not for alphanumeric. When I put my breakpoints in and step through the code, the datatypes (Strings) are the same for each of the cases, numeric and alphanumeric.
The error message I'm getting is, again, "invalid column name". Basically, the expected results should be that the serial number, regardless of if it's numeric or alphanumeric, should be inserted into the correct table of the database, which is based on the device type (dType).
As explained in comments, you should never concatenate strings to build your sql statement.
In your case I assume that you want to insert multiple records in your database using a single statement. This can be done also using parameters and manually building the VALUES part or your query (This syntax is available from Sql Server 2008)
// Sample values, replace them with your code that builds the increments array
string[] increments = new string[] {"VALUE1", "VALUE2","VALUE3", "VALUE4"};
// Invariant part of your query
string baseQuery = "INSERT INTO [Manufacturing].[dbo].[Device.Devices]([SerialNumber],[DeviceTypeID]) VALUES";
// Fixed value for the type
string dType = "42";
List<SqlParameter> prms = new List<SqlParameter>();
List<string> placeHolders = new List<String>();
// Build a list of parameter placeholders and a list of those parameter and their values
for(int x = 0; x < increments.Length; x++)
{
placeHolders.Add($"(#p{x},{dType})");
prms.Add(new SqlParameter { ParameterName = $"#p{x}", SqlDbType = SqlDbType.NVarChar, Value = increments[x]});
}
// Put the text together
string queryText = baseQuery + string.Join(",", placeHolders);
// This should be the final text
// INSERT INTO [Manufacturing].[dbo].[Device.Devices]([SerialNumber],[DeviceTypeID])
// VALUES(#p0,42),(#p1,42),(#p2,42),(#p3,42)
using (SqlCommand insertCommand = new SqlCommand(queryText, cnn))
{
// Add all parameters to the command...
insertCommand.Parameters.AddRange(prms.ToArray());
var executeNonQuery = insertCommand.ExecuteNonQuery();
}
I have this problem. I would like to create a csv file by using C#. So I try to development this code:
public static void creaExcel(Oggetto obj)
{
string filePath = #"C:\Temp\test.csv";
string delimiter = ",";
string[][] output = new string[][]{
new string[]{"TobRod Porosity", "Batch code", "Nu.","PAD","G.Po","L.PoD "},
new string[]{"Col1 Row 2", "Col2 Row 2", "Col3 Row 2"}
};
int length = output.GetLength(0);
StringBuilder sb = new StringBuilder();
for (int index = 0; index < length; index++)
sb.AppendLine(string.Join(delimiter, output[index]));
File.WriteAllText(filePath, sb.ToString());
// open xls file
}
This code found but I would like to insert single value in a single cell, so with this code I insert all value ([]{"TobRod Porosity", "Batch code", "Nu.","PAD","G.Po","L.PoD "}, ) in a single row, in a single cell, instead I would like to insert every value a single cell.
Can we help me?
Best reguards
The code is working fine because the result is:
TobRod Porosity,Batch code,Nu.,PAD,G.Po,L.PoD
Col1 Row 2,Col2 Row 2,Col3 Row 2
Can you confirm this?
Here is how it is displayed on my PC:
If you see all the values in a single cell on your machine, this means that there is a problem identifying the correct separator. In order to fix this, add this line: sep=, at the beginning of your CSV content, so the resulting content would be:
sep=,
TobRod Porosity,Batch code,Nu.,PAD,G.Po,L.PoD
Col1 Row 2,Col2 Row 2,Col3 Row 2
This way you can force certain devices (I know for sure that iPhones have an issue with this) to use the correct separator.
I would also suggest you to use " as a string qualifier. Example:
sep=,
"TobRod Porosity","Batch code","Nu.","PAD","G.Po","L.PoD"
"Col1 Row 2","Col2 Row 2","Col3 Row 2"
the created file is a - more or less - correct csv (comma separated values) file.
however if you open that file with excel and it puts all values in one cell, it doesn't know that you want to separate it with the comma. you can however teach it to. with excel 2013 you mark the cell and go to the tab DATA and the "text to Columns" button.
edit: however, i have the feeling that you would like to use CSV to create excel documents. thats not what CSV is made for. if you want to create real excel sheets have a look here: Create Excel (.XLS and .XLSX) file from C#
the thing is that you are selection the full array when you make the insertion
Here you accesing to the global array and getting or the first array or the second
output[index]
If you want to insert each value of the chosen array , you just have to loop again the selected array
output[index][anotherIndex]
For example
output[0][0]
Will return "TobRod Porosity" as selected value
I have fixed my error, so I have write this method
public static void creaExcel(Oggetto obj)
{
try
{
string filePath = #"TOBROD_POROSITY_" + Utility.getData() + ".csv";
string delimiter = ";";
string[][] output = new string[][]{
new string[]{"TobRod Porosity", "Batch code", "Nu.","PAD","G.Po","L.PoD "}
};
int length = output.GetLength(0);
StringBuilder sb = new StringBuilder();
for (int index = 0; index < length; index++)
sb.Append(string.Join(delimiter, output[index]));
sb.AppendLine("");
//una volta, settato l'header del file bisogna inserire i valori
if (obj != null && obj.listaMisure != null)
{
for (int i = 0; i < obj.listaMisure.Count(); i++)
{
ValoriMisure v = obj.listaMisure[i];
sb.AppendLine(obj.tobaccoPorosity
+ delimiter + obj.batchCode
+ delimiter + v.nu
+ delimiter + v.pad
+ delimiter + v.gPo
+ delimiter + v.lPod);
}
}
File.WriteAllText(filePath, sb.ToString());
//muovi il file nel percorso di destinazione
File.Move(filePath, pathFolderDestination+"\\"+filePath);
}
catch (Exception e)
{
log.Error(e);
}
}
We should see this code:
string delimiter = ";";
because if you insert this delimiter ";" you can write a value in different cell on CSV file.
I had a look on the site and on Google, but I couldn't seem to find a good solution to what I'm trying to do.
Basically, I have a client server application (C#) where I send the server an SQL select statement (Connecting to SQL Server 2008) and would like to return results in a CSV manner back to the client.
So far I have the following:
if (sqlDataReader.HasRows)
{
while(sqlDataReader.Read())
{
//not really sure what to put here and if the while should be there!
}
}
`
Unfortunately, I'm really new to connecting C# with SQL. I need any tips on how to simply put the results in a string in a csv format. The columns and fields are likely to be different so I cannot use the method of something[something] as I've seen in a few sites. I'm not sure if I'm being comprehensible tbh!
I would really appreciate any tips / points on how to go about this please!
Here is a method I use to dump any IDataReader out to a StreamWriter. I generally create the StreamSwriter like this: new StreamWriter(Response.OutputStream). I convert any double-quote characters in the input into single-quote characters (maybe not the best way to handle this, but it works for me).
public static void createCsvFile(IDataReader reader, StreamWriter writer) {
string Delimiter = "\"";
string Separator = ",";
// write header row
for (int columnCounter = 0; columnCounter < reader.FieldCount; columnCounter++) {
if (columnCounter > 0) {
writer.Write(Separator);
}
writer.Write(Delimiter + reader.GetName(columnCounter) + Delimiter);
}
writer.WriteLine(string.Empty);
// data loop
while (reader.Read()) {
// column loop
for (int columnCounter = 0; columnCounter < reader.FieldCount; columnCounter++) {
if (columnCounter > 0) {
writer.Write(Separator);
}
writer.Write(Delimiter + reader.GetValue(columnCounter).ToString().Replace('"', '\'') + Delimiter);
} // end of column loop
writer.WriteLine(string.Empty);
} // data loop
writer.Flush();
}
As mentioned, there are quite a few issues with delimiters, escaping characters correctly, and formatting different types correctly. But if you are just looking for an example of putting data into a string, here is yet another one. It does not do any checking for the aforementioned complications.
public static void ReaderToString( IDataReader Reader )
{
while ( Reader.Read() )
{
StringBuilder str = new StringBuilder();
for ( int i = 0; i < Reader.FieldCount; i++ )
{
if ( Reader.IsDBNull( i ) )
str.Append( "null" );
else
str.Append( Reader.GetValue( i ).ToString() );
if ( i < Reader.FieldCount - 1 )
str.Append( ", " );
}
// do something with the string here
Console.WriteLine(str);
}
}
When dealing with CSV file I usually go for the FileHelpers library: it has a SqlServerStorage class which you can use to read records from a SQL server and write them to a CSV file.
You may be able to adapt the implementation of a CSV writer available here.
If you also need to parse CSV files, the implementation here is relatively good.
The CSV format is more complicated than it looks - particularly if you're going to deal with arbitrary data coming back from a query. You would need to be able to handle escaping of special characters (like quotes and commas), dealing with line breaks, and the like. You are better off finding and using a proven implementation - especially if you're new to C#.
You can get the table column names like this:
SqlConnection conn = new SqlConnection(connString);
conn.Open();
SqlCommand cmd = new SqlCommand(sql, conn);
SqlDataReader rdr = cmd.ExecuteReader();
DataTable schema = rdr.GetSchemaTable();
foreach (DataRow row in schema.Rows)
{
foreach (DataColumn col in schema.Columns)
Console.WriteLine(col.ColumnName + " = " + row[col]);
}
rdr.Close()
conn.Close();
Of course you can determine the columns names with the first row only, here it does it on every rows.
You can now put your own code to join the columns into a CSV line pretty easily...
Thanks