I'm writing a small app to do a little processing on some cells in a CSV file I have. I've figured out how to read and write CSV files with a library I found online, but I'm having trouble: the library parses CSV files into a DataTable, but, when I try to change a cell of the table, it isn't saving the change in the table!
Below is the code in question. I've separated the process into multiple variables and renamed some of the things to make it easier to debug for this question.
Code
Inside the loop:
string debug1 = readIn.Rows[i].ItemArray[numColumnToCopyTo].ToString();
string debug2 = readIn.Rows[i].ItemArray[numColumnToCopyTo].ToString().Trim();
string debug3 = readIn.Rows[i].ItemArray[numColumnToCopyFrom].ToString().Trim();
string towrite = debug2 + ", " + debug3;
readIn.Rows[i].ItemArray[numColumnToCopyTo] = (object)towrite;
After the loop:
readIn.AcceptChanges();
When I debug my code, I see that towrite is being formed correctly and everything's OK, except that the row isn't updated: why isn't it working? I have a feeling that I'm making a simple mistake here: the last time I worked with DataTables (quite a long time ago), I had similar problems.
If you're wondering why I'm adding another comma in towrite, it's because I'm combining a street address field with a zip code field - I hope that's not messing anything up.
My code is kind of messy, as I'm only trying to edit one file to make a small fix, so sorry.
The easiest way to edit individual column values is to use the DataRow.Item indexer property:
readIn.Rows[i][numColumnToCopyTo] = (object)towrite;
This isn't well-documented, but DataRow.ItemArray's get accessor returns a copy of the underlying data. Here's the implementation, courtesy of Reflector:
public object[] get_ItemArray() {
int defaultRecord = this.GetDefaultRecord();
object[] objArray = new object[this._columns.Count];
for (int i = 0; i < objArray.Length; i++) {
DataColumn column = this._columns[i];
objArray[i] = column[defaultRecord];
}
return objArray;
}
There's an awkward alternative method for editing column values: get a row's ItemArray, modify those values, then modify the row to use the updated array:
object[] values = readIn.Rows[i].ItemArray;
values[numColumnToCopyTo] = (object)towrite;
readIn.Rows.ItemArray = values;
use SetField<> method :
string debug1 = readIn.Rows[i].ItemArray[numColumnToCopyTo].ToString();
string debug2 = readIn.Rows[i].ItemArray[numColumnToCopyTo].ToString().Trim();
string debug3 = readIn.Rows[i].ItemArray[numColumnToCopyFrom].ToString().Trim();
string towrite = debug2 + ", " + debug3;
readIn.Rows[i].SetField<string>(numColumnToCopyTo,towrite);
readIn.AcceptChanges();
Related
I have a nice piece of C# code which allows me to import data into a table with less columns than in the SQL table (as the file format is consistently bad).
My problem comes when I have a blank entry in a column. The values statement does not pickup an empty column from the csv. And so I receive the error
You have more insert columns than values
Here is the query printed to a message box...
As you can see there is nothing for Crew members 4 to 11, below is the file...
Please see my code:
SqlConnection ADO_DB_Connection = new SqlConnection();
ADO_DB_Connection = (SqlConnection)
(Dts.Connections["ADO_DB_Connection"].AcquireConnection(Dts.Transaction) as SqlConnection);
// Inserting data of file into table
int counter = 0;
string line;
string ColumnList = "";
// MessageBox.Show(fileName);
System.IO.StreamReader SourceFile =
new System.IO.StreamReader(fileName);
while ((line = SourceFile.ReadLine()) != null)
{
if (counter == 0)
{
ColumnList = "[" + line.Replace(FileDelimiter, "],[") + "]";
}
else
{
string query = "Insert into " + TableName + " (" + ColumnList + ") ";
query += "VALUES('" + line.Replace(FileDelimiter, "','") + "')";
// MessageBox.Show(query.ToString());
SqlCommand myCommand1 = new SqlCommand(query, ADO_DB_Connection);
myCommand1.ExecuteNonQuery();
}
counter++;
}
If you could advise how to include those fields in the insert that would be great.
Here is the same file but opened with a text editor and not given in picture format...
Date,Flight_Number,Origin,Destination,STD_Local,STA_Local,STD_UTC,STA_UTC,BLOC,AC_Reg,AC_Type,AdultsPAX,ChildrenPAX,InfantsPAX,TotalPAX,AOC,Crew 1,Crew 2,Crew 3,Crew 4,Crew 5,Crew 6,Crew 7,Crew 8,Crew 9,Crew 10,Crew 11
05/11/2022,241,BOG,SCL,15:34,22:47,20:34,02:47,06:13,N726AV,"AIRBUS A-319 ",0,0,0,36,AV,100612,161910,323227
Not touching the potential for sql injection as I'm free handing this code. If this a system generated file (Mainframe extract, dump from Dynamics or LoB app) the probability for sql injection is awfully low.
// Char required
char FileDelimiterChar = FileDelimiter.ToChar()[0];
int columnCount = 0;
while ((line = SourceFile.ReadLine()) != null)
{
if (counter == 0)
{
ColumnList = "[" + line.Replace(FileDelimiterChar, "],[") + "]";
// How many columns in line 1. Assumes no embedded commas
// The following assumes FileDelimiter is of type char
// Add 1 as we will have one fewer delimiters than columns
columnCount = line.Count(x => x == FileDelimiterChar) +1;
}
else
{
string query = "Insert into " + TableName + " (" + ColumnList + ") ";
// HACK: this fails if there are embedded delimiters
int foundDelimiters = line.Count(x => x == FileDelimiter) +1;
// at this point, we know how many delimiters we have
// and how many we should have.
string csv = line.Replace(FileDelimiterChar, "','");
// Pad out the current line with empty strings aka ','
// Note: I may be off by one here
// Probably a classier linq way of doing this or string.Concat approach
for (int index = foundDelimiters; index <= columnCount; index++)
{
csv += "','";
}
query += "VALUES('" + csv + "')";
// MessageBox.Show(query.ToString());
SqlCommand myCommand1 = new SqlCommand(query, ADO_DB_Connection);
myCommand1.ExecuteNonQuery();
}
counter++;
}
Something like that should get you a solid shove in the right direction. The concept is that you need to inspect the first line and see how many columns you should have. Then for each line of data, how many columns do you actually have and then stub in the empty string.
If you change this up to use SqlCommand objects and parameters, the approximate logic is still the same. You'll add all the expected parameters by figuring out columns in the first line and then for each line you will add your values and if you have a short row, you just send the empty string (or dbnull or whatever your system expects).
The big take away IMO is that CSV parsing libraries exist for a reason and there are so many cases not addressed in the above psuedocode that you'll likely want to trash the current approach in favor of a standard parsing library and then while you're at it, address the potential security flaws.
I see your updated comment that you'll take the formatting concerns back to the source party. If they can't address them, I would envision your SSIS package being
Script Task -> Data Flow task.
Script Task is going to wrangle the unruly data into a strict CSV dialect that a Data Flow task can handle. Preprocessing the data into a new file instead of trying to modify the existing in place.
The Data Flow then becomes a chip shot of Flat File Source -> OLE DB Destination
Here's how you can process this file... I would still ask for Json or XML though.
You need two outputs set up. Flight Info (the 1st 16 columns) and Flight Crew (a business key [flight number and date maybe] and CrewID).
Seems to me the problem is how the crew is handled in the CSV.
So basic steps are Read the file, use regex to split it, write out first 16 col to output1 and the rest (with key) to flight crew. And skip the header row on your read.
var lines = System.File.IO.ReadAllLines("filepath");
for(int i =1; i<lines.length; i++)
{
var = new System.Text.RegularExpressions.Regex("new Regex("(?:^|,)(?=[^\"]|(\")?)\"?((?(1)(?:[^\"]|\"\")*|[^,\"]*))\"?(?=,|$)"); //Some code I stole to split quoted CSVs
var m = r.Matches(line[i]); //Gives you all matches in a MatchCollection
//first 16 columns are always correct
OutputBuffer0.AddRow();
OutputBuffer0.Date = m[0].Groups[2].Value;
OutputBuffer0.FlightNumber = m[1].Groups[2].Value;
[And so on until m[15]]
for(int j=16; j<m.Length; j++)
{
OutputBuffer1.AddRow(); //This is a new output that you need to set up
OutputBuffer1.FlightNumber = m[1].Groups[2].Value;
[Keep adding to make a business key here]
OutputBuffer1.CrewID = m[j].Groups[2].Value;
}
}
Be careful as I just typed all this out to give you a general plan without any testing. For example m[0] might actually be m[0].Value and all of the data types will be strings that will need to be converted.
To check out how regex processes your rows, please visit https://regex101.com/r/y8Ayag/1 for explanation. You can even paste in your row data.
UPDATE:
I just tested this and it works now. Needed to escape the regex function. And specify that you wanted the value of group 2. Also needed to hit IO in the File.ReadAllLines.
The solution that I implemented in the end avoided the script task completely. Also meaning no SQL Injection possibilities.
I've done a flat file import. Everything into one column then using split_string and a pivot in SQL then inserted into a staging table before tidy up and off into main.
Flat File Import to single column table -> SQL transform -> Load
This also allowed me to iterate through the files better using a foreach loop container.
ELT on this occasion.
Thanks for all the help and guidance.
I have some data and I want to write them to a specific line in notepad using C#.
For example I have two textboxes and the data inside them are "123 Hello", for textBox1, and "565878 Hello2" for textBox2.
When I press SAVE button, those data will be saved into one file but with different line. I want to save the first data in the first line and the second data in the third line.
How can I do this?
This question is too broad. The simple answer is that you write the two lines to a file, but write a newline (either "\r\n" or Environment.NewLine) between each string. That will put the two strings on different lines. If you want the second string on the third line, then you should write two newlines between each string.
If neither of those are the answer, then you need to be a lot more specific about why not. Is the file empty to start with? What have you tried? Where, specifically, are you getting stuck? What platform?
And I really don't see what this has to do with NotePad.
EDIT:
You have clarified that you are starting with an existing text file and want to replace the content at the specified lines.
This is a more complex thing to do, and may be beyond your skills if you are just starting out. The basic approach is this:
Assuming you can read the entire file into memory, load the file into a string. You will have to parse new lines to find the lines you want to replace. You can then just replace those parts of the string with the new data. When finished, write the file back to disk.
If the file is too big to load into memory, then it becomes much more complex. I'm sorry, but since you've done such a poor job of describing the issue, I'm not going to the trouble of going over the details for this case. And such a task probably falls outside the scope of a stackoverflow answer any way.
If you line numbers are not fixed you can do something like below:
class Program
{
private static void Main()
{
var data = "";
const string data1 = "Data1";//First Data
const string data2 = "Data2";//Second Data
const int line1 = 1;//First Data Line
const int line2 = 3;//Second Data Line
var maxNoOfLines = Math.Max(line1, line2);
for (var i = 1; i <= maxNoOfLines; i++)
{
if (i == line1)
{
data += data1 + Environment.NewLine;
}
else if (i == line2)
{
data += data2 + Environment.NewLine;
}
else
{
data += Environment.NewLine;
}
}
File.WriteAllText(#"C:\NOBACKUP\test.txt", data);
}
}
Otherwise if line numbers are fixed it will be much more simpler. You can just remove the loop from above and hardcode the values.
This is a bit of an odd question, and I'm sure there was an easier way of doing this but...
I have a method that returns a List of names. I am parsing through this list with a foreach loop. I am adding each name to one long String so that I can set the text of a Table Cell to that string. But I can't seem to get it to add a new line. Is this possible?
Here's a snippet of the code:
Earlier my table loop:
TableCell tempCell = new TableCell();
The issue:
// Returns a List of Employees on the specified day and time.
List<string> EmployeeList = CheckForEmployees(currentDay, timeCount);
string PrintedList = "";
foreach (String s in EmployeeList)
{
PrintedList += s;
// PrintedList += s + System.Environment.NewLine;
// PrintedList += s + " \n";
}
tempCell.Text = PrintedList;
Both the commented code lines didn't work. Any ideas?
You need to append a break tag since you want the new line to show in HTML. So append <br/>. I would also recommend using a stringbuilder if it is more than a few iterations.
Table cell as in HTML table cell? Do you perhaps need to use the "br" tag instead of normal newline?
Another thing, you should use StringBuilder instead of doing string concatenation the way you do. Or even better, String.Join.
string PrintedList = String.Join("br-tag", EmployeeList);
Again, not sure if the br-tag is what you are after, but prefer to use the methods in the String class when possible.
I have the following data from an excel sheet:
06:07:00 6:07
Data1
Data2
Data3
Data4
06:15:00 06:15
Data5
Data6
Data7
Data8
I want to compare this to the following data from text file:
XXXXXXXXXX 06:08:32 13.0 Data1
XXXXXXXXXX 06:08:45 6.0 Data2
xxxxxxxxxx 06:08:51 5.0 Data3
xxxxxxxxxx 06:08:56 13.0 Data4
xxxxxxxxxx 06:13:44 9.0 Data5
xxxxxxxxxx 06:13:53 11.0 Data6
xxxxxxxxxx 06:14:04 6.0 Data7
xxxxxxxxxx 06:14:10 13.0 Data8
As I want to use the time to compare the two files (excel with text), Time is different for each group. Group1(data1 to Data4), group2 (Data5-data8).
Does anyone have any idea how to go about this situation.
EDIT1:
Here is what I tried to do:
private void doTest(string time)
{
TimeSpan ts = TimeSpan.Parse(time);
int hours = ts.Hours;
int min = ts.Minutes;
int sec = ts.Seconds;
int minstart, minend;
string str;
minstart = min - 5;
minend = min + 5;
while (min != minend)
{
sec = sec + 1;
if (sec < 60)
{
if (hours < 10)
str = hours.ToString().PadLeft(2, '0');
else str = hours.ToString();
if (minstart < 10)
str = str + minstart.ToString().PadLeft(2, '0');
else str = str + minstart.ToString();
if (sec < 10)
str = str + sec.ToString().PadLeft(2, '0');
else str = str + sec.ToString();
chkwithtext(str);
}
else if (sec == 60)
{
sec = 00;
min = min + 1;
str = hours.ToString() + min.ToString() + sec.ToString();
chkwithtext(str);
}
}
}
private void chkwithtext(string str)
{
// check with the text file here if time doesn't match go
// back increment the time with 1sec and then check here again
}
It's not precisely clear how you are 'comparing' the times, but for this answer I'll make the assumption that data from the text file is to be compared if, and only if, its timestamp is within x minutes (defaulting to x = 5) of the Excel timestamp.
My recommendation would be to use an Excel add-in called Schematiq for this - you can download this (approx. 9MB) from http://schematiq.htilabs.com/ (see screenshots below). It's free for personal, non-commercial use. (Disclaimer: I work for HTI Labs, the authors of Schematiq.)
However, I'd do the time handling in Excel. First we'll calculate the start/stop limits for the Excel timestamps. For example, for the first time (06:07:00) we want the range 6:02-6:12. We'll also break the actual, 'start' and 'end' times into hours, minutes and seconds for ease later on. The Excel data sheet looks like this:
Next we need a Schematiq 'template function' which will take the start and end times and return us a range of times. This template is shown here:
The input values to this function are effectively 'dummy' values - the function is compiled internally by Schematiq and can then be called with whatever inputs are required. The 'Result' cell contains text starting with '~#...' (and likewise several of the previous cells) - this indicates a Schematiq data-link containing a table, function or other structure. To view it, you can click the cell and look in the Schematiq Viewer which appears as a task pane within Excel like this:
In other words, Schematiq allows you to hold an entire table of data within a single cell.
Now everything is set up, we simply import the text file and get Schematiq to do the work for us. For each 'time group' within the Excel data, a suitable range of times is generated and this is matched against the text file. You are returned all matching data, plus any unmatched data from both Excel and the text file. The necessary calculations are shown here:
Your Excel worksheet is therefore tiny, and clicking on the final cell will display the final results in the Schematiq Viewer. The results, including the Excel data and the 'template calculation', are shown here:
To be clear, what you see in this screenshot is the entire contents of the workbook - there are no other calculations taking place anywhere other than in the actual cells you see.
The 'final results' themselves are shown enlarged here:
This is exactly the comparison you're after (with a deliberately introduced error - Data9 - in the text file, to demonstrate the matching). You can then carry out whatever comparisons or further analysis you need to.
All of the data-links represent the use of Schematiq functions - the syntax is very similar to Excel and therefore easy to pick up. As an example, the call in the final cell is:
=tbl.SelectColumns(D21, {"Data","Text file"}, TRUE)
This selects all columns from the Schematiq table in cell D21 apart from the 'Data' and 'Text file' columns (the final Boolean argument to this function indicates 'all but').
I'd recommend downloading Schematiq and trying this for yourself - I'd be very happy to email you a copy of the workbook I've put together, so it should just run immediately.
I'm not sure if I understand what do you mean, but I'd start with exporting excel file to csv with ; separator - it's way much easier to work this way. Then some simple container class:
public class DataTimeContainer
{
public string Data;
public string TimeValue1 = string.Empty;
public string TimeValue2 = string.Empty;
}
And use it this way:
//Processint first file
List<DataTimeContainer> Container1 = new List<DataTimeContainer>();
string[] lines = File.ReadAllLines("c:\\data1.csv");
string groupTimeValue1 = string.Empty;
string groupTimeValue2 = string.Empty;
foreach (string[] fields in lines.Select(l => l.Split(';')))
{
//iterating over every line, splited by ';' delimiter
if (!string.IsNullOrWhiteSpace(fields[0]))
{
//we're in a line having both values, like:
//06:07:00 ; 6:07
groupTimeValue1 = fields[0];
groupTimeValue2 = fields[1];
}
else
//we're in line looking like this:
// ; DataX
Container1.Add(new DataTimeContainer(){Data = fields[1], TimeValue1 = groupTimeValue1, TimeValue2 = groupTimeValue2});
}
//Processing second file
List<DataTimeContainer> Container2 = new List<DataTimeContainer>();
lines = File.ReadAllLines("c:\\data2.txt");
foreach (string[] fields in lines.Select(l => l.Split(';')))
{
Container2.Add(new DataTimeContainer() { TimeValue1 = fields[1], TimeValue2 = fields[2], Data = fields[3]});
}
DoSomeComparison();
Of course I'm using strings as data types because I do not know what kind of objects they're supposed to be. Let me know how's that working for you.
If this is a one-time comparison, I would recommend just pulling the text file into Excel (using the Text-to-Columns tools if needed) and running a comparison there with the built-in functions.
If however you need to do this frequently, something like Tarec suggested would be a good start. It seems like you're trying to compare separate event logs within a given timespan (?) - your life will be easier if you parse to objects with DateTime properties instead of comparing text strings.
Populate your Data from your 2 sources(excel and text file) into 2 lists .
Make sure that Lists are of same type .
I would recommend Convert your Excel data to Text File Format .. and then populate Each line of text file and Excel file data into string List.
And then you can compare your List by using the LINQ or Enumerable Methods .
Quickest way to compare two List<>
Hi i have the following which creates two worksheets in an excel spreadsheet based on the values in a datagrid, I am able to get it working for two datagrids, however i need to do it for 14 datagrids, this is what i have got so far;
var grid1Output = RadGridView1.ToExcelML();
var grid2Output = RadGridView2.ToExcelML().Replace("Worksheet1", "Worksheet2");
var workBook = grid1Output.Replace("</Worksheet>", "</Worksheet>" +
grid2Output.Substring(grid2Output.IndexOf("<Worksheet"),
grid2Output.IndexOf("</Worksheet")- grid2Output.IndexOf("<Worksheet")) + " </Worksheet>");
The above works fine, however I need to do it for 14 gridoutputs in total. My problem is, I am having trouble replacing strings at the right place. How do i do this?
I would probably do it with Linq for XML methods rather than string manipulation, but the choice is yours.
Either way, it shouldn't be that hard to write a method that takes a grid output (I am assuming it's a string), extracts the contents, and returns them. The calling routine then assembles the 14 XML strings and wraps them in a single Worksheet tag.
Here's a stab at it. Bear in mind that I'm not familiar with the RadGridView and the output of ToExcelML, so you probably won't be able to use this code without some modification. I'm making some assumptions that may not be valid.
First, I would create a method that takes an XML string as input. I am assuming that this string is entirely wrapped in a <Worksheetn> tag.
string ExtractWorksheetContents(string excelML, int index)
{
// You might also be able to do this with a regex, depending on how the contents are structured
// Since I don't know enough about the content, I will do this with string manipulation, as
// you did, rather than loading the XML and making assumptions.
string tagName = string.Format("Worksheet{0}", index);
int worksheetStart = excelML.IndexOf("<" + tagName);
int worksheetEnd = excelML.IndexOf("</" + tagName + ">") + tagName.Length + 3);
// Should contain some checks that neither w'sheet start nor end are -1
return excelML.Substring(worksheetStart, worksheetEnd-worksheetStart);
}
Then I would assemble the results. Again, I'm making assumptions about how the XML is structured.
StringBuilder sb = new StringBuilder();
sb.Append("<Worksheet>");
RadGridView[] gridViews = new RadGridView[] { RadGridView1, RadGridView2 .... RadGridView14 };
for(int i=0;i<14; i++)
{
var rgv = gridViews[i];
sb.Append(ExtractWorksheetContents(rgv.ToExcelML(),i+1));
}
sb.Append("</Worksheet>");
var workBook = sb.ToString();
Hope this helps somewhat.