I have a table below that contains the students' results in a DataGridView:-
ID NAME RESULT
1 Peter PASS
1 Peter SILVER
2 Sam FAIL
2 Sam SILVER
3 Simon FAIL
4 Cliff PASS
5 Jason FAIL
5 Jason FAIL
6 Leonard PASS
6 Leonard FAIL
I'm trying to produce a simple program that will filter out certain rows base on the Results upon a click of a button. What I have achieved right now is that I have able to filter out those who with PASS and/or SILVER as their Result and only display out FAIL.
The problem is right now whenever the button is clicked, it will removed the rows with a PASS and/or SILVER, except the 2nd Row: 1 Peter SILVER. Leaving me with this table below as the end result:-
The only way to resolved this right now is to click the button again.
Below is the source code for the button:-
private void btnGenerate_Click(object sender, EventArgs e)
{
if (dtList.Rows.Count != 0)
{
try
{
foreach (DataGridViewRow dr in dtList.Rows)
{
//Column names in excel file
string colID = dr.Cells["ID"].Value.ToString();
string colName = dr.Cells["Name"].Value.ToString();
string colResult = dr.Cells["Result"].Value.ToString();
if (!colResult.Equals("FAIL", StringComparison.InvariantCultureIgnoreCase))
{
dtList.Rows.Remove(dr);
}
}
}
catch
{
}
}
}
The problem is you are changing the list you are iterating over. This is never a great idea...
In this case you look at the first row (Peter/Pass) and remove it. You then look at the second row. But wait, we removed a row so the second row is in fact actually now the old third row - we have just skipped the original second row.
You don't notice this problem anywhere else because all other rows that want to be removed are followed by rows you want to keep.
The way to fix this is to either:
Create a new list with the items you want to keep and then bind
that new list to whereever you are displaying
Create a list of items that you want to remove from the datatable
while you are iterating the table. Then once you have a list of
items you want to remove iterate over that list removing them from
the datatable.
Iterate through the list with a for loop starting with the last
index. This will mean that when you remove items you only effect
those that come after which in this case you will have already
processed.
The second is probably the easiest way to go in this situation. The first involves extra code and the third may not be obvious why you are doing that to somebody that comes after you.
Related
I have a file source where the data is not in normalized form with any sort of primary key value or repeating group value. I am using Merge Join to put the multiple rows into one merged row. I need to apply some row numbering so that I have a join between the multiple rows, to get them into the one single row for the merge join.
Here is what the source data looks like:
Data Rows:
MSH|BLAH|||BLAHBLAH15|BLAHZ|||
EVN|MOREBLAH|BLAHBLAH11|BLAHY|||
PID|BLAHXX|BLAHBLAH655|BLAHX|||
PV1|BLAHX2|BLAHBLAH42|BLAHX|||||||||
DG1|1||84|XXXX||A
IN1|1||11400|TEST
IN1|2||20100|TEST2
MSH|BLAH2|BLAHBLAH5|BLAHZ|||
EVN|BLAH6|20220131123100
PID|BLAHGG|BLAH222|BLAHX|||
PV1|PV1|BLAHX2|BLAHBLAH42|BLAHX||||||||20220101|
DG1|1||84|XXXX||A
DG1|2||84|XXXX||A
IN1|1||11600|TEST2
What is consistent is that there is always an MSH line as the header, and everything below it belongs with the MSH line at the top.
So I'm trying to accomplish this by applying a row numbering as below, where it goes from 1,1,1,1 to 2,2,2,2,2 incrementing by one wherever it finds the MSH line, as per below:
Data Rows: Numbering Needed:
MSH|BLAH|||BLAHBLAH15|BLAHZ||| 1
EVN|MOREBLAH|BLAHBLAH11|BLAHY||| 1
PID|BLAHXX|BLAHBLAH655|BLAHX||| 1
PV1|BLAHX2|BLAHBLAH42|BLAHX||||||||| 1
DG1|1||84|XXXX||A 1
IN1|1||11400|TEST 1
IN1|2||20100|TEST2 1
MSH|BLAH2|BLAHBLAH5|BLAHZ||| 2
EVN|BLAH6|20220131123100 2
PV1|PV1|BLAHX2|BLAHBLAH42|BLAHX|||||| 2
DG1|1||84|XXXX||A 2
DG1|2||84|XXXX||A 2
IN1|1||11600|TEST2 2
I can't use a specific row count to reset the number, ie: Every 5 rows increment the row numbering, because it's an inconsistent number of rows each time. In the example above, the first set is 7 rows and the 2nd set is 6 rows. I have to do my incrementing by the presence of the "MSH" row value, and apply the same number on down until it finds the next "MSH". I know that have to use a script task (preferably in C#) to generate this row number since my source is a file. But I just can't seem to find the right logic that will do this, since my data doesn't have a repeating key for each row that I can partition by.
This is what I would do to meet your requirements:
Read entire row in a flat file viewer
Go into a script task (source). I forgot to mention to add the row as read only.
Set up an output for each type.
Go into the code.
Set up a variable outside of main processing (in startup)
int counter = 0;
In main row processing:
string[] details = Row.Split('|');
switch(details[0])
{
case "MSH":
counter++; //increment counter
OutputBufferMSH.AddRow();
OutputBufferMSH.Counter = counter;
OutputBufferMSH.Col1 = details[1];
// Repeat for each detail Column
break;
case "EVN":
OutputBufferEVN.AddRow();
OutputBufferEVN.Counter = counter;
OutputBufferEVN.Col1 = details[1];
// Repeat for each detail Column
break;
//Repeat for each row type
}
I personally would not use this counter approach but actually load the MSH row and return the identity column to replace the counter.
Honestly, I would do the whole thing in a console application instead and use a StreamReader to load the flatfile. Readlines and then use the above logic to push the data into DataTables and use a Bulk Insert to load the data. But the above is the solution to do this in SSIS.
There is a lot to unpack here if you are not familiar with C# or the script task object itself.
The saga of trying to chop flat files up into useable bits continues!
You may see from my other questions that I am trying to wrangle some flat file data into various bits using C# transformer in SSIS. The current challenge is trying to turn a selection of rows with one column into one row with many columns.
A friend has very kindly tipped me off to use List and then to somehow loop through that in the PostExecute().
The main problem is that I do not know how to loop through and create a row to add to the Output Buffer programatically - there might be a variable number of fields listed in the flat file, there is no consistency. For now, I have allowed for 100 outputs and called these pos1, pos2, etc.
What I would really like to do is count everything in my list, and loop through that many times, incrementing the numbers accordingly - i.e. fieldlist[0] goes to OutputBuffer.pos1, fieldlist[1] goes to OutputBuffer.pos2, and if there is nothing after this then nothing is put in pos3 to pos100.
The secondary problem is that I can't even test that my list and writing to an output table is working by specifically using OutputBuffer in PostExecute, never mind working out a loop.
The file has all sorts in it, but the list of fields is handily contained between START-OF-FIELDS and END-OF-FIELDS, so I have used the same logic as before to only process the rows in the middle of those.
bool passedSOF;
bool passedEOF;
List<string> fieldlist = new List<string>();
public override void PostExecute()
{
base.PostExecute();
OutputBuffer.AddRow();
OutputBuffer.field1=fieldlist[0];
OutputBuffer.field2=fieldlist[1];
}
public override void Input_ProcessInputRow(InputBuffer Row)
{
if (Row.RawData.Contains("END-OF-FIELDS"))
{
passedEOF = true;
OutputBuffer.SetEndOfRowset();
}
if (passedSOF && !passedEOF)
{
fieldlist.Add(Row.RawData);
}
if(Row.RawData.Contains("START-OF-FIELDS"))
{
passedSOF = true;
}
}
I have nothing underlined in red, but when I try to run this I get an error message about PostExecute() and "object reference not set to an instance of an object", which I thought meant something contained a null where it shouldn't, but in my test file I have more than two fields between START and END markers.
So first of all, what am I doing wrong in the example above, and secondly, how do I do this in a proper loop? There are only 100 possible outputs right now, but this could increase over time.
"Post execute" It's named that for a reason.
The execution of your data flow has ended and this method is for cleanup or anything that needs to happen after execution - like modification of SSIS variables. The buffers have gone away, there's no way to do interact with the contents of the buffers at this point.
As for the rest of your problem statement... it needs focus
So once again I have misunderstood a basic concept - PostExecute cannot be used to write out in the way I was trying. As people have pointed out, there is no way to do anything with the buffer contents here.
I cannot take credit for this answer, as again someone smarter than me came to the rescue, but I have got permission from them to post the code in case it is useful to anyone. I hope I have explained this OK, as I only just understand it myself and am very much learning as I go along.
First of all, make sure to have the following in your namespace:
using System.Reflection;
using System.Linq;
using System.Collections.Generic;
These are going to be used to get properties for the Output Buffer and to allow me to output the first item in the list to pos_1, the second to pos_2, etc.
As usual I have two boolean variables to determine if I have passed the row which indicates the rows of data I want have started or ended, and I have my List.
bool passedSOF;
bool passedEOF;
List<string> fieldlist = new List<string>();
Here is where it is different - as I have something which indicates I am done processing my rows, which is the row containing END-OF-FIELDS, when I hit that point, I should be writing out my collected List to my output buffer. The aim is to take all of the multiple rows containing field names, and turn that into a single row with multiple columns, with the field names populated across those columns in the row order they appeared.
if (Row.RawData.Contains("END-OF-FIELDS"))
{
passedEOF = true;
//IF WE HAVE GOT TO THIS POINT, WE HAVE ALL THE DATA IN OUR LIST NOW
OutputBuffer.AddRow();
var fields = typeof(OutputBuffer).GetProperties();
//SET UP AND INITIALISE A VARIABLE TO HOLD THE ROW NUMBER COUNT
int rowNumber = 0;
foreach (var fieldName in fieldList)
{
//ADD ONE TO THE CURRENT VALUE OF rowNumber
rowNumber++;
//MATCH THE ROW NUMBER TO THE OUTPUT FIELD NAME
PropertyInfo field = fields.FirstOrDefault(x = > x.Name == string.Format("pos{0}", rowNumber));
if (field != null)
{
field.SetValue(OutputBuffer, fieldName);
}
}
OutputBuffer.SetEndOfRowset();
}
if (passedSOF && !passedEOF)
{
this.fieldList.Add(Row.RawData);
}
if (Row.RawData.Contains("START-OF-FIELDS"))
{
passedSOF = true;
}
So instead of having something like this:
START-OF-FIELDS
FRUIT
DAIRY
STARCHES
END-OF-FIELDS
I have the output:
pos_1 | pos_2 | pos_3
FRUIT | DAIRY | STARCHES
So I can build a position key table to show which field will appear in which order in the current monthly file, and now I am looking forward into getting myself into more trouble splitting the actual data rows out into another table :)
I am trying to reset the number of columns in an Excel ListObject. I know you can add and remove columns one-by-one, but I want to avoid unnecessary loops. I instead decided to resize the ListObject using the Resize method.
Here is the code that I am using (where OutputCasesTable is the ListObject):
OutputCasesTable.DataBodyRange.Value2 = "";
OutputCasesTable.Resize(OutputCasesTable.Range.Resize[ColumnSize: CaseCount]);
OutputCasesTable.DataBodyRange.Value2 = OutputCasesAray;
The above lines of code appear to work perfectly, however if the ListObject only contains 1 row of data, the DataBodyRange of the ListObject becomes null on the second line - producing an error when I try to change its cell's value. The row in excel still appears to be present.
The MSDN documentation says the following:
"The header must remain in the same row and the resulting list must overlap the original list. The list must contain a header row and at least one row of data."
Now I understand that "one row of data" implies that the row contains values - so the cause of the error here must be that the DataBodyRange cells all contain no value (""). However, a table with two data rows containing "" still doesn't have a row with data, does it?
I know there are many ways of accomplishing this task, but I want to understand why this happens.
Temporary Solution:
Replaced the code to only set the values to empty strings in columns that will be removed (columns above the new column count). All other columns will be replaced:
if(OutputCasesTable.ListColumns.Count - CaseCount > 0)
OutputCasesTable.DataBodyRange.Offset[ColumnOffset: CaseCount].Resize[ColumnSize: OutputCasesTable.ListColumns.Count - CaseCount].Value2 = "";
OutputCasesTable.Resize(OutputCasesTable.Range.Resize[ColumnSize: CaseCount]);
OutputCasesTable.DataBodyRange.Value2 = OutputCasesAray;
Personally I prefer looking at the first solution!
Is there anything I can do make it work with empty strings? Or do you have a better solution?
Best regards,
The Resize operation is the piece that kills the DataBodyRange, and clearly there's some internal logic that Resize uses, along the lines of "if there is only one row, and all the cells are empty, remove all the data rows. If there is more than one row, don't remove any".
I agree that this logic is a bit confounding. If your question is why did Microsoft implement it this way, I'd argue that although it's inconsistent, it's perhaps tidier in a way - it appears to the model that you're working with an empty table, and there's no way for the model to tell the difference graphically (it's not possible for a table to just have a header row).
When Resize turns up to do its work and finds a single-row blank table, it can't tell whether you have a zero-row table or a single-row table with empty strings. If it arrives and finds two empty rows, that's unambiguous (they must be meaningful rows).
For the workaround portion of your question, I'd suggest a tidier solution of just checking the ListRows.Count property, and adding one if necessary. Note that you can also use Clear instead of setting Value2 to blank; for me it reads as more self-explanatory.
OutputCasesTable.DataBodyRange.Clear();
OutputCasesTable.Resize(OutputCasesTable.Range.Resize[ColumnSize: CaseCount]);
if (OutputCasesTable.ListRows.Count == 0) OutputCasesTable.ListRows.Add();
OutputCasesTable.DataBodyRange.Value2 = OutputCasesAray;
I have a datatable containing certain columns. I am able to display them in a repeater as a whole. However, what I want to do is to display the row, one by one. I want to create like a go next and show previous button. I did something like the following:
myDataTable.Rows.Cast<DataRow>().Take(1).CopyToDataTable();
This is giving me the first row. Now how can I use this concept to get the next row, (row 2), then row 3..... till row n. The rows being returned are different for different cases. It is an web application.
To get a different row, you just need to Skip some:
myDataTable.Rows.Cast<DataRow>().Skip(n).Take(1).CopyToDataTable();
where n is how many rows you want to skip. So, for the second record n would be 1.
I greatly disagree with the use of CopyDataDataTable(), but it would require a lot of knowledge of your code base to provide a better approach.
I would select it from the database instead, however, use Skip(n).Take(1):
var row3 = myDataTable.AsEnumerable().Skip(2).Take(1).CopyToDataTable();
Introduce the use of .Skip():
myDataTable.Rows.Cast<DataRow>().Skip(0).Take(1).CopyToDataTable();
Now you can simply track which record the user is currently viewing, and update the value sent to the .Skip() method. For example, if the user has pressed "next" 5 times, you've incremented the value 5 times:
myDataTable.Rows.Cast<DataRow>().Skip(5).Take(1).CopyToDataTable();
Keep a counter and use Skip(n-1).Take(1) for nth record.
How do I set the source data of an excel interop chart to several entire rows?
I have a .csv file that is created by my program to display some results that are produced. For the sake of simplicity let's say these results and chart are displayed like this: (which is exactly how I want it to be)
Now the problem I am having is that the number of people is variable. So I really need to access the entire rows data.
Right now, I am doing this:
var range = worksheet.get_range("A1","D3");
xlExcel.ActiveChart.SetSourceData(range);
and this works great if you only have three Persons, but I need to access the entire row of data.
So to restate my question, how can I set the source data of my chart to several entire rows?
I tried looking here but couldn't seem to make that work with rows instead of columns.
var range = worksheet.get_range("A1").CurrentRegion;
xlExcel.ActiveChart.SetSourceData(range);
EDIT: I am assuming that the cells in the data region won't be blank.
To test this,
1) place cursor on cell A1
2) press F5
3) click on "Special"
4) choose "Current Region" as option
5) click "OK"
This will select the cells surrounding A1 which are filled, which I believe is what you are looking for.
The translation of that in VBA code points to CurrentRegion property. I think, that should work.
Check Out the option Range.EntireRow I'm not 100% on how to expand that to a single range containing 3 entire rows, but it shouldn't be that difficult to accomplish.
Another thing you can do is scan to get the actual maximum column index you need (this is assuming that there are guaranteed to be no gaps in the names), then use that index as you declare your range.
Add Code
int c = 2;//column b
while(true)
{
if (String.IsNullOrEmpty(worksheet.GetRange(1,c).Value2))
{
c--;
break;
}
c++;
}
Take a column from A to D that you're sure has no empty cells.
Do some loop to find the first empty one in that column and it will be one after the last.
Range Cell = SHeet.Range["A1"]; //or another column you're sure there's no empty data
int LineOffset = 0;
while (Cell.Offset[LineOffset, 0].Value != "") //maybe you should cast the left side to string, not sure.
{
LineOffset++;
}
int LastLine = LineOffset - 1;
Then you can get Range[Sheet.Cells[1,1], Sheet.Cells[LastLine, 4]]
Out of the box here, but why not transpose the data? Three columns for Name, Height, Weight. Convert this from an ordinary range to a Table.
When any formula, including a chart's SERIES formula references a column of a table, it always references that column, no matter how long the table gets. Add another person (another row) and the chart displays the data with the added person. Remove a few people, and the chart adjusts without leaving blanks at the end.
This is illustrated in my tutorial, Easy Dynamic Charts Using Lists or Tables.