Comparing Values in One DataTable Column - c#

I have a datatable that I read in from a csv. What I would like to do is find all the duplicate names within one row titled "name" and add them to another datable for use later. The code I have so far:
private DataTable MatcherTable(DataTable table)
{
DataTable match = new DataTable();
match = table.Clone();
var equalRows = table.Rows.Cast<DataRow>().Where(dataRow => dataRow["name"] == dataRow["name"]).ToList();
foreach (var equalRow in equalRows)
{
match.Rows.Add(equalRow.ItemArray);
}
return match;
}
However when I return the table that should be full of matches, it returns the exact same table that I read in. Am I missing something simple?

The code is simply copying all the datarows in the output table because the comparison expression compares the same row and column with itself.
You could resolve your problem with a single Linq expression
private DataTable MatcherTable(DataTable table)
{
DataTable match = table.Rows.Cast<DataRow>()
.GroupBy(x => x["Name"])
.Where(g => g.Count() > 1)
.Select(k => k.FirstOrDefault())
.CopyToDataTable();
return match;
}
Here we GroupBy the rows using the value in the Name column and filter out all Groups with an occurence count less than 2. Next we take the first row from the group and build a Datarow sequence finally copied in the output table.
The code above will return just one row of the duplicate ones. If you want to keep all duplicate rows then you need
DataTable match = table.Rows.Cast<DataRow>()
.GroupBy(x => x["Name"])
.Where(g => g.Count() > 1)
.SelectMany(k => k)
.CopyToDataTable();

Create empty List, so u can do this
List<string> names= new List<string>();
foreach(var row in table.Rows)
{
if(names.Contains(row["name"])
{
names.Add(row["name"].ToString());
}
else
{
DataRow dr = match.NewDataRow();
dr.ItemArray=row.ItemArray; match.Rows.Add(dr);
}
}
I might have some mistakes in spelling or smt, but this is just to give u an idea!

Related

How do I GroupBy one column on this DataTable

Suppose I have a call log DataTable where each row represents a call placed with the following columns:
AccountNumber1, AccountNumber2, AccountListDate, AccountDisposition
I want to GroupBy column AccountNumber1 and want a new DataTable with the same columns + 1 additional column NumCalls which will be the count of calls for each AccountNumber1.
New DataTable after GroupBy:
AccountNumber1, AccountNumber2, AccountListDate, AccountDisposition, NumCalls
So far I have the following:
table.AsEnumerable()
.GroupBy(x => x.Field<int>("AccountNumber1"))
.Select(x => new { x.Key.AccountNumber1, NumCalls = x.Count() })
.CopyToDataTable()
Which gives me a DataTable with just two columns AccountNumber1 and NumCalls. How do I get the other columns as I described above?? I would appreciate any help. Thank you.
There's no magic, you need to use a loop and initialize the new table with the new column:
DataTable tblResult = table.Clone();
tblResult.Columns.Add("NumCalls", typeof(int));
var query = table.AsEnumerable().GroupBy(r => r.Field<string>("AccountNumber1"));
foreach (var group in query)
{
DataRow newRow = tblResult.Rows.Add();
DataRow firstOfGroup = group.First();
newRow.SetField<string>("AccountNumber1", group.Key);
newRow.SetField<string>("AccountNumber2", firstOfGroup.Field<string>("AccountNumber2"));
newRow.SetField<DateTime>("AccountListDate", firstOfGroup.Field<DateTime>("AccountListDate"));
newRow.SetField<string>("AccountDisposition", firstOfGroup.Field<string>("AccountDisposition"));
newRow.SetField<int>("NumCalls", group.Count());
}
This takes arbitrary values from the first row of each group which seems to be desired.

How to convert to int and then compare in linq query c#

IEnumerable<classB> list = getItems();
//dt is datatable
list = list.Where(x => Convert.ToInt32( !dt.Columns["Id"]) == (x.Id));
I want to only keep the items in the list which match in datatable id column. The rest are removed. I m not doing it right.
The datatable can have: ID - 1,3,4,5,7
The list can have: ID - 1,2,3,4,5,6,7,8,9,10
I want the output list to have: ID - 1,3,4,5,7
Your code won't work because you're comparing a definition of a column to an integer value. That's not a sensible comparison to make.
What you can do is put all of the values from the data table into a collection that can be effectively searched and then get all of the items in the list that are also in that collection:
var ids = new HashSet<int>(dt.AsEnumerable()
.Select(row => row.Field<int>("Id"));
list = list.Where(x => ids.Contains(x.Id));
Try this one
var idList = dt.AsEnumerable().Select(d => (int) d["Id"]).ToList();
list = list.Where(x => idList.Contains(x.Id));
You can't do it like that. Your dt.Columns["Id"] returns the DataColumn and not the value inside that column in a specific datarow. You need to make a join between two linq query, the first one you already have, the other you need to get from the DataTable.
var queryDt = (from dtRow in dt
where !dtRow.IsNull("Id")
select int.Parse(dtRow["Id"])).ToList();
Now the join
var qry = from nonNull in queryDt
join existing in list on nonNull equals list.id

How do I use LINQ to filter a datatable against a Lst of strings that need to be split?

I have a datatable and I want to use LINQ to filter against a List of strings, with each string delimited using the pipe ('|'), and contains two values.
The list (List Actions) of string looks like this. This is only two strings in this list, but it can have many more.
8/1/2013 9:57:52 PM|Login for bill.lock#cap.com
8/1/2013 9:57:37 PM|Login for bill.lock#cap.com
The datatable has five (5) fields in each row, and I'm using each string from the list above to compare two fields (Text and Time) in the datatable to omit or delete those rows.
The datatable is structured like this
DataTable stdTable = new DataTable("Actions");
DataColumn col1 = new DataColumn("Area");
DataColumn col2 = new DataColumn("Action");
DataColumn col3 = new DataColumn("Time");
DataColumn col4 = new DataColumn("Text");
Currently I'm manually performing all this, but I know it can be done in LINQ with just a few lines of code. I'm not sure how to iterate thru the list and use the split. I saw this example, but the split is beyond me.
// Get all checked id's.
var ids = chkGodownlst.Items.OfType<ListItem>()
.Where(cBox => cBox.Selected)
.Select(cBox => cBox.Value)
.ToList();
// Now get all the rows that has a CountryID in the selected id's list.
var a = dt.AsEnumerable().Where(r =>
ids.Any(id => id == r.Field<int>("CountryID"))
);
// Create a new table.
DataTable newTable = a.CopyToDataTable();
Any help would be appreciated.
Thanks
List<string> list = {
"8/1/2013 9:57:52 PM|Login for bill.lock#cap.com",
"8/1/2013 9:57:37 PM|Login for bill.lock#cap.com"
};
var a = dt.AsEnumerable().Where(x=>
!list.Select(y=> new {
Time = DateTime.Parse(y.Split('|')[0]),
Text = y.Split('|')[1]
})
.Any(z=> z.Time == x.Time && z.Text == x.Text));
or
var a = dt.AsEnumerable().Where(x=>
!list.Any(y=> y == string.Format("{0}|{1}",x["Time"],x["Text"])));
DataTable newTable = a.CopyToDataTable();

Gridview Error Data Contains No Rows when Used with Linq

I have this Linq statement that filters transactions. It works fine when it filters but I get an error in dt.AsEnumerable() when there is nothing being returned.
The error is Data contains no row. Anybody know how to handle when there is nothing returned?
newDataTable = dt.AsEnumerable()
.Where(r => !ListLinkedIds.Contains(r.Field<int>("LinkedTicketId")))
.CopyToDataTable();
gvMain.DataSource = newDataTable;
gvMain.DataBind();
You cannot use CopyToDataTable if the input sequence is empty. So you need to check that first:
var newDataTable = dt.Clone(); // an empty table with the same schema
var ticketRows = dt.AsEnumerable()
.Where(r => !ListLinkedIds.Contains(r.Field<int>("LinkedTicketId")));
if(ticketRows.Any())
newDataTable = ticketRows.CopyToDataTable();
Possible exceptions with CopytoDataTable
ArgumentNullException
The source IEnumerable sequence is null and a new table cannot be created.
InvalidOperationException
A DataRow in the source sequence has a state of Deleted.
The source sequence does not contain any DataRow objects.
A DataRow in the source sequence is null.
Check if your DataTable has any rows before calling AsEnumerable()
if (dt.Rows.Count > 0)
{
newDataTable = dt.AsEnumerable()
.Where(r => !ListLinkedIds.Contains(r.Field<int>("LinkedTicketId")))
.CopyToDataTable();
gvMain.DataSource = newDataTable;
gvMain.DataBind();
}
else {
//error
}

C# - Remove rows with the same column value from a DataTable

I have a DataTable which looks like this:
ID Name DateBirth
.......................
1 aa 1.1.11
2 bb 2.3.11
2 cc 1.2.12
3 cd 2.3.12
Which is the fastest way to remove the rows with the same ID, to get something like this (keep the first occurrence, delete the next ones):
ID Name DateBirth
.......................
1 aa 1.1.11
2 bb 2.3.11
3 cd 2.3.12
I don't want to double pass the table rows, because the row number is big.
I want to use some LinQ if possible, but I guess it will be a big query and I have to use a comparer.
You can use LINQ to DataTable, to distinct based on column ID, you can group by on this column, then do select first:
var result = dt.AsEnumerable()
.GroupBy(r => r.Field<int>("ID"))
.Select(g => g.First())
.CopyToDataTable();
I was solving the same situation and found it quite interesting and would like to share my finding.
If rows are to be distinct based on ALL COLUMNS.
DataTable newDatatable = dt.DefaultView.ToTable(true, "ID", "Name", "DateBirth");
The columns you mention here, only those will be returned back in newDatatable.
If distinct based on one column and column type is int then I would prefer LINQ query.
DataTable newDatatable = dt.AsEnumerable()
.GroupBy(dr => dr.Field<int>("ID"))
.Select(dg => dg).Take(1)
.CopyToDataTable();
If distinct based on one column and column type is string then I would prefer loop.
List<string> toExclude = new List<string>();
for (int i = 0; i < dt.Rows.Count; i++)
{
var idValue = (string)dt.Rows[i]["ID"];
if (toExclude.Contains(idValue))
{
dt.Rows.Remove(dt.Rows[i]);
i--;
}
toExclude.Add(glAccount);
}
Third being my favorite.
I may have answered few things which are not asked in the question. It was done in good intent and with little excitement as well.
Hope it helps.
you can try this
DataTable uniqueCols = dt.DefaultView.ToTable(true, "ID");
Not necessarily the most efficient approach, but maybe the most readable:
table = table.AsEnumerable()
.GroupBy(row => row.Field<int>("ID"))
.Select(rowGroup => rowGroup.First())
.CopyToDataTable();
Linq is also more powerful. For example, if you want to change the logic and not select the first (arbitrary) row of each id-group but the last according to DateBirth:
table = table.AsEnumerable()
.GroupBy(row => row.Field<int>("ID"))
.Select(rowGroup => rowGroup
.OrderByDescending(r => r.Field<DateTime>("DateBirth"))
.First())
.CopyToDataTable();
Get a record count for each ID
var rowsToDelete =
(from row in dataTable.AsEnumerable()
group row by row.ID into g
where g.Count() > 1
Determine which record to keep (don't know your criteria; I will just sort by DoB then Name and keep first record) and select the rest
select g.OrderBy( dr => dr.Field<DateTime>( "DateBirth" ) ).ThenBy( dr => dr.Field<string>( "Name" ) ).Skip(1))
Flatten
.SelectMany( g => g );
Delete rows
rowsToDelete.ForEach( dr => dr.Delete() );
Accept changes
dataTable.AcceptChanges();
Heres a way to achive this,
All you need to use moreLinq library use its function DistinctBy
Code:
protected void Page_Load(object sender, EventArgs e)
{
var DistinctByIdColumn = getDT2().AsEnumerable()
.DistinctBy(
row => new { Id = row["Id"] });
DataTable dtDistinctByIdColumn = DistinctByIdColumn.CopyToDataTable();
}
public DataTable getDT2()
{
DataTable dt = new DataTable();
dt.Columns.Add("Id", typeof(string));
dt.Columns.Add("Name", typeof(string));
dt.Columns.Add("Dob", typeof(string));
dt.Rows.Add("1", "aa","1.1.11");
dt.Rows.Add("2", "bb","2.3.11");
dt.Rows.Add("2", "cc","1.2.12");
dt.Rows.Add("3", "cd","2.3.12");
return dt;
}
OutPut: As what you expected
For moreLinq sample code view my blog

Categories

Resources