How do I use SELECT GROUP BY in DataTable.Select(Expression)? - c#

I try to remove the duplicate rows by select a first row from every group.
For Example
PK Col1 Col2
1 A B
2 A B
3 C C
4 C C
I want a return:
PK Col1 Col2
1 A B
3 C C
I tried following code but it didn't work:
DataTable dt = GetSampleDataTable(); //Get the table above.
dt = dt.Select("SELECT MIN(PK), Col1, Col2 GROUP BY Col1, Col2);

DataTable's Select method only supports simple filtering expressions like {field} = {value}. It does not support complex expressions, let alone SQL/Linq statements.
You can, however, use Linq extension methods to extract a collection of DataRows then create a new DataTable.
dt = dt.AsEnumerable()
.GroupBy(r => new {Col1 = r["Col1"], Col2 = r["Col2"]})
.Select(g => g.OrderBy(r => r["PK"]).First())
.CopyToDataTable();

dt = dt.AsEnumerable().GroupBy(r => r.Field<int>("ID")).Select(g => g.First()).CopyToDataTable();

dt.AsEnumerable()
.GroupBy(r => new { Col1 = r["Col1"], Col2 = r["Col2"] })
.Select(g =>
{
var row = dt.NewRow();
row["PK"] = g.Min(r => r.Field<int>("PK"));
row["Col1"] = g.Key.Col1;
row["Col2"] = g.Key.Col2;
return row;
})
.CopyToDataTable();

This solution sort by Col1 and group by Col2. Then extract value of Col2 and display it in a mbox.
var grouped = from DataRow dr in dt.Rows orderby dr["Col1"] group dr by dr["Col2"];
string x = "";
foreach (var k in grouped) x += (string)(k.ElementAt(0)["Col2"]) + Environment.NewLine;
MessageBox.Show(x);

Based on #Alfred Wallace's solution :
DataTable dt = new DataTable();
dt.Columns.Add("Col1");
dt.Columns.Add("Col2");
dt.Rows.Add("120", "34");
dt.Rows.Add("121", "34");
dt.Rows.Add("122", "34");
dt.Rows.Add("1", "345");
dt.Rows.Add("2", "345");
dt.Rows.Add("3", "345");
var grouped = from DataRow dr in dt.Rows orderby dr["Col1"] group dr by dr["Col2"];
string xxx = "", yyy = "";
foreach (var k_group in grouped)
{
xxx += (string)(k_group.ElementAt(0)["Col1"]) + Environment.NewLine;
foreach (DataRow item_dr in k_group)
{
yyy += (string)(item_dr["Col1"]) + Environment.NewLine;
// or use WhatEverMethod(item_dr);
}
var zzz = k_group.Max(g => g["Col1"]);
var qqq = k_group.Key;
}

Related

Linq - How to use 2 columns in where clause

I have two DataTables and I want to select the rows from the first Table which are not present in second Table based on 2 columns(col1, col2) using linq
Please check below example
I tried example from this page
Compare two DataTables and select the rows that are not present in second table
from the example they are used only one column
Edit 1
I have tried
DataTable Table1 = new DataTable();
Table1.Columns.Add("col1", typeof(string));
Table1.Columns.Add("col2", typeof(string));
DataRow r1 = Table1.NewRow();
r1["col1"] = "A";
r1["col2"] = "A-1";
Table1.Rows.Add(r1);
DataRow r2 = Table1.NewRow();
r2["col1"] = "B";
r2["col2"] = "B-2";
Table1.Rows.Add(r2);
DataRow r3 = Table1.NewRow();
r3["col1"] = "C";
r3["col2"] = "C-3";
Table1.Rows.Add(r3);
DataRow r4 = Table1.NewRow();
r4["col1"] = "D";
r4["col2"] = "D-4";
Table1.Rows.Add(r4);
DataRow r5 = Table1.NewRow();
r5["col1"] = "E";
r5["col2"] = "E-5";
Table1.Rows.Add(r5);
DataTable Table2 = new DataTable();
Table2.Columns.Add("col1", typeof(string));
Table2.Columns.Add("col2", typeof(string));
DataRow r11 = Table2.NewRow();
r11["col1"] = "A";
r11["col2"] = "A-1";
Table2.Rows.Add(r11);
DataRow r22 = Table2.NewRow();
r22["col1"] = "B";
r22["col2"] = "B-2";
Table2.Rows.Add(r22);
DataRow r33 = Table2.NewRow();
r33["col1"] = "C";
r33["col2"] = "C-4";
Table2.Rows.Add(r33);
DataRow r44 = Table2.NewRow();
r44["col1"] = "D";
r44["col2"] = "DD";
Table2.Rows.Add(r44);
DataRow r55 = Table2.NewRow();
r55["col1"] = "E";
r55["col2"] = "EE";
Table2.Rows.Add(r55);
DataRow r66 = Table2.NewRow();
r66["col1"] = "F";
r66["col2"] = "FF";
Table2.Rows.Add(r66);
Example - 1
DataTable table3s = (from a in Table1.AsEnumerable()
where !Table2.AsEnumerable().Any(e => (e.Field<string>("col1") == a.Field<string>("col1"))
&& (e.Field<string>("col2") == a.Field<string>("col2")))
select a).CopyToDataTable();
Example - 2
DataTable TableC = Table1.AsEnumerable().Where(ra => !Table2.AsEnumerable()
.Any(rb => rb.Field<string>("col1") == ra.Field<string>("col1")
&& rb.Field<string>("col2") == ra.Field<string>("col2"))).CopyToDataTable();
Example 1 & 2 gives error when no matching rows
The source contains no DataRows
please give working example based on my sample code and suggest most efficient way because DataTable may contains large record like 10000 rows, 20000 rows and more
Or to have something with a proper outer join without an implicit loop using Any:
var res = from a in Table1
join b in Table2
on (a.col1, a.col2) equals (b.col1, b.col2)
into temp
from b in temp.DefaultIfEmpty(default)
where b.col2 == null
select a;
It just joins the two tables using a composite key and puts it into the temp table. Then it does an outer join (DefaultIfEmpty) and takes only those entries from Table1 where the join returned an empty result.
Assuming you have
class Table1
{
public string col1 { get; set; }
public string col2 { get; set; }
}
class Table2
{
public string col1 { get; set; }
public string col2 { get; set; }
}
and
List<Table1> table1s = new List<Table1>();
List<Table2> table2s = new List<Table2>();
the query is
var table3s = from table1 in table1s
where !table2s.Any(e => (e.col1 == table1.col1) && (e.col2 == table1.col2))
select table1;
Try this. Basically, this line of code selects every element from Table1 whose "col1" and "col2" values do not exist in Table2.
var results = Table1.AsEnumerable().Where(t1 => Table2.AsEnumerable().All(t2 => t2["col1"] !=
t1["col1"] || t2["col2"] != t1["col2"]));
I tried to resolve this using below logic. Please let me know if I missed something here?
static void LinkPerf()
{
string[] arr = { "A", "B", "C", "D", "E", "F", "G", "H", "I" };
DataTable table1 = new DataTable();
table1.Columns.Add("Id");
table1.Columns.Add("Col1");
table1.Columns.Add("Col2");
DataTable table2 = new DataTable();
table2.Columns.Add("Id");
table2.Columns.Add("Col1");
table2.Columns.Add("Col2");
DataTable ResultTable3 = new DataTable();
ResultTable3.Columns.Add("Id");
ResultTable3.Columns.Add("Col1");
ResultTable3.Columns.Add("Col2");
Random rand = new Random();
for (int i = 1; i <= 10000; i++)
{
DataRow row = table1.NewRow();
int index = rand.Next(arr.Length);
var colVal = arr[index];
//Table 1
row[0] = i.ToString();
row[1] = colVal;
row[2] = colVal + "-" + i.ToString();
table1.Rows.Add(row);
//Table 2
row = table2.NewRow();
row[0] = i.ToString();
row[1] = colVal;
row[2] = (i % 5 == 0) ? colVal + colVal + i.ToString() : colVal + "-" + i.ToString();
table2.Rows.Add(row);
}
Stopwatch watch = new Stopwatch();
watch.Start();
var result = table1.AsEnumerable()
.Where(ra => !table2.AsEnumerable()
.Any(rb => rb.Field<string>("Col1") == ra.Field<string>("Col1") && rb.Field<string>("Col2") == ra.Field<string>("Col2")));
if (result.Any())
{
foreach (var item in result)
{
ResultTable3.ImportRow(item);
}
}
watch.Stop();
var timeTaken = watch.Elapsed;
Console.WriteLine("Time taken: " + timeTaken.ToString(#"m\:ss\.fff"));
Console.ReadLine();
}

How to set order by descending in Linq query based on aggregated column?

Below query has 4 columns, out of 4 columns .. i need only 2 columns which are Vegetables, Pricing.. and also .. Pricing has to be order by descending..
How to include order by for the aggregation value in Linq query?
DataTable Dt2 = new DataTable();
Dt2 = dt.AsEnumerable()
.GroupBy(r => r.Field<string>("Vegetables"))
.Select(g =>
{
var row = dt.NewRow();
row["Vegetables"] = g.Key;
row["Pricing"] = g.Average(r => ParseInt32(r.Field<string>("Pricing")));
return row;
}).CopyToDataTable();
Two Linq ways come to mind.
The first is using .OrderBy<T>() followed by .Reverse<T>():
DataTable Dt2 = new DataTable();
Dt2 = dt.AsEnumerable()
.GroupBy(r => r.Field<string>("Vegetables"))
.Select(g =>
{
var row = dt.NewRow();
row["Vegetables"] = g.Key;
row["Pricing"] = g.Average(r => Int32.Parse(r.Field<string>("Pricing")));
return row;
})
.OrderBy(row => row["Pricing"])
.Reverse()
.CopyToDataTable();
The second is just using .OrderByDescending<T>():
DataTable Dt2 = new DataTable();
Dt2 = dt.AsEnumerable()
.GroupBy(r => r.Field<string>("Vegetables"))
.Select(g =>
{
var row = dt.NewRow();
row["Vegetables"] = g.Key;
row["Pricing"] = g.Average(r => Int32.Parse(r.Field<string>("Pricing")));
return row;
})
.OrderByDescending(row => row["Pricing"])
.CopyToDataTable();
If you're looking for a non-Linq solution could also apply a sorted DataView on the DataTable to achieve a similar result.

Need to merge 2 rows in dynamic table

Hi I have created dynamic table as below and in that I have 2 rows with same id
How should I merge them.
DataTable dt = new DataTable();
dt = (DataTable)Session["AddtoCart"];
DataRow dr1 = dt.NewRow();
foreach (var key in collection.AllKeys)
{
dr1["Description"] = collection["hdDescription"];
dr1["Title"] = collection["hdTitle"];
dr1["ActualQuantity"] = collection["hdactualquantity"];
dr1["PropertyId"] = collection["hdPropertyId"];
dr1["Quantity"] = collection["Quantity"];
TempData["AddedtoCart"] = ConfigurationManager.AppSettings["AddtoCart"].ToString();
}
dt.Rows.Add(dr1);
My added rows are as
Propertyid, Quantity, ActualQuantity
1 5 10
1 2 10
2 3 20
2 4 20
Th result i needed is as
Propertyid, Quantity, ActualQuantity
1 7 10
2 7 20
Update: i have tried this answer:
var query = dt.AsEnumerable()
.GroupBy(row => row.Field<int>("PropertyId"))// issue over this line
.Select(grp => new
{
PropertyId = grp.Key,
Quantity = grp.Sum(row => row.Field<int>("Quantity")),
ActualQuantity = grp.First().Field<int>("ActualQuantity"),
Title = grp.First().Field<int>("Title"),
Description = grp.First().Field<int>("Description")
});
var SumByIdTable = dt.Clone();
foreach (var x in query)
SumByIdTable.Rows.Add(x.PropertyId, x.Quantity, x.ActualQuantity,x.Title, x.Description);
Session["AddtoCart"] = SumByIdTable;
but I am getting issue as Specified cast is not valid. on .GroupBy(row => row.Field("PropertyId"))
Update 2: I have tried below code but i am getting issue
DataTable dt = new DataTable();
dt.Columns.Add("PropertyId", typeof(int));
dt.Columns.Add("Quantity", typeof(int));
dt.Columns.Add("Description", Type.GetType("System.String"));
dt.Columns.Add("Title", Type.GetType("System.String"));
dt.Columns.Add("ActualQuantity", typeof(int));
DataRow dr1 = dt.NewRow();
foreach (var key in collection.AllKeys)
{
dr1["Description"] = collection["hdDescription"];
dr1["Title"] = collection["hdTitle"];
dr1["ActualQuantity"] = Convert.ToInt32(collection["hdactualquantity"]);
dr1["PropertyId"] = Convert.ToInt32(collection["hdPropertyId"]);
dr1["Quantity"] = Convert.ToInt32(collection["Quantity"]);
TempData["AddedtoCart"] = ConfigurationManager.AppSettings["AddtoCart"].ToString();
}
dt.Rows.Add(dr1);
var query = dt.AsEnumerable()
.GroupBy(row => row.Field<int>("PropertyId"))
.Select(grp => new
{
PropertyId = grp.Key,
Quantity = grp.Sum(row => row.Field<int>("Quantity")),
ActualQuantity = grp.First().Field<int>("ActualQuantity"),
Title = grp.First().Field<string>("Title"),
Description = grp.First().Field<string>("Description")
});
var SumByIdTable = dt.Clone();
foreach (var x in query)
SumByIdTable.Rows.Add(x.PropertyId, x.Quantity, x.ActualQuantity,x.Title, x.Description);
Session["AddtoCart"] = SumByIdTable;
but I am getting issue in
SumByIdTable.Rows.Add(x.PropertyId, x.Quantity, x.ActualQuantity,x.Title, x.Description);
Input string was not in a correct format.
Resolved Update 3: I have tried below code and is working
DataTable dt = new DataTable();
dt.Columns.Add("PropertyId", typeof(int));
dt.Columns.Add("Quantity", typeof(int));
dt.Columns.Add("Description", Type.GetType("System.String"));
dt.Columns.Add("Title", Type.GetType("System.String"));
dt.Columns.Add("ActualQuantity", typeof(int));
DataRow dr1 = dt.NewRow();
foreach (var key in collection.AllKeys)
{
dr1["Description"] = collection["hdDescription"];
dr1["Title"] = collection["hdTitle"];
dr1["ActualQuantity"] = Convert.ToInt32(collection["hdactualquantity"]);
dr1["PropertyId"] = Convert.ToInt32(collection["hdPropertyId"]);
dr1["Quantity"] = Convert.ToInt32(collection["Quantity"]);
TempData["AddedtoCart"] = ConfigurationManager.AppSettings["AddtoCart"].ToString();
}
dt.Rows.Add(dr1);
var query = dt.AsEnumerable()
.GroupBy(row => row.Field<int>("PropertyId"))
.Select(grp => new
{
PropertyId = grp.Key,
Quantity = grp.Sum(row => row.Field<int>("Quantity")),
ActualQuantity = grp.First().Field<int>("ActualQuantity"),
Title = grp.First().Field<string>("Title"),
Description = grp.First().Field<string>("Description")
});
var SumByIdTable = dt.Clone();
foreach (var x in query)
SumByIdTable.Rows.Add(Convert.ToInt32(x.PropertyId), Convert.ToInt32(x.Quantity), x.Description, x.Title, Convert.ToInt32(x.ActualQuantity));
Session["AddtoCart"] = SumByIdTable;
changes was I have to add the value in SumByIdTable as clone set in dt
You can use LINQ (-to-DataTable):
var query = dt.AsEnumerable()
.GroupBy(row => row.Field<int>("Propertyid"))
.Select(grp => new {
Propertyid = grp.Key,
Quantity = grp.Sum(row => row.Field<int>("Quantity")),
ActualQuantity = grp.First().Field<int>("ActualQuantity")
});
var SumByIdTable = dt.Clone();
foreach(var x in query)
SumByIdTable.Rows.Add(x.Propertyid, x.Quantity, x.ActualQuantity);

Grouping Row In Datatable

I have data like this..
ID
1234-001
1234-002
1234-003
5678-001
7890-001
7890-002
I am holding this data in a datatable. I am attempting to do some processing on the rows by groups based on the base number i.e. 1234, 5678, 7890
How can I iterate through this datatable and hold in new (temp) datatable
1234-001,1234-002, 1234-003
clear the temp datatable then hold
5678-001
clear the temp datatable then hold
7890-001,7890-002
I am working on an old code base and LINQ is not available. I cant come up with an elegant solution. Maybe something to do with dataviews im not sure?
You say you don't want to use LINQ but would prefer an elegant solution... Unless I am missing something vital in your question, this LINQified code seems to let you do what you want.
var grouped = from d in data
group d by d.Id.Split('-').FirstOrDefault();
foreach(var g in grouped) {
// do something with each group
}
Non-LINQ, non-var answer:
DataTable data = new DataTable();
data.Columns.Add("ID");
data.Columns.Add("Value");
data.Rows.Add("1234-001", "Row 1");
data.Rows.Add("1234-002", "Row 2");
data.Rows.Add("1234-003", "Row 3");
data.Rows.Add("5678-001", "Row 4");
data.Rows.Add("7890-001", "Row 5");
data.Rows.Add("7890-002", "Row 5");
Dictionary<String, List<DataRow>> grouped = new Dictionary<String, List<DataRow>>();
foreach(DataRow r in data.Select()) {
List<DataRow> groupedRows;
String key = r["ID"].ToString().Split('-')[0];
if(!grouped.TryGetValue(key, out groupedRows)) {
groupedRows = new List<DataRow>();
grouped[key] = groupedRows;
}
groupedRows.Add(r);
}
foreach(KeyValuePair<String, List<DataRow>> g in grouped) {
String groupKey = g.Key;
Console.WriteLine(groupKey);
foreach(DataRow r in g.Value) {
Console.WriteLine("\t{0}", r["Value"]);
}
}
I get the following output, so I'm not seeing "it only groups the first 3 and stops":
1234
Row 1
Row 2
Row 3
5678
Row 4
7890
Row 5
Row 5
Here is a non Linq example. Since you say it's sorted you can do it in one loop.
DataTable dt1 = new DataTable();
dt1.Columns.Add("ID", typeof (string));
dt1.Rows.Add("1234-001");
dt1.Rows.Add("1234-002");
dt1.Rows.Add("1234-003");
dt1.Rows.Add("5678-001");
dt1.Rows.Add("7890-001");
dt1.Rows.Add("7890-002");
int i = 0;
while (i < dt1.Rows.Count)
{
DataRow row = dt1.Rows[i];
string key = row.Field<string>("ID").Split('-')[0];
DataView dv = new DataView(dt1);
dv.RowFilter = String.Format("ID LIKE '{0}*'", key.Replace("'", "''"));
DataTable tempdt = dv.ToTable();
i = i + tempdt.Rows.Count;
}
Does this help some?
DataTable dt = new DataTable();
dt.Columns.Add("Data", typeof(string));
dt.Rows.Add("1234-001");
dt.Rows.Add("1234-002");
dt.Rows.Add("1234-003");
dt.Rows.Add("5678-001");
dt.Rows.Add("7890-001");
dt.Rows.Add("7890-002");
var stuff = from dr in dt.Select()
group dr by dr["Data"].ToString().Split('-')[0] into g
select new {First = g.Key, Records = g.ToList()};
stuff.Dump();
var groups = table.AsEnumerable();
List<List<DataRow>> groupList = (from g in table.AsEnumerable()
group g by g.Field<string>("id").ToString().Split('-').First() into Group1
select Group1.ToList()).ToList();

Add columns of datatables in a dataset

I have a DataSet with 2 DataTable's. Each DataTable contains a column called "cost".
I want to calculate the sum of all costs for the 2 tables in a table called Result table, like the example below. How can I do that?
Table 1
Name | cost
balan | 6
gt | 5
Table 2
Name | cost
balan | 2
gt | 8
Result table
Name | cost
balan | 8
gt | 12
This is a way to do it:
DataTable dt1 = new DataTable();
DataTable dt2 = new DataTable();
DataTable results = new DataTable();
dt1.Columns.Add("Name");
dt1.Columns.Add("cost", typeof(int));
dt2.Columns.Add("Name");
dt2.Columns.Add("cost", typeof(int));
results.Columns.Add("Name");
results.Columns.Add("cost", typeof(int));
dt1.Rows.Add("balan", 6);
dt2.Rows.Add("balan", 2);
dt1.Rows.Add("gt", 5);
dt2.Rows.Add("gt", 8);
foreach (DataRow dr1 in dt1.Rows)
{
results.Rows
.Add(
dr1["Name"],
(int)dr1["cost"] + (int)dt2.Select(String.Format("Name='{0}'", dr1["name"]))[0]["cost"]
);
}
to get the result, you can do something like
var table1 = yourDataSet.Tables["Table 1"];
var table2 = yourDataSet.Tables["Table 2"];
var results = table1.AsEnumerable().Select(t1 => new {
name = t1.Field<string>("Name"),
cost = t1.Field<int>("cost")
})
.Concat(
table2.AsEnumerable().Select(t2 => new {
name = t2.Field<string>("Name"),
cost = t2.Field<int>("cost")
})
)
.GroupBy(m => m.name)
.Select(g => new {
name = g.Key,
cost = g.Sum(x => x.cost)
});
this won't give you a dataTable, but an IEnumerable. To transform an IEnumerable to a dataTable, see for example here
or easier, if table1 and table2 have same rows
var table1 = yourDataSet.Tables["Table 1"];
var table2 = yourDataSet.Tables["Table 2"];
var results = new DataTable();
results.Columns.Add("Name");
results.Columns.Add("cost", typeof(int));
table1.AsEnumerable().Concat(table2.AsEnumerable())
.GroupBy(m => m.Field<string>("Name"))
.Select(g => results.Rows.Add(g.Key, g.Sum(x => x.Field<int>("cost"))));

Categories

Resources