Merge columns of two DataTables using linq

Merge columns of two DataTables using linq - c#

I have two DataTables: dt1 and dt2.
dt1:
ID | Name | Address | QTY
-------+----------+---------+-----
A1 | Dog | C1 | 272
A2 | Cat | C3 | 235
A3 | Chicken | C2 | 254
A4 | Mouse | C4 | 259
A5 | Pig | C5 | 233
dt2:
ID | Name | Address | QTY MAX
-------+----------+---------+--------
A1 | Dog | C1 | 250
A2 | Cat | C3 | 200
A3 | Chicken | C2 | 300
A6 | Rabbit | C6 | 350
But, I want to merge dt1 and dt2 to dt3 like below:
ID | Name | Address | QTY | QTY MAX
-------+----------+---------+-------+--------
A1 | Dog | C1 | 272 | 250
A2 | Cat | C3 | 235 | 200
A3 | Chicken | C2 | 254 | 300
A4 | Mouse | C4 | 259 | 0
A5 | Pig | C5 | 233 | 0
A6 | Rabbit | C6 | 0 | 350
Can any one help me?

If your DataTables haven't primary key and you can't or don't want to change those DataTables you can use a code like this:
// At first you need to define your result `DataTable`
// So make it by cloning from first `DataTable`
var dt3 = dt1.Clone();
// Then add extra columns to it
dt3.Columns.Add("Qty Max", typeof(int));
// Second, you need to add rows of first `DataTable`
foreach (DataRow row in dt1.Rows)
{
// When you don't have a primary key you need a code like this to find same rows:
var dt2Row = dt2.Rows.OfType<DataRow>().SingleOrDefault(w => w["ID"].Equals(row["ID"]));
var qtyMax = dt2Row?["Qty Max"] ?? 0; // Here I set default value to `0`
dt3.Rows.Add(row["ID"], row["Name"], row["Address"], row["Qty"], qtyMax);
}
// Third, you need to add rows of second `DataTable` that is not in first
var dt2OnlyRows =
dt2.Rows.OfType<DataRow>().Where(w => dt1.Rows.OfType<DataRow>().All(x => x["ID"] != w["ID"]));
foreach (var row in dt2OnlyRows)
{
dt3.Rows.Add(row["ID"], row["Name"], row["Address"], 0, row["Qty Max"]);
}

This solution is not a linq solution as you could simply use DataTable.Merge & DataTable.PrimaryKey to get the desired output.
Here is a dummy example which you can use:
var dt1 = new DataTable();
var p1 = dt1.Columns.Add("a", typeof(int)); //Use this to add Primary Key constraint
dt1.Columns.Add("b");
dt1.Columns.Add("c");
dt1.Rows.Add("1", "apple", "10");
dt1.Rows.Add("2", "mango", "20");
dt1.Rows.Add("3", "orange", "30");
dt1.Rows.Add("4", "banana", "40");
dt1.PrimaryKey = new DataColumn[] { p1 }; //This removes duplication of rows
var dt2 = new DataTable();
var p2 = dt2.Columns.Add("a", typeof(int)); //Use this to add Primary Key constraint
dt2.Columns.Add("b");
dt2.Columns.Add("d");
dt2.Rows.Add("1", "apple", "50");
dt2.Rows.Add("2", "mango", "60");
dt2.Rows.Add("3", "orange", "70");
dt2.Rows.Add("5", "grapes", "80");
dt2.PrimaryKey = new DataColumn[] { p2 }; //This removes duplication of rows
var dt3 = dt1.Copy();
dt3.Merge(dt2); // Merge here merges the values from both provided DataTables
Taking your question into consideration:
var dt1 = new DataTable();
var p1 = dt1.Columns.Add("ID", typeof(string));
dt1.Columns.Add("Name", typeof(string));
dt1.Columns.Add("Address", typeof(string));
dt1.Columns.Add("Qty", typeof(int));
dt1.Columns["Qty"].DefaultValue = 0; //Setting default value
dt1.Rows.Add("A1", "Dog", "C1", 100);
dt1.Rows.Add("A2", "Cat", "C3", 200);
dt1.Rows.Add("A3", "Chicken", "C2", 300);
dt1.Rows.Add("A4", "Mouse", "C4", 400);
dt1.Rows.Add("A5", "Pig", "C5", 500);
dt1.PrimaryKey = new DataColumn[] { p1 };
var dt2 = new DataTable();
var p2 = dt2.Columns.Add("ID", typeof(string));
dt2.Columns.Add("Name", typeof(string));
dt2.Columns.Add("Address", typeof(string));
dt2.Columns.Add("Qty Max", typeof(int));
dt2.Columns["Qty Max"].DefaultValue = 0; //Setting default value
dt2.Rows.Add("A1", "Dog", "C1", 600);
dt2.Rows.Add("A2", "Cat", "C3", 700);
dt2.Rows.Add("A3", "Chicken", "C2", 800);
dt2.Rows.Add("A6", "Rabbit", "C6", 900);
dt2.PrimaryKey = new DataColumn[] { p2 };
var dt3 = dt1.Copy();
dt3.Merge(dt2);
Output:
Thanks #shA.t for suggesting to include DataColumn.DefaultValue so that blank cells could be replaced with 0. Also his answer seems to include linq features which I guess is what you are looking for!

Related

How do I list distinct values as columns? [duplicate]

This question already has answers here:
Is it possible to Pivot data using LINQ?
(7 answers)
Closed 3 years ago.
I received a request to export data from my asp.net mvc project using linq to an excel spreadsheet. Usually this is an easy task, however, in this scenario the person requesting the data would like the export from example or list A to look like example B
Example A (current export)
Id | CustomerNum | CustomerName | FruitName | Charge
____________________________________________________
1 | 1026 | Bob | Banana | 3.00
2 | 1032 | Jill | Apple | 2.00
3 | 1026 | Bob | Apple | 3.00
4 | 1144 | Marvin | Banana | 1.00
5 | 1753 | Sam | Pear | 4.00
6 | 1026 | Bob | Banana | 3.00
Example B (requested export format)
Id | CustomerNum | CustomerName | Banana | Apple | Pear
_________________________________________________________
1 | 1026 | Bob | 6.00 | 3.00 |
2 | 1032 | Jill | 0 | 2.00 |
3 | 1144 | Marvin | 1.00 | 0 |
5 | 1753 | Sam | 0 | 0 | 4.00
I have never seen where distinct row values were used as columns. How should I go about this?

Create a pivot table :
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Data;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
DataTable dt = new DataTable();
dt.Columns.Add("Id", typeof(int));
dt.Columns.Add("CustomerNum", typeof(int));
dt.Columns.Add("CustomerName", typeof(string));
dt.Columns.Add("FruitName", typeof(string));
dt.Columns.Add("Charge", typeof(decimal));
dt.Rows.Add(new object[] {1,1026, "Bob", "Banana", 3.00});
dt.Rows.Add(new object[] {2,1032, "Jill", "Apple", 2.00});
dt.Rows.Add(new object[] {3,1026, "Bob", "Apple", 3.00});
dt.Rows.Add(new object[] {4,1144, "Marvin", "Banana", 1.00});
dt.Rows.Add(new object[] {5,1753, "Sam", "Pear", 4.00});
dt.Rows.Add(new object[] {6,1026, "Bob", "Banana", 3.00});
string[] fruits = dt.AsEnumerable().Select(x => x.Field<string>("FruitName")).Distinct().OrderBy(x => x).ToArray();
DataTable pivot = new DataTable();
pivot.Columns.Add("CustomerNum", typeof(int));
pivot.Columns.Add("CustomerName", typeof(string));
foreach (string fruit in fruits)
{
pivot.Columns.Add(fruit, typeof(decimal));
}
var groups = dt.AsEnumerable().GroupBy(x => x.Field<int>("CustomerNum"));
foreach (var group in groups)
{
DataRow newRow = pivot.Rows.Add();
newRow["CustomerNum"] = group.Key;
newRow["CustomerName"] = group.First().Field<string>("CustomerName");
foreach (DataRow row in group)
{
string fruitName = row.Field<string>("FruitName");
decimal oldvalue = (newRow[fruitName] == DBNull.Value) ? 0 : (decimal)newRow[fruitName];
newRow[fruitName] = oldvalue + row.Field<decimal>("Charge");
}
}
}
}
}

Copy row from datatable to another where there are common column headers

I have two datatables, I am trying to copy row from one table to another, I have tried this. the thing is that my tables are not exactly the same, both tables have common headers, but to the second table have more columns, therefore I need "smart" copy, i.e to copy the row according to the column header name.
d1:
+--------+--------+--------+
| ID | aaa | bbb |
+--------+--------+--------+
| 23 | value1 | value2 | <----copy this row
d2:
+--------+--------+--------+--------+
| ID | ccc | bbb | aaa |
+--------+--------+--------+--------+
| 23 | | value2 | value1 | <----I need this result
but this code:
string rowID=23;
DataRow[] result = dt1.Select($"ID = {rowID}");
dt2.Rows.Add(result[0].ItemArray);
gives:
d2:
+--------+--------+--------+--------+
| ID | ccc | bbb | aaa |
+--------+--------+--------+--------+
| 23 | value1 | value2 | | <---- :( NOT what I need

I think this is your homework, but here you have some simple and not very smart solution:
private DataTable DTCopySample()
{
int cnt = 0;
DataTable dt1 = new DataTable();
dt1.Columns.Add("ID");
dt1.Columns.Add("aaa");
dt1.Columns.Add("bbb");
DataTable dt2 = new DataTable();
dt2.Columns.Add("ID");
dt2.Columns.Add("ccc");
dt2.Columns.Add("bbb");
dt2.Columns.Add("aaa");
dt1.Rows.Add();
dt1.Rows[0]["ID"] = "23";
dt1.Rows[0]["aaa"] = "val1";
dt1.Rows[0]["bbb"] = "val2";
dt1.Rows.Add();
dt1.Rows[1]["ID"] = "99";
dt1.Rows[1]["aaa"] = "val99";
dt1.Rows[1]["bbb"] = "val98";
string colName = string.Empty;
foreach (DataRow row in dt1.Rows)
{
dt2.Rows.Add();
foreach (DataColumn col in dt1.Columns)
{
dt2.Rows[cnt][col.ColumnName] = row[col.ColumnName].ToString();
}
cnt++;
}
return dt2;
}
There are more smart and better solutions, but this is fast-written (2 mins) and works.
Remeber, that you have not specified columns datatypes or anything else, so I assumed there are strings everywhere for creating simple sample.

Load a specified column in DataGridView with values from a database

I have a database with a table. What I want to do is programatically load values from a column of the table to a column of the DataGridView.
I have a table "Actions", with a field "Total", which has some data: 10, 20, 35, 50, etc.
I want to put this field into the DataGridView in the 2nd column.
So the DataGridView should look like this.(the other columns are already set).
| Name | Total | Something |
|:-----------|------------:|:------------:|
| adsad | 10 | This |
| sddssdf | 20 | column |
| name1 | 35 | will |
| name | 50 | be |
| nmas | 1 | center |
| gjghjhh | 67 | aligned |

You need to create the particular column in Gridview and try following code :
DataGridView dataGridView2 = new DataGridView();
BindingSource bindingSource2 = new BindingSource();
dataGridView2.ColumnCount = 2;
dataGridView2.Columns[0].Name = "FieldOne";
dataGridView2.Columns[0].DataPropertyName = "FieldOne";
dataGridView2.Columns[1].Name = "FieldTwo";
dataGridView2.Columns[1].DataPropertyName = "FieldTwo";
bindingSource1.DataSource = GetDataTable();
dataGridView1.DataSource = bindingSource1;

you can add a new column to your DataTable and then bind it to your DataGridView.
//call SQL helper class to get initial data
DataTable dt = sql.ExecuteDataTable("sp_MyProc");
dt.Columns.Add("NewColumn", typeof(System.Int32));
foreach(DataRow row in dt.Rows)
{
//need to set value to NewColumn column
row["NewColumn"] = 0; // or set it to some other value
}
// possibly save your Dataset here, after setting all the new values
dataGridView1.DataSource = dt;

Compare 2 Datatables to find difference/accuracy between the columns

So, I have 2 separate datatables, that look pretty identical but the values in their rows might be different for instance.
EDIT:
I can have an unique ID BY creating a temporary identity column that can be used as primary key if that will make it easier. so think of ID column as the primary key than.
Table A
ID | Name | Value1 | Value2 | Value3
-------------------------------------
1 | Bob | 50 | 150 | 35
2 | Bill | 55 | 47 | 98
3 | Pat | 10 | 15 | 45
4 | Cat | 70 | 150 | 35
Table B
ID | Name | Value1 | Value2 | Value3
-------------------------------------
1 | Bob | 30 | 34 | 67
2 | Bill | 55 | 47 | 98
3 | Pat | 100 | 15 | 45
4 | Cat | 70 | 100 | 20
Output Should be:
Table C
ID | Name | TableAValue1 | TableBValue1 | DiffValue1 ....Samething for Value2 .....samething for value3
------------------------------------------------------
1 | Bob | 50 | 30 | 20
2 | Bill | 55 | 55 | 0
3 | Pat | 10 | 100 | 90
4 | Cat | 70 | 70 | 0
I Know the tedious method to do this is by using a forloop and looping through each row comparing column rows with each other. But I am not sure how to create a new Table C with the results I want. Also I think there might be a simpler solution using Linq which I am not very familiar with but I would be interested in the solution with linq if it faster and less lines of code. I am looking for the most optimal/efficient way of going about this. as these datatables can be anywhere between 5,000 to 15,000+ rows in size so memory usage becomes an issue.

LINQ is not faster, at least not in general. But it can help to increase readability.
You can use Enumerable.Join which might be more efficient than nested loops, but you need a loop to fill your third table anyway. So the first two columns are the identifiers and the rest are the values:
var query = from r1 in table1.AsEnumerable()
join r2 in table2.AsEnumerable()
on new { ID = r1.Field<int>("ID"), Name = r1.Field<string>("Name") }
equals new { ID = r2.Field<int>("ID"), Name = r2.Field<string>("Name") }
select new { r1, r2 };
var columnsToCompare = table1.Columns.Cast<DataColumn>().Skip(2);
foreach (var rowInfo in query)
{
var row = table3.Rows.Add();
row.SetField("ID", rowInfo.r1.Field<int>("ID"));
row.SetField("Name", rowInfo.r1.Field<int>("Name"));
foreach (DataColumn col in columnsToCompare)
{
int val1 = rowInfo.r1.Field<int>(col.ColumnName);
int val2 = rowInfo.r2.Field<int>(col.ColumnName);
int diff = (int)Math.Abs(val1-val2);
row.SetField(col.ColumnName, diff);
}
}

var tableC = new DataTable();
tableC.Columns.Add(new DataColumn("ID"));
tableC.Columns.Add(new DataColumn("Name"));
tableC.Columns.Add(new DataColumn("TableAValue1"));
tableC.Columns.Add(new DataColumn("TableBValue1"));
tableC.Columns.Add(new DataColumn("DiffValue1"));
foreach (DataRow rowA in tableA.Rows)
{
foreach (DataRow rowB in tableB.Rows)
{
if (Convert.ToInt32(rowA["ID"]) == Convert.ToInt32(rowB["ID"]) &&
rowA["Name"].ToString() == rowB["Name"].ToString() &&
Convert.ToInt32(rowA["Value1"]) != Convert.ToInt32(rowB["Value1"]))
{
var newRow = tableC.NewRow();
newRow["ID"] = rowA["ID"];
newRow["Name"] = rowA["Name"];
newRow["TableAValue1"] = rowA["Value1"];
newRow["TableBValue1"] = rowB["Value1"];
newRow["DiffValue1"] = Convert.ToInt32(rowA["Value1"]) - Convert.ToInt32(rowB["Value1"]);
tableC.Rows.Add(newRow);
}
}
}

Using LINQ, create an anonymous type as follows
var joinedRows = (from rowA in TableA.AsEnumerable()
from rowB in TableB.AsEnumerable()
where rowA.Field<String>("Name") == rowB.Field<String>("Name")
select new
{
ID = rowA.Field<int>("ID"),
Name = rowA.Field<String>("Name"),
TableAValue1 = rowA.Field<int>("Value1"),
TableBValue1 = rowB.Field<int>("Value1"),
DiffValue1 = Math.Abs(rowA.Field<int>("Value1") - rowB.Field<int>("Value1")),
TableAValue2 = rowA.Field<int>("Value2"),
TableBValue2 = rowB.Field<int>("Value2"),
DiffValue2 = Math.Abs(rowA.Field<int>("Value2") - rowB.Field<int>("Value2")),
TableAValue3 = rowA.Field<int>("Value3"),
TableBValue3 = rowB.Field<int>("Value3"),
DiffValue3 = Math.Abs(rowA.Field<int>("Value3") - rowB.Field<int>("Value3"))
});
Table.AsEnumerable() will give you an IEnumerable(of DataRow)
row.Field will cast it to the correct type for you
You can now use the anonymous type of joinedRows and create your new dataTable from it

This uses a strategy similar to kippermand's, but will probably perform slightly better on large sets of data by avoiding the O(n²) complexity of checking every ID against every other ID, and by reusing the values extracted from the data table:
// joining by row location
var joinedTableRows =
dt1.AsEnumerable().Zip(dt2.AsEnumerable(),
(r1, r2) => new{r1, r2});
// or, joining by ID
var joinedTableRows2 =
dt1.AsEnumerable().Join(dt2.AsEnumerable(),
r => r.Field<int>("ID"),
r => r.Field<int>("ID"),
(r1, r2) => new{r1, r2});
var result =
from row in joinedTableRows
let rowA = row.r1
let rowB = row.r2
let tableAValue1 = rowA.Field<int>("Value1")
let tableBValue1 = rowB.Field<int>("Value1")
let tableAValue2 = rowA.Field<int>("Value2")
let tableBValue2 = rowB.Field<int>("Value2")
let tableAValue3 = rowA.Field<int>("Value3")
let tableBValue3 = rowB.Field<int>("Value3")
select new
{
ID = row.r1.Field<int>("ID"),
Name = row.r1.Field<string>("Name"),
TableAValue1 = tableAValue1,
TableBValue1 = tableBValue1,
DiffValue1 = Math.Abs(tableAValue1 - tableBValue1),
TableAValue2 = tableAValue2,
TableBValue2 = tableBValue2,
DiffValue2 = Math.Abs(tableAValue2 - tableBValue2),
TableAValue3 = tableAValue3,
TableBValue3 = tableBValue3,
DiffValue3 = Math.Abs(tableAValue3 - tableBValue3)
};
Depending on how your data needs to be consumed, you could either declare a class matching this anonymous type, and consume that directly (which is what I'd prefer), or you can create a DataTable from these objects, if you have to.

linq joining, grouping, with parent roll-up

say I've got a DataTable in this format:
id | key1 | key2 | data1 | data2 | parentID
10 | AA | one | 10.3 | 0.3 | -1
10 | AA | two | 20.1 | 16.2 | -1
10 | BB | one | -5.9 | 30.1 | -1
20 | AA | one | 403.1 | -20.4 | 10
30 | AA | one | 121.5 | 210.3 | -1
and a second DataTable like so:
id | data
10 | 5500
20 | -3000
30 | 500
what I want to do is aggregate the data at the "id" level, with the second table's "data" field added to the first's net "data1", and "data2" just summed up by itself. I figured out how to do this, but what I'm stuck at is this: I want data for anything with "parentID" != -1 to be added to it's parent. so the output of the above data should be
id | data1 | data2
10 | 2927.6 | 26.2
30 | 621.5 | 210.3
is there an efficient way to do this?
edit: code sample
DataTable dt1 = new DataTable();
dt1.Columns.Add("id", typeof(int));
dt1.Columns.Add("key1", typeof(string));
dt1.Columns.Add("key2", typeof(string));
dt1.Columns.Add("data1", typeof(double));
dt1.Columns.Add("data2", typeof(double));
dt1.Columns.Add("parentID", typeof(int));
DataTable dt2 = new DataTable();
dt2.Columns.Add("id", typeof(int));
dt2.Columns.Add("data", typeof(double));
dt1.Rows.Add(new object[] { 10, "AA", "one", 10.3, 0.3, -1 });
dt1.Rows.Add(new object[] { 10, "AA", "two", 20.1, 16.2, -1 });
dt1.Rows.Add(new object[] { 10, "BB", "one", -5.9, 30.1, -1 });
dt1.Rows.Add(new object[] { 20, "AA", "one", 403.1, -20.4, 10 });
dt1.Rows.Add(new object[] { 30, "AA", "one", 121.5, 210.3, -1 });
dt2.Rows.Add(new object[] { 10, 5500 });
dt2.Rows.Add(new object[] { 20, -3000 });
dt2.Rows.Add(new object[] { 30, 500 });
var groups = dt1.AsEnumerable()
.GroupBy(e => e["id"])
.Select(e => new
{
id = e.Key,
net_data1 = e.Sum(w => (double)w["data1"]),
net_data2 = e.Sum(w => (double)w["data2"])
})
.GroupJoin(dt2.AsEnumerable(), e1 => e1.id, e2 => e2["id"],
(a1, a2) => new
{
id = a1.id,
net_data1 = a1.net_data1 + a2.Sum(w => (double)w["data"]),
net_data2 = a1.net_data2
});

Unfortunately, SQL (and, by extension, LINQ) is not well-suited to recursion. Can the parentID column go multiple levels deep? Like this:
ID Parent
------------------
10 -1
20 10
30 10
40 20
If you want to retrace the steps up from ID 40 to ID 10, then you should abandon a SQL/LINQ approach and just do it in code.

It sounds like a good use of a group join. Something like this might work (though it's completely untested):
var items = from parent in context.dataTable
join child in context.dataTable on parent.id equals child.parentID into children
where parent.parentID == -1
select new { id = parent.id,
data1 = (parent.data1 + children.Sum(c => c.data1)),
data2 = (parent.data2 + children.Sum(c => c.data2)) };

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Merge columns of two DataTables using linq - c#

Related

How do I list distinct values as columns? [duplicate]

Copy row from datatable to another where there are common column headers

Load a specified column in DataGridView with values from a database

Compare 2 Datatables to find difference/accuracy between the columns

linq joining, grouping, with parent roll-up

Categories

Resources