How to replace duplicates in datatable - c#

I have Datatable 1:-----------------------------should be like that:
ID Name Lastname ID Name Lastname
------------------- -----------------------
1 | koki ha 1 | koki ha
------------------- | ----------------- //merge Rows[0][0]
1 | lola mi | lola mi //with Rows[1][0] if the same
------------------- -----------------------
2 | ka xe 2 ka xe
how to replace "1" with "" or empty if is already exist? I spend for this for 2 hours but can't find the solution. I tried with linq but dont find the key to do it right, maybe distinct or group?
DataTable table = new DataTable("table");
table.Columns.Add("ID", typeof(Int32));
table.Columns.Add("Name", typeof(String));
table.Columns.Add("Lastname", typeof(String));
object[] o1 = { 1, "Kiki", "ha"};
object[] o2 = { 1,"lola","mi"};
object[] o4 = { 2, "ka", "xe" };
table.Rows.Add(o1);
table.Rows.Add(o2);
table.Rows.Add(o4);
dataGridView2.DataSource = table;

Here's how you can do this using LINQ:
var dataRows = table.Rows.Cast<System.Data.DataRow>()
.GroupBy(r => r[0])
.Where(g => g.Count() > 1);
foreach (var dataRowGroup in dataRows) {
int idx = 0;
foreach (DataRow row in dataRowGroup) {
if (idx++ > 0) {
row[0] = DBNull.Value;
}
}
}

Related

Sort out duplicates in one row but keep a specific one

What Im trying to do, is distinct (or group by) for one column, but then keeping the one in my List which has a value fo a second column.
What i have
Column1 Column 2 Column3 ...
1 | tada | smth
1 | | wefih
2 | tada | uitethgev
3 | | urifnvf
what i want
Column1 Column 2 Column3 ...
1 | tada | smth
2 | tada | uitethgev
3 | | urifnvf
As i only have one "3" i wanna keep it in my list. Same for the 2, but the 1 should only stay with a value in column2.
I wanna do this in a linq query. Each row is an Object with attributes that represent the columns.
Any clues on this? I know how to make it by using multiple lists and writing to each other with a method checking it. But i thought there could be a nice linq way to do this. Also keep in mind pls that i have more then just 3 columns.
You can do it by linq, specially by GroupBy why not :
1 - i'm creating a class that simulate your demand :
public class TestClass
{
public int Column1 { get; set; }
public string Column2 { get; set; }
public string Column3 { get; set; }
}
2 - i'm initializing a list of TestClass like a below:
List<TestClass> testClasses = new List<TestClass>
{
new TestClass{Column1 = 1, Column2 = "tada", Column3 = "smth"},
new TestClass{Column1 = 1, Column2 = "msa", Column3 = "msa1"},
new TestClass{Column1 = 1, Column3 = "wefih"},
new TestClass{Column1 = 2, Column2 = "tada", Column3 = "uitethgev"},
new TestClass{Column1 = 3, Column3 = "urifnvf"},
};
3 - using groupBy to filter your list, by testing the count of grouped element is grater than or equal 2 :
if count >= 2 : take the first element that have a column2 not empty
else : take element without filtering
List<TestClass> groupedList = testClasses
.GroupBy(x => x.Column1)
.Select(y => y.Count() >= 2 ? y.First(z => !string.IsNullOrEmpty(z.Column2)) : y.First())
.ToList();
Result of 3 :
Column1|Column 2| Column3 ...
1 | tada | smth
2 | tada | uitethgev
3 | | urifnvf
4 - if you need all not empty column2, if the count grater than 2, try this code :
List<TestClass> groupedList = testClasses
.GroupBy(x => x.Column1)
.SelectMany(y => y.Count() >= 2 ? y.TakeWhile(z => !string.IsNullOrEmpty(z.Column2)) : y)
.ToList();
Result of 4 :
Column1|Column 2| Column3 ...
1 | tada | smth
1 | msa | msa1
2 | tada | uitethgev
3 | | urifnvf
i hope that will give you an answer
The following linq works :
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Data;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
DataTable dt = new DataTable();
dt.Columns.Add("Column1", typeof(int));
dt.Columns.Add("Column2", typeof(string));
dt.Columns.Add("Column3", typeof(string));
dt.Rows.Add(new object[] { 1, "tada", "smth"});
dt.Rows.Add(new object[] { 1, null, "wefih"});
dt.Rows.Add(new object[] { 2, "tada", "uitethgev"});
dt.Rows.Add(new object[] { 3, null, "erifnvf"});
DataTable dt2 = dt.AsEnumerable()
.GroupBy(x => x.Field<int>("Column1"))
.Select(x => x.All(y => (y.Field<object>("Column2") == null)) ? x.First() : x.Where(y => y.Field<object>("Column2") != null).First())
.CopyToDataTable();
}
}
}

How to load data from SQL to datetable that show in treeview

I have a chartTable with 2 columns:
ChildPersonID | ParentPersonID
--------------+-----------------
1 | 2
1 | 3
2 | 4
That is joined to personTable with 2 columns:
ID | PersonName
---+-----------------
1 | a
2 | b
3 | c
4 | d
I want a select query that fill datatable with PersonName that show in treeview
Result:
parentname | parentid | childname | childid
-----------+----------+-----------+---------
a | 1 | b | 2
a | 1 | c | 3
b | 2 | d | 4
my code
DECLARE #Table1 TABLE (ChildPersonID INT,ParentPersonID INT)
DECLARE #Table2 TABLE (ID INT, PersonName VARCHAR(10))
INSERT INTO #Table1 VALUES (1,2),(1,3),(2,4)
INSERT INTO #Table2 VALUES (1,'a'),(2,'b'),(3,'c'),(4,'d')
SELECT T3.PersonName AS parentName, T1.ChildPersonID AS ParentId,
T2.PersonName AS childname, T1.ParentPersonID AS childid
FROM #Table1 T1
INNER JOIN #Table2 T2 ON T1.ParentPersonID = T2.Id
INNER JOIN #Table2 T3 ON T2.ChildPersonID = T3.id
This is the query which you are looking for (though the column name which you have mentioned are confusing, I think it should be reversed)
DECLARE #Table1 TABLE (ChildPersonID INT,ParentPersonID INT)
DECLARE #Table2 TABLE (ID INT, PersonName VARCHAR(10))
INSERT INTO #Table1 VALUES (1,2),(1,3),(2,4)
INSERT INTO #Table2 VALUES (1,'a'),(2,'b'),(3,'c'),(4,'d')
SELECT T3.PersonName AS parentName, T1.ChildPersonID AS ParentId,
T2.PersonName AS childname, T1.ParentPersonID AS childid
FROM #Table1 T1
INNER JOIN #Table2 T2 ON T1.ParentPersonID = T2.Id
INNER JOIN #Table2 T3 ON T1.ChildPersonID = T3.id
See code below :
using System;
using System.Collections.Generic;
using System.Collections;
using System.Linq;
using System.Text;
using System.Data;
namespace ConsoleApplication94
{
class Program
{
static void Main(string[] args)
{
DataTable dt = new DataTable();
dt.Columns.Add("parentname", typeof(string));
dt.Columns.Add("parentid", typeof(int));
dt.Columns.Add("childname", typeof(string));
dt.Columns.Add("childid", typeof(int));
DataTable dtChildPerson = new DataTable();
dtChildPerson.Columns.Add("ChildPersonID", typeof(int));
dtChildPerson.Columns.Add("ParentPersonID", typeof(int));
dtChildPerson.Rows.Add(new object[] { 1, 1 });
dtChildPerson.Rows.Add(new object[] { 1, 3 });
dtChildPerson.Rows.Add(new object[] { 2, 4 });
DataTable personName = new DataTable();
personName.Columns.Add("ID", typeof(int));
personName.Columns.Add("PersonName", typeof(string));
personName.Rows.Add(new object[] { 1, "a" });
personName.Rows.Add(new object[] { 2, "b" });
personName.Rows.Add(new object[] { 3, "c" });
personName.Rows.Add(new object[] { 4, "d" });
foreach (DataRow row in dtChildPerson.AsEnumerable())
{
int parentID = row.Field<int>("ParentPersonID");
string parentName = personName.AsEnumerable().Where(x => x.Field<int>("ID") == parentID).Select(x => x.Field<string>("PersonName")).FirstOrDefault();
int childID = row.Field<int>("ChildPersonID");
foreach(DataRow childRow in personName.AsEnumerable().Where(x => x.Field<int>("ID") == childID))
{
string childName = childRow.Field<string>("PersonName");
dt.Rows.Add(new object[] { parentName, parentID, childName, childID });
}
}
}
}
}

DataTable.Select() to display summation of records in datatable

In my C# project in a DataTable, I need to sum a few columns and display the aggregated record and I am unable to create filter query for that.
Records like:
|Col1|Col2|Col3|Col4|
| A | X | 10 | 10 |
| A | X | 10 | 20 |
| A | Y | 12 | 12 |
| A | Y | 10 | 10 |
Result will be:
|Col1|Col2|Col3|Col4|
| A | X | 20 | 30 |
| A | Y | 22 | 22 |
I have to use DataTable.Select("filter condition").
var result = (from DataRow s in yourDataTable.Select("filter conditions").AsEnumerable()
group s by new {g1 = s.Field<string>("Col1"), g2 = s.Field<string>("Col2") } into g
select new
{
Col1 = g.Key.g1,
Col2 = g.Key.g2,
Col3 = g.sum(r => r.Field<decimal>("Col3")),
Col4 = g.sum(r => r.Field<decimal>("Col4")),
}).ToList();
And if you want result as DataTable type, you can convert list to DataTable Like below:
var resultAsDataTable = ConvertListToDataTable(result);
public static DataTable ConvertListToDataTable<T>(IList<T> data)
{
PropertyDescriptorCollection props =
TypeDescriptor.GetProperties(typeof(T));
DataTable table = new DataTable();
for (int i = 0; i < props.Count; i++)
{
PropertyDescriptor prop = props[i];
table.Columns.Add(prop.Name, prop.PropertyType);
}
object[] values = new object[props.Count];
foreach (T item in data)
{
for (int i = 0; i < values.Length; i++)
{
values[i] = props[i].GetValue(item);
}
table.Rows.Add(values);
}
return table;
}

Merge columns of two DataTables using linq

I have two DataTables: dt1 and dt2.
dt1:
ID | Name | Address | QTY
-------+----------+---------+-----
A1 | Dog | C1 | 272
A2 | Cat | C3 | 235
A3 | Chicken | C2 | 254
A4 | Mouse | C4 | 259
A5 | Pig | C5 | 233
dt2:
ID | Name | Address | QTY MAX
-------+----------+---------+--------
A1 | Dog | C1 | 250
A2 | Cat | C3 | 200
A3 | Chicken | C2 | 300
A6 | Rabbit | C6 | 350
But, I want to merge dt1 and dt2 to dt3 like below:
ID | Name | Address | QTY | QTY MAX
-------+----------+---------+-------+--------
A1 | Dog | C1 | 272 | 250
A2 | Cat | C3 | 235 | 200
A3 | Chicken | C2 | 254 | 300
A4 | Mouse | C4 | 259 | 0
A5 | Pig | C5 | 233 | 0
A6 | Rabbit | C6 | 0 | 350
Can any one help me?
If your DataTables haven't primary key and you can't or don't want to change those DataTables you can use a code like this:
// At first you need to define your result `DataTable`
// So make it by cloning from first `DataTable`
var dt3 = dt1.Clone();
// Then add extra columns to it
dt3.Columns.Add("Qty Max", typeof(int));
// Second, you need to add rows of first `DataTable`
foreach (DataRow row in dt1.Rows)
{
// When you don't have a primary key you need a code like this to find same rows:
var dt2Row = dt2.Rows.OfType<DataRow>().SingleOrDefault(w => w["ID"].Equals(row["ID"]));
var qtyMax = dt2Row?["Qty Max"] ?? 0; // Here I set default value to `0`
dt3.Rows.Add(row["ID"], row["Name"], row["Address"], row["Qty"], qtyMax);
}
// Third, you need to add rows of second `DataTable` that is not in first
var dt2OnlyRows =
dt2.Rows.OfType<DataRow>().Where(w => dt1.Rows.OfType<DataRow>().All(x => x["ID"] != w["ID"]));
foreach (var row in dt2OnlyRows)
{
dt3.Rows.Add(row["ID"], row["Name"], row["Address"], 0, row["Qty Max"]);
}
This solution is not a linq solution as you could simply use DataTable.Merge & DataTable.PrimaryKey to get the desired output.
Here is a dummy example which you can use:
var dt1 = new DataTable();
var p1 = dt1.Columns.Add("a", typeof(int)); //Use this to add Primary Key constraint
dt1.Columns.Add("b");
dt1.Columns.Add("c");
dt1.Rows.Add("1", "apple", "10");
dt1.Rows.Add("2", "mango", "20");
dt1.Rows.Add("3", "orange", "30");
dt1.Rows.Add("4", "banana", "40");
dt1.PrimaryKey = new DataColumn[] { p1 }; //This removes duplication of rows
var dt2 = new DataTable();
var p2 = dt2.Columns.Add("a", typeof(int)); //Use this to add Primary Key constraint
dt2.Columns.Add("b");
dt2.Columns.Add("d");
dt2.Rows.Add("1", "apple", "50");
dt2.Rows.Add("2", "mango", "60");
dt2.Rows.Add("3", "orange", "70");
dt2.Rows.Add("5", "grapes", "80");
dt2.PrimaryKey = new DataColumn[] { p2 }; //This removes duplication of rows
var dt3 = dt1.Copy();
dt3.Merge(dt2); // Merge here merges the values from both provided DataTables
Taking your question into consideration:
var dt1 = new DataTable();
var p1 = dt1.Columns.Add("ID", typeof(string));
dt1.Columns.Add("Name", typeof(string));
dt1.Columns.Add("Address", typeof(string));
dt1.Columns.Add("Qty", typeof(int));
dt1.Columns["Qty"].DefaultValue = 0; //Setting default value
dt1.Rows.Add("A1", "Dog", "C1", 100);
dt1.Rows.Add("A2", "Cat", "C3", 200);
dt1.Rows.Add("A3", "Chicken", "C2", 300);
dt1.Rows.Add("A4", "Mouse", "C4", 400);
dt1.Rows.Add("A5", "Pig", "C5", 500);
dt1.PrimaryKey = new DataColumn[] { p1 };
var dt2 = new DataTable();
var p2 = dt2.Columns.Add("ID", typeof(string));
dt2.Columns.Add("Name", typeof(string));
dt2.Columns.Add("Address", typeof(string));
dt2.Columns.Add("Qty Max", typeof(int));
dt2.Columns["Qty Max"].DefaultValue = 0; //Setting default value
dt2.Rows.Add("A1", "Dog", "C1", 600);
dt2.Rows.Add("A2", "Cat", "C3", 700);
dt2.Rows.Add("A3", "Chicken", "C2", 800);
dt2.Rows.Add("A6", "Rabbit", "C6", 900);
dt2.PrimaryKey = new DataColumn[] { p2 };
var dt3 = dt1.Copy();
dt3.Merge(dt2);
Output:
Thanks #shA.t for suggesting to include DataColumn.DefaultValue so that blank cells could be replaced with 0. Also his answer seems to include linq features which I guess is what you are looking for!

linq joining, grouping, with parent roll-up

say I've got a DataTable in this format:
id | key1 | key2 | data1 | data2 | parentID
10 | AA | one | 10.3 | 0.3 | -1
10 | AA | two | 20.1 | 16.2 | -1
10 | BB | one | -5.9 | 30.1 | -1
20 | AA | one | 403.1 | -20.4 | 10
30 | AA | one | 121.5 | 210.3 | -1
and a second DataTable like so:
id | data
10 | 5500
20 | -3000
30 | 500
what I want to do is aggregate the data at the "id" level, with the second table's "data" field added to the first's net "data1", and "data2" just summed up by itself. I figured out how to do this, but what I'm stuck at is this: I want data for anything with "parentID" != -1 to be added to it's parent. so the output of the above data should be
id | data1 | data2
10 | 2927.6 | 26.2
30 | 621.5 | 210.3
is there an efficient way to do this?
edit: code sample
DataTable dt1 = new DataTable();
dt1.Columns.Add("id", typeof(int));
dt1.Columns.Add("key1", typeof(string));
dt1.Columns.Add("key2", typeof(string));
dt1.Columns.Add("data1", typeof(double));
dt1.Columns.Add("data2", typeof(double));
dt1.Columns.Add("parentID", typeof(int));
DataTable dt2 = new DataTable();
dt2.Columns.Add("id", typeof(int));
dt2.Columns.Add("data", typeof(double));
dt1.Rows.Add(new object[] { 10, "AA", "one", 10.3, 0.3, -1 });
dt1.Rows.Add(new object[] { 10, "AA", "two", 20.1, 16.2, -1 });
dt1.Rows.Add(new object[] { 10, "BB", "one", -5.9, 30.1, -1 });
dt1.Rows.Add(new object[] { 20, "AA", "one", 403.1, -20.4, 10 });
dt1.Rows.Add(new object[] { 30, "AA", "one", 121.5, 210.3, -1 });
dt2.Rows.Add(new object[] { 10, 5500 });
dt2.Rows.Add(new object[] { 20, -3000 });
dt2.Rows.Add(new object[] { 30, 500 });
var groups = dt1.AsEnumerable()
.GroupBy(e => e["id"])
.Select(e => new
{
id = e.Key,
net_data1 = e.Sum(w => (double)w["data1"]),
net_data2 = e.Sum(w => (double)w["data2"])
})
.GroupJoin(dt2.AsEnumerable(), e1 => e1.id, e2 => e2["id"],
(a1, a2) => new
{
id = a1.id,
net_data1 = a1.net_data1 + a2.Sum(w => (double)w["data"]),
net_data2 = a1.net_data2
});
Unfortunately, SQL (and, by extension, LINQ) is not well-suited to recursion. Can the parentID column go multiple levels deep? Like this:
ID Parent
------------------
10 -1
20 10
30 10
40 20
If you want to retrace the steps up from ID 40 to ID 10, then you should abandon a SQL/LINQ approach and just do it in code.
It sounds like a good use of a group join. Something like this might work (though it's completely untested):
var items = from parent in context.dataTable
join child in context.dataTable on parent.id equals child.parentID into children
where parent.parentID == -1
select new { id = parent.id,
data1 = (parent.data1 + children.Sum(c => c.data1)),
data2 = (parent.data2 + children.Sum(c => c.data2)) };

Categories

Resources