How to replace duplicates in datatable

How to replace duplicates in datatable - c#

I have Datatable 1:-----------------------------should be like that:
ID Name Lastname ID Name Lastname
------------------- -----------------------
1 | koki ha 1 | koki ha
------------------- | ----------------- //merge Rows[0][0]
1 | lola mi | lola mi //with Rows[1][0] if the same
------------------- -----------------------
2 | ka xe 2 ka xe
how to replace "1" with "" or empty if is already exist? I spend for this for 2 hours but can't find the solution. I tried with linq but dont find the key to do it right, maybe distinct or group?
DataTable table = new DataTable("table");
table.Columns.Add("ID", typeof(Int32));
table.Columns.Add("Name", typeof(String));
table.Columns.Add("Lastname", typeof(String));
object[] o1 = { 1, "Kiki", "ha"};
object[] o2 = { 1,"lola","mi"};
object[] o4 = { 2, "ka", "xe" };
table.Rows.Add(o1);
table.Rows.Add(o2);
table.Rows.Add(o4);
dataGridView2.DataSource = table;

Here's how you can do this using LINQ:
var dataRows = table.Rows.Cast<System.Data.DataRow>()
.GroupBy(r => r[0])
.Where(g => g.Count() > 1);
foreach (var dataRowGroup in dataRows) {
int idx = 0;
foreach (DataRow row in dataRowGroup) {
if (idx++ > 0) {
row[0] = DBNull.Value;
}
}
}

Related

Sort out duplicates in one row but keep a specific one

What Im trying to do, is distinct (or group by) for one column, but then keeping the one in my List which has a value fo a second column.
What i have
Column1 Column 2 Column3 ...
1 | tada | smth
1 | | wefih
2 | tada | uitethgev
3 | | urifnvf
what i want
Column1 Column 2 Column3 ...
1 | tada | smth
2 | tada | uitethgev
3 | | urifnvf
As i only have one "3" i wanna keep it in my list. Same for the 2, but the 1 should only stay with a value in column2.
I wanna do this in a linq query. Each row is an Object with attributes that represent the columns.
Any clues on this? I know how to make it by using multiple lists and writing to each other with a method checking it. But i thought there could be a nice linq way to do this. Also keep in mind pls that i have more then just 3 columns.

You can do it by linq, specially by GroupBy why not :
1 - i'm creating a class that simulate your demand :
public class TestClass
{
public int Column1 { get; set; }
public string Column2 { get; set; }
public string Column3 { get; set; }
}
2 - i'm initializing a list of TestClass like a below:
List<TestClass> testClasses = new List<TestClass>
{
new TestClass{Column1 = 1, Column2 = "tada", Column3 = "smth"},
new TestClass{Column1 = 1, Column2 = "msa", Column3 = "msa1"},
new TestClass{Column1 = 1, Column3 = "wefih"},
new TestClass{Column1 = 2, Column2 = "tada", Column3 = "uitethgev"},
new TestClass{Column1 = 3, Column3 = "urifnvf"},
};
3 - using groupBy to filter your list, by testing the count of grouped element is grater than or equal 2 :
if count >= 2 : take the first element that have a column2 not empty
else : take element without filtering
List<TestClass> groupedList = testClasses
.GroupBy(x => x.Column1)
.Select(y => y.Count() >= 2 ? y.First(z => !string.IsNullOrEmpty(z.Column2)) : y.First())
.ToList();
Result of 3 :
Column1|Column 2| Column3 ...
1 | tada | smth
2 | tada | uitethgev
3 | | urifnvf
4 - if you need all not empty column2, if the count grater than 2, try this code :
List<TestClass> groupedList = testClasses
.GroupBy(x => x.Column1)
.SelectMany(y => y.Count() >= 2 ? y.TakeWhile(z => !string.IsNullOrEmpty(z.Column2)) : y)
.ToList();
Result of 4 :
Column1|Column 2| Column3 ...
1 | tada | smth
1 | msa | msa1
2 | tada | uitethgev
3 | | urifnvf
i hope that will give you an answer

The following linq works :
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Data;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
DataTable dt = new DataTable();
dt.Columns.Add("Column1", typeof(int));
dt.Columns.Add("Column2", typeof(string));
dt.Columns.Add("Column3", typeof(string));
dt.Rows.Add(new object[] { 1, "tada", "smth"});
dt.Rows.Add(new object[] { 1, null, "wefih"});
dt.Rows.Add(new object[] { 2, "tada", "uitethgev"});
dt.Rows.Add(new object[] { 3, null, "erifnvf"});
DataTable dt2 = dt.AsEnumerable()
.GroupBy(x => x.Field<int>("Column1"))
.Select(x => x.All(y => (y.Field<object>("Column2") == null)) ? x.First() : x.Where(y => y.Field<object>("Column2") != null).First())
.CopyToDataTable();
}
}
}

How to load data from SQL to datetable that show in treeview

I have a chartTable with 2 columns:
ChildPersonID | ParentPersonID
--------------+-----------------
1 | 2
1 | 3
2 | 4
That is joined to personTable with 2 columns:
ID | PersonName
---+-----------------
1 | a
2 | b
3 | c
4 | d
I want a select query that fill datatable with PersonName that show in treeview
Result:
parentname | parentid | childname | childid
-----------+----------+-----------+---------
a | 1 | b | 2
a | 1 | c | 3
b | 2 | d | 4
my code
DECLARE #Table1 TABLE (ChildPersonID INT,ParentPersonID INT)
DECLARE #Table2 TABLE (ID INT, PersonName VARCHAR(10))
INSERT INTO #Table1 VALUES (1,2),(1,3),(2,4)
INSERT INTO #Table2 VALUES (1,'a'),(2,'b'),(3,'c'),(4,'d')
SELECT T3.PersonName AS parentName, T1.ChildPersonID AS ParentId,
T2.PersonName AS childname, T1.ParentPersonID AS childid
FROM #Table1 T1
INNER JOIN #Table2 T2 ON T1.ParentPersonID = T2.Id
INNER JOIN #Table2 T3 ON T2.ChildPersonID = T3.id

This is the query which you are looking for (though the column name which you have mentioned are confusing, I think it should be reversed)
DECLARE #Table1 TABLE (ChildPersonID INT,ParentPersonID INT)
DECLARE #Table2 TABLE (ID INT, PersonName VARCHAR(10))
INSERT INTO #Table1 VALUES (1,2),(1,3),(2,4)
INSERT INTO #Table2 VALUES (1,'a'),(2,'b'),(3,'c'),(4,'d')
SELECT T3.PersonName AS parentName, T1.ChildPersonID AS ParentId,
T2.PersonName AS childname, T1.ParentPersonID AS childid
FROM #Table1 T1
INNER JOIN #Table2 T2 ON T1.ParentPersonID = T2.Id
INNER JOIN #Table2 T3 ON T1.ChildPersonID = T3.id

See code below :
using System;
using System.Collections.Generic;
using System.Collections;
using System.Linq;
using System.Text;
using System.Data;
namespace ConsoleApplication94
{
class Program
{
static void Main(string[] args)
{
DataTable dt = new DataTable();
dt.Columns.Add("parentname", typeof(string));
dt.Columns.Add("parentid", typeof(int));
dt.Columns.Add("childname", typeof(string));
dt.Columns.Add("childid", typeof(int));
DataTable dtChildPerson = new DataTable();
dtChildPerson.Columns.Add("ChildPersonID", typeof(int));
dtChildPerson.Columns.Add("ParentPersonID", typeof(int));
dtChildPerson.Rows.Add(new object[] { 1, 1 });
dtChildPerson.Rows.Add(new object[] { 1, 3 });
dtChildPerson.Rows.Add(new object[] { 2, 4 });
DataTable personName = new DataTable();
personName.Columns.Add("ID", typeof(int));
personName.Columns.Add("PersonName", typeof(string));
personName.Rows.Add(new object[] { 1, "a" });
personName.Rows.Add(new object[] { 2, "b" });
personName.Rows.Add(new object[] { 3, "c" });
personName.Rows.Add(new object[] { 4, "d" });
foreach (DataRow row in dtChildPerson.AsEnumerable())
{
int parentID = row.Field<int>("ParentPersonID");
string parentName = personName.AsEnumerable().Where(x => x.Field<int>("ID") == parentID).Select(x => x.Field<string>("PersonName")).FirstOrDefault();
int childID = row.Field<int>("ChildPersonID");
foreach(DataRow childRow in personName.AsEnumerable().Where(x => x.Field<int>("ID") == childID))
{
string childName = childRow.Field<string>("PersonName");
dt.Rows.Add(new object[] { parentName, parentID, childName, childID });
}
}
}
}
}

DataTable.Select() to display summation of records in datatable

In my C# project in a DataTable, I need to sum a few columns and display the aggregated record and I am unable to create filter query for that.
Records like:
|Col1|Col2|Col3|Col4|
| A | X | 10 | 10 |
| A | X | 10 | 20 |
| A | Y | 12 | 12 |
| A | Y | 10 | 10 |
Result will be:
|Col1|Col2|Col3|Col4|
| A | X | 20 | 30 |
| A | Y | 22 | 22 |
I have to use DataTable.Select("filter condition").

var result = (from DataRow s in yourDataTable.Select("filter conditions").AsEnumerable()
group s by new {g1 = s.Field<string>("Col1"), g2 = s.Field<string>("Col2") } into g
select new
{
Col1 = g.Key.g1,
Col2 = g.Key.g2,
Col3 = g.sum(r => r.Field<decimal>("Col3")),
Col4 = g.sum(r => r.Field<decimal>("Col4")),
}).ToList();
And if you want result as DataTable type, you can convert list to DataTable Like below:
var resultAsDataTable = ConvertListToDataTable(result);
public static DataTable ConvertListToDataTable<T>(IList<T> data)
{
PropertyDescriptorCollection props =
TypeDescriptor.GetProperties(typeof(T));
DataTable table = new DataTable();
for (int i = 0; i < props.Count; i++)
{
PropertyDescriptor prop = props[i];
table.Columns.Add(prop.Name, prop.PropertyType);
}
object[] values = new object[props.Count];
foreach (T item in data)
{
for (int i = 0; i < values.Length; i++)
{
values[i] = props[i].GetValue(item);
}
table.Rows.Add(values);
}
return table;
}

Merge columns of two DataTables using linq

I have two DataTables: dt1 and dt2.
dt1:
ID | Name | Address | QTY
-------+----------+---------+-----
A1 | Dog | C1 | 272
A2 | Cat | C3 | 235
A3 | Chicken | C2 | 254
A4 | Mouse | C4 | 259
A5 | Pig | C5 | 233
dt2:
ID | Name | Address | QTY MAX
-------+----------+---------+--------
A1 | Dog | C1 | 250
A2 | Cat | C3 | 200
A3 | Chicken | C2 | 300
A6 | Rabbit | C6 | 350
But, I want to merge dt1 and dt2 to dt3 like below:
ID | Name | Address | QTY | QTY MAX
-------+----------+---------+-------+--------
A1 | Dog | C1 | 272 | 250
A2 | Cat | C3 | 235 | 200
A3 | Chicken | C2 | 254 | 300
A4 | Mouse | C4 | 259 | 0
A5 | Pig | C5 | 233 | 0
A6 | Rabbit | C6 | 0 | 350
Can any one help me?

If your DataTables haven't primary key and you can't or don't want to change those DataTables you can use a code like this:
// At first you need to define your result `DataTable`
// So make it by cloning from first `DataTable`
var dt3 = dt1.Clone();
// Then add extra columns to it
dt3.Columns.Add("Qty Max", typeof(int));
// Second, you need to add rows of first `DataTable`
foreach (DataRow row in dt1.Rows)
{
// When you don't have a primary key you need a code like this to find same rows:
var dt2Row = dt2.Rows.OfType<DataRow>().SingleOrDefault(w => w["ID"].Equals(row["ID"]));
var qtyMax = dt2Row?["Qty Max"] ?? 0; // Here I set default value to `0`
dt3.Rows.Add(row["ID"], row["Name"], row["Address"], row["Qty"], qtyMax);
}
// Third, you need to add rows of second `DataTable` that is not in first
var dt2OnlyRows =
dt2.Rows.OfType<DataRow>().Where(w => dt1.Rows.OfType<DataRow>().All(x => x["ID"] != w["ID"]));
foreach (var row in dt2OnlyRows)
{
dt3.Rows.Add(row["ID"], row["Name"], row["Address"], 0, row["Qty Max"]);
}

This solution is not a linq solution as you could simply use DataTable.Merge & DataTable.PrimaryKey to get the desired output.
Here is a dummy example which you can use:
var dt1 = new DataTable();
var p1 = dt1.Columns.Add("a", typeof(int)); //Use this to add Primary Key constraint
dt1.Columns.Add("b");
dt1.Columns.Add("c");
dt1.Rows.Add("1", "apple", "10");
dt1.Rows.Add("2", "mango", "20");
dt1.Rows.Add("3", "orange", "30");
dt1.Rows.Add("4", "banana", "40");
dt1.PrimaryKey = new DataColumn[] { p1 }; //This removes duplication of rows
var dt2 = new DataTable();
var p2 = dt2.Columns.Add("a", typeof(int)); //Use this to add Primary Key constraint
dt2.Columns.Add("b");
dt2.Columns.Add("d");
dt2.Rows.Add("1", "apple", "50");
dt2.Rows.Add("2", "mango", "60");
dt2.Rows.Add("3", "orange", "70");
dt2.Rows.Add("5", "grapes", "80");
dt2.PrimaryKey = new DataColumn[] { p2 }; //This removes duplication of rows
var dt3 = dt1.Copy();
dt3.Merge(dt2); // Merge here merges the values from both provided DataTables
Taking your question into consideration:
var dt1 = new DataTable();
var p1 = dt1.Columns.Add("ID", typeof(string));
dt1.Columns.Add("Name", typeof(string));
dt1.Columns.Add("Address", typeof(string));
dt1.Columns.Add("Qty", typeof(int));
dt1.Columns["Qty"].DefaultValue = 0; //Setting default value
dt1.Rows.Add("A1", "Dog", "C1", 100);
dt1.Rows.Add("A2", "Cat", "C3", 200);
dt1.Rows.Add("A3", "Chicken", "C2", 300);
dt1.Rows.Add("A4", "Mouse", "C4", 400);
dt1.Rows.Add("A5", "Pig", "C5", 500);
dt1.PrimaryKey = new DataColumn[] { p1 };
var dt2 = new DataTable();
var p2 = dt2.Columns.Add("ID", typeof(string));
dt2.Columns.Add("Name", typeof(string));
dt2.Columns.Add("Address", typeof(string));
dt2.Columns.Add("Qty Max", typeof(int));
dt2.Columns["Qty Max"].DefaultValue = 0; //Setting default value
dt2.Rows.Add("A1", "Dog", "C1", 600);
dt2.Rows.Add("A2", "Cat", "C3", 700);
dt2.Rows.Add("A3", "Chicken", "C2", 800);
dt2.Rows.Add("A6", "Rabbit", "C6", 900);
dt2.PrimaryKey = new DataColumn[] { p2 };
var dt3 = dt1.Copy();
dt3.Merge(dt2);
Output:
Thanks #shA.t for suggesting to include DataColumn.DefaultValue so that blank cells could be replaced with 0. Also his answer seems to include linq features which I guess is what you are looking for!

linq joining, grouping, with parent roll-up

say I've got a DataTable in this format:
id | key1 | key2 | data1 | data2 | parentID
10 | AA | one | 10.3 | 0.3 | -1
10 | AA | two | 20.1 | 16.2 | -1
10 | BB | one | -5.9 | 30.1 | -1
20 | AA | one | 403.1 | -20.4 | 10
30 | AA | one | 121.5 | 210.3 | -1
and a second DataTable like so:
id | data
10 | 5500
20 | -3000
30 | 500
what I want to do is aggregate the data at the "id" level, with the second table's "data" field added to the first's net "data1", and "data2" just summed up by itself. I figured out how to do this, but what I'm stuck at is this: I want data for anything with "parentID" != -1 to be added to it's parent. so the output of the above data should be
id | data1 | data2
10 | 2927.6 | 26.2
30 | 621.5 | 210.3
is there an efficient way to do this?
edit: code sample
DataTable dt1 = new DataTable();
dt1.Columns.Add("id", typeof(int));
dt1.Columns.Add("key1", typeof(string));
dt1.Columns.Add("key2", typeof(string));
dt1.Columns.Add("data1", typeof(double));
dt1.Columns.Add("data2", typeof(double));
dt1.Columns.Add("parentID", typeof(int));
DataTable dt2 = new DataTable();
dt2.Columns.Add("id", typeof(int));
dt2.Columns.Add("data", typeof(double));
dt1.Rows.Add(new object[] { 10, "AA", "one", 10.3, 0.3, -1 });
dt1.Rows.Add(new object[] { 10, "AA", "two", 20.1, 16.2, -1 });
dt1.Rows.Add(new object[] { 10, "BB", "one", -5.9, 30.1, -1 });
dt1.Rows.Add(new object[] { 20, "AA", "one", 403.1, -20.4, 10 });
dt1.Rows.Add(new object[] { 30, "AA", "one", 121.5, 210.3, -1 });
dt2.Rows.Add(new object[] { 10, 5500 });
dt2.Rows.Add(new object[] { 20, -3000 });
dt2.Rows.Add(new object[] { 30, 500 });
var groups = dt1.AsEnumerable()
.GroupBy(e => e["id"])
.Select(e => new
{
id = e.Key,
net_data1 = e.Sum(w => (double)w["data1"]),
net_data2 = e.Sum(w => (double)w["data2"])
})
.GroupJoin(dt2.AsEnumerable(), e1 => e1.id, e2 => e2["id"],
(a1, a2) => new
{
id = a1.id,
net_data1 = a1.net_data1 + a2.Sum(w => (double)w["data"]),
net_data2 = a1.net_data2
});

Unfortunately, SQL (and, by extension, LINQ) is not well-suited to recursion. Can the parentID column go multiple levels deep? Like this:
ID Parent
------------------
10 -1
20 10
30 10
40 20
If you want to retrace the steps up from ID 40 to ID 10, then you should abandon a SQL/LINQ approach and just do it in code.

It sounds like a good use of a group join. Something like this might work (though it's completely untested):
var items = from parent in context.dataTable
join child in context.dataTable on parent.id equals child.parentID into children
where parent.parentID == -1
select new { id = parent.id,
data1 = (parent.data1 + children.Sum(c => c.data1)),
data2 = (parent.data2 + children.Sum(c => c.data2)) };

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

How to replace duplicates in datatable - c#

Here's how you can do this using LINQ: var dataRows = table.Rows.Cast<System.Data.DataRow>() .GroupBy(r => r[0]) .Where(g => g.Count() > 1); foreach (var dataRowGroup in dataRows) { int idx = 0; foreach (DataRow row in dataRowGroup) { if (idx++ > 0) { row[0] = DBNull.Value; } } }

Related

Sort out duplicates in one row but keep a specific one

How to load data from SQL to datetable that show in treeview

DataTable.Select() to display summation of records in datatable

Merge columns of two DataTables using linq

linq joining, grouping, with parent roll-up

Categories

Resources