Taking distinct rows from datatable fails

Taking distinct rows from datatable fails - c#

I am trying to take distinct rows from my table based on a column value(Column Name Id)
my master datatable is like this
And I want to take all rows with distinct Id, My code is like this
DataTable distinctDt = dt.DefaultView.ToTable(true, "Id", "ProductName",
"ProductDescription","ProductCategory_Id", "Fair_Id", "Price","ImagePath",
"FairName", "FairLogo", "StartDate", "EndDate", "picId");
But this still returns duplicate rows. What am i doing wrong here?

Your used approach is not useful in this scenario because you need all the columns based on the unique columns ID's, here is the method of doing this:
public static DataTable RemoveDuplicateRows(DataTable dTable, string colName)
{
Hashtable hTable = new Hashtable();
ArrayList duplicateList = new ArrayList();
//Add list of all the unique item value to hashtable, which stores combination of key, value pair.
//And add duplicate item value in arraylist.
foreach (DataRow drow in dTable.Rows)
{
if (hTable.Contains(drow[colName]))
duplicateList.Add(drow);
else
hTable.Add(drow[colName], string.Empty);
}
//Removing a list of duplicate items from datatable.
foreach (DataRow dRow in duplicateList)
dTable.Rows.Remove(dRow);
//Datatable which contains unique records will be return as output.
return dTable;
}
Simply pass your MasterDatatable and column name to this method and it will give you what you want.
It is tested code and works perfectly. See the working.
Hope it helps!

Related

Issue Understanding SQL Server Merge for Bulk Insert

I generate a DataTable and I first remove all rows from that DataTable in C#.
Then I pass on the DataTable to a stored procedure for the bulk insert but I am getting error randomly stating:
The Merge statement updated /delete same row more than once. This happens when a target row matches more than one source row.
But what is confusing me is I remove all duplicate rows from the DataTable before sending it to the stored procedure.
CODE:
public static DataTable RemoveDuplicateRows(DataTable dTable, string colName)
{
Hashtable hTable = new Hashtable();
ArrayList duplicateList = new ArrayList();
//Add list of all the unique item value to hashtable, which stores combination of key, value pair.
//And add duplicate item value in arraylist.
foreach (DataRow drow in dTable.Rows)
{
if (hTable.Contains(drow[colName]))
duplicateList.Add(drow);
else
hTable.Add(drow[colName], string.Empty);
}
//Removing a list of duplicate items from datatable.
foreach (DataRow dRow in duplicateList)
dTable.Rows.Remove(dRow);
//Datatable which contains unique records will be return as output.
return dTable;
}
And this is the procedure:
ALTER PROCEDURE [dbo].[Update_DataFeed_Discoverable]
#tblCustomers DateFeed_Discoverable READONLY
AS
BEGIN
SET NOCOUNT ON;
MERGE INTO DataFeed c1
USING #tblCustomers c2
ON c1.CheckSumProductName=checksum(c2.aw_deep_link)
WHEN MATCHED THEN
UPDATE SET
c1.merchant_name = c2.merchant_name
,c1.aw_deep_link = c2.aw_deep_link
,c1.brand_name = c2.brand_name
,c1.product_name = c2.product_name
,c1.merchant_image_url = c2.merchant_image_url
,c1.Price = c2.Price
WHEN NOT MATCHED THEN
INSERT VALUES(c2.merchant_name, c2.aw_deep_link,
c2.brand_name, c2.product_name, c2.merchant_image_url,
c2.Price,1,checksum(c2.aw_deep_link)
);
END
So does it end up with duplicate rows when they are not Inserted and removed before using the DataTable.
Any help is appreciated.

GetChanges from merged datatables returns null

I have following code:
DataTable datTable3 = new DataTable();
datTable3 = datTable1.Clone();
datTable2.Merge(datTable1);
datTable3 = datTable2.GetChanges();
Want I want to do is: Compare DataTable1 with DataTable2 and when there are rows in DataTable1 which aren't in DataTable 2 then add these rows into a new DataTable(3). This code above gives me an empty DataTable3 each time although the rows in the first dt are not equal to the rows in my second dt. what am I doing wrong? Sorry if that question may be too easy but I'm using C# since a couple of months.
EDIT: I found this solution which doesn't work for me... Why?
DataTable datTable3 = new DataTable();
datTable3 = datTable1.Clone();
foreach (DataRow row in datTable1.Rows)
{
datTable3.ImportRow(row);
}
foreach (DataRow row in datTable3.Rows)
{
row.SetAdded();
}
datTable2.Merge(datTable3);
DataTable datTableFinal = datTable2.GetChanges(DataRowState.Added);
// shows me a datatable with again the values from datTable1
// even if they are already in datTable2!
datTable2.RejectChanges();
datTable1.RejectChanges();

The DataTable.GetChanges() method Gets a copy of the DataTable that contains all changes made to it since it was loaded or AcceptChanges was last called.
In other words, GetChanges() is dependent on the DataRow.RowState property. A DataTable.Merge() will either preserve their 'RowState' property, or reset it to 'Unchanged'.
This means that when you merge two DataTables with rows that have 'Unchanged' RowStates, the merged table will also contain 'Unchanged' rows and the DataTable.GetChanges method will return null or Nothing.
EDIT : You can always iterate through the DataTable to see what rows are added to the merged table. Something like
foreach(DataRow row in datTable2.Rows)
{
Console.WriteLine("--- Row ---"); // Print separator.
foreach (var item in row.ItemArray) // Loop over the items.
{
Console.Write("Item: "); // Print label.
Console.WriteLine(item); // Invokes ToString abstract method.
}
}

Iterate through and use LoadDataRow(object[] value, bool fAcceptChanges) :
foreach (DataRow row in MergeTable.Rows)
{
TargetTable.LoadDataRow(row.ItemArray, false);
}
var changes = TargetTable.GetChanges();
changes had the desired value when I tried this method.

Enumerate over DataTable, filter items, then revert to DataTable

I'd like to filter items in my DataTable by whether a column value is contained inside a string array by converting it to an IEnumerable<DataRow>, afterwards I'd like to re-convert it to DataTable since that's what my method has to return.
Here's my code so far:
string[] ids = /*Gets string array of IDs here*/
DataTable dt = /*Databasecall returning a DataTable here*/
IEnumerable<DataRow> ie = dt.AsEnumerable();
ie = ie.Where<DataRow>(row => ids.Contains(row["id"].ToString()));
/*At this point I've filtered out the entries I don't want, now how do I convert this back to a DataTable? The following does NOT work.*/
ie.CopyToDataTable(dt, System.Data.LoadOption.PreserveChanges);
return dt;

I would create an empty clone of the data table:
DataTable newTable = dt.Clone();
Then import the rows from the old table that match the filter:
foreach(DataRow row in ie)
{
newTable.ImportRow(row);
}

Assuming that you want to filter the rows in-place, that is the filtered rows should be returned in the same DataTable that was created through the original database query, you should first clear the DataTable.Rows collection. Then you should copy the filtered rows to an array and add them sequentially:
ie = ie.Where<DataRow>(row => ids.Contains(row["id"].ToString())).ToArray();
dt.Rows.Clear();
foreach (var row in ie)
{
dt.Rows.Add(row);
}
An alternative way to achieve this could be to simply iterate through the rows in the DataTable once and delete the ones that should be filtered out:
foreach (var row in dt.Rows)
{
if (ids.Contains(row["id"].ToString()) == false)
{
row.Delete();
}
}
dt.AcceptChanges();
Note that if the DataTable is part of a DataSet that is being used to update the database, all modifications made to the DataTable.Rows collection will be reflected in the corresponding database table during an update.

Selected columns from DataTable

How to get the Selected columns form the DataTable? For e.g my BaseTable has three columns, ColumnA, ColumnB and ColumnC. Now as part of intermediate operations, I need to retrieve all the rows only from the ColumnA. Is there any predefined formula just like DataTable.Select?

DataView.ToTable Method.
DataView view = new DataView(MyDataTable);
DataTable distinctValues = view.ToTable(true, "ColumnA");
Now you can select.
DataRow[] myRows = distinctValues.Select();

From this question: How to select distinct rows in a datatable and store into an array you can get the distinct values:
DataView view = new DataView(table);
DataTable distinctValues = view.ToTable(true, "ColumnA");
If you're dealing with a large DataTable and care about the performance, I would suggest something like the following in .NET 2.0. I'm assuming the type of the data you're displaying is a string so please change as necessary.
Dictionary<string,string> colA = new Dictionary<string,string>();
foreach (DataRow row in table.Rows) {
colA[(string)row["ColumnA"]] = "";
}
return colA.Keys;

Remove duplicate column values from a datatable without using LINQ

Consider my datatable,
Id Name MobNo
1 ac 9566643707
2 bc 9944556612
3 cc 9566643707
How to remove the row 3 which contains duplicate MobNo column value in c# without using LINQ. I have seen similar questions on SO but all the answers uses LINQ.

The following method did what i want....
public DataTable RemoveDuplicateRows(DataTable dTable, string colName)
{
Hashtable hTable = new Hashtable();
ArrayList duplicateList = new ArrayList();
//Add list of all the unique item value to hashtable, which stores combination of key, value pair.
//And add duplicate item value in arraylist.
foreach (DataRow drow in dTable.Rows)
{
if (hTable.Contains(drow[colName]))
duplicateList.Add(drow);
else
hTable.Add(drow[colName], string.Empty);
}
//Removing a list of duplicate items from datatable.
foreach (DataRow dRow in duplicateList)
dTable.Rows.Remove(dRow);
//Datatable which contains unique records will be return as output.
return dTable;
}

As you are reading your CSV file ( a bit of pseudo code, but you get the picture ):
List<String> uniqueMobiles = new List<String>();
String[] fileLines = readYourFile();
for (String line in fileLines) {
DataRow row = parseLine(line);
if (uniqueMobiles.Contains(row["MobNum"])
{
continue;
}
uniqueMobiles.Add(row["MobNum"]);
yourDataTable.Rows.Add(row);
}
This will only load the records with unique mobiles into your data table.

This is the simplest way .
**
var uniqueContacts = dt.AsEnumerable()
.GroupBy(x=>x.Field<string>("Email"))
.Select(g=>g.First());
**
I found it in this thread
LINQ to remove duplicate rows from a datatable based on the value of a specific row
what actually was for me that I return it as datatable
DataTable uniqueContacts = dt.AsEnumerable()
.GroupBy(x=>x.Field<string>("Email"))
.Select(g=>g.First()).CopyToDataTable();

You might want to look up the inner workings on DISTINCT before running this on your sharp DB (be sure to back up!), but if it works as I think it does (grabbing the first value) you should be able to use (something very similar to) the following SQL:
DELETE FROM YourTable WHERE Id NOT IN (SELECT DISTINCT Id, MobNo FROM YourTable);

You can use "IEqualityComparer" in C#

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Taking distinct rows from datatable fails - c#

Related

Issue Understanding SQL Server Merge for Bulk Insert

GetChanges from merged datatables returns null

Enumerate over DataTable, filter items, then revert to DataTable

Selected columns from DataTable

Remove duplicate column values from a datatable without using LINQ

Categories

Resources