c# - Copy only selected data to new datatable with linq - c#

I've searched the web for quite some time now and can't seem to find an elegant way to
read data from one datatable,
group it by two variables with linq
select only those two variables (forget about the others in the source datatable) and
copy these items to a new datatable.
I got it working without selecting specific variables, but at the amount of data the program is going to process later I'd rather only copy what's really needed.
var temp123 = from row in oldDataTable.AsEnumerable()
orderby row["Column1"] ascending
group row by new { Column1 = row["Column1"], Column2 = row["Column2"] } into grp
select grp.First();
newDataTable = temp123.CopyToDataTable();
Can anyone please be so kind to help me out here? Thanks!

You can use custom implementation of CopyToDataTable method from this article How to: Implement CopyToDataTable Where the Generic Type T Is Not a DataRow
newDataTable =
oldDataTable
.AsEnumerable()
.GroupBy(r => new { Column1 = row["Column1"], Column2 = row["Column2"] })
.Select(g => g.First())
.OrderBy(x => x.Column1)
.CopyToDataTable(); // your custom extension
Another option, as Tim suggested - manual creation of DataTable.
var newDataTable = new DataTable();
newDataTable.Columns.Add("Column1");
newDataTable.Columns.Add("Column2");
foreach(var item in temp123)
newDataTable.Rows.Add(item.Column1, item.Column2);
And last option (if possible) - don't use DataTable - simply use collection of strongly typed objects.

Related

How do I select a single grouped column from a dataview?

I have a data table tblWorkList with multiple columns: RecordNr, GroupNum, Section, SubscriberID, and quite a few others.
What I need to do is create a dataview or second datatable that is the equivalent of:
SELECT SubscriberID FROM tblWorkList GROUP BY SubscriberID;
I'm doing it in the application because I need this to end up in a dataview that will then be filtered based on multiple user inputs. I have that part working. I've spent several hours now beating my head against the internet trying to figure out how to do this, but I keep running up against errors in solutions that LOOK like they should work but end up failing spectacularly. Although, that said, I'm VERY inexperienced with LINQ right now, so I'm sure I'm missing something pretty straightforward.
(The basic functionality is this: The table contains a list of records to be processed. Basically, I need to take the table full of records, pull the subscriber IDs into a dataview, allow the user to filter that dataview down by a variety of methods (and providing the user a running count of the number of SubscriberID's matching the selected criteria), and when they're done, assign all of the records associated with the resulting SubscriberID collection to a specific analyst to be processed.)
All of the methods I've attempted to use to create the list or dataview of SubscriberID values are enclosed in this:
using (DataTable dt = dsWorkData.Tables["tblWorkData"])
The table tblWorkData contains approximately 23,000 records.
Here are several of my attempts.
Attempt 1 - Error is
Parameter may not be null. Parameter: source'
var result1 = from row in dt.AsEnumerable()
group row by row.Field<string>("SubscriberID") into grp
select new { SubscriberID = grp.Key };
ShowMessage(result1.Count().ToString());
Attempt 2 - Error is
'Cannot implicitly convert anonymous type: string SubscriberID to DataRow'
EnumerableRowCollection<DataRow> query =
from row in dt.AsEnumerable()
group row by row.Field<string>("SubscriberID") into grp
select new { SubscriberID = grp.Key };
Attempt 3 - Error is
'The [third] name 'row' does not exist in the current context.'
EnumerableRowCollection<DataRow> query2 =
from row in dt.AsEnumerable()
group row by row.Field<string>("SubscriberID") into grp
select row;
Attempt 4 - same error as Attempt 1:
DataTable newDt = dt.AsEnumerable()
.GroupBy(r => new { SubscriberID = r["SubscriberID"] })
.Select(g => g.OrderBy(r => r["SubscriberID"]).First())
.CopyToDataTable();
MessageBox.Show(newDt.Rows.Count.ToString());
Attempt 5 - same error as Attempt 1:
var result = dt.AsEnumerable().GroupBy(row => row.Field<string>("SubscriberID"));
MessageBox.Show(result.Count().ToString());
Attempt 6 - same error as Attempt 1:
var results = dt.AsEnumerable().GroupBy(g => g["SubscriberID"])
.Select(x => x.First());
MessageBox.Show(results.Count().ToString());
So can someone explain what I'm doing wrong here, or at least point me in the right direction? I don't really care WHICH approach gets used, for the record, as long as there's a way to do this.
Answer was a pair of comments from NetMage:
Your SQL query is really using GROUP BY to do DISTINCT, so just use the LINQ Distinct: dt.AsEnumerable().Select(r => r.Field<string>("SubscriberID") ).Distinct().
PS Your first error implies that dt is null - source is the parameter name to AsEnumerable.

Linq Distinct not bringing back the correct results

I'm trying to select a distinct values from a DataTable using Linq. The DataTable gets populated from an excel sheet which has dynamic column apart from each excel sheet has a column name SERIAL NUMBER which is mandatory.
I have a DataTable for demo purpose which consist of 4 serial number as:
12345
12345
98765
98765
When I do
var distinctList = dt.AsEnumerable().Select(a => a).Distinct().ToList();
If I do
var distinctList = dt.AsEnumerable().Select(a => a.Field<string>("SERIAL NUMBER").Distinct().ToList();
Then I get the correct results, however but it only contains the one column from dt and not all the other columns
I get all four records instead of 2. Can someone tell me where I'm going wrong please.
The problem is that Distinct method by default uses the default equality comparer, which for DataRow is comparing by reference. To get the desired result, you can use the Distinct overload that allows you to pass IEqualityComparer<T>, and pass DataRowComparer.Default:
The DataRowComparer<TRow> class is used to compare the values of the DataRow objects and does not compare the object references.
var distinctList = dt.AsEnumerable().Distinct(DataRowComparer.Default).ToList();
For more info, see Comparing DataRows (LINQ to DataSet).
So, you want to group them by Serial Number and retrieve the full DataRow? Assuming that after grouping them we want to retrieve the first item:
var distinctList = dt.AsEnumerable().GroupBy(a => a.Field<string>("SERIAL NUMBER"))
.Select(a => a.FirstOrDefault()).Distinct().ToList();
EDIT: As requested
var distinctValues = dt.AsEnumerable().Select(a => a.Field<string>("SERIAL NUMBER")).Distinct().ToList();
var duplicateValues = dt.AsEnumerable().GroupBy(a => a.Field<string>("SERIAL NUMBER")).SelectMany(a => a.Skip(1)).Distinct().ToList();
var duplicatesRemoved = dt.AsEnumerable().Except(duplicateValues);
In ToTable method the first parameter specifies if you want Distinct records, the second specify by which column name we will make distinct.
DataTable returnVals = dt.DefaultView.ToTable(true, "ColumnNameOnWhichYouWantDistinctRecords");
Here there is no need to use linq for this task !
Using Linq a GroupBy would be better suited, by the sounds of it.
var groups = dt.AsEnumerable().GroupBy(a => a.SerialNumber).Select(_ => new {Key = _.Key, Items = _});
This will then contain groupings based on the Serial Number. With each group of items having the same serial number, but other property values different.
Try this:
List<string> distinctValues = (from row in dt.AsEnumerable() select row.Field<string>("SERIAL NUMBER")).Distinct().ToList();
However to me this also works:
List<string> distinctValues = dt.AsEnumerable().Select(row => row.Field<string>("SERIAL NUMBER")).Distinct().ToList();

How to convert to int and then compare in linq query c#

IEnumerable<classB> list = getItems();
//dt is datatable
list = list.Where(x => Convert.ToInt32( !dt.Columns["Id"]) == (x.Id));
I want to only keep the items in the list which match in datatable id column. The rest are removed. I m not doing it right.
The datatable can have: ID - 1,3,4,5,7
The list can have: ID - 1,2,3,4,5,6,7,8,9,10
I want the output list to have: ID - 1,3,4,5,7
Your code won't work because you're comparing a definition of a column to an integer value. That's not a sensible comparison to make.
What you can do is put all of the values from the data table into a collection that can be effectively searched and then get all of the items in the list that are also in that collection:
var ids = new HashSet<int>(dt.AsEnumerable()
.Select(row => row.Field<int>("Id"));
list = list.Where(x => ids.Contains(x.Id));
Try this one
var idList = dt.AsEnumerable().Select(d => (int) d["Id"]).ToList();
list = list.Where(x => idList.Contains(x.Id));
You can't do it like that. Your dt.Columns["Id"] returns the DataColumn and not the value inside that column in a specific datarow. You need to make a join between two linq query, the first one you already have, the other you need to get from the DataTable.
var queryDt = (from dtRow in dt
where !dtRow.IsNull("Id")
select int.Parse(dtRow["Id"])).ToList();
Now the join
var qry = from nonNull in queryDt
join existing in list on nonNull equals list.id

Make a selection of items in a list

I have a DataTable which I have converted into a list. I would like to know how to query the list and create a new list where the ParentID is null.
DataTable myTable = new DataTable();
myTable.Columns.Add("ParentID", typeof(string));
myTable.Columns.Add("ID", typeof(string));
myTable.Rows.Add(null, "CEO");
myTable.Rows.Add("CEO", "FD");
myTable.Rows.Add("CEO", "CIO");
List<DataRow> lst = myTable.AsEnumerable().ToList();
I am trying something like:
List<DataRow> topNodes = lst.Select("ID is null")
Thanks.
You could try this - search for all cells in a row, for which the content is null. Note that in this case (for the given input - ParentId column is of type string) the row value has to be converted to a string:
// get only those items where ParentId is null
var topRows = myTable
.AsEnumerable()
.ToList()
.Where (row => string.IsNullOrEmpty(row["ParentID"].ToString()))
.ToList();
Output (as List of DataRows) for the given input is:
ParentID | ID
-------------------
null | CEO
try that, assuming you can use LINQ
var topNodes = lst.Where(l => (!(l.ID.HasValue))).ToList();
Edit:
I see your ID is String, so it should be like that:
var topNodes = lst.Where(l => ID.IsNullOrEmpty()).ToList();
Edit:
The list is DataRow so:
var topNodes = lst.Where(row => row["ID"].IsNullOrEmpty()).ToList();
Edit:
You added more information regarding what you want to do. If all you need is list or DataRow, I would filter first and then cast to list:
var topNodes = myTable.Select("ID is null").AsEnumerable().ToList();
Note: See comments under original post.
List<DataRow> lst = myTable.Select("ParentID is null").ToList();
Edit: I see that my answer is very similar to Kris Ivanov's but mine is a bit simpler and doesn't use the noxious var keyword and I suggested using Table.Select first in the comments. So, I'll leave it be.

Remove Duplicate Rows and Count in C# DataTable

I Have DataTable Similar Like this.
If the adults value and child value are same. I need to Remove it and count that. I need a output similar like this.
Can anyone please help me on this???.
Thank you,
You want to group by adults+child:
var groups = tblRoooms.AsEnumerable()
.GroupBy(r => new{ Adults = r.Field<int>("Adults"), Child = r.Field<int>("Child") });
var tblRooomsCopy = tblRoooms.Clone(); // creates an empty clone of the table
foreach(var grp in groups)
{
int roomCount = grp.Sum(r => r.Field<int>("Roomcount"));
DataRow row = tblRooomsCopy.Rows.Add();
row.SetField("RoomNo", grp.First().Field<int>("RoomNo"));
row.SetField("Roomcount", roomCount);
row.SetField("Adults", grp.Key.Adults);
row.SetField("Child", grp.Key.Child);
}
Now you have your desired result in tblRooomsCopy.
I won't write the complete code for you but I will describe a suggested way: first order the datatable by adults and child, that will cause same rows to be consecutive, create a list that you will fill rows to be deleted
then use foreach to compare each row with the previous one, if it has the same value then add it to the list of rows to be removed, finally you will delete the rows in the list

Categories

Resources