The following C# code takes a large datatable with many columns and an array of 2 column names. It will give a new datatable with two rows where there are duplicate rows for the two fields supplied staff no & skill.
This is too specific and I need to supply any number of fields as the groupby.
can someone help me?
string[] excelField = new string[0]; // contains a list of field name for uniquness
excelField[0] = "staff No";
excelField[1] = "skill";
DataTable dataTableDuplicateRows = new DataTable();
dataTableDuplicateRows.Clear();
dataTableDuplicateRows.Columns.Clear();
foreach (string fieldName in excelField)
{
dataTableDuplicateRows.Columns.Add(fieldName);
}
var duplicateValues = dataTableCheck.AsEnumerable()
.GroupBy(row => new { Field0 = row[excelField[0]], Field1 = row[excelField[1]] })
.Where(group => group.Count() > 1)
.Select(g => g.Key);
foreach (var duplicateValuesRow in duplicateValues)
{
dataTableDuplicateRows.Rows.Add(duplicateValuesRow.Field0, duplicateValuesRow.Field1);
}
I think what you require is something make the linq more dynamic, even though you could achieve it by using expression tree, the DynamicLinq library would appear to solve your issue in an easier way.
For you case, with the library, just use the GroupBy extension method with a string value.
More info about DynamicLinq library:
Scott Gu's blog
Related
I was wondering if there wasn't an optimal way for this code.
List<CollectionFormFieldRecord> dataFields = new List<CollectionFormFieldRecord>();
foreach (CollectionRelationModel relation in relations)
{
foreach (var field in visibleFields)
{
if (field.SourceCollectionsID == relation.ChildCollectionID)
dataFields.Add(field);
}
}
When a field (visibleFields) has a SourceCollectionsID that exists in the relations list then the field must be added to a separated list.
I tried somethings with LINQ but didn't know how to compare a property with a property in a list.
You can do this using linq
dataFields = (from relation in relations
from field in visibleFields
where field.SourceCollectionsID == relation.ChildCollectionID
select field).Select(field => field).ToList();
but I do prefer using foreaches instead
The code you showed us has complexity of O(N square). Try to use .Join method, so you will have complexity close to O(N) due to hashing. The code you should use is
dataFields = visibleFields.Join(relations, vF => vF.SourceCollectionsID, r => r.ChildCollectionID, (visibleField, relation) => visibleField).ToList();
For better understand about complexity look at my answer for this question
I can be similar to this
var dataFields = dataFields .Where(f => relations.Any(r => f.SourceCollectionsID ==r.ChildCollectionID))
.ToList()
Here are a list of column names:
var colNames = new List<string> { "colE", "colL", "colO", "colN" };
Based on the position of the column names in the list, I want to make that column's visible index equal to the position of the column name, but without returning a list. In other words, the following lambda expression without "ToList()" at the end:
colNames.Select((x, index) => { grid_ctrl.Columns[x].VisibleIndex = index; return x; }).ToList();
Can this be coded in a one-line lambda expression?
Use a loop to make side-effects. Use queries to compute new data from existing data:
var updates =
colNames.Select((x, index) => new { col = grid_ctrl.Columns[x].VisibleIndex, index })
.ToList();
foreach (var u in updates)
u.col.VisibleIndex = u.index;
Hiding side-effects in queries can make for nasty surprises. We can still use a query to do the bulk of the work.
You could also use List.ForEach to make those side-effects. That approach is not very extensible, however. It is not as general as a query.
Yes, here you are:
colNames.ForEach((x) => grid_ctrl.Columns[x].VisibleIndex = colNames.IndexOf(x));
Note that you need unique strings in your list, otherwise .IndexOf will behave badly.
Unfortunately LINQ .ForEach, as its relative foreach doesn't provide an enumeration index.
I wanted to ask for suggestions how I can simplify the foreach block below. I tried to make it all in one linq statement, but I couldn't figure out how to manipulate "count" values inside the query.
More details about what I'm trying to achieve:
- I have a huge list with potential duplicates, where Id's are repeated, but property "Count" is different numbers
- I want to get rid of duplicates, but still not to loose those "Count" values
- so for the items with the same Id I summ up the "Count" properties
Still, the current code doesn't look pretty:
var grouped = bigList.GroupBy(c => c.Id).ToList();
foreach (var items in grouped)
{
var count = 0;
items.Each(c=> count += c.Count);
items.First().Count = count;
}
var filtered = grouped.Select(y => y.First());
I don't expect the whole solution, pieces of ideas will be also highly appreciated :)
Given that you're mutating the collection, I would personally just make a new "item" with the count:
var results = bigList.GroupBy(c => c.Id)
.Select(g => new Item(g.Key, g.Sum(i => i.Count)))
.ToList();
This performs a simple mapping from the original to a new collection of Item instances, with the proper Id and Count values.
var filtered = bigList.GroupBy(c=>c.Id)
.Select(g=> {
var f = g.First();
f.Count = g.Sum(c=>c.Count);
return f;
});
I know that there were posts regarding dynamic where clauses in c# linq, however, I'm a bit new with linq and don't think that the solutions proposed were relevant to my case.
The problem is as follows:
I have a Dictionary<string, List<string>>. Each value in the dictionary represents a set of values. For example: for a given key "food" the value can be {"apple", "tomato", "soup"}. In addition, I have a DataTable which its columns are the dictionary's keys.
My mission is to build a linq which its where clauses are build according to the dictionary.
Thus, among multiple values, "or" condition will appear and between key's values, "And" or "Or" condition will appear.
I can't write it hard coded since it must change dynamically according to the keys found in the dictionary.
I don't really know how to concatenate multiple where clauses which may match my requirements.
Instead of concatenating linq expressions you can use .Any(), .All() and .Contains() to achieve what you want.
var filters = new Dictionary<string, List<string>> { {"Food", new List<string> { "apple", "tomato", "soup"} },
{"Drink", new List<string> { "tea" }}};
var table = new System.Data.DataTable();
table.Columns.Add("Food");table.Columns.Add("Drink");
table.Rows.Add("apple" , "water");
table.Rows.Add("tomato", "tea");
table.Rows.Add("cake" , "water");
table.Rows.Add("cake" , "tea");
//And: Retrieves only tomato, tea
var andLinq = table.Rows.Cast<System.Data.DataRow>().Where(row => filters.All(filter => filter.Value.Contains(row[filter.Key])));
//Or: Retrieves all except cake, water
var orLinq = table.Rows.Cast<System.Data.DataRow>().Where(row => filters.Any(filter => filter.Value.Contains(row[filter.Key])));
The Contains is equivalent to equals val1 or equals val2.
I updated the answer. That's the most precise that I could pull out of your question.
Dictionary<string, List<string>> filters = GetFilters();
var filteredSource = GetSource();
bool useAndOperator = GetUseAndOperator();
foreach (var filter in filters.Values)
{
Func<myDataRow, bool> predecate = useAndOperator ? s => filter.All(f => f == s["key1"])
: s => filter.Any(f => f == s["key1"]);
filteredSource = filteredSource.Where(predecate);
}
Also this code doesn't makes much sense but it demonstrates the principle not the complete solution and you should update it accordingly to your needs.
I have a List. I would like to filter through all the rows in the list of tables to find all the rows that are in every datatable in the list.
If possible, the compare needs to be on the "ID" column that is on every row.
I have tried to solve this with Linq but got stuck. This is what I have so far:
List<DataTable> dataTables = new List<DataTable>();
// fill up the list
List<DataRow> dataRows =
dataTables.SelectMany(dt => dt.Rows.Cast<DataRow>().AsEnumerable()).
Aggregate((r1, r2) => r1.Intersect(r2));
Any suggestions?
Not a simple question. Here's a solution (which seems too complicated to me, but it works).
Obtain the Id value from each row using Linq to DataSets
Intersect the multiple lists to find all the common values
Find a single occurence of a row in all of the rows that have one of the matching ids
To use Linq on DataTable, see this article for a start.
You could get the ids from one table like this
var ids = dt.AsEnumerable().Select (d => d.Field<int>("ID")).OfType<int>();
and from multiple tables
var setsOfIds = dataTables.Select (
t => t.AsEnumerable().Select (x => x.Field<int>("ID")).OfType<int>());
To intersect multiple lists, try this article. Using one of the methods there you could obtain the intersection of all of the ids.
Using Jon Skeet's helper method
public static class MyExtensions
{
public static List<T> IntersectAll<T>(this IEnumerable<IEnumerable<T>> lists)
{
HashSet<T> hashSet = new HashSet<T>(lists.First());
foreach (var list in lists.Skip(1))
{
hashSet.IntersectWith(list);
}
return hashSet.ToList();
}
}
we can write
var commonIds = setsOfIds.InsersectAll();
Now flatten all the rows from the DataTables and filter by the common ids:
var rows = dataTables.SelectMany (t => t.AsEnumerable()).Where(
r => commonIds.Contains(r.Field<int>("ID")));
Now group by id and take the first instance of each row:
var result = rows.GroupBy (r => r.Field<int>("ID")).Select (r => r.First ());
Try this to find the intersection between the two lists:
r1.Join(r2, r1 => r1.Id, r2 => r2.Id, (r1, r2) => r1);