Tasks Combining by groups into a single group - c#

Assuming I have all of the classes implementing IGenerator
List<IGenerator> generators = new List<IGenerator> { new Jane1Generator(), new Jane2Generator(), new JohnGenerator() };
And
public interface IGenerator
{
string GetFirstName();
Task<List<Items>> Generate();
}
So grouping generators by GetFirstName() will put Jane1 and Jane2 in the same category. How can I combine the Tasks for Jane1 and Jane2 into a single Task, and keep John separate. I want to combine the results of both Janes into a single List.
foreach (var groupedByName in generators.GroupBy(i => i.GetFirstName()))
{
//Combine Tasks of Jane1 & Jane2
//give me a new Task that's is the sum of Jane1 & Jane2 tasks.
//almost like List<Items>.Join(AnotherList) wrapped with a Task, so I can wait for both Tasks and Get combined Results instead.
}
so If I create a List<Task<List<Items>>> tasks = new List<Task<List<Items>>>();
I want tasks list to contain only two elements. one would be the combine tasks of Jane1&Jane2 and the other just John, so I can do
await Task.WhenAll(tasks);
Console.Write("Done");

You can start with GroupBy to group all of the generators by name.
Then select out those groups into the needed information, namely the name and the items.
To get a task representing all of the items for that group you can use Select to get a sequence of all of the items for each generator in the group. Giving that to WhenAll gives us a task that will be done when all of the generators finish. We can then add a continuation to that task that combines all of the items together into a single list.
var groups = generators.GroupBy(gen => gen.GetFirstName())
.Select(group => new
{
Name = group.Key,
Items = Task.WhenAll(group.Select(gen => gen.Generate()))
.ContinueWith(t => t.Result.SelectMany(items => items).ToList())
});

tasks.Add(Task<IEnumerable<Items>>.Factory.ContinueWhenAll(results.ToArray(),
myTasks =>
{
var newList = new List<Items>();
foreach (var i in results)
{
newList.AddRange(i.Result);
}
return DoSomething(newList.AsEnumerable());
}));
where results is the list of my groups items by their firstName.

Related

Update a property field in a List

I have a List<Map> and I wanted to update the Map.Target property based from a matching value from another List<Map>.
Basically, the logic is:
If mapsList1.Name is equal to mapsList2.Name
Then mapsList1.Target = mapsList2.Name
The structure of the Map class looks like this:
public class Map {
public Guid Id { get; set; }
public string Name { get; set; }
public string Target { get; set; }
}
I tried the following but obviously it's not working:
List<Map> mapsList1 = new List<Map>();
List<Map> mapsList2 = new List<Map>();
// populate the 2 lists here
mapsList1.Where(m1 => mapsList2.Where(m2 => m1.Name == m2.Name) ) // don't know what to do next
The count of items in list 1 will be always greater than or equal to the count of items in list 2. No duplicates in both lists.
Assuming there are a small number of items in the lists and only one item in list 1 that matches:
list2.ForEach(l2m => list1.First(l1m => l1m.Name == l2m.Name).Target = l2m.Target);
If there are more than one item in List1 that must be updated, enumerate the entire list1 doing a First on list2.
list1.ForEach(l1m => l1m.Target = list2.FirstOrDefault(l2m => l1.Name == l2m.Name)?.Target ?? l1m.Target);
If there are a large number of items in list2, turn it into a dictionary
var d = list2.ToDictionary(m => m.Name);
list1.ForEach(m => m.Target = d.ContainsKey(m.Name) ? d[m.Name].Target : m.Target);
(Presumably list2 doesn't contain any repeated names)
If list1's names are unique and everything in list2 is in list1, you could even turn list1 into a dictionary and enumerate list2:
var d=list1.ToDictionary(m => m.Name);
list2.ForEach(m => d[m.Name].Target = m.Target);
If List 2 has entries that are not in list1 or list1 has duplicate names, you could use a Lookup instead, you'd just have to do something to avoid a "collection was modified; enumeration may not execute" you'd get if you were trying to modify the list it returns in response to a name
mapsList1.Where(m1 => mapsList2.Where(m2 => m1.Name == m2.Name) ) // don't know what to do next
LINQ Where doesn't really work like that / that's not a statement in itself. The m1 is the entry from list1, and the inner Where would produce an enumerable of list 2 items, but it doesn't result in the Boolean the outer Where is expecting, nor can you do anything to either of the sequences because LINQ operations are not supposed to have side effects. The only thing you can do with a Where is capture or use the sequence it returns in some other operation (like enumerating it), so Where isn't really something you'd use for this operation unless you use it to find all the objects you need to alter. It's probably worth pointing out that ForEach is a list thing, not a LINQ thing, and is basically just another way of writing foreach(var item in someList)
If collections are big enough better approach would be to create a dictionary to lookup the targets:
List<Map> mapsList1 = new List<Map>();
List<Map> mapsList2 = new List<Map>();
var dict = mapsList2
.GroupBy(map => map.Name)
.ToDictionary(maps => maps.Key, maps => maps.First().Target);
foreach (var map in mapsList1)
{
if (dict.TryGetValue(map.Name, out var target))
{
map.Target = target;
}
}
Note, that this will discard any possible name duplicates from mapsList2.

Split list of strings by list of key words [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
I have a list of strings
e.g.{"apple.txt", "orange.sd.2.txt", "apple.2.tf.txt", "orange.txt"}
and another list of strings to group the first list
e.g. {"apple", "orange"}
so that the first list is split into a list of lists and looks like this:
{{"apple.txt", "apple.2.tf.txt"},{"orange.txt", "orange.sd.2.txt"}}
How can I achieve this with linq?
How about this:
var groupedList = firstList.GroupBy(x => secondList.Single(y => x.Name.Contains(y)));
You could group the elements of each of the original list by all possible keys using Split, SelectMany, and GroupBy with an anonymous type:
var list = new List<string> { "apple.txt", "orange.sd.2.txt", "apple.2.tf.txt", "orange.txt" };
var groups = list
.SelectMany(element => element
.Split('.')
.Select(part => new { Part = part, Full = element }))
.GroupBy(entry => entry.Part);
Now you can select the groups you want to keep using Where, and convert the results into the nested lists using Select and ToList:
var keys = new List<string> { "apple", "orange" };
var result = group
.Where(group => keys.Contains(group.Key))
.Select(group => group
.Select(entry => entry.Full)
.ToList())
.ToList();
N.B. Elements of the original list which do not contain any of the specified keys will not appear in the results, and elements which contain more than one of the specified keys will appear more than once in the result.
Edit: As #NetMage noted, I've made an incorrect assumption about splitting strings - here's another version, although it's O(m * n):
var result = keys
.Select(key => list.Where(element => element.Contains(key)).ToList())
.ToList();
This is one simple way to do it. There is many ways and this will include duplicated key as the comment i made on your question. If many key match the same data the grouping will include the copies.
// have the list of keys (groups)
var keyList = new List<string>() {"apple", "orange"};
// have the list of all the data to split
var dataToSplit = new List<string>()
{
"apple.txt",
"apple.2.tf.txt",
"orange.txt",
"orange.sd.2.txt"
};
// now split to get just as desired you select what you want for each keys
var groupedData = keyList.Select(key => dataToSplit.Where(data => data.Contains(key)).ToList()).ToList();
// groupedData is a List<List<string>>
A second option to get the values maybe in a more "object" fashion is to use anonymous. specially good if you will do lots of manipulation and it's more "verbiose" in the code. But if you are new to this i do NOT recommend that approach but anyhow this is it.
// have the list of keys (groups)
var keyList = new List<string>() {"apple", "orange"};
// have the list of all the data to split
var dataToSplit = new List<string>()
{
"apple.txt",
"apple.2.tf.txt",
"orange.txt",
"orange.sd.2.txt"
};
// create the anonymous
var anonymousGroup = keyList.Select(key =>
{
return new
{
Key = key,
Data = dataToSplit.Where(data => data.Contains(key)).ToList()
}
});
// anonymousGroup is a List<A> where keeping the order you should access all data for orange like this
var orangeGroup = anonymousGroup.FirstOfDefault(o=> o.Key = "orange"); // get the anonymous
var orangeData = orangeGroup.Data; // get the List<string> for that group
A third way with less complexity than O(m*n). The trick is to remove from the collection the data as you go to reduce the chance to recheck over item already processed. This is from my codebase and it's an extension for List that simply remove item from a collection based on a predicate and return what has been removed.
public static List<T> RemoveAndGet<T>(this List<T> list, Func<T, bool> predicate)
{
var itemsRemoved = new List<T>();
// iterate backward for performance
for (int i = list.Count - 1; i >= 0; i--)
{
// keep item pointer
var item = list[i];
// if the item match the remove predicate
if (predicate(item))
{
// add the item to the returned list
itemsRemoved.Add(item);
// remove the item from the source list
list.RemoveAt(i);
}
}
return itemsRemoved;
}
Now with that extension when you have a list you can use it easily like this :
// have the list of keys (groups)
var keyList = new List<string>() {"apple", "orange"};
// have the list of all the data to split
var dataToSplit = new List<string>()
{
"apple.txt",
"apple.2.tf.txt",
"orange.txt",
"orange.sd.2.txt"
};
// now split to get just as desired you select what you want for each keys
var groupedData = keyList.Select(key => dataToSplit.RemoveAndGet(data => data.Contains(key))).ToList();
In that case due to the order in both collection the first key is apple so it will iterate the 4 items in dataToSplit and keep only 2 AND reducing the dataToSplit collection to 2 items only being the one with orange in them. On the second key it will iterate only over 2 items which will make it faster for this case. Typically this method will be as fast or faster than the first 2 ones i provided while being as clear and still make use of linq.
You can achieve this using this simple code:
var list1 = new List<string>() {"apple.txt", "orange.sd.2.txt", "apple.2.tf.txt", "orange.txt"};
var list2 = new List<string>() {"apple", "orange"};
var result = new List<List<string>>();
list2.ForEach(e => {
result.Add(list1.Where(el => el.Contains(e)).ToList());
});
Tuples to the rescue!
var R = new List<(string, List<string>)> { ("orange", new List<string>()), ("apple", new List<string>()) };
var L = new List<string> { "apple.txt", "apple.2.tf.txt", "orange.txt", "orange.sd.2.txt" };
R.ForEach(r => L.ForEach(l => { if (l.Contains(r.Item1)) { r.Item2.Add(l); } }));
var resultString = string.Join("," , R.Select(x => "{" + string.Join(",", x.Item2) + "}"));
You can build R dynamically trivially if you need to.

Run method in parallel in C# and collate results

I have a method that returns a object. In my parent function I have a list of IDs.
I would like to call the method for each ID I have and then have the objects added to a list. Right now I have written a loop that calls the method passing each ID and waits for the returned object and then goes to the next ID.
Can this be done in parallel? Any help here would be most helpful.
Something like this maybe:
List<int> ids = new List<int>();
List<object> result = new List<object>();
Parallel.ForEach(ids, (id, state, index) => {
result.Add(new { Id = id }); // You class instance here.
});
I think Task parallel libraries will help you
Task[] tasks = new Task[2];
tasks[0] = Task.Factory.StartNew(() => YourFunction());
tasks[1] = Task.Factory.StartNew(() => YourFunction());
Task.WaitAll(tasks);// here it will wait untill all the functions get completed

C# compare one list with part of other list

I am trying to remove unwanted images from the website. The product image folder contains more than 200000 + Images. I have a list of product codes that are inactive in a List. I have the list of file names in another list.
List<string> lFileList = files.ToList();
List<string> lNotinfiles = new List<string>();
foreach (var s in lFileList)
{
var s2 = (from s3 in lProductsList
where s.Contains(s3.cProductCode)
select s3.cProductCode).FirstOrDefault();
if (s2 == null)
{
lNotinfiles.Add(s);
}
}
Here lProductsList is the list containing ProductCodes that are not used.
The Image list contain multiple images for the same product but the image name contains product code ( mostly it starts with and there may be _1, _2.jpg will be there.
The above code works but it takes more than 5 min for a single folder to get the Not in list. I did try the following but that took more than 15 min.
var s2 = (from s3 in lProductsList
where s.IndexOf(s3.cProductCode) >= 0
select s3.cProductCode).FirstOrDefault();
I have tried to remove the loop all together that also didn't work.
What should be the best way to achieve this faster.
I'd suggest to: use HashSet, wait with ToList and maybe GroupBy.
HashSet + use of ToList
Currently your code is in a time complexity of o(n)2 - you iterate the outer list and for each item iterate all the items of the inner list.
Change the type of the lProductsList from a list to a HashSet<string> containing codes. Finding an item in a HashSet is of o(1) (list is o(n)). Then when you iterate each of the times of lFileList to find if they are in lProductsList it will be in the time complexity of o(n) instead of o(n)2.
This code will show you the time difference between when using 2 lists or when using a list and a HashSet:
var items = (new[] { "1", "2", "3","4","5","6","7","8","9","10" }).SelectMany(x => Enumerable.Repeat(x, 10000)).ToList();
var itemsToFilterOut = new List<string> { "1", "2", "3" };
var efficientItemsToFilterOut = new HashSet<string>(itemsToFilterOut);
var watch = System.Diagnostics.Stopwatch.StartNew();
var unwantedItems = items.Where(item => itemsToFilterOut.Contains(item)).ToList();
watch.Stop();
Console.WriteLine(watch.TotalMilliseconds);
watch = Stopwatch.StartNew();
var efficientUnwantedItems = items.Where(item => efficientItemsToFilterOut.Contains(item)).ToList();
watch.Stop();
Console.WriteLine(watch.TotalMilliseconds);
As for putting it in the context of your code:
var notInUseItems = new HashSet(from item in lProductsList
select item.cProductCode);
//Notice that here I am not using the materialized `lFileList`
lNotinfiles = files.Where(item => !notInUseItems.Contains(item));
GroupBy
Moreover - you said that the list contains multiple items mapping to the same key. Use GroupBy before filtering out. Check performance of this addition:
watch = Stopwatch.StartNew();
var moreEfficientUnwantedItems = items.GroupBy(item => item)
.Where(group => efficientItemsToFilterOut.Contains(group.Key))
.Select(group => group.Key);
watch.Stop();
Console.WriteLine(watch.TotalMilliseconds);
Check your data to analyze how significant the amount of duplications it and if needed use the GroupBy
Two suggestions:
Do not materialize files .ToList() i.e. do not wait until all files are retrieved
Organize NotInFiles as HashSet<String> to have a better compexity O(1) instead of O(N).
Something like this:
//TODO: you have to implement this
prtivate static String ExtractProductCode(string fileName) {
int p = fileName.IndexOf('_');
if (p >= 0)
return fileName.SubString(0, p);
else
return fileName;
}
...
HashSet<String> NotInFiles = new HashSet<String>(
lNotinfiles,
StringComparer.OrdinalIgnoreCase); // file names are case insensitive
..
var files = Directory
.EnumerateFiles(#"C:\MyPictures", "*.jpeg", SearchOption.AllDirectories)
.Where(path => Path.GetFileNameWithoutExtension(path))
.Select(path => ExtractProductCode(path))
.Where(code => !NotInFiles.Contains(code))
.ToList(); // if you want List materialization
You are converting Your (I assume)array to a List and then do a foreach
Using for directly on the array should make it at least a bit faster.
List<string> lNotinfiles = new List<string>();
for(int i = 0; i < files.Count(); i++)
foreach (var s in files)
{
var s2 = (from s3 in lProductsList where s.Contains(s3.cProductCode) select s3.cProductCode).FirstOrDefault();
if (s2 == null)
{
lNotinfiles.Add(s);
}
}

Linq intersect to filter multiple criteria against list

I'm trying to filter users by department. The filter may contain multiple departments, the users may belong to multiple departments (n:m). I'm fiddling around with LINQ, but can't find the solution. Following example code uses simplified Tuples just to make it runnable, of course there are some real user objects.
Also on CSSharpPad, so you have some runnable code: http://csharppad.com/gist/34be3e2dd121ffc161c4
string Filter = "Dep1"; //can also contain multiple filters
var users = new List<Tuple<string, string>>
{
Tuple.Create("Meyer", "Dep1"),
Tuple.Create("Jackson", "Dep2"),
Tuple.Create("Green", "Dep1;Dep2"),
Tuple.Create("Brown", "Dep1")
};
//this is the line I can't get to work like I want to
var tuplets = users.Where(u => u.Item2.Intersect(Filter).Any());
if (tuplets.Distinct().ToList().Count > 0)
{
foreach (var item in tuplets) Console.WriteLine(item.ToString());
}
else
{
Console.WriteLine("No results");
}
Right now it returns:
(Meyer, Dep1)
(Jackson, Dep2)
(Green, Dep1;Dep2)
(Brown, Dep1)
What I would want it to return is: Meyer,Green,Brown. If Filter would be set to "Dep1;Dep2" I would want to do an or-comparison and find *Meyer,Jackson,Green,Brown" (as well as distinct, as I don't want Green twice). If Filter would be set to "Dep2" I would only want to have Jackson, Green. I also played around with .Split(';'), but it got me nowhere.
Am I making sense? I have Users with single/multiple departments and want filtering for those departments. In my output I want to have all users from the specified department(s). The LINQ-magic is not so strong on me.
Since string implements IEnumerable, what you're doing right now is an Intersect on a IEnumerable<char> (i.e. you're checking each letter in the string). You need to split on ; both on Item2 and Filter and intersect those.
var tuplets = users.Where(u =>
u.Item2.Split(new []{';'})
.Intersect(Filter.Split(new []{';'}))
.Any());
string[] Filter = {"Dep1","Dep2"}; //Easier if this is an enumerable
var users = new List<Tuple<string, string>>
{
Tuple.Create("Meyer", "Dep1"),
Tuple.Create("Jackson", "Dep2"),
Tuple.Create("Green", "Dep1;Dep2"),
Tuple.Create("Brown", "Dep1")
};
//I would use Any/Split/Contains
var tuplets = users.Where(u => Filter.Any(y=> u.Item2.Split(';').Contains(y)));
if (tuplets.Distinct().ToList().Count > 0)
{
foreach (var item in tuplets) Console.WriteLine(item.ToString());
}
else
{
Console.WriteLine("No results");
}
In addition to the other answers, the Contains extension method may also be a good fit for what you're trying to do if you're matching on a value:
var result = list.Where(x => filter.Contains(x.Value));
Otherwise, the Any method will accept a delegate:
var result = list.Where(x => filter.Any(y => y.Value == x.Value));

Categories

Resources