Venn Diagram style grouping in LINQ

Venn Diagram style grouping in LINQ - c#

Ok. The title might be a little confusing but here is what I am trying to do
I have a series of natural numbers
var series = Enumerable.Range(1, 100)
Now I want to use GroupBy to put numbers into these 3 groups, Prime, Even, Odd
series.Select(number => {
var type = "";
if (MyStaticMethods.IsPrime(number))
{
Type = "prime";
}
else if (number % 2 == 0)
{
type = "Even";
}
else
{
type = "Odd";
}
return new { Type=type, Number = number };
}).GroupBy(n => n.Type);
Now the above query will miss categorizing Prime numbers that are even or odd into both categories and they will just be in 'prime' group. Is there any way for the above select to yield multiple numbers?
I could try something like the following, but it requires an additional flattening of the sequence.
series.Select(number => {
var list = new List<int>();
if (MyStaticMethods.IsPrime(number))
{
list.Add(new { Type="prime", Number = number });
}
if (number % 2 == 0)
{
list.Add(new { Type="even", Number = number });
}
else
{
list.Add(new { Type="odd", Number = number });
}
return list;
})
.SelectMany(n => n)
.GroupBy(n => n.Type);
The above code solves my issue, is there any better way that could make my code look more "functional" ?

You can use linq here, but you'll need to duplicate some values that can exist in different groups. GroupBy only works for disjoint groups so you need a way to distinguish 2 the even number and 2 the prime number. The approach you did is essentially what you need to do, but it could be done a little more efficiently.
You can define a set of categories that can help classify the numbers. You don't necessarily need to define new classes to get this to work, but it helps to keep things clean and organized.
class Category<T>
{
public Category(string name, Predicate<T> predicate)
{
Name = name;
Predicate = predicate;
}
public string Name { get; }
public Predicate<T> Predicate { get; }
}
Then to group the numbers, you'd do this:
var series = Enumerable.Range(1, 100);
var categories = new[]
{
new Category<int>("Prime", i => MyStaticMethods.IsPrime(i)),
new Category<int>("Odd", i => i % 2 != 0),
new Category<int>("Even", i => i % 2 == 0),
};
var grouped =
from i in series
from c in categories
where c.Predicate(i)
group i by c.Name;

This is a good case to use Reactive Extensions, as you will avoid to duplicate values.
In the code below , "series" is parsed only once, because it's a hot source thanks to the Publish().
The actual parsing is done during the "Connect()".
using System.Reactive.Linq;
var list = new List<KeyValuePair<string, int>>();
var series= Observable.Range(1, 100).Publish();
series.Where(e => e % 2 == 0).Subscribe(e=>list.Add(new KeyValuePair<string, int>("Even",e)));
series.Where(e => e % 2 == 1).Subscribe(e => list.Add(new KeyValuePair<string, int>("Odd", e)));
series.Where(e => MyStaticMethods.IsPrime(e) ).Subscribe(e => list.Add(new KeyValuePair<string, int>("Prime", e)));
series.Connect();
var result = list.GroupBy(n => n.Key);

Related

Generate all possible coverage options

Suppose I have 2 lists: one containing strings, one containing integers, they differ in length. The application I am building will use these lists to generate combinations of vehicle and coverage areas. Strings represent area names and ints represent vehicle ID's.
My goal is to generate a list of all possible unique combinations used for further investigation. One vehicle can service many areas, but one area can't be served by multiple vehicles. Every area must receive service, and every vehicle must be used.
So to conclude the constraints:
Every area is used only once
Every vehicle is used at least once
No area can be left out.
No vehicle can be left out
Here is an example:
public class record = {
public string areaId string{get;set;}
public int vehicleId int {get;set;}
}
List<string> areas = new List<string>{ "A","B","C","D"};
List<int> vehicles = new List<int>{ 1,2};
List<List<record>> uniqueCombinationLists = retrieveUniqueCombinations(areas,vehicles);
I just have no clue how to make the retrieveUniqueCombinations function. Maybe I am just looking wrong or thinking too hard. I am stuck thinking about massive loops and other brute force approaches. An explanation of a better approach would be much appreciated.
The results should resemble something like this, I think this contains all possibilities for this example.
A1;B1;C1;D2
A1;B1;C2;D1
A1;B2;C1;D1
A2;B1;C1;D1
A2;B2;C2;D1
A2;B2;C1;D2
A2;B1;C2;D2
A1;B2;C2;D2
A2;B1;C1;D2
A1;B2;C2;D1
A2;B2;C1;D1
A1;B1;C2;D2
A2;B1;C2;D1
A1;B2;C1;D2

Here's something I threw together that may or may not work. Borrowing heavily from dtb's work on this answer.
Basically, I generate them all, then remove the ones that don't meet the requirements.
List<string> areas = new List<string> { "A", "B", "C", "D" };
List<int> vehicles = new List<int> { 1, 2 };
var result = retrieveUniqueCombinations(areas, vehicles);
result.ToList().ForEach((recordList) => {
recordList.ToList().ForEach((record) =>
Console.Write("{0}{1};", record.areaId, record.vehicleId));
Console.WriteLine();
});
public IEnumerable<IEnumerable<record>> retrieveUniqueCombinations(IEnumerable<string> areas, IEnumerable<int> vehicles)
{
var items = from a in areas
from v in vehicles
select new record { areaId = a, vehicleId = v };
var result = items.GroupBy(i => i.areaId).CartesianProduct().ToList();
result.RemoveAll((records) =>
records.All(record =>
record.vehicleId == records.First().vehicleId));
return result;
}
public class record
{
public string areaId { get; set; }
public int vehicleId { get; set; }
}
static class Extensions
{
public static IEnumerable<IEnumerable<T>> CartesianProduct<T>(
this IEnumerable<IEnumerable<T>> sequences)
{
IEnumerable<IEnumerable<T>> emptyProduct = new[] { Enumerable.Empty<T>() };
return sequences.Aggregate(
emptyProduct,
(accumulator, sequence) =>
from accseq in accumulator
from item in sequence
select accseq.Concat(new[] { item }));
}
}
This produces the following:
A1;B1;C1;D2;
A1;B1;C2;D1;
A1;B1;C2;D2;
A1;B2;C1;D1;
A1;B2;C1;D2;
A1;B2;C2;D1;
A1;B2;C2;D2;
A2;B1;C1;D1;
A2;B1;C1;D2;
A2;B1;C2;D1;
A2;B1;C2;D2;
A2;B2;C1;D1;
A2;B2;C1;D2;
A2;B2;C2;D1;
Note that these are not in the same order as yours, but I'll leave the verification to you. Also, there's likely a better way of doing this (for instance, by putting the logic in the RemoveAll step in the CartesianProduct function), but hey, you get what you pay for ;).

So lets use some helper classes to convert numbers to IEnumerable<int> enumerations in different bases. It may be more efficient to use List<> but since we are trying to use LINQ:
public static IEnumerable<int> LeadingZeros(this IEnumerable<int> digits, int minLength) {
var dc = digits.Count();
if (dc < minLength) {
for (int j1 = 0; j1 < minLength - dc; ++j1)
yield return 0;
}
foreach (var j2 in digits)
yield return j2;
}
public static IEnumerable<int> ToBase(this int num, int numBase) {
IEnumerable<int> ToBaseRev(int n, int nb) {
do {
yield return n % nb;
n /= nb;
} while (n > 0);
}
foreach (var n in ToBaseRev(num, numBase).Reverse())
yield return n;
}
Now we can create an enumeration that lists all the possible answers (and a few extras). I converted the Lists to Arrays for indexing efficiency.
var areas = new List<string> { "A", "B", "C", "D" };
var vehicles = new List<int> { 1, 2 };
var areasArray = areas.ToArray();
var vehiclesArray = vehicles.ToArray();
var numVehicles = vehiclesArray.Length;
var numAreas = areasArray.Length;
var NumberOfCombos = Convert.ToInt32(Math.Pow(numVehicles, numAreas));
var ansMap = Enumerable.Range(0, NumberOfCombos).Select(n => new { n, nd = n.ToBase(numVehicles).LeadingZeros(numAreas)});
Given the enumeration of the possible combinations, we can convert into areas and vehicles and exclude the ones that don't use all vehicles.
var ans = ansMap.Select(nnd => nnd.nd).Select(m => m.Select((d, i) => new { a = areasArray[i], v = vehiclesArray[d] })).Where(avc => avc.Select(av => av.v).Distinct().Count() == numVehicles);

How do I get total Qty using one linq query?

I have two linq queries, one to get confirmedQty and another one is to get unconfirmedQty.
There is a condition for getting unconfirmedQty. It should be average instead of sum.
result = Sum(confirmedQty) + Avg(unconfirmedQty)
Is there any way to just write one query and get the desired result instead of writing two separate queries?
My Code
class Program
{
static void Main(string[] args)
{
List<Item> items = new List<Item>(new Item[]
{
new Item{ Qty = 100, IsConfirmed=true },
new Item{ Qty = 40, IsConfirmed=false },
new Item{ Qty = 40, IsConfirmed=false },
new Item{ Qty = 40, IsConfirmed=false },
});
int confirmedQty = Convert.ToInt32(items.Where(o => o.IsConfirmed == true).Sum(u => u.Qty));
int unconfirmedQty = Convert.ToInt32(items.Where(o => o.IsConfirmed != true).Average(u => u.Qty));
//Output => Total : 140
Console.WriteLine("Total : " + (confirmedQty + unconfirmedQty));
Console.Read();
}
public class Item
{
public int Qty { get; set; }
public bool IsConfirmed { get; set; }
}
}

Actually accepted answer enumerates your items collection 2N + 1 times and it adds unnecessary complexity to your original solution. If I'd met this piece of code
(from t in items
let confirmedQty = items.Where(o => o.IsConfirmed == true).Sum(u => u.Qty)
let unconfirmedQty = items.Where(o => o.IsConfirmed != true).Average(u => u.Qty)
let total = confirmedQty + unconfirmedQty
select new { tl = total }).FirstOrDefault();
it would take some time to understand what type of data you are projecting items to. Yes, this query is a strange projection. It creates SelectIterator to project each item of sequence, then it create some range variables, which involves iterating items twice, and finally it selects first projected item. Basically you have wrapped your original queries into additional useless query:
items.Select(i => {
var confirmedQty = items.Where(o => o.IsConfirmed).Sum(u => u.Qty);
var unconfirmedQty = items.Where(o => !o.IsConfirmed).Average(u => u.Qty);
var total = confirmedQty + unconfirmedQty;
return new { tl = total };
}).FirstOrDefault();
Intent is hidden deeply in code and you still have same two nested queries. What you can do here? You can simplify your two queries, make them more readable and show your intent clearly:
int confirmedTotal = items.Where(i => i.IsConfirmed).Sum(i => i.Qty);
// NOTE: Average will throw exception if there is no unconfirmed items!
double unconfirmedAverage = items.Where(i => !i.IsConfirmed).Average(i => i.Qty);
int total = confirmedTotal + (int)unconfirmedAverage;
If performance is more important than readability, then you can calculate total in single query (moved to extension method for readability):
public static int Total(this IEnumerable<Item> items)
{
int confirmedTotal = 0;
int unconfirmedTotal = 0;
int unconfirmedCount = 0;
foreach (var item in items)
{
if (item.IsConfirmed)
{
confirmedTotal += item.Qty;
}
else
{
unconfirmedCount++;
unconfirmedTotal += item.Qty;
}
}
if (unconfirmedCount == 0)
return confirmedTotal;
// NOTE: Will not throw if there is no unconfirmed items
return confirmedTotal + unconfirmedTotal / unconfirmedCount;
}
Usage is simple:
items.Total();
BTW Second solution from accepted answer is not correct. It's just a coincidence that it returns correct value, because you have all unconfirmed items with equal Qty. This solution calculates sum instead of average. Solution with grouping will look like:
var total =
items.GroupBy(i => i.IsConfirmed)
.Select(g => g.Key ? g.Sum(i => i.Qty) : (int)g.Average(i => i.Qty))
.Sum();
Here you have grouping items into two groups - confirmed and unconfirmed. Then you calculate either sum or average based on group key, and summary of two group values. This also neither readable nor efficient solution, but it's correct.

Fastest way to select distinct values from list based on two properties

I have a this list:
List<myobject> list= new List<myobject>();
list.Add(new myobject{name="n1",recordNumber=1});
list.Add(new myobject{name="n2",recordNumber=2});
list.Add(new myobject{name="n3",recordNumber=3});
list.Add(new myobject{name="n4",recordNumber=3});
I'm looking for the fastest way to select distinct objects based on recordNumber, but if there is more than one object with same recordNumber(here recordNumber=3), I want to select object base on its name.(the name provided by paramater)
thanks

It looks like you are really after something like:
Dictionary<int, List<myobject>> myDataStructure;
That allows you to quickly retrieve by record number. If the List<myobject> with that dictionary key contains more than one entry, you can then use the name to select the correct one.
Note that if your list is not terribly long, an O(n) check that just scans the list checking for the recordNumber and name may be fast enough, in the sense that other things happening in your program could obscure the list lookup cost. Consider that possibility before over-optimizing lookup times.

Here's the LINQ way of doing this:
Func<IEnumerable<myobject>, string, IEnumerable<myobject>> getDistinct =
(ms, n) =>
ms
.ToLookup(x => x.recordNumber)
.Select(xs => xs.Skip(1).Any()
? xs.Where(x => x.name == n).Take(1)
: xs)
.SelectMany(x => x)
.ToArray();
I just tested this with a 1,000,000 randomly created myobject list and it produced the result in 106ms. That should be fast enough for most situations.

Are you looking for
class Program
{
static void Main(string[] args)
{
List<myobject> list = new List<myobject>();
list.Add(new myobject { name = "n1", recordNumber = 1 });
list.Add(new myobject { name = "n2", recordNumber = 2 });
list.Add(new myobject { name = "n3", recordNumber = 3 });
list.Add(new myobject { name = "n4", recordNumber = 3 });
//Generates Row Number on the fly
var withRowNumbers = list
.Select((x, index) => new
{
Name = x.name,
RecordNumber = x.recordNumber,
RowNumber = index + 1
}).ToList();
//Generates Row Number with Partition by clause
var withRowNumbersPartitionBy = withRowNumbers
.OrderBy(x => x.RowNumber)
.GroupBy(x => x.RecordNumber)
.Select(g => new { g, count = g.Count() })
.SelectMany(t => t.g.Select(b => b)
.Zip(Enumerable.Range(1, t.count), (j, i) => new { Rn = i, j.RecordNumber, j.Name}))
.Where(i=>i.Rn == 1)
.ToList();
//print the result
withRowNumbersPartitionBy.ToList().ForEach(i => Console.WriteLine("Name = {0} RecordNumber = {1}", i.Name, i.RecordNumber));
Console.ReadKey();
}
}
class myobject
{
public int recordNumber { get; set; }
public string name { get; set; }
}
Result:
Name = n1 RecordNumber = 1
Name = n2 RecordNumber = 2
Name = n3 RecordNumber = 3

Are you looking for a method to do this?
List<myobject> list= new List<myobject>();
list.Add(new myobject{name="n1",recordNumber=1});
list.Add(new myobject{name="n2",recordNumber=2});
list.Add(new myobject{name="n3",recordNumber=3});
list.Add(new myobject{name="n4",recordNumber=3});
public myobject Find(int recordNumber, string name)
{
var matches = list.Where(l => l.recordNumber == recordNumber);
if (matches.Count() == 1)
return matches.Single();
else return matches.Single(m => m.name == name);
}
This will - of course - break if there are multiple matches, or zero matches. You need to write your own edge cases and error handling!

If the name and recordNumber combination is guaranteed to be unique then you can always use Hashset.
You can then use RecordNumber and Name to generate the HashCode by using a method described here.
class myobject
{
//override GetHashCode
public override int GetHashCode()
{
unchecked // Overflow is fine, just wrap
{
int hash = 17;
// Suitable nullity checks etc, of course :)
hash = hash * 23 + recordNumber.GetHashCode();
hash = hash * 23 + name.GetHashCode();
return hash;
}
}
//override Equals
}

IEnumerable<Object> Data Specific Ordering

I've an object that is include property ID with values between 101 and 199. How to order it like 199,101,102 ... 198?
In result I want to put last item to first.

The desired ordering makes no sense (some reasoning would be helpful), but this should do the trick:
int maxID = items.Max(x => x.ID); // If you want the Last item instead of the one
// with the greatest ID, you can use
// items.Last().ID instead.
var strangelyOrderedItems = items
.OrderBy(x => x.ID == maxID ? 0 : 1)
.ThenBy(x => x.ID);

Depending whether you are interested in the largest item in the list, or the last item in the list:
internal sealed class Object : IComparable<Object>
{
private readonly int mID;
public int ID { get { return mID; } }
public Object(int pID) { mID = pID; }
public static implicit operator int(Object pObject) { return pObject.mID; }
public static implicit operator Object(int pInt) { return new Object(pInt); }
public int CompareTo(Object pOther) { return mID - pOther.mID; }
public override string ToString() { return string.Format("{0}", mID); }
}
List<Object> myList = new List<Object> { 1, 2, 6, 5, 4, 3 };
// the last item first
List<Object> last = new List<Object> { myList.Last() };
List<Object> lastFirst =
last.Concat(myList.Except(last).OrderBy(x => x)).ToList();
lastFirst.ForEach(Console.Write);
Console.WriteLine();
// outputs: 312456
// or
// the largest item first
List<Object> max = new List<Object> { myList.Max() };
List<Object> maxFirst =
max.Concat(myList.Except(max).OrderBy(x => x)).ToList();
maxFirst.ForEach(Console.Write);
Console.WriteLine();
// outputs: 612345

Edit: missed the part about you wanting the last item first. You could do it like this :
var objectList = new List<DataObject>();
var lastob = objectList.Last();
objectList.Remove(lastob);
var newList = new List<DataObject>();
newList.Add(lastob);
newList.AddRange(objectList.OrderBy(o => o.Id).ToList());
If you are talking about a normal sorting you could use linq's order by method like this :
objectList = objectList.OrderBy(ob => ob.ID).ToList();

In result I want to put last item to first
first sort the list
List<int> values = new List<int>{100, 56, 89..};
var result = values.OrderBy(x=>x);
add an extension method for swaping an elements in the List<T>
static void Swap<T>(this List<T> list, int index1, int index2)
{
T temp = list[index1];
list[index1] = list[index2];
list[index2] = temp;
}
after use it
result .Swap(0, result.Count -1);

You can acheive this using a single Linq statment.
var ordering = testData
.OrderByDescending(t => t.Id)
.Take(1)
.Union(testData.OrderBy(t => t.Id).Take(testData.Count() - 1));
Order it in reverse direction and take the top 1, then order it the "right way round" and take all but the last and union these together. There are quite a few variants of this approach, but the above should work.
This approach should work for arbitrary lists too, without the need to know the max number.

How about
var orderedItems = items.OrderBy(x => x.Id)
var orderedItemsLastFirst =
orderedItems.Reverse().Take(1).Concat(orderedItems.Skip(1));
This will iterate the list several times so perhaps could be more efficient but doesn't use much code.
If more speed is important you could write a specialised IEnumerable extension that would allow you to sort and return without converting to an intermediate IEnumerable.

var myList = new List<MyObject>();
//initialize the list
var ordered = myList.OrderBy(c => c.Id); //or use OrderByDescending if you want reverse order

Query a list for only duplicates

I have a List of type string in a .NET 3.5 project. The list has thousands of strings in it, but for the sake of brevity we're going to say that it just has 5 strings in it.
List<string> lstStr = new List<string>() {
"Apple", "Banana", "Coconut", "Coconut", "Orange"};
Assume that the list is sorted (as you can tell above). What I need is a LINQ query that will remove all strings that are not duplicates. So the result would leave me with a list that only contains the two "Coconut" strings.
Is this possible to do with a LINQ query? If it is not then I'll have to resort to some complex for loops, which I can do, but I didn't want to unless I had to.

here is code for finding duplicates form string arrya
int[] listOfItems = new[] { 4, 2, 3, 1, 6, 4, 3 };
var duplicates = listOfItems
.GroupBy(i => i)
.Where(g => g.Count() > 1)
.Select(g => g.Key);
foreach (var d in duplicates)
Console.WriteLine(d);

var dupes = lstStr.Where(x => lstStr.Sum(y => y==x ? 1 : 0) > 1);
OR
var dupes = lstStr.Where((x,i) => ( (i > 0 && x==lstStr[i-1])
|| (i < lstStr.Count-1 && x==lstStr[i+1]));
Note that the first one enumerates the list for every element which takes O(n²) time (but doesn't assume a sorted list). The second one is O(n) (and assumes a sorted list).

This should work, and is O(N) rather that the O(N^2) of the other answers. (Note, this does use the fact that the list is sorted, so that really is a requirement).
IEnumerable<T> OnlyDups<T>(this IEnumerable<T> coll)
where T: IComparable<T>
{
IEnumerator<T> iter = coll.GetEnumerator();
if (iter.MoveNext())
{
T last = iter.Current;
while(iter.MoveNext())
{
if (iter.Current.CompareTo(last) == 0)
{
yield return last;
do
{
yield return iter.Current;
}
while(iter.MoveNext() && iter.Current.CompareTo(last) == 0);
}
last = iter.Current;
}
}
Use it like this:
IEnumerable<string> onlyDups = lstStr.OnlyDups();
or
List<string> onlyDups = lstStr.OnlyDups().ToList();

var temp = new List<string>();
foreach(var item in list)
{
var stuff = (from m in list
where m == item
select m);
if (stuff.Count() > 1)
{
temp = temp.Concat(stuff);
}
}

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Venn Diagram style grouping in LINQ - c#

Related

Generate all possible coverage options

How do I get total Qty using one linq query?

Fastest way to select distinct values from list based on two properties

IEnumerable<Object> Data Specific Ordering

Query a list for only duplicates

Categories

Resources