I've got List of sctructs. In struct there is field x. I would like to select those of structs, which are rather close to each other by parameter x. In other words, I'd like to clusterise them by x.
I guess, there should be one-line solution.
Thanks in advance.
If I understood correctly what you want, then you might need to sort your list by the structure's field X.
Look at the GroupBy extension method:
var items = mylist.GroupBy(c => c.X);
This article gives a lot of examples using group by.
If you're doing graph-style clustering, the easiest way to do it is by building up a list of clusters which is initially empty. Then loop over the input and, for each value, find all of the clusters which have at least one element which is close to the current value. All those clusters should then be merged together with the value. If there aren't any, then the value goes into a cluster all by itself.
Here is some sample code for how to do it with a simple list of integers.
IEnumerable<int> input;
int threshold;
List<List<int>> clusters = new List<List<int>>();
foreach(var current in input)
{
// Search the current list of clusters for ones which contain at least one
// entry such that the difference between it and x is less than the threshold
var matchingClusters =
clusters.Where(
cluster => cluster.Any(
val => Math.Abs(current - val) <= threshold)
).ToList();
// Merge all the clusters that were found, plus x, into a new cluster.
// Replace all the existing clusters with this new one.
IEnumerable<int> newCluster = new List<int>(new[] { current });
foreach (var match in matchingClusters)
{
clusters.Remove(match);
newCluster = newCluster.Concat(match);
}
clusters.Add(newCluster.ToList());
}
Related
So I have a Visualstudio Forms where I have a NumericUpDown function that will allow users to input a 5 digit number such as 09456. And I need to be able to compare that number to an already existing array of similar 5 digit numbers, so essentially I need to get the inputted number and find the closest number to that.
var numbers = new List<float> {89456f, 23467f, 86453f, };
// the list is way longer but you get the idea
var target = numericUpDown.3 ;
var closest = numbers.Select(n => new { n, (n - target) })
.OrderBy(p => p.distance)
.First().n;
But the first problem I encounter is that I cannot use a "-" operation on a float. Is there any way I can avoid that error and be able to still find the closest input?
Anonymous type members need names, and you need to use the absolute value of the difference. eg
var numbers = new List<float> { 89456f, 23467f, 86453f, };
var target = 3;
var closest = numbers.Select(n => new { n, distance = Math.Abs(n - target) })
.OrderBy(p => p.distance)
.First().n;
Well, apart from some issues in your sample(like no distance property on float) it should work:
int target = 55555;
float closest = numbers.OrderBy(f => Math.Abs(f - target)).First();
Demo: https://dotnetfiddle.net/gqS50L
The answers that use OrderBy are correct, but have less than optimal performance. OrderBy is an O(N log N) operation, but why sort the whole collection when you only need the top element? By contrast, MinBy will give you the result in O(N) time:
var closest = numbers.MinBy(n => Math.Abs(n - target));
Apart from the compilation errors, using LINQ for this is very slow and time consuming. The entire list has to be scanned once to find the distance, then it needs to be sorted, which scans it all over again and caches the results before returning them in order.
Before .NET 6
A faster way would be to iterate only once, calculating the distance of the current item from the target, and keep track of which number is closest. That's how eg Min and Max work.
public static float? Closest(this IEnumerable<float> list, float target)
{
float? closest=null;
float bestDist=float.MaxValue;
foreach(var n in list)
{
var dist=Math.Abs(n-target);
if (dist<bestDist)
{
bestDist=dist;
closest=n;
}
}
return closest;
}
This will return the closest number in a single pass.
var numbers = new List<float> { 89456f, 23467f, 86453f, };
var closest=numbers.Closest(20000);
Console.WriteLine($"Closest is {closest}");
------------------
Closest is 23467
Using MoreLINQ and MinBy
The same can be done in a single line using the MinBy extension method from the MoreLINQ library:
var closest=numbers.MinBy(n=>Math.Abs(n-target));
Using MinBy
In .NET 6 and later, Enumerable.MinBy was added to the BCL:
var closest=numbers.MinBy(n=>Math.Abs(n-target));
The code is similar to the explicit loop once you look past the generic key selectors and comparers :
while (e.MoveNext())
{
TSource nextValue = e.Current;
TKey nextKey = keySelector(nextValue);
if (nextKey != null && comparer.Compare(nextKey, key) < 0)
{
key = nextKey;
value = nextValue;
}
}
What I am trying to achieve is to add one item to a List, multiple times without using a loop.
I am going to add 50 numbers to a List and want all of those number to be equal to, let's say, 42. I am aware that I can simply create a small loop that runs 50 times and adds the same item over and over again, as such;
List<int> listFullOfInts = new List<int>();
int addThis = 42;
for(int i = 0; i < 50; i++)
listFullOfInts.Add(addThis);
What I am trying to do is something on the lines of;
listFullOfInts.AddRange(addThis, 50);
Or something that is similar to this at least, maybe using Linq? I have a vague memory of seeing how to do this but am unable to find it. Any ideas?
You can use Repeat:
List<int> listFullOfInts = Enumerable.Repeat(42, 50).ToList();
Demo
If you already have a list and you don't want to create a new one with ToList:
listFullOfInts.AddRange(Enumerable.Repeat(42, 50));
If you want to do add reference types without repeating the same reference, you can use Enumerable.Range+Select:
List<SomeClass> itemList = Enumerable.Range(0, 50)
.Select(i => new SomeClass())
.ToList();
You can't do it directly with LINQ since LINQ is side effect free but you can use some of what's found in the System.linq namespace to build the required.
public static void AddRepeated<T>(this List<T> self,T item, int count){
var temp = Enumerable.Repeat(item,count);
self.AddRange(temp);
}
you can then use that as you propose in your post
listFullOfInts.AddRepeated(addThis, 50);
IMPORTANT NOTE
To the people who flagged this as a duplicate, please understand we do NOT want a LINQ-based solution. Our real-world example has several original lists in the tens-of-thousands range and LINQ-based solutions are not performant enough for our needs since they have to walk the lists several times to perform their function, expanding with each new source list.
That is why we are specifically looking for a non-LINQ algorithm, such as the one suggested in this answer below where they walk all lists simultaneously, and only once, via enumerators. That seems to be the best so far, but I am wondering if there are others.
Now back to the question...
For the sake of explaining our issue, consider this hypothetical problem:
I have multiple lists, but to keep this example simple, let's limit it to two, ListA and ListB, both of which are of type List<int>. Their data is as follows:
List A List B
1 2
2 3
4 4
5 6
6 8
8 9
9 10
...however the real lists can have tens of thousands of rows.
We next have a class called ListPairing that's simply defined as follows:
public class ListPairing
{
public int? ASide{ get; set; }
public int? BSide{ get; set; }
}
where each 'side' parameter really represents one of the lists. (i.e. if there were four lists, it would also have a CSide and a DSide.)
We are trying to do is construct a List<ListPairing> with the data initialized as follows:
A Side B Side
1 -
2 2
- 3
4 4
5 -
6 6
8 8
9 9
- 10
Again, note there is no row with '7'
As you can see, the results look like a full outer join. However, please see the update below.
Now to get things started, we can simply do this...
var finalList = ListA.Select(valA => new ListPairing(){ ASide = valA} );
Which yields...
A Side B Side
1 -
2 -
4 -
5 -
6 -
8 -
9 -
and now we want to go back-fill the values from List B. This requires checking first if there is an already existing ListPairing with ASide that matches BSide and if so, setting the BSide.
If there is no existing ListPairing with a matching ASide, a new ListPairing is instantiated with only the BSide set (ASide is blank.)
However, I get the feeling that's not the most efficient way to do this considering all of the required 'FindFirst' calls it would take. (These lists can be tens of thousands of items long.)
However, taking a union of those lists once up front yields the following values...
1, 2, 3, 4, 5, 6, 8, 9, 10 (Note there is no #7)
My thinking was to somehow use that ordered union of the values, then 'walking' both lists simultaneously, building up ListPairings as needed. That eliminates repeated calls to FindFirst, but I'm wondering if that's the most efficient way to do this.
Thoughts?
Update
People have suggested this is a duplicate of getting a full outer join using LINQ because the results are the same...
I am not after a LINQ full outer join. I'm after a performant algorithm.
As such, I have updated the question.
The reason I bring this up is the LINQ needed to perform that functionality is much too slow for our needs. In our model, there are actually four lists, and each can be in the tens of thousands of rows. That's why I suggested the 'Union' approach of the IDs at the very end to get the list of unique 'keys' to walk through, but I think the posted answer on doing the same but with the enumerators is an even better approach as you don't need the list of IDs up front. This would yield a single pass through all items in the lists simultaneously which would easily out-perform the LINQ-based approach.
This didn't turn out as neat as I'd hoped, but if both input lists are sorted then you can just walk through them together comparing the head elements of each one: if they're equal then you have a pair, else emit the smallest one on its own and advance that list.
public static IEnumerable<ListPairing> PairUpLists(IEnumerable<int> sortedAList,
IEnumerable<int> sortedBList)
{
// Should wrap these two in using() per Servy's comment with braces around
// the rest of the method.
var aEnum = sortedAList.GetEnumerator();
var bEnum = sortedBList.GetEnumerator();
bool haveA = aEnum.MoveNext();
bool haveB = bEnum.MoveNext();
while (haveA && haveB)
{
// We still have values left on both lists.
int comparison = aEnum.Current.CompareTo(bEnum.Current);
if (comparison < 0)
{
// The heads of the two remaining sequences do not match and A's is
// lower. Generate a partial pair with the head of A and advance the
// enumerator.
yield return new ListPairing() {ASide = aEnum.Current};
haveA = aEnum.MoveNext();
}
else if (comparison == 0)
{
// The heads of the two sequences match. Generate a pair.
yield return new ListPairing() {
ASide = aEnum.Current,
BSide = bEnum.Current
};
// Advance both enumerators
haveA = aEnum.MoveNext();
haveB = bEnum.MoveNext();
}
else
{
// No match and B is the lowest. Generate a partial pair with B.
yield return new ListPairing() {BSide = bEnum.Current};
// and advance the enumerator
haveB = bEnum.MoveNext();
}
}
if (haveA)
{
// We still have elements on list A but list B is exhausted.
do
{
// Generate a partial pair for all remaining A elements.
yield return new ListPairing() { ASide = aEnum.Current };
} while (aEnum.MoveNext());
}
else if (haveB)
{
// List A is exhausted but we still have elements on list B.
do
{
// Generate a partial pair for all remaining B elements.
yield return new ListPairing() { BSide = bEnum.Current };
} while (bEnum.MoveNext());
}
}
var list1 = new List<int?>(){1,2,4,5,6,8,9};
var list2 = new List<int?>(){2,3,4,6,8,9,10};
var left = from i in list1
join k in list2 on i equals k
into temp
from k in temp.DefaultIfEmpty()
select new {a = i, b = (i == k) ? k : (int?)null};
var right = from k in list2
join i in list1 on k equals i
into temp
from i in temp.DefaultIfEmpty()
select new {a = (i == k) ? i : (int?)i , b = k};
var result = left.Union(right);
If you need the ordering to be same as your example, then you will need to provide an index and order by that (then remove duplicates)
var result = left.Select((o,i) => new {o.a, o.b, i}).Union(right.Select((o, i) => new {o.a, o.b, i})).OrderBy( o => o.i);
result.Select( o => new {o.a, o.b}).Distinct();
I have a list, with even number of nodes (always even). My task is to "match" all the nodes in the least costly way.
So I could have listDegree(1,4,5,6), which represents all the odd-degree nodes in my graph. How can I pair the nodes in the listDegree, and save the least costly combination to a variable, say int totalCost.
Something like this, and I return the least totalCost amount.
totalCost = (1,4) + (5,6)
totalCost = (1,5) + (4,6)
totalCost = (1,6) + (4,5)
--------------- More details (or a rewriting of the upper) ---------------
I have a class, that read my input-file and store all the information I need, like the costMatrix for the graph, the edges, number of edges and nodes.
Next i have a DijkstrasShortestPath algorithm, which computes the shortest path in my graph (costMatrix) from a given start node to a given end node.
I also have a method that examines the graph (costMatrix) and store all the odd-degree nodes in a list.
So what I was looking for, was some hints to how I can pair all the odd-degree nodes in the least costly way (shortest path). To use the data I have is easy, when I know how to combine all the nodes in the list.
I dont need a solution, and this is not homework.
I just need a hint to know, when you have a list with lets say integers, how you can combine all the integers pairwise.
Hope this explenation is better... :D
Perhaps:
List<int> totalCosts = listDegree
.Select((num,index) => new{num,index})
.GroupBy(x => x.index / 2)
.Select(g => g.Sum(x => x.num))
.ToList();
Demo
Edit:
After you've edited your question i understand your requirement. You need a total-sum of all (pairwise) combinations of all elements in a list. I would use this combinatorics project which is quite efficient and informative.
var listDegree = new[] { 1, 4, 5, 6 };
int lowerIndex = 2;
var combinations = new Facet.Combinatorics.Combinations<int>(
listDegree,
lowerIndex,
Facet.Combinatorics.GenerateOption.WithoutRepetition
);
// get total costs overall
int totalCosts = combinations.Sum(c => c.Sum());
// get a List<List<int>> of all combination (the inner list count is 2=lowerIndex since you want pairs)
List<List<int>> allLists = combinations.Select(c => c.ToList()).ToList();
// output the result for demo purposes
foreach (IList<int> combis in combinations)
{
Console.WriteLine(String.Join(" ", combis));
}
(Without more details on the cost, I am going to assume cost(1,5) = 1-5, and you want the sum to get as closest as possible to 0.)
You are describing the even partition problem, which is NP-Complete.
The problem says: Given a list L, find two lists A,B such that sum(A) = sum(B) and #elements(A) = #elements(B), with each element from L must be in A or B (and never both).
The reduction to your problem is simple, each left element in the pair will go to A, and each right element in each pair will go to B.
Thus, there is no known polynomial solution to the problem, but you might want to try exponential exhaustive search approaches (search all possible pairs, there are Choose(2n,n) = (2n!)/(n!*n!) of those).
An alternative is pseudo-polynomial DP based solutions (feasible for small integers).
I have a scenario at work where we have several different tables of data in a format similar to the following:
Table Name: HingeArms
Hght Part #1 Part #2
33 S-HG-088-00 S-HG-089-00
41 S-HG-084-00 S-HG-085-00
49 S-HG-033-00 S-HG-036-00
57 S-HG-034-00 S-HG-037-00
Where the first column (and possibly more) contains numeric data sorted ascending and represents a range to determine the proper record of data to get (e.g. height <= 33 then Part 1 = S-HG-088-00, height <= 41 then Part 1 = S-HG-084-00, etc.)
I need to lookup and select the nearest match given a specified value. For example, given a height = 34.25, I need to get second record in the set above:
41 S-HG-084-00 S-HG-085-00
These tables are currently stored in a VB.NET Hashtable "cache" of data loaded from a CSV file, where the key for the Hashtable is a composite of the table name and one or more columns from the table that represent the "key" for the record. For example, for the above table, the Hashtable Add for the first record would be:
ht.Add("HingeArms,33","S-HG-088-00,S-HG-089-00")
This seems less than optimal and I have some flexibility to change the structure if necessary (the cache contains data from other tables where direct lookup is possible... these "range" tables just got dumped in because it was "easy"). I was looking for a "Next" method on a Hashtable/Dictionary to give me the closest matching record in the range, but that's obviously not available on the stock classes in VB.NET.
Any ideas on a way to do what I'm looking for with a Hashtable or in a different structure? It needs to be performant as the lookup will get called often in different sections of code. Any thoughts would be greatly appreciated. Thanks.
A hashtable is not a good data structure for this, because items are scattered around the internal array according to their hash code, not their values.
Use a sorted array or List<T> and perform a binary search, e.g.
Setup:
var values = new List<HingeArm>
{
new HingeArm(33, "S-HG-088-00", "S-HG-089-00"),
new HingeArm(41, "S-HG-084-00", "S-HG-085-00"),
new HingeArm(49, "S-HG-033-00", "S-HG-036-00"),
new HingeArm(57, "S-HG-034-00", "S-HG-037-00"),
};
values.Sort((x, y) => x.Height.CompareTo(y.Height));
var keys = values.Select(x => x.Height).ToList();
Lookup:
var index = keys.BinarySearch(34.25);
if (index < 0)
{
index = ~index;
}
var result = values[index];
// result == { Height = 41, Part1 = "S-HG-084-00", Part2 = "S-HG-085-00" }
You can use a sorted .NET array in combination with Array.BinarySearch().
If you get a non negative value this is the index of exact match.
Otherwise, if result is negative use formula
int index = ~Array.BinarySearch(sortedArray, value) - 1
to get index of previous "nearest" match.
The meaning of nearest is defined by a comparer you use. It must be the same you used when sorting the array. See:
http://gmamaladze.wordpress.com/2011/07/22/back-to-the-roots-net-binary-search-and-the-meaning-of-the-negative-number-of-the-array-binarysearch-return-value/
How about LINQ-to-Objects (This is by no means meant to be a performant solution, btw.)
var ht = new Dictionary<string, string>();
ht.Add("HingeArms,33", "S-HG-088-00,S-HG-089-00");
decimal wantedHeight = 34.25m;
var foundIt =
ht.Select(x => new { Height = decimal.Parse(x.Key.Split(',')[1]), x.Key, x.Value }).Where(
x => x.Height < wantedHeight).OrderBy(x => x.Height).SingleOrDefault();
if (foundIt != null)
{
// Do Something with your item in foundIt
}