NTILE function equivalent in C# - c#

I need to implement the following SQL in C# Linq:
SELECT NTILE (3) OVER (ORDER BY TransactionCount DESC) AS A...
I couldn't find any answer to a similar problem except this. However I don't think that is what I am looking for.
I don't even know where to start, if anyone could please give me at least a starting point I'd appreciated.
-- EDIT --
Trying to explain a little better.
I have one Store X with Transactions, Items, Units and other data that I retrieve from SQL and store in a object in C#.
I have a list of all stores with the same data but in this case I retrieve it from Analysis Services due to the large amount of data retrieved (and other reasons) and I store all of it in another object in C#.
So what I need is to order the list and find out if store X is in the top quartile of that list or second or third...
I hope that helps to clarify what I am trying to achieve.
Thank you

I believe that there is no simple LINQ equivalent of NTILE(n). Anyway, depending on your needs it's not that hard to write one.
The T-SQL documentation says
Distributes the rows in an ordered partition into a specified number of groups. The groups are numbered, starting at one. For each row, NTILE returns the number of the group to which the row belongs.
(see here)
For a very crude implementation of NTILE you can use GroupBy. The following example uses an int[] for sake of simplicity, but of course you are not restricted to
int n = 4;
int[] data = { 5, 2, 8, 2, 3, 8, 3, 2, 9, 5 };
var ntile = data.OrderBy(value => value)
.Select((value,index) => new {Value = value, Index = index})
.GroupBy(c => Math.Floor(c.Index / (data.Count() / (double)n)), c => c.Value);
First, our data is ordered ascending by it's values. If you are not using simple ints this could be something like store => store.Revenue (given you'd like to get the quantiles by revenue of the stores). Futhermore we are selecting the ordered data to an anonymous type, to include the indices. This is necessary since the indices are necessary for grouping, but it seems as GroupBy does not support lambdas with indices, as Select does.
The third line is a bit less intuitive, but I'll try and explain: The NTILE function assigns groups, the rows are assigned to. To create n groups, we devide N (number of items) by n to get the items per group and then device the current index by that, to determine in which group the current item is. To get the number of groups right I had to make the number of items per group fractional and floor the calculated group number, but admittedly, this is rather empirical.
ntile will contain n groups, each one having Key equal to the group number. Each group is enumerable. If you'd like to determine, if an element is in the second quartile, you can check if groups.Where(g => g.Key == 1) contains the element.
Remarks: The method I've used to determine the group may need some fine adjustment.

You can do it using GroupBy function by grouping based on index of the object. Consider a list of integers like this:-
List<int> numbers = new List<int> { 1, 2, 3, 4, 5, 6, 7, 8 };
You can first project the Index of all elements using Select and finally group by their resp. index. While calculating the Index we can divide it by NTILE value (3 in this case):-
var result = numbers.Select((v, i) => new { Value = v, Index = i / 3 })
.GroupBy(x => x.Index)
.Select(x => x.Select(z => z.Value).ToList());
Fiddle.

Related

Merge (add values together, reduce to single element) all pairs in List (except for first and last pair)

Goal
Reduce the total length of a list by merging pairs in the list (except the first and last pair) into single elements.
Visual Example
I have a single dimension list, which looks somewhat like this:
Pair (A)
Pair (B)
Pair (C)
Pair (D)
0
1
2
3
4
5
6
7
12.0
10.0
19.0
34.0
16.0
12.0
99.0
68.0
An example of what I would like the list (or a new list) to look like instead:
Pair (A)
Pair (B)
Pair (C)
0
1
2
3
4
5
12.0
10.0
53.0
28.0
99.0
68.0
Still a single dimension list but the total length has been reduced by merging pairs of elements (ignoring the first and last pair) into a single element (instead of a pair of elements), with a new value calculated by adding the former pairs values together.
Theory
Using some combination of GetRange, Select, Aggregate and Where to either alter the original list, or return a new list somehow.
In-closing
I'd like to apologise for the dodgy wording, and formatting of my question: I'm obviously out of my depth in what I'm trying to achieve - any help would be greatly appreciated.
Cheers.
Sounds fairly simple to do with a loop
var outList = new List<int>();
outList.Add(inList[0]);
outList.Add(inList[1]);
for(int x = 2; x < inList.Count -2;x+=2){
outList.Add(inList[x] + inList[x+1]);
}
outList.Add(inList[^2]);
outList.Add(inList[^1]);
You take the first two elements unchanged, you then run a loop that skips in twos summing a pair of elements together and Adding the result to the outList, stopping short of processing the last 2 entries, then add those verbatim to the outList
If the intent was to modify the original list, work backwards in twos between the end and the start
for(int x = list.Count - 4; x >= 2;x-=2){
list[x] += list[x+1];
list.RemoveAt(x+1);
}
By working backwards, our manipulations (removing from the list) doesn't affect future elements we have yet to process as the list shortens, which makes the logic a bit easier
--
LINQ's a hammer; not every problem is a nail.. If you can only accept LINQ then it could look like:
list.Take(2).Concat(
list.Skip(2)
.Take(list.Count-4)
.Select((e, i) => new { E = e, N = i/2 })
.GroupBy(x => x.N, x => x.E)
.Select(g => g.Sum())
).Concat(
list.Skip(list.Count-2)
).ToList();
Or with ranges:
list[..2].Concat(
list[2..^2]
.Select((e, i) => new { E = e, N = i/2 })
.GroupBy(x => x.N, x => x.E)
.Select(g => g.Sum())
).Concat(
list[^2..]
).ToList();
Take the first 2, add on the result of taking the middle N values, projecting them including their index and grouping on index/2 then summing the resulting group, then Concat on the last 2. I don't like it as much as the loops. I can think of other linq ways too, but none I really like.

I am trying to do get a list of random numbers which I want to not be repeated

I have a arraylist of names which get added 1 by 1. The range of the random number is the amount of people inside the arraylist.
For example:
NameList => [James, Vince, Joe, Joseph, John]
I want the output to be
NameListNum => [James 3, Vince 2, Joe 5, Joseph 1, John 4]
or
NameListNum => [James 2, Vince 5, Joe 1, Joseph 4, John 3]\
foreach (var name in nameList)
{
counter++;
int randomNum = rand.Next(Decimal.ToInt32(numOfShooters))+1;
nameListNum.Add(name + " "+randomNum);
foreach (var item in nameListNum)
{
}
}
Don't know if I am going in the right direction but the second foreach loop would be the one that checks the other nameListNum strings and regenerate a random number and rewrite it to the name.
Given that it is very easy to find the code to Shuffle a list of generated numbers randomly, the code is as easy as
var namesList = new []{"James", "Vince", "Joe", "Joseph", "John"};
var numsList = Enumerable.Range(1,namesList.Length).ToList().Shuffle();
var namesNumsList = namesList.Select( (n,i) => $"{n} {numsList[i]}").ToList();
Live example: https://dotnetfiddle.net/MzOwQa
If you want to randomise the names make them a List<string>:
var namesList = new List<string>{"James", "Vince", "Joe", "Joseph", "John"}.Shuffle();
The only other change is that you'll need namesList.Count on the following line in place of namesList.Length
Ok, let's go step by step. You have a list of names and you want to assign a unique random number to each name. The random numbers must be within the range [1, number of names in the list]
The naive brute force way would be to generate a random number, check if it has been rolled before and if not assign it to a name. Repeat the process for each name in the list in order and you are done.
With 4 or 5 names, this will actually run pretty fast but its wateful. Even more when lists get very big, getting to the point where its wasteful and performs horribly. Why? Well you need to roll many more times than necessary.
Is there a better way? Yes. Imagine you problem is: write a method that returns one by one random cards in a standard deck? Would you do it the same way? Or would you somehow store and ordered deck, shuffle it and then simply hand the cards out one by one?
Well, here its the same. Your standard deck is simply an ordered list of numbers from 1 to the total number of shooters: 1, 2, 3, ...., numberOfShooters.
Now, how would you shuffle it. Well, a naive way would be to create a list, then randomly pick an index, pick the number stored in that list and then remove it from the list to avoid rolling it again. That would work, but again, its wasteful. Why? Because repeatedly removing items in a list can be expensive. Remember lists are just wrappers over a standar array; removing an item mid list entails that all following numbers must be shifted un position up in the array.
An easy way to shuffle a list without all these problems is to use linq (there are better ways but this should suffice in your case):
var numbersToShuffle = Enumerable.Range(1, numberOfShooters);
var rnd = new Random();
numbersShuffled = numbersToShuffle.OrderBy(i => rnd.Next());
The rest should be easy.

How to match / connect / pair integers from a List <T>

I have a list, with even number of nodes (always even). My task is to "match" all the nodes in the least costly way.
So I could have listDegree(1,4,5,6), which represents all the odd-degree nodes in my graph. How can I pair the nodes in the listDegree, and save the least costly combination to a variable, say int totalCost.
Something like this, and I return the least totalCost amount.
totalCost = (1,4) + (5,6)
totalCost = (1,5) + (4,6)
totalCost = (1,6) + (4,5)
--------------- More details (or a rewriting of the upper) ---------------
I have a class, that read my input-file and store all the information I need, like the costMatrix for the graph, the edges, number of edges and nodes.
Next i have a DijkstrasShortestPath algorithm, which computes the shortest path in my graph (costMatrix) from a given start node to a given end node.
I also have a method that examines the graph (costMatrix) and store all the odd-degree nodes in a list.
So what I was looking for, was some hints to how I can pair all the odd-degree nodes in the least costly way (shortest path). To use the data I have is easy, when I know how to combine all the nodes in the list.
I dont need a solution, and this is not homework.
I just need a hint to know, when you have a list with lets say integers, how you can combine all the integers pairwise.
Hope this explenation is better... :D
Perhaps:
List<int> totalCosts = listDegree
.Select((num,index) => new{num,index})
.GroupBy(x => x.index / 2)
.Select(g => g.Sum(x => x.num))
.ToList();
Demo
Edit:
After you've edited your question i understand your requirement. You need a total-sum of all (pairwise) combinations of all elements in a list. I would use this combinatorics project which is quite efficient and informative.
var listDegree = new[] { 1, 4, 5, 6 };
int lowerIndex = 2;
var combinations = new Facet.Combinatorics.Combinations<int>(
listDegree,
lowerIndex,
Facet.Combinatorics.GenerateOption.WithoutRepetition
);
// get total costs overall
int totalCosts = combinations.Sum(c => c.Sum());
// get a List<List<int>> of all combination (the inner list count is 2=lowerIndex since you want pairs)
List<List<int>> allLists = combinations.Select(c => c.ToList()).ToList();
// output the result for demo purposes
foreach (IList<int> combis in combinations)
{
Console.WriteLine(String.Join(" ", combis));
}
(Without more details on the cost, I am going to assume cost(1,5) = 1-5, and you want the sum to get as closest as possible to 0.)
You are describing the even partition problem, which is NP-Complete.
The problem says: Given a list L, find two lists A,B such that sum(A) = sum(B) and #elements(A) = #elements(B), with each element from L must be in A or B (and never both).
The reduction to your problem is simple, each left element in the pair will go to A, and each right element in each pair will go to B.
Thus, there is no known polynomial solution to the problem, but you might want to try exponential exhaustive search approaches (search all possible pairs, there are Choose(2n,n) = (2n!)/(n!*n!) of those).
An alternative is pseudo-polynomial DP based solutions (feasible for small integers).

LINQ OrderBy with one exception

I have a DataSet that contains a column, call it Type which contains ints. I'm using LINQ to manipulate the DataSet and I want to sort by Type. The Type column only contains 3 values right now (1, 2, 3). I would like to sort so that Type 2 are first in the list and then 1 and 3.
Is there an easy solution for this or am I going to have to customize the OrderBy?
Few solutions :
table.AsEnumerable()
.OrderBy(r => r.Field<int>("Type")==2 ? 0 : 1)
.ThenBy(r => r.Field<int>("Type"));
or probably better
table.AsEnumerable().
OrderBy(r => r.Field<int>("Type")==2
? 0
: r => r.Field<int>("Type"))
or also elegant Tim Schmelter's solution
table.AsEnumerable()
.OrderByDescending(r => r.Field<int>("Type")==2)
.ThenBy(r => r.Field<int>("Type"))
Advantage or Tim Schmelter's solution : you're not depending on a "pseudo-value".
In the two first solutions, we assert that 0 is ((min possible value of the field) -1).
Is this real, can it change, we don't know.
To make the sample simpler I removed the fact that we start from a DataTable, it's just a detail, and I thought we could do this:
var list = new [] { 1, 3, 5, 2, 4, 6, 9, 8 };
var sorters = new [] { 3, 2, -1 };
var o = from s in sorters
from l in list.OrderBy(x => x)
where s == l || (s == -1 && !sorters.Contains(l))
select l;
The sort array contains the preferred sorters, this way we can have more than one if we need them, and then there is a 'jolly' (-1) to represent the end of the sorters. The jolly could be handled in a better way, it's like that just to give you the idea. At this point the algorithm is generic and does not need any hard-coded check on the preferred values (just the jolly).
There are some potential inefficiencies, but I just wanted to show the general idea of this solution, with some more works it could be done more efficiently.
Here you have like 5 ways of accomplishing this. It's a post regarding how to set value as the first of the order, then throw in the ones lower, and after the ones higher than the selected so if you have {1 2 3 4 5 6} and select item 4, output: {4 1 2 3 5 6}.. I prefer my own answer though.. ^_^
https://stackoverflow.com/a/12580121/920359

Get array index values of the top 1000 largest entries inside an array using LINQ

I would like to have a nice clean LINQ code that can get an array of the index values of the top 1000 largest values inside an array.
For example:
int[] IndexArray = ArrayWithValues.Return_Indexes_Of_1000_Biggest_Values
The code is obviously bogus it is just to illustrate what I need.
UPDATE
I totally forgot to say that I need a second functionality. I have a second array, and I need to retrieve all the values in the second array which has the same indexes as contained inside the IndexArray.
I can do it easily using loops and all that but the code is big, and I want to learn to use LINQ more often but at the moment LINQ is still very foreign to me.
I have gone through similar questions asked here but I was not able to modify the code to suite my needs, since people usually only need the values and not the indexes of the values.
Thanks for the help!
Something like this should work. It uses the overload of Select that allows you to incorporate a second input that is the index of the item in the sequence.
var indexArray = sourceArray
.Select((value, index) => new { value, index })
.OrderByDescending(item => item.value)
.Take(1000)
.Select(item => item.index)
.ToArray();
Simply project the value and index into an object, order by the value, take the top 1000 items, and then select simply the indexes before converting to an array.
Testing by taking the top 5 indexes from the array { 10, 4, 6, 8, 2, 3, 5, 1, 9, 7 } yields { 0, 8, 3, 9, 2 }, which maps to values { 10, 9, 8, 7, 6 }.
As the comments have already addressed in regards to your update, you can simply take these indices to select from the other if you are confident the arrays are equal in length or will otherwise not result in an IndexOutOfBoundsException.
.Select(item => otherArray[item.index])
.ToArray();
Another method you could look up would be Enumerable.Zip.

Categories

Resources