Get subset x elements before and x elements after LINQ - c#

I have a list and i need a subset of it which has the 5 previous elements to the one currently in the for loop and 5 elements after, so in total a list of 10 elements. Ignoring the current item in the loop.
I am currently achieving this as follows:
var currentIndex = myList.ClassName.FindIndex(a => a.Id == plate.Id);
var fromIndex = currentIndex - 5;
if (fromIndex < 0) fromIndex = 0;
var toIndex = currentIndex + 5;
if ((myList.ClassName.ElementAtOrDefault(toIndex) == null))
toIndex = myList.ClassName.Count - 1;
var subsetList = myList.ClassName.GetRange(fromIndex, (11));
comparisonPlates.RemoveAt(currentIndex);
However i am sure there is a much better and more efficient way of doing this using LINQ, any guidance?

I would use Skip and Take so you have all your elements surrounding your current index (and the current index).
To remove the current index, either add a RemoveAt; or use several Skip/Take (Skip/Take to take the elements before yours, and Skip/Take to take the elements after)
With a sample :
const int currentIndex = 12;
const int nbElements = 5;
List<string> results = items.Skip(currentIndex - nbElements).Take(nbElements).Concat(items.Skip(currentIndex + 1).Take(nbElements)).ToList();

I'd suggest you use LINQ to generate indices:
var subset = Enumerable.Range(currentIndex - 5, 5)
.Concat(Enumerable.Range(currentIndex + 1, 5))
.SkipWhile(index => index < 0)
.TakeWhile(index => index < items.Count)
.Select(index => items[index])
;
This can be more efficient than the way using items.Skip operation, because items.Skip(n) will internally do IEnumerator.MoveNext through n elements one by one, that is, the greater your currentIndex, the less efficient.
Of course SkipWhile or TakeWhile in the code above is slightly inefficient for the same reason, but it totally loops always only 10 times.
If you hate that inefficiency, you can calculate indices and counts (parameters of Enumerable.Range) beforehand and eliminate those.
(In my opinion, my code above seems more readable.)
In addition, Runtime Complexity of the indexer([]) of List is O(1), that means items[index] takes constant time regardless of index value or the size of your List.

Here is a solution that I feel is a little more readable.
var numElements = 5;
var fromIndex = currentIndex <= numElements ? 0 : currentIndex - numElements - 1;
var toIndex = myList.Count() - currentIndex <= numElements ? myList.Count() : currentIndex + numElements;
var subsetList = myList.Skip(fromIndex).Take(toIndex - fromIndex);
EDIT:
I'd suggest using Ripple's answer because of the performance reasons mentioned. I wasn't aware of the implications of skip/take, but after looking it up, it does make sense. For a small list of items, it won't matter, but it would with a sizable amount of data.

Related

How to Order By or Sort an integer List and select the Nth element

I have a list, and I want to select the fifth highest element from it:
List<int> list = new List<int>();
list.Add(2);
list.Add(18);
list.Add(21);
list.Add(10);
list.Add(20);
list.Add(80);
list.Add(23);
list.Add(81);
list.Add(27);
list.Add(85);
But OrderbyDescending is not working for this int list...
int fifth = list.OrderByDescending(x => x).Skip(4).First();
Depending on the severity of the list not having more than 5 elements you have 2 options.
If the list never should be over 5 i would catch it as an exception:
int fifth;
try
{
fifth = list.OrderByDescending(x => x).ElementAt(4);
}
catch (ArgumentOutOfRangeException)
{
//Handle the exception
}
If you expect that it will be less than 5 elements then you could leave it as default and check it for that.
int fifth = list.OrderByDescending(x => x).ElementAtOrDefault(4);
if (fifth == 0)
{
//handle default
}
This is still some what flawed because you could end up having the fifth element being 0. This can be solved by typecasting the list into a list of nullable ints at before the linq:
var newList = list.Select(i => (int?)i).ToList();
int? fifth = newList.OrderByDescending(x => x).ElementAtOrDefault(4);
if (fifth == null)
{
//handle default
}
Without LINQ expressions:
int result;
if(list != null && list.Count >= 5)
{
list.Sort();
result = list[list.Count - 5];
}
else // define behavior when list is null OR has less than 5 elements
This has a better performance compared to LINQ expressions, although the LINQ solutions presented in my second answer are comfortable and reliable.
In case you need extreme performance for a huge List of integers, I'd recommend a more specialized algorithm, like in Matthew Watson's answer.
Attention: The List gets modified when the Sort() method is called. If you don't want that, you must work with a copy of your list, like this:
List<int> copy = new List<int>(original);
List<int> copy = original.ToList();
The easiest way to do this is to just sort the data and take N items from the front. This is the recommended way for small data sets - anything more complicated is just not worth it otherwise.
However, for large data sets it can be a lot quicker to do what's known as a Partial Sort.
There are two main ways to do this: Use a heap, or use a specialised quicksort.
The article I linked describes how to use a heap. I shall present a partial sort below:
public static IList<T> PartialSort<T>(IList<T> data, int k) where T : IComparable<T>
{
int start = 0;
int end = data.Count - 1;
while (end > start)
{
var index = partition(data, start, end);
var rank = index + 1;
if (rank >= k)
{
end = index - 1;
}
else if ((index - start) > (end - index))
{
quickSort(data, index + 1, end);
end = index - 1;
}
else
{
quickSort(data, start, index - 1);
start = index + 1;
}
}
return data;
}
static int partition<T>(IList<T> lst, int start, int end) where T : IComparable<T>
{
T x = lst[start];
int i = start;
for (int j = start + 1; j <= end; j++)
{
if (lst[j].CompareTo(x) < 0) // Or "> 0" to reverse sort order.
{
i = i + 1;
swap(lst, i, j);
}
}
swap(lst, start, i);
return i;
}
static void swap<T>(IList<T> lst, int p, int q)
{
T temp = lst[p];
lst[p] = lst[q];
lst[q] = temp;
}
static void quickSort<T>(IList<T> lst, int start, int end) where T : IComparable<T>
{
if (start >= end)
return;
int index = partition(lst, start, end);
quickSort(lst, start, index - 1);
quickSort(lst, index + 1, end);
}
Then to access the 5th largest element in a list you could do this:
PartialSort(list, 5);
Console.WriteLine(list[4]);
For large data sets, a partial sort can be significantly faster than a full sort.
Addendum
See here for another (probably better) solution that uses a QuickSelect algorithm.
This LINQ approach retrieves the 5th biggest element OR throws an exception WHEN the list is null or contains less than 5 elements:
int fifth = list?.Count >= 5 ?
list.OrderByDescending(x => x).Take(5).Last() :
throw new Exception("list is null OR has not enough elements");
This one retrieves the 5th biggest element OR null WHEN the list is null or contains less than 5 elements:
int? fifth = list?.Count >= 5 ?
list.OrderByDescending(x => x).Take(5).Last() :
default(int?);
if(fifth == null) // define behavior
This one retrieves the 5th biggest element OR the smallest element WHEN the list contains less than 5 elements:
if(list == null || list.Count <= 0)
throw new Exception("Unable to retrieve Nth biggest element");
int fifth = list.OrderByDescending(x => x).Take(5).Last();
All these solutions are reliable, they should NEVER throw "unexpected" exceptions.
PS: I'm using .NET 4.7 in this answer.
Here there is a C# implementation of the QuickSelect algorithm to select the nth element in an unordered IList<>.
You have to put all the code contained in that page in a static class, like:
public static class QuickHelpers
{
// Put the code here
}
Given that "library" (in truth a big fat block of code), then you can:
int resA = list.QuickSelect(2, (x, y) => Comparer<int>.Default.Compare(y, x));
int resB = list.QuickSelect(list.Count - 1 - 2);
Now... Normally the QuickSelect would select the nth lowest element. We reverse it in two ways:
For resA we create a reverse comparer based on the default int comparer. We do this by reversing the parameters of the Compare method. Note that the index is 0 based. So there is a 0th, 1th, 2th and so on.
For resB we use the fact that the 0th element is the list-1 th element in the reverse order. So we count from the back. The highest element would be the list.Count - 1 in an ordered list, the next one list.Count - 1 - 1, then list.Count - 1 - 2 and so on
Theorically using Quicksort should be better than ordering the list and then picking the nth element, because ordering a list is on average a O(NlogN) operation and picking the nth element is then a O(1) operation, so the composite is O(NlogN) operation, while QuickSelect is on average a O(N) operation. Clearly there is a but. The O notation doesn't show the k factor... So a O(k1 * NlogN) with a small k1 could be better than a O(k2 * N) with a big k2. Only multiple real life benchmarks can tell us (you) what is better, and it depends on the size of the collection.
A small note about the algorithm:
As with quicksort, quickselect is generally implemented as an in-place algorithm, and beyond selecting the k'th element, it also partially sorts the data. See selection algorithm for further discussion of the connection with sorting.
So it modifies the ordering of the original list.

C# - Linq optimize code with List and Where clause

I have a following code:
var tempResults = new Dictionary<Record, List<Record>>();
errors = new List<Record>();
foreach (Record record in diag)
{
var code = Convert.ToInt16(Regex.Split(record.Line, #"\s{1,}")[4], 16);
var cond = codes.Where(x => x.Value == code && x.Active).FirstOrDefault();
if (cond == null)
{
errors.Add(record);
continue;
}
var min = record.Datetime.AddSeconds(downDiff);
var max = record.Datetime.AddSeconds(upDiff);
//PROBLEM PART - It takes around 4,5ms
var possibleResults = cas.Where(x => x.Datetime >= min && x.Datetime <= max).ToList();
if (possibleResults.Count == 0)
errors.Add(record);
else
{
if (!CompareCond(record, possibleResults, cond, ref tempResults, false))
{
errors.Add(record);
}
}
}
variable diag is List of Record
variable cas is List of Record with around 50k items.
The problem is that it's too slowly. The part with the first where clause needs around 4,6599ms, e.g. for 3000 records in List diag it makes 3000*4,6599 = 14 seconds. Is there any option to optimize the code?
You can speed up that specific statement you emphasized
cas.Where(x => x.Datetime >= min && x.Datetime <= max).ToList();
With binary search over cas list. First pre-sort cas by Datetime:
cas.Sort((a,b) => a.Datetime.CompareTo(b.Datetime));
Then create comparer for Record which will compare only Datetime properties (implementation assumes there are no null records in the list):
private class RecordDateComparer : IComparer<Record> {
public int Compare(Record x, Record y) {
return x.Datetime.CompareTo(y.Datetime);
}
}
Then you can translate your Where clause like this:
var index = cas.BinarySearch(new Record { Datetime = min }, new RecordDateComparer());
if (index < 0)
index = ~index;
var possibleResults = new List<Record>();
// go backwards, for duplicates
for (int i = index - 1; i >= 0; i--) {
var res = cas[i];
if (res.Datetime <= max && res.Datetime >= min)
possibleResults.Add(res);
else break;
}
// go forward until item bigger than max is found
for (int i = index; i < cas.Count; i++) {
var res = cas[i];
if (res.Datetime <= max &&res.Datetime >= min)
possibleResults.Add(res);
else break;
}
Idea is to find first record with Datetime equal or greater to your min, with BinarySearch. If exact match is found - it returns index of matched element. If not found - it returns negative value, which can be translated to the index of first element greater than target with ~index operation.
When we found that element, we can just go forward the list and grab items until we find item with Datetime greater than max (because list is sorted). We need to go a little backwards also, because if there are duplicates - binary search will not necessary return the first one, so we need to go backwards for potential duplicates.
Additional improvements might include:
Putting active codes in a Dictionary (keyed by Value) outside of for loop, and thus replacing codes Where search with Dictionary.ContainsKey.
As suggested in comments by #Digitalsa1nt - parallelize foreach loop, using Parallel.For, PLINQ, or any similar techniques. It's a perfect case for parallelization, because loop contains only CPU bound work. You need to make a little adjustments to make it thread-safe of course, such as using thread-safe collection for errors (or locking around adding to it).
Try adding AsNoTracking in the list
The AsNoTracking method can save both execution times and memory usage. Applying this option really becomes important when we retrieve a large amount of data from the database.
var possibleResults = cas.Where(x => x.Datetime >= min && x.Datetime <= max).AsNoTracking().ToList(); //around 4,6599ms
There a few improvements you can make here.
It might only be a minor performance increase but you should try using groupby instead of where in this circumstance.
So instead you should have something like this:
cas.GroupBy(x => x.DateTime >= min && x.DateTime <= max).Select(h => h.Key == true);
This ussually works for seaching through lists for distinct values, but in you case I'm unsure if it will provide you any benefit when using a clause.
Also a few other things you can do throughout you code:
Avoid using ToList when possible and stick to IEnumerable. ToList performs an eager evaluation which is probably causing a lot of slowdown in your query.
use .Any() instead of Count when checking if values exist (This only applies if the list is IEnumerable)

How would i find a range of numbers inside of an array that suit a condition?

I'm trying to understand arrays better and I've found this piece of code on this website that fills an array with a load of random numbers. I was wondering how you would go about, say, extracting a range of numbers. So if i wanted to find out how many of these random numbers inside of the array, were between 25 and 50, how would i go about doing this? I've heard about the Array.FindAll<> however i have no clue on how to use it.
Thank you in advance.
Random r = new Random();
int count = 100;
// Create an array with count elements.
int[] numbers = new int[count];
// Loop over each index
for (int i = 0; i < count; i++)
{
// Generate and store a random number at current index
numbers[i] = r.Next(1, 100);
}
Using LINQ you can try the following:
var numbersInRange = numbers.Where(number => number > 25 && number < 60)
.ToArray();
The above will filter out all the numbers in the range (25,60). If you want to get also the numbers that are equal to either 25 or 60, you just have to use >= and <= repsectively.
If you don't want to use LINQ (which I couldn't explain), you could try the following:
var numbersInRange = new List<int>();
foreach(var number in numbers)
{
if(number > 25 && number < 60)
{
numbersInRange.Add(number);
}
}
So at the end you would have a list that would contains your numbers.
Comparing this to the LINQ version, I think that LINQ is the winner in simplicity and readability.
Using Linq it is a single line
int[] inRange = numbers.Where(x => x >= 25 && x <= 50).ToArray();
The Enumerable methods are a collection of methods that operates on sequences of values and apply a predicate expression (x => 25 && x <= 50) to each member (x) of the sequence (numbers). In your particular case the Where method filters the sequence based on the result of the boolean expression (the return includes the members where the expression returns true).
We can use Linq. First have using System.Linq in your namespaces so the extension methods are available.
To find the numbers:
numbers.Where(x => x >= 25 & x <= 50)
This takes a lambda expression (in this case x => x >= 25 & x <= 50) that takes on object of the type in the sequence (in this case int) returns bool and then evaluates for that expression for all contained items, filtering out those that do not.
You can use ToArray() on the result to get it back into an array, or just work on the results directly.
To get which numbers, that is to say, which indices the matches where at, you can use:
numbers.Select((x, i) => new {El = x, Idx = i}).Where(x => x.EL >= 25 & x.El <= 50).Select(x => x.Idx)
This first creates a new anonymous object for each item where El is the value held and Idx is the index it was at. Then it does the same Where as before, but on the El property of the object we created in the first step. Finally it extracts out the Idx property of the survivors so we have all of the matching indices.
Although "Linq" can do the job as shown by other answers ; there is also, as you heard about, Array.FindAll which behaves the same but already returns an array (and so doesn't need a call to ToArray)
int[] matchedNumbers = Array.FindAll (currentNumber => 25 <= currentNumber && currentNumber <= 50);
As for the "strange" arrow notation it's a lambda expression

What would be the shortest way to sum up the digits in odd and even places separately

I've always loved reducing number of code lines by using simple but smart math approaches. This situation seems to be one of those that need this approach. So what I basically need is to sum up digits in the odd and even places separately with minimum code. So far this is the best way I have been able to think of:
string number = "123456789";
int sumOfDigitsInOddPlaces=0;
int sumOfDigitsInEvenPlaces=0;
for (int i=0;i<number.length;i++){
if(i%2==0)//Means odd ones
sumOfDigitsInOddPlaces+=number[i];
else
sumOfDigitsInEvenPlaces+=number[i];
{
//The rest is not important
Do you have a better idea? Something without needing to use if else
int* sum[2] = {&sumOfDigitsInOddPlaces,&sumOfDigitsInEvenPlaces};
for (int i=0;i<number.length;i++)
{
*(sum[i&1])+=number[i];
}
You could use two separate loops, one for the odd indexed digits and one for the even indexed digits.
Also your modulus conditional may be wrong, you're placing the even indexed digits (0,2,4...) in the odd accumulator. Could just be that you're considering the number to be 1-based indexing with the number array being 0-based (maybe what you intended), but for algorithms sake I will consider the number to be 0-based.
Here's my proposition
number = 123456789;
sumOfDigitsInOddPlaces=0;
sumOfDigitsInEvenPlaces=0;
//even digits
for (int i = 0; i < number.length; i = i + 2){
sumOfDigitsInEvenPlaces += number[i];
}
//odd digits, note the start at j = 1
for (int j = 1; i < number.length; i = i + 2){
sumOfDigitsInOddPlaces += number[j];
}
On the large scale this doesn't improve efficiency, still an O(N) algorithm, but it eliminates the branching
Since you added C# to the question:
var numString = "123456789";
var odds = numString.Split().Where((v, i) => i % 2 == 1);
var evens = numString.Split().Where((v, i) => i % 2 == 0);
var sumOfOdds = odds.Select(int.Parse).Sum();
var sumOfEvens = evens.Select(int.Parse).Sum();
Do you like Python?
num_string = "123456789"
odds = sum(map(int, num_string[::2]))
evens = sum(map(int, num_string[1::2]))
This Java solution requires no if/else, has no code duplication and is O(N):
number = "123456789";
int[] sums = new int[2]; //sums[0] == sum of even digits, sums[1] == sum of odd
for(int arrayIndex=0; arrayIndex < 2; ++arrayIndex)
{
for (int i=0; i < number.length()-arrayIndex; i += 2)
{
sums[arrayIndex] += Character.getNumericValue(number.charAt(i+arrayIndex));
}
}
Assuming number.length is even, it is quite simple. Then the corner case is to consider the last element if number is uneven.
int i=0;
while(i<number.length-1)
{
sumOfDigitsInEvenPlaces += number[ i++ ];
sumOfDigitsInOddPlaces += number[ i++ ];
}
if( i < number.length )
sumOfDigitsInEvenPlaces += number[ i ];
Because the loop goes over i 2 by 2, if number.length is even, removing 1 does nothing.
If number.length is uneven, it removes the last item.
If number.length is uneven, then the last value of i when exiting the loop is that of the not yet visited last element.
If number.length is uneven, by modulo 2 reasoning, you have to add the last item to sumOfDigitsInEvenPlaces.
This seems slightly more verbose, but also more readable, to me than Anonymous' (accepted) answer. However, benchmarks to come.
Well, the compiler seems to think my code more understandable as well, since he removes it all if I don't print the results (which explains why I kept getting a time of 0 all along...). The other code though is obfuscated enough for even the compiler.
In the end, even with huge arrays, it's pretty hard for clock_t to tell the difference between the two. You get about a third less instructions in the second case, but since everything's in cache (and your running sums even in registers) it doesn't matter much.
For the curious, I've put the disassembly of both versions (compiled from C) here : http://pastebin.com/2fciLEMw

List.IndexOf() - return index of final occurrence rather than the first?

int highestValue = someList.IndexOf(someList.Max())
someList contains a lot of duplicates and someList.Max() returns the index of the first instance of the highest value.
Is there some trickery I can use (reversing the order of the list?) to get the index of the final occurrence of the highest value in the list, rather than resorting to writing a manual method?
Try this:
int highestValue = someList.LastIndexOf(someList.Max()) ;
All the other answers being completely correct, it must be noted that this requires 2 iterations over the list (one to find the max element, second to find the last index). For a list of integers that's a non-issue, but if the iteration was more complicated, here's an alternative:
var highestValue = someList.Select((val, ind) => new { Value = val, Index = ind })
.Aggregate((x, y) => (x.Value > y.Value) ? x : y)
.Index;
You mean like getting the index of the last occurrence? That would be:
int highestValueIndex = someList.LastIndexOf(someList.Max())
You should, however, be aware of the fact that you're making two passes over the data in both your original code and the code above. If you want to do it in a single pass (and you should only worry about this if your data sets are large), you can do this with something like:
static int LastIndexOfMax(List<int> list)
{
// Empty list, no index.
if (list.Count == 0) return -1;
// Default to first element then check all others.
int maxIdx = 0, maxVal = list[0];
for (int idx = 1; idx < list.Count; ++idx) {
// Higher or equal-and-to-the-right, replace.
if (list[idx] >= maxVal) {
maxIdx = idx;
maxVal = list[idx];
}
}
return maxIdx;
}
Use LastIndexOf
int highestValue = someList.LastIndexOf(someList.Max());

Categories

Resources