Using Linq; how can I do the "opposite" of Take?
I.e. instead of getting the first n elements such as in
aCollection.Take(n)
I want to get everything but the last n elements. Something like
aCollection.Leave(n)
(Don't ask why :-)
Edit
I suppose I can do it this way aCollection.TakeWhile((x, index) => index < aCollection.Count - n) Or in the form of an extension
public static IEnumerable<TSource> Leave<TSource>(this IEnumerable<TSource> source, int n)
{
return source.TakeWhile((x, index) => index < source.Count() - n);
}
But in the case of Linq to SQL or NHibernate Linq it would have been nice if the generated SQL took care of it and generated something like (for SQL Server/T-SQL)
SELECT TOP(SELECT COUNT(*) -#n FROM ATable) * FROM ATable Or some other more clever SQL implementation.
I suppose there is nothing like it?
(But the edit was actually not part of the question.)
aCollection.Take(aCollection.Count() - n);
EDIT: Just as a piece of interesting information which came up in the comments - you may think that the IEnumerable's extension method .Count() is slow, because it would iterate through all elements. But in case the actual object implements ICollection or ICollection<T>, it will just use the .Count property which should be O(1). So performance will not suffer in that case.
You can see the source code of IEnumerable.Count() at TypeDescriptor.net.
I'm pretty sure there's no built-in method for this, but this can be done easily by chaining Reverse and Skip:
aCollection.Reverse().Skip(n).Reverse()
I don't believe there's a built-in function for this.
aCollection.Take(aCollection.Count - n)
should be suitable; taking the total number of items in the collection minus n should skip the last n elements.
Keeping with the IEnumerable philosphy, and going through the enumeration once for cases where ICollection isn't implemented, you can use these extension methods:
public static IEnumerable<T> Leave<T>(this ICollection<T> src, int drop) => src.Take(src.Count - drop);
public static IEnumerable<T> Leave<T>(this IEnumerable<T> src, int drop) {
IEnumerable<T> IEnumHelper() {
using (var esrc = src.GetEnumerator()) {
var buf = new Queue<T>();
while (drop-- > 0)
if (esrc.MoveNext())
buf.Enqueue(esrc.Current);
else
break;
while (esrc.MoveNext()) {
buf.Enqueue(esrc.Current);
yield return buf.Dequeue();
}
}
}
return (src is ICollection<T> csrc) ? csrc.Leave(drop) : IEnumHelper();
}
This will be much more efficient than the solutions with a double-reverse, since it creates only one list and only enumerates the list once.
public static class Extensions
{
static IEnumerable<T> Leave<T>(this IEnumerable<T> items, int numToSkip)
{
var list = items.ToList();
// Assert numToSkip <= list count.
list.RemoveRange(list.Count - numToSkip, numToSkip);
return List
}
}
string alphabet = "abcdefghijklmnopqrstuvwxyz";
var chars = alphabet.Leave(10); // abcdefghijklmnop
Currently, C# has a TakeLast(n) method defined which takes characters from the end of the string.
See here: https://msdn.microsoft.com/en-us/library/hh212114(v=vs.103).aspx
Related
Recently, I've got interested in the List.GetRange() function. It can retrieve a sub-list from a bigger list. Usage requires two arguments:
List<T> SubList = List<T>.GetRange(10, 20) //Get 20 items, starting from index 10
But what if I wanted to take every remaining item from a specific index, with this function?
List<T> RemainingItemsFromList = MyList.GetRange(7, /*REST*/) //What can I insert into REST to make it retrieve everything?
Is there
Any built-in RestOfTheList statement without doing something like Length - Index?
Any replacement function (that already exists)?
Any other alternative?
or am I simply doing something wrong?
Since List does not provide built-in method with required functionality, your options are:
1) Create extension method yourself:
public static class ListExtensions {
public static List<T> GetRange<T>(this List<T> list, int start) {
return list.GetRange(start, list.Count - start);
}
}
var remaining = list.GetRange(7);
2) Use LINQ:
var remaining = list.Skip(7).ToList(); // a bit less efficient, but usually that does not matter
"Searching for alternative functionalities for "Skip" and "Take" functionalities"
1 of the link says "Everytime you invoke Skip() it will have to iterate you collection from the beginning in order to skip the number of elements you desire, which gives a loop within a loop (n2 behaviour)"
Conclusion: For large collections, don’t use Skip and Take. Find another way to iterate through your collection and divide it.
In order to access last page data in a huge collection, can you please suggest us a way other than Skip and Take approach?
Looking at the source for Skip, you can see it enumerates over all the items, even over the first n items you want to skip.
It's strange though, because several LINQ-methods have optimizations for collections, like Count and Last.
Skip apparently does not.
If you have an array or IList<T>, you use the indexer to truly skip over them:
for (int i = skipStartIndex; i < list.Count; i++) {
yield return list[i];
}
Internally it is really correct:
private static IEnumerable<TSource> SkipIterator<TSource>(IEnumerable<TSource> source, int count)
{
using (IEnumerator<TSource> enumerator = source.GetEnumerator())
{
while (count > 0 && enumerator.MoveNext())
--count;
if (count <= 0)
{
while (enumerator.MoveNext())
yield return enumerator.Current;
}
}
}
If you want to skip for IEnumerable<T> then it works right. There are no other way except enumeration to get specific element(s). But you can write own extension method on IReadOnlyList<T> or IList<T> (if this interface is implemented in collection used for your elements).
public static class IReadOnlyListExtensions
{
public static IEnumerable<T> Skip<T>(this IReadOnlyList<T> collection, int count)
{
if (collection == null)
return null;
return ICollectionExtensions.YieldSkip(collection, count);
}
private static IEnumerable<T> YieldSkip<T>(IReadOnlyList<T> collection, int count)
{
for (int index = count; index < collection.Count; index++)
{
yield return collection[index];
}
}
}
In addition you can implement it for IEnumerable<T> but check inside for optimization:
if (collection is IReadOnlyList<T>)
{
// do optimized skip
}
Such solution is used a lot of where in Linq source code (but not in Skip unfortunately).
Depends on your implementation, but it would make sense to use indexed arrays for the purpose, instead.
Lets assume you have a function that returns a lazily-enumerated object:
struct AnimalCount
{
int Chickens;
int Goats;
}
IEnumerable<AnimalCount> FarmsInEachPen()
{
....
yield new AnimalCount(x, y);
....
}
You also have two functions that consume two separate IEnumerables, for example:
ConsumeChicken(IEnumerable<int>);
ConsumeGoat(IEnumerable<int>);
How can you call ConsumeChicken and ConsumeGoat without a) converting FarmsInEachPen() ToList() beforehand because it might have two zillion records, b) no multi-threading.
Basically:
ConsumeChicken(FarmsInEachPen().Select(x => x.Chickens));
ConsumeGoats(FarmsInEachPen().Select(x => x.Goats));
But without forcing the double enumeration.
I can solve it with multithread, but it gets unnecessarily complicated with a buffer queue for each list.
So I'm looking for a way to split the AnimalCount enumerator into two int enumerators without fully evaluating AnimalCount. There is no problem running ConsumeGoat and ConsumeChicken together in lock-step.
I can feel the solution just out of my grasp but I'm not quite there. I'm thinking along the lines of a helper function that returns an IEnumerable being fed into ConsumeChicken and each time the iterator is used, it internally calls ConsumeGoat, thus executing the two functions in lock-step. Except, of course, I don't want to call ConsumeGoat more than once..
I don't think there is a way to do what you want, since ConsumeChickens(IEnumerable<int>) and ConsumeGoats(IEnumerable<int>) are being called sequentially, each of them enumerating a list separately - how do you expect that to work without two separate enumerations of the list?
Depending on the situation, a better solution is to have ConsumeChicken(int) and ConsumeGoat(int) methods (which each consume a single item), and call them in alternation. Like this:
foreach(var animal in animals)
{
ConsomeChicken(animal.Chickens);
ConsomeGoat(animal.Goats);
}
This will enumerate the animals collection only once.
Also, a note: depending on your LINQ-provider and what exactly it is you're trying to do, there may be better options. For example, if you're trying to get the total sum of both chickens and goats from a database using linq-to-sql or linq-to-entities, the following query..
from a in animals
group a by 0 into g
select new
{
TotalChickens = g.Sum(x => x.Chickens),
TotalGoats = g.Sum(x => x.Goats)
}
will result in a single query, and do the summation on the database-end, which is greatly preferable to pulling the entire table over and doing the summation on the client end.
The way you have posed your problem, there is no way to do this. IEnumerable<T> is a pull enumerable - that is, you can GetEnumerator to the front of the sequence and then repeatedly ask "Give me the next item" (MoveNext/Current). You can't, on one thread, have two different things pulling from the animals.Select(a => a.Chickens) and animals.Select(a => a.Goats) at the same time. You would have to do one then the other (which would require materializing the second).
The suggestion BlueRaja made is one way to change the problem slightly. I would suggest going that route.
The other alternative is to utilize IObservable<T> from Microsoft's reactive extensions (Rx), a push enumerable. I won't go into the details of how you would do that, but it's something you could look into.
Edit:
The above is assuming that ConsumeChickens and ConsumeGoats are both returning void or are at least not returning IEnumerable<T> themselves - which seems like an obvious assumption. I'd appreciate it if the lame downvoter would actually comment.
Actually simples way to achieve what you what is convert FarmsInEachPen return value to push collection or IObservable and use ReactiveExtensions for working with it
var observable = new Subject<Animals>()
observable.Do(x=> DoSomethingWithChicken(x. Chickens))
observable.Do(x=> DoSomethingWithGoat(x.Goats))
foreach(var item in FarmsInEachPen())
{
observable.OnNext(item)
}
I figured it out, thanks in large part due to the path that #Lee put me on.
You need to share a single enumerator between the two zips, and use an adapter function to project the correct element into the sequence.
private static IEnumerable<object> ConsumeChickens(IEnumerable<int> xList)
{
foreach (var x in xList)
{
Console.WriteLine("X: " + x);
yield return null;
}
}
private static IEnumerable<object> ConsumeGoats(IEnumerable<int> yList)
{
foreach (var y in yList)
{
Console.WriteLine("Y: " + y);
yield return null;
}
}
private static IEnumerable<int> SelectHelper(IEnumerator<AnimalCount> enumerator, int i)
{
bool c = i != 0 || enumerator.MoveNext();
while (c)
{
if (i == 0)
{
yield return enumerator.Current.Chickens;
c = enumerator.MoveNext();
}
else
{
yield return enumerator.Current.Goats;
}
}
}
private static void Main(string[] args)
{
var enumerator = GetAnimals().GetEnumerator();
var chickensList = ConsumeChickens(SelectHelper(enumerator, 0));
var goatsList = ConsumeGoats(SelectHelper(enumerator, 1));
var temp = chickensList.Zip(goatsList, (i, i1) => (object) null);
temp.ToList();
Console.WriteLine("Total iterations: " + iterations);
}
For the List object, we have a method called Reverse().
It reverse the order of the list 'in place', it doesn't return anything.
For the IEnumerable object, we have an extension method called Reverse().
It returns another IEnumerable.
I need to iterate in reverse order throught a list, so I can't directly use the second method, because I get a List, and I don't want to reverse it, just iterate backwards.
So I can either do this :
for(int i = list.Count - 1; i >=0; i--)
Or
foreach(var item in list.AsEnumerable().Reverse())
I found it less readable than if I have an IEnumerable, just do
foreach(var item in list.Reverse())
I can't understand why this 2 methods have been implemented this way, with the same name. It is pretty annoying and confusing.
Why there is not an extension called BackwardsIterator() in the place of Reverse() working for all IEnumerable?
I'm very interested by the historical reason of this choice, more than the 'how to do it' stuff!
It is worth noting that the list method is a lot older than the extension method. The naming was likely kept the same as Reverse seems more succinct than BackwardsIterator.
If you want to bypass the list version and go to the extension method, you need to treat the list like an IEnumerable<T>:
var numbers = new List<int>();
numbers.Reverse(); // hits list
(numbers as IEnumerable<int>).Reverse(); // hits extension
Or call the extension method as a static method:
Enumerable.Reverse(numbers);
Note that the Enumerable version will need to iterate the underlying enumerable entirely in order to start iterating it in reverse. If you plan on doing this multiple times over the same enumerable, consider permanently reversing the order and iterating it normally.
Write your own BackwardsIterator then!
public static IEnumerable BackwardsIterator(this List lst)
{
for(int i = lst.Count - 1; i >=0; i--)
{
yield return lst[i];
}
}
The existence of List<T>.Reverse long preceded the existence of IEnumerable<T>.Reverse. The reason they are named the same is ... incompetence. It's a horrible botch; clearly the Linq IEnumerable<T> function should have been given a different name ... e.g., Backwards ... since they have quite different semantics. As it is, it lays an awful trap for programmers -- someone might change the type of list from List<T> to, e.g., Collection<T>, and suddenly list.Reverse();, rather than reversing list in place, simply returns an IEnumerable<T> that is discarded. It cannot be overstated just how incompetent it was of MS to give these methods the same name.
To avoid the problem you can define your own extension method
public static IEnumerable<T> Backwards<T>(this IEnumerable<T> source) => source.Reverse();
You can even add a special case for efficient processing of indexable lists:
public static IEnumerable<T> Backwards<T>(this IEnumerable<T> source) =>
source is IList<T> list ? Backwards<T>(list) : source.Reverse();
public static IEnumerable<T> Backwards<T>(this IList<T> list)
{
for (int x = list.Count; --x >= 0;)
yield return list[x];
}
Can I somehow "instruct" LINQ to use binary search when the collection that I'm trying to search is ordered. I'm using an ObservableCollection<T>, populated with ordered data, and I'm trying to use Enumerable.First(<Predicate>). In my predicate, I'm filtering by the value of the field my collection's sorted by.
As far as I know, it's not possible with the built-in methods. However it would be relatively easy to write an extension method that would allow you to write something like that :
var item = myCollection.BinarySearch(i => i.Id, 42);
(assuming, of course, that you collection implements IList ; there's no way to perform a binary search if you can't access the items randomly)
Here's a sample implementation :
public static T BinarySearch<T, TKey>(this IList<T> list, Func<T, TKey> keySelector, TKey key)
where TKey : IComparable<TKey>
{
if (list.Count == 0)
throw new InvalidOperationException("Item not found");
int min = 0;
int max = list.Count;
while (min < max)
{
int mid = min + ((max - min) / 2);
T midItem = list[mid];
TKey midKey = keySelector(midItem);
int comp = midKey.CompareTo(key);
if (comp < 0)
{
min = mid + 1;
}
else if (comp > 0)
{
max = mid - 1;
}
else
{
return midItem;
}
}
if (min == max &&
min < list.Count &&
keySelector(list[min]).CompareTo(key) == 0)
{
return list[min];
}
throw new InvalidOperationException("Item not found");
}
(not tested... a few adjustments might be necessary) Now tested and fixed ;)
The fact that it throws an InvalidOperationException may seem strange, but that's what Enumerable.First does when there's no matching item.
The accepted answer is very good.
However, I need that the BinarySearch returns the index of the first item that is larger, as the List<T>.BinarySearch() does.
So I watched its implementation by using ILSpy, then I modified it to have a selector parameter. I hope it will be as useful to someone as it is for me:
public static class ListExtensions
{
public static int BinarySearch<T, U>(this IList<T> tf, U target, Func<T, U> selector)
{
var lo = 0;
var hi = (int)tf.Count - 1;
var comp = Comparer<U>.Default;
while (lo <= hi)
{
var median = lo + (hi - lo >> 1);
var num = comp.Compare(selector(tf[median]), target);
if (num == 0)
return median;
if (num < 0)
lo = median + 1;
else
hi = median - 1;
}
return ~lo;
}
}
Well, you can write your own extension method over ObservableCollection<T> - but then that will be used for any ObservableCollection<T> where your extension method is available, without knowing whether it's sorted or not.
You'd also have to indicate in the predicate what you wanted to find - which would be better done with an expression tree... but that would be a pain to parse. Basically, the signature of First isn't really suitable for a binary search.
I suggest you don't try to overload the existing signatures, but write a new one, e.g.
public static TElement BinarySearch<TElement, TKey>
(this IList<TElement> collection, Func<TElement, TItem> keySelector,
TKey key)
(I'm not going to implement it right now, but I can do so later if you want.)
By providing a function, you can search by the property the collection is sorted by, rather than by the items themselves.
Enumerable.First(predicate) works on an IEnumarable<T> which only supports enumeration, therefore it does not have random access to the items within.
Also, your predicate contains arbitrary code that eventually results in true or false, and so cannot indicate whether the tested item was too low or too high. This information would be needed in order to do a binary search.
Enumerable.First(predicate) can only test each item in order as it walks through the enumeration.
Keep in mind that all(? at least most) of the extension methods used by LINQ are implemented on IQueryable<T>orIEnumerable<T> or IOrderedEnumerable<T> or IOrderedQueryable<T>.
None of these interfaces supports random access, and therefore none of them can be used for a binary search. One of the benefits of something like LINQ is that you can work with large datasets without having to return the entire dataset from the database. Obviously you can't binary search something if you don't even have all of the data yet.
But as others have said, there is no reason at all you can't write this extension method for IList<T> or other collection types that support random access.