What is the best way to find something in a list? I know LINQ has some nice tricks, but let's also get suggestions for C# 2.0. Lets get the best refactorings for this common code pattern.
Currently I use code like this:
// mObjList is a List<MyObject>
MyObject match = null;
foreach (MyObject mo in mObjList)
{
if (Criteria(mo))
{
match = mo;
break;
}
}
or
// mObjList is a List<MyObject>
bool foundIt = false;
foreach (MyObject mo in mObjList)
{
if (Criteria(mo))
{
foundIt = true;
break;
}
}
# Konrad: So how do you use it? Let's say I want to match mo.ID to magicNumber.
In C# 2.0 you'd write:
result = mObjList.Find(delegate(int x) { return x.ID == magicNumber; });
3.0 knows lambdas:
result = mObjList.Find(x => x.ID == magicNumber);
Using a Lambda expression:
List<MyObject> list = new List<MyObject>();
// populate the list with objects..
return list.Find(o => o.Id == myCriteria);
Put the code in a method and you save a temporary and a break (and you recycle code, as a bonus):
T Find<T>(IEnumerable<T> items, Predicate<T> p) {
foreach (T item in items)
if (p(item))
return item;
return null;
}
… but of course this method already exists anyway for Lists, even in .NET 2.0.
Evidently the performance hit of anonymous delegates is pretty significant.
Test code:
static void Main(string[] args)
{
for (int kk = 0; kk < 10; kk++)
{
List<int> tmp = new List<int>();
for (int i = 0; i < 100; i++)
tmp.Add(i);
int sum = 0;
long start = DateTime.Now.Ticks;
for (int i = 0; i < 1000000; i++)
sum += tmp.Find(delegate(int x) { return x == 3; });
Console.WriteLine("Anonymous delegates: " + (DateTime.Now.Ticks - start));
start = DateTime.Now.Ticks;
sum = 0;
for (int i = 0; i < 1000000; i++)
{
int match = 0;
for (int j = 0; j < tmp.Count; j++)
{
if (tmp[j] == 3)
{
match = tmp[j];
break;
}
}
sum += match;
}
Console.WriteLine("Classic C++ Style: " + (DateTime.Now.Ticks - start));
Console.WriteLine();
}
}
Results:
Anonymous delegates: 710000
Classic C++ Style: 340000
Anonymous delegates: 630000
Classic C++ Style: 320000
Anonymous delegates: 630000
Classic C++ Style: 330000
Anonymous delegates: 630000
Classic C++ Style: 320000
Anonymous delegates: 610000
Classic C++ Style: 340000
Anonymous delegates: 630000
Classic C++ Style: 330000
Anonymous delegates: 650000
Classic C++ Style: 330000
Anonymous delegates: 620000
Classic C++ Style: 330000
Anonymous delegates: 620000
Classic C++ Style: 340000
Anonymous delegates: 620000
Classic C++ Style: 400000
In every case, using anonymous delegates is about 100% slower than the other way.
Related
int[] array = new int[5]{5,7,8,15,20};
int TargetNumber = 13;
For a target number, I want to find the closest number in an array. For example, when the target number is 13, the closest number to it in the array above is 15. How would I accomplish that programmatically in C#?
EDIT: Have adjusted the queries below to convert to using long arithmetic, so that we avoid overflow issues.
I would probably use MoreLINQ's MinBy method:
var nearest = array.MinBy(x => Math.Abs((long) x - targetNumber));
Or you could just use:
var nearest = array.OrderBy(x => Math.Abs((long) x - targetNumber)).First();
... but that will sort the whole collection, which you really don't need. It won't make much difference for a small array, admittedly... but it just doesn't feel quite right, compared with describing what you're actually trying to do: find the element with the minimum value according to some function.
Note that both of these will fail if the array is empty, so you should check for that first.
If you're using .Net 3.5 or above LINQ can help you here:
var closest = array.OrderBy(v => Math.Abs((long)v - targetNumber)).First();
Alternatively, you could write your own extension method:
public static int ClosestTo(this IEnumerable<int> collection, int target)
{
// NB Method will return int.MaxValue for a sequence containing no elements.
// Apply any defensive coding here as necessary.
var closest = int.MaxValue;
var minDifference = int.MaxValue;
foreach (var element in collection)
{
var difference = Math.Abs((long)element - target);
if (minDifference > difference)
{
minDifference = (int)difference;
closest = element;
}
}
return closest;
}
Useable like so:
var closest = array.ClosestTo(targetNumber);
Both Jon and Rich gave great answers with MinBy and ClosestTo. But I would never recommend using OrderBy if your intent is to find a single element. It's far too inefficient for those kinds of tasks. It's simply the wrong tool for the job.
Here's a technique that performs marginally better than MinBy, is already included in the .NET framework, but less elegant than MinBy: Aggregate
var nearest = array.Aggregate((current, next) => Math.Abs((long)current - targetNumber) < Math.Abs((long)next - targetNumber) ? current : next);
As I said, not as elegant as Jon's method, but viable.
Performance on my computer:
For(each) Loops = fastest
Aggregate = 2.5x slower than loops
MinBy = 3.5x slower than loops
OrderBy = 12x slower than loops
I found this really sexy approach years ago in Math.NET Numerics https://numerics.mathdotnet.com/ which works with BinarySearch in the array. It was a good help in preparation for interpolations and works down to .Net 2.0:
public static int LeftSegmentIndex(double[] array, double t)
{
int index = Array.BinarySearch(array, t);
if (index < 0)
{
index = ~index - 1;
}
return Math.Min(Math.Max(index, 0), array.Length - 2);
}
If you need to find the closest value to the average
very open style
public static double Miidi(double[] list)
{
bool isEmpty = !list.Any();
if (isEmpty)
{
return 0;
}
else
{
double avg = list.Average();
double closest = 100;
double shortest = 100;
{
for ( int i = 0; i < list.Length; i++)
{
double lgth = list[i] - avg;
if (lgth < 0)
{
lgth = lgth - (2 * lgth);
}
else
lgth = list[i] - avg;
if (lgth < shortest)
{
shortest = lgth;
closest = list[i];
}
}
}
return closest;
}
}
Performance wise custom code will be more useful.
public static int FindNearest(int targetNumber, IEnumerable<int> collection) {
var results = collection.ToArray();
int nearestValue;
if (results.Any(ab => ab == targetNumber))
nearestValue = results.FirstOrDefault(i => i == targetNumber);
else{
int greaterThanTarget = 0;
int lessThanTarget = 0;
if (results.Any(ab => ab > targetNumber)) {
greaterThanTarget = results.Where(i => i > targetNumber).Min();
}
if (results.Any(ab => ab < targetNumber)) {
lessThanTarget = results.Where(i => i < targetNumber).Max();
}
if (lessThanTarget == 0) {
nearestValue = greaterThanTarget;
}
else if (greaterThanTarget == 0) {
nearestValue = lessThanTarget;
}
else if (targetNumber - lessThanTarget < greaterThanTarget - targetNumber) {
nearestValue = lessThanTarget;
}
else {
nearestValue = greaterThanTarget;
}
}
return nearestValue;
}
This is out of curiosity I want to ask this question...
Here is my code:
for (int i = 0; i < myList.Count - 1; ++i)
{
for (int j = i+1; j < myList.Count; ++j)
{
DoMyStuff(myList[i], myList[j]);
}
}
Pretty simple loop, but obviously it only works with List...
But I was wondering... how can I code this loop in order to make it independent of the collection's type (deriving from IEnumerable...)
My first thought:
IEnumerator it1 = myList.GetEnumerator();
while (it1.MoveNext())
{
IEnumerator it2 = it1; // this part is obviously wrong
while (it2.MoveNext())
{
DoMyStuff(it1.Current, it2.Current);
}
}
Because enumerators don't have an efficient way of getting the n'th element, your best bet is to copy the enumerable into a list, then use your existing code:
void CrossMap<T>(IEnumerable<T> enumerable)
{
List<T> myList = enumerable.ToList();
for (int i = 0; i < myList.Count - 1; ++i)
{
for (int j = i+1; j < myList.Count; ++j)
{
DoMyStuff(myList[i], myList[j]);
}
}
}
However, there is a rather tricksie hack you can do with some collection types. Because the enumerators of some of the collection types in the BCL are declared as value types, rather than reference types, you can create an implicit clone of the state of an enumerator by copying it to another variable:
// notice the struct constraint!
void CrossMap<TEnum, T>(TEnum enumerator) where TEnum : struct, IEnumerator<T>
{
while (enumerator.MoveNext())
{
TEnum enum2 = enumerator; // value type, so this makes an implicit clone!
while (enum2.MoveNext())
{
DoMyStuff(enumerator.Current, enum2.Current);
}
}
}
// to use (you have to specify the type args exactly)
List<int> list = Enumerable.Range(0, 10).ToList();
CrossMap<List<int>.Enumerator, int>(list.GetEnumerator());
This is quite obtuse, and quite hard to use, so you should only do this if this is performance and space-critical.
Here is a way that will truly use the lazy IEnumerable paradigm to generate a stream of non-duplicated combinations from a single IEnumerable input. The first pair will return immediately (no cacheing of lists), but there will be increasing delays (still imperceptible except for very high values of n or very expensive IEnumerables) during the Skip(n) operation which occurs after every move forward on the outer enumerator:
public static IEnumerable<Tuple<T, T>> Combinate<T>(this IEnumerable<T> enumerable) {
var outer = enumerable.GetEnumerator();
var n = 1;
while (outer.MoveNext()) {
foreach (var item in enumerable.Skip(n))
yield return Tuple.Create(outer.Current, item);
n++;
}
}
Here is how you would use it in your case:
foreach(var pair in mySource.Combinate())
DoMyStuff(pair.Item1, pair.Item2);
Postscript
Everyone has pointed out (here and elsewhere) that there is no efficient way of getting the "nth" element of an IEnumerable. This is partly because IEnumerable does not require there to even be an underlying source collection. For example, here's a silly little function that that dynamically generates values for an experiment as quickly as they can be consumed, and continues for a specified period of time rather than for any count:
public static IEnumerable<double> Sample(double milliseconds, Func<double> generator) {
var sw = new Stopwatch();
var timeout = TimeSpan.FromMilliseconds(milliseconds);
sw.Start();
while (sw.Elapsed < timeout)
yield return generator();
}
There are extension methods Count() and ElementAt(int) that are declared on IEnumerable<T>. They are declared in the System.Linq namespace, which should be included by default in your .cs files if you are using any C# version later than C# 3. That means that you could you just do:
for (int i = 0; i < myList.Count() - 1; ++i)
{
for (int j = i+1; j < myList.Count(); ++j)
{
DoMyStuff(myList.ElementAt(i), myList.ElementAt(j));
}
}
However, note that these are methods, and will be called over and over again during iteration, so you might want to save their result to variables, like:
var elementCount = myList.Count();
for (int i = 0; i < elementCount - 1; ++i)
{
var iElement = myList.ElementAt(i);
for (int j = i+1; j < elementCount; ++j)
{
DoMyStuff(iElement, myList.ElementAt(j));
}
}
You could also try some LINQ that will select all pair of elements that are eligible, and then use simple foreach to call the processing, something like:
var result = myList.SelectMany((avalue, aindex) =>
myList.Where((bvalue, bindex) => aindex < bindex)
.Select(bvalue => new {First = avalue, Second = bvalue}));
foreach (var item in result)
{
DoMyStuff(item.First, item.Second);
}
I'd write against IEnumerable<T> and pass a delegate for the indexing operation:
public static void DoStuff<T>(IEnumerable<T> seq, Func<int, T> selector)
{
int count = seq.Count();
for (int i = 0; i < count - 1; ++i)
{
for (int j = i+1; j < count; ++j)
{
DoMyStuff(selector(i), selector(j));
}
}
}
You can call it using:
List<T> list = //whatever
DoStuff(list, i => list[i]);
If you restrict the collection argument to ICollection<T> you can use the Count property instead of using the Count() extension method.
Not really efficient, but readable:
int i = 0;
foreach( var item1 in myList)
{
++i;
foreach( var item2 in myList.Skip(i))
DoMyStuff(item1, item2);
}
You can do it fairly succinctly using IEnumerable.Skip(), and it might even be fairly fast compared with copying the list into an array IF the list is short enough. It's bound to be a lot slower than the copying for lists of a sufficient size, though.
You'd have to do some timings with lists of various sizes to see where copying to an array becomes more efficient.
Here's the code. Note that it's iterating an enumerable twice - which will be ok if the enumerable is implemented correctly!
static void test(IEnumerable<int> myList)
{
int n = 0;
foreach (int v1 in myList)
{
foreach (int v2 in myList.Skip(++n))
{
DoMyStuff(v1, v2);
}
}
}
I've got the following to sort entities by their job position. The desired order ist defined in another array. In C# this code works:
IEnumerable<CreditObject> query = credits.OrderBy(x =>
{
for (int i = 0; i < list.Length; i++)
{
if (x.Job == list[i])
return i;
}
throw new NotImplementedException("Job not within List");
});
However I will have to convert this to VB.net. I read the equivalent would be something like the following:
Dim query As IEnumerable(Of CreditObject) = credits.OrderBy(Function(x)
For j As Integer = 0 To templ.Length - 1
If x.Job = templ(j) Then
Return j
End If
Next
End Function)
This does not compile, gives me "Expression expected" right after the Function(x). What am I doing wrong?
First, you make that into a bonafide method:
public int GetCreditObjectPosition(CreditObject x, List<int> list) {
for (int i = 0; i < list.Length; i++) {
if (x.Job == list[i]) {
return i;
}
}
throw new NotImplementedException("Job not within List");
}
Then, you just say:
IEnumerable<CreditObject> query =
credits.OrderBy(x => GetCreditObjectPosition(x, list));
That's easy enough to convert to VB.
Next, you rewrite GetCreditObjectPosition for massive performance improvements:
public int GetCreditObjectPosition(CreditObject x, List<int> list) {
var jobDictionary =
list.Select((job, index) => new { Job = job, Index = Index } )
.ToDictionary(item => item.Job, item => item.Index);
int position;
if(!jobDictionary.TryGetValue(x.Job, out position)) {
throw new Exception("Job not within List");
}
return position;
}
int[] array = new int[5]{5,7,8,15,20};
int TargetNumber = 13;
For a target number, I want to find the closest number in an array. For example, when the target number is 13, the closest number to it in the array above is 15. How would I accomplish that programmatically in C#?
EDIT: Have adjusted the queries below to convert to using long arithmetic, so that we avoid overflow issues.
I would probably use MoreLINQ's MinBy method:
var nearest = array.MinBy(x => Math.Abs((long) x - targetNumber));
Or you could just use:
var nearest = array.OrderBy(x => Math.Abs((long) x - targetNumber)).First();
... but that will sort the whole collection, which you really don't need. It won't make much difference for a small array, admittedly... but it just doesn't feel quite right, compared with describing what you're actually trying to do: find the element with the minimum value according to some function.
Note that both of these will fail if the array is empty, so you should check for that first.
If you're using .Net 3.5 or above LINQ can help you here:
var closest = array.OrderBy(v => Math.Abs((long)v - targetNumber)).First();
Alternatively, you could write your own extension method:
public static int ClosestTo(this IEnumerable<int> collection, int target)
{
// NB Method will return int.MaxValue for a sequence containing no elements.
// Apply any defensive coding here as necessary.
var closest = int.MaxValue;
var minDifference = int.MaxValue;
foreach (var element in collection)
{
var difference = Math.Abs((long)element - target);
if (minDifference > difference)
{
minDifference = (int)difference;
closest = element;
}
}
return closest;
}
Useable like so:
var closest = array.ClosestTo(targetNumber);
Both Jon and Rich gave great answers with MinBy and ClosestTo. But I would never recommend using OrderBy if your intent is to find a single element. It's far too inefficient for those kinds of tasks. It's simply the wrong tool for the job.
Here's a technique that performs marginally better than MinBy, is already included in the .NET framework, but less elegant than MinBy: Aggregate
var nearest = array.Aggregate((current, next) => Math.Abs((long)current - targetNumber) < Math.Abs((long)next - targetNumber) ? current : next);
As I said, not as elegant as Jon's method, but viable.
Performance on my computer:
For(each) Loops = fastest
Aggregate = 2.5x slower than loops
MinBy = 3.5x slower than loops
OrderBy = 12x slower than loops
I found this really sexy approach years ago in Math.NET Numerics https://numerics.mathdotnet.com/ which works with BinarySearch in the array. It was a good help in preparation for interpolations and works down to .Net 2.0:
public static int LeftSegmentIndex(double[] array, double t)
{
int index = Array.BinarySearch(array, t);
if (index < 0)
{
index = ~index - 1;
}
return Math.Min(Math.Max(index, 0), array.Length - 2);
}
If you need to find the closest value to the average
very open style
public static double Miidi(double[] list)
{
bool isEmpty = !list.Any();
if (isEmpty)
{
return 0;
}
else
{
double avg = list.Average();
double closest = 100;
double shortest = 100;
{
for ( int i = 0; i < list.Length; i++)
{
double lgth = list[i] - avg;
if (lgth < 0)
{
lgth = lgth - (2 * lgth);
}
else
lgth = list[i] - avg;
if (lgth < shortest)
{
shortest = lgth;
closest = list[i];
}
}
}
return closest;
}
}
Performance wise custom code will be more useful.
public static int FindNearest(int targetNumber, IEnumerable<int> collection) {
var results = collection.ToArray();
int nearestValue;
if (results.Any(ab => ab == targetNumber))
nearestValue = results.FirstOrDefault(i => i == targetNumber);
else{
int greaterThanTarget = 0;
int lessThanTarget = 0;
if (results.Any(ab => ab > targetNumber)) {
greaterThanTarget = results.Where(i => i > targetNumber).Min();
}
if (results.Any(ab => ab < targetNumber)) {
lessThanTarget = results.Where(i => i < targetNumber).Max();
}
if (lessThanTarget == 0) {
nearestValue = greaterThanTarget;
}
else if (greaterThanTarget == 0) {
nearestValue = lessThanTarget;
}
else if (targetNumber - lessThanTarget < greaterThanTarget - targetNumber) {
nearestValue = lessThanTarget;
}
else {
nearestValue = greaterThanTarget;
}
}
return nearestValue;
}
are there any differences in the references that are produced in code generated for anonymous methods by a .NET 2.0 or 4.0 compiler and code generated for an equivalent lambda by a .NET 4.0 compiler? and in particular for the this pointer: I know both anonymous methods and lambdas are a C# compiler feature and the compiler actually generates a nested class with a delegate and all the references required for outer variables, but this article on the implementation of anonymous methods states a reference is kept to the pointer and I cannot find any source describing anything similar for lambdas.. or am I not finding anything because the implementation for compiling anonymous methods maps 1 on 1 to that of lambdas?
here's a bit of code to demonstrate anonymous methods and lambdas:
class AnonymousMethodMethodScope
{
private Func<bool> d;
public Func<int, bool> d2;
int j = 0;
public void Test(int i)
{
d = new Func<bool>(delegate { j = 10; return j > i; });
// what references does this anonymous method keep?
d2 = new Func<int, bool>(delegate(int x) { return x == j; });
Console.WriteLine("j = " + j + " result = " + d());
}
}
class LambdaMethodScope
{
private Func<bool> d;
public Func<int, bool> d2;
public void Test(int i)
{
int j = 0;
d = () => { j = 10; return j > i; };
// what references does this lambda keep?
d2 = x => x == j;
Console.WriteLine("j = " + j + " result = " + d());
}
}
Yes, lambda expressions will do (and have to do) the same thing as anonymous methods when it comes to capturing variables. (I'm assuming you're talking about lambda expressions which are converted into delegates; if they're converted into expression trees they may be a bit different - I'm not sure.)