How to compare two lists in C#? - c#

I have two lists to compare:
I want to compare List A and List B such that if any of the dates from ListB is present in List A then return true.
For example, if 14-01-2020 (which is in List B) is present in List A (which is definitely present) then it should return true.
How to do that?
Please note: The data in List A contains the dates of an entire month, whereas List B contains only a few dates.

If any of the dates from ListB is present in List A then return true.
return ListB.Any(x => ListA.Contains(x));
or vice versa:
return ListA.Any(x => ListB.Contains(x));
Which one is better for you will depend on the nature of your data, but I'd normally favor running Contains() over the shorter sequence.
Additionally, I see this:
The data in List A contains the dates of an entire month
Depending on exactly what you mean, you may be able to take advantange of that fact:
var start = A.Min(x => x);
var stop = A.Max(x => x);
return ListB.Any(x => x >= start && x <= stop);
Finally, if you know the data in one or both of the sequences is sorted, you can optimize these significantly.

Related

C# element-wise difference between two lists of numbers

Assume
List<int> diff(List<int> a, List<int> b)
{
// assume same length lists
List<int> diff= new List<int>(a.Count);
for (int i=0; i<diff.Count; ++i)
{
diff[i] = a[i] - b[i];
}
return diff;
}
I would like to have some kind of one-liner do the same, or something that uses a lambda, rather than re-writing all the boilerplate.
for instance, in python, this would be either
[ai-bi for ai,bi in zip(a,b)]
or even
np.array(a) - np.array(b)
Is there a nice way to write this in C#? All my searches find ways to remove or add list elements, but nothing about element-wise actions.
Linq has a Zip method as well:
var diff = a.Zip(b, (ai, bi) => ai - bi);
Note that one potential bug in your code is if b has fewer elements than a then you'd get an exception when you try to access an element outside the range of b. Zip will only return items as long as both collections have items, which is effectively the shorter of the two collection lengths.

Efficiently calculating totals from a file using LINQ

I'm reading a file and turning each line within it into a class, let's call it Record, and returning each Record as it is read using IEnumerable<Record> and yield return.
Because of this I only start actually performing these reads whenever I do an operation on the enumeration, such as performing a sum on it or iterating through it with a foreach.
I do need to go through each record and then translate that into a database, but due to database design before my time I need the totals on each record in the database, so I need these totals before I start translating them into my database.
At the moment I have five separate .Count() or .Sum() operations on my enumeration before I start iterating the enumeration (example int i = records.Sum(r => r.SomeField) or int j = records.Count(r => r.IsSomethingTrue)). Each one of those counts or sums will loop through the entire file to calculate each one separately. I'm not really happy with this behaviour and would like to find a more efficient way of doing this.
I am using .NET 3.5 if that makes any difference.
You could use your own struct to calculate a few values at the single pass through an enumerable object.
public struct ComplexAccumulator
{
public int TotalSumField { get; set; }
public int CountSomethingTrue { get; set; }
}
Now you can use Aggreagate extension method to accumulate values:
records.Aggregate(default(ComplexAccumulator), (a, r) => new ComplexAccumulator
{
TotalSumFiled = a.TotalSumField + r.SumField,
CountSomethingTrue = a.CountSomethingTrue + r.IsSomethingTrue ? 1 : 0,
});
Instead of the struct you could use suitable Tuple instance, f.e. something like Tuple<int, int, int>.
Efficiency is not a strength of LINQ... You need to replace some LINQ things with manual loops here.
You seem to need two passes over the data. One for aggregation:
var sum = 0; //etc.
foreach (var item in items) {
//compute all 5 aggregates here
}
And then one to translate the data:
items.Select(item => Translate(item, aggregates))
Whether you should buffer items (for example using ToList) or not depends on whether available memory can hold those items or not.
You can use Aggregate to perform all 5 aggregations in one pass but that's not better than a loop in any way. It's slower, far more code and the code arguably is illegible.

LINQ, Lambda, C#, extension methods

I've only been playing with linq to sql and lambda expressions for the first time for a few days and I want to do the following.
I've got a string extension method that returns a double. The extension method tests two strings and returns a similarity score.
I have a list of string values from a column in a table using linq to sql and I want to use the extension method as a way of filtering out the only those strings whose similarity score is equal to or greater than the input string.
I've got the below so far. I don't seem to be able to test the value of the returned double.
List<int> ids = dc.ErrorIndexTolerances
.Where(n => n.Token.Distance(s) => .85)
.Select(n => n.ID)
.ToList();
The Distance Method is the extension method that returns a double. Both Token and s are string. ID is an integer ID field within a table.
Does anyone have any tips?
The greater or equal operator is >=, not =>.
List<int> ids =
dc.ErrorIndexTolerances.Where(n => n.Token.Distance(s) >= .85)
.Select(n => n.ID).ToList();
Perhaps this should be
n.Token.Distance(s) >= .85)
Just a typo :-)
Does anyone have any tips?
I have a tip... never use "greater than", only use "less than".
.Where(n => .85 <= n.Token.Distance(s))
I follow this rule mainly because of date logic. When comparing 5 sets of dates, it's good to never make the mistake of mis-reading the sign. The small one is on the left and the big one is on the right, 100% of the time.
.Where(acct => acct.CreateTime <= now
&& acct.StartTime <= order.OrderDate
&& order.FulfilledDate <= acct.EndTime)

How to Compare Values in Array

If you have a string of "1,2,3,1,5,7" you can put this in an array or hash table or whatever is deemed best.
How do you determine that all value are the same? In the above example it would fail but if you had "1,1,1" that would be true.
This can be done nicely using lambda expressions.
For an array, named arr:
var allSame = Array.TrueForAll(arr, x => x == arr[0]);
For an list (List<T>), named lst:
var allSame = lst.TrueForAll(x => x == lst[0]);
And for an iterable (IEnumerable<T>), named col:
var first = col.First();
var allSame = col.All(x => x == first);
Note that these methods don't handle empty arrays/lists/iterables however. Such support would be trivial to add however.
Iterate through each value, store the first value in a variable and compare the rest of the array to that variable. The instant one fails, you know all the values are not the same.
How about something like...
string numArray = "1,1,1,1,1";
return numArrray.Split( ',' ).Distinct().Count() <= 1;
I think using List<T>.TrueForAll would be a slick approach.
http://msdn.microsoft.com/en-us/library/kdxe4x4w.aspx
Not as efficient as a simple loop (as it always processes all items even if the result could be determined sooner), but:
if (new HashSet<string>(numbers.Split(',')).Count == 1) ...

Fastest way to compare two lists

I have a List (Foo) and I want to see if it's equal to another List (foo). What is the fastest way ?
From 3.5 onwards you may use a LINQ function for this:
List<string> l1 = new List<string> {"Hello", "World","How","Are","You"};
List<string> l2 = new List<string> {"Hello","World","How","Are","You"};
Console.WriteLine(l1.SequenceEqual(l2));
It also knows an overload to provide your own comparer
Here are the steps I would do:
Do an object.ReferenceEquals() if true, then return true.
Check the count, if not the same, return false.
Compare the elements one by one.
Here are some suggestions for the method:
Base the implementation on ICollection. This gives you the count, but doesn't restrict to specific collection type or contained type.
You can implement the method as an extension method to ICollection.
You will need to use the .Equals() for comparing the elements of the list.
Something like this:
public static bool CompareLists(List<int> l1, List<int> l2)
{
if (l1 == l2) return true;
if (l1.Count != l2.Count) return false;
for (int i=0; i<l1.Count; i++)
if (l1[i] != l2[i]) return false;
return true;
}
Some additional error checking (e.g. null-checks) might be required.
Something like this maybe using Match Action.
public static CompareList<T>(IList<T> obj1, IList<T> obj2, Action<T,T> match)
{
if (obj1.Count != obj2.Count) return false;
for (int i = 0; i < obj1.Count; i++)
{
if (obj2[i] != null && !match(obj1[i], obj2[i]))
return false;
}
}
Assuming you mean that you want to know if the CONTENTS are equal (not just the list's object reference.)
If you will be doing the equality check much more often than inserts then you may find it more efficient to generate a hashcode each time a value is inserted and compare hashcodes when doing the equality check. Note that you should consider if order is important or just that the lists have identical contents in any order.
Unless you are comparing very often I think this would usually be a waste.
One shortcut, that I didn't see mentioned, is that if you know how the lists were created, you may be able to join them into strings and compare directly.
For example...
In my case, I wanted to prompt the user for a list of words. I wanted to make sure that each word started with a letter, but after that, it could contain letters, numbers, or underscores. I'm particularly concerned that users will use dashes or start with numbers.
I use Regular Expressions to break it into 2 lists, and them join them back together and compare them as strings:
var testList = userInput.match(/[-|\w]+/g)
/*the above catches common errors:
using dash or starting with a numeric*/
listToUse = userInput.match(/[a-zA-Z]\w*/g)
if (listToUse.join(" ") != testList.join(" ")) {
return "the lists don't match"
Since I knew that neither list would contain spaces, and that the lists only contained simple strings, I could join them together with a space, and compare them.

Categories

Resources