Compare two List<string[]> objects in C# Unit Test - c#

I'm trying to create a Unit Test that compares two lists of string arrays.
I tried creating two of the exact same List<string[]> objects, but when I use CollectionAssert.AreEqual(expected, actual);, the test fails:
[TestMethod]
public void TestList()
{
List<string[]> expected = new List<string[]> {
new string[] { "John", "Smith", "200" },
new string[] { "John", "Doe", "-100" }
};
List<string[]> actual = new List<string[]> {
new string[] { "John", "Smith", "200" },
new string[] { "John", "Doe", "-100" }
};
CollectionAssert.AreEqual(expected, actual);
}
I've also tried Assert.IsTrue(expected.SequenceEqual(actual));, but that fails as well.
Both these methods work if I am comparing two Lists of strings or two arrays of strings, but they do not work when comparing two Lists of arrays of strings.
I'm assuming these methods are failing because they are comparing two Lists of object references instead of the array string values.
How can I compare the two List<string[]> objects and tell if they are really the same?

It is failing because the items in your list are objects (string[]) and since you did not specify how CollectionAssert.AreEqual should compare the elements in the two sequences it is falling back to the default behavior which is to compare references. If you were to change your lists to the following, for example, you would find that the test passes because now both lists are referencing the same arrays:
var first = new string[] { "John", "Smith", "200" };
var second = new string[] { "John", "Smith", "200" };
List<string[]> expected = new List<string[]> { first, second};
List<string[]> actual = new List<string[]> { first, second};
To avoid referential comparisons you need to tell CollectionAssert.AreEqual how to compare the elements, you can do that by passing in an IComparer when you call it:
CollectionAssert.AreEqual(expected, actual, StructuralComparisons.StructuralComparer);

CollectionAssert.AreEqual(expected, actual); fails, because it compares object references. expected and actual refer to different objects.
Assert.IsTrue(expected.SequenceEqual(actual)); fails for the same reason. This time the contents of expected and actual are compared, but the elements are themselves different array references.
Maybe try to flatten both sequences using SelectMany:
var expectedSequence = expected.SelectMany(x => x).ToList();
var actualSequence = actual.SelectMany(x => x).ToList();
CollectionAssert.AreEqual(expectedSequence, actualSequence);
As Enigmativity correctly noticed in his comment, SelectMany may give a positive result when the number of arrays and/or their elements are different, but flattening the lists will result in an equal number of elements. It is safe only in the case when you always have the same number of arrays and elements in these arrays.

The best solution would be to check both the items in each sub-collection as well as the number of items in each respective sub-collection.
Try with this:
bool equals = expected.Count == actual.Count &&
Enumerable.Range(0, expected.Count).All(i => expected[i].Length == actual[i].Length &&
expected[i].SequenceEqual(actual[i]));
Assert.IsTrue(equals);
This will check that:
Both the lists have the same length
All the pair of sub-collections in both lists have the same length
The items in each pair of sub-collections are the same
Note: using SelectMany isn't a good idea since it could create a false positive you have the same items in your second list, but spread out in different sub-collections. I mean, it'd consider two lists to be the same even if the second one had the same items all in a single sub-collection.

Related

comparing two lists of string and if one of the item match do some processing

I have two lists of string. I want to compare each elements in one list with another and if at least one of them match then do some processing else dont do anything.
I dont know how to do. I do have the following lists and the code I used was SequenceEqual but my lead said its wrong as it just compares if its equal or not and does nothing. I couldn't disagree and I want to achieve my intended functionality I mentioned above. Please help. As you seem, order doesn't matter, here 123 is in both list but in different order, so it matches and hence do some processing as per my requirement.
List<string> list1 = new List<string> () { "123", "234" };
List<string> list2 = new List<string> () { "333", "234" , "123"};
You can use the Any method for this :
var matchfound = list1.Any(x=> list2.Contains(x));
Now you can do conditional block on the matchFound if it returns true you can process what ever is required.
if you want to do case insentitive comparison then you will need to use String.Equals and can specify if case does not matter for comaparing those.
You can use Intersect to find common elements:
var intersecting = list1.Intersect(list2);
If you just want to know if there are common elements append .Any():
bool atLeastOneCommonElement = intersecting.Any();
If you want to process them:
foreach(var commonElement in intersecting)
{
// do something ...
}
You could check with Intersect and Any
var matchFound = list1.Intersect(list2).Any();
For example,
List<string> list1 = new List<string>{ "123", "234" };
List<string> list2 = new List<string>{ "333", "234" , "123"};
var result = list1.Intersect(list2).Any();
Output True
List<string> list3 = new List<string>{"5656","8989"};
result = list1.Intersect(list3).Any();
Output False
You need to take all those item that are matches from both list and then do code if match found like
foreach (var item in list1.Where(x => list2.Contains(x)))
{
//do some processing here
Console.WriteLine($"Match found: {item}");
}
In above code foreach iterate when item present in both list.
Output:
Use LINQ to find the matches; and then check the resulting array size as follows:
var intersect = list1.Where(el1=>list2.Any(el2=>el2==el1));
var isMatch = intersect.Count > 0;

Remove items from string array

I have two string arrays
string[] a = ...
string[] b = ...
I want to remove any items from a that also exist in b or return a new array with only those items that exist only in a.
So, for example, if
a={"a", "b", "c"};
and,
b={"b"}
then the result should be
{"a", "c"}
Is there a neat lambda expression or Linq or something I can use to do this?
Thanks,
Sachin
I believe Except will do what you want. Remember, Except, like most LINQ Extension methods, will not modify the existing collection. It will return a new collection.
c = a.Except(b)

List<List<int>> Remove() method

I'd like to use Remove() method on list of lists, but it's not working for me.
Simple example should say everything:
List<List<int>> list = new List<List<int>>();
list.Add(new List<int> { 0, 1, 2 });
list.Add(new List<int> { 1, 2 });
list.Add(new List<int> { 4 });
list.Add(new List<int> { 0, 1, });
list.Remove(new List<int> { 1, 2 });
If I use RemoveAt(1) it works fine but Remove() not.
It is obviously the same reason that this code returns false:
List<int> l1 = new List<int>();
List<int> l2 = new List<int>();
l1.Add(1);
l2.Add(1);
bool b1 = l1 == l2; // returns False
bool b2 = l1.Equals(l2); // returns False too
So it seems to me that I cannot simply compare two lists or even arrays. I can use loops instead of Remove(), but there must be easier way.
Thanks in advance.
The problem is that List<T> doesn't override Equals and GetHashCode, which is what List<T> will use when trying to find an item. (In fact, it will use the default equality comparer, which means it'll use the IEquatable<T> implementation if the object implements it, and fall back to object.Equals/GetHashCode if necessary). Equals will return false as you're trying to remove a different object, and the default implementation is to just compare references.
Basically you'd have write a method to compare two lists for equality, and use that to find the index of the entry you want to remove. Then you'd remove by index (using RemoveAt). EDIT: As noted, Enumerable.SequenceEqual can be used to compare lists. This isn't as efficient as it might be, due to not initially checking whether the counts are equal when they can be easily computed. Also, if you only need to compare List<int> values, you can avoid the virtual method call to an equality comparer.
Another alternative is to avoid using a List<List<int>> in the first place - use a List<SomeCustomType> where SomeCustomType includes a List<int>. You can then implement IEquatable<T> in that type. Note that this may well also allow you to encapsulate appropriate logic in the custom type too. I often find that by the type you've got "nested" collection types, a custom type encapsulates the meaning of the inner collection more effectively.
First approach:
List<int> listToRemove = new List<int> { 1, 2 };
list.RemoveAll(innerList => innerList.Except(listToRemove).Count() == 0);
This also removes the List { 2, 1 }
Second approach (preferred):
List<int> listToRemove = new List<int> { 1, 2 };
list.RemoveAll(innerList => innerList.SequenceEqual(listToRemove));
This removes all lists that contain the same sequence as the provided list.
List equality is reference equality. It won't remove the list unless it has the same reference as a list in the outer list. You could create a new type that implements equality as set equality rather than reference equality (or you do care about order as well?). Then you could make lists of this type instead.
This simply won't work because you're tying to remove a brand new list (the new keyword kind of dictates such), not one of the ones you just put in there. For example, the following code create two different lists, inasmuch as they are not the same list, however much they look the same:
var list0 = new List<int> { 1, 2 };
var list1 = new List<int> { 1, 2 };
However, the following creates one single list, but two references to the same list:
var list0 = new List<int> { 1, 2 };
var list1 = list0;
Therefore, you ought to keep a reference to the lists you put in there should you want to act upon them with Remove in the future, such that:
var list0 = new List<int> { 1, 2 };
listOfLists.Remove(list0);
They are different objects. Try this:
List<int> MyList = new List<int> { 1, 2 };
List<List<int>> list = new List<List<int>>();
list.Add(new List<int> { 0, 1, 2 });
list.Add(MyList);
list.Add(new List<int> { 4 });
list.Add(new List<int> { 0, 1, });
list.Remove(MyList);
You need to specify the reference to the list you want to remove:
list.Remove(list[1]);
which, really, is the same as
list.RemoveAt(1);

Check if one collection of values contains another

Suppose I have two collections as follows:
Collection1:
"A1"
"A1"
"M1"
"M2"
Collection2:
"M2"
"M3"
"M1"
"A1"
"A1"
"A2"
all the values are string values. I want to know if all the elements in Collection1 are contained in Collection2, but I have no guarantee on the order and a set may have multiple entries with the same value. In this case, Collection2 does contain Collection1 because Collection2 has two A1's, M1 and M2. Theres the obvious way: sorting both collections and popping off values as i find matches, but I was wondering if there's a faster more efficient way to do this. Again with the initial collections I have no guarantee on the order or how many times a given value will appear
EDIT: Changed set to collection just to clear up that these aren't sets as they can contain duplicate values
The most concise way I know of:
//determine if Set2 contains all of the elements in Set1
bool containsAll = Set1.All(s => Set2.Contains(s));
Yes, there is a faster way, provided you're not space-constrained. (See space/time tradeoff.)
The algorithm:
Just insert all the elements in Set2 into a hashtable (in C# 3.5, that's a HashSet<string>), and then go through all the elements of Set1 and check if they're in the hashtable. This method is faster (Θ(m + n) time complexity), but uses O(n) space.
Alternatively, just say:
bool isSuperset = new HashSet<string>(set2).IsSupersetOf(set1);
Edit 1:
For those people concerned about the possibility of duplicates (and hence the misnomer "set"), the idea can easily be extended:
Just make a new Dictionary<string, int> representing the count of each word in the super-list (add one to the count each time you see an instance of an existing word, and add the word with a count of 1 if it's not in the dictionary), and then go through the sub-list and decrement the count each time. If every word exists in the dictionary and the count is never zero when you try to decrement it, then the subset is in fact a sub-list; otherwise, you had too many instances of a word (or it didn't exist at all), so it's not a real sub-list.
Edit 2:
If the strings are very big and you're concerned about space efficiency, and an algorithm that works with (very) high probability works for you, then try storing a hash of each string instead. It's technically not guaranteed to work, but the probability of it not working is pretty darn low.
The problem I see with the HashSet, Intersect, and other Set theory answers is that you do contain duplicates, and "A set is a collection that contains no duplicate elements". Here's a way to handle the duplicate cases.
var list1 = new List<string> { "A1", "A1", "M1", "M2" };
var list2 = new List<string> { "M2", "M3", "M1", "A1", "A1", "A2" };
// Remove returns true if it was able to remove it, and it won't be there to be matched again if there's a duplicate in list1
bool areAllPresent = list1.All(i => list2.Remove(i));
EDIT: I renamed from Set1 and Set2 to list1 and list2 to appease Mehrdad.
EDIT 2: The comment implies it, but I wanted to explicitly state that this does alter list2. Only do it this way if you're using it as a comparison or control but don't need the contents afterwards.
Check out linq...
string[] set1 = {"A1", "A1", "M1", "M2" };
string[] set2 = { "M2", "M3", "M1", "A1", "A1", "A2" };
var matching = set1.Intersect(set2);
foreach (string x in matching)
{
Console.WriteLine(x);
}
Similar one
string[] set1 = new string[] { "a1","a2","a3","a4","a5","aa","ab" };
string[] set2 = new string[] {"m1","m2","a4","a6","a1" };
var a = set1.Select(set => set2.Contains(set));

How do you concatenate Lists in C#?

If I have:
List<string> myList1;
List<string> myList2;
myList1 = getMeAList();
// Checked myList1, it contains 4 strings
myList2 = getMeAnotherList();
// Checked myList2, it contains 6 strings
myList1.Concat(myList2);
// Checked mylist1, it contains 4 strings... why?
I ran code similar to this in Visual Studio 2008 and set break points after each execution. After myList1 = getMeAList();, myList1 contains four strings, and I pressed the plus button to make sure they weren't all nulls.
After myList2 = getMeAnotherList();, myList2 contains six strings, and I checked to make sure they weren't null... After myList1.Concat(myList2); myList1 contained only four strings. Why is that?
Concat returns a new sequence without modifying the original list. Try myList1.AddRange(myList2).
Try this:
myList1 = myList1.Concat(myList2).ToList();
Concat returns an IEnumerable<T> that is the two lists put together, it doesn't modify either existing list. Also, since it returns an IEnumerable, if you want to assign it to a variable that is List<T>, you'll have to call ToList() on the IEnumerable<T> that is returned.
targetList = list1.Concat(list2).ToList();
It's working fine I think so. As previously said, Concat returns a new sequence and while converting the result to List, it does the job perfectly.
It also worth noting that Concat works in constant time and in constant memory.
For example, the following code
long boundary = 60000000;
for (long i = 0; i < boundary; i++)
{
list1.Add(i);
list2.Add(i);
}
var listConcat = list1.Concat(list2);
var list = listConcat.ToList();
list1.AddRange(list2);
gives the following timing/memory metrics:
After lists filled mem used: 1048730 KB
concat two enumerables: 00:00:00.0023309 mem used: 1048730 KB
convert concat to list: 00:00:03.7430633 mem used: 2097307 KB
list1.AddRange(list2) : 00:00:00.8439870 mem used: 2621595 KB
I know this is old but I came upon this post quickly thinking Concat would be my answer. Union worked great for me. Note, it returns only unique values but knowing that I was getting unique values anyway this solution worked for me.
namespace TestProject
{
public partial class Form1 :Form
{
public Form1()
{
InitializeComponent();
List<string> FirstList = new List<string>();
FirstList.Add("1234");
FirstList.Add("4567");
// In my code, I know I would not have this here but I put it in as a demonstration that it will not be in the secondList twice
FirstList.Add("Three");
List<string> secondList = GetList(FirstList);
foreach (string item in secondList)
Console.WriteLine(item);
}
private List<String> GetList(List<string> SortBy)
{
List<string> list = new List<string>();
list.Add("One");
list.Add("Two");
list.Add("Three");
list = list.Union(SortBy).ToList();
return list;
}
}
}
The output is:
One
Two
Three
1234
4567
Take a look at my implementation. It's safe from null lists.
IList<string> all= new List<string>();
if (letterForm.SecretaryPhone!=null)// first list may be null
all=all.Concat(letterForm.SecretaryPhone).ToList();
if (letterForm.EmployeePhone != null)// second list may be null
all= all.Concat(letterForm.EmployeePhone).ToList();
if (letterForm.DepartmentManagerName != null) // this is not list (its just string variable) so wrap it inside list then concat it
all = all.Concat(new []{letterForm.DepartmentManagerPhone}).ToList();

Categories

Resources