I want to compare 2 hash sets and take out the differences - c#

I have 2 hash sets like this.
Hash_1 = {1, 2, 3, 4, 5}
Hash_2 = {4, 5, 6, 7, 8}
I am using C#
I want to compare those two sets and want to get the output like
Hash_3 = {1, 2, 3, 6, 7, 8}

What you want is: Hash_1 without Hash_2, and Hash_2 without Hash_1, then combined into one set.
So let's start with Hash_1 without Hash_2:
var modified1 = Hash_1.Except(Hash_2);
and then Hash_2 without Hash_1:
var modified2 = Hash_2.Except(Hash_1);
And now let's combine them:
var result = modified1.Concat(modified2);
Or in short:
var result = Hash_1.Except(Hash_2).Concat(Hash_2.Except(Hash_1));
Try it online

Or you could use SymmetricExceptWith
Modifies the current HashSet<T> object to contain only elements that
are present either in that object or in the specified collection, but
not both.
var h1 = new HashSet<int>() { 1, 2, 3, 4, 5 };
var h2 = new HashSet<int>() { 4, 5, 6, 7, 8 };
h1.SymmetricExceptWith(h2);
Console.WriteLine(string.Join(",", h1));
Output
1,2,3,7,6,8
Internally it just uses
foreach (T item in other)
{
if (!Remove(item))
{
AddIfNotPresent(item);
}
}
Source Code here

Related

Remove elements from List A that are in List B while keeping any duplicates in List A [duplicate]

This question already has answers here:
Removing a list of objects from another list
(5 answers)
Closed 1 year ago.
I have a List A of strings that I want to trim of all elements that also appear in List B, while keeping the duplicate values in List A.
Such that with an input like:
List A: [1, 2, 2, 2, 3, 3, 4, 5, 6, 7, 7, 7]
List B: [2, 6, 8, 9, 10]
I am hoping to get an output like:
List C: [1, 3, 3, 4, 5, 7, 7, 7]
I originally thought this could be accomplished using ListA.Except(ListB), but that function leaves only one element of a duplicate value.
In the program, List B is much bigger than the example given and there are multiple instances of List A to go through, so I'd like to avoid nested for loops. I don't necessarily care about keeping the original order of List A either, since the output of this will be the input of a frequency dictionary.
Am I overlooking something? Is there a faster option than using nested for loops?
You can use List<T>.RemoveAll()
var a = new[] { 1, 2, 2, 2, 3, 3, 4, 5, 6, 7, 7, 7 }.ToList();
var b = new[] { 2, 6, 8, 9, 10 }.ToList();
var c = a.Select(i => i).ToList(); //make a copy of 'a'
c.RemoveAll(i => b.Contains(i));
You can just use a Where() clause with a Contains() in it. To avoid O(n²) complexity (which is really what you're trying to avoid when you say "nested for loops," you can create a HashSet out of List B.
var setB = listB.ToHashSet();
var aMinusB = listA.Where(item => !setB.Contains(item)).ToList();
You can use LINQ for this using any:
var result = listA.Where(el1 => !listB.Any(el2 => el2 == el1)).ToList();

How to use elements which are not in the second List<>()

I have two lists like this List... In first I have some elements and I want to use a element in the second list which is not one of the first using LINQ. For example:
List one has: 1, 2
List two has: 1, 2, 3, 4, 5, 6
So my output should be: 3, 4, 5, 6.
You can use Except to subtract the first list from the second one.
var list3 = list2.Except(list1).ToList();
Use the Except method:
List<int> a = new List<int> { 1, 2 };
List<int> b = new List<int> { 1, 2, 3, 4, 5 };
var result = b.Except(a).ToList();
Yes you could do that with a foreach loop, no you shouldn't do it this way. What yolu should do is read about IEquatable and override the Equals method. This will let you control the property which excludes the elements.

Why can't I use array initialisation syntax separate from array declaration?

I can do this with an integer:
int a;
a = 5;
But I can't do this with an integer array:
int[] a;
a = { 1, 2, 3, 4, 5 };
Why not?
To clarify, I am not looking for the correct syntax (I can look that up); I know that this works:
int[] a = { 1, 2, 3, 4, 5 };
Which would be the equivalent of:
int a = 5;
What I am trying to understand is, why does the code fail for arrays? What is the reason behind the code failing to be recognised as valid?
The reason there is a difference is that the folks at Microsoft decided to lighten the syntax when declaring and initializing the array in the same statement, but did not add the required syntax to allow you to assign a new array to it later.
This is why this works:
int[] a = { 1, 2, 3, 4, 5 };
but this does not:
int[] a;
a = { 1, 2, 3, 4, 5 };
Could they have added the syntax to allow this? Sure, but they didn't. Most likely they felt that this use-case is so seldom used that it didn't warrant prioritizing over other features. All new features start with minus 100 points and this probably just didn't rank high enough on the priority list.
Note that { 1, 2, 3, 4, 5 } by itself has no meaning; it can only appear in two places:
As part of an array variable declaration:
int[] a = { 1, 2, 3, 4, 5 };
As part of an array creation expression:
new int[] { 1, 2, 3, 4, 5 }
The number 5, on the other hand, has a meaning everywhere it appears in C#, which is why this works:
int a;
a = 5;
So this is just special syntax the designers of C# decided to support, nothing more.
This syntax is documented in the C# specification, section 12.6 Array Initializers.
The reason your array example doesn't work is because of the difference between value and reference types. An int is a value type. It is a single location in memory whose value can be changed.
Your integer array is a reference type. It is not equivalent to a constant number of bytes in memory. Therefore, it is a pointer to the bytes where that data is stored.
In this first line, you are assigning null to a.
int[] a;
In the next line, if you want to change the value of the array, you need to assign it to a new array.
a = new[] {1, 2, 3, 4, 5};
That is why you need the new[] before the list of values within the array if you strongly type your declaration.
int[] a = {1, 2, 3, 4, 5}; // This will work.
var a = {1, 2, 3, 4, 5}; // This will not.
However, as many of the other answers have said, if you declare it in a single line, then you do not need the new[]. If you separate the declaration and initialization, then you are required to use new[].
{} syntax is available for array initialization, not to be used after declaration.
To initialize an array you should try like this:
int[] a = { 1, 2, 3, 4, 5 };
Other ways to Initializing a Single-dimensional array:
int[] a = new int[] { 1, 2, 3, 4, 5 };
int[] a = new int[5] { 1, 2, 3, 4, 5 };
Have a look at this: different ways to initialize different kinds of arrays

Is there a way to organise an IEnumerable into batches in column-major format using Linq?

In several of my most recent projects, I've found the need to divide a single collection up into m batches of n elements.
There is an answer to this question that suggests using morelinq's Batch method. That is my preferred solution (no need to re-invent the wheel and all that).
While the Batch method divides the input up in row-major format, would it also be possible to write a similar extension that divides the input up in column-major format? That is, given the input
{ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 }
you would call ColumnBatch(4), generating an output of
{
{ 1, 4, 7, 10 },
{ 2, 5, 8, 11 },
{ 3, 6, 9, 12 }
}
Does morelinq already offer something like this?
UPDATE: I'm going to change the convention slightly. Instead of using an input of 4, I'll change it to 3 (the number of batches rather than the batch size).
The final extension method will have the signature
public static IEnumerable<IEnumerable<T>> ToColumns<T>(this IEnumerable<T> source, int numberOfColumns)
which should be clear to whoever is implementing the method, and does not require multiple enumerations.
int[] arr = new[] { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 };
int i=0;
var result = arr.GroupBy(x => i++ % 3).Select(g => g.ToList()).ToList();

Adding items to a generic list (novice)

I'm some what embarrassed to even ask this but I know there is a better way to do this I just don't know how
List<int> numbers = new List<int>(22);
numbers.Add(3);
numbers.Add(4);
numbers.Add(9);
numbers.Add(14);
numbers.Add(15);
//...
List<int> numbers = new List<int>(22) { 3, 4, 9, ..., 99 };
shorter than that? Only if your numbers follow a pattern which could be expressed mathematically.
This is the collection initializer.
You can use a collection initializer:
List<int> numbers = new List<int>(22)
{
3, 4, 9,
14, // ...
};
As of C# 3.0, at least, you can use an initializer, like so:
List<int> numbers = new List<int>{ 3, 4, 9, ... , 99 };
(Specifying the initial capacity (22) isn't terribly necessary...)

Categories

Resources