Using LINQ to remove duplicate content in list?

Using LINQ to remove duplicate content in list? - c#

Suppose I have this list:
int[] list = { 1, 2, 3, 3, 1 };
What I would like is to remove duplicates that immediately follow same number. So in this case I want to remove 3, but not 1.
New list should therefore be: {1, 2, 3, 1}
Another example is this list: {2, 7, 7, 7, 2, 6, 4} which will become {2, 7, 2, 6, 4}.
Can I do this with LINQ?

You could use Aggregate if you want to use an existing LINQ method but such an approach would lose laziness. You can write your own extension method:
public static IEnumerable<T> RemoveConsecutiveDuplicates<T>(this IEnumerable<T> source, IEqualityComparer<T> comp = null)
{
comp = comp ?? EqualityComparer<T>.Default;
using (var e = source.GetEnumerator())
{
if (e.MoveNext())
{
T last = e.Current;
yield return e.Current;
while (e.MoveNext())
{
if (!comp.Equals(e.Current, last))
{
yield return e.Current;
last = e.Current;
}
}
}
}
}

If you insist on doing this with LINQ, you could use Aggregate:
var result = array.Aggregate(new List<int>(), (a, b) =>
{
if (!a.Any() || a.Last() != b)
a.Add(b);
return a;
});
But this isn't necessarily the most efficient solution because of Any and Last in each iteration. A simple foreach comparing the previous and current iteration value will perform much better.

You can use PairWise from MoreLINQ like this:
var result =
new[] {list[0]}
.Concat(
list
.Pairwise((x, y) => new {Item = y, Same = x == y})
.Where(x => !x.Same)
.Select(x => x.Item))
.ToArray();
PairWise allows you to get a sequence that results from applying a function on each item in the original sequence along with the item before it (expect for the first item).
What I am doing here is for each item (expect the first item), I am getting the item itself and a boolean (Same) indicating whether this item equals the item before it. Then, I am filtering the sequence to take only the items that each does not equal the item before it. I am then simply appending the first item in the original list to the new sequence.
Note: don't forget to handle the case where list is empty.

You could do the following (without linq).
var collection = new [] { 2, 7, 7, 7, 2, 6, 4 }.ToList();
for (int i = 0; i < collection.Count - 1; i++)
{
if (collection[i] == collection[i + 1])
{
collection.RemoveAt(i);
i--;
}
}
Another yield based solution would be this.
public static IEnumerable<T> RemoveConsecutiveDuplicates<T>(this IEnumerable<T> collection)
{
using (var enumerator = collection.GetEnumerator())
{
bool wasNotLast = enumerator.MoveNext(),
hasEntry = wasNotLast;
T last = hasEntry ? enumerator.Current : default(T);
while(wasNotLast)
{
if (!last.Equals(enumerator.Current))
yield return last;
last = enumerator.Current;
wasNotLast = enumerator.MoveNext();
}
if (hasEntry)
yield return last;
}
}

Related

How to filter members of more than one list with LINQ? [duplicate]

How do I select the unique elements from the list {0, 1, 2, 2, 2, 3, 4, 4, 5} so that I get {0, 1, 3, 5}, effectively removing all instances of the repeated elements {2, 4}?

var numbers = new[] { 0, 1, 2, 2, 2, 3, 4, 4, 5 };
var uniqueNumbers =
from n in numbers
group n by n into nGroup
where nGroup.Count() == 1
select nGroup.Key;
// { 0, 1, 3, 5 }

var nums = new int{ 0...4,4,5};
var distinct = nums.Distinct();
make sure you're using Linq and .NET framework 3.5.

With lambda..
var all = new[] {0,1,1,2,3,4,4,4,5,6,7,8,8}.ToList();
var unique = all.GroupBy(i => i).Where(i => i.Count() == 1).Select(i=>i.Key);

C# 2.0 solution:
static IEnumerable<T> GetUniques<T>(IEnumerable<T> things)
{
Dictionary<T, int> counts = new Dictionary<T, int>();
foreach (T item in things)
{
int count;
if (counts.TryGetValue(item, out count))
counts[item] = ++count;
else
counts.Add(item, 1);
}
foreach (KeyValuePair<T, int> kvp in counts)
{
if (kvp.Value == 1)
yield return kvp.Key;
}
}

Here is another way that works if you have complex type objects in your List and want to get the unique values of a property:
var uniqueValues= myItems.Select(k => k.MyProperty)
.GroupBy(g => g)
.Where(c => c.Count() == 1)
.Select(k => k.Key)
.ToList();
Or to get distinct values:
var distinctValues = myItems.Select(p => p.MyProperty)
.Distinct()
.ToList();
If your property is also a complex type you can create a custom comparer for the Distinct(), such as Distinct(OrderComparer), where OrderComparer could look like:
public class OrderComparer : IEqualityComparer<Order>
{
public bool Equals(Order o1, Order o2)
{
return o1.OrderID == o2.OrderID;
}
public int GetHashCode(Order obj)
{
return obj.OrderID.GetHashCode();
}
}

If Linq isn't available to you because you have to support legacy code that can't be upgraded, then declare a Dictionary, where the first int is the number and the second int is the number of occurences. Loop through your List, loading up your Dictionary. When you're done, loop through your Dictionary selecting only those elements where the number of occurences is 1.

I believe Matt meant to say:
static IEnumerable<T> GetUniques<T>(IEnumerable<T> things)
{
Dictionary<T, bool> uniques = new Dictionary<T, bool>();
foreach (T item in things)
{
if (!(uniques.ContainsKey(item)))
{
uniques.Add(item, true);
}
}
return uniques.Keys;
}

There are many ways to skin a cat, but HashSet seems made for the task here.
var numbers = new[] { 0, 1, 2, 2, 2, 3, 4, 4, 5 };
HashSet<int> r = new HashSet<int>(numbers);
foreach( int i in r ) {
Console.Write( "{0} ", i );
}
The output:
0 1 2 3 4 5

Here's a solution with no LINQ:
var numbers = new[] { 0, 1, 2, 2, 2, 3, 4, 4, 5 };
// This assumes the numbers are sorted
var noRepeats = new List<int>();
int temp = numbers[0]; // Or .First() if using IEnumerable
var count = 1;
for(int i = 1; i < numbers.Length; i++) // Or foreach (var n in numbers.Skip(1)) if using IEnumerable
{
if (numbers[i] == temp) count++;
else
{
if(count == 1) noRepeats.Add(temp);
temp = numbers[i];
count = 1;
}
}
if(count == 1) noRepeats.Add(temp);
Console.WriteLine($"[{string.Join(separator: ",", values: numbers)}] -> [{string.Join(separator: ",", values: noRepeats)}]");
This prints:
[0,1,2,2,2,3,4,4,5] -> [0,1,3,5]

In .Net 2.0 I`m pretty sure about this solution:
public IEnumerable<T> Distinct<T>(IEnumerable<T> source)
{
List<T> uniques = new List<T>();
foreach (T item in source)
{
if (!uniques.Contains(item)) uniques.Add(item);
}
return uniques;
}

Cluster items by condition and order using linq

I want to cluster some numbers by condition and their order in the list.
int delta = 3;
var numbers = new List<int>() { 2, 4, 9, 6, 3, 2, 7, 7, 4, 1, 9, 1, 2 };
var g = numbers.GroupBy(n => n <= delta);
This gives two groups based on the condition. What I want is:
g1: 2
g2: 4, 9, 6
g3: 3, 2
g4: 7, 7, 4
g5: 1
g6: 9
g7: 1, 2
edit
The condition is to group them based on a condition (here it is number <= delta), but every group should only contain numbers that are next to each other in the first list.

If I understand the logic, you want to create a new group whenever item n passes the condition but item n - 1 fails, or vice-versa.
Well, normally you wouldn't use Linq for such a thing. You'd have to iterate over the loop one item at a time and build the result set by yourself. For example:
List<int> list = null;
var result = new List<IEnumerable<int>>();
bool? prev = null;
foreach (var n in numbers)
{
bool cur = n <= 3;
if (prev != cur)
{
list = new List<int>();
result.Add(list);
prev = cur;
}
list.Add(n);
}
But here is a workable solution in Linq. It depends upon site-effects, which you should normally avoid:
var prev = numbers.First() <= delta;
var counter = 0;
var result = numbers.GroupBy(n => (prev != (prev = n <= delta)) ? ++counter : counter)
.ToList();

So what we are conceptually doing here is going through the list and grouping while a condition is met. We can write a corresponding operation just for that without too much difficulty:
public static IEnumerable<IEnumerable<T>> GroupWhile<T>(
this IEnumerable<T> source, Func<T, T, bool> predicate)
{
using (var iterator = source.GetEnumerator())
{
if (!iterator.MoveNext())
yield break;
List<T> list = new List<T>() { iterator.Current };
T previous = iterator.Current;
while (iterator.MoveNext())
{
if (!predicate(previous, iterator.Current))
{
yield return list;
list = new List<T>();
}
list.Add(iterator.Current);
previous = iterator.Current;
}
yield return list;
}
}
We can now write:
var groups = numbers.GroupWhile((prev,next) =>
(prev <= delta) == (next <= delta));
Here the condition for when to start a new group is when the previous item's comparison is the same as the current item's.

If you're a fan of fold you can write it this way:
var groups = numbers.Skip(1).Aggregate(new List<List<int>>(new[] { new List<int> { numbers[0] } }), (acc, b) =>
{
if ((acc.Last().LastOrDefault() <= delta) == (b <= delta))
{
acc.Last().Add(b);
}
else
{
acc.Add(new List<int>() { b });
}
return acc;
})
Here groups is of type of List<List<int>>

LINQ to count Continues repeated items(int) in an int Array?

Here is an scenario of my question: I have an array, say:
{ 4, 1, 1, 3, 3, 2, 5, 3, 2, 2 }
The result should be something like this (array element => its count):
4 => 1
1 => 2
3 => 2
2 => 1
5 => 1
3 => 1
2 => 2
I know this can be achieved by for loop.
But google'd a lot to make this possible using lesser lines of code using LINQ without success.

I believe the most optimal way to do this is to create a "LINQ-like" extension methods using an iterator block. This allows you to perform the calculation doing a single pass over your data. Note that performance isn't important at all if you just want to perform the calculation on a small array of numbers. Of course this is really your for loop in disguise.
static class Extensions {
public static IEnumerable<Tuple<T, Int32>> ToRunLengths<T>(this IEnumerable<T> source) {
using (var enumerator = source.GetEnumerator()) {
// Empty input leads to empty output.
if (!enumerator.MoveNext())
yield break;
// Retrieve first item of the sequence.
var currentValue = enumerator.Current;
var runLength = 1;
// Iterate the remaining items in the sequence.
while (enumerator.MoveNext()) {
var value = enumerator.Current;
if (!Equals(value, currentValue)) {
// A new run is starting. Return the previous run.
yield return Tuple.Create(currentValue, runLength);
currentValue = value;
runLength = 0;
}
runLength += 1;
}
// Return the last run.
yield return Tuple.Create(currentValue, runLength);
}
}
}
Note that the extension method is generic and you can use it on any type. Values are compared for equality using Object.Equals. However, if you want to you could pass an IEqualityComparer<T> to allow for customization of how values are compared.
You can use the method like this:
var numbers = new[] { 4, 1, 1, 3, 3, 2, 5, 3, 2, 2 };
var runLengths = numbers.ToRunLengths();
For you input data the result will be these tuples:
4 1
1 2
3 2
2 1
5 1
3 1
2 2

(Adding another answer to avoid the two upvotes for my deleted one counting towards this...)
I've had a little think about this (now I've understood the question) and it's really not clear how you'd do this nicely in LINQ. There are definitely ways that it could be done, potentially using Zip or Aggregate, but they'd be relatively unclear. Using foreach is pretty simple:
// Simplest way of building an empty list of an anonymous type...
var results = new[] { new { Value = 0, Count = 0 } }.Take(0).ToList();
// TODO: Handle empty arrays
int currentValue = array[0];
int currentCount = 1;
foreach (var value in array.Skip(1))
{
if (currentValue != value)
{
results.Add(new { Value = currentValue, Count = currentCount });
currentCount = 0;
currentValue = value;
}
currentCount++;
}
// Handle tail, which we won't have emitted yet
results.Add(new { Value = currentValue, Count = currentCount });

Here's a LINQ expression that works (edit: tightened up code just a little more):
var data = new int[] { 4, 1, 1, 3, 3, 2, 5, 3, 2, 2 };
var result = data.Select ((item, index) =>
new
{
Key = item,
Count = (index == 0 || data.ElementAt(index - 1) != item)
? data.Skip(index).TakeWhile (d => d == item).Count ()
: -1
}
)
.Where (d => d.Count != -1);
And here's a proof that shows it working.

This not short enough?
public static IEnumerable<KeyValuePair<T, int>> Repeats<T>(
this IEnumerable<T> source)
{
int count = 0;
T lastItem = source.First();
foreach (var item in source)
{
if (Equals(item, lastItem))
{
count++;
}
else
{
yield return new KeyValuePair<T, int>(lastItem, count);
lastItem = item;
count = 1;
}
}
yield return new KeyValuePair<T, int>(lastItem, count);
}
I'll be interested to see a linq way.

I already wrote the method you need over there. Here's how to call it.
foreach(var g in numbers.GroupContiguous(i => i))
{
Console.WriteLine("{0} => {1}", g.Key, g.Count);
}

Behold (you can run this directly in LINQPad -- rle is where the magic happens):
var xs = new[] { 4, 1, 1, 3, 3, 2, 5, 3, 2, 2 };
var rle = Enumerable.Range(0, xs.Length)
.Where(i => i == 0 || xs[i - 1] != xs[i])
.Select(i => new { Key = xs[i], Count = xs.Skip(i).TakeWhile(x => x == xs[i]).Count() });
Console.WriteLine(rle);
Of course, this is O(n^2), but you didn't request linear efficiency in the spec.

var array = new int[] {1,1,2,3,5,6,6 };
foreach (var g in array.GroupBy(i => i))
{
Console.WriteLine("{0} => {1}", g.Key, g.Count());
}

var array = new int[]{};//whatever ur array is
array.select((s)=>{return array.where((s2)=>{s == s2}).count();});
the only prob with is tht if you have 1 - two times you will get the result for 1-two times

var array = new int[] {1,1,2,3,5,6,6 };
var arrayd = array.Distinct();
var arrayl= arrayd.Select(s => { return array.Where(s2 => s2 == s).Count(); }).ToArray();
Output
arrayl=[0]2 [1]1 [2]1 [3]1 [4]2

Try GroupBy through List<int>
List<int> list = new List<int>() { 4, 1, 1, 3, 3, 2, 5, 3, 2, 2 };
var res = list.GroupBy(val => val);
foreach (var v in res)
{
MessageBox.Show(v.Key.ToString() + "=>" + v.Count().ToString());
}

Is there an easy way to merge two ordered sequences using LINQ?

Given
IEnumerable<T> first;
IEnumerable<T> second;
and that both first and second are ordered by a comparer Func<T, T, int> that returns 0 for equality, -1 when the first is "smaller" and 1 when the second is "smaller".
Is there a straight-forward way using LINQ to merge the two sequences in a way that makes the resulting sequence also ordered by the same comparer?
We're currently using a hand-crafted algorithm that works, but the readability of a straight-forward LINQ statement would be preferable.

You could define an extension method for this. Something like
public static IEnumerable<T> MergeSorted<T>(this IEnumerable<T> first, IEnumerable<T> second, Func<T, T, int> comparer)
{
using (var firstEnumerator = first.GetEnumerator())
using (var secondEnumerator = second.GetEnumerator())
{
var elementsLeftInFirst = firstEnumerator.MoveNext();
var elementsLeftInSecond = secondEnumerator.MoveNext();
while (elementsLeftInFirst || elementsLeftInSecond)
{
if (!elementsLeftInFirst)
{
do
{
yield return secondEnumerator.Current;
} while (secondEnumerator.MoveNext());
yield break;
}
if (!elementsLeftInSecond)
{
do
{
yield return firstEnumerator.Current;
} while (firstEnumerator.MoveNext());
yield break;
}
if (comparer(firstEnumerator.Current, secondEnumerator.Current) < 0)
{
yield return firstEnumerator.Current;
elementsLeftInFirst = firstEnumerator.MoveNext();
}
else
{
yield return secondEnumerator.Current;
elementsLeftInSecond = secondEnumerator.MoveNext();
}
}
}
}
Usage:
var s1 = new[] { 1, 3, 5, 7, 9 };
var s2 = new[] { 2, 4, 6, 6, 6, 8 };
var merged = s1.MergeSorted(s2, (a, b) => a > b ? 1 : -1).ToList();
Console.WriteLine(string.Join(", ", merged));
Output:
1, 2, 3, 4, 5, 6, 6, 6, 7, 8, 9

I think, converting the first enumerable to list and adding second item to this list then calling sort will do the trick.
IEnumerable<int> first = new List<int>(){1,3};
IEnumerable<int> second = new List<int>(){2,4};
var temp = first.ToList();
temp.AddRange(second);
temp.Sort(new Comparison<int>(comparer)); // where comparer is Func<T,T,int>

Take every 2nd object in list

I have an IEnumerable and I want to get a new IEnumerable containing every nth element.
Can this be done in Linq?

Just figured it out myself...
The IEnumerable<T>.Where() method has an overload that takes the index of the current element - just what the doctor ordered.
(new []{1,2,3,4,5}).Where((elem, idx) => idx % 2 == 0);
This would return
{1, 3, 5}
Update: In order to cover both my use case and Dan Tao's suggestion, let's also specify what the first returned element should be:
var firstIdx = 1;
var takeEvery = 2;
var list = new []{1,2,3,4,5};
var newList = list
.Skip(firstIdx)
.Where((elem, idx) => idx % takeEvery == 0);
...would return
{2, 4}

To implement Cristi's suggestion:
public static IEnumerable<T> Sample<T>(this IEnumerable<T> source, int interval)
{
// null check, out of range check go here
return source.Where((value, index) => (index + 1) % interval == 0);
}
Usage:
var upToTen = new[] { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
var evens = upToTen.Sample(2);
var multiplesOfThree = upToTen.Sample(3);

While not LINQ you may also create an extension method with yield.
public static IEnumerable<T> EverySecondObject<T>(this IEnumerable<T> list)
{
using (var enumerator = list.GetEnumerator())
{
while (true)
{
if (!enumerator.MoveNext())
yield break;
if (enumerator.MoveNext())
yield return enumerator.Current;
else
yield break;
}
}
}

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Using LINQ to remove duplicate content in list? - c#

Related

How to filter members of more than one list with LINQ? [duplicate]

Cluster items by condition and order using linq

LINQ to count Continues repeated items(int) in an int Array?

Is there an easy way to merge two ordered sequences using LINQ?

Take every 2nd object in list

Categories

Resources