Query a list for only duplicates

Query a list for only duplicates - c#

I have a List of type string in a .NET 3.5 project. The list has thousands of strings in it, but for the sake of brevity we're going to say that it just has 5 strings in it.
List<string> lstStr = new List<string>() {
"Apple", "Banana", "Coconut", "Coconut", "Orange"};
Assume that the list is sorted (as you can tell above). What I need is a LINQ query that will remove all strings that are not duplicates. So the result would leave me with a list that only contains the two "Coconut" strings.
Is this possible to do with a LINQ query? If it is not then I'll have to resort to some complex for loops, which I can do, but I didn't want to unless I had to.

here is code for finding duplicates form string arrya
int[] listOfItems = new[] { 4, 2, 3, 1, 6, 4, 3 };
var duplicates = listOfItems
.GroupBy(i => i)
.Where(g => g.Count() > 1)
.Select(g => g.Key);
foreach (var d in duplicates)
Console.WriteLine(d);

var dupes = lstStr.Where(x => lstStr.Sum(y => y==x ? 1 : 0) > 1);
OR
var dupes = lstStr.Where((x,i) => ( (i > 0 && x==lstStr[i-1])
|| (i < lstStr.Count-1 && x==lstStr[i+1]));
Note that the first one enumerates the list for every element which takes O(n²) time (but doesn't assume a sorted list). The second one is O(n) (and assumes a sorted list).

This should work, and is O(N) rather that the O(N^2) of the other answers. (Note, this does use the fact that the list is sorted, so that really is a requirement).
IEnumerable<T> OnlyDups<T>(this IEnumerable<T> coll)
where T: IComparable<T>
{
IEnumerator<T> iter = coll.GetEnumerator();
if (iter.MoveNext())
{
T last = iter.Current;
while(iter.MoveNext())
{
if (iter.Current.CompareTo(last) == 0)
{
yield return last;
do
{
yield return iter.Current;
}
while(iter.MoveNext() && iter.Current.CompareTo(last) == 0);
}
last = iter.Current;
}
}
Use it like this:
IEnumerable<string> onlyDups = lstStr.OnlyDups();
or
List<string> onlyDups = lstStr.OnlyDups().ToList();

var temp = new List<string>();
foreach(var item in list)
{
var stuff = (from m in list
where m == item
select m);
if (stuff.Count() > 1)
{
temp = temp.Concat(stuff);
}
}

Related

How to check duplicate string array in list?

How to check duplicate string array in list?
I declare string array list like this:
List<string[]> list = new List<string[]>();
and I add a few items in the list.
list.Add(new string[3] {"1","2","3"});
list.Add(new string[3] {"2","3","4"});
list.Add(new string[1] {"3"});
list.Add(new string[1] {"3"});
list.Add(new string[3] {"1","2","3"});
now I want to get to know which items are duplicated. I tried like below to add the duplicated items to new list:
for (int j = 0; j < list.Count - 1; j++)
{
for (int k = list.Count - 1; k > j; k--)
{
if (j != k)
{
if (Enumerable.SequenceEqual(list[j], list[k]))
{
savedDistinctList.Add(list[j]);
}
}
}
}
and finally I want to remove the duplicated item in the first list. so I want to see 3 items in the list.([1,2,3],[2,3,4],[3])
Perhaps any idea using LINQ or something else?

First we have to teach .Net how to compare arrays:
private sealed class ArrayEqualityComparer<T> : IEqualityComparer<T[]> {
public bool Equals(T[] left, T[] right) {
if (ReferenceEquals(left, right))
return true;
if (left is null || right is null)
return false;
return left.SequenceEqual(right);
}
public int GetHashCode(T[] array) => array is null
? -1
: array.Length;
}
Then you can use Linq Distinct with this class implemented:
using System.Linq;
...
savedDistinctList = list
.Distinct(new ArrayEqualityComparer<string>())
.ToList();
If you want to modify the existing list, you can use HashSet<T>:
var unique = new HashSet<string[]>(new ArrayEqualityComparer<string>());
for (int i = list.Count - 1; i >= 0; --i)
if (!unique.Add(list[i]))
list.RemoveAt(i);

This has already been replied here: C# LINQ find duplicates in List by #Save
The easiest way to solve the problem is to group the elements based on their value, and then pick a representative of the group if there are more than one element in the group. In LINQ, this translates to:
var query = lst.GroupBy(x => x)
.Where(g => g.Count() > 1)
.Select(y => y.Key)
.ToList();
If you want to know how many times the elements are repeated, you can use:
var query = lst.GroupBy(x => x)
.Where(g => g.Count() > 1)
.Select(y => new { Element = y.Key, Counter = y.Count() })
.ToList();
This will return a List of an anonymous type, and each element will have the properties Element and Counter, to retrieve the information you need.
And lastly, if it's a dictionary you are looking for, you can use
var query = lst.GroupBy(x => x)
.Where(g => g.Count() > 1)
.ToDictionary(x => x.Key, y => y.Count());
This will return a dictionary, with your element as key, and the number of times it's repeated as value.
Apply with a foreach on your list.

How to sort a loop order using linq query in C#?

Assume that I have a list of items from 1 - 3.
I could order them by 1,1,2,2,3,3.
But instead, I would like to order them by 1,2,3,1,2,3....
Is there an already exist function to achieve that?

This approach separates each number into groups, then iterates through the groups in order while conditionally adding them to a result list. There's probably ways to make this safer and more efficient, but this should give you a start. (It assumes that if there aren't equal counts of each number in the source array, it will skip those numbers as it runs out of them during the iteration phase.)
int[] arr = new[] { 1,1,1,2,2,2,3,3,3,4,4,4,5,5,5 };
var orderList = arr.OrderBy(x => x).Distinct().ToArray();
var refList = arr.GroupBy(x => x).ToDictionary(k => k.Key, v => v.Count());
var result = new List<int>();
int i = 0;
while (result.Count < arr.Length)
{
if (refList.Values.Sum() == 0)
break;
if (refList[orderList[i]] > 0)
{
result.Add(orderList[i]);
refList[orderList[i]]--;
}
i++;
if (i >= orderList.Length)
i = 0;
}
// Result: [1,2,3,4,5,1,2,3,4,5,1,2,3,4,5]

Venn Diagram style grouping in LINQ

Ok. The title might be a little confusing but here is what I am trying to do
I have a series of natural numbers
var series = Enumerable.Range(1, 100)
Now I want to use GroupBy to put numbers into these 3 groups, Prime, Even, Odd
series.Select(number => {
var type = "";
if (MyStaticMethods.IsPrime(number))
{
Type = "prime";
}
else if (number % 2 == 0)
{
type = "Even";
}
else
{
type = "Odd";
}
return new { Type=type, Number = number };
}).GroupBy(n => n.Type);
Now the above query will miss categorizing Prime numbers that are even or odd into both categories and they will just be in 'prime' group. Is there any way for the above select to yield multiple numbers?
I could try something like the following, but it requires an additional flattening of the sequence.
series.Select(number => {
var list = new List<int>();
if (MyStaticMethods.IsPrime(number))
{
list.Add(new { Type="prime", Number = number });
}
if (number % 2 == 0)
{
list.Add(new { Type="even", Number = number });
}
else
{
list.Add(new { Type="odd", Number = number });
}
return list;
})
.SelectMany(n => n)
.GroupBy(n => n.Type);
The above code solves my issue, is there any better way that could make my code look more "functional" ?

You can use linq here, but you'll need to duplicate some values that can exist in different groups. GroupBy only works for disjoint groups so you need a way to distinguish 2 the even number and 2 the prime number. The approach you did is essentially what you need to do, but it could be done a little more efficiently.
You can define a set of categories that can help classify the numbers. You don't necessarily need to define new classes to get this to work, but it helps to keep things clean and organized.
class Category<T>
{
public Category(string name, Predicate<T> predicate)
{
Name = name;
Predicate = predicate;
}
public string Name { get; }
public Predicate<T> Predicate { get; }
}
Then to group the numbers, you'd do this:
var series = Enumerable.Range(1, 100);
var categories = new[]
{
new Category<int>("Prime", i => MyStaticMethods.IsPrime(i)),
new Category<int>("Odd", i => i % 2 != 0),
new Category<int>("Even", i => i % 2 == 0),
};
var grouped =
from i in series
from c in categories
where c.Predicate(i)
group i by c.Name;

This is a good case to use Reactive Extensions, as you will avoid to duplicate values.
In the code below , "series" is parsed only once, because it's a hot source thanks to the Publish().
The actual parsing is done during the "Connect()".
using System.Reactive.Linq;
var list = new List<KeyValuePair<string, int>>();
var series= Observable.Range(1, 100).Publish();
series.Where(e => e % 2 == 0).Subscribe(e=>list.Add(new KeyValuePair<string, int>("Even",e)));
series.Where(e => e % 2 == 1).Subscribe(e => list.Add(new KeyValuePair<string, int>("Odd", e)));
series.Where(e => MyStaticMethods.IsPrime(e) ).Subscribe(e => list.Add(new KeyValuePair<string, int>("Prime", e)));
series.Connect();
var result = list.GroupBy(n => n.Key);

how to find out existance of specific pattern in a list of integers by help of Linq. C#

How to determine a range in a list of integer follow specific pattern.
For example, we have a list like this:
List<int> ints = new List<int>(){4,5,2,6,8,4,5,6,5,6,8,9,9};
Exists and Any could check if an element satisfies specific condition.
But what if I want to know if there is any three items in row that incremental values(plus 1): here they are {4, 5, 6}.

Patrick already answered your question with a good solution, but if you're really looking for a LINQ-only way, you could use Aggregate:
var inputs = new List<IEnumerable<int>>
{
new List<int>{ 4,5,2,6,8,4,5,6,5,6,8,9,9 },
new List<int>{ 1,2,3 },
new List<int>{ 1,2,4 },
};
foreach(var input in inputs)
{
var result = input.Aggregate(Enumerable.Empty<int>(),
(agg, cur) => agg.Count() == 3 ? agg
: agg.Any() && cur == agg.Last() + 1
? agg.Concat(new []{cur})
: new []{cur});
Console.WriteLine(result.Count() >= 3 ? String.Join(", ", result) : "not found");
}

Another way is to take all of the groups of 3 and then see which group(s) meet your n, n+1 and n+2 rule
var results = Enumerable.Range(0, ints.Count - 3)
.Select(n => ints.Skip(n).Take(3).ToArray())
.Where(three => three[0]+1 == three[1] && three[0]+2 == three[2])
.ToArray();

I would drop the LINQ requirement. It is very hard, maybe even impossible. A regular foreach statement is better suited for this:
List<int> sequence = new List<int>();
List<int> longestSequence = null;
int previous = 0;
foreach (int i in ints)
{
if (i != previous + 1 && sequence.Count > 0)
{
if (longestSequence == null || longestSequence.Count < sequence.Count)
{
longestSequence = sequence;
}
sequence = new List<int>();
}
sequence.Add(i);
previous = i;
}

how to find no of duplicate values in arraylist

I have a arraylist which has values some of them are repeated. I need the count of the repeated values. Is this possible in c#?

here is a great post how to do it with LINQ
var query =
from c in arrayList
group c by c into g
where g.Count() > 1
select new { Item = g.Key, ItemCount = g.Count()};
foreach (var item in query)
{
Console.WriteLine("Country {0} has {1} cities", item.Item , item.ItemCount );
}

If your object has equals method correctly overrided just call Distinct() from System.Linq namespace on it
It requires the ArrayList to be homogeneous and calling Cast<YourType>() before Distinct().
Then subtract the length of arrayList from the Distinct sequence.
arraList.Count - arraList.Cast<YourType>().Distinct().Count()
it will throw exception if your items in arrayList is not of type YourType, and if you use OfType<YourType> it filters items to objects of type YourType.
but if you want the count of each repeated item, this is not your answer.

public Dictionary<T,int> CountOccurences<T>(IEnumerable<T> items) {
var occurences = new Dictionary<T,int>();
foreach(T item in items) {
if(occurences.ContainsKey(item)) {
occurences[item]++;
} else {
occurences.Add(item, 1);
}
}
return occurences;
}

myList.GroupBy(i => i).Count(g => g.Count() > 1)
and if you specifically need ArrayList
ArrayList arrayList = new ArrayList(new[] { 1, 1, 2, 3, 4, 4 });
Console.WriteLine(arrayList.ToArray().GroupBy(i => i).Count(g => g.Count() > 1));
Based on comments by poster
ArrayList arrayList = new ArrayList(new[] { 1, 1, 2, 3, 4, 4 });
Console.WriteLine(arrayList.ToArray().Count(i => i == 4));

int countDup = ArrayList1.Count - ArrayList1.OfType<object>().Distinct().Count();

var items = arrayList.Cast<object>()
.GroupBy(o => o)
.Select(g => new { Item = g, Count = g.Count() })
.ToList();
each item of result list will have two properties:
Item - source item
Count - count in source list

You can acomplish this many ways. The first that comes to me would be to group by the values within your array list, and only return the grouping counts that are over 1.
ArrayList al = new ArrayList();
al.Add("a");
al.Add("b");
al.Add("c");
al.Add("f");
al.Add("a");
al.Add("f");
int count = al.ToArray().GroupBy(q => q).Count(q=>q.Count()>1);
count will return the value of 2 as a and f are duplicated.

You could sort it, then it becomes very easy.
Edit: sorting becomes a moot point when done this way.
Arraylist myList = new ArrayList();
myList = someStuff;
Dictionary<object, int> counts = new Dictionary<object,int>();
foreach (object item in myList)
{
if (!counts.ContainsKey(item))
{
counts.Add(item,1);
}
else
{
counts[item]++;
}
}
Edit:
Some minor things might vary (not certain about some of my square braces, I'm a little rusty with c#) but the concept should withstand scrutiny.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Query a list for only duplicates - c#

here is code for finding duplicates form string arrya int[] listOfItems = new[] { 4, 2, 3, 1, 6, 4, 3 }; var duplicates = listOfItems .GroupBy(i => i) .Where(g => g.Count() > 1) .Select(g => g.Key); foreach (var d in duplicates) Console.WriteLine(d);

var temp = new List<string>(); foreach(var item in list) { var stuff = (from m in list where m == item select m); if (stuff.Count() > 1) { temp = temp.Concat(stuff); } }

Related

How to check duplicate string array in list?

How to sort a loop order using linq query in C#?

Venn Diagram style grouping in LINQ

how to find out existance of specific pattern in a list of integers by help of Linq. C#

how to find no of duplicate values in arraylist

Categories

Resources