How to check duplicate string array in list? - c#

How to check duplicate string array in list?
I declare string array list like this:
List<string[]> list = new List<string[]>();
and I add a few items in the list.
list.Add(new string[3] {"1","2","3"});
list.Add(new string[3] {"2","3","4"});
list.Add(new string[1] {"3"});
list.Add(new string[1] {"3"});
list.Add(new string[3] {"1","2","3"});
now I want to get to know which items are duplicated. I tried like below to add the duplicated items to new list:
for (int j = 0; j < list.Count - 1; j++)
{
for (int k = list.Count - 1; k > j; k--)
{
if (j != k)
{
if (Enumerable.SequenceEqual(list[j], list[k]))
{
savedDistinctList.Add(list[j]);
}
}
}
}
and finally I want to remove the duplicated item in the first list. so I want to see 3 items in the list.([1,2,3],[2,3,4],[3])
Perhaps any idea using LINQ or something else?

First we have to teach .Net how to compare arrays:
private sealed class ArrayEqualityComparer<T> : IEqualityComparer<T[]> {
public bool Equals(T[] left, T[] right) {
if (ReferenceEquals(left, right))
return true;
if (left is null || right is null)
return false;
return left.SequenceEqual(right);
}
public int GetHashCode(T[] array) => array is null
? -1
: array.Length;
}
Then you can use Linq Distinct with this class implemented:
using System.Linq;
...
savedDistinctList = list
.Distinct(new ArrayEqualityComparer<string>())
.ToList();
If you want to modify the existing list, you can use HashSet<T>:
var unique = new HashSet<string[]>(new ArrayEqualityComparer<string>());
for (int i = list.Count - 1; i >= 0; --i)
if (!unique.Add(list[i]))
list.RemoveAt(i);

This has already been replied here: C# LINQ find duplicates in List by #Save
The easiest way to solve the problem is to group the elements based on their value, and then pick a representative of the group if there are more than one element in the group. In LINQ, this translates to:
var query = lst.GroupBy(x => x)
.Where(g => g.Count() > 1)
.Select(y => y.Key)
.ToList();
If you want to know how many times the elements are repeated, you can use:
var query = lst.GroupBy(x => x)
.Where(g => g.Count() > 1)
.Select(y => new { Element = y.Key, Counter = y.Count() })
.ToList();
This will return a List of an anonymous type, and each element will have the properties Element and Counter, to retrieve the information you need.
And lastly, if it's a dictionary you are looking for, you can use
var query = lst.GroupBy(x => x)
.Where(g => g.Count() > 1)
.ToDictionary(x => x.Key, y => y.Count());
This will return a dictionary, with your element as key, and the number of times it's repeated as value.
Apply with a foreach on your list.

Related

Sorting array by frequency of elements in C#

Here is what i have so far
int[] numbers = { 3,5,4,3,8,8,5,3,2,1,9,5 };
int[] n = new int[12];
int[] k;
foreach (int number in numbers)
{
n[number]++;
}
Array.Sort(n);
Array.Reverse(n);
foreach (int value in n)
{
Console.WriteLine(value);
}
I know i am missing the part where i sort the frequency of the elements after i counted them and i just cant get my head around it. I'd appreciate some help, Thanks!
What's the problem with your solution ?
Whereas you correctly keep the frequencies of the numbers in the table called n in your code, which hereby I would call it frequencies, then you Sort this array. This action breaks your solution, since each frequency is associated with the corresponding index of its location in the array.
E.g. If an instance of this array is this [8,2,1,7,6]. When you call the Sort method on this array, this would have as a result the array to be sorted and the order of the elements of the array would be this [1,2,7,6,8]. Before calling sort, the first element of the array was indicating that the number 0 (the index of the first element is 0) has been found 8 times in our numbers. After sort, the first element is 1, which means now that the frequency of the number 0 is 1, which is apparently wrong.
If you want to keep it your way, then you could try something like this:
int[] numbers = { 1,2,2,9,1,2,5,5,5,5,2 };
int[] frequencies = new int[12];
int k = 3;
foreach (int number in numbers)
{
frequencies[number]++;
}
var mostFrequentNumbers = frequencies.Select((frequency, index) => new
{
Number = index,
Frequency = frequency
})
.OrderByDescending(item => item.Frequency)
.Select(item => item.Number)
.Take(k);
foreach (int mostFrequentNumber in mostFrequentNumbers)
{
Console.WriteLine(mostFrequentNumber);
}
Are there any other approaches ?
An easy way to do this is to use a data structure like a Dictionary, in which you would keep as keys the numbers and as the corresponding values the corresponding frequencies.
Then you can order by descending values the above data structure an keep the k most frequent numbers.
int[] numbers = { 1,2,2,9,1,2,5,5,5,5,2 };
int k = 3;
Dictionary<int, int> numberFrequencies = new Dictionary<int, int>();
foreach (int number in numbers)
{
if(numberFrequencies.ContainsKey(number))
{
numberFrequencies[number] += 1;
}
else
{
numberFrequencies.Add(number, 1);
}
}
var mostFrequentNumbers = numberFrequencies.OrderByDescending(numberFrequency => numberFrequency.Value)
.Take(k)
.Select(numberFrequency => numberFrequency.Key);
foreach (int mostFrequentNumber in mostFrequentNumbers)
{
Console.WriteLine(mostFrequentNumber);
}
You can also achieve the same thing by only using LINQ:
int[] numbers = { 1,2,2,9,1,2,5,5,5,5,2 };
int k = 3;
var mostFrequentNumbers = numbers.GroupBy(number => number)
.ToDictionary(gr => gr.Key, gr => gr.Count())
.OrderByDescending(keyValue => keyValue.Value)
.Take(k)
.Select(numberFrequency => numberFrequency.Key);
foreach (int mostFrequentNumber in mostFrequentNumbers)
{
Console.WriteLine(mostFrequentNumber);
}
You can just use Linq extensions:
using System.Linq;
using System.Collections.Generic;
...
private static IEnumerable<int> Solve(int[] numbers, int k) {
return numbers
.GroupBy(x => x)
.OrderByDescending(g => g.Count())
.Select(g => g.Key)
.Take(k);
}
Then you can call:
var numbers = new []{1,2,2,9,1,2,5,5,5,5,2};
var k = 3;
var result = Solve(numbers, k);
foreach (int n in result)
Console.WriteLine(n);
To be very terse:
var frequents = numbers.GroupBy(t => t)
.Where(grp => grp.Count() > 1)
.Select(t => t.Key)
.OrderByDescending(t => t)
.Take(k)
.ToList();

convert nested for loop to linq with condition

How to convert below c# nested for loop to linq...?
list = objBLForms.GetForms(Ids);
for (int i = 0; i < list.Count; i++)
{
for (int j = 0; j < list.Count; j++)
{
if (list[i].StateId == list[j].StateId &&
list[i].PayerId == list[j].PayerId && i != j)
{
if (string.IsNullOrEmpty(list[i].Tax))
{
list.Remove(list[i]);
}
else
{
list.Remove(list[j]);
}
}
}
}
I want to Remove duplicate payers with same state..And if any state tax is present, i wanted to remove the other duplicate one i,e; the duplicate one which is having no state tax...
I have achived it by using the nested for loop as shown above.
is there any way to do it in linq..I dont't know anything about linq.
Am very new to linq,Thanks in advance
The logic of your code is actually removing EVERYTHING that has string.IsNullOrEmpty(Tax), and only keeping first record that has value in Tax. Then, how about this
list
.Where(l => !string.IsNullOrEmpty(l.Tax))
.GroupBy(l => new {l.StateId, l.PayerId})
.Select(group => group.First())
.ToArray();
This seems about right to me:
list =
list
.OrderByDescending(x => x.Tax)
.GroupBy(x => new { x.StateId, x.PayerId })
.SelectMany(x => x.Take(1))
.ToList();

How to sort a loop order using linq query in C#?

Assume that I have a list of items from 1 - 3.
I could order them by 1,1,2,2,3,3.
But instead, I would like to order them by 1,2,3,1,2,3....
Is there an already exist function to achieve that?
This approach separates each number into groups, then iterates through the groups in order while conditionally adding them to a result list. There's probably ways to make this safer and more efficient, but this should give you a start. (It assumes that if there aren't equal counts of each number in the source array, it will skip those numbers as it runs out of them during the iteration phase.)
int[] arr = new[] { 1,1,1,2,2,2,3,3,3,4,4,4,5,5,5 };
var orderList = arr.OrderBy(x => x).Distinct().ToArray();
var refList = arr.GroupBy(x => x).ToDictionary(k => k.Key, v => v.Count());
var result = new List<int>();
int i = 0;
while (result.Count < arr.Length)
{
if (refList.Values.Sum() == 0)
break;
if (refList[orderList[i]] > 0)
{
result.Add(orderList[i]);
refList[orderList[i]]--;
}
i++;
if (i >= orderList.Length)
i = 0;
}
// Result: [1,2,3,4,5,1,2,3,4,5,1,2,3,4,5]

Select lists from lists with linq

I have an list with x items. I wish to get an results that groups this list based of a number and not a property.
For example.
I have a list of 8 items. I want to group them by 3.
I want to get a List thats contains three lists, where the first two lists contains each three items and the last list the remaining two.
I want a more elegant solution than this:
private static List<List<string>> GroupBy(List<string> pages, int groupSize)
{
var result = new List<List<TrimlinePage>>();
while (!(result.Count != 0 && result.Last().Count % 3 > 0))
{
int skip = result.Count*groupSize;
var group = pages.Skip(skip).Take(groupSize).ToList();
result.Add(group);
}
return result;
}
You can use the integer divison trick:
List<List<string>> lists = pages
.Select((str, index) => new { str, index })
.GroupBy(x => x.index / groupSize)
.Select(g => g.Select(x => x.str).ToList())
.ToList();
Example:
int groupSize = 3;
var pages = new List<string> { "A", "B", "C", "D", "E", "F", "G" };
List<List<string>> lists = pages
.Select((str, index) => new { str, index })
.GroupBy(x => x.index / groupSize)
.Select(g => g.Select(x => x.str).ToList())
.ToList();
Result:
foreach(var list in lists)
Console.WriteLine(string.Join(",", list));
Output:
A,B,C
D,E,F
G
So this approach will give you lists with the specified max-size, in this case 3. If you instead want to ensure that you always get three lists you need to use % instead of /:
List<List<string>> lists = pages
.Select((str, index) => new { str, index })
.GroupBy(x => x.index % groupSize)
.Select(g => g.Select(x => x.str).ToList())
.ToList();
Try this:
var list = Enumerable.Range(1,100);
var query = list
.Select((x, i) => new {x, i})
.GroupBy(v => v.i / 3).Select(g => g.Select(v =>v.x.ToList()))
.ToList();
Here's a simple solution using side effects (which is generally discouraged):
private static List<List<string>> GroupBy(List<string> pages, int groupSize)
{
var i = 0;
return pages.GroupBy(p => i++ / 3, (k, g) => g.ToList()).ToList();
}
Or if you want to avoid relying on side effects, you could use this:
private static List<List<string>> GroupBy(List<string> pages, int groupSize)
{
return pages.Select(p => new { p, i })
.GroupBy(x => x.i / 3)
.Select(g => g.Select(x => x.p).ToList())
.ToList();
}
LINQ is not the best solution. Often good old indexing is much more readable and efficient.
private static List<List<T>> GroupBy(List<T> pages, int groupSize)
{
var result = new List<List<T>>();
List<T> l;
for (int i=0; i < pages.Count; i++)
{
if (i%groupSize == 0)
{
l = new List<T>();
result.Add(l);
}
l.Add(pages[i]);
}
return result;
}
You could also have a look at morelinq which contains the Partition method.
It's available via NuGet.

Query a list for only duplicates

I have a List of type string in a .NET 3.5 project. The list has thousands of strings in it, but for the sake of brevity we're going to say that it just has 5 strings in it.
List<string> lstStr = new List<string>() {
"Apple", "Banana", "Coconut", "Coconut", "Orange"};
Assume that the list is sorted (as you can tell above). What I need is a LINQ query that will remove all strings that are not duplicates. So the result would leave me with a list that only contains the two "Coconut" strings.
Is this possible to do with a LINQ query? If it is not then I'll have to resort to some complex for loops, which I can do, but I didn't want to unless I had to.
here is code for finding duplicates form string arrya
int[] listOfItems = new[] { 4, 2, 3, 1, 6, 4, 3 };
var duplicates = listOfItems
.GroupBy(i => i)
.Where(g => g.Count() > 1)
.Select(g => g.Key);
foreach (var d in duplicates)
Console.WriteLine(d);
var dupes = lstStr.Where(x => lstStr.Sum(y => y==x ? 1 : 0) > 1);
OR
var dupes = lstStr.Where((x,i) => ( (i > 0 && x==lstStr[i-1])
|| (i < lstStr.Count-1 && x==lstStr[i+1]));
Note that the first one enumerates the list for every element which takes O(n²) time (but doesn't assume a sorted list). The second one is O(n) (and assumes a sorted list).
This should work, and is O(N) rather that the O(N^2) of the other answers. (Note, this does use the fact that the list is sorted, so that really is a requirement).
IEnumerable<T> OnlyDups<T>(this IEnumerable<T> coll)
where T: IComparable<T>
{
IEnumerator<T> iter = coll.GetEnumerator();
if (iter.MoveNext())
{
T last = iter.Current;
while(iter.MoveNext())
{
if (iter.Current.CompareTo(last) == 0)
{
yield return last;
do
{
yield return iter.Current;
}
while(iter.MoveNext() && iter.Current.CompareTo(last) == 0);
}
last = iter.Current;
}
}
Use it like this:
IEnumerable<string> onlyDups = lstStr.OnlyDups();
or
List<string> onlyDups = lstStr.OnlyDups().ToList();
var temp = new List<string>();
foreach(var item in list)
{
var stuff = (from m in list
where m == item
select m);
if (stuff.Count() > 1)
{
temp = temp.Concat(stuff);
}
}

Categories

Resources