how to find no of duplicate values in arraylist

how to find no of duplicate values in arraylist - c#

I have a arraylist which has values some of them are repeated. I need the count of the repeated values. Is this possible in c#?

here is a great post how to do it with LINQ
var query =
from c in arrayList
group c by c into g
where g.Count() > 1
select new { Item = g.Key, ItemCount = g.Count()};
foreach (var item in query)
{
Console.WriteLine("Country {0} has {1} cities", item.Item , item.ItemCount );
}

If your object has equals method correctly overrided just call Distinct() from System.Linq namespace on it
It requires the ArrayList to be homogeneous and calling Cast<YourType>() before Distinct().
Then subtract the length of arrayList from the Distinct sequence.
arraList.Count - arraList.Cast<YourType>().Distinct().Count()
it will throw exception if your items in arrayList is not of type YourType, and if you use OfType<YourType> it filters items to objects of type YourType.
but if you want the count of each repeated item, this is not your answer.

public Dictionary<T,int> CountOccurences<T>(IEnumerable<T> items) {
var occurences = new Dictionary<T,int>();
foreach(T item in items) {
if(occurences.ContainsKey(item)) {
occurences[item]++;
} else {
occurences.Add(item, 1);
}
}
return occurences;
}

myList.GroupBy(i => i).Count(g => g.Count() > 1)
and if you specifically need ArrayList
ArrayList arrayList = new ArrayList(new[] { 1, 1, 2, 3, 4, 4 });
Console.WriteLine(arrayList.ToArray().GroupBy(i => i).Count(g => g.Count() > 1));
Based on comments by poster
ArrayList arrayList = new ArrayList(new[] { 1, 1, 2, 3, 4, 4 });
Console.WriteLine(arrayList.ToArray().Count(i => i == 4));

int countDup = ArrayList1.Count - ArrayList1.OfType<object>().Distinct().Count();

var items = arrayList.Cast<object>()
.GroupBy(o => o)
.Select(g => new { Item = g, Count = g.Count() })
.ToList();
each item of result list will have two properties:
Item - source item
Count - count in source list

You can acomplish this many ways. The first that comes to me would be to group by the values within your array list, and only return the grouping counts that are over 1.
ArrayList al = new ArrayList();
al.Add("a");
al.Add("b");
al.Add("c");
al.Add("f");
al.Add("a");
al.Add("f");
int count = al.ToArray().GroupBy(q => q).Count(q=>q.Count()>1);
count will return the value of 2 as a and f are duplicated.

You could sort it, then it becomes very easy.
Edit: sorting becomes a moot point when done this way.
Arraylist myList = new ArrayList();
myList = someStuff;
Dictionary<object, int> counts = new Dictionary<object,int>();
foreach (object item in myList)
{
if (!counts.ContainsKey(item))
{
counts.Add(item,1);
}
else
{
counts[item]++;
}
}
Edit:
Some minor things might vary (not certain about some of my square braces, I'm a little rusty with c#) but the concept should withstand scrutiny.

Related

VBA/Excel RANK in C#

I am working on creating calculations from a spreadsheet into C#, and I was wondering if C# has a similar method to Rank in Excel?
Rank in Excel
Returns the rank of a number in a list of numbers. The rank of a
number is its size relative to other values in a list. (If you were to
sort the list, the rank of the number would be its position.)
Syntax
RANK(number,ref,order)
Number is the number whose rank you want to find.
Ref is an array of, or a reference to, a list of numbers.
Nonnumeric values in ref are ignored.
Order is a number specifying how to rank number.
If order is 0 (zero) or omitted, Microsoft Excel ranks number as if
ref were a list sorted in descending order. If order is any nonzero
value, Microsoft Excel ranks number as if ref were a list sorted in
ascending order.
The same can be achieved through code, but I just wanted to check if there was anything I was missing first.

You can, sort of.
SortedList<int, object> list = new SortedList<int, object>();
// fill with unique ints, and then look for one
int rank = list.Keys.IndexOf(i);
Rank will be an ascending, zero-based position.
You could pretty it up by writing an extension method:
public static class Extensions
{
public static int Rank(this int[] array, int find)
{
SortedList<int, object> list = new SortedList<int, object>();
for (int i = 0; i < array.Length; i++)
{
list.Add(array[i], null);
}
if (list.ContainsKey(find))
{
return list.Keys.IndexOf(find);
}
else
{
return -1;
}
}
}
And use it like:
int[] ints = new int[] { 2, 7, 6, 3, 9, 12 };
int rank = ints.Rank(2);
...but I'm not convinced its the most sensible thing to do.

To get the equivalent of RANK you'll need to get the minimum index of each item when you group:
var ranks = list.OrderBy(x => x)
.Select((x, i) => new {x, i = i+1}) // get 1-based index of each item
.GroupBy(xi => xi.x) // group by the item
.Select(g => new {rank = g.Min(xi => xi.i), items = g}) // rank = min index of group
.SelectMany(g => g.items, (g, gg) => new {g.rank, gg.i}) ; // select rank and item
or if you'rs grouping by the property of a class:
var ranks = list.OrderBy(x => x.{some property})
.Select((x, i) => new {x, i = i+1}) // get 1-based index of each item
.GroupBy(xi => xi.x.{some property}) // group by the item's property
.Select(g => new {rank = g.Min(xi => xi.i), items = g}) // rank = min index of group
.SelectMany(g => g.items, (g, gg) => new {g.rank, gg.i}) ; // select rank and item

This works for me so far (and it is simpler)
public static int Rank<T>(T value, IEnumerable<T> data)
{
return data.OrderByDescending(x => x).ToList().IndexOf(value) + 1;
}
I used T so it can take all numeric types (int/double/decimal).
The usage is similar to Excel
int[] data = new[] { 3, 2, 2, 3, 4 };
int rank = Rank(3, data); // returns 2
I hope I didn't miss anything

Find Generic List record based on index

I m in a situation where i need to find record from a generic list using its position, means need to find 1st 5th and 9th record , then 2nd 6th and 10th record and so on...
Situation is
A list of projects assigned to a List of Team,
So if we have 20 projects and 4 teams
then 1st project go to 1st team, 2nd go to 2nd team , 3rd go to 3rd team, 4th go to 4th team
then again 5th project go to 1st team
so its like
Projects Team
1 1
2 2
3 3
4 4
5 1
6 2
7 3
8 4
9 1
.
.
so now i want to run a Query on Generic List to get record for each team, so for first team record 1,5 and 9.... need to fetch.
Some thing like
List<Project> lst = list (from Database)
//For 1stTeam
lst = lst.Index(1,5,9...);
//For 2nsTeam
lst = lst.Index(2,6,10...);
Hope i clear my point.

You could do something like this with LINQ Select and GroupBy:
List<int> list = new List<int>{1,2,3,4,5,6,7,8,9,10};
int numberOfTeams = 4;
var projectsByTeam = list
.Select((number, index) => new {Value = number, Index = index})
.GroupBy(item => item.Index % numberOfTeams)
.Select(item => new {TeamNumber = item.Key+1, ProjectIDs = item.Select(x => x.Value).ToList()})
.ToList();
Splits the original list into
{
{TeamNumber = 1, ProjectIDs = {1,5,9}},
{TeamNumber = 2, ProjectIDs = {2,6,10}},
{TeamNumber = 3, ProjectIDs = {3,7}},
{TeamNumber = 4, ProjectIDs = {4,8}},
}

First, this is not specific to generic lists.
You have to create a new list, and then, one by one, add the items from the original list that you want in the new list. You can access single items at a given position via the indexer (square brackets).
List<Project> lst = // list (from Database)
List<Project> firstTeam = new List<Project>();
firstTeam.Add(lst[1]);
firstTeam.Add(lst[5]);
firstTeam.Add(lst[9]);
List<Project> secondTeam = new List<Project>();
secondTeam.Add(lst[2]);
secondTeam.Add(lst[6]);
secondTeam.Add(lst[10]);
Of course, if the items are distributed that regularly throughout the original lst, you can automatically determine the items:
List<Project> firstTeam = new List<Project>();
for (int i = 1; i < lst.Count; i += 4) {
firstTeam.Add(lst[i]);
}
i.e. you loop over the original list, taking every 4th item.
If the items to add to one of the teams are not distributed regularly throughout lst, you will have to add them one by one, but you might be able to make use of the shorter list initializer syntax:
List<Project> firstTeam = new List<Project>() { lst[1], lst[5], lst[9] };
Lastly, note that List<T> starts counting indices at zero, so the very first item is lst[0], not lst[1].

You are looking for the params keyword. It will allow you to pass in to Index an array of arguments, which are the indexes in your case.
In your case an extension method can do the trick:
public static List<Project> Index(this List<Project> list, params int[] indexes)
{
var newList = new List<Project>();
foreach(var index in indexes)
{
newList.Add(list[index]);
}
return newList;
}

// Define other methods and classes here
static IEnumerable<IEnumerable<T>> CustomSplit<T>(this IEnumerable<T> source, int max)
{
var results = new List<List<T>>();
for (int i = 0; i < max; i++)
{
results.Add(new List<T>());
}
int index = 0;
using (var enumerator = source.GetEnumerator())
{
while (enumerator.MoveNext())
{
int circularIndex = index % max;
results[circularIndex].Add(enumerator.Current);
index++;
}
}
return results;
}
And here is how to use it:
void Main()
{
var elements = Enumerable.Range(0, 100).CustomSplit(4);
}

You can use:
List<Project> projects; // = something from db
var neededIndexes = new[] { 0, 4, 8 };
var result = projects.Where((project, index) => neededIndexes.Contains(index)).ToList();
Or if the indexes are evenly distributed:
List<Project> projects; // = something from db
var result = projects.Where((project, index) => index % 4 == 0).ToList();

This solve your problem:
List for each team:
List<List<Project>> projectsPerTeam = new List<List<Project>> ();
for(int i=0;i<teamsList.Count();i++)
{
projectsPerTeam.Add(new List<Project> ());
}
Now your issue (add project for correct team):
for(int i=0;i<projectsList.Count();i++)
{
projectsPerTeam[i%teamList.Count()].Add(projectsList[i]);
}

IEnumerable<Object> Data Specific Ordering

I've an object that is include property ID with values between 101 and 199. How to order it like 199,101,102 ... 198?
In result I want to put last item to first.

The desired ordering makes no sense (some reasoning would be helpful), but this should do the trick:
int maxID = items.Max(x => x.ID); // If you want the Last item instead of the one
// with the greatest ID, you can use
// items.Last().ID instead.
var strangelyOrderedItems = items
.OrderBy(x => x.ID == maxID ? 0 : 1)
.ThenBy(x => x.ID);

Depending whether you are interested in the largest item in the list, or the last item in the list:
internal sealed class Object : IComparable<Object>
{
private readonly int mID;
public int ID { get { return mID; } }
public Object(int pID) { mID = pID; }
public static implicit operator int(Object pObject) { return pObject.mID; }
public static implicit operator Object(int pInt) { return new Object(pInt); }
public int CompareTo(Object pOther) { return mID - pOther.mID; }
public override string ToString() { return string.Format("{0}", mID); }
}
List<Object> myList = new List<Object> { 1, 2, 6, 5, 4, 3 };
// the last item first
List<Object> last = new List<Object> { myList.Last() };
List<Object> lastFirst =
last.Concat(myList.Except(last).OrderBy(x => x)).ToList();
lastFirst.ForEach(Console.Write);
Console.WriteLine();
// outputs: 312456
// or
// the largest item first
List<Object> max = new List<Object> { myList.Max() };
List<Object> maxFirst =
max.Concat(myList.Except(max).OrderBy(x => x)).ToList();
maxFirst.ForEach(Console.Write);
Console.WriteLine();
// outputs: 612345

Edit: missed the part about you wanting the last item first. You could do it like this :
var objectList = new List<DataObject>();
var lastob = objectList.Last();
objectList.Remove(lastob);
var newList = new List<DataObject>();
newList.Add(lastob);
newList.AddRange(objectList.OrderBy(o => o.Id).ToList());
If you are talking about a normal sorting you could use linq's order by method like this :
objectList = objectList.OrderBy(ob => ob.ID).ToList();

In result I want to put last item to first
first sort the list
List<int> values = new List<int>{100, 56, 89..};
var result = values.OrderBy(x=>x);
add an extension method for swaping an elements in the List<T>
static void Swap<T>(this List<T> list, int index1, int index2)
{
T temp = list[index1];
list[index1] = list[index2];
list[index2] = temp;
}
after use it
result .Swap(0, result.Count -1);

You can acheive this using a single Linq statment.
var ordering = testData
.OrderByDescending(t => t.Id)
.Take(1)
.Union(testData.OrderBy(t => t.Id).Take(testData.Count() - 1));
Order it in reverse direction and take the top 1, then order it the "right way round" and take all but the last and union these together. There are quite a few variants of this approach, but the above should work.
This approach should work for arbitrary lists too, without the need to know the max number.

How about
var orderedItems = items.OrderBy(x => x.Id)
var orderedItemsLastFirst =
orderedItems.Reverse().Take(1).Concat(orderedItems.Skip(1));
This will iterate the list several times so perhaps could be more efficient but doesn't use much code.
If more speed is important you could write a specialised IEnumerable extension that would allow you to sort and return without converting to an intermediate IEnumerable.

var myList = new List<MyObject>();
//initialize the list
var ordered = myList.OrderBy(c => c.Id); //or use OrderByDescending if you want reverse order

Query a list for only duplicates

I have a List of type string in a .NET 3.5 project. The list has thousands of strings in it, but for the sake of brevity we're going to say that it just has 5 strings in it.
List<string> lstStr = new List<string>() {
"Apple", "Banana", "Coconut", "Coconut", "Orange"};
Assume that the list is sorted (as you can tell above). What I need is a LINQ query that will remove all strings that are not duplicates. So the result would leave me with a list that only contains the two "Coconut" strings.
Is this possible to do with a LINQ query? If it is not then I'll have to resort to some complex for loops, which I can do, but I didn't want to unless I had to.

here is code for finding duplicates form string arrya
int[] listOfItems = new[] { 4, 2, 3, 1, 6, 4, 3 };
var duplicates = listOfItems
.GroupBy(i => i)
.Where(g => g.Count() > 1)
.Select(g => g.Key);
foreach (var d in duplicates)
Console.WriteLine(d);

var dupes = lstStr.Where(x => lstStr.Sum(y => y==x ? 1 : 0) > 1);
OR
var dupes = lstStr.Where((x,i) => ( (i > 0 && x==lstStr[i-1])
|| (i < lstStr.Count-1 && x==lstStr[i+1]));
Note that the first one enumerates the list for every element which takes O(n²) time (but doesn't assume a sorted list). The second one is O(n) (and assumes a sorted list).

This should work, and is O(N) rather that the O(N^2) of the other answers. (Note, this does use the fact that the list is sorted, so that really is a requirement).
IEnumerable<T> OnlyDups<T>(this IEnumerable<T> coll)
where T: IComparable<T>
{
IEnumerator<T> iter = coll.GetEnumerator();
if (iter.MoveNext())
{
T last = iter.Current;
while(iter.MoveNext())
{
if (iter.Current.CompareTo(last) == 0)
{
yield return last;
do
{
yield return iter.Current;
}
while(iter.MoveNext() && iter.Current.CompareTo(last) == 0);
}
last = iter.Current;
}
}
Use it like this:
IEnumerable<string> onlyDups = lstStr.OnlyDups();
or
List<string> onlyDups = lstStr.OnlyDups().ToList();

var temp = new List<string>();
foreach(var item in list)
{
var stuff = (from m in list
where m == item
select m);
if (stuff.Count() > 1)
{
temp = temp.Concat(stuff);
}
}

Better way of comparing two lists with LINQ?

I have the 2 collections:
IEnumerable<Element> allElements
List<ElementId> someElements,
What is a concise way of doing the following together:
[1] Verifying that all elements in someElements exist in allElements, return quickly when condition fails.
and
[2] Obtain a list of Element objects that List<ElementId> someElements maps to.
Every Element object has an ElementId
Thank you.

I would do this:
var map = allElements.ToDictionary(x => x.Id);
if (!someElements.All(id => map.ContainsKey(id))
{
// Return early
}
var list = someElements.Select(x => map[x])
.ToList();
Note that the first line will throw an exception if there are any duplicates in allElements.

someElements.All(e => allElements.Contains(e));
allElements.Where(e => someElements.Contains(e.ElementId));

Not as efficient as the Skeet answer, but good enough for reasonable-sized collections:
IEnumerable<Element> allElements = new List<Element>
{ new Element { Id = 1 }, new Element { Id = 2 } };
List<int> someElements = new List<int> { 1, 2 };
var query =
(from element in allElements
join id in someElements on element.Id equals id
select element)
.ToList();
if (query.Count != someElements.Count)
{
Console.WriteLine("Not all items found.");
}
foreach (var element in query)
{
Console.WriteLine ("Found: " + element.Id);
}

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

how to find no of duplicate values in arraylist - c#

I have a arraylist which has values some of them are repeated. I need the count of the repeated values. Is this possible in c#?

here is a great post how to do it with LINQ var query = from c in arrayList group c by c into g where g.Count() > 1 select new { Item = g.Key, ItemCount = g.Count()}; foreach (var item in query) { Console.WriteLine("Country {0} has {1} cities", item.Item , item.ItemCount ); }

public Dictionary<T,int> CountOccurences<T>(IEnumerable<T> items) { var occurences = new Dictionary<T,int>(); foreach(T item in items) { if(occurences.ContainsKey(item)) { occurences[item]++; } else { occurences.Add(item, 1); } } return occurences; }

int countDup = ArrayList1.Count - ArrayList1.OfType<object>().Distinct().Count();

var items = arrayList.Cast<object>() .GroupBy(o => o) .Select(g => new { Item = g, Count = g.Count() }) .ToList(); each item of result list will have two properties: Item - source item Count - count in source list

Related

VBA/Excel RANK in C#

Find Generic List record based on index

IEnumerable<Object> Data Specific Ordering

Query a list for only duplicates

Better way of comparing two lists with LINQ?

Categories

Resources