Is there a "DataTable" with "named row" in C#? - c#

I need a data structure with both named column and row. For example:
magic_data_table:
col_foo col_bar
row_foo 1 3
row_bar 2 4
I need to be able to access elements like magic_data_table["row_foo", "col_bar"] (which will give me 3)
I also need to be able to add new columns like:
magic_data_table.Columns.Add("col_new");
magic_data_table["row_foo", "col_new"] = 5;
AFAIK, DataTable only has named column...
EDIT:
I don't need to change the name of a column or a row. However, I may need to insert new rows into the middle of the table.

While you could use a Dictionary<string, Dictionary<string, T>> to do what you want, that wouldn't be particularly efficient in terms of memory, and would have the potential for the inner dictionaries to get out of sync. If you create your own data structure though that is a facade for lists, using dictionaries to map column names to indexes, then it's simple enough:
public class MyDataStructure<T>//TODO come up with better name
{
private Dictionary<string, int> columns;
private Dictionary<string, int> rows;
private List<List<T>> data;
public MyDataStructure(
IEnumerable<string> rows,
IEnumerable<string> columns)
{
this.columns = columns.Select((name, index) => new { name, index })
.ToDictionary(x => x.name, x => x.index);
this.rows = rows.Select((name, index) => new { name, index })
.ToDictionary(x => x.name, x => x.index);
initData();
}
private void initData()
{
data = new List<List<T>>(rows.Count);
for (int i = 0; i < rows.Count; i++)
{
data.Add(new List<T>(columns.Count));
for (int j = 0; j < columns.Count; j++)
{
data[i].Add(default(T));
}
}
}
public T this[string row, string column]
{
//TODO error checking for invalid row/column values
get
{
return data[rows[row]][columns[column]];
}
set
{
data[rows[row]][columns[column]] = value;
}
}
public void AddColumn(string column)
{
columns.Add(column, columns.Count);
for (int i = 0; i < data.Count; i++)
{
data[i].Add(default(T));
}
}
public void AddRow(string row)
{
rows.Add(row, rows.Count);
var list = new List<T>(columns.Count);
data.Add(list);
for (int i = 0; i < columns.Count; i++)
{
list.Add(default(T));
}
}
public bool RenameRow(string oldRow, string newRow)
{
if (rows.ContainsKey(oldRow) && !rows.ContainsKey(newRow))
{
this.Add(newRow, rows[oldRow]);
this.Remove(oldRow);
return true;
}
return false;
}
}
Note that if you were willing to fix the rows/columns upon construction then you'd be able to use a T[,] as the backing for the data, which would both make the class dramatically simpler to implement, and further reduce the memory overhead, although that doesn't appear to work for your use cases.

Add a column for the name - "name" in the following:
DataTable table = ...
DataColumn nameCol = table.Columns["name"];
var index = table.Rows.Cast<DataRow>()
.ToDictionary(row => (string)row[nameCol]);
... // then when you need the values:
string rowName = ..., colName = ...
var val = index[rowName][colName];

You may find that the Tuple (.net 4.0 and above) class suits your needs. It won't work strictly like a table but will give you a lot of flexibility.
You can use the List<> generic to store it and LINQ to query your data.
List<Tuple<string, string, int>> magicTable = new List<Tuple<string, string, int>>();
magicTable.AddRange(new Tuple<string, string, int>[] {
Tuple.Create("row_foo", "col_foo", 1),
Tuple.Create("row_foo", "col_bar", 2),
Tuple.Create("row_bar", "col_foo", 3),
Tuple.Create("row_bar", "col_bar", 4)});
magicTable.Add(Tuple.Create("row_foo", "col_new", 5));
int value = magicTable.Single(tuple => (tuple.Item1 == "row_foo" && tuple.Item2 == "col_new")).Item3;
It is going to be quite resource intensive due to the duplication of row/column names but you do get a lot of flexibility for small datasets.
Microsoft's Tuple documenation (3-tuple): http://msdn.microsoft.com/en-us/library/dd387150.aspx

Related

C # Compare 2 lists of ints of different size

I currently have 2 lists that I want to compare which will never have equal length.
List<int> data which is of n length
List<int> numbersToSeekFor which is of n length and contains the set of all distinct values in data
List<Color> colorsToAssign whose length is the same as numbersToSeekFor
What I want to achieve, and not been very successful, is to compare all the items in data to each index of numbersToSeekFor. If this condition is true the first index of colorsToAssignwill be added to a list and then the second index and so forth...
A very dumb example of the following would be this method. Were I am assuming that there are 3 elements in numbersToSeekFor. The output list of this method should also be of equal size as data.
public List<Color> Foo(List<int> data, List<int>numbersToSeekFor, List<Color> colorsToAssign)
{
List<Color> colors = new List<Color>();
for (int i = 0; i < data.Count; i++)
{
if(data[i] == numbersToSeekFor[0])
{
colors.Add(colorsToAssign[0]);
}
if(data[i] == numbersToSeekFor[1] )
{
colors.Add(colorsToAssign[1]);
}
if(data[i] == numbersToSeekFor[2])
{
colors.Add(colorsToAssign[2]);
}
}
return colors;
}
What would be the cleanest way to achieve this?
Thank you for your help
Well, you could use a comination of LINQ .Where and .Select methods:
public static List<Color> Foo3(List<int> data, List<int> numbersToSeekFor, List<Color> colorsToAssign)
{
if (data?.Any() != true || numbersToSeekFor?.Any() != true || colorsToAssign?.Count != data.Count)
{
return new List<Color>();
}
List<Color> colors = data
.Select(d => numbersToSeekFor.IndexOf(d))
.Where(i => i > -1 && i < colorsToAssign.Count)
.Select(i => colorsToAssign[i])
.ToList();
return colors;
}
If I understand it correctly, numbersToSeekFor is meant for mapping a number in data to an index in colorsToAssign. So maybe it would be a good idea to convert to to a Dictionary first:
var mapNumberToIndex = new Dictionary<int, int>();
for (var i = 0; i < numbersToSeekFor.Count; i++)
mapNumberToIndex.Add(numbersToSeekFor[i], i);
Then you could simple use
colors.Add(mapNumberToIndex[data[i]]);
in your loop
You could use a dictionary to map values onto colours, and do away with the need to pass in a separate list of all the distinct values:
public static List<Color> AssignColours(List<int> data, List<Color> coloursToAssign)
{
var result = new List<Color>();
var map = new Dictionary<int, Color>();
int n = 0;
foreach (var datum in data)
{
if (!map.TryGetValue(datum, out var colour))
{
// Next line will throw exception if number of distinct numbers
// is greater than length of coloursToAssign.
colour = coloursToAssign[n++];
map[datum] = colour;
}
result.Add(colour);
}
return result;
}
This code assumes that the count of distinct values is <= the length of coloursToAssign, otherwise there will be an exception at the commented line.
Alternatively (and more efficiently if you're calling this several times) you can precalculate the map like so:
public static Dictionary<int, Color> MapColours(List<int> numbersToSeekFor, List<Color> coloursToAssign)
{
var map = new Dictionary<int, Color>();
for (int i = 0; i < numbersToSeekFor.Count; ++i)
map[numbersToSeekFor[i]] = coloursToAssign[i];
return map;
}
and then use it like so:
var map = MapColours(numbersToSeekFor, coloursToAssign);
...
var colourToUse = map[data[someIndex]];

How Add values to var myDictionary = new Dictionary<int, Values>()

I created a public struct Values that has public string value1and public string value2.
public struct Values
{
public string header;
public string type;
}
My dictionary:
var myDictionary = new Dictionary<int, Values>();
Question: How do I add two values for each key?
while (true)
{
for (int i = 0; i < end i++)
{
myDictionary.Add(i, value1 , value2);
}
}
If you want to generate dictionary, you can try using Linq:
var myDictionary = Enumerable
.Range(0, end)
.Select(i => new {
key = i,
value = new Values() {
header = HeaderFromIndex(i), //TODO: implement this
type = TypeFromIndex(i) //TODO: implement this
}})
.ToDictionary(item => item.key, item => item.value);
In case you want to add items into existing dictionary:
for (int i = 0; i < end; ++i)
myDictionary.Add(i, new Values() {
header = HeaderFromIndex(i), //TODO: implement this
type = TypeFromIndex(i) //TODO: implement this
});
Please notice, that in any case dictionary holds pairs: {key, value}; so if you want to have two items as values for the corresponding key, you have to organize the values into a class new Values() {header = ..., type = ...} in your case
If I get the question correctly, you have to initialize a Values object and than add this one to your dictionary. Like this:
while (true) {
for (int i = 0; i < end i++) {
Values tmp_values;
tmp_values.header = "blabla";
tmp_values.type = "blabla type";
myDictionary.Add(i, tmp_values);
}
}

C# Generate An IEnumerable from Nested Sparse Dictionaries

So I have a sparse matrix of elements that is represented as
Dictionary<int, Dictionary<int, StructuredCell>> CellValues = new Dictionary<int, Dictionary<int, StructuredCell>>();
inside a class StructuredTable. I would like to be able to write a loop as
StructuredTable table = new StructuredTable();
// Fill the table with values
foreach(StructuredCell cell in table.Cells()) {
// Fill an alternate structure
}
Where any x,y combination inside the bound of the max of the number of columns and rows is returned as null. I can't seem to locate an example that uses yield this way.
Something like
public IEnumerable<StructuredCell> Cells(){
for (int i = 0; i < maxColumn; i++)
{
Dictionary<int, StructuredCell> row = null;
CellValues.TryGetValue(i, out row);
for (int j = 0; j < maxRow; j++)
{
if (row == null) yield return null;
StructuredCell cell = null;
row.TryGetValue(j, out cell);
yield return cell;
}
}
}
Based on the fact that the keys are resonable small you can do a number of optimizations here.
public class DataStructure {
private const int MAX_VALUE = 100000;
private readonly Dictionary<long, StructuredCell> CellValues;
private void Add(int keyOne, int keyTwo, StructuredCell cell) {
long hashKey = keyOne*MAX_VALUE + keyTwo;
CellValues[hashKey] = cell;
}
private void Remove(int keyOne, int keyTwo)
{
long hashKey = keyOne * MAX_VALUE + keyTwo;
CellValues.Remove(hashKey);
}
private IEnumerable<StructuredCell> GetCells() {
return CellValues.Values;
}
}
You can keep a simple Key->Value dictionary, where the
key = hash(keyOne, keyTwo)
You don't need any fancy lazy constructs (yield) since you already have the values available.

List sorting by multiple parameters

I have a .csv with the following headers and an example line from the file.
AgentID,Profile,Avatar,In_Time,Out_Time,In_Location,Out_Location,Target_Speed(m/s),Distance_Traveled(m),Congested_Duration(s),Total_Duration(s),LOS_A_Duration(s),LOS_B_Duration(s),LOS_C_Duration(s),LOS_D_Duration(s),LOS_E_Duration(s),LOS_F_Duration(s)
2177,DefaultProfile,DarkGreen_LowPoly,08:00:00,08:00:53,East12SubwayportalActor,EWConcourseportalActor,1.39653,60.2243,5.4,52.8,26.4,23,3.4,0,0,0
I need to sort this .csv by the 4th column (In_time) by increasing time ( 08:00:00, 08:00:01) and the 6th (In_Location) by alphabetical direction (e.g. East, North, etc).
So far my code looks like this:
List<string> list = new List<string>();
using (StreamReader reader = new StreamReader("JourneyTimes.csv"))
{
string line;
while ((line = reader.ReadLine()) != null)
{
line.Split(',');
list.Add(line);
}
I read in the .csv and split it using a comma (there are no other commas so this is not a concern). I then add each line to a list. My issue is how do I sort the list on two parameters and by the headers of the .csv.
I have been looking all day at this, I am relatively new to programming, this is my first program so I apologize for my lack of knowledge.
You can use LINQ OrderBy/ThenBy:
e.g.
listOfObjects.OrderBy (c => c.LastName).ThenBy (c => c.FirstName)
But first off, you should map your CSV line to some object.
To map CSV line to object you can predefine some type or create it dynamically
from line in File.ReadLines(fileName).Skip(1) //header
let columns = line.Split(',') //really basic CSV parsing, consider removing empty entries and supporting quotes
select new
{
AgentID = columns[0],
Profile = int.Parse(columns[1]),
Avatar = float.Parse(columns[2])
//other properties
}
And be aware that like many other LINQ methods, these two use deferred execution
You are dealing with two distinct problems.
First, ordering two columns in C# can be achieved with OrderBy, ThenBy
public class SpreadsheetExample
{
public DateTime InTime { get; set; }
public string InLocation { get; set; }
public SpreadsheetExample(DateTime inTime, string inLocation)
{
InTime = inTime;
InLocation = inLocation;
}
public static List<SpreadsheetExample> LoadMockData()
{
int maxMock = 10;
Random random = new Random();
var result = new List<SpreadsheetExample>();
for (int mockCount = 0; mockCount < maxMock; mockCount++)
{
var genNumber = random.Next(1, maxMock);
var genDate = DateTime.Now.AddDays(genNumber);
result.Add(new SpreadsheetExample(genDate, "Location" + mockCount));
}
return result;
}
}
internal class Class1
{
private static void Main()
{
var mockData = SpreadsheetExample.LoadMockData();
var orderedResult = mockData.OrderBy(m => m.InTime).ThenBy(m => m.InLocation);//Order, ThenBy can be used to perform ordering of two columns
foreach (var item in orderedResult)
{
Console.WriteLine("{0} : {1}", item.InTime, item.InLocation);
}
}
}
Now you can tackle the second issue of moving data into a class from Excel. VSTO is what you are looking for. There are lots of examples online. Follow the example I posted above. Replace your custom class in place of SpreadSheetExample.
You may use a DataTable:
var lines = File.ReadAllLines("test.csv");
DataTable dt = new DataTable();
var columNames = lines[0].Split(new char[] { ',' });
for (int i = 0; i < columNames.Length; i++)
{
dt.Columns.Add(columNames[i]);
}
for (int i = 1; i < lines.Length; i++)
{
dt.Rows.Add(lines[i].Split(new char[] { ',' }));
}
var rows = dt.Rows.Cast<DataRow>();
var result = rows.OrderBy(i => i["In_time"])
.ThenBy(i => i["In_Location"]);
// sum
var sum = rows.Sum(i => Int32.Parse(i["AgentID"].ToString()));

Remove duplicates from a List<T> in C#

Anyone have a quick method for de-duplicating a generic List in C#?
If you're using .Net 3+, you can use Linq.
List<T> withDupes = LoadSomeData();
List<T> noDupes = withDupes.Distinct().ToList();
Perhaps you should consider using a HashSet.
From the MSDN link:
using System;
using System.Collections.Generic;
class Program
{
static void Main()
{
HashSet<int> evenNumbers = new HashSet<int>();
HashSet<int> oddNumbers = new HashSet<int>();
for (int i = 0; i < 5; i++)
{
// Populate numbers with just even numbers.
evenNumbers.Add(i * 2);
// Populate oddNumbers with just odd numbers.
oddNumbers.Add((i * 2) + 1);
}
Console.Write("evenNumbers contains {0} elements: ", evenNumbers.Count);
DisplaySet(evenNumbers);
Console.Write("oddNumbers contains {0} elements: ", oddNumbers.Count);
DisplaySet(oddNumbers);
// Create a new HashSet populated with even numbers.
HashSet<int> numbers = new HashSet<int>(evenNumbers);
Console.WriteLine("numbers UnionWith oddNumbers...");
numbers.UnionWith(oddNumbers);
Console.Write("numbers contains {0} elements: ", numbers.Count);
DisplaySet(numbers);
}
private static void DisplaySet(HashSet<int> set)
{
Console.Write("{");
foreach (int i in set)
{
Console.Write(" {0}", i);
}
Console.WriteLine(" }");
}
}
/* This example produces output similar to the following:
* evenNumbers contains 5 elements: { 0 2 4 6 8 }
* oddNumbers contains 5 elements: { 1 3 5 7 9 }
* numbers UnionWith oddNumbers...
* numbers contains 10 elements: { 0 2 4 6 8 1 3 5 7 9 }
*/
How about:
var noDupes = list.Distinct().ToList();
In .net 3.5?
Simply initialize a HashSet with a List of the same type:
var noDupes = new HashSet<T>(withDupes);
Or, if you want a List returned:
var noDupsList = new HashSet<T>(withDupes).ToList();
Sort it, then check two and two next to each others, as the duplicates will clump together.
Something like this:
list.Sort();
Int32 index = list.Count - 1;
while (index > 0)
{
if (list[index] == list[index - 1])
{
if (index < list.Count - 1)
(list[index], list[list.Count - 1]) = (list[list.Count - 1], list[index]);
list.RemoveAt(list.Count - 1);
index--;
}
else
index--;
}
Notes:
Comparison is done from back to front, to avoid having to resort list after each removal
This example now uses C# Value Tuples to do the swapping, substitute with appropriate code if you can't use that
The end-result is no longer sorted
I like to use this command:
List<Store> myStoreList = Service.GetStoreListbyProvince(provinceId)
.GroupBy(s => s.City)
.Select(grp => grp.FirstOrDefault())
.OrderBy(s => s.City)
.ToList();
I have these fields in my list: Id, StoreName, City, PostalCode
I wanted to show list of cities in a dropdown which has duplicate values.
solution: Group by city then pick the first one for the list.
It worked for me. simply use
List<Type> liIDs = liIDs.Distinct().ToList<Type>();
Replace "Type" with your desired type e.g. int.
As kronoz said in .Net 3.5 you can use Distinct().
In .Net 2 you could mimic it:
public IEnumerable<T> DedupCollection<T> (IEnumerable<T> input)
{
var passedValues = new HashSet<T>();
// Relatively simple dupe check alg used as example
foreach(T item in input)
if(passedValues.Add(item)) // True if item is new
yield return item;
}
This could be used to dedupe any collection and will return the values in the original order.
It's normally much quicker to filter a collection (as both Distinct() and this sample does) than it would be to remove items from it.
An extension method might be a decent way to go... something like this:
public static List<T> Deduplicate<T>(this List<T> listToDeduplicate)
{
return listToDeduplicate.Distinct().ToList();
}
And then call like this, for example:
List<int> myFilteredList = unfilteredList.Deduplicate();
In Java (I assume C# is more or less identical):
list = new ArrayList<T>(new HashSet<T>(list))
If you really wanted to mutate the original list:
List<T> noDupes = new ArrayList<T>(new HashSet<T>(list));
list.clear();
list.addAll(noDupes);
To preserve order, simply replace HashSet with LinkedHashSet.
This takes distinct (the elements without duplicating elements) and convert it into a list again:
List<type> myNoneDuplicateValue = listValueWithDuplicate.Distinct().ToList();
Use Linq's Union method.
Note: This solution requires no knowledge of Linq, aside from that it exists.
Code
Begin by adding the following to the top of your class file:
using System.Linq;
Now, you can use the following to remove duplicates from an object called, obj1:
obj1 = obj1.Union(obj1).ToList();
Note: Rename obj1 to the name of your object.
How it works
The Union command lists one of each entry of two source objects. Since obj1 is both source objects, this reduces obj1 to one of each entry.
The ToList() returns a new List. This is necessary, because Linq commands like Union returns the result as an IEnumerable result instead of modifying the original List or returning a new List.
As a helper method (without Linq):
public static List<T> Distinct<T>(this List<T> list)
{
return (new HashSet<T>(list)).ToList();
}
Here's an extension method for removing adjacent duplicates in-situ. Call Sort() first and pass in the same IComparer. This should be more efficient than Lasse V. Karlsen's version which calls RemoveAt repeatedly (resulting in multiple block memory moves).
public static void RemoveAdjacentDuplicates<T>(this List<T> List, IComparer<T> Comparer)
{
int NumUnique = 0;
for (int i = 0; i < List.Count; i++)
if ((i == 0) || (Comparer.Compare(List[NumUnique - 1], List[i]) != 0))
List[NumUnique++] = List[i];
List.RemoveRange(NumUnique, List.Count - NumUnique);
}
Installing the MoreLINQ package via Nuget, you can easily distinct object list by a property
IEnumerable<Catalogue> distinctCatalogues = catalogues.DistinctBy(c => c.CatalogueCode);
If you have tow classes Product and Customer and we want to remove duplicate items from their list
public class Product
{
public int Id { get; set; }
public string ProductName { get; set; }
}
public class Customer
{
public int Id { get; set; }
public string CustomerName { get; set; }
}
You must define a generic class in the form below
public class ItemEqualityComparer<T> : IEqualityComparer<T> where T : class
{
private readonly PropertyInfo _propertyInfo;
public ItemEqualityComparer(string keyItem)
{
_propertyInfo = typeof(T).GetProperty(keyItem, BindingFlags.GetProperty | BindingFlags.Instance | BindingFlags.Public);
}
public bool Equals(T x, T y)
{
var xValue = _propertyInfo?.GetValue(x, null);
var yValue = _propertyInfo?.GetValue(y, null);
return xValue != null && yValue != null && xValue.Equals(yValue);
}
public int GetHashCode(T obj)
{
var propertyValue = _propertyInfo.GetValue(obj, null);
return propertyValue == null ? 0 : propertyValue.GetHashCode();
}
}
then, You can remove duplicate items in your list.
var products = new List<Product>
{
new Product{ProductName = "product 1" ,Id = 1,},
new Product{ProductName = "product 2" ,Id = 2,},
new Product{ProductName = "product 2" ,Id = 4,},
new Product{ProductName = "product 2" ,Id = 4,},
};
var productList = products.Distinct(new ItemEqualityComparer<Product>(nameof(Product.Id))).ToList();
var customers = new List<Customer>
{
new Customer{CustomerName = "Customer 1" ,Id = 5,},
new Customer{CustomerName = "Customer 2" ,Id = 5,},
new Customer{CustomerName = "Customer 2" ,Id = 5,},
new Customer{CustomerName = "Customer 2" ,Id = 5,},
};
var customerList = customers.Distinct(new ItemEqualityComparer<Customer>(nameof(Customer.Id))).ToList();
this code remove duplicate items by Id if you want remove duplicate items by other property, you can change nameof(YourClass.DuplicateProperty) same nameof(Customer.CustomerName) then remove duplicate items by CustomerName Property.
If you don't care about the order you can just shove the items into a HashSet, if you do want to maintain the order you can do something like this:
var unique = new List<T>();
var hs = new HashSet<T>();
foreach (T t in list)
if (hs.Add(t))
unique.Add(t);
Or the Linq way:
var hs = new HashSet<T>();
list.All( x => hs.Add(x) );
Edit: The HashSet method is O(N) time and O(N) space while sorting and then making unique (as suggested by #lassevk and others) is O(N*lgN) time and O(1) space so it's not so clear to me (as it was at first glance) that the sorting way is inferior
Might be easier to simply make sure that duplicates are not added to the list.
if(items.IndexOf(new_item) < 0)
items.add(new_item)
You can use Union
obj2 = obj1.Union(obj1).ToList();
Another way in .Net 2.0
static void Main(string[] args)
{
List<string> alpha = new List<string>();
for(char a = 'a'; a <= 'd'; a++)
{
alpha.Add(a.ToString());
alpha.Add(a.ToString());
}
Console.WriteLine("Data :");
alpha.ForEach(delegate(string t) { Console.WriteLine(t); });
alpha.ForEach(delegate (string v)
{
if (alpha.FindAll(delegate(string t) { return t == v; }).Count > 1)
alpha.Remove(v);
});
Console.WriteLine("Unique Result :");
alpha.ForEach(delegate(string t) { Console.WriteLine(t);});
Console.ReadKey();
}
There are many ways to solve - the duplicates issue in the List, below is one of them:
List<Container> containerList = LoadContainer();//Assume it has duplicates
List<Container> filteredList = new List<Container>();
foreach (var container in containerList)
{
Container duplicateContainer = containerList.Find(delegate(Container checkContainer)
{ return (checkContainer.UniqueId == container.UniqueId); });
//Assume 'UniqueId' is the property of the Container class on which u r making a search
if(!containerList.Contains(duplicateContainer) //Add object when not found in the new class object
{
filteredList.Add(container);
}
}
Cheers
Ravi Ganesan
Here's a simple solution that doesn't require any hard-to-read LINQ or any prior sorting of the list.
private static void CheckForDuplicateItems(List<string> items)
{
if (items == null ||
items.Count == 0)
return;
for (int outerIndex = 0; outerIndex < items.Count; outerIndex++)
{
for (int innerIndex = 0; innerIndex < items.Count; innerIndex++)
{
if (innerIndex == outerIndex) continue;
if (items[outerIndex].Equals(items[innerIndex]))
{
// Duplicate Found
}
}
}
}
David J.'s answer is a good method, no need for extra objects, sorting, etc. It can be improved on however:
for (int innerIndex = items.Count - 1; innerIndex > outerIndex ; innerIndex--)
So the outer loop goes top bottom for the entire list, but the inner loop goes bottom "until the outer loop position is reached".
The outer loop makes sure the entire list is processed, the inner loop finds the actual duplicates, those can only happen in the part that the outer loop hasn't processed yet.
Or if you don't want to do bottom up for the inner loop you could have the inner loop start at outerIndex + 1.
A simple intuitive implementation:
public static List<PointF> RemoveDuplicates(List<PointF> listPoints)
{
List<PointF> result = new List<PointF>();
for (int i = 0; i < listPoints.Count; i++)
{
if (!result.Contains(listPoints[i]))
result.Add(listPoints[i]);
}
return result;
}
All answers copy lists, or create a new list, or use slow functions, or are just painfully slow.
To my understanding, this is the fastest and cheapest method I know (also, backed by a very experienced programmer specialized on real-time physics optimization).
// Duplicates will be noticed after a sort O(nLogn)
list.Sort();
// Store the current and last items. Current item declaration is not really needed, and probably optimized by the compiler, but in case it's not...
int lastItem = -1;
int currItem = -1;
int size = list.Count;
// Store the index pointing to the last item we want to keep in the list
int last = size - 1;
// Travel the items from last to first O(n)
for (int i = last; i >= 0; --i)
{
currItem = list[i];
// If this item was the same as the previous one, we don't want it
if (currItem == lastItem)
{
// Overwrite last in current place. It is a swap but we don't need the last
list[i] = list[last];
// Reduce the last index, we don't want that one anymore
last--;
}
// A new item, we store it and continue
else
lastItem = currItem;
}
// We now have an unsorted list with the duplicates at the end.
// Remove the last items just once
list.RemoveRange(last + 1, size - last - 1);
// Sort again O(n logn)
list.Sort();
Final cost is:
nlogn + n + nlogn = n + 2nlogn = O(nlogn) which is pretty nice.
Note about RemoveRange:
Since we cannot set the count of the list and avoid using the Remove funcions, I don't know exactly the speed of this operation but I guess it is the fastest way.
Using HashSet this can be done easily.
List<int> listWithDuplicates = new List<int> { 1, 2, 1, 2, 3, 4, 5 };
HashSet<int> hashWithoutDuplicates = new HashSet<int> ( listWithDuplicates );
List<int> listWithoutDuplicates = hashWithoutDuplicates.ToList();
Using HashSet:
list = new HashSet<T>(list).ToList();
public static void RemoveDuplicates<T>(IList<T> list )
{
if (list == null)
{
return;
}
int i = 1;
while(i<list.Count)
{
int j = 0;
bool remove = false;
while (j < i && !remove)
{
if (list[i].Equals(list[j]))
{
remove = true;
}
j++;
}
if (remove)
{
list.RemoveAt(i);
}
else
{
i++;
}
}
}
If you need to compare complex objects, you will need to pass a Comparer object inside the Distinct() method.
private void GetDistinctItemList(List<MyListItem> _listWithDuplicates)
{
//It might be a good idea to create MyListItemComparer
//elsewhere and cache it for performance.
List<MyListItem> _listWithoutDuplicates = _listWithDuplicates.Distinct(new MyListItemComparer()).ToList();
//Choose the line below instead, if you have a situation where there is a chance to change the list while Distinct() is running.
//ToArray() is used to solve "Collection was modified; enumeration operation may not execute" error.
//List<MyListItem> _listWithoutDuplicates = _listWithDuplicates.ToArray().Distinct(new MyListItemComparer()).ToList();
return _listWithoutDuplicates;
}
Assuming you have 2 other classes like:
public class MyListItemComparer : IEqualityComparer<MyListItem>
{
public bool Equals(MyListItem x, MyListItem y)
{
return x != null
&& y != null
&& x.A == y.A
&& x.B.Equals(y.B);
&& x.C.ToString().Equals(y.C.ToString());
}
public int GetHashCode(MyListItem codeh)
{
return codeh.GetHashCode();
}
}
And:
public class MyListItem
{
public int A { get; }
public string B { get; }
public MyEnum C { get; }
public MyListItem(int a, string b, MyEnum c)
{
A = a;
B = b;
C = c;
}
}
I think the simplest way is:
Create a new list and add unique item.
Example:
class MyList{
int id;
string date;
string email;
}
List<MyList> ml = new Mylist();
ml.Add(new MyList(){
id = 1;
date = "2020/09/06";
email = "zarezadeh#gmailcom"
});
ml.Add(new MyList(){
id = 2;
date = "2020/09/01";
email = "zarezadeh#gmailcom"
});
List<MyList> New_ml = new Mylist();
foreach (var item in ml)
{
if (New_ml.Where(w => w.email == item.email).SingleOrDefault() == null)
{
New_ml.Add(new MyList()
{
id = item.id,
date = item.date,
email = item.email
});
}
}

Categories

Resources