dedupe list object using linq - c#

I am working on a function which should dedupe (remove duplicates) from a list object. Here is the requirement:
Tradeline is considered as duplicate if it has:
Same account number, account type, date and it's not manual
If you found something then select only those which has
Latest reported date
If same reported date then compare (30,60,90) fields and select tradeline which has a higher value in ANY of the above three attributes
I am having trouble implementing the last bullet point. Here is my code:
public IEnumerable<Tradeline> DedupeTradeline(IEnumerable<Tradeline> tradelines)
{
//split tradeline into manual and non-manual
var tradelineDictionary = tradelines.GroupBy(x => x.Source == "MAN").ToDictionary(x => x.Key, x => x.ToList());
//create list of non-manual tradeline for dedupe logic
var nonManualTradelines = tradelineDictionary.Where(x => x.Key == false).Select(x => x.Value).FirstOrDefault();
var manualTradelines = tradelineDictionary.Where(x => x.Key).Select(x => x.Value).FirstOrDefault();
//check if same reported date is present for dedupe tradelines
var duplicate = nonManualTradelines?.GroupBy(x => new
{
x.ReportedDate,
x.Account,
x.AcctType,
x.Date
}).Any(g => g.Count() > 1);
IEnumerable<Tradeline> dedupe;
if (duplicate != null && (bool) !duplicate)
{
//logic for dedupe tradeline if no same reported date
dedupe = nonManualTradelines.GroupBy(x => new
{
x.Account,
x.AcctType,
x.Date
})
//in case of duplicate tradelines select one with the latest date reported
.Select(x => x.OrderByDescending(o => o.ReportedDate).First());
}
else
{
//logic for dedupe tradeline if same reported date
dedupe = nonManualTradelines?.GroupBy(x => new
{
x.ReportedDate,
x.Account,
x.AcctType,
x.Date
})
.Select();
// Stuck here not sure what to do
}
//append manual tradeline to the output of dedupe tradelines
var response = manualTradelines != null ? (dedupe).Union(manualTradelines) : dedupe;
return response;
}
Tradeline class:
public class Tradeline
{
public string Account { get; set; }
public string AcctType { get; set; }
public string Late30 { get; set; }
public string Late60 { get; set; }
public string Late90 { get; set; }
public string Date { get; set; }
public string ReportedDate { get; set; }
public string Source { get; set; }
}

You can just sort descending by the maximum Latex value. I replaced the peculiar use of Dictionary with a simple and more efficient separation of the two categories.
public static class ObjectExt {
public static int ToInt<T>(this T obj) => Convert.ToInt32(obj);
}
public IEnumerable<Tradeline> DedupeTradeline(IEnumerable<Tradeline> tradelines) {
//split tradeline into manual and non-manual
var nonManualTradelines = new List<Tradeline>();
var manualTradelines = new List<Tradeline>();
foreach (var t in tradelines) {
if (t.Source == "MAN")
manualTradelines.Add(t);
else
nonManualTradelines.Add(t);
}
IEnumerable<Tradeline> dedupe = nonManualTradelines.GroupBy(t => new {
t.Account,
t.AcctType,
t.Date
})
//in case of duplicate tradelines select one with the latest date reported
.Select(tg => tg.OrderByDescending(t => t.ReportedDate).ThenByDescending(t => Math.Max(t.Late90.ToInt(), Math.Max(t.Late60.ToInt(), t.Late30.ToInt()))).First());
//append manual tradeline to the output of dedupe tradelines
return dedupe.Union(manualTradelines);
}

Related

C# LINQ Filter list of complex objects by sub-list using a list of values

I want to return a list of active groups that are discounted in requested states. The list of groups each have a list of states which include the state abbrev and a discount flag.
filter criteria:
string[] states //list of state abbreviations
List to filter:
public class WorksiteGroup
{
public long Id { get; set; }
public string Name { get; set; }
public string Type { get; set; }
public bool IsDiscontinued { get; set; }
public List<WorksiteGroupState> ActiveStates { get; set; } = new List<WorksiteGroupState>();
}
public class WorksiteGroupState
{
public string StateAbbrev { get; set; }
public bool IsDiscountApplied { get; set; }
}
Again, I want to return a list of WorksiteGroup with the full structure above where IsDiscontinued is false and have an ActiveState where StateAbbrev matches any of the filter criteria (states[]) and IsDiscountApplied is true for that state.
Let's do this step by step and then we can merge operations where necessary.
I want to return a list of WorksiteGroup with the full structure above
where IsDiscontinued is false
source.Where(e => !e.IsDiscontinued);
and have an ActiveState where StateAbbrev matches any of the filter
criteria (states[])
now let's take the previous pipeline and chain this criterion into it.
source.Where(e => !e.IsDiscontinued)
.Where(e => e.ActiveStates.Any(a => states.Contains(a.StateAbbrev)))
and IsDiscountApplied is true for that state.
source.Where(e => !e.IsDiscontinued)
.Where(e => e.ActiveStates.Any(s => states.Contains(s.StateAbbrev) && s.IsDiscountApplied));
for efficiency let's swap the Contains call to be after s.IsDiscountApplied e.g.
source.Where(e => !e.IsDiscontinued)
.Where(e => e.ActiveStates.Any(s => s.IsDiscountApplied && states.Contains(s.StateAbbrev)));
You can try this using Linq:
string[] states = new string[] { "abbrev1", "abbrev2" };
var list = new List<WorksiteGroup>();
var item = new WorksiteGroup();
item.Name = "Test1";
item.IsDiscontinued = false;
var subitem = new WorksiteGroupState();
subitem.IsDiscountApplied = true;
subitem.StateAbbrev = "abbrev1";
item.ActiveStates.Add(subitem);
list.Add(item);
item = new WorksiteGroup();
item.Name = "Test2";
item.IsDiscontinued = true;
subitem = new WorksiteGroupState();
subitem.IsDiscountApplied = true;
subitem.StateAbbrev = "abbrev1";
item.ActiveStates.Add(subitem);
list.Add(item);
var result = list.Where(wg => wg.IsDiscontinued == false
&& wg.ActiveStates.Where(state => state.IsDiscountApplied == true
&& states.Contains(state.StateAbbrev)).Any());
foreach ( var value in result )
Console.WriteLine(value.Name);
Console.ReadKey();
You can play with items and add more to see results.
sudo-code but would something like below work, im sure you could do this is one line but
var worksiteGroup = Populate();
var filteredWorkSiteGroup = worksiteGroup .Where(x=>x.IsDiscontinued == false);
filteredWorkSiteGroup.ActiveStates = filteredWorkSiteGroup.ActiveStates
.Where(x=> states.Contains(x.StateAbbrev)
&& x.IsDiscountApplied == true);

Find all duplicates in a list in C#

I have a Custom class shown below
internal class RecurringClusterModel
{
public int? From { get; set; }
public int? To { get; set; }
public string REC_Cluster_1 { get; set; }
public string REC_Cluster_2 { get; set; }
public string REC_Cluster_3 { get; set; }
public string REC_Cluster_4 { get; set; }
public string REC_Cluster_5 { get; set; }
public string REC_Cluster_6 { get; set; }
public string REC_Cluster_7 { get; set; }
public string REC_Cluster_8 { get; set; }
public string REC_Cluster_9 { get; set; }
public string REC_Cluster_10 { get; set; }
I have a List of this class
List<RecurringClusterModel> recurringRecords = new List<RecurringClusterModel>();
The data can be in the below format
recurringRecords[0].REC_Cluster_1 = "USA";
recurringRecords[0].REC_Cluster_2 = "UK";
recurringRecords[0].REC_Cluster_3 = "India";
recurringRecords[0].REC_Cluster_4 = "France";
recurringRecords[0].REC_Cluster_5 = "China";
recurringRecords[1].REC_Cluster_1 = "France";
recurringRecords[1].REC_Cluster_2 = "Germany";
recurringRecords[1].REC_Cluster_3 = "Canada";
recurringRecords[1].REC_Cluster_4 = "Russia";
recurringRecords[1].REC_Cluster_5 = "India";
....
I want to find the duplicate records between all the Cluster properties..This is just a subset I have 50 properties till REC_Cluster_50. I want to find out which countries are getting duplicated between the 50 cluster properties of the list.
So in this case India and France are getting duplicated. I can group by individual property and then find out the duplicate by getting the count but then I d have to do it for all the 50 Rec_Clusters property. Not sure if there is a better way of doing it.
Thanks
Since you want to capture the From and To, I suggest you structure your class like this:
internal class RecurringClusterModel
{
public int? From { get; set; }
public int? To { get; set; }
public IEnumerable<string> REC_Clusters { get; set; }
}
Then you can search for duplicates:
var dupes = recs
.Select(r => new
{
r.From,
r.To,
DuplicateClusters = r.REC_Clusters.GroupBy(c => c)
.Where(g => g.Count() > 1) // duplicates
.SelectMany(g => g) // flatten it back
.ToArray() // indexed
})
.Where(r => r.DuplicateClusters.Any()) //only interested in clusters with duplicates
.ToArray();
EDIT
If you want all duplicates, then it will be:
var allDupes = recs.SelectMany(r => r.REC_Clusters)
.Select(r => r.GroupBy(c => c)
.Where(g => g.Count() > 1)
.SelectMany(g => g))
.Where(r => r.Any()).ToArray();
But now you lose track of the From/To
I would add an enumerable to your class that iterates over all properties of that class:
internal class RecurringClusterModel
{
public string REC_Cluster_1 { get; set; }
public string REC_Cluster_2 { get; set; }
public string REC_Cluster_3 { get; set; }
public IEnumerable<string> Clusters => GetAllClusters();
private IEnumerable<string> GetAllClusters()
{
if (!string.IsNullOrEmpty(REC_Cluster_1))
yield return REC_Cluster_1;
if (!string.IsNullOrEmpty(REC_Cluster_2))
yield return REC_Cluster_2;
if (!string.IsNullOrEmpty(REC_Cluster_3))
yield return REC_Cluster_3;
}
}
With this you can flatten the list to the individual clusters and then group by. If you need the original object back again, you have to provide it while flattening. Here is an example:
var clusters = Enumerable
.Range(1, 10)
.Select(_ => new RecurringClusterModel
{
REC_Cluster_1 = _Locations[_Random.Next(_Locations.Count)],
REC_Cluster_2 = _Locations[_Random.Next(_Locations.Count)],
REC_Cluster_3 = _Locations[_Random.Next(_Locations.Count)],
})
.ToList();
var dictionary = clusters
// Flatten the list and preserve original object
.SelectMany(model => model.Clusters.Select(cluster => (cluster, model)))
// Group by flattened value and put original object into each group
.GroupBy(node => node.cluster, node => node.model)
// Take only groups with more than one element (duplicates)
.Where(group => group.Skip(1).Any())
// Depending on further processing you could put the groups into a dictionary.
.ToDictionary(group => group.Key, group => group.ToList());
foreach (var cluster in dictionary)
{
Console.WriteLine(cluster.Key);
foreach (var item in cluster.Value)
{
Console.WriteLine(" " + String.Join(", ", item.Clusters));
}
}

Only parameterless constructors and initializers are supported in LINQ to Entities error

I have a view model which contains a List<>. This list is a collection of another model and I'm trying to fill this list while filling an IEnumerable of my view model. While doing this I get the error “Only parameterless constructors and initializers are supported in LINQ to Entities”. The error came to be because of the Locations = new List<> part in which I try to fill the list. What I would like to know is how to fill this list the correct way.
Code:
IEnumerable<PickListLineViewModel> lineList = dbEntity.PickListLine
.Where(i => i.PickID == id && i.Status != "C")
.Select(listline => new PickListLineViewModel
{
ArticleName = dbEntity.Item
.Where(i => i.ItemId == dbEntity.SalesOrderLine
.Where(idb => idb.DocId == listline.BaseDocID &&
idb.DocType.Equals(listline.BaseDocType) &&
idb.LineNum == listline.BaseLineNum)
.Select(iid => iid.ItemId)
.FirstOrDefault())
.Select(p => p.Description)
.FirstOrDefault(),
PickID = listline.PickID,
BaseDocID = listline.BaseDocID,
BaseDocType = listline.BaseDocType,
BaseLineNum = listline.BaseLineNum,
LineNum = listline.LineNum,
Quantity = listline.Quantity,
ReleasedByQty = listline.ReleasedByQty,
Status = listline.Status,
PickedQuantity = listline.PickedQuantity,
Locations = new List<BinLocationItemModel>(dbEntity.BinLocation_Item
.Where(t => t.ItemId == dbEntity.SalesOrderLine
.Where(idb => idb.DocId == listline.BaseDocID &&
idb.DocType.Equals(listline.BaseDocType) &&
idb.LineNum == listline.BaseLineNum)
.Select(iid => iid.ItemId)
.FirstOrDefault())
.Select(locitem => new BinLocationItemModel
{
ItemId = locitem.ItemId,
Barcode = locitem.BinLocation.Barcode,
BinLocationCode = locitem.BinLocation.BinLocationCode,
BinLocationId = locitem.BinLocationId,
BinLocationItemId = locitem.ItemId,
StockAvailable = locitem.StockAvailable
}))
.ToList(),
ArticleID = dbEntity.Item
.Where(i => i.ItemId == dbEntity.SalesOrderLine
.Where(idb => idb.DocId == listline.BaseDocID &&
idb.DocType.Equals(listline.BaseDocType) &&
idb.LineNum == listline.BaseLineNum)
.Select(iid => iid.ItemId)
.FirstOrDefault())
.Select(p => p.ItemCode)
.FirstOrDefault()
})
.AsEnumerable();
BinLocationItemModel:
public class BinLocationItemModel
{
[Required]
public int BinLocationItemId { get; set; }
public string Barcode { get; set; }
public string BinLocationCode { get; set; }
[Required]
public int BinLocationId { get; set; }
[Required]
public int ItemId { get; set; }
public decimal? StockAvailable { get; set; }
}
In your PickListLineViewModel constructor you should initialise your Locations list like below
public class PickListLineViewModel
{
public PickListLineViewModel()
{
Locations = new List<BinLocationItemModel>();
}
public List <BinLocationItemModel> Locations { get; set; }
}
public class BinLocationItemModel
{
....
}
Then for your linq query you should be able to do the following, you wont need your new list inside the linq:
Locations = dbEntity.BinLocation_Item
.Where(t => t.ItemId == dbEntity.SalesOrderLine
.Where(idb => idb.DocId == listline.BaseDocID &&
idb.DocType.Equals(listline.BaseDocType) &&
idb.LineNum == listline.BaseLineNum)
.Select(iid => iid.ItemId)
.FirstOrDefault())
.Select(locitem => new BinLocationItemModel
{
ItemId = locitem.ItemId,
Barcode = locitem.BinLocation.Barcode,
BinLocationCode = locitem.BinLocation.BinLocationCode,
BinLocationId = locitem.BinLocationId,
BinLocationItemId = locitem.ItemId,
StockAvailable = locitem.StockAvailable
})
.ToList(),

Using Linq(?) to get a property from a list inside of a list

I need help to select all Titles from List"FeedItem" that is inside of List"Feed" where Feed.Name matches a string from a combobox.
Below is my attempt, which is not succesful, might be on the wrong road.
var loadFeedData = fillFeed.GetAllFeeds();
var filteredOrders =
loadFeedData.SelectMany(x => x.Items)
.Select(y => y.Title)
.Where(z => z.Contains(flow)).ToList();
To understand things better I'll post the Feed.cs code as well.
public class Feed : IEntity
{
public string Url { get; set; }
public Guid Id { get; set; }
public string Category { get; set; }
public string Namn { get; set; }
public string UppdateInterval { get; set; }
public List<FeedItem> Items { get; set; }
}
This is the Whole Code that I'm trying to get working, filling a ListView with the Title, based on the Name of the Listview with Feed.Name that I select.
private void listFlow_SelectionChanged(object sender, System.Windows.Controls.SelectionChangedEventArgs e)
{
listInfo.Items.Clear();
listEpisode.Items.Clear();
if (listFlow.SelectedItem != null)
{
string flow = listFlow.SelectedItem.ToString();
var loadFeedData = fillFeed.GetAllFeeds();
var filteredOrders = loadFeedData
.Where(f => f.Name == myStringFromComboBox)
.SelectMany(f => f.Items)
.Select(fi => fi.Title);
listEpisode.Items.Add(filteredOrders);
}
}
- Posted whole code to clear out some ??
loadFeedData
.Where(f => f.Name == myStringFromComboBox)
.SelectMany(f => f.Items)
.Select(fi => fi.Title);
I believe you are looking for:
List<string> titles = loadFeedData.Where(f => f.Name == "SomeName")
.SelectMany(f => f.Items.Select(subItem => subItem.Title))
.ToList();
First you will filter your main list loadFeedData based on Name
Then select Title from List<FeedItem>
Later flatten your Titles, using SelectMany to return an IEnumerable<string>
Optional, call ToList to get a List<string> back.
If I understand you correctly, you want to use this one:
var filteredOrders = loadFeedData.Where(x => x.Name == flow)
.SelectMany(x => x.Items)
.Select(x => x.Title).ToList();
This will give you all FeedItem items inside all Feed things, which have Feed.Name == flow.

Grouping and flattening list with linq and lambda

I have the following class
public class SolicitacaoConhecimentoTransporte
{
public long ID { get; set; }
public string CodigoOriginal { get; set; }
public DateTime Data { get; set; }
public List<CaixaConhecimentoTransporte> Caixas { get; set; }
}
I would like to know if there is a way of achiveing the same behavior of the code below using Linq (with lambda expression syntax),
List<SolicitacaoConhecimentoTransporte> auxList = new List<SolicitacaoConhecimentoTransporte>();
foreach (SolicitacaoConhecimentoTransporte s in listaSolicitacao)
{
SolicitacaoConhecimentoTransporte existing =
auxList.FirstOrDefault(f => f.CodigoOriginal == s.CodigoOriginal &&
f.Data == s.Data &&
f.ID == s.ID);
if (existing == null)
{
auxList.Add(s);
}
else
{
existing.Caixas.AddRange(s.Caixas);
}
}
return auxList;
In other words, group all entities that have equal properties and flat all lists into one.
Thanks in advance.
Use anonymous object to group by three properties. Then project each group to new SolicitacaoConhecimentoTransporte instance. Use Enumerable.SelectMany to get flattened sequence of CaixaConhecimentoTransporte from each group:
listaSolicitacao.GroupBy(s => new { s.CodigoOriginal, s.Data, s.ID })
.Select(g => new SolicitacaoConhecimentoTransporte {
ID = g.Key.ID,
Data = g.Key.Data,
CodigoOriginal = g.Key.CodigoOriginal,
Caixas = g.SelectMany(s => s.Caixas).ToList()
}).ToList()

Categories

Resources