Grouping using multiple columns, then summing a specific column using method syntax - c#

Currently I am looking for a way to get the sum of all piece counts from an enumerable while ignoring duplicates, all this while using method syntax. As things are right now my code will work, but I realize this is only temporary. More on this later.
Lets use the following class as an example
internal class Piece
{
public int Count { get; set; }
public DateTime Date { get; set; }
public string Description { get; set; }
}
This class is then used to create a list with the following information
List<Piece> pieces = new List<Piece>
{
new Piece(41,DateTime.Parse("2019-07-12"),"BB"),
new Piece(41,DateTime.Parse("2019-07-12"),"BB"),
new Piece(21,DateTime.Parse("2019-07-12"),"JP"),
new Piece(23,DateTime.Parse("2019-07-14"),"AA")
};
To do the sum, I came up with the following
int total = pieces.Where(x => x.Count > 0)
.GroupBy(x => x.Count, x => x.Date,
(piece, date) => new { Count = piece,Date = date})
.Sum(x => x.Count);
This is where things get tricky. If another piece were to be added with as follows
new Piece(23,DateTime.Parse("2019-07-14"),"AB")
that piece would be ignored due to how I am grouping. This is far from ideal.
I have found the following way to group by several columns
GroupBy( x => new {x.Count,x.Date,x.Description})
But I have found no way make it so I can use Sum on this Grouping. This grouping using the AnonymousType does not let me declare local variables (piece,date) as I am able to do in the prior GroupBy (as far as I know).
For now the code that I have will do the trick but it is only a matter of time before that is no longer the case.
Some extra details.
I am manipulating a query result using Razor, and I have no control on the data that I get from the server. Manipulating the data using linq is basically the only way I have at the moment.
Any help is greatly appreciated

For the count you just need this query:
int total = pieces
.Where(x => x.Count > 0)
.GroupBy(x => new { x.Count, x.Date, x.Description })
.Sum(g => g.Key.Count);
So you can access all key properties of the grouping.
This returns 85 for your initial sample and 108 if you add the new piece.

Related

How to find items in sequence and group them?

Let's say I have the following:
public class Person
{
public string Name{get;set;}
public string Other{get;set;}
public string Other2{get;set;}
public int? Sequence{get;set;}
}
new Person("bob","other1","other2",1)
new Person("bob","other1","other2",2)
new Person("bob","other1","other2",3)
new Person("bob","other1","other2",4)
new Person("Alice","other1","other2")
new Person("Alice","other1","other2",1)
new Person("Alan","other1","other2",1)
new Person("Alan","other1","other2",2)
new Person("Alan","other1","other2",3)
new Person("Alex","other1","other2")
new Person("Alex","other1","other2",1)
new Person("Alex","other1","other2",2)
As shown some of the objects have sequence 1-n and some don't.
Could I use LINQ to pull objects by sequence from the given list?
Desired output would be:
Bob and all his related data where a sequence is there like 1,2,3,4 records
Alex 2 records as he only has sequences 1 and 2.
So the output would be another object by name and data by sequence.
new {Name="Bob", Data=new[]{
Other = "other"
Sequence = 1
Other2 = "Other2" //etc
}}
The sequence will always increment by 1 and be in order, but how many there might be is unknown.
If I have not made something clear just ask.
What I tried
I tried without using LINQ and looping through the list and processing each object and passing out a newly created object for each row using lots of if's.
I am just wondering if there is an easier way with LINQ although my way works it's ugly.
If I understand correctly:
var result = yourCollection
.Where(x => x.Sequence.HasValue)
.GroupBy(x => x.Name)
.Select(grp => new
{
Name = grp.Key,
Data = grp.Select(x => new
{
x.Other,
x.Sequence,
x.Other2
})
});
This assumes the list is already ordered by Sequence; if not, just add an .OrderBy(x => x.Sequence.Value) before the GroupBy.

Finding duplicate items then selecting one with closest date to currentDate

Note: Using Windows Mobile 6.5 Compact Framework.
I have a collection of the following object.
public class RFileModel
{
public List<string> RequiredFilesForR = new List<string>();
public string Date { get; set; }
public string RouteId { get; set; }
}
var ListOfRFileModels = new List<RFileModel>();
There is the chance that the same RouteId will be in multiple instances of RFileModel but with a different Date.
I'm trying to identify the duplicates and select only one, the one closest to the current date.
I have the following LINQ so far:
var query = ListOfRFileModels.GroupBy(route => route.RouteId)
.OrderBy(newGroup => newGroup.Key)
.Select(newGroup => newGroup).ToList();
But I don't think this is what I need, since it still returns all elements. I was expecting a list of non unique RouteId, that way I can iterate each non-unique id and compare dates to see which one to keep.
How can I accomplish this with LINQ or just plain ole foreach?
Your expression sorts groups, not group elements. Here is how to fix it:
DateTime currentDate = ...
var query = ListOfRFileModels
.GroupBy(route => route.RouteId)
.Select(g => g.OrderBy(fm => currentDate-fm.Date).First())
.ToList();
currentDate-fm.Date expression produces the difference between the current date and the date of the RFileModel object. The object with the smallest difference would end up in the first position of the ordered sequence. The call First() picks it up from the group to produce the final result.
Assuming you want ONLY the members with duplicates, take #dasblinkenlight's answer and add a Where clause: .Where(grp => grp.Count()>1):
DateTime currentDate = DateTime.Now;
var query = ListOfRFileModels
.GroupBy(route => route.RouteId)
.Where(grp => grp.Count()>1)
.Select(g => g.OrderBy(fm => currentDate-fm.Date).First())
.ToList();

Sorting a list of objects based on another

public class Product
{
public string Code { get; private set; }
public Product(string code)
{
Code = code;
}
}
List<Product> sourceProductsOrder =
new List<Product>() { new Product("BBB"), new Product("QQQ"),
new Product("FFF"), new Product("HHH"),
new Product("PPP"), new Product("ZZZ")};
List<Product> products =
new List<Product>() { new Product("ZZZ"), new Product("BBB"),
new Product("HHH")};
I have two product lists and I want to reorder the second one with the same order as the first.
How can I reorder the products list so that the result would be : "BBB", "HHH", "ZZZ"?
EDIT: Changed Code property to public as #juharr mentioned
You would use IndexOf:
var sourceCodes = sourceProductsOrder.Select(s => s.Code).ToList();
products = products.OrderBy(p => sourceCodes.IndexOf(p.Code));
The only catch to this is if the second list has something not in the first list those will go to the beginning of the second list.
MSDN post on IndexOf can be found here.
You could try something like this
products.OrderBy(p => sourceProductsOrder.IndexOf(p))
if it is the same Product object. Otherwise, you could try something like:
products.OrderBy(p => GetIndex(sourceProductsOrder, p))
and write a small GetIndex helper method. Or create a Index() extension method for List<>, which would yield
products.OrderBy(p => sourceProductsOrder.Index(p))
The GetIndex method is rather simple so I omit it here.
(I have no PC to run the code so please excuse small errors)
Here is an efficient way to do this:
var lookup = sourceProductsOrder.Select((p, i) => new { p.Code, i })
.ToDictionary(x => x.Code, x => x.i);
products = products.OrderBy(p => lookup[p.Code]).ToList();
This should have a running time complexity of O(N log N), whereas an approach using IndexOf() would be O(N2).
This assumes the following:
there are no duplicate product codes in sourceProductsOrder
sourceProductsOrder contains all of the product codes in products
you make the Code field/property non-private
If needed, you can create a safeguard against the first bullet by replacing the first statement with this:
var lookup = sourceProductsOrder.GroupBy(p => p.Code)
.Select((g, i) => new { g.Key, i })
.ToDictionary(x => x.Key, x => x.i);
You can account for the second bullet by replacing the second statement with this:
products = products.OrderBy(p =>
lookup.ContainsKey(p.Code) ? lookup[p.Code] : Int32.MaxValue).ToList();
And you can use both if you need to. These will slow down the algorithm a bit, but it should continue to have an O(N log N) running time even with these alterations.
I would implement a compare function that does a lookup of the order from sourceProductsOrder using a hash table. The lookup table would look like
(key) : (value)
"BBB" : 1
"QQQ" : 2
"FFF" : 3
"HHH" : 4
"PPP" : 5
"ZZZ" : 6
Your compare could then lookup the order of the two elements and do a simple < (pseudo code):
int compareFunction(Product a, Product b){
return lookupTable[a] < lookupTable[b]
}
Building the hash table would be linear and doing the sort would generally be nlogn
Easy come easy go:
IEnumerable<Product> result =
products.OrderBy(p => sourceProductsOrder.IndexOf(sourceProductsOrder.FirstOrDefault(p2 => p2.Code == p.Code)));
This will provide the desired result. Objects with ProductCodes not available in the source list will be placed at the beginning of the resultset. This will perform just fine for a couple of hundred of items I suppose.
If you have to deal with thousands of objects than an answer like #Jon's will likely perform better. There you first create a kind of lookup value / score for each item and then use that for sorting / ordering.
The approach I described is O(n2).

NHibernate query extremely slow compared to hard coded SQL query

I'm re-writing some of my old NHibernate code to be more database agnostic and use NHibernate queries rather than hard coded SELECT statements or database views. I'm stuck with one that's incredibly slow after being re-written. The SQL query is as such:
SELECT
r.recipeingredientid AS id,
r.ingredientid,
r.recipeid,
r.qty,
r.unit,
i.conversiontype,
i.unitweight,
f.unittype,
f.formamount,
f.formunit
FROM recipeingredients r
INNER JOIN shoppingingredients i USING (ingredientid)
LEFT JOIN ingredientforms f USING (ingredientformid)
So, it's a pretty basic query with a couple JOINs that selects a few columns from each table. This query happens to return about 400,000 rows and has roughly a 5 second execution time. My first attempt to express it as an NHibernate query was as such:
var timer = new System.Diagnostics.Stopwatch();
timer.Start();
var recIngs = session.QueryOver<Models.RecipeIngredients>()
.Fetch(prop => prop.Ingredient).Eager()
.Fetch(prop => prop.IngredientForm).Eager()
.List();
timer.Stop();
This code works and generates the desired SQL, however it takes 120,264ms to run. After that, I loop through recIngs and populate a List<T> collection, which takes under a second. So, something NHibernate is doing is extremely slow! I have a feeling this is simply the overhead of constructing instances of my model classes for each row. However, in my case, I'm only using a couple properties from each table, so maybe I can optimize this.
The first thing I tried was this:
IngredientForms joinForm = null;
Ingredients joinIng = null;
var recIngs = session.QueryOver<Models.RecipeIngredients>()
.JoinAlias(r => r.IngredientForm, () => joinForm)
.JoinAlias(r => r.Ingredient, () => joinIng)
.Select(r => joinForm.FormDisplayName)
.List<String>();
Here, I just grab a single value from one of my JOIN'ed tables. The SQL code is once again correct and this time it only grabs the FormDisplayName column in the select clause. This call takes 2498ms to run. I think we're on to something!!
However, I of course need to return several different columns, not just one. Here's where things get tricky. My first attempt is an anonymous type:
.Select(r => new { DisplayName = joinForm.FormDisplayName, IngName = joinIng.DisplayName })
Ideally, this should return a collection of anonymous types with both a DisplayName and an IngName property. However, this causes an exception in NHibernate:
Object reference not set to an instance of an object.
Plus, .List() is trying to return a list of RecipeIngredients, not anonymous types. I also tried .List<Object>() to no avail. Hmm. Well, perhaps I can create a new type and return a collection of those:
.Select(r => new TestType(r))
The TestType construction would take a RecipeIngredients object and do whatever. However, when I do this, NHibernate throws the following exception:
An unhandled exception of type 'NHibernate.MappingException' occurred
in NHibernate.dll
Additional information: No persister for: KitchenPC.Modeler.TestType
I guess NHibernate wants to generate a model matching the schema of RecipeIngredients.
How can I do what I'm trying to do? It seems that .Select() can only be used for selecting a list of a single column. Is there a way to use it to select multiple columns?
Perhaps one way would be to create a model with my exact schema, however I think that would end up being just as slow as the original attempt.
Is there any way to return this much data from the server without the massive overhead, without hard coding a SQL string into the program or depending on a VIEW in the database? I'd like to keep my code completely database agnostic. Thanks!
The QueryOver syntax for conversion of selected columns into artificial object (DTO) is a bit different. See here:
16.6. Projections for more details and nice example.
A draft of it could be like this, first the DTO
public class TestTypeDTO // the DTO
{
public string PropertyStr1 { get; set; }
...
public int PropertyNum1 { get; set; }
...
}
And this is an example of the usage
// DTO marker
TestTypeDTO dto = null;
// the query you need
var recIngs = session.QueryOver<Models.RecipeIngredients>()
.JoinAlias(r => r.IngredientForm, () => joinForm)
.JoinAlias(r => r.Ingredient, () => joinIng)
// place for projections
.SelectList(list => list
// this set is an example of string and int
.Select(x => joinForm.FormDisplayName)
.WithAlias(() => dto.PropertyStr1) // this WithAlias is essential
.Select(x => joinIng.Weight) // it will help the below transformer
.WithAlias(() => dto.PropertyNum1)) // with conversion
...
.TransformUsing(Transformers.AliasToBean<TestTypeDTO>())
.List<TestTypeDTO>();
So, I came up with my own solution that's a bit of a mix between Radim's solution (using the AliasToBean transformer with a DTO, and Jake's solution involving selecting raw properties and converting each row to a list of object[] tuples.
My code is as follows:
var recIngs = session.QueryOver<Models.RecipeIngredients>()
.JoinAlias(r => r.IngredientForm, () => joinForm)
.JoinAlias(r => r.Ingredient, () => joinIng)
.Select(
p => joinIng.IngredientId,
p => p.Recipe.RecipeId,
p => p.Qty,
p => p.Unit,
p => joinIng.ConversionType,
p => joinIng.UnitWeight,
p => joinForm.UnitType,
p => joinForm.FormAmount,
p => joinForm.FormUnit)
.TransformUsing(IngredientGraphTransformer.Create())
.List<IngredientBinding>();
I then implemented a new class called IngredientGraphTransformer which can convert that object[] array into a list of IngredientBinding objects, which is what I was ultimately doing with this list anyway. This is exactly how AliasToBeanTransformer is implemented, only it initializes a DTO based on a list of aliases.
public class IngredientGraphTransformer : IResultTransformer
{
public static IngredientGraphTransformer Create()
{
return new IngredientGraphTransformer();
}
IngredientGraphTransformer()
{
}
public IList TransformList(IList collection)
{
return collection;
}
public object TransformTuple(object[] tuple, string[] aliases)
{
Guid ingId = (Guid)tuple[0];
Guid recipeId = (Guid)tuple[1];
Single? qty = (Single?)tuple[2];
Units usageUnit = (Units)tuple[3];
UnitType convType = (UnitType)tuple[4];
Int32 unitWeight = (int)tuple[5];
Units rawUnit = Unit.GetDefaultUnitType(convType);
// Do a bunch of logic based on the data above
return new IngredientBinding
{
RecipeId = recipeId,
IngredientId = ingId,
Qty = qty,
Unit = rawUnit
};
}
}
Note, this is not as fast as doing a raw SQL query and looping through the results with an IDataReader, however it's much faster than joining in all the various models and building the full set of data.
IngredientForms joinForm = null;
Ingredients joinIng = null;
var recIngs = session.QueryOver<Models.RecipeIngredients>()
.JoinAlias(r => r.IngredientForm, () => joinForm)
.JoinAlias(r => r.Ingredient, () => joinIng)
.Select(r => r.column1, r => r.column2})
.List<object[]>();
Would this work?

How can I sort the file txt line 5000000?

i've got a disordered file with 500000 line which its information and date are like the following :
for instance desired Result
------------ ---------------
723,80 1,4
14,50 1,5
723,2 10,8
1,5 14,50
10,8 723,2
1,4 723,80
Now how can i implement such a thing ?
I've tried the sortedList and sorteddictionary methods but there is no way for implemeting a new value in the list because there are some repetative values in the list.
I'd appreciate it if u suggest the best possible method .
One more thing , i've seen this question but this one uses the class while i go with File!
C# List<> Sort by x then y
var result = File.ReadAllLines("...filepath...")
.Select(line => line.Split(','))
.Select(parts => new
{
V1 = int.Parse(parts[0]),
V2 = int.Parse(parts[1])
})
.OrderBy(v => v.V1)
.ThenBy(v => v.V2)
.ToList();
Duplicates will be handled properly by default. If you want to remove them, add .Distinct() somewhere, for example after ReadAllLines.
You need to parse the file into an object defined by a class. Once it's in the object, you can start to sort it.
public class myObject
{
public int x { get; set; }
public int y { get; set; }
}
Now once you get the file parsed into a list of objects, you should be able to do something like the following:
var myList = new List<myObject>(); //obviously, you should have parsed the file into the list.
var sortedList = myList.OrderBy(l => l.x).ThenBy(l => l.y).ToList();
First, sort each row so that they are in the correct order (e.g [723,80] - > [80,723]
Then sort all rows using a comparison something like this:
int Compare(Tuple<int,int> lhs, Tuple<int,int> rhs)
{
int res = lhs.Item1.CompareTo(rhs.Item1)
if(res == 0) res=lhs.Item2.CompareTo(rhs.Item2);
return res;
}

Categories

Resources