new to C#, SQL and Linq. I have two lists, one "dataTransactions" (fuel from gas stations) and a similar one "dataTransfers" (fuel from slip tanks).
They each access a different table from SQL and get combined later.
List<FuelLightTruckDataSource> data = new List<FuelLightTruckDataSource>();
using (SystemContext ctx = new SystemContext())
{
List<FuelLightTruckDataSource> dataTransactions
= ctx.FuelTransaction
.Where(tx => DbFunctions.TruncateTime(tx.DateTime) >= from.Date && DbFunctions.TruncateTime(tx.DateTime) <= to.Date
//&& tx.AssetFilled.AssignedToEmployee.Manager
&& tx.AssetFilled.AssignedToEmployee != null
//&
&& tx.AssetFilled.AssetType.Code == "L"
&& (tx.FuelProductType.FuelProductClass.Code == "GAS" || tx.FuelProductType.FuelProductClass.Code == "DSL"))
.GroupBy(tx => new { tx.AssetFilled, tx.DateTime, tx.FuelProductType.FuelProductClass, tx.FuelCard.FuelVendor, tx.City, tx.Volume, tx.Odometer}) //Added tx.volume to have individual transactions
.Select(g => new FuelLightTruckDataSource()
{
Asset = g.FirstOrDefault().AssetFilled,
Employee = g.FirstOrDefault().AssetFilled.AssignedToEmployee,
ProductClass = g.FirstOrDefault().FuelProductType.FuelProductClass,
Vendor = g.FirstOrDefault().FuelCard.FuelVendor,
FillSource = FuelFillSource.Transaction,
Source = "Fuel Station",
City = g.FirstOrDefault().City.ToUpper(),
Volume = g.FirstOrDefault().Volume,
Distance = g.FirstOrDefault().Odometer,
Date = g.FirstOrDefault().DateTime
})
.ToList();
In the end, I use
data.AddRange(dataTransactions);
data.AddRange(dataTransfers);
to put the two lists together and generate a fuel consumption report.
Both lists are individually sorted by Date, but after "AddRange" the "dataTransfers" just gets added to the end, losing my sort by Date. How do I sort the combined result again by date after using the "AddRange" command?
Try this:
data = data.OrderBy(d => d.Date).ToList();
Or if you want to order descending:
data = data.OrderByDescending(d => d.Date).ToList();
You can call List<T>.Sort(delegate).
https://msdn.microsoft.com/en-us/library/w56d4y5z(v=vs.110).aspx
Example:
data.Sort(delegate(FuelLightTruckDataSource x, FuelLightTruckDataSource y)
{
// your sort logic here.
});
Advantage: this sort doesn't create a new IList<T> instance as it does in OrderBy. it's a small thing, but to some people this matters, especially for performance and memory sensitive situations.
Related
I have two lists filled with their own data.
lets say there are two models Human and AnotherHuman. Each model contains different fields, however they have some common fields like LastName, FirstName, Birthday, PersonalID.
List<Human> humans = _unitOfWork.GetHumans();
List<AnotherHuman> anotherHumans = _unitofWork.GetAnotherHumans();
I would like to exclude the items from list anotherHumans where LastName, FirstName, Birthday are all equal to the corresponding fields of any item in list humans.
However if any item in anotherHumans list has PersonalID and item in list humans have the same PersonalID, then it is enough to compare Human with AnotherHuman only by this PersonalID, otherwise by LastName, FirstName and Birthday.
I tried to create new list of dublicates and exclude it from anotherHumans:
List<AnotherHuman> duplicates = new List<AnotherHuman>();
foreach(Human human in humans)
{
AnotherHuman newAnotherHuman = new AnotherHuman();
newAnotherHuman.LastName = human.LastName;
newAnotherHuman.Name= human.Name;
newAnotherHuman.Birthday= human.Birthday;
duplicates.Add(human)
}
anotherHumans = anotherHumans.Except(duplicates).ToList();
But how can I compare PersonalID from both lists if it presents (it is nullable). Is there any way to get rid from creating new instance of AnotherHuman and list of duplicates and use LINQ only?
Thanks in advance!
Instead of creating new objects, how about checking the properties as part of the linq query
List<Human> humans = _unitOfWork.GetHumans();
List<AnotherHuman> anotherHumans = _unitofWork.GetAnotherHumans();
// Get all anotherHumans where the record does not exist in humans
var result = anotherHumans
.Where(ah => !humans.Any(h => h.LastName == ah.LastName
&& h.Name == ah.Name
&& h.Birthday == ah.Birthday
&& (!h.PersonalId.HasValue || h.PersonalId == ah.PersonalId)))
.ToList();
var duplicates = from h in humans
from a in anotherHumans
where (h.PersonalID == a.PersonalID) ||
(h.LastName == a.LastName &&
h.FirstName == a.FirstName &&
h.Birthday == a.Birthday)
select a;
anotherHumans = anotherHumans.Except(duplicates);
var nonIdItems = anotherHumans
.Where(ah => !ah.PersonalID.HasValue)
.Join(humans,
ah => new{ah.LastName,
ah.FirstName,
ah.Birthday},
h => new{h.LastName,
h.FirstName,
h.Birthday},
(ah,h) => ah);
var idItems = anotherHumans
.Where(ah => ah.PersonalID.HasValue)
.Join(humans,
ah => ah.PersonalID
h => h.PersonalID,
(ah,h) => ah);
var allAnotherHumansWithMatchingHumans = nonIdItems.Concat(idItems);
var allAnotherHumansWithoutMatchingHumans =
anotherHumans.Except(allAnotherHumansWithMatchingHumans);
I need to select 2 cities in a table with linq:
SOLUTION 1: just one query
var CityQuery = db.Cities.Where(c => c.CityId == City1Id || c.CityId == City2Id).Take(2)
foreach (var item in CityQuery)
{
if (item.CityId == City1Id)
{
City1Name = item.CityName;
}
else
{
City2Name = item.CityName;
}
}
or
SOLUTION 2: execute 2 queries
var City1Query = db.Cities.Where(c => c.CityId == City1Id).FirsOrDefault();
City1Name = City1Query.CityName;
var City2Query = db.Cities.Where(c => c.CityId == City2Id).FirsOrDefault();
City2Name = City2Query.CityName;
Which query is the most efficient ? What is the best practice ?
Generally speaking, your Solution 1 should be faster since it makes a single round-trip to the database. However, whether that difference is significant or not depends on your use case.
Here's an alternative solution that only brings back the city names from the database (vs bringing back all columns). At first look, this might look like an inefficient cartesian product but the db engine will most likely optimize this, especially if there exist an index on the CityId column.
var result = from city1 in CityQuery where city1.CityId == City1Id
from city2 in CityQuery where city2.CityId == City2Id
select new
{
City1Name = c1.CityName,
City2Name = c2.CityName
};
City1Name = result.City1Name;
City2Name = result.City2Name;
I've been using Stopwatch and it looks like the below query is very expensive in terms of performance, even though what I already have below I find most optimal based on various reading (change foreach loop with for, use arrays instead of collection, using anonymous type not to take the whole table from DB). Is there a way to make it faster? I need to fill the prices array, which needs to be nullable. I'm not sure if I'm missing something?
public float?[] getPricesOfGivenProducts(string[] lookupProducts)
{
var idsAndPrices = from r in myReadings select
new { ProductId = r.ProductId, Price = r.Price };
float?[] prices = new float?[lookupProducts.Length];
for(int i=0;i<lookupProducts.Length;i++)
{
string id = lookupProducts[i];
if (idsAndPrices.Any(r => r.ProductId == id))
{
prices[i] = idsAndPrices.Where(p => p.ProductId == id)
.Select(a=>a.Price).FirstOrDefault();
}
else
{
prices[i] = null;
}
}
return prices;
}
It's likely every time you call idsAndPrices.Any(r => r.ProductId == id), you are hitting the database, because you haven't materialized the result (.ToList() would somewhat fix it). That's probably the main cause of the bad performance. However, simply loading it all into memory still means you're searching the list for a productID every time (twice per product, in fact).
Use a Dictionary when you're trying to do lookups.
public float?[] getPricesOfGivenProducts(string[] lookupProducts)
{
var idsAndPrices = myReadings.ToDictionary(r => r.ProductId, r => r.Price);
float?[] prices = new float?[lookupProducts.Length];
for (int i = 0; i < lookupProducts.Length; i++)
{
string id = lookupProducts[i];
if (idsAndPrices.ContainsKey(id))
{
prices[i] = idsAndPrices[id];
}
else
{
prices[i] = null;
}
}
return prices;
}
To improve this further, we can identify that we only care about products passed to us in the array. So let's not load the entire database:
var idsAndPrices = myReadings
.Where(r => lookupProducts.Contains(r.ProductId))
.ToDictionary(r => r.ProductId, r => r.Price);
Now, we might want to avoid the 'return null price if we can't find the product' scenario. Perhaps the validity of the product id should be handled elsewhere. In that case, we can make the method a lot simpler (and we won't have to rely on having the array in order, either):
public Dictionary<string, float> getPricesOfGivenProducts(string[] lookupProducts)
{
return myReadings
.Where(r => lookupProducts.Contains(r.ProductId))
.ToDictionary(r => r.ProductId, r => r.Price);
}
And a note unrelated to performance, you should use decimal for money
Assuming that idsAndPrices is an IEnumerable<T>, you should make it's initialization:
var idsAndPrices = (from r in myReadings select
new { ProductId = r.ProductId, Price = r.Price })
.ToList();
It's likely that the calls to:
idsAndPrices.Any(r => r.ProductId == id)
and:
idsAndPrices.Where(p => p.ProductId == id)
..are causing the IEnumerable<T> to be evaluated every time it's called.
Based on
using anonymous type not to take the whole table from DB
I assume myReadings is the database table and
var idsAndPrices =
from r in myReadings
select new { ProductId = r.ProductId, Price = r.Price };
is the database query.
Your implementation is far from optimal (I would rather say quite inefficient) because the above query is executed twice per each element of lookupProducts array - idsAndPrices.Any(...) and idsAndPrices.Where(...) statements.
The optimal way I see is to filter as much as possible the database query, and then use the most efficient LINQ to Objects method for correlating two in memory sequences - join, in your case left outer join:
var dbQuery =
from r in myReadings
where lookupProducts.Contains(r.ProductId)
select new { ProductId = r.ProductId, Price = r.Price };
var query =
from p in lookupProducts
join r in dbQuery on p equals r.ProductId into rGroup
from r in rGroup.DefaultIfEmpty().Take(1)
select r?.Price;
var result = query.ToArray();
The Any and FirstOrDefault are O(n) and redundant. You can get a 50% speed up just by removing theAll call. FirstOrDefault will give you back a null, so use it to get a product object (remove the Select). If you want to really speed it up you should just loop through the products and check if prices[p.ProductId] != null before setting prices[p.ProductId] = p.Price.
bit of extra code code there
var idsAndPrices = (from r in myReadings select
new { ProductId = r.ProductId, Price = r.Price })
.ToList();
for(int i=0;i<lookupProducts.Length;i++)
{
string id = lookupProducts[i];
prices[i] = idsAndPrices.FirstOrDefault(p => p.ProductId == id);
}
better yet
Dictionary<Int, Float?> dp = new Dictionary<Int, Float?>();
foreach(var reading in myReadings)
dp.add(r.ProductId, r.Price);
for(int i=0;i<lookupProducts.Length;i++)
{
string id = lookupProducts[i];
if(dp.Contains(id)
prices[i] = dp[id];
else
prices[i] = null;
}
This query I wrote is failing and I am not sure why.
What I'm doing is getting a list of user domain objects, projecting them to a view model while also calculating their ranking as the data will be shown on a leaderboard. This was how I thought of doing the query.
var users = Context.Users.Select(user => new
{
Points = user.UserPoints.Sum(p => p.Point.Value),
User = user
})
.Where(user => user.Points != 0 || user.User.UserId == userId)
.OrderByDescending(user => user.Points)
.Select((model, rank) => new UserScoreModel
{
Points = model.Points,
Country = model.User.Country,
FacebookId = model.User.FacebookUserId,
Name = model.User.FirstName + " " + model.User.LastName,
Position = rank + 1,
UserId = model.User.UserId,
});
return await users.FirstOrDefaultAsync(u => u.UserId == userId);
The exception message
System.NotSupportedException: LINQ to Entities does not recognize the method 'System.Linq.IQueryable`1[WakeSocial.BusinessProcess.Core.Domain.UserScoreModel] Select[<>f__AnonymousType0`2,UserScoreModel](System.Linq.IQueryable`1[<>f__AnonymousType0`2[System.Int32,WakeSocial.BusinessProcess.Core.Domain.User]], System.Linq.Expressions.Expression`1[System.Func`3[<>f__AnonymousType0`2[System.Int32,WakeSocial.BusinessProcess.Core.Domain.User],System.Int32,WakeSocial.BusinessProcess.Core.Domain.UserScoreModel]])' method, and this method cannot be translated into a store expression.
Unfortunately, EF does not know how to translate the version of Select which takes a lambda with two parameters (the value and the rank).
For your query two possible options are:
If the row set is very small small, you could skip specifying Position in the query, read all UserScoreModels into memory (use ToListAsync), and calculate a value for Position in memory
If the row set is large, you could do something like:
var userPoints = Context.Users.Select(user => new
{
Points = user.UserPoints.Sum(p => p.Point.Value),
User = user
})
.Where(user => user.Points != 0 || user.User.UserId == userId);
var users = userPoints.OrderByDescending(user => user.Points)
.Select(model => new UserScoreModel
{
Points = model.Points,
Country = model.User.Country,
FacebookId = model.User.FacebookUserId,
Name = model.User.FirstName + " " + model.User.LastName,
Position = 1 + userPoints.Count(up => up.Points < model.Points),
UserId = model.User.UserId,
});
Note that this isn't EXACTLY the same as I've written it, because two users with a tied point total won't be arbitrarily assigned different ranks. You could rewrite the logic to break ties on userId or some other measure if you want. This query might not be as nice and clean as you were hoping, but since you are ultimately selecting only one row by userId it hopefully won't be too bad. You could also split out the rank-finding and selection of base info into two separate queries, which might speed things up because each would be simpler.
I have two IList<Traffic> I need to combine.
Traffic is a simple class:
class Traffic
{
long MegaBits;
DateTime Time;
}
Each IList holds the same Times, and I need a single IList<Traffic>, where I have summed up the MegaBits, but kept the Time as key.
Is this possible using Linq ?
EDIT:
I forgot to mention that Time isn't necessarily unique in any list, multiple Traffic instances may have the same Time.
Also I might run into X lists (more than 2), I should had mentioned that as well - sorry :-(
EXAMPLE:
IEnumerable<IList<Traffic>> trafficFromDifferentNics;
var combinedTraffic = trafficFromDifferentNics
.SelectMany(list => list)
.GroupBy(traffic => traffic.Time)
.Select(grp => new Traffic { Time = grp.Key, MegaBits = grp.Sum(tmp => tmp.MegaBits) });
The example above works, so thanks for your inputs :-)
this sounds more like
var store = firstList.Concat(secondList).Concat(thirdList)/* ... */;
var query = from item in store
group item by item.Time
into groupedItems
select new Traffic
{
MegaBits = groupedItems.Sum(groupedItem => groupedItem.MegaBits),
Time = groupedItems.Key
};
or, with your rework
IEnumerable<IList<Traffic>> stores;
var query = from store in stores
from item in store
group item by item.Time
into groupedItems
select new Traffic
{
MegaBits = groupedItems.Sum(groupedItem => groupedItem.MegaBits),
Time = groupedItems.Key
};
You could combine the items in both lists into a single set, then group on the key to get the sum before transforming back into a new set of Traffic instances.
var result = firstList.Concat(secondList)
.GroupBy(trf => trf.Time, trf => trf.MegaBits)
.Select(grp => new Traffic { Time = grp.Key, MegaBits = grp.Sum()});
That sounds like:
var query = from x in firstList
join y in secondList on x.Time equals y.Time
select new Traffic { MegaBits = x.MegaBits + y.MegaBits,
Time = x.Time };
Note that this will join in a pair-wise fashion, so if there are multiple elements with the same time in each list, you may not get the results you want.