NHibernate get first x distinct results does not get me x results? - c#

I have an ICriteria query like so:
var contentCriteria = DetachedCriteria.For<InvoiceItem>();
var countCriteria = DetachedCriteria.For<InvoiceItem>();
if (model.CurrentPage <= 0) model.CurrentPage = 1;
if (model.OnlyShowErrors)
{
contentCriteria.Add(Restrictions.Not(Restrictions.Eq("TroubleClass", TroubleClasses.Success)));
countCriteria.Add(Restrictions.Not(Restrictions.Eq("TroubleClass", TroubleClasses.Success)));
}
if (!string.IsNullOrEmpty(model.BatchId))
{
contentCriteria.Add(Restrictions.Eq("BatchId", model.BatchId));
countCriteria.Add(Restrictions.Eq("BatchId", model.BatchId));
}
if (model.DocumentStartDate != null)
{
contentCriteria.Add(Restrictions.Ge("DocumentDate", model.DocumentStartDate));
countCriteria.Add(Restrictions.Ge("DocumentDate", model.DocumentStartDate));
}
if (model.DocumentEndDate != null)
{
contentCriteria.Add(Restrictions.Le("DocumentDate", model.DocumentEndDate));
countCriteria.Add(Restrictions.Le("DocumentDate", model.DocumentEndDate));
}
if (!string.IsNullOrEmpty(model.VendorId))
{
contentCriteria.Add(Restrictions.Eq("VendorId", model.VendorId));
countCriteria.Add(Restrictions.Eq("VendorId", model.VendorId));
}
using (var session = GetSession())
{
var countC = countCriteria.GetExecutableCriteria(session)
.SetProjection(Projections.CountDistinct("RecordId"));
var contentC = contentCriteria
.AddOrder(Order.Desc("PersistedTimeStamp"))
.GetExecutableCriteria(session)
.SetResultTransformer(Transformers.DistinctRootEntity)
.SetFirstResult((model.CurrentPage * model.ItemsPerPage) - model.ItemsPerPage)
.SetMaxResults(model.ItemsPerPage);
var mq = session.CreateMultiCriteria()
.Add("total", countC)
.Add<InvoiceItem>("paged", contentC);
model.Invoices = ((IEnumerable<InvoiceItem>)mq.GetResult("paged"));
model.Invoices = model.Invoices
.OrderBy(x => x.PersistedTimeStamp);
model.TotalItems = (int)(mq.GetResult("total") as System.Collections.ArrayList)[0];
}
return model;
This returns results, but where I would expect the results to be in groups of model.ItemsPerPage, it rarely is. I think that the .SetResultTransformer(Transformers.DistinctRootEntity) transform is being run after the .SetMaxResults(model.ItemsPerPage) limit, and I don't know why or how to fix it. Can someone please enlighten me?

You need to see the SQL generated by NHibernate as this is not essentially NHibernate bug but behavior of SQL queries when ROWNUM and DISTINCT are applied together. This has been an issue in our known issues list from long.
Following URLs might enlighten you...
ROW NUMBER vs DISTINCT
How ROWNUM works

So this is directly related to what was written in this blog post. Additionally, I had the platform-specific complication of PostgreSQL not allowing a DISTINCT ordered set ordered by something not in the SELECT list. Ultimately, I had to make two calls to the database, like so:
using (var session = GetSession())
{
//I honestly hope I never have to reverse engineer this mess. Pagination in NHibernate
//when ordering by an additional column is a nightmare.
var countC = countCriteria.GetExecutableCriteria(session)
.SetProjection(Projections.CountDistinct("RecordId"));
var contentOrdered = contentCriteria
.SetProjection(Projections.Distinct(
Projections.ProjectionList()
.Add(Projections.Id())
.Add(Projections.Property("PersistedTimeStamp"))
))
.AddOrder(Order.Desc("PersistedTimeStamp"))
.SetFirstResult((model.CurrentPage * model.ItemsPerPage) - model.ItemsPerPage)
.SetMaxResults(model.ItemsPerPage);
var contentIds = contentOrdered.GetExecutableCriteria(session)
.List().OfType<IEnumerable<object>>()
.Select(s => (Guid)s.First())
.ToList();
var contentC = DetachedCriteria.For<InvoiceItem>()
.Add(Restrictions.In("RecordId", contentIds))
.SetResultTransformer(Transformers.DistinctRootEntity);
var mq = session.CreateMultiCriteria()
.Add("total", countC)
.Add("paged", contentC);
model.Invoices = (mq.GetResult("paged") as System.Collections.ArrayList)
.OfType<InvoiceItem>()
.OrderBy(x => x.PersistedTimeStamp);
model.TotalItems = (int)(mq.GetResult("total") as System.Collections.ArrayList)[0];
}
return model;
This is not pretty, but it worked; I think the folks over at NHibernate need to work on this and make it a tad bit easier.

Related

EF 6 - Performance of GroupBy

I don't have a problem currently, but I want to make sure, that the performance is not too shabby for my issue. My search on Microsofts documentation was without any success.
I have a Entity of the name Reservation. I now want to add some statistics to the program, where I can see some metrics about the reservations (reservations per month and favorite spot/seat in particular).
Therefore, my first approach was the following:
public async Task<ICollection<StatisticElement<Seat>>> GetSeatUsage(Company company)
{
var allReservations = await this.reservationService.GetAll(company);
return await this.FetchGroupedSeatData(allReservations, company);
}
public async Task<ICollection<StatisticElement<DateTime>>> GetMonthlyReservations(Company company)
{
var allReservations = await this.reservationService.GetAll(company);
return this.FetchGroupedReservationData(allReservations);
}
private async Task<ICollection<StatisticElement<Seat>>> FetchGroupedSeatData(
IEnumerable<Reservation> reservations,
Company company)
{
var groupedReservations = reservations.GroupBy(r => r.SeatId).ToList();
var companySeats = await this.seatService.GetAll(company);
return (from companySeat in companySeats
let groupedReservation = groupedReservations.FirstOrDefault(s => s.Key == companySeat.Id)
select new StatisticElement<Seat>()
{
Value = companySeat,
StatisticalCount = groupedReservation?.Count() ?? 0,
}).OrderByDescending(s => s.StatisticalCount).ToList();
}
private ICollection<StatisticElement<DateTime>> FetchGroupedReservationData(IEnumerable<Reservation> reservations)
{
var groupedReservations = reservations.GroupBy(r => new { Month = r.Date.Month, Year = r.Date.Year }).ToList();
return groupedReservations.Select(
groupedReservation => new StatisticElement<DateTime>()
{
Value = new DateTime(groupedReservation.Key.Year, groupedReservation.Key.Month, 1),
StatisticalCount = groupedReservation.Count(),
}).
OrderBy(s => s.Value).
ToList();
}
To explain the code a little bit: With GetSeatUsage and GetMonthlyReservations I can get the above mentioned data of a company. Therefore, I fetch ALL reservations at first (with reservationService.GetAll) - this is the point, where I think the performance will be a problem in the future.
Afterwards, I call either FetchGroupedSeatData or FetchGroupedReservationData, which first groups the reservations I previously fetched from the database and then converts them in a, for me, usable format.
As I said, I think the group by after I have read ALL the data from the database MIGHT be a problem, but I cannot find any information regarding performance in the documentation.
My other idea was, that I create a new method in my ReservationService, which then already returns the grouped list. But, again, I can't find the information, that the EF adds the GroupBy to the DB Query or basically does it after all of the data has been read from the database. This method would look something like this:
return await this.Context.Set<Reservation>.Where(r => r.User.CompanyId == company.Id).GroupBy(r => r.SeatId).ToListAsync();
Is this already the solution? Where can I check that? Am I missing something completely obvious?

How to select multiple properties properly?

MyObject have two property named p1 and p2 in int type ;now I want for each of MyObject take p1 and p2 and add those up. I tried this:
int p1Sum = 0, p2Sum = 0;
foreach (int[] ps in new MyEntity().MyObject.Select(o => new { o.p1, o.p2 }))
{
p1Sum += ps[0];
p2Sum += ps[1];
}
but says:
cannot convert AnonymousType#1 to int[]
on foreach.
How can I fix this?
foreach (var ps in new MyEntity().MyObject.Select(o => new { o.p1, o.p2 }))
{
p1Sum += ps.p1;
p2Sum += ps.p2;
}
jyparask's answer will definitely work, but it's worth considering using Sum twice instead - it will involve two database calls, but it may (check!) avoid fetching all the individual values locally:
var entities = new MyEntity().MyObject;
var p1Sum = entities.Sum(x => x.p1);
var p2Sum = entities.Sum(x => x.p2);
Now there's at least logically the possibility of inconsistency here - some entities may be removed or added between the two Sum calls. However, it's possible that EF will ensure that doesn't happen (e.g. via caching) or it may not be relevant in your situation. It's definitely something you should think consider.
In addition to Jon Skeet and jyparask answer you can also try :
var result = (new MyEntity().MyObject
.GroupBy(_=> 0)
.Select(r=> new
{
p1Sum = r.Sum(x=> x.p1)
p2Sum = r.Sum(x=> x.p2)
})
.FirstOrDefault();
The above would result in a single query fetching only Sum for both columns, You may look at the query generated and its execution plan if you are concerned about the performance.
if(result != null)
{
Console.WriteLine("p1Sum = " + result.p1Sum);
Console.WriteLine("p2Sum = " + result.p2Sum);
}

Improve performance on LINQ Query

My linq query goes slow when I try to loop through the results to create an Xelement, which I later process XSLT based on the XElement.
Here is my code
public override XElement Search(SearchCriteria searchCriteria)
{
XElement root = new XElement("Root");
using (ReportOrderLogsDataContext dataContext = DataConnection.GetLinqDataConnection<ReportOrderLogsDataContext>(searchCriteria.GetConnectionString()))
{
try
{
IQueryable<vw_udisclosedDriverResponsePart> results = from a in dataContext.vw_udisclosedDriverResponseParts
where
(a.CreateDt.HasValue &&
a.CreateDt >= Convert.ToDateTime(searchCriteria.BeginDt) &&
a.CreateDt <= Convert.ToDateTime(searchCriteria.EndDt))
select a;
if (!string.IsNullOrEmpty(searchCriteria.AgentNumber))
{
results = results.Where(request => request.LgAgentNumber == searchCriteria.AgentNumber);
}
if (!string.IsNullOrEmpty(searchCriteria.AgentTitle))
{
results = results.Where(a => a.LgTitle == searchCriteria.AgentTitle);
}
if (!string.IsNullOrEmpty(searchCriteria.QuotePolicyNumber))
{
results = results.Where(a => a.QuotePolicyNumber == searchCriteria.QuotePolicyNumber);
}
if (!string.IsNullOrEmpty(searchCriteria.InsuredName))
{
results = results.Where(a => a.LgInsuredName.Contains(searchCriteria.InsuredName));
}
foreach (var match in results) // goes slow here, specifically times out before evaluating the first match when results are too large.
{
DateTime date;
string strDate = string.Empty;
if (DateTime.TryParse(match.CreateDt.ToString(), out date))
{
strDate = date.ToString("MM/dd/yyyy");
}
root.Add(new XElement("Record",
new XElement("System", "Not Supported"),
new XElement("Date", strDate),
new XElement("Agent", match.LgAgentNumber),
new XElement("UserId", match.LgUserId),
new XElement("UserTitle", match.LgTitle),
new XElement("QuoteNum", match.QuotePolicyNumber),
new XElement("AddressLine1", match.AddressLine1),
new XElement("AddressLine2", match.AddressLine2),
new XElement("City", match.City),
new XElement("State", match.State),
new XElement("Zip", match.Zip),
new XElement("DriverName", string.Concat(match.GivenName, " ", match.SurName)),
new XElement("DriverLicense", match.LicenseNumber),
new XElement("LicenseState", match.LicenseState)));
;
}
}
catch (Exception es)
{
throw es;
}
}
return root;
// return GetSearchedCriteriaFromStoredPocedure(searchCriteria);
}
I assume there is a better way to convert the results object into an XElement. Processing the view itself only takes about 2 seconds. Trying to loop through the results object is resulting in a timeout, even when many results are not returned.
Any help would be appreciated.
Thanks!
-James
AMENDED 7/10/2012
The issue is not with the linq query itself but its with the execution of the view when specifying a date range. Executing the view by itself takes about 4-6 seconds. When a small date range (07/05/2012 - 07/10/2012) is used the view takes around 1:30. Does anyone have any suggestions of how to increase performance of the query with a date range specified. Its faster if I got all of the results and looped through them checking the date.
i.e.
IQueryable<vw_udisclosedDriverResponsePart> results = from a in dataContext.vw_udisclosedDriverResponseParts select a;
foreach (var match in results) //results only takes 3 seconds to enumerate, before would timeout
{
// eval search criteria date here.
}
I can code it like I suggested above, but does anyone have a better way?
How does the database perform? The simplest test is to run a sample query - a query that will retrieve the data you need from the database, just to test database indexing and performance - because in 99% of cases that's the cause of slowness.
I would guess that the slowness is occurring because
you are iterating from the database, rather than retrieving all the rows up front, and
you are selecting on bad WHERE conditions (are your indexes correct?)
Firstly, call ToList to get the results to determine that the slowness is happening in the database, not in the XML construction
if (!string.IsNullOrEmpty(searchCriteria.InsuredName))
{
//...
}
var matches = results.ToList();
foreach (var match in matches)
{
//...
Assuming that the var matches = results.ToList() is very slow, I'd look at the functions in the WHERE clause
(a.CreateDt.HasValue &&
a.CreateDt >= Convert.ToDateTime(searchCriteria.BeginDt) &&
a.CreateDt <= Convert.ToDateTime(searchCriteria.EndDt))
to check that they aren't being executed for every row.
If you use SQL Server, run Profiler (in the Tools menu) to trace the SQL that LINQ-to-SQL.
And, of course, do the conversion outside the linq. criteria won't change
during the runtime of the Linq expression.
From what you posted, I made this example:
var begin = Convert.ToDateTime(searchCriteria.BeginDt);
var end = Convert.ToDateTime(searchCriteria.EndDt);
var results = from a in searchList
where ((a.CreateDt.HasValue &&
a.CreateDt >= begin &&
a.CreateDt <= end)
&& (string.IsNullOrEmpty(searchCriteria.AgentNumber) || a.LgAgentNumber == searchCriteria.AgentNumber)
&& (string.IsNullOrEmpty(searchCriteria.AgentTitle) || a.LgTitle == searchCriteria.AgentTitle)
&& (string.IsNullOrEmpty(searchCriteria.QuotePolicyNumber) || a.LgTitle == searchCriteria.QuotePolicyNumber)
&& (string.IsNullOrEmpty(searchCriteria.InsuredName) || a.LgInsuredName.Contains(searchCriteria.InsuredName))
)
select a;
Perhaps this is helpful for you.
For measuring the time I used the following:
var watch = new Stopwatch();
watch.Start();
var arr = results.ToArray(); // force evaluation of linq
watch.Stop();
var elapsed = watch.ElapsedTicks;
Seems the altered query is already about 30-40% faster on average, but i just
did some runs.
I would suggest few experiments:
One.
Put a
int count = results.Count();
before the foreach and see if this takes a long time.
Two.
Leave the the Count() call and see if the foreach is still slow. If it is fast it would suggest that the initial connection to the db is slow.
As others suggested - have a look how you query performs in the db (actually type in in the database, without c#).
You could also post a SHOW TABLE result so the community could inspect the indexes and help you with a fix.

NHIbernate OR Criteria Query

I have the following mapped classes
Trade { ID, AccountFrom, AccountTo }
Account {ID, Company}
Company {ID}
Now I cannot figure out a way select all trades where
AccountFrom.Company.ID = X OR AccountTo.Company.ID = X
I can get AND to work using the following:
criteria.CreateCriteria("AccountFrom").CreateCriteria("Company").Add(Restrictions.Eq("ID", X);
criteria.CreateCriteria("AccountTo").CreateCriteria("Company").Add(Restrictions.Eq("ID", X);
But how can I transform this into an OR rather an an AND. I have used Disjunction previously, but I cannot seem to know how to add separate criteria, just restrictions.
Try:
return session.CreateCriteria<Trade>()
.CreateAlias("AccountFrom", "af")
.CreateAlias("AccountTo", "at")
.Add(Restrictions.Or(
Restrictions.Eq("af.Company.CompanyId", companyId),
Restrictions.Eq("at.Company.CompanyId", companyId)))
.List<Trade>();
I don't think you will need to alias Company.
I think your NHibernate options depend on which version of NHibernate that you are using.
Disjunction = OR, Conjunction = AND
.Add(
Expression.Disjunction()
.Add(companyId1)
.Add(companyId2)
)
Same as this question here
Jamie Ide just answered more thoroughly...the gist of it goes like this:
.Add(Restrictions.Or(
Restrictions.Eq("object1.property1", criteriaValue),
Restrictions.Eq("object2.property3", criteriaValue))
Using Linq to NHibernate:
var X = 0; // or whatever the identifier type.
var result = Session.Linq<Trade>()
.Where(trade => trade.AccountFrom.Company.ID == X ||
trade.AccountTo.Company.ID == X)
.ToList();
Using HQL:
var X = 0; // or whatever the identifier type.
var hql = "from Trade trade where trade.AccountFrom.Company.ID = :companyId or trade.AccountTo.Company.ID = :companyID";
var result = Session.CreateQuery(hql)
.SetParameter("companyId", X)
.List<Trade>();

Object mapping with LINQ and SubSonic

I'm building a small project with SubSonic 3.0.0.3 ActiveRecord and I'm running into an issue I can't seem to get past.
Here is the LINQ query:
var result = from r in Release.All()
let i = Install.All().Count(x => x.ReleaseId == r.Id)
where r.ProductId == productId
select new ReleaseInfo
{
NumberOfInstalls = i,
Release = new Release
{
Id = r.Id,
ProductId = r.ProductId,
ReleaseNumber = r.ReleaseNumber,
RevisionNumber = r.RevisionNumber,
ReleaseDate = r.ReleaseDate,
ReleasedBy = r.ReleasedBy
}
};
The ReleaseInfo object is a custom class and looks like this:
public class ReleaseInfo
{
public Release Release { get; set; }
public int NumberOfInstalls { get; set; }
}
Release and Install are classes generated by SubSonic.
When I do a watch on result, the Release property is null.
If I make this a simpler query and watch result, the value is not null.
var result = from r in Release.All()
let i = Install.All().Count(x => x.ReleaseId == r.Id)
where r.ProductId == productId
select new Release
{
Id = r.Id,
ProductId = r.ProductId,
ReleaseNumber = r.ReleaseNumber,
RevisionNumber = r.RevisionNumber,
ReleaseDate = r.ReleaseDate,
ReleasedBy = r.ReleasedBy
};
Is this an issue with my LINQ query or a limitation of SubSonic?
I think the issue might be that you're essentially duplicating the functionality of the ORM. The key thing to understand is this line:
from r in Release.All()
This line returns a list of fully-populated Release records for every item in your database. There should never be a need to new up a release anywhere else in your query - just return the ones that SubSonic has already populated for you!
Using this logic, you should be able to do the following:
var result = from r in Release.All()
select new ReleaseInfo {
Release = r,
NumberOfInstalls = Install.All().Count(x => x.ReleaseId == r.Id)
};
That being said, you should look at the Install.All() call, because that's likely to be tremendously inefficient. What that will do is pull every install from the database, hydrate those installs into objects, and then compare the id of every record in .NET to check if the record satisfies that condition. You can use the .Find method in SubSonic to only return certain records at the database tier, which should help performance significantly. Even still, inflating objects may still be expensive and you might want to consider a view or stored procedure here. But as a simple first step, the following should work:
var result = from r in Release.All()
select new ReleaseInfo {
Release = r,
NumberOfInstalls = Install.Find(x => x.ReleaseId == r.Id).Count()
};
I think I've found the actual answer to this problem. I've been rummaging around in the SubSonic source and found that there are two types of object projection that are used when mapping the datareader to objects: one for anonymous types and groupings and one for everything else:
Here is a snippet: Line 269 - 298 of SubSonic.Linq.Structure.DbQueryProvider
IEnumerable<T> result;
Type type = typeof (T);
//this is so hacky - the issue is that the Projector below uses Expression.Convert, which is a bottleneck
//it's about 10x slower than our ToEnumerable. Our ToEnumerable, however, stumbles on Anon types and groupings
//since it doesn't know how to instantiate them (I tried - not smart enough). So we do some trickery here.
if (type.Name.Contains("AnonymousType") || type.Name.StartsWith("Grouping`") || type.FullName.StartsWith("System.")) {
var reader = _provider.ExecuteReader(cmd);
result = Project(reader, query.Projector);
} else
{
using (var reader = _provider.ExecuteReader(cmd))
{
//use our reader stuff
//thanks to Pascal LaCroix for the help here...
var resultType = typeof (T);
if (resultType.IsValueType)
{
result = reader.ToEnumerableValueType<T>();
}
else
{
result = reader.ToEnumerable<T>();
}
}
}
return result;
Turns out that the SubSonic ToEnumerable tries to match the column names in the datareader to the properties in the object you're trying to project to. The SQL Query from my Linq looks like this:
SELECT [t0].[Id], [t0].[ProductId], [t0].[ReleaseDate], [t0].[ReleasedBy], [t0].[ReleaseNumber], [t0].[RevisionNumber], [t0].[c0]
FROM (
SELECT [t1].[Id], [t1].[ProductId], [t1].[ReleaseDate], [t1].[ReleasedBy], [t1].[ReleaseNumber], [t1].[RevisionNumber], (
SELECT COUNT(*)
FROM [dbo].[Install] AS t2
WHERE ([t2].[ReleaseId] = [t1].[Id])
) AS c0
FROM [dbo].[Release] AS t1
) AS t0
WHERE ([t0].[ProductId] = 2)
Notice the [t0].[c0] is not the same as my property name NumberOfInstalls. So the value of c0 never gets projected into my object.
THE FIX:
You can simply take out the if statement and use the 10x slower projection and everything will work.
We have a bug with projections that trips on certain occassions - I think it's been patched but I need to test it more. I invite you to try the latest bits - I think we've fixed it... sorry to be so vague but a bug worked it's way in between 3.0.0.1 and 3.0.0.3 and I haven't been able to find it.
Has this been fixed in 3.0.0.4? I was so peeved to find this post. After 2 days of trying to figure out why my projections were not working - except when the property names matched the query exactly - I ended up here.
I am so dependant on SS SimpleRepository that it is too late to turn back now. A bug like this is crippling. Any chance it is sorted out?
I went the 10x slower route for now so I can at least release to my client. Would much prefer the faster method to work correctly :)

Categories

Resources