Efficient way of finding multiple dates per ID - c#

I'm trying to query my MsSQL Express database to find all CompanyID's which have multiple dates associated - when I say multiple dates, I must point out they need to be over different days.
EG
ID UkDate CompanyId
1 01/01/2015 16
2 01/01/2015 16
3 03/01/2015 18
4 05/01/2015 19
5 06/01/2015 20
6 08/01/2015 20
In the example above, only the rows with ComapnyID 20 would be returned because it occurred multiple times and those times were over dates (note that although companyId 16 has multiple entries, but both entries are the same date).
I'm not sure how to write the query for this using Linq. My object is already IQueryable<T> but, I'm not sure how to perform the query without executing the code, and then 'finishing off' the query.
I'm not near Visual Studio but the code would be (please forgive typing errors, this is from memory)
//First, grab unique CompanyIds as this removes those who didn't visit multiple times
var uniqueIds = (from d in this._database.MyTable
select companyId).Distinct();
//This is the problem because on each iteration I'm re-querying the database!
foreach(var id in uniqueIds)
{
var result = (from d in this._database.MyTable.OrderBy(a=>a.UkDate)
where d.CompanyId==id
select d);
//check for nulls
if (result.First(a=>a.UkDate.Day) != result.Last(a => a.UkDate.Day)
{
this.AllResultsList.AddRange(results);
}
}
Whilst it works without error I don't feel the code is correct - it feels like a hack and unefficient but this was my best effort. Is there a way I could reduce the number of database requests I make and achieve the same result

It would be something along the lines of
var results = myTable.GroupBy(x => x.CompanyID)
.Where(g => g.GroupBy(g2 => g2.UkDate).Count()>1)
.Select(g => g.Key);
Live example (albeit with LinqToObjects, but the query should work against a database just fine): http://rextester.com/FPHI53553

var results = (from o in this._database.MyTable
group o by o.CompanyId into grouped
where (grouped.Max(s => s.UKDate) - grouped.Min(s => s.UKDate)).TotalDays > 0
select grouped.Key);
Edit (by OP)
Final result:
var results = (from o in this._database.MyTable
group o by o.CompanyId into grouped
where (Convert.ToDateTime(grouped.Max(s => s.UKDate)) - Convert.ToDateTime(grouped.Min(s => s.UKDate))).TotalDays > 0
from l in myTable
where l.CompanyID == grouped.Key
select l).ToList();

A little different version:
var result = (from o in this._database.MyTable
group o by o.CompanyId into grouped
select new {
grouped.Key,
Count = grouped.Select(c => c.UkDate).Distinct().Count()
} into filter
where filter.Count > 1
join a in this._database.MyTable on filter.Key equals a.CompanyID
select new { a.CompanyID, a.UkDate}
).ToList();

You can also try this if you want the company id and a count of the different dates:
from c in dataTable
group c by c.CompanyId into grouped
let count = grouped.Select(x => x.UkDate).Distinct().Count()
where count > 1
select new { CompanyId = grouped.Key, Count = count }

Related

Group in Linq with Max: How do I get the full data set of the max value?

what I have got so far is this:
var a = from e in tcdb.timeclockevent
group e by e.workerId into r
select new { workerId = r.Key, Date = r.Max(d => d.timestamp) };
This Query is giving me latest "timestamp" of every workerId (Note: workerId is not the primary key of tcdb.timeclockevent). So it is only giving me pairs of two values but I need the whole data sets
Does anybody know how I can get the whole datasets of tcdb.timeclock with the maximal timestamp for every workerId?
OR
Does anybody know how I can get the Id of the data sets of the maximal date for each worker?
Thank you in advance :)
You can order your r grouping by timestamp and select the first one
var a = from e in tcdb.timeclockevent
group e by e.workerId into r
select r.OrderByDescending(d => d.timestamp).FirstOrDefault();
Does anybody know how I can get the whole datasets of tcdb.timeclock with the maximal timestamp for every workerId?
Well, the straightforward query would be like this:
var queryA =
from e in tcdb.timeclockevent
group e by e.workerId into g
let maxDate = g.Max(e => e.timestamp)
select new { workerId = g.Key, events = g.Where(e => e.timestamp == maxDate) };
If you don't need IQueryable<T> result and since there is no SQL construct that returns directly the grouped result set, you could try the following query, which uses a different way of filtering the records with maximal timestamp for every workerId inside the database, and then does the grouping in memory:
var queryB = tcdb.timeclockevent
.Where(e => !tcdb.timeclockevent.Any(e2 =>
e2.workerId == e.workerId && e2.timestamp > e.timestamp))
.AsEnumerable()
.GroupBy(e => e.workerId);
You can try and see which one performs better with your data.

LINQ - Join 2 tables, Group by DateTime.Month , multiple Counts

I'm pretty new to C# and LINQ and I'm trying get a list of emails that holds the sum of emails, attachments and user's (the one's that sent the email).
So my current Problem is the Output of my Query is false. The number of email's is equal to the number of attachment's which obvious is wrong.
My Query:
var monthQuery = from em in dbEdoka.email
join ema in dbEdoka.email_attachment on em.id equals ema.email_id into e
from e2 in e.DefaultIfEmpty()
group e2 by em.erstellt_am.Month into grouped
select new Entities.Month
{
NameOfMonth = grouped.FirstOrDefault().erstellt_am.ToString(),
NumberOfMails = grouped.Distinct().Count(m => m.email_id != null).ToString(),
NumberOfAttachments = grouped.Count(a => a.id != null).ToString(),
NumberOfUsers = grouped.Select(u => u.erstellt_von).Distinct().Count().ToString()
};
months = monthQuery.ToList();
Months = CollectionViewSource.GetDefaultView(months);
As you can see I had to take m.email_id from dbEdoka.email_attachment instead of m.id from dbEdoka.email because it wasn't avaliable (don't know why...).
Yet I have to count "NumberOfMails", "NumberOfAttachments" and "NumberOfUsers".
Thank you!

Entity framework - select by multiple conditions in same column - referenced table

Example scenario:
Two tables: order and orderItem, relationship One to Many.
I want to select all orders that have at least one orderItem with price 100 and at least one orderItem with price 200.
I can do it like this:
var orders = (from o in kontextdbs.orders
join oi in kontextdbs.order_item on o.id equals oi.order_id
join oi2 in kontextdbs.order_item on o.id equals oi2.order_id
where oi.price == 100 && oi2.price == 200
select o).Distinct();
But what if those conditions are user generated?
So I dont know how many conditions there will be.
You need to loop through all the values using a Where and Any method like this:
List<int> values= new List() { 100, 200 };
var orders = from o in kontextdbs.orders
select o;
foreach(int value in values)
{
int tmpValue = value;
orders = orders.Where(x => kontextdbs.order_item.Where(oi => x.id == oi.order_id)
.Any(oi => oi.price == tmpValue));
}
orders = orders.Distinct();
List<int> orderValues = new List() { 100, 200 };
ObjectQuery<Order> orders = kontextdbs.Orders;
foreach(int value in orderValues) {
orders = (ObjectQuery<Order>)(from o in orders
join oi in kontextdbs.order_item
on o.id equals oi.order_id
where oi.price == value
select o);
}
orders = orders.Distinct();
ought to work, or at least that's the general pattern - you can apply extra queries to the IObjectQueryables at each stage.
Note that in my experience generating dynamic queries like this with EF gives terrible performance, unfortunately - it spends a few seconds compiling each one into SQL the first time it gets a specific pattern. If the number of order values is fairly stable though then this particular query ought to work OK.

C# LINQ query (MYSQL EF) - Distinct and Latest Records

I have a table, lets call it Record. Containing:
ID (int) | CustID (int) | Time (datetime) | Data (varchar)
I need the latest (most recent) record for each customer:
SQL
select * from record as i group by i.custid having max(id);
LINQ version 1
dgvLatestDistinctRec.DataSource = from g in ee.Records
group g by g.CustID into grp
select grp.LastOrDefault();
This throws an error:
System.NotSupportedException was unhandled by user code Message=LINQ
to Entities does not recognize the method 'Faizan_Kazi_Utils.Record
LastOrDefault[Record
](System.Collections.Generic.IEnumerable`1[Faizan_Kazi_Utils.Record
])' method, and this method cannot be translated into a store
expression. Source=System.Data.Entity
LINQ version 2
var list = (from g in ee.Records
group g by g.CustID into grp
select grp).ToList();
Record[] list2 = (from grp in list
select grp.LastOrDefault()).ToArray();
dgvLatestDistinctRec.DataSource = list2;
This works, but is inefficient because it loads ALL records from the database into memory and then extracts just the last (most recent member) of each group.
Is there any LINQ solution that approaches the efficiency and readability of the mentioned SQL solution?
Update:
var results = (from rec in Record group rec by rec.CustID into grp
select new
{
CustID = grp.Key,
ID = grp.OrderByDescending(r => r.ID).Select(x => x.ID).FirstOrDefault(),
Data = grp.OrderByDescending(r => r.ID).Select(x => x.Data).FirstOrDefault()
}
);
So I made a test table and wrote a Linq -> SQL Query that will do exactly what you need. Take a look at this and let me know what you think. Only thing to keep in mind if this query is scaled I believe it will run a query to the DB for each and every CustID record after the grouping in the select new. The only way to be sure would be to run SQL Tracer when you run the query for info on that go here .. http://www.foliotek.com/devblog/tuning-sql-server-for-programmers/
Original:
Could you do something like this? from g in ee.Records where g.CustID == (from x in ee.Records where (g.CustID == x.CustID) && (g.ID == x.Max(ID)).Select(r => r.CustID))
That's all pseudo code but hopefully you get the idea.
I'm probably too late to help with your problem, but I had a similar issue and was able to get the desired results with a query like this:
from g in ee.Records
group g by g.CustID into grp
from last in (from custRec in grp where custRec.Id == grp.Max(cr => cr.Id) select custRec)
select last
What if you replace LastOrDefault() with simple Last()?
(Yes, you will have to check your records table isn't empty)
Because I can't see a way how MySQL can return you "Default" group. This is not the thing that can be simply translated to SQL.
I think grp.LastOrDefault(), a C# function, is something that SQL doesn't know about. LINQ turns your query into an SQL query for your db server to understand. You might want to try and create an stored procedure instead, or another way to filter out what your looking for.
The reason your second query works is because the LINQ to SQL returns a list and then you do a LINQ query (to filter out what you need) on a C# list, which implements the IEnumerable/IQueryable interfaces and understands the grp.LastOrDefault().
I had another idea:
// Get a list of all the id's i need by:
// grouping by CustID, and then selecting Max ID from each group.
var distinctLatest = (from x in ee.Records
group x by x.CustID into grp
select grp.Max(g => g.id)).ToArray();
// List<Record> result = new List<Record>();
//now we can retrieve individual records using the ID's retrieved above
// foreach (int i in distinctLatest)
// {
// var res = from g in ee.Records where g.id == i select g;
// var arr = res.ToArray();
// result.Add(res.First());
// }
// alternate version of foreach
dgvLatestDistinctRec.DataSource = from g in ee.Records
join i in distinctLatest
on g.id equals i
select g;

LINQ group by month question

I'm new to LINQ to SQL and I would like to know how to achieve something like this in LINQ:
Month Hires Terminations
Jan 5 7
Feb 8 8
Marc 8 5
I've got this so far, and I think there is something wrong with it but I'm not sure:
from term1 in HRSystemDB.Terminations
group term1 by new { term1.TerminationDate.Month, term1.TerminationDate.Year } into grpTerm
select new HiresVsTerminationsQuery
{
Date = Criteria.Period,
TerminationsCount = grpTerm.Count(term => term.TerminationDate.Month == Criteria.Period.Value.Month),
HiresCount = (from emp in HRSystemDB.Persons.OfType<Employee>()
group emp by new { emp.HireDate.Month, emp.HireDate.Year } into grpEmp
select grpEmp).Count(e => e.Key.Month == Criteria.Period.Value.Month)
});
Thanks in advance.
I'm not quite sure where does the Criteria.Period value come from in your sample query.
However I think you're trying to read both hires and terminations for all available months (and then you can easily filter it). Your query could go wrong if the first table (Termination) didn't include any records for some specified month (say May). Then the select clause wouldn't be called with "May" as the parameter at all and even if you had some data in the second table (representing Hires), then you wouldn't be able to find it.
This can be elegantly solved using the Concat method (see MSDN samples). You could select all termniations and all hires (into a data structure of some type) and then group all the data by month:
var terms = from t in HRSystemDB.Terminations
select new { Month = t.TerminationDate.Month,
Year = term1.TerminationDate.Year,
IsHire = false };
var hires = from emp in HRSystemDB.Persons.OfType<Employee>()
select new { Month = emp.HireDate.Month,
Year = emp.HireDate.Year
IsHire = true };
// Now we can merge the two inputs into one
var summary = terms.Concat(hires);
// And group the data using month or year
var res = from s in summary
group s by new { s.Year, s.Month } into g
select new { Period = g.Key,
Hires = g.Count(info => info.IsHire),
Terminations = g.Count(info => !info.IsHire) }
When looking at the code now, I'm pretty sure there is some shorter way to write this. On the other hand, this code should be quite readable, which is a benefit. Also note that it doesn't matter that we split the code into a couple of sub-queries. Thanks to lazy evalutation of LINQ to SQL, this should be executed as a single query.
I don't know if it shorter but you can also try this version to see if it works better with your server. I don't know exactly how these two answers turn into SQL statements. One might be better based on your indexs and such.
var terms =
from t in Terminations
group t by new {t.Month, t.Year} into g
select new {g.Key, Count = g.Count()};
var hires =
from p in Persons
group p by new {p.Month, p.Year} into g
select new {g.Key, Count = g.Count()};
var summary =
from t in terms
join h in hires on t.Key equals h.Key
select new {t.Key.Month, t.Key.Year,
Hires = h.Count, Terms = t.Count};

Categories

Resources