I have the following three tables, and need to bring in information from two dissimilar tables.
Table baTable has fields OrderNumber and Position.
Table accessTable has fields OrderNumber and ProcessSequence (among others)
Table historyTable has fields OrderNumber and Time (among others).
.
var progress = from ba in baTable
from ac in accessTable
where ac.OrderNumber == ba.OrderNumber
select new {
Position = ba.Position.ToString(),
Time = "",
Seq = ac.ProcessSequence.ToString()
};
progress = progress.Concat(from ba in baTable
from hs in historyTable
where hs.OrderNumber == ba.OrderNumber
select new {
Position = ba.Position.ToString(),
Time = String.Format("{0:hh:mm:ss}", hs.Time),
Seq = ""
});
int searchRecs = progress.Count();
The query compiles successfully, but when the SQL executes during the call to Count(), I get an error
All queries combined using a UNION, INTERSECT or EXCEPT operator must have an equal number of expressions in their target lists.
Clearly the two lists each have three items, one of which is a constant. Other help boards suggested that the Visual Studio 2010 C# compiler was optimizing out the constants, and I have experimented with alternatives to the constants.
The most surprising thing is that, if the Time= entry within the select new {...} is commented out in both of the sub-queries, no error occurs when the SQL executes.
I actually think the problem is that Sql won't recognize your String.Format(..) method.
Change your second query to:
progress = progress.Concat(from ba in baTable
from hs in historyTable
where hs.OrderNumber == ba.OrderNumber
select new {
Position = ba.Position.ToString(),
Time = hs.Time.ToString(),
Seq = ""
});
After that you could always loop trough the progress and format the Time to your needs.
Related
In my service, first I generate 40,000 possible combinations of home and host countries, like so (clientLocations contains 200 records, so 200 x 200 is 40,000):
foreach (var homeLocation in clientLocations)
{
foreach (var hostLocation in clientLocations)
{
allLocationCombinations.Add(new AirShipmentRate
{
HomeCountryId = homeLocation.CountryId,
HomeCountry = homeLocation.CountryName,
HostCountryId = hostLocation.CountryId,
HostCountry = hostLocation.CountryName,
HomeLocationId = homeLocation.LocationId,
HomeLocation = homeLocation.LocationName,
HostLocationId = hostLocation.LocationId,
HostLocation = hostLocation.LocationName,
});
}
}
Then, I run the following query to find existing rates for the locations above, but also include empty the missing rates; resulting in a complete recordset of 40,000 rows.
var allLocationRates = (from l in allLocationCombinations
join r in Db.PaymentRates_AirShipment
on new { home = l.HomeLocationId, host = l.HostLocationId }
equals new { home = r.HomeLocationId, host = (Guid?)r.HostLocationId }
into matches
from rate in matches.DefaultIfEmpty(new PaymentRates_AirShipment
{
Id = Guid.NewGuid()
})
select new AirShipmentRate
{
Id = rate.Id,
HomeCountry = l.HomeCountry,
HomeCountryId = l.HomeCountryId,
HomeLocation = l.HomeLocation,
HomeLocationId = l.HomeLocationId,
HostCountry = l.HostCountry,
HostCountryId = l.HostCountryId,
HostLocation = l.HostLocation,
HostLocationId = l.HostLocationId,
AssigneeAirShipmentPlusInsurance = rate.AssigneeAirShipmentPlusInsurance,
DependentAirShipmentPlusInsurance = rate.DependentAirShipmentPlusInsurance,
SmallContainerPlusInsurance = rate.SmallContainerPlusInsurance,
LargeContainerPlusInsurance = rate.LargeContainerPlusInsurance,
CurrencyId = rate.RateCurrencyId
});
I have tried using .AsEnumerable() and .AsNoTracking() and that has sped things up quite a bit. The following code shaves several seconds off of my query:
var allLocationRates = (from l in allLocationCombinations.AsEnumerable()
join r in Db.PaymentRates_AirShipment.AsNoTracking()
But, I am wondering: How can I speed this up even more?
Edit: Can't replicate foreach functionality in linq.
allLocationCombinations = (from homeLocation in clientLocations
from hostLocation in clientLocations
select new AirShipmentRate
{
HomeCountryId = homeLocation.CountryId,
HomeCountry = homeLocation.CountryName,
HostCountryId = hostLocation.CountryId,
HostCountry = hostLocation.CountryName,
HomeLocationId = homeLocation.LocationId,
HomeLocation = homeLocation.LocationName,
HostLocationId = hostLocation.LocationId,
HostLocation = hostLocation.LocationName
});
I get an error on from hostLocation in clientLocations which says "cannot convert type IEnumerable to Generic.List."
The fastest way to query a database is to use the power of the database engine itself.
While Linq is a fantastic technology to use, it still generates a select statement out of the Linq query, and runs this query against the database.
Your best bet is to create a database View, or a stored procedure.
Views and stored procedures can easily be integrated into Linq.
Material Views ( in MS SQL ) can further speed up execution, and missing indexes are by far the most effective tool in speeding up database queries.
How can I speed this up even more?
Optimizing is a bitch.
Your code looks fine to me. Make sure to set the index on your DB schema where it's appropriate. And as already mentioned: Run your Linq against SQL to get a better idea of the performance.
Well, but how to improve performance anyway?
You may want to have a glance at the following link:
10 tips to improve LINQ to SQL Performance
To me, probably the most important points listed (in the link above):
Retrieve Only the Number of Records You Need
Turn off ObjectTrackingEnabled Property of Data Context If Not
Necessary
Filter Data Down to What You Need Using DataLoadOptions.AssociateWith
Use compiled queries when it's needed (please be careful with that one...)
I have 2 datatables named 'dst' and 'dst2'. they are located in the dataset 'urenmat'.
The mayority of the data is in 'dst'. this however contains a column named 'werknemer'. It contains a value which corresponds to a certain row in 'dst2'. This column is named 'nummer'.
What i need is a way to left outer join both datatables where dst.werknemer and dst2.nummer are linked, and a new datatable is created which contains 'dst2.naam' linked to 'dst.werknemer' along with all the other columns from 'dst'.
I have looked everywhere and still can't seem te find the right answer to my question. several sites provide a way using LINQ in this situation. I have tried using LINQ but i am not so skilled at this.
I tried using the 101 LINQ Samples:
http://code.msdn.microsoft.com/101-LINQ-Samples-3fb9811b
urenmat = dataset.
dst = a, b, c, d, werknemer.
dst2 = nummer, naam.
I used the following code from '101'.
var query =
from contact in dst.AsEnumerable()
join order in dst2.AsEnumerable()
on contact.Field<string>("werknemer") equals
order.Field<string>("nummer")
select new
{
a = order.Field<string>("a"),
b = order.Field<string>("b"),
c = order.Field<string>("c"),
d = order.Field<string>("d"),
naam = contact.Field<decimal>("naam")};
I however don't know what to change 'contact' and 'order' to and i can't seem to find out how to save it to a datatable again.
I am very sorry if these are stupid questions but i have tried to solve it myself but it appears i'm stupid:P. Thank for the help in advance!
PS. i am using C# to code, the dataset and datatables are typed.
if you want to produce a projected dataset of dst left outer joined to dst2 you can use this LINQ expression (sorry i don't really work in LINQ query syntax so you'll have to use this lambda syntax instead).
var query = dst.AsEnumerable()
.GroupJoin(dst2.AsEnumerable(), x => x.Field<string>("werknemer"), x => x.Field<string>("nummer"), (contact, orders) => new { contact, orders })
.SelectMany(x => x.orders.DefaultIfEmpty(), (x, order) => new
{
a = order.Field<string>("a"),
b = order.Field<string>("b"),
c = order.Field<string>("c"),
d = order.Field<string>("d"),
naam = x.contact.Field<decimal>("naam")
});
because this is a projected dataset you cannot simply save back to the datatable. If saving is desired then you would want to load the affected row, update the desired fields, then save the changes.
// untyped
var row = dst.Rows.Find(keyValue);
// typed
var row = dst.FindBy...(keyValue);
// update the field
row.SetField("a", "some value");
// save only this row's changes
row.AcceptChanges();
// or after all changes to the table have been made, save the table
dst.AcceptChanges();
Normally if you need to perform loading and saving of (projected) data, an ORM (like entity framework, or LINQ-to-SQL) would be the best tool. However, you are using DataTable's in this case and I'm not sure if you can link an ORM to these (though it seems like it would probably be possible).
In my query there I'm trying to select all entities (20 of them) and iterate through collection like this
List<Domain.Property> data = session.Query<Domain.Property>().ToList();
PropertyViewModel viewModel;
List<PropertyViewModel> listOfViewModels = new List<PropertyViewModel>();
foreach (Domain.Property prop in data)
{
viewModel = new PropertyViewModel()
{
AdType = prop.AdType.ToString(),
CityName = prop.CityName,
ContructionYear = prop.ConstructionYear,
Photo = prop.Photos.First()
};
}
listOfViewModels.Add(viewModel);
Each property MUST have one or more photos, I need only first one so I'm using Photos.First()
When this line is commented out Photo = prop.Photos.First() nhib. profiler reports that 20 entities is loaded which is fine (those from first query).
But with Photo = prop.Photos.First() loaded entities increased to 65 entities that number should approx. equal to properties + photos collections.
Can anyone point to right direction ?
And in nhib. generated sql there is
SELECT photos0_.PropertyId as PropertyId1_,
photos0_.Id as Id1_,
photos0_.Id as Id1_0_,
photos0_.ImageData as ImageData1_0_,
photos0_.ImageMimeType as ImageMim3_1_0_,
photos0_.PropertyId as PropertyId1_0_
FROM Photo photos0_
WHERE photos0_.PropertyId = 117 /* #p0 */
which is marked as SELECT N+1
For each property (single query to get the list of them) you're doing another query to get the first photo. The Photos collection isn't populated until you try to access it resulting in a second query.
Join to the photos table as part of the original query to reduce it to a single query.
I've read MANY different solutions for the separate functions of LINQ that, when put together would solve my issue. My problem is that I'm still trying to wrap my head about how to put LINQ statements together correctly. I can't seem to get the syntax right, or it comes up mish-mash of info and not quite what I want.
I apologize ahead of time if half of this seems like a duplicate. My question is more specific than just reading the file. I'd like it all to be in the same query.
To the point though..
I am reading in a text file with semi-colon separated columns of data.
An example would be:
US;Fort Worth;TX;Tarrant;76101
US;Fort Worth;TX;Tarrant;76103
US;Fort Worth;TX;Tarrant;76105
US;Burleson;TX;Tarrant;76097
US;Newark;TX;Tarrant;76071
US;Fort Worth;TX;Tarrant;76103
US;Fort Worth;TX;Tarrant;76105
Here is what I have so far:
var items = (from c in (from line in File.ReadAllLines(myFile)
let columns = line.Split(';')
where columns[0] == "US"
select new
{
City = columns[1].Trim(),
State = columns[2].Trim(),
County = columns[3].Trim(),
ZipCode = columns[4].Trim()
})
select c);
That works fine for reading the file. But my issue after that is I don't want the raw data. I want a summary.
Specifically I need the count of the number of occurrences of the City,State combination, and the count of how many times the ZIP code appears.
I'm eventually going to make a tree view out of it.
My goal is to have it laid out somewhat like this:
- Fort Worth,TX (5)
- 76101 (1)
- 76103 (2)
- 76105 (2)
- Burleson,TX (1)
- 76097 (1)
- Newark,TX (1)
- 76071 (1)
I can do the tree thing late because there is other processing to do.
So my question is: How do I combine the counting of the specific values in the query itself? I know of the GroupBy functions and I've seen Aggregates, but I can't get them to work correctly. How do I go about wrapping all of these functions into one query?
EDIT: I think I asked my question the wrong way. I don't mean that I HAVE to do it all in one query... I'm asking IS THERE a clear, concise, and efficient way to do this with LINQ in one query? If not I'll just go back to looping through.
If I can be pointed in the right direction it would be a huge help.
If someone has an easier idea in mind to do all this, please let me know.
I just wanted to avoid iterating through a huge array of values and using Regex.Split on every line.
Let me know if I need to clarify.
Thanks!
*EDIT 6/15***
I figured it out. Thanks to those who answered it helped out, but was not quite what I needed. As a side note I ended up changing it all up anyways. LINQ was actually slower than doing it other ways that I won't go into as it's not relevent. As to those who made multiple comments on "It's silly to have it in one query", that's the decision of the designer. All "Best Practices" don't work in all places. They are guidelines. Believe me, I do want to keep my code clear and understandable but I also had a very specific reasoning for doing it the way I did.
I do appreciate the help and direction.
Below is the prototype that I used but later abandoned.
/* Inner LINQ query Reads the Text File and gets all the Locations.
* The outer query summarizes this by getting the sum of the Zips
* and orders by City/State then ZIP */
var items = from Location in(
//Inner Query Start
(from line in File.ReadAllLines(FilePath)
let columns = line.Split(';')
where columns[0] == "US" & !string.IsNullOrEmpty(columns[4])
select new
{
City = (FM.DecodeSLIC(columns[1].Trim()) + " " + columns[2].Trim()),
County = columns[3].Trim(),
ZipCode = columns[4].Trim()
}
))
//Inner Query End
orderby Location.City, Location.ZipCode
group Location by new { Location.City, Location.ZipCode , Location.County} into grp
select new
{
City = grp.Key.City,
County = grp.Key.County,
ZipCode = grp.Key.ZipCode,
ZipCount = grp.Count()
};
The downside of using File.ReadAllLines is that you have to pull the entire file into memory before operating over it. Also, using Columns[] is a bit clunky. You might want to consider my article describing using DynamicObject and streaming the file as an alternative implemetnation. The grouping/counting operation is secondary to that discussion.
var items = (from c in
(from line in File.ReadAllLines(myFile)
let columns = line.Split(';')
where columns[0] == "US"
select new
{
City = columns[1].Trim(),
State = columns[2].Trim(),
County = columns[3].Trim(),
ZipCode = columns[4].Trim()
})
select c);
foreach (var i in items.GroupBy(an => an.City + "," + an.State))
{
Console.WriteLine("{0} ({1})",i.Key, i.Count());
foreach (var j in i.GroupBy(an => an.ZipCode))
{
Console.WriteLine(" - {0} ({1})", j.Key, j.Count());
}
}
There is no point getting everything into one query. It's better to split the queries so that it would be meaningful. Try this to your results
var grouped = items.GroupBy(a => new { a.City, a.State, a.ZipCode }).Select(a => new { City = a.Key.City, State = a.Key.State, ZipCode = a.Key.ZipCode, ZipCount = a.Count()}).ToList();
Result screen shot
EDIT
Here is the one big long query which gives the same output
var itemsGrouped = File.ReadAllLines(myFile).Select(a => a.Split(';')).Where(a => a[0] == "US").Select(a => new { City = a[1].Trim(), State = a[2].Trim(), County = a[3].Trim(), ZipCode = a[4].Trim() }).GroupBy(a => new { a.City, a.State, a.ZipCode }).Select(a => new { City = a.Key.City, State = a.Key.State, ZipCode = a.Key.ZipCode, ZipCount = a.Count() }).ToList();
I am am making a calendar and to make it easier on myself I break up appointments that span over multiple weeks.
For instance Jan 1st to Jan 31st spans like 6 weeks(my calendar is always 42 cells - 6 by 7). So I would basically have 6 rows stored in my database.
However somethings I do require to me to put all these rows back together into one row. For instance if I want to export my calendar in Ical format.
I have a field in my database called bindingClassName all these rows get the same unquie id to that group of tasks so I am able to get all the weeks easily.
// get all of the task rows by binding class name.
var found = plannerDb.Calendars.Where(u => u.UserId == userId && u.BindingClassName == bindingClassName)
.GroupBy(u => u.BindingClassName);
List<Calendar> allAppoingments = new List<Calendar>();
// go through each of the results and add it to a list of calendars
foreach (var group in found)
{
foreach (var row in group)
{
Calendar appointment = new Calendar();
appointment.AppointmentId = row.AppointmentId;
appointment.AllDay = row.AllDay;
appointment.BindingClassName = row.BindingClassName;
appointment.Description = row.Description;
appointment.EndDate = row.EndDate;
appointment.StartDate = row.StartDate;
appointment.Title = row.Title;
appointment.Where = row.Where;
appointment.UserId = row.UserId;
allAppoingments.Add(appointment);
}
}
// order
var test = allAppoingments.OrderBy(u => u.StartDate);
var firstAppointment = test.First();
var LastAppointment = test.Last();
Calendar newAppointment = new Calendar();
newAppointment.UserId = firstAppointment.UserId;
newAppointment.Description = firstAppointment.Description;
newAppointment.AllDay = firstAppointment.AllDay;
newAppointment.StartDate = firstAppointment.StartDate;
newAppointment.Title = firstAppointment.Title;
newAppointment.Where = firstAppointment.Where;
newAppointment.BindingClassName = firstAppointment.BindingClassName;
newAppointment.EndDate = LastAppointment.EndDate;
return newAppointment;
So basically that big blob finds all the appointments with the same binding name. Then I go through each one and make it into a Calendar object then finally once it is all made I get the first and last record to get the startDate and endDate.
So I am not good with linq but I am not sure if I can just add something after the groupBy to do what I want.
Edit.
I am trying group all my appointments together once I get all of them from the user.
So I have this so far
I tried something like this.
var allApointments = calendar.GetAllAppointments(userId);
var group = allApointments.GroupBy(u => u.BindingClassName).Select(u => new Calendar()).ToList
I was hoping that it would fill each group automatically but it does not. So I am not sure if don't need groupby again.
Edit # admin
Hi thanks for explaining sorting and grouping. How you explained it though it seems either one would work.
Like the code you have for getting the first and last date works great and does what I wanted it to.
I think grouping might have worked because in the end though I am looking just to have one row that has the startdate of the first record and the end date of the last record all the other information would be the same.
So I don't know if it would harder to write that instead or what but like I said your query does what I want.
However that query is used on a single basis. Like I use that query only when a user clicks to view that appointment on my calendar. By clicking on the appointment I get all the information about that appointment and thats where I need to look at if that task spans over multiple days and figure out when the appointment started and when it is going to end.
Now I need another query and I think it would be better if I could actually group them as how I understand it from your explanation it will make one row. the reason I think this is because I want to export all the records in the table from that user.
So if I order them into one continues block by binding name I still going to need some loops that goes through all the records and gets the first and start date. So if I could just group it in one go and the final result would be just one record for each group of binding names and it would have the first start date and the last end date from the first and last record would be better.
Why are you grouping the appointments if you aren't actually using the group? It looks like you're just using them individually. In any case, you're already filtering the rows on a single value for BindingClassName in the Where clause, so you would only end up with 1 (or 0) group(s) anyway.
You can rewrite that series of foreach loops into a Select and ToList() like this:
var allAppointments =
plannerDb.Calendars.Where(
row => row.UserId == userId &&
row.BindingClassName == bindingClassName).OrderBy(
row => row.StartDate).Select(
row => new Calendar()
{
AppointmentId = row.AppointmentId,
AllDay = row.AllDay,
BindingClassName = row.BindingClassName,
Description = row.Description,
EndDate = row.EndDate,
StartDate = row.StartDate,
Title = row.Title,
Where = row.Where,
UserId = row.UserId
}).ToList();
This will give you back the full list in the order you wanted. However, I'm curious why you're retrieving the whole list when it looks like you're only interested in the first and last appointment. You could instead do this:
var baseQuery =
plannerDb.Calendars.Where(
row => row.UserId == userId &&
row.BindingClassName == bindingClassName);
var first = baseQuery.OrderBy(row => row.StartDate).First();
var last = baseQuery.OrderByDescending(row => row.StartDate).Select(
row => row.EndDate).First();
return new Calendar()
{
AppointmentId = first.AppointmentId,
AllDay = first.AllDay,
BindingClassName = first.BindingClassName,
Description = first.Description,
EndDate = last,
StartDate = first.StartDate,
Title = first.Title,
Where = first.Where,
UserId = first.UserId
});
This should produce outputs that are the same as what you have now. I would question, however, if this is exactly what you want. Say you have two appointments:
Appointment 1 starts January 5 and ends on January 10
Appointment 2 starts January 6 and ends on January 7
Using this (and your) logic, you would get the end date as January 7, since Appointment 2 has the larger start date, but Appointment 1 actually ends later. I would recommend changing the second query to this:
var last = baseQuery.OrderByDescending(row => row.EndDate).Select(
row => row.EndDate).First();
This will give you the largest end date, which I think is what you're actually after.
EDIT
I think you're making the (very common) mistake of confusing grouping with sorting. When you say you want to "group the appointments by the binding name", it sounds like you want a full, complete list of appointments, and you want those appointments arranged in such a way as all appointments with a particular binding name form a contiguous block. If that's the case, you want to order the list by the binding name, not group them. Grouping takes the whole list and produces one row per grouping clause and allows you to perform aggregation functions on the remaining columns. For example, let's say I group the appointments on the binding name. This means that my result set will contain one row per binding name, and I can then do things like find the maximum start or end date or something like that; more formally, you can specify aggregation operations, which are operations that take a set of data (i.e. a list of start dates) and return a single piece of data (i.e. the maximum start date).
Unless I'm misunderstanding, it sounds like you still want to retrieve all of the individual assignments, you just want them arranged by binding name. If this is the case, just OrderBy(row => row.BindingName) and it will do the trick. In addition, you may want to avoid using the word "group", as people will think you mean the sort of grouping that I described above.
Just as a side point not concerning the linq, have you looked at AutoMapper? I am currently using this for populating data objects from linq and I've found it really useful for getting rid of the large sections of code where you just map to dtos. It wouldn't make the query parts of your code any shorter but would reduce:
return new Calendar()
{
AppointmentId = first.AppointmentId,
AllDay = first.AllDay,
BindingClassName = first.BindingClassName,
Description = first.Description,
EndDate = last,
StartDate = first.StartDate,
Title = first.Title,
Where = first.Where,
UserId = first.UserId
});
to:
return Mapper.Map(first,new Calendar{EndDate = last});