I can't figure out why this isn't intersecting all of the items in the loop, just the last 2. I think it has something to do with IQueryable
var outerquery = db.Employees.Where(x => x.Name = "Smith").Select(x => x.EmployeeID);
foreach(var name in nameList){
var innerQuery = db.Employees.Where(x => x.Name = name).Select(x => x.EmployeeID);
outerquery = outerquery.Intersect(innerQuery);
}
return outerquery.ToList();
EDIT -
A more concrete example. The table has approx 35 million records.
The table has ID, ConceptID, Word. Words can have multiple ConceptIDs & there is 1 word per record. I was to intersect a search string 'shoulder pain chronic' and get all the ConceptIDs that share those 3 words. It should return:
Concept1234 - shoulder
Concept1234 - pain
Concept1234 - chronic
What I am getting (just the last 2):
Concept1234 - pain
Concept1234 - chronic
Doing an OR on 35 million records is rough even with this monster server I have & an intersect is the only way to do it in less than a second.
What I am trying to generate with LINQ to SQL (Entity Framework) is this -
SELECT ConceptID FROM WordTable WHERE Word = 'shoulder'
INTERSECT
SELECT ConceptID FROM WordTable WHERE Word = 'pain'
INTERSECT
SELECT ConceptID FROM WordTable WHERE Word = 'chronic'
You have outerquery inside the foreach loop which gets replaced in each iteration of the loop and you lose previous data.
Related
I have a situation where I have to match up multiple customers numbers from one system with a single customer number in another system.
So for instance customer number 225, 228 and 223 in system A will all map to customer number 110022 in system B.
Easy enough, I have a matrix setup to do that.
I pull the matrix data in like this:
var dt_th_matrix = (from m in aDb.Matrix_Datatrac_TopHat select m).ToArray();
So the records would be something like:
customerA: 3 CustomerB: 1001
CustomerA: 4 CustomerB: 1001
CustomerA: 5 Customer: 1002
Then I do a big data pull and step through all the items. For each of the items I go grab the matching customer number from the matrix like this:
foreach (var dt_stop in mainPull)
{
int? th_customerId = (from d in dt_th_matrix
where d.datatrac_customer_no == dt_stop.Customer_No.ToString()
select d.tophat_customer_detail_Id).First();
What I would rather do is to just embed the code to grab the customer numbrer from the matrix directly in my datapull -- the part "Query goes here somehow" will be some type of Lambda I assume. Any help?
I have tried something like this:
th_customerId = (dt_th_matrix.First().tophat_customer_detail_Id.Equals c.Customer_No)
But that is not it (obviously)
var mainPull = (from c in cDb.DistributionStopInformations
join rh in cDb.DistributionRouteHeaders on c.Route_Code equals rh.Route_Code
where c.Company_No == 1 &&
(accountNumbers.Contains(c.Customer_No)) &&
(brancheSearchList.Contains(c.Branch_Id) && brancheSearchList.Contains(rh.Branch_Id)) &&
c.Shipment_Type == "D" &&
(c.Datetime_Created > dateToSearch || c.Datetime_Updated > dateToSearch) &&
rh.Company_No == 1 &&
((rh.Route_Date == routeDateToSearch && c.Route_Date == routeDateToSearch) ||
(rh.Route_Date == routeDateToSearch.AddDays(1) && c.Route_Date == routeDateToSearch.AddDays(1)))
orderby c.Unique_Id_No
select new
{
c.Datetime_Updated,
th_customerId = ("Query goes here somehow")
c.Datetime_Created,
c.Unique_Id_No,
c.Original_Unique_Id_No,
c.Unique_Id_Of_New_Stop,
c.Branch_Id,
c.Route_Date,
c.Route_Code,
c.Sequence_Code,
c.Customer_No,
c.Customer_Reference,
c.Shipment_Type,
c.Stop_Name,
c.Stop_Address,
c.Stop_City,
c.Stop_State,
c.Stop_Zip_Postal_Code,
c.Stop_Phone_No,
c.Stop_Arrival_Time,
c.Stop_Departure_Time,
c.Address_Point,
c.Stop_Special_Instruction1,
c.Stop_Special_Instruction2,
c.Stop_Expected_Pieces,
c.Stop_Expected_Weight,
c.Stop_Signature,
c.Actual_Arrival_Time,
c.Actual_Depart_Time,
c.Actual_Service_Date,
c.Stop_Actual_Pieces,
c.Stop_Exception_Code,
c.Created_By,
rh_Route_Date = rh.Route_Date,
routeHeaderRouteCode = rh.Route_Code,
rh.Actual_Driver,
rh.Assigned_Driver,
rh_routeDate = rh.Route_Date
}).ToArray();
I will try and clarify the above.
What I need is for the Linq query to say :
For each record that I pull I will goto the Array named dt_th_matrix and get the record that matches for this line and use it.
The data in the matrix looks exactly like this:
Record 1: datatrac_customer_no: 227, tophat_customer_detail_Id 1
Record 2: datatrac_customer_no: 228, tophat_customer_detail_Id: 1
Record 3: datatrac_customer_no: 910, tophat_customer_detail_Id: 5
Then for the first record pulled in the mainPull the field c.customer_no == 228 so I need the query in the select new statement to replace th_customerId with 1 (from Record 2 in the Matrix.
Then say the next record pulled in the mainPull the field c.customer_no = 910 the th_customerId would be 5.
That is what the first line of my foreach statement is currently doing. I want to move that logic to inside my LINQ query.
If I understand you correctly, using a dictionary with a key of datatrac_customer_no and a value of tophat_customer_detail_Id would be a good idea here:
var dt_th_matrix = (from m in aDb.Matrix_Datatrac_TopHat select m).ToDictionary(m=>m.datatrac_customer_no,m=>m.tophat_customer_detail_Id);
With this you should be able to replace your "Query goes here somehow" with
dt_th_matrix[c.Customer_No]
Using LINQ would be possible as well, but I don't think it's worth the performance overhead and reduction in readibility.
If you still want to use LINQ for this with your original matrix, this should work as your query:
dt_th_matrix.Single(m => m.datatrac_customer_no == c.Customer_No).tophat_customer_detail_Id
Both expressions will throw an exception if the key is not found or exists multiple times - but if I understand your structure correctly this should not be possible. Otherwise you need to check for this.
We have a table in our SQL database with historical raw data I need to create charts from. We access the DB via Entity Framework and LINQ.
For smaller datetime intervals, I can simply read the data and generate the charts:
var mydata = entity.DataLogSet.Where(dt => dt.DateTime > dateLimit);
But we want to implement a feature where you can quickly "zoom out" from the charts to include larger date intervals (last 5 days, last month, last 6 months, last 10 years and so on and so forth.)
We don't want to chart every single data point for this. We want to use a sample of the data, by which I mean something like this --
Last 5 days: chart every data point in the table
Last month: chart every 10th data point in the table
Last 6 months: chart every 100th data point
The number of data points and chart names are only examples. What I need is a way to pick only the "nth" row from the database.
You can use the Select overload that includes the item index of enumerations. Something like this should do the trick --
var data = myDataLogEnumeration.
Select((dt,i) => new { DataLog = dt, Index = i }).
Where(x => x.Index % nth == 0).
Select(x => x.DataLog);
If you need to limit the query with a Where or sort with OrderBy, you must do it before the first Select, otherwise the indexes will be all wrong --
var data = myDataLogEnumeration.
Where(dt => dt.DateTime > dateLimit).
OrderBy(dt => dt.SomeField).
Select((dt,i) => new { DataLog = dt, Index = i }).
Where(x => x.Index % nth == 0).
Select(x => x.DataLog);
Unfortunately, as juharr commented, this overload is not supported in Entity Framework. One way to deal with this is to do something like this --
var data = entity.DataLogSet.
Where(dt => dt.DateTime > dateLimit).
OrderBy(dt => dt.SomeField).
ToArray().
Select((dt,i) => new { DataLog = dt, Index = i }).
Where(x => x.Index % nth == 0).
Select(x => x.DataLog);
Note the addition of a ToArray(). This isn't ideal though as it will force loading all the data that matches the initial query before selecting only every nth row.
There might be a trick that is supported by ef that might work for this.
if (step != 0)
query = query.Where(_ => Convert.ToInt32(_.Time.ToString().Substring(14, 2)) % step == 0);
this code converts the date into string then cuts the minutes out converts the minutes into an int and then gets every x'th minute for example if the variable step is 5 it's every 5 minutes.
For Postgresql this converts to:
WHERE ((substring(c.time::text, 15, 2)::INT % #__step_1) = 0)
this works best with fixed meassure points such as once a minute.
However, you can use the same method to group up things by cutting up to the hour or the minutes or the first part of the minute (10 minutes grouped) and use aggregation functions such as max() average() sum(), what might even is more desirable.
For example, this groups up in hours and takes the max of most but the average of CPU load:
using var ef = new DbCdr.Context();
IQueryable<DbCdr.CallStatsLog> query;
query = from calls in ef.Set<DbCdr.CallStatsLog>()
group calls by calls.Time.ToString().Substring(0, 13)
into g
orderby g.Max(_ => _.Time) descending
select new DbCdr.CallStatsLog()
{
Time = g.Min(_ => _.Time),
ConcurrentCalls = g.Max(_ => _.ConcurrentCalls),
CpuUsage = (short)g.Average(_ => _.CpuUsage),
ServerId = 0
};
var res = query.ToList();
translates to:
SELECT MAX(c.time) AS "Time",
MAX(c.concurrent_calls) AS "ConcurrentCalls",
AVG(c.cpu_usage::INT::double precision)::smallint AS "CpuUsage",
0 AS "ServerId"
FROM call_stats_log AS c
GROUP BY substring(c.time::text, 1, 13)
ORDER BY MAX(c.time) DESC
note: the examples work with postgres and iso datestyle.
I am doing a search on a database table using a wildcard(The contains extension) Here is the code
// This gets a list of the primary key IDs in a table that has 5000+ plus records in it
List<int> ids = context.Where(m => m.name.ToLower().Contains(searchTerm.ToLower())).Select(m => m.Id).ToList();
// Loop through all the ids and get the ones that match in a different table (So basically the FKs..)
foreach (int idin nameId)
{
total.AddRange(context2.Where(x => x.NameID == id).Select(m => m.Id).ToList());
}
In there anything I could change in the LINQ that would result in getting the IDs faster?
Thanks
I have not tested it, but you could do something along these lines:
var total =
from obj2 in context2
join obj1 in context1 on obj2.NameID equals obj1.Id
where obj1.name.ToLower().Contains(searchTerm.ToLower())
select obj2.Id
It joins the two tables, performing a cartesian product first and then limiting it to the pairs where the NameIds match (see this tutorial on the join clause). The where line does the actual filtering.
It should be faster because the whole matching is done in the database, and only the correct ids are returned.
If you had a Name property in the context2 item class that holds a reference to the context1 item, you could write it more readable:
var total = context2
.Where(x => x.Name.name.ToLower().Contains(searchTerm.toLower()))
.Select(x => x.ID);
In this case, Linq to SQL would do the join automatically for you.
In terms of performance you can see tests that show if you search 1 string in 1 000 000 entries its around 100 ms.
Here is the link with tests and implementation.
for (int y = 0; y < sf.Length; y++)
{
c[y] += ss.Where(o => o.Contains(sf[y])).Count();
}
mvc beginner
I have a table of lots that contain a property Num_of_steps representing the number of completed steps toward building a house.
I currently use this to retrieve the lot information and am sorting by the lot number.
var ViewModel = new Sub_lot_VM();
ViewModel.Subdivisions = db.Subdivisions
.Include(i => i.Lots)
.ToList();
if (ViewModel.Subdivisions !=null) // if data sort by lot number
{
foreach (var item in ViewModel.Subdivisions)
item.Lots = item.Lots.OrderBy(i => i.LotName).ToList();
}
return View(ViewModel);
}
Now I want to display this information a 3 groups:
first where the count is between 1 and 114 (active),
second where the count is above 115 (or GTE 115?) (finished)( and then orderby lot name) and
third group is count = 0 (not started) also order by lotname.
I've been trying to think of how to add .where and .groupby lambda expressions to my method without luck. Such as.where(I=>i.Lot.Num_of_steps=0).
I also see that I needed a foreach where some LINQ examples did not need the foreach. Still confused on that.
Get the lots first and then use groupby with ranges to get the groups
from x in
(
db.Subdivisions.SelectMany(sd => sd.Lots)
)
group x by x.Num_of_steps == 0 ? 3 : x.Num_of_steps < 115 ? 1 : 2 into g
orderby g.Key
select g.OrderBy(g1 => g1.LotName)
You can give the groups meaningful names in stead of 1, 2 and 3, but you can also postpone that until it's display time. The numbers facilitate correct sorting.
This has its root into another question I asked, that solved a part of this problem -
Convert this code to LINQ?
Now I am trying to write the entire logic in the following manner -
var divisions = (
from DataRow row in results.Rows
let section = row["SECTION"].ToString()
orderby section ascending
select section).Distinct();
string result = String.Empty;
foreach (string div in divisions)
{
result = String.Concat(result, div, Environment.NewLine);
var query =
from DataRow row in results.Rows
let remarks = row["REMARKS"].ToString()
let exam = row["EXAM_NAME"].ToString()
let rollno = row["ROLL_NO"].ToString()
let section = row["SECTION"].ToString()
where (remarks == "Passes" || remarks == "Promoted") &&
exam == "TOTAL" && section == div
orderby rollno
select rollno;
result = String.Concat(result,string.Join(" ", query.ToArray()),
Environment.NewLine);
}
Basically, the original datatable has a bunch of rows with various information including Division. I want to create a single string, for which every division appears on a new line, and below that the roll nos for that division are shown in comma separated fashion. Next division on next line, and so on. (here Section and division are interoperable terms).
Is there any elegant way to write this with one linq query, instead of having to loop through the results of the first query?
EDIT:
Data (not mentioning the other columns that are used in filter conditions)
Roll_no Section.. other cols
001 A
002 A
001 B
003 A
004 B
006 B
This is what the output will look like - (roll no is unique only within a division, but that should not affect the logic in any way)
A
001 002 003
B
001 004 006
This will be like 'A\r\n001 002 003\r\nB\r\n001 004 006' when the string is in raw format.
Note, the above code works. I am just looking for a better approach.
There are two separate requirements you want to have implemented, and you should not try to merge them into a single thing. You 1. want to group the results togetter and 2. have specific needs for presentation.
Here is how you can do this:
var query =
from DataRow row in results.Rows
// here the query stuff you already had
select new { rollno, section, exam, remarks };
// 1. Grouping
var groups =
from item in query
group item by item.section into g
select new
{
Section = g.Key,
Rollnos = g.Select(i => i.rollno).ToArray(),
};
// 2. Presentation
foreach (var group in groups)
{
Console.WriteLine(group.Section);
Console.WriteLine(string.Join(" ", group.Rollno));
}
It is possible to write one single query that also does part of the presentation for you, but this query would become very nasty and unreadable.