I have the following code to extract records from a dbcontext randomly using Guid class:
var CategoryList = {1,5};
var generatedQues = new List<Question>();
//Algorithm 1 :)
if (ColNum > 0)
{
generatedQues = db.Questions
.Where(q => CategoryList.Contains(q.CategoryId))
.OrderBy(q => Guid.NewGuid()).Take(ColNum).ToList();
}
First, I have a list of CategoryId stored in CategoryList as a condition to be fulfilled when getting records from the db. However, I would like to achieve an even distribution among the questions based on the CategoryId.
For example:
If the ColNum is 10, and the CategoryId obtained are {1,5}, I would like to achieve by getting 5 records that are from CategoryId = 1 and another set of 5 records from CategoryId = 5. If the ColNum is an odd number like 11, I would also like to achieve an even distribution as much as possible like maybe getting 5 records from CategoryId 1 and 6 records from CategoryId 2.
How do I do this?
This is a two step process,
Determine how many you want for each category
Select that many items from each category in a random order
For the first part, define a class to represent the category and how many items are required
public class CategoryLookup
{
public CategoryLookup(int catId)
{
this.CategoryId = catId;
}
public int CategoryId
{
get; private set;
}
public int RequiredAmount
{
get; private set;
}
public void Increment()
{
this.RequiredAmount++;
}
}
And then, given your inputs of the required categories and the total number of items required, work out how many are required for each category
var categoryList = new []{1,5};
var colNum = 7;
var categoryLookup = categoryList.Select(x => new CategoryLookup(x)).ToArray();
for(var i = 0;i<colNum;i++){
categoryLookup[i%categoryList.Length].Increment();
}
The second part is really easy, just use a SelectMany to get the list of questions (Ive used a straight linq to objects to test, should work fine for database query. questions in my code would just be db.Questions in yours)
var result = categoryLookup.SelectMany(
c => questions.Where(q => q.CategoryId == c.CategoryId)
.OrderBy(x => Guid.NewGuid())
.Take(c.RequiredAmount)
);
Live example: http://rextester.com/RHF33878
You could try something like this:
var CategoryList = {1,5};
var generatedQues = new List<Question>();
//Algorithm 1 :)
if (ColNum > 0 && CategoryList.Count > 0)
{
var take = // Calculate how many of each
// First category
var query = db.Questions
.Where(q => q.CategoryId == CategoryList[0])
.OrderBy(q => Guid.NewGuid()).Take(take);
// For all remaining categories
for(int i = 1; i < CategoryList.Count; i++)
{
// Calculate how many you want
take = // Calculate how many of each
// Union the questions for that category to query
query = query.Union(
query
.Where(q => q.CategoryId == CategoryList[i])
.OrderBy(q => Guid.NewGuid()).Take(take));
}
// Randomize again and execute query
generatedQues = query.OrderBy(q => Guid.NewGuid()).ToList()
}
The idea is to just get a random list for each category and add them all together. Then you randomize that again and create your list. I do not know if it will do all this on the database or in memory, but it should be database I think. The resulting SQL will look horrible though.
Related
I am trying to add a feature to my website where the teacher can see a summary of students who have completed goals that week.
This is my controller method
public IActionResult WeeklyDetails()
{
var user = svc.GetUser(GetSignedInUserId());
var goals = svc.GetGoalsForTeacher(user.Id);
var mostRecentMonday = DateTime.Now.StartOfWeek(DayOfWeek.Monday);//get week start of most recent Monday morning
var weekEnd = mostRecentMonday.AddDays(7).AddSeconds(-1); //will return the end of the day on Sunday
var results = goals.Where(g => g.AchievedOn >= mostRecentMonday && g.AchievedOn <= weekEnd).ToList();
for (int i = 0; i < results.Count; i++)
{
//Get count of current element to before:
int count = results.Take(i + 1)
.Count(r => r.Student.Name == results[i].Student.Name);
results[i].Count = count;
}
var result = results.GroupBy(x => x.Id)
.Select(group => group.First()).ToList();
return View(result);
}
In my cshtml view page I call the details like this
#foreach(var item in Model)
{
<p>#item.Student.Name #item.Count</p>
}
However, I achieve this result
Emma 1
Emma 2
Sarah 1
This is because emma has two goals which are completed in the list I know, however I would prefer for Emma 2 to be the only result that is shown. Is there a way to choose Max and then the first? Maybe not, my apologies if this is unclear.
I don't know the definition of result, but I think you group by the primary key of result (x.Id). You use the original result list as model of your view. But you provide aggregated data, so I would create a clean type (can be done inside your controller class as nested class):
public class GoalSummary
{
public int StudentId {get;set;}
public string Firstname {get;set;}
public string Name {get;set;}
public int Goals {get;set;}
}
Then you can use grouping and projecting (select) to create this results:
var summary = goals
.Where(g => g.AchievedOn >= mostRecentMonday && g.AchievedOn <= weekEnd)
.GroupBy(g => new {g.Student.Id, g.Student.Firstname, g.Student.Name})
.Select(g => new GoalSummary
{
StudentId = g.Key.Id,
Firstname = g.Key.Firstname,
Name = g.Key.Name,
Goals = g.Count()
}).ToList();
return View(summary);
If you are familiar with SQL: We want StudentId, Firstname, Name and COUNT(*). So we have to group by Id, Firstname and Name.
In your View you can use your typed summary:
#model List<YourNamespace.Controllers.YourController.GoalSummary>
I have written a code like below:
foreach (var itemA in itm)
{
foreach (var itemB in filteredList)
{
if (itemA.ItemID != itemB.ItemID)
{
missingList.Add(itemB);
ListToUpdate.Add(itemB);
}
else
{
if (itemA.QuantitySold != itemB.QuantitySold)
{
ListToUpdate.Add(itemB);
}
}
}
}
So as you can see i have two lists here which are identical in their structure and they are:
List #1 is "itm" list - which contains old records from DB
List #2 is "filteredList" - which has all items from DB and + new ones
I'm trying to add items to missingList and ListToUpdate on next criteria:
All items that are "new" in filteredList - meaning their ItemID doens't exists in "itm" list should be added to missingList.
And all items that are new in filteredList- filteredList - meaning their ItemID doens't exists in "itm" list should be added to .ListToUpdate
And final criteria to add items to ListToUpdate should be those items that exist in both lists - and if the quantitysold in "itm" list is different - add them to ListToUpdate
The code above that I written gives me completely wrong results, I end up having more than 50000 items extra in both lists...
I'd like to change this code in a manner that it works like I wrote above and to possibly use parallel loops or PLINQ to speed things up...
Can someone help me out ?
Let's use Parallel.ForEach, which is available in C# 4.0:
Parallel.ForEach(filteredList, (f) =>
{
var conditionMatchCount = itm.AsParallel().Max(i =>
// One point if ID matches
((i.ItemID == f.ItemID) ? 1 : 0) +
// One point if ID and QuantitySold match
((i.ItemID == f.ItemID && i.QuantitySold == f.QuantitySold) ? 1 : 0)
);
// Item is missing
if (conditionMatchCount == 0)
{
listToUpdate.Add(f);
missingList.Add(f);
}
// Item quantity is different
else if (conditionMatchCount == 1)
{
listToUpdate.Add(f);
}
});
The above code uses two nested parallelised list iterators.
Following is an example to compare two lists which will give you list of new IDs.
Class I used to hold the data
public class ItemList
{
public int ID { get; set; }
}
Function to get new IDs
private static void GetNewIdList()
{
List<ItemList> lstItm = new List<ItemList>();
List<ItemList> lstFiltered = new List<ItemList>();
ItemList oItemList = new ItemList();
oItemList.ID = 1;
lstItm.Add(oItemList);
lstFiltered.Add(oItemList);
oItemList = new ItemList();
oItemList.ID = 2;
lstItm.Add(oItemList);
lstFiltered.Add(oItemList);
oItemList = new ItemList();
oItemList.ID = 3;
lstFiltered.Add(oItemList);
var lstListToUpdate = lstFiltered.Except(lstItm);
Console.WriteLine(lstListToUpdate);
}
For getting the list of common IDs use following
var CommonList = from p in lstItm
join q in lstFiltered
on p.ID equals q.ID
select p;
UPDATE 2
For getting the list of new IDs from filtered list based on ID
var lstListToUpdate2 = lstFiltered.Where(a => !lstItm.Select(b => b.ID).Contains(a.ID));
I need to do a query in c# to get the position of a specific id, in a table order by a date.
My table structure
IdAirport bigint
IdUser int
AddedDate datetime
Data:
2 5126 2014-10-23 14:54:32.677
2 5127 2014-10-23 14:55:32.677
1 5128 2014-10-23 14:56:32.677
2 5129 2014-10-23 14:57:32.677
For example, i need to know in which position is the IdUser=5129, in the IdAirport=2, order by AddedDate asc. (The result in this case will be 3).
Edit:
im using iQueryables like this:
AirPort airport = (for airport as context.Airport select airport).FirstOrDefault();
Thanks for your time!
Using LINQ: If you want to find the index of an element within an arbitrary order you can use OrderBy(), TakeWhile() and Count().
db.records.Where(x => x.IdAirport == airportId)
.OrderBy(x => x.AddedDate)
.TakeWhile(x => x.IdUser != userId)
.Count() + 1;
Here's a quick one :
public class test
{
public int IdAirport;
public int IdUser;
public DateTime AddedDate;
public test(int IdAirport, int IdUser, DateTime AddedDate)
{
this.IdAirport = IdAirport;
this.IdUser = IdUser;
this.AddedDate = AddedDate;
}
}
void Main()
{
List<test> tests = new List<test>()
{
new test(2, 5126, DateTime.Parse("2014-10-23 14:54:32.677")),
new test(2, 5127, DateTime.Parse("2014-10-23 14:55:32.677")),
new test(1 , 5128 , DateTime.Parse("2014-10-23 14:56:32.677")),
new test(2 , 5129 , DateTime.Parse("2014-10-23 14:57:32.677"))
};
var r = tests
.Where(t => t.IdAirport == 2)
.OrderBy(t => t.AddedDate)
.TakeWhile(t => t.IdUser != 5129)
.Count() + 1;
Console.WriteLine(r);
}
It keeps the exact order of your own list. You can modify Where/OrderBy if you wish, the interesting part is in the "TakeWhile/Count" use.
Should work fine but probably not very efficient for long lists.
edit : seems to be the same as Ian Mercer. But the "+ 1" in my own sample is needed since TakeWhile will return the number of skipped items, hence not the position of the good one. Or I didn't get well the issue.
This should do what you need:
dataTable.Rows.IndexOf(
dataTable.AsEnumerable().OrderBy(
x => x["AddedDateColumn"]).First(
x => (int)(x["IdUserColumn"]) == 5129));
I have a table, containing weekly sales data from multiple years for a few hundred products.
Simplified, I have 3 columns: ProductID, Quantity, [and Date (week/year), not relevant for the question]
In order to process the data, i want to fetch everything using LINQ. In the next step I would like create a List of Objects for the sales data, where an Object consists of the ProductId and an array of the corresponding sales data.
EDIT: directly after, I will process all the retrieved data product-by-product in my program by passing the sales as an array to a statistics software (R with R dot NET) in order to get predictions.
Is there a simple (built in) way to accomplish this?
If not, in order to process the sales product by product,
should I just create the mentioned List using a loop?
Or should I, in terms of performance, avoid that all together and:
Fetch the sales data product-by-product from the database as I need it?
Or should I make one big List (with query.toList()) from the resultset and get my sales data product-by-product from there?
erm, something like
var groupedByProductId = query.GroupBy(p => p.ProductId).Select(g => new
{
ProdcutId = g.Key,
Quantity = g.Sum(p => p.Quantity)
});
or perhaps, if you don't want to sum and, instread need the quantities as an array of int ordered by Date.
var groupedByProductId = query.GroupBy(p => p.ProductId).Select(g => new
{
ProdcutId = g.Key,
Quantities = g.OrderBy(p => p.Date).Select(p => p.Quantity).ToArray()
});
or maybe you need to pass the data around and an anonymous type is inappropriate., you could make an IDictionary<int, int[]>.
var salesData = query.GroupBy(p => p.ProductId).ToDictionary(
g => g.Key,
g => g.OrderBy(p => p.Date).Select(p => p.Quantity).ToArray());
so later,
int productId = ...
int[] orderedQuantities = salesData[productId];
would be valid code (less the ellipsis.)
You may create a Product class with id and list of int data. Something as below:
Public class Product{
public List<int> list = new List<int>();
public int Id;
Public Product(int id,params int[] list){
Id = id;
for (int i = 0; i < list.Length; i++)
{
list.Add(list[i]);
}
}
}
Then use:
query.where(x=>new Product(x.ProductId,x.datum1,x.datum2,x.datum3));
This is a question about SPEED - there are a LOT of records to be accessed.
Basic Information About The Problem
As an example, we will have three tables in a Database.
Relations:
Order-ProductInOrder is One-To-Many (an order can have many products in the order)
ProductInOrder- Product is One-To-One (a product in the order is represented by one product)
public class Order {
public bool Processed { get; set; }
// this determines whether the order has been processed
// - orders that have do not go through this again
public int OrderID { get; set; } //PK
public decimal TotalCost{ get; set; }
public List<ProductInOrder> ProductsInOrder;
// from one-to-many relationship with ProductInOrder
// the rest is irrelevant and will not be included here
}
//represents an product in an order - an order can have many products
public class ProductInOrder {
public int PIOD { get; set; } //PK
public int Quantity{ get; set; }
public int OrderID { get; set; }//FK
public Order TheOrder { get; set; }
// from one-to-many relationship with Order
public int ProductID { get; set; } //FK
public Product TheProduct{ get; set; }
//from one-to-one relationship with Product
}
//information about a product goes here
public class Product {
public int ProductID { get; set; } //PK
public decimal UnitPrice { get; set; } //the cost per item
// the rest is irrelevant to this question
}
Suppose we receive a batch of orders where we need to apply discounts to and find the total price of the order. This could apply to anywhere from 10,000 to over 100,000 orders. The way this works is that if an order has 5 or more products where the cost each is $100, we will give a 10% discount on the total price.
What I Have Tried
I have tried the following:
//this part gets the product in order with over 5 items
List<Order> discountedOrders = orderRepo
.Where(p => p.Processed == false)
.ToList();
List<ProductInOrder> discountedProducts = discountedOrders
.SelectMany(p => p.ProductsInOrder)
.Where(q => q.Quantity >=5 )
.ToList();
discountedProducts = discountedProducts
.Where(p => p.Product.UnitPrice >= 100.00)
.ToList();
discountOrders = discountedOrders
.Where(p => discountProducts.Any(q => q.OrderID == p.OrderID))
.ToList();
This is very slow and takes forever to run, and when I run integration tests on it, the test seems to time out. I was wondering if there is a faster way to do this.
Try to not call ToList after every query.
When you call ToList on a query it is executed and the objects are loaded from the database in memory. Any subsequent query based on the results from the first query is performed in memory on the list instead of performing it directly in the database. What you want to do here is to execute the whole query on the database and return only those results which verify all your conditions.
var discountedOrders = orderRepo
.Where(p=>p.Processed == false);
var discountedProducts = discountedOrders
.SelectMany(p=>p.ProductsInOrder)
.Where(q=>q.Quantity >=5);
discountedProducts = discountedProducts
.Where(p=>p.Product.UnitPrice >= 100.00);
discountOrders = discountedOrders
.Where(p=>discountProducts.Any(q=>q.OrderID == p.OrderID));
Well, for one thing, combining those calls will speed it up some. Try this:
discountOrders = orderRepo.Where(p=>p.Processed == false && p.SelectMany(q=>q.ProductsInOrder).Where(r=>r.Quantity >=5 && r.Product.UnitPrice >= 100.00 && r.OrderID == p.OrderId).Count() > 0).ToList();
Note that this isn't tested. I hope I got the logic right-- I think I did, but let me know if I didn't.
Similar to #PhillipSchmidt, you could rationalize your Linq
var discountEligibleOrders =
allOrders
.Where(order => !order.Processed
&& order
.ProductsInOrder
.Any(pio => pio.TheProduct.UnitPrice >= 100M
&& pio.Quantity >= 5))
Removing all those nasty ToList statements is a great start because you're pulling potentially significantly larger sets from the db to your app than you need to. Let the database do the work.
To get each order and its price (assuming a discounted price of 0.9*listed price):
var ordersAndPrices =
allOrders
.Where(order => !order.Processed)
.Select(order => new {
order,
isDiscounted = order
.ProductsInOrder
.Any(pio => pio.TheProduct.UnitPrice >= 100M
&& pio.Quantity >= 5)
})
.Select(x => new {
order = x.order,
price = x.order
.ProductsInOrder
.Sum(p=> p.Quantity
* p.TheProduct.UnitPrice
* (x.isDiscounted ? 0.9M : 1M))});
I know you have an accepted answer but please try this for added speed - PLINQ (Parallel LINQ) this will take a list of 4000 and if you have 4 cores it will filter 1000 on each core and then collate the results.
List<Order> orders = new List<Order>();
var parallelQuery = (from o in orders.AsParallel()
where !o.Processed
select o.ProductsInOrder.Where(x => x.Quantity >= 5 &&
x.TheProduct.UnitPrice >= 100.00 &&
orders.Any(x => x.OrderID = x.OrderID));
Please see here:
In many scenarios, PLINQ can significantly increase the speed of LINQ to Objects queries by using all available cores on the host computer more efficiently. This increased performance brings high performance computing power onto the desktop
http://msdn.microsoft.com/en-us/library/dd460688.aspx
move that into 1 query, but actually you should move this into a SSIS package or a sql job. You could easily make this a stored proc that runs in less than a second.