My own OrderBy function - c#

I am writing a piece of code which is going to order the list of photos based on their rating. Each photo is stored in DB, and each has such information as number of positive and negative votes. I want to order them by the formula in which I count the percentage of positive votes, and the first photo is the one with the highest percentage.
For that I used the standard IComparer interface, and wrote my own Comparer function, which compares two photos. The problem is that I do that I have to first download the list of all photos from the db. It seems like a lot of unnecessary effort which I would like to avoid. So I am wondering if it is possible to create my own SQL function which will do the comparing on the DB side, and returns to me just the photos I want? It is more efficient than comparing all the photos on the server side?
The code for my own comparer:
public class PictureComparer : IComparer<Picture>
{
public int Compare(Picture p1, Picture p2)
{
double firstPictureScore = (((double)p1.PositiveVotes/(double)(p1.PositiveVotes+p1.NegativeVotes))*100);
double secondPictureScore = (((double)p2.PositiveVotes / (double)(p2.PositiveVotes + p2.NegativeVotes)) * 100);
if (firstPictureScore < secondPictureScore) return 1;
if (firstPictureScore > secondPictureScore) return -1;
return 0;
}
}
And the code which uses the comaprer:
var pictures = db.Pictures.Include(q => q.Tags).Include(q => q.User).ToList();
pictures = pictures.OrderBy(q => q, new PictureComparer()).Skip(0 * 10).Take(10).ToList();

Remove the first call to ToList and use a lambda expression instead of defining a comparer:
var result = db.Pictures
.Include(q => q.Tags)
.Include(q => q.User)
.OrderByDescending(q =>
q.PositiveVotes + q.NegativeVotes == 0
? -1
: q.PositiveVotes / (double)(q.PositiveVotes + q.NegativeVotes))
.Skip(n * 10)
.Take(10)
.ToList();

The calculations in your comparer code are independent (i.e. the comparison just depends on ordering a value that can be calculated without reference to the item you are comparing to). Therefore you should calculate your positive percentage number first and just use the calculated value in your comparer.
This should certainly be done in the database if possible (i.e. if you have access to make changes to the database). Databases are suited to this kind of calculation and you could probably do it on the fly without having to cache the calculated values, by which I mean have a view that works out the percentage for you rather than precalculating and storing the value everytime there is a positive or negative vote. This will obviate the need to download all the photos to compare, as you can just order by the positive percentage. Below is some sample sql that will do the job (note it is only a sample...you might want to store the vote as a bit or something more efficient). The votes table contains a list of all votes for a particular picture and who voted for it.
declare #votes table(
pictureId int,
voterId int,
vote int)
insert into #votes select 1,1,1
insert into #votes select 1,2,-1
insert into #votes select 1,3,1
insert into #votes select 1,4,1
insert into #votes select 2,1,-1
insert into #votes select 2,2,-1
insert into #votes select 2,3,1
insert into #votes select 2,4,1
declare #votesView table(
pictureId int,
positiveVotes int,
NegativeVotes int)
insert into #votesView
select pictureId, sum(case when vote > 0 then 1 else 0 end) as PositiveVotes,
SUM(case when vote < 0 then 1 else 0 end) as NegativeVotes from #votes group by pictureId
select pictureId, convert(decimal(6,2),positiveVotes) / convert(decimal(6,2), (positiveVotes + negativeVotes)) as rating from #votesView

Related

Calculate a Running Total plus minus with merge column

I have a table with five column description, opening balance, sale, sale return, recipt.
I want to merge opening balance, sale as "Debit" and salereturn, recipt as "Credit".
How to calculate running total as column name as "balance" debit amount plus and credit amount MINUS in balance column?
My attempt is
SELECT Description, (InvoiceAmount + OpeningBalance) as 'Dabit', (DrAmount + SaleReturn + BadDebtAmount) as 'credit', SUM (sale+ OpeningBalance-SaleReturn-recipt) over (ORDER BY id) AS RunningAgeTotal FROM tablename
You seem to be describing coalesce() and a window function:
select description,
coalesce(opening, sale) as debit,
coalesce(return, receipt) as credit,
sum(coalesce(opening, sale, 0) - coalesce(return, receipt, 0)) over (order by order by (case description when 'opening balance' then 1 when 'sale' then 2 when 'sale return' then 3 else 4 end))
from t
order by (case description when 'opening balance' then 1 when 'sale' then 2 when 'sale return' then 3 else 4 end);
At the expense of creating a temporary list, a Linq version would be as follows :
Assuming your original source is from a Sql database, then you first need to bring the data into memory, eg
var records = OrderDetails
.OrderBy(a=>a.Date)
.Select(a => new
{
a.Description,
Debit = a.OpeningBalance + a.Sale,
Credit = a.Return + a.SaleReturn
}
)
.ToList();
Note the query needs to be sorted to ensure the date is returned in the correct order. You haven't mentioned any other fields, so I have just assumed there is a field called date that can be used.
Once you have the data in memory, you can then add the Balance column, ie
decimal balance = 0;
var list = records.Select(a => new
{
a.Description,
a.Debit,
a.Credit,
Balance = balance += (a.Debit - a.Credit),
}).ToList();
Because you are introducing a local variable and initialising it outside the Linq statement, it is important that the query is not enumerated twice unless balance has been reset to zero. You can avoid this by using .ToList(); or .ToArray();

How to Move Zero Qty Products to the end of List when i have a PagedList<Products>?

I have a method in my controller which returns a PagedList to my category-page View that contains Products (based current Page-Number and Page-Size which user has selected) from SQL Server stored-procedure like blow :
var products = _dbContext.EntityFromSql<Product>("ProductLoad",
pCategoryIds,
pManufacturerId,
pOrderBy,
pageIndex,
pageSize).ToList(); // returning products based selected Ordering by.
var totalRecords = pTotalRecords.Value != DBNull.Value ? Convert.ToInt32(pTotalRecords.Value) : 0;
var allPrd= new PagedList<Product>(products, pageIndex, pageSize, totalRecords);
An Example of sending parameters to db stored-procedure is :
("ProductLoad",
[1,10,13],
[653],
"Lowest Price",
2,
64) // so it returns second 64 products with those category-ids and Brand-ids sorting by lowest to highest price
It's working fine , but what i am trying to do is always sending products with 0 quantity to the end of list.
For example :
if i had 10k products which 2k of them have 0 quantity , i need to show this 8k products first and then 2k unavailable products in the end of list)
what i have tried so far is always loading All products without page-size and page-index first then send zero qty products to the end of the list by this and finally Pagedlist with fixing page size :
("ProductLoad",
[1,10,13],
[653],
"Lowest Price",
0,
10000) // fixed page size means loading all products
var zeroQty= from p in products
where p.StockQuantity==0
select p;
var zeroQtyList= zeroQty.ToList();
products = products.Except(zeroQtyList).ToList();
products.AddRange(zeroQtyList);
var totalRecords = pTotalRecords.Value != DBNull.Value ? Convert.ToInt32(pTotalRecords.Value) : 0;
var allPrd= new PagedList<Product>(products, pageIndex, 64, totalRecords);
It cause all zero qty Products goes to the end of list.
But it always loads all products that is not a good idea and for sure not an optimized way , sometime users get page loading time-out,
(because category-page show 64 products in every page-index-number) every time user opens a page in the website, all products would loads and it cause delay in loading page.
Is there anyway to solve this problem (have a PagedList which
contains all more than zero qty products first and 0 qty products
second) without changing stored-procedure? (fixing loading page
delays)
P.S : The reason i avoid changing stored-procedure is it has already too much join,temp-table Union and Order by.
Any help would be appreciated.
You will need to the use the ROW_NUMBER function in your stored procedure.
This is an example of how I have done this before. Hopefully you will be able to adapt it to your SP.
--temp table to hold the message IDs we are looking for
CREATE TABLE #ids (
MessageId UNIQUEIDENTIFIER
,RowNum INT PRIMARY KEY
);
--insert all message IDs that match the search criteria, with row number
INSERT INTO #ids
SELECT m.[MessageId]
,ROW_NUMBER() OVER (ORDER BY CreatedUTC DESC)
FROM [dbo].[Message] m WITH (NOLOCK)
WHERE ....
DECLARE #total INT;
SELECT #total = COUNT(1) FROM #ids;
--determine which records we want to select
--note: #skip and #take are parameters of the procedure
IF #take IS NULL
SET #take = #total;
DECLARE #startRow INT, #endRow INT;
SET #startRow = #skip + 1;
SET #endRow = #skip + #take;
-- select the messages within the requested range
SELECT m.* FROM [dbo].[Message] WITH (NOLOCK)
INNER JOIN #ids i ON m.MessageId = i.MessageId
WHERE i.RowNum BETWEEN #startRow AND #endRow;
OrderByDescending could be useful to fix it. Like below:
List<Product> SortedList = products.OrderByDescending(o => o.StockQuantity).ToList();

Alias columns in the WHERE/GROUP BY clauses with Entity Framework Core?

I would like to produce a query similar to the following in Entity Framework Core:
SELECT ID, Name, ..., (SQUARE(Color.STX - 0.1) + SQUARE(Color.STY - 0.2) + SQUARE(Color.Z - 0.3)) AS ColorVarianceSquared
FROM Point
WHERE ColorVarianceSquared < 10000.0 AND ...
ORDER BY ColorVarianceSquared
The schema for the table:
CREATE TABLE Point (
[ID] UNIQUEIDENTIFIER NOT NULL,
[Name] NVARCHAR (64) NULL,
...
[Color] [sys].[geometry] NULL,
);
(the Color column is initialized from C#/EF as an instance of NetTopologySuite.Geometries.Point)
The use case for my question is ordering the results by how close the colors match (CIELAB color space is used).
Is there a way to create an alias column such that it can be used in the ORDER BY clause or do I have to repeat the distance formula calculation? I understand that even bare SQL cannot use an alias column in the WHERE clause, so a derived table may have to come into play somewhere, or in the worst case write the distance calculation three times - in the SELECT, WHERE, and ORDER BY. From EF perhaps I could Select() into another object and caching the distance value there, then query/order against that? Something like:
NetTopologySuite.Geometries.Point ReferenceColor = ...
double MaximumVariance = ...
context.Point.Select(m => new {
ID = m.ID,
Name = m.Name,
...
ColorVariance = m.Color.Distance(ReferenceColor),
})
.Where(m => m.ColorVariance < MaximumVariance && ...)
.OrderBy(m => m.ColorVariance)
.ThenBy(...)
To complicate things, the color matching and other WHERE conditions are added dynamically using Expressions based on parameters bound in a Controller method, so the ideal solution would involve perhaps Expressions and reflection to only add this functionality to the query if requested. While not strictly part of my question, I also wonder if adding all that wouldn't be less efficient than simply repeating the distance calculation and if it's not requested by the user, it won't get added to the WHERE clause (again, built dynamically based on user input). In other words, selecting it but not using it in the WHERE or ORDER BY may cause less of a slowdown in the database server than adding this complication to the web server.
move your query to a subquery then filter your results.
SELECT t1.ColorVarianceSquared FROM
SELECT ID, Name, ..., (SQUARE(Color.STX - 0.1) + SQUARE(Color.STY - 0.2) + SQUARE(Color.Z - 0.3)) AS ColorVarianceSquared
FROM Point) as t1
WHERE t1.ColorVarianceSquared < 10000.0
ORDER BY t1.ColorVarianceSquared
For faster query.
SELECT t1.ColorVarianceSquared FROM
SELECT ID, Name, ..., (SQUARE(Color.STX - 0.1) + SQUARE(Color.STY - 0.2) + SQUARE(Color.Z - 0.3)) AS ColorVarianceSquared
FROM Point
WHERE (SQUARE(Color.STX - 0.1) + SQUARE(Color.STY - 0.2) + SQUARE(Color.Z - 0.3)) < 10000.0) as t1
ORDER BY t1.ColorVarianceSquared

Select Count(Id) in linq

Is there any way to write a linq query to result in :
select Count(Id) from tbl1
because
tbl1.Select(q=>q.Id).Count()
doesn't translate to the result that I want
update :
it returns :
select count(*) from tbl1
Update after answer :
I tested the scenario with more than 21,000,000
Is there any way to write a linq query to result in.
No. First thing is to understad what you need, for sample, in T-SQL, you can use:
COUNT(*) will counts the rows in your table
COUNT(column) will counts the entries in a column - ignoring null values.
If you need to count how many rows you have, just use
var total = tbl1.Count();
If you need to see how many entities you have where a specific column is not null, then use a filter overloads of Count method.
var total = tbl1.Count(x => x.Id != null);
No, it is not possible. There is not difference realted with performance using Count(*) or ´Count(Id), even more if yourId` is the primary key.
I did an experiment with a table here with more than one million tuples. See the executioon plan of both queries. The first one is the select count(*) and second one is select count(id). The id is the primary key (sorry the results are in portuguese-brazil):
Using count(field) in sql counts all non-null values. In linq, you can say:
tbl1.Where(q => q.Id != null).Count();
or simply:
tbl1.Count(q => q.Id != null);
A possibility to get
select Count(Id) from tbl1
would be
tbl1.Where(q => q.Id != null).Select(x => x.Id).Distinct().Count();
The above Where is there to avoid null values. If you want them to also be counted, the Where needs to be eliminated and the Select adjusted to deal with null entries.
Additionally if you don't want to count just distinct values then the Select and Distinct parts can be disregarded.

LINQ to SQL complex query problem

I have 3 tables: Principal (Principal_ID, Scale), Frequency (Frequency_ID, Value) and Visit (Visit_ID, Principal_ID, Frequency_ID).
I need a query which returns all principals (in the Principal table), and for each record, query the capacity required for that principal, calculated as below:
Capacity = (Principal.Scale == 0 ? 0 : (Frequency.Value == 1 ? 1 : Frequency.Value * 1.8) / Principal.Scale)
I'm using LINQ to SQL, so here is the query:
from Principal p in ShopManagerDataContext.Instance.Principals
let cap =
(
from Visit v in p.Visits
let fqv = v.Frequency.Value
select (p.Scale != 0 ? ((fqv == 1.0f ? fqv : fqv * 1.8f) / p.Scale) : 0)
).Sum()
select new
{
p,
Capacity = cap
};
The generated TSQL:
SELECT [t0].[Principal_ID], [t0].[Name], [t0].[Scale], (
SELECT SUM(
(CASE
WHEN [t0].[Scale] <> #p0 THEN (
(CASE
WHEN [t2].[Value] = #p1 THEN [t2].[Value]
ELSE [t2].[Value] * #p2
END)) / (CONVERT(Real,[t0].[Scale]))
ELSE #p3
END))
FROM [Visit] AS [t1]
INNER JOIN [Frequency] AS [t2] ON [t2].[Frequency_ID] = [t1].[Frequency_ID]
WHERE [t1].[Principal_ID] = [t0].[Principal_ID]
) AS [Capacity]
FROM [Principal] AS [t0]
And the error I get:
SqlException: Multiple columns are specified in an aggregated expression containing an outer reference. If an expression being aggregated contains an outer reference, then that outer reference must be the only column referenced in the expression.
And ideas how to solve this, if possible, in one query?
Thank you very much in advance!
Here are 2 ways to do this by changing up your approach:
Create a user defined aggregate function using the SQL CLR. This may not be the right solution for you, but it's a perfect fit for the problem as stated. For one thing, this would move all of the logic into the data layer so LINQ would be of limited value. With this approach you get effeciency, but there's a big impact on your architecture.
Load Visit and Fequency tables into a typed DataSet and use LINQ to datasets. This will probably work using your existing code, but I haven't tried it. With this approach your achitecture is more or less preserved, but you could have a big efficency hit if Visit and Frequency are large.
Based on the comment, I've an alternative suggestion. Since your error is coming from SQL, and you aren't using the new column as a filter, you can move your calculation to the client. For this to work, you'll need to pull all the relevant records (using DataLoadOptions.LoadWith<> on your context).
To further your desire for use with binding to a DataGrid, it'd probably be easiest to bury the complexity in a property of Principal.
partial class Principal
{
public decimal Capacity
{
get
{
return this.Scale == 0 ? 0 : this.Visits.Select(v =>
(v.Frequency.Value == 1 ? 1 : v.Frequency.Value * 1.8) / this.Scale).Sum();
}
}
}
Then your retrieval gets really simple:
using (ShopManagerDataContext context = new ShopManagerDataContext())
{
DataLoadOptions options = new DataLoadOptions();
options.LoadWith<Principal>(p => p.Visits);
options.LoadWith<Visit>(v => v.Frequency);
context.LoadOptions = options;
return (from p in context.Principals
select p).ToList();
}

Categories

Resources