Entity Framework and self-referencing table - c#

I need to have a database that starts with a table called "User" that needs to self reference itself and will have a very deep graph of related objects. It will need to be like the left side of the image below (disregard the right side).
I will also need to traverse through this graph both up and downwards in order to calculate percentages, totals, etc. In other words I'll need to travese the entire graph in some cases.
Is this possible and/or how is it done? Can traversing be done right in the LINQ statement? Examples?
EDIT:
I'm basically trying to create a network marketing scenario and need to calculate each persons earnings.
Examples:
To be able to calulate the total sales for each user under a specific user (so each user would have some sort of revenue coming in).
Calculate the commission at a certain level of the tree (e.g. if the top person had 3 people below them each selling a product for $1 and the commission was 50% then there would be $1.50.)
If I queried the image above (on the left) for "B" I should get "B,H,I,J,N,O"
Hopefully that helps :S

You can't traverse the whole tree using just LINQ in a way that would translate to single SQL query (or a constant count of them). You can do it either with one query for each level or with one query, that is limited to a specific count of levels (but such a query would get really big with many levels).
In T-SQL (I assume you're using MS SQL Server), you can do this using recursive common table expressions. It should be possible to put that into a stored procedure that you can use from LINQ to get the information you actually want.
To sum up, your options are:
Don't use LINQ, just SQL with recursive CTE
Use recursive CTE in a stored procedure from LINQ
Use LINQ, creating one query for each level
Use ugly LINQ query limited to just a few levels

I know this is late, but if you look at Directed Graph algorithms, you can bypass the recursive issues. check out these 2 articles:
http://www.sitepoint.com/hierarchical-data-database/
http://www.codeproject.com/Articles/22824/A-Model-to-Represent-Directed-Acyclic-Graphs-DAG-o

Related

Entity Framework Complex query with inner joins

I want to execute a complex query using Entity Framework (Code First).
I know it can be done by LINQ, but what is the best way to write LINQ query for complex data and inner joining?
I want to execute the following query:
var v = from m in WebAppDbContext.UserSessionTokens
from c in WebAppDbContext.Companies.Include(a => a.SecurityGroups)
from n in WebAppDbContext.SecurityGroups.Include(x => x.Members)
where m.TokenString == userTokenString &&
n.Members.Contains(m.User) &&
c.SecurityGroups.Contains(n)
select c;
Is this the best way to do this?
Does this query get any entire list of records from the db and then executes the filtration on this list? (might be huge list)
And most important: Does it query the database several times?
In my opinion and based on my own experiences, talking about performance, especially joining data sets, it's faster when you write it in SQL. But since you used code first approach then it's not an option. To answer your questions, your query will not query DB several times (you can try debugging and see Events log in VS). EF will transform your query into SQL statement and execute it.
TL:DR; don't micromanage the robots. Let them do their thing and 99% of the time you'll be fine. Linq doesn't really expose methods to micromanage the underlying data query, anyway, its whole purpose is to be an abstraction layer.
In my experience, the Linq provider is pretty smart about converting valid Linq syntax into "pretty good" SQL. Your looks like your query is all inner joins, and those should all be on the primary indexes/foreign keys of each table, so it's going to come up with something pretty good.
If that's not enough to soothe your worries, I'd suggest:
put on a SQL Trace to see the actual query that's being presented to the Database. I bet it's not as simple as Companies join SecurityGroups join Members join UserTokens, but it's probably equivalent to it.
Only worry about optimizing if this becomes an actual performance problem.
If it does become a performance problem, consider refactoring your problem space. It looks like you're trying to get a Company record from a UserToken, by going through three other tables. Is it possible to just relate Users and Companies directly, rather than through the security model? (please excuse my total ignorance of your problem space, I'm just saying "maybe look at the problem from another angle?")
In short, it sounds like you're burning brain cycles attempting to optimize something prematurely. Now, it's reasonable to think if there is a performance problem, this section of the code could be responsible, but that would only be noticeable if you're doing this query a lot. Based on coming from a security token, I'd guess you're doing this once per session to get the current user's contextual information. In that case, the problem isn't usually with Linq, but with your approach to solving the problem for finding Company information. Can you cache the result?

Which query is optimized?

I am fetching a list of products including their prices. I want to get just enable prices.
I wrote two type of queries:
context.Products.Include("Prices").Where(p=>p.Prices.Where(pr=>pr.Enable==true).Count()>0).ToList();
And the other one is:
context.Products.Include("Prices").ToList().RemoveAll(p => p.Prices.Where(pr => pr.Enable == true).ToList().Count == 0);
Which one is more optimized?
Assuming you are using an EntityFramework context, the first one is way better.
This is because Linq to SQL will translate the statement into an SQL statement. The Where statements will result in an according SQL Where. So only the necessary subset of the elements are retrieved.
The second statement retrieves all Products and Prices and then removes the unwanted elements.
This assumes that you have a remote database. If your database is running locally or you already have all Products and Prices in memory its not so easy to tell (you would have to use the profiler for that).
This kind of question really depends on a lot of things, so it is not so easy to say which is better.
But from the code, the first one is doing the where clause at sql side, where the second code is getting all the data out from sql and do the where in application.
so it will depend on the sql server, the application hardware and data amount.

Natural sort for SQL Server

Similar Questions
Similar questions have been asked before, but always have specific qualities to the data which allow a more targeted "split it up and just sort by this part" approach, which does not work when you don't know the structure of the data in the column - or even the column, frankly. In other words, not a generic, "Natural" sort order - something roughly equivalent to SELECT * FROM [parts] ORDER BY [part_category] DESC, [part_number] NATURAL DESC
My Situation
I have a DataView in C# that has a Sort parameter for specifying the ORDER BY that would be used by ADO, and a requirement to sort by a column using a 'natural' sort algorithm. I could in theory do just about anything from creating a different column to sort by (based on the column I'd like to have 'sorted naturally') to not sorting in SQL, but rather sorting the result set in code afterwards. I'm looking for the best balance of flexibility, efficiency, preparation effort, and maintainability. I would benefit somewhat from being able to sort such data after retrieval (in C#) or completely within a stored procedure.
In my mind, and according to customer statements so far, 'Natural' sort order will mean treating upper and lower case letters equivalently, and considering the magnitude of a number, rather than the ASCII value of its digits (that is x90 comes before x100). Jeff Atwood had a pretty decent discussion of this, but it didn't address SQL sorting. That said, these are my thoughts:
Incorporating the magnitude awareness while also retaining the ability to sort alpha characters ASCII-betically may also come in handy
Non-alphanumeric characters would probably have to be sorted ASCII-betically regardless
Decimal point awareness might be more effort than it's worth, since most of the time periods and commas in alphanumeric fields are treated as merely punctuation/separators, and only denote fractional portions when they're representing a float field
My Question
What is a reasonably flexible, reasonably generic, reasonably efficient, approach to implementing a natural sort algorithm for SQL? Weighing the pros and cons, which is the best approach? Is there another option?
Is there a native SQL way to ORDER BY [field] NATURAL DESC or something?
PURE SQL function to create a 'sort equivalent' - Could be used to create some sort of second, possibly indexed, 'sort value' column, or called from a stored procedure, or specified in an 'ORDER BY' clause - but how to write it efficiently? (loops? is there a set based solution at all??)
CLR SQL Function - usability benefits of pure SQL function, but using procedural language, like C# (algorithm should be no problem, but can it be made to go faster than a pure SQL sort [set based??] implementation?) Also, could be referenced and utilized in C# if efficient enough.
Avoid SQL Server - since parsing an arbitrary number of numbers amid all sorts of other characters is really best suited for looping or recursion, and T-SQL is not well suited for looping or recursion (though TECHNICALLY supported, All I see is 'DON'T USE LOOPS!!!' and 'CTE's are even worse!!!')
Some sort of comparator in SQL(??) - doesn't seem like SQL lends itself to that sort of sorting and I don't see a way to specify a comparator to use - so I guess this won't work...
I have values at least as varied as the following:
100s455t
200s400
d399487
S0000005.2
d400400
d99222
cg9876
D550-9-1
CL2009-3-27
f2g099
f2g100
f2g1000
f2g999
cg 8837
99s1000f
These should be sorted as follows:
99s1000f
100s455t
200s400
cg9876
cg 8837
CL2009-3-27
D550-9-1
d99222
d399487
d400400
f2g099
f2g100
f2g999
f2g1000
S0000005.2
Create a sort column. That way you can keep all the usual mechanisms in place that you use today to sort. You can index that column for example.
Split the string into parts. You need to pad number parts with zeroes to the maximum possible number length.
For example CL2009-3 would become CL|000002009|-|000000003.
This way the usual case-insensitive SQL Server collation sort behavior will create the right order.
Doing a natural sort dynamically prevents indexing, requires the entire data set to move into the app for each query and is resource intensive.
Instead, simply update the sort column whenever you update the base column.
OK. Here is something that is almost what you are looking for. The only piece it can't deal with is when there are some characters then a space and then numbers (cg 8837 and cg9876). Would be good if in the future you could post the ddl and sample data so we can work with it.
with Something (SomeValue) as(
select '100s455t' union all
select '200s400' union all
select 'd399487' union all
select 'S0000005.2' union all
select 'd400400' union all
select 'd99222' union all
select 'cg9876' union all
select 'D550-9-1' union all
select 'CL2009-3-27' union all
select 'f2g099' union all
select 'f2g100' union all
select 'f2g1000' union all
select 'f2g999' union all
select 'cg 8837' union all
select '99s1000f'
)
select *
from Something
order by
cast(
case when patindex('%[A_Za-z]%', SomeValue) = 1 then '99999999999'
when patindex('%[A_Za-z]%', SomeValue) = 0 then SomeValue
else substring(SomeValue, 1, patindex('%[A_Za-z]%', SomeValue) - 1)
end as bigint),
SomeValue
I would recommend to "stay away from the SQL Server". While technically you can implement everything using t-sql or clr function, SQL server remains a single non scalable unit of the infrastructure. Using its CPU resources to do a heavy computing is going to inevitably impact performance of the system in general. And in the end, SQL server will be performing sort using almost exactly the same algorithm that you will use to sort your array on the application side, i.e. looking at each item in array and comparing it to the others until it finds an appropriate position.
Of course I am assuming that, if you try to implement this type of sort on the SQL server side, you will be copying the data into a temporary table before performing the sort, to avoid data locks etc.

Speed up linq query without where clauses

Quick LINQ performance question.
I have a database with many many records and it's used for a webshop.
All query logic and paging is done with LINQ, and it performs quite well.
This is, because the usual search for products contains one or more where clause, and that shortens my result set to a couple of hundred results at max.
But.. there is an option to list all products (when no search criteria is provided), and that query is slow.. real slow. Even though i'm just asking for a single page with .Skip(20).Take(10), it's still slow because the total result is something like 140000 products. Is there a way to limit this (or all) query, so that the speed of the whole thing is kept okay?
I don't want to force my customers to provide one or more criteria.. but on the other hand i have no problem with telling them that they can never find more than 2000 products.
Thanks for helping!
Tys
Why don't you limit the number of records on the sql side as described in this post
http://www.sqlservercurry.com/2009/06/skip-and-take-n-number-of-records-in.html
Watch out for any "premature" enumerations when you pass down queries/results in your code!
There are also several LINQ visualizers available, which can help to see what the LINQ expressions actually translate to. Or you can play around with expressions in LINQPad before integrating in your codeā€¦
What you can do is to have Linq use stored procedure from the database.
In that case, it will be faster because it is the database engine who will do the work and return it to Linq; the database engine is made for that, and it is closer to data than Linq.
I suggest you give it a try and give us feedback
You can check what indexes has the table and what PK is. It could be the table has no index at all so records compared by field values. Also you can catch the query in the SqlProfiler, run it separately and analyse its query plan.

NHibernate - Log items that appear in a search result

I am using NHibernate in an MVC 2.0 application. Essentially I want to keep track of the number of times each product shows up in a search result. For example, when somebody searches for a widget the product named WidgetA will show up in the first page of the search results. At this point i will increment a field in the database to reflect that it appeared as part of a search result.
While this is straightforward I am concerned that the inserts themselves will greatly slow down the search result. I would like to batch my statements together but it seems that coupling my inserts with my select may be counter productive. Has anyone tried to accomplish this in NHibernate and, if so, are there any standard patterns for completing this kind of operation?
Interesting question!
Here's a possible solution:
var searchResults = session.CreateCriteria<Product>()
//your query parameters here
.List<Product>();
session.CreateQuery(#"update Product set SearchCount = SearchCount + 1
where Id in (:productIds)")
.SetParameterList("productIds", searchResults.Select(p => p.Id).ToList())
.ExecuteUpdate();
Of course you can do the search with Criteria, HQL, SQL, Linq, etc.
The update query is a single round trip for all the objects, so the performance impact should be minimal.

Categories

Resources