Reducing Objects to Categories based on arbitrary logic

Reducing Objects to Categories based on arbitrary logic - c#

This question could be subjective. I'm not sure if it belongs here or on Programmers
Say I have a data type, X (think of business objects constructed from a relational database). My end goal is represent many instances of this type in a table for a report, with each instance under one of a few different headings.
The heading to display the object under is selected based on arbitrary logic handed down from management, which anybody who's done corporate software development will be familiar with:
Examples:
If the instance has a FooID of 6 and a BarFactor of < 0.5, place it under the "Borked" heading.
or
If the Weight is > 0, and the CreatedDate is before Midnight but after 3PM, and today is not the 3rd Wednesday of the month, the object should be categorised as "Fluffy".
My question: Is there a common idiom for taking an instance of X, applying a potentially headache-inducing amount of logic to the state of the instance and getting a category/string/arbitrary value from the result of that logic?
My ideas so far have been:
A function which takes an X and returns a String. I could easily see this turning into a Megamoth, and a b***** to maintain as requirements are constantly modified.
Define a Heading abstract type, and a factory function which gives me an instance of Heading with the ToString method correctly overloaded. I think this technique will be plagued by the same issues as idea no. 1.
A hierarchy of functions, each splitting the problem up a little further until we arrive at the correct heading.
For example:
public String GetHeading(X x)
{
if (x.Weight > 0)
return WeightGreaterThanZero(x);
else if (x.Weight < 0)
return WeightLessThanZero(x);
else
return WeightIsZero(x);
}
The three "Weight" functions would test further conditions until we arrive at a value. The problem I see here is that we need to track which function called which. The FooIDIs6 function needs to know whether it was called by WeightIsZero or some other function, otherwise any preceding decisions are potentially meaningless. We end up with something like WeightIsGreaterThanZero_FooIDIs6_GrandmotherIsOlderThan100 etc. etc.

I'm not sure that this completely applies to what you are trying to accomplish, but whenever we encounter this type of "fun", we end up providing the business users a user interface to define their rules, then just write enough code to apply the rules to the specific objects.
For example, in one of our applications, we allow the user to specify a set of conditions that, when evaluated to true, will produce a user-defined output.
To do this, we store one record in the database for each portion of the IF statement. Each record specifies the property name in the class, the comparison action (<, =, <>, etc), the comparison value, whether or not the statement begins or ends a group (i.e. parens), and how the current statement is joined with the next (AND or OR).
So you would have
Heading Record (Parent record, which defines the heading to be used)
Heading Selector (Child records which define the if statement)
We can also support user-defined priorities in the parent record so that if a given record has multiple matches, it is the user's responsibility to determine the best match through the proper application of priorities.
At run time, you just build a statement to evaluate the conditions as entered by the user. We use reflection so that we don't have to hard-code property names.
In addition, the same configuration code can be used to generate SQL statements in the cases where it is more appropriate to perform the selection logic in SQL.
We have used this mechanism extensively and have found it to be a very powerful approach to meeting the frequent and varied demands of our customers.

What about a set of category classes that accepts an X and indicates whether it is the correct category for that X? Something like:
var categories = new Category[] { new FooCategory(), new BarCategory(), new FluffyCategory() };
foreach( var x in myListOfXs ) {
var cat = categories.FirstOrDefault(c => c.Matches(x));
if( cat != null ) {
x.Category = cat.Name;
}
}

Related

Creating a join Linq query to optimize looping together two lists

I'm not good at databases and T-sql queries so I got a little stumped over how to do something similar in C# using Linq.
Thing is I have this structure that is pretty much the same as a relational database table, with which I have to do some kind of join selection.
In effect I get a list of composite key addresses. These are actually classes that hold a few int values (byte or short perhaps but not relevant). Now I have to search through my structure for matches of these lists and call a method there.
This is probably a simple join (can't remember what join does what) but I need some help because I wan't this to be as cheap as I can easily get away with so I don't need to search through every line for every address.
public class TheLocationThing
{
int ColumnID;
int ColumnGroupId;
int RowID;
}
public class TheCellThing
{
TheLocationThing thing;
public void MethodINeedToCallIfInList()
{
//here something happens
}
}
public class TheRowThing
{
int RowId;
List<TheCellThing> CellsInThisRow;
}
public class TableThing
{
List<TheRowThing> RowsInThisTable;
}
So I have this tablething type class that has rows and they have cells. Notice that ColumnGroup thing, it's a composite key with ColumnId so the same columnid can come again but only once for each ColumnGroup.
The thing to keep in mind though is that inn TheTable there will only ever be one GroupColumnId, but the list given could have multiple, so we can filter them away.
public void DoThisThing()
{
List<TheLocationThing> TheAddressesINeedToFind = GetTheseAddresses(); //actualy is a TheLocationThing[] if that matters
var filterList = TheAddressesINeedToFind.Where(a => a.ColumnGroupId == this.CurrentActiveGroup);
//Here I have to do the join with this.TableInstance
}
Now, I should of course only loop through the addresses with the same row id in that row and all that.
Also is managing thing as IQueryable something that would help me out here, especially in the initial filter out, should i get it as Queryable?

I'm going to give different example, because I'm not quite following yours, and use it to explain the basics of joining, hopefully hitting what you need to learn.
Let's imagine two slightly more meaninfully-named classes than LocationThing etc. (which has me lost).
public class Language
{
string Code{get;set;}
string EnglishName{get;set;}
string NativeName{get;set;}
}
public class Document
{
public int ID{get; private set;}//no public set as it corresponds to an automatically-set column
public string LanguageCode{get;set;}
public string Title{get;set;}
public string Text{get;set;}
}
Now, let's also imagine we have methods GetLanguages() and GetDocuments() that return all languages and documents respectively. There's a few different ways that could work, and I'll get to that later.
An example of a join being useful, is if we e.g. wanted all the titles and all the English names of the languages they were in. For that in SQL we would use:
SELECT documents.title, languages.englishName
FROM languages JOIN documents
ON languages.code = documents.languageCode
Or leaving out table names where doing so doesn't make column-names ambiguous:
SELECT title, englishName
FROM languages JOIN documents
ON code = languageCode
Each of these will, for each row in documents, match them up with the corresponding row in languages, and return the title and English name of the combined row (if there's a document with no matching language, it doesn't get returned, if there's two languages with the same code - should be prevented by the db in this case though - corresponding documents get mentioned once for each).
The LINQ equivalent is:
from l in GetLanguages()
join d in GetDocuments()
on l.Code equals d.LanguageCode //note l must come before d
select new{d.Title, l.EnglishName}
This will similarly match each document with its corresponding language and return an IQueryable<T> or IEnumerable<T> (depending on the source enumerations/queryables) where T is an anonymous object with Title and EnglishName properties.
Now, as to the expense of this. This depends primarily on the nature of GetLanguages() and GetDocuments().
No matter what the source, this is inherently a matter of searching through every one of the results of those two methods - that's just the nature of the operation. However, the most efficient way of doing this, is still something that varies according to what we know about the source data. Let's consider a Linq2Objects form first. There's lots of ways that this could be done, but lets imagine they're returning Lists that were pre-computed:
public List<Document> GetDocuments()
{
return _precomputedDocs;
}
public List<Language> GetLanguages()
{
return _precomputedLangs;
}
Let's pretend Linq's join doesn't exist for a moment, and imagine how we'd write something functionally equivalent to the code above. We might arrive at something like:
var langLookup = GetLanguages().ToLookup(l => l.Code);
foreach(var doc in GetDocuments())
foreach(var lang in langLookup[doc.LanguageCode])
yield return new{doc.Title, lang.EnglishName};
This is a reasonable general case. We can go one step further, and reduce storage, since we know that all we finally care about with each language is the English name:
var langLookup = GetLanguages().ToLookup(l => l.Code, l => l.EnglishName);
foreach(var doc in GetDocuments())
foreach(var englishName in langLookup[doc.LanguageCode])
yield return new{doc.Title, EnglishName = englishName};
That's about as much as we can do without special knowledge of the set of data.
If we did have special knowledge, we could go further. For example, if we knew there was only one language per code, then the following would be faster:
var langLookup = GetLanguages().ToDictionary(l => l.Code, l => l.EnglishName);
string englishName;
foreach(var doc in GetDocuments())
if(langLookup.TryGetValue(doc.LanguageCode, out englishName))
yield return new{doc.Title, EnglishName = englishName};
If we knew the two sources were both sorted by language code, we could go further still and spin through them both at the same time, yielding matches, and throwing away languages once we've dealt with them, as we're never going to need it again for the rest of the enumeration.
But, Linq does not have that special knowledge when just looking at two lists. For all it knows every single language and every single document all have the same codes. It really has to examine the lot to find out. For that, it's pretty efficient in how it does it (a bit better than my example above suggests, due to some optimisation).
Let's consider a Linq2SQL case, and note that Entity Framework and other ways of using Linq directly on databases would be comparable. Let's say all of this is happening in the context of a class that has a _ctx member that's a DataContext. Then our source methods could be:
public Table<Document> GetDocuments()
{
return _ctx.GetTable<Document>();
}
public Table<Language> GetLanguages()
{
return _ctx.GetTable<Languages>();
}
Table<T> implements IQueryable<T> along with some other methods. Here, instead of joining things in memory, it'll execute the following (bar some aliases) SQL:
SELECT documents.title, languages.englishName
FROM languages JOIN documents
ON languages.code = documents.languageCode
Look familiar? It's the same SQL we mentioned at the beginning.
First great thing about this, is that it's not bringing back anything from the database that we won't use.
Second great thing, is that the database's query engine (what turns this into executable code that it then runs) does have knowledge of the nature of the data. If for example we've set up the Languages table to have a unique key or constraint on the code column, the engine knows there can't be two languages with the same code, so it can perform the equivalent of the optimisation we mentioned above where we used a Dictionary instead of a ILookup.
Third great thing, is that if we have indices on languages.code and documents.languageCode then the query engine will use these for even faster retrieval and matching, perhaps getting all it needs from the index without hitting the table, making a call as to which table to hit first to avoid testing irrelevant rows in the second, and so on.
Fourth great thing, is that RDBMSs have benefited from several decades of research into how to make this sort of retrieval as fast as possible, so we've stuff going on that I don't know about and don't need to know about to benefit from.
In all then, we want to run our queries against the datasource directly, not against sources in memory. There are exceptions, particularly some forms of grouping (hitting the DB directly with some group-by operations can mean hitting it repeatedly) and if we reuse the same results over and over in quick succession (in which case we're better off hitting it once for those results, and then storing them).

Is it possible to use a LINQ Query in Windows Workflow rule condition?

I am in the process of prototyping an implementation of a rules engine to help us with our ordering portals. For example giving discounts on items or requiring approval if certain items are ordered. We would also like to be able to add rules for dollar amounts, user hierarchy positions, and be apply it one client or more.
I feel that WWF is a good answer to this need.
All of that said however I am having a little difficulty figuring out how best to set up some of the more complex rules. I have a "condition" that I feel is best described in a LINQ query, like so:
var y = from ol in currentOrder.OrderLines where ol.ItemCode == "MYITEMCODE" select ol;
I am not against using a different framework for a rules engine or adding additional properties/methods to our objects (ex: OrderHasItem(ItemCode), etc) to make these rules more simplified but I would like to avoid having to do that. It feels self-defeating in that it forces us down the road of potentially requiring code changes for new rules.

Yes, you can use Linq queries with Workflow. In WF what you are referring to as a rule is an expression that is evaluated at runtime. Your query is selecting a subset of the orderliness based on a criteria.
For example, if I have a collection of names and I want to see only names that begin with 'R'. I could write the following code.
private static void ShowQueryWithCode(IEnumerable<string> names)
{
Console.WriteLine("LINQ Query in Code - show names that start with 'R'");
// Assuming there are no null entries in the names collection
var query = from name in names where name.StartsWith("R") select name;
// This is the same thing as
// var query = names.Where(name => name.StartsWith("R"));
foreach (var name in query)
{
Console.WriteLine(name);
}
Console.WriteLine();
}
To do the same with Workflow
Step 1: Create a Workflow with an In Argument of type IEnumerable
Here you can see that I've added the in argument
Step 2: Add a Variable for the query of type IEnumerable
Before you can add a variable you need to include an activity which has variables. In this workflow I've added a sequence.
Step 3: Assign the query the LINQ expression you want to use
You can use a method chain or query syntax.
Step 4: Iterate over the collection
In the completed workflow I used a ForEach activity to iterate the list of names and write them to the console.
This example uses C# in .NET 4.5 but the same technique can be used with Visual Basic.
You can download the sample code here

WF is a workflow engine. It's used to run factories and banks. The rule engine is just a small part of it. To make sense, any normal WF project requires a dedicated team of professionals to build it. Seems like an overkill for your specific purpose. You are very likely to bury yourself and your project in a typical fight between requirements and real skills of your team.
The use of any available .NET engine would be more justified in your situation. Keep in mind that building a custom rule engine is not an easy tasks, no matter how simple it seems from the beginning. Setting property values of a class (typically called a "fact" or "source" object) or executing actions (invoking class' methods) is what rules engines do best. And it seems that it's exactly what you need. Check out some available .NET engines. They are inexpensive, if not free, and reliable.

Generate and compile name to index translation/mapping for faster reusability

Suppose I get data from a service (that I can't control) as:
public class Data
{
// an array of column names
public string[] ColumnNames { get; set; }
// an array of rows that contain arrays of strings as column values
public string[][] Rows { get; get; }
}
and on the middle tier I would like to map/translate this to an IEnumerable<Entity> where column names in Data may be represented as properties in my Entity class. I said may because I may not need all the data returned by the service but just some of it.
Transformation
This is an abstraction of an algorithm that would do the translation:
create an IDictionary<string, int> of ColumnNames so I can easily map individual column names to array indices in individual rows.
use reflection to examine my Entity properties' names so I'm able to match them with column names
iterate through Data.Rows and create my Entity objects and populate properties according to mapping done in #1. Likely using reflection and SetValue on properties to set them.
Optimisation
Upper algorithm would of course work, but I think that because it uses reflection it should do some caching and possibly some on the fly compilation, that could speed things up considerably.
When steps 1 and 2 are done, we could actually generate a method that takes an array of strings and instantiates my entities using indices directly and compile it and cache it for future reuse.
I'm usually getting a page of results, so subsequent requests would reuse the same compiled method.
Additional fact
This is not imperative to the question (and answers) but I also created two attributes that help with column-to-property mapping when these don't match in names. I created the most obvious MapNameAttribute (that takes a string and optionally also enable case sensitivity) and IgnoreMappingAttribute for properties on my Entity that shouldn't map to any data. But these attributes are read in step 2 of the upper algorithm so property names are collected and renamed according to this declarative metadata so they match column names.
Question
What is the best and easiest way to generate and compile such a method? Lambda expressions? CSharpCodeProvider class?
Do you maybe have an example of generated and compiled code that does a similar thing? I guess that mappings are a rather common scenario.
Note: In the meantime I will be examining PetaPoco (and maybe also Massive) because afaik they both do compilation and caching on the fly exactly for mapping purposes.

Suggestion: obtain FastMember from NuGet
Then just use:
var accessor = TypeAccessor.Create(typeof(Entity));
Then just in your loop, when you have found the memberName and newValue for the current iteration:
accessor[obj, memberName] = newValue;
This is designed to do what you are asking; internally, it maintains a set of types if has seen before. When a new type is seen, it creates a new subclass of TypeAccessor on-the-fly (via TypeBuilder) and caches it. Each unique TypeAccessor is aware of the properties for that type, and basically just acts like a:
switch(memberName) {
case "Foo": obj.Foo = (int)newValue;
case "Bar": obj.Bar = (string)newValue;
// etc
}
Because this is cached, you only pay any cost (and not really a big cost) the first time it ever sees your type; the rest of the time, it is free. Because it uses ILGenerator directly, it also avoids any unnecessary abstraction, for example via Expression or CodeDom, so it is about as fast as it can be.
(I should also clarify that for dynamic types, i.e. types that implement IDynamicMetaObjectProvider, it can use a single instance to support every object).
Additional:
What you could do is: take the existing FastMember code, and edit it to process MapNameAttribute and IgnoreMappingAttribute during WriteGetter and WriteSetter; then all the voodoo happens on your data names, rather than the member names.
This would mean changing the lines:
il.Emit(OpCodes.Ldstr, prop.Name);
and
il.Emit(OpCodes.Ldstr, field.Name);
in both WriteGetter and WriteSetter, and doing a continue at the start of the foreach loops if it should be ignored.

LINQ-like or SQL-like DSL for end-users to run queries to select (not modify) data?

For a utility I'm working on, the client would like to be able to generate graphic reports on the data that has been collected. I can already generate a couple canned graphs (using ZedGraph, which is a very nice library); however, the utility would be much more flexible if the graphs were more programmable or configurable by the end-user.
TLDR version
I want users to be able to use something like SQL to safely extract and select data from a List of objects that I provide and can describe. What free tools or libraries will help me accomplish this?
Full version
I've given thought to using IronPython, IronRuby, and LuaInterface, but frankly they're all a bit overpowered for what I want to do. My classes are fairly simple, along the lines of:
class Person:
string Name;
int HeightInCm;
DateTime BirthDate;
Weight[] WeighIns;
class Weight:
int WeightInKg;
DateTime Date;
Person Owner;
(exact classes have been changed to protect the innocent).
To come up with the data for the graph, the user will choose whether it's a bar graph, scatter plot, etc., and then to actually obtain the data, I would like to obtain some kind of List from the user simply entering something SQL-ish along the lines of
SELECT Name, AVG(WeighIns) FROM People
SELECT WeightInKg, Owner.HeightInCm FROM Weights
And as a bonus, it would be nice if you could actually do operations as well:
SELECT WeightInKg, (Date - Owner.BirthDate) AS Age FROM Weights
The DSL doesn't have to be compliant SQL in any way; it doesn't even have to resemble SQL, but I can't think of a more efficient descriptive language for the task.
I'm fine filling in blanks; I don't expect a library to do everything for me. What I would expect to exist (but haven't been able to find in any way, shape, or form) is something like Fluent NHibernate (which I am already using in the project) where I can declare a mapping, something like
var personRequest = Request<Person>();
personRequest.Item("Name", (p => p.Name));
personRequest.Item("HeightInCm", (p => p.HeightInCm));
personRequest.Item("HeightInInches", (p => p.HeightInCm * CM_TO_INCHES));
// ...
var weightRequest = Request<Weight>();
weightRequest.Item("Owner", (w => w.Owner), personRequest); // Indicate a chain to personRequest
// ...
var people = Table<Person>("People", GetPeopleFromDatabase());
var weights = Table<Weight>("Weights", GetWeightsFromDatabase());
// ...
TryRunQuery(userInputQuery);
LINQ is so close to what I want to do, but AFAIK there's no way to sandbox it. I don't want to expose any unnecessary functionality to the end user; meaning I don't want the user to be able to send in and process:
from p in people select (p => { System.IO.File.Delete("C:\\something\\important"); return p.Name })
So does anyone know of any free .NET libraries that allow something like what I've described above? Or is there some way to sandbox LINQ? cs-script is close too, but it doesn't seem to offer sandboxing yet either. I'd be hesitant to expose the NHibernate interface either, as the user should have a read-only view of the data at this point in the usage.
I'm using C# 3.5, and pure .NET solutions would be preferred.
The bottom line is that I'm really trying to avoid writing my own parser for a subset of SQL that would only apply to this single project.

There is a way to sandbox LINQ or even C#: A sandboxed appdomain. I would recommend you look into accepting and compiling LINQ in a locked-down domain.
Regarding NHibernate, perhaps you can pass the objects into the domain without exposing NHibernate at all (I don't know how NHibernate works). If this is not possible, perhaps the connection to the database used within the sandbox can be logged in as a user who is granted only SELECT permissions.

Maybe the expressions will come handy for You.
You could provide simple entry places for:
a) what to select - user is expected to enter an expression only _ probably member and arithmetic expressions - those are subclasses of the expression class
b) how to filter the things = again only expressions are expected
c) ordering
d) joining?
Expressions don't let You do File.Delete because You operate only on precise domain objects (which probably don't have this functionality). The only thing You have to check is whether the parameters of the said expressions are of Your domain types. and Return types of said expressions are of domain types (or generic types in case of IEnumerable<> or IQuerable<>
this might prove helpful
I.E. expressions don't let You write multi-line statements.
Then You build your method chain in code
and voila.
There comes the data

I ended up using a little bit of a different approach. Instead of letting users pick arbitrary fields and make arbitrary graphs, I'm still presenting canned graphs, but I'm using Flee to let the user filter out exactly what data is used in the source of the graph. This works out nicely, because I ended up making a set of mappings from variable names to "accessors", and then using those mappings to inject variables into the user-entered filters. It ended up something like:
List<Mapping<Person>> mappings;
// ...
mappings.Add(new Mapping("Weight", p => p.Weight, "The person's weight (in pounds)"));
// ...
foreach (var m in mappings)
{
context.Variables[m.Name] = m.Accessor(p);
}
// ...
And you can even give an expression context an "owner" (think Ruby's instance_eval, where the context is executed with score of the specified object as this); then the user can even enter a filter like Weight > InputNum("The minimum weight to see"), and then they will be prompted thusly when the filter is executed, because I've defined a method InputNum in the owning class.
I feel like it was a good balance between effort involved and end result. I would recommend Flee to anyone who has a need to parse simple statements, especially if you need to extend those statements with your own variables and functions as well.

Common problem for me in C#, is my solution good, stupid, reasonable? (Advanced Beginner)

Ok, understand that I come from Cold Fusion so I tend to think of things in a CF sort of way, and C# and CF are as different as can be in general approach.
So the problem is: I want to pull a "table" (thats how I think of it) of data from a SQL database via LINQ and then I want to do some computations on it in memory. This "table" contains 6 or 7 values of a couple different types.
Right now, my solution is that I do the LINQ query using a Generic List of a custom Type. So my example is the RelevanceTable. I pull some data out that I want to do some evaluation of the data, which first start with .Contains. It appears that .Contains wants to act on the whole list or nothing. So I can use it if I have List<string>, but if I have List<ReferenceTableEntry> where ReferenceTableEntry is my custom type, I would need to override the IEquatable and tell the compiler what exactly "Equals" means.
While this doesn't seem unreasonable, it does seem like a long way to go for a simple problem so I have this sneaking suspicion that my approach is flawed from the get go.
If I want to use LINQ and .Contains, is overriding the Interface the only way? It seems like if there way just a way to say which field to operate on. Is there another collection type besides LIST that maybe has this ability. I have started using List a lot for this and while I have looked and looked, a see some other but not necessarily superior approaches.
I'm not looking for some fine point of performance or compactness or readability, just wondering if I am using a Phillips head screwdriver in a Hex screw. If my approach is a "decent" one, but not the best of course I'd like to know a better, but just knowing that its in the ballpark would give me little "Yeah! I'm not stupid!" and I would finish at least what I am doing completely before switch to another method.
Hope I explained that well enough. Thanks for you help.

What exactly is it you want to do with the table? It isn't clear. However, the standard LINQ (-to-Objects) methods will be available on any typed collection (including List<T>), allowing any range of Where, First, Any, All, etc.
So: what is you are trying to do? If you had the table, what value(s) do you want?
As a guess (based on the Contains stuff) - do you just want:
bool x= table.Any(x=>x.Foo == foo); // or someObj.Foo
?

There are overloads for some of the methods in the List class that takes a delegate (optionally in the form of a lambda expression), that you can use to specify what field to look for.
For example, to look for the item where the Id property is 42:
ReferenceTableEntry found = theList.Find(r => r.Id == 42);
The found variable will have a reference to the first item that matches, or null if no item matched.
There are also some LINQ extensions that takes a delegate or an expression. This will do the same as the Find method:
ReferenceTableEntry found = theList.FirstOrDefault(r => r.Id == 42);

Ok, so if I'm reading this correctly you want to use the contains method. When using this with collections of objects (such as ReferenceTableEntry) you need to be careful because what you're saying is you're checking to see if the collection contains an object that IS the same as the object you're comparing against.
If you use the .Find() or .FindAll() method you can specify the criteria that you want to match on using an anonymous method.
So for example if you want to find all ReferenceTableEntry records in your list that have an Id greater than 1 you could do something like this
List<ReferenceTableEntry> listToSearch = //populate list here
var matches = listToSearch.FindAll(x => x.Id > 1);
matches will be a list of ReferenceTableEntry records that have an ID greater than 1.
having said all that, it's not completely clear that this is what you're trying to do.

Here is the LINQ query involved that creates the object I am talking about, and the problem line is:
.Where (searchWord => queryTerms.Contains(searchWord.Word))
List<queryTerm> queryTerms = MakeQueryTermList();
public static List<RelevanceTableEntry> CreateRelevanceTable(List<queryTerm> queryTerms)
{
SearchDataContext myContext = new SearchDataContext();
var productRelevance = (from pwords in myContext.SearchWordOccuranceProducts
where (myContext.SearchUniqueWords
.Where (searchWord => queryTerms.Contains(searchWord.Word))
.Select (searchWord => searchWord.Id)).Contains(pwords.WordId)
orderby pwords.WordId
select new {pwords.WordId, pwords.Weight, pwords.Position, pwords.ProductId});
}
This query returns a list of WordId's that match the submitted search string (when it was List and it was just the word, that works fine, because as an answerer mentioned before, they were the same type of objects). My custom type here is queryTerms, a List that contains WordId, ProductId, Position, and Weight. From there I go about calculating the relevance by doing various operations on the created object. Sum "Weight" by product, use position matches to bump up Weights, etc. My point for keeping this separate was that the rules for doing those operations will change, but the basic factors involved will not. I would have even rather it be MORE separate (I'm still learning, I don't want to get fancy) but the rules for local and interpreted LINQ queries seems to trip me up when I do.
Since CF has supported queries of queries forever, that's how I tend to lean. Pull the data you need from the db, then do your operations (which includes queries with Aggregate functions) on the in-memory table.
I hope that makes it more clear.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.