LINQ to SQL join when there aren't results - c#

Given the following database structure
alt text http://dl.dropbox.com/u/26791/tables.png
I'm trying to write a LINQ query that will return images grouped by tags it's associated with. So far I've got this:
var images = from img in db.Images
join imgTags in db.ImageTags on img.idImage equals imgTags.idImage
join t in db.Tags on imgTags.idTag equals t.idTag
where img.OCRData.Contains(searchText.Text)
group img by new { t.TagName } into aGroup
select new
{
GroupName = aGroup.Key.TagName,
Items = from x in aGroup
select new ImageFragment()
{
ImageID = x.idImage,
ScanDate = x.ScanTime
}
};
Which works great. However, I also want to return Images that do not have any tags associated with them in a group of "(Untagged)" or something. I can't wrap my head around how I would do this without inserting a default tag for every image and that seems like generally not a very good solution.

If you want image records when there are no corresponding tag records, you need to perform an outer join on the image tags table.

It's a little tricky, but you can do it in one big query if you have the ability to instantiate new ImageTag and Tag instances for linq to work with. Essentially, when you're doing an outer join, you have to use the into keyword with the DefaultIfEmpty(...) method to deal with the "outer join gaps" (e.g., when the right side of the joined key is null in a typical SQL left outer join).
var images = from img in db.Images
join imgTags in db.ImageTags on img.idImage equals imgTags.idImage
into outerImageRef
from outerIR in outerImageRef.DefaultIfEmpty(new ImageTag() { idImage = img.idImage, idTag = -1 })
join t in db.Tags on imgTags.idTag equals t.idTag
into outerRefTags
from outerRT in outerRefTags.DefaultIfEmpty(new Tag(){ idTag=-1, TagName ="untagged"})
group img by outerRT.TagName into aGroup
select new {
GroupName = aGroup.Key,
Items = from x in aGroup
select new ImageFragment() {
ImageID = x.idImage,
ScanDate = x.ScanTime
}
};
Hopefully the above compiles since I don't have your exact environment, I built my solution using my own data types and then converted it to your question's description. Basically the key parts are the extra into and DefaultIfEmpty lines that essentially help add the extra "rows" into the massively joined table that's in memory if you're thinking about it in the traditional sql sense.
However, there's a more readable solution that doesn't require the in memory instantiation of linq entities (you'll have to convert this one yourself to your environment):
//this first query will return a collection of anonymous types with TagName and ImageId,
// essentially a relation from joining your ImageTags x-ref table and Tags so that
// each row is the tag and image id (as Robert Harvey mentioned in his comment to your Q)
var tagNamesWithImageIds = from tag in Tags
join refer in ImageTags on tag.IdTag equals refer.IdTag
select new {
TagName = tag.Name,
ImageId = refer.IdImage
};
//Now we can get your solution by outer joining the images to the above relation
// and filling in the "outer join gaps" with the anonymous type again of "untagged"
// and then joining that with the Images table one last time to get your grouping and projection.
var images = from img in Images
join t in tagNamesWithImageIds on img.IdImage equals t.ImageId
into outerJoin
from o in outerJoin.DefaultIfEmpty(new { TagName = "untagged", ImageId = img.IdImage })
join img2 in Images on o.ImageId equals img2.IdImage
group img2 by o.TagName into aGroup
select new {
TagName = aGroup.Key,
Images = aGroup.Select(i => i.Data).ToList() //you'll definitely need to replace this with your code's logic. I just had a much simpler data type in my workspace.
};
Hope that makes sense.
Of course, you can always just set your application to tag everything by default w/ "untagged" or do some much simpler LINQ queries to create a list of image id's that are not present in your ImageTag table, and then union or something.

Here's what I ended up doing. I haven't actually checked what kind of SQL this is generating yet, I'm guessing that it's probably not exactly pretty. I think I'd be better off doing a couple queries and aggregating the stuff myself, but in any case this works:
var images = from img in db.Images
join imgTags in db.ImageTags on img.idImage equals imgTags.idImage into g
from imgTags in g.DefaultIfEmpty()
join t in db.Tags on imgTags.idTag equals t.idTag into g1
from t in g1.DefaultIfEmpty()
where img.OCRData.Contains(searchText.Text)
group img by t == null ? "(No Tags)" : t.TagName into aGroup
select new
{
GroupName = aGroup.Key,
Items = from x in aGroup
select new ImageFragment()
{
ImageID = x.idImage,
ScanDate = x.ScanTime
}
};

Related

Left join with where clause in linq

I am trying to do a left join with a where clause in linq.
I have leadsQuery table with 2500 rows. I want to join the LeadCons table into it. For a lead there can be multiple entries in the LeadCons table, hence I want to join only when the Status match. Else I want the fields to be NULL.
var data = from lead in leadsQuery
join lcs in context.LeadCons on lead.ID equals lcs.LeadId into leadsWithCons
from lcs in leadsWithCons.DefaultIfEmpty()
where lead.Status == lcs.Status
select new
{
LeadId = lead.ID,
Source = lead.Source.ToString(),
};
This query gives me ~1500 rows and leadsQuery has 2500. What am I doing wrong here?
A late answer, hoping it is still helpful:
First, you aren't selecting any values from LeadCons, so what is the purpose of a join?
I shall assume maybe you want to extend your select, so let us say you want to select the property foo, so my next question: Why do you need a left join in your case? You can simply do a select:
var data = from lead in leadsQuery
select new
{
Foo = context.LeadCons.Where(lcs => lead.Status == lcs.Status).SingleOrDefault().foo
LeadId = lead.ID,
Source = lead.Source.ToString(),
};
This way you have the same number of items and for each item the desired foo value.
Have you tried just changing the your join to a join with multiple conditions, and then removing the where 'status equal status'
from lead in leadsQuery
join lcs in context.LeadCons on new {
p1 = lead.ID,
p2 = lead.Status
}
equals
new {
p1 = lcs.LeadId,
p2 = lcs.Status
}
you can have a look at this nice article:
https://smehrozalam.wordpress.com/2010/04/13/linq-how-to-write-queries-with-complex-join-conditions/

"Invalid Where condition" when I try to search for a value that is accented ignoring accents in CRM 2011 with LINQ

I 'm trying to do a search for a contact. For example value "Café " which is stored in the name field , but when I search like "cafe" does not return any record .
I tried to do the following
using (ServiceContext svcContext = new ServiceContext(_serviceProxy))
{
var query_where3 = from c in svcContext.ContactSet
join a in svcContext.AccountSet
on c.ContactId equals a.PrimaryContactId.Id
where c.FullName.Normalize(NormalizationForm.FormD).Contains("Café")
select new
{
account_name = a.Name,
contact_name = c.LastName
};
}
and appear the Exception with message saying "Invalid 'where' condition. An entity member is invoking an invalid property or method"
You can't use that functions on LinQ-CRM, the correct way to do the query is:
c.FullName == "someString" or c.FullName.equals("someString").
This is because you can't use functions or transformations on the left condition. You must use the attribute itself.
Your query will look like:
using (ServiceContext svcContext = new ServiceContext(_serviceProxy))
{
var query_where3 = from c in svcContext.ContactSet
join a in svcContext.AccountSet
on c.ContactId equals a.PrimaryContactId.Id
where c.FullName == "Café" || c.FullName == "Cafe"
select new
{
account_name = a.Name,
contact_name = c.LastName
};
}
You can't really deal with the accents with Linq to SQL in general ... and you are even more limited with what you can do with Linq to CRM. You cant modify the DB; unless you don't care about being supported. Then you could do something like : MAD suggested and to a db alter.
ALTER TABLE Name ALTER COLUMN Name [varchar](100) COLLATE SQL_Latin1_General_CP1_CI_AI
I personally would not recommend that.
The best that I can come up with is getting the data as close as you can and filtering it from there inside a list or something similar.
I have to do it all the time and it is a pain (and adds more overhead) but there is not really another workaround that I have found.
//declare a dictionary
Dictionary<string, string> someDictionary = new Dictionary<string, string> ();
using (ServiceContext svcContext = new ServiceContext(_serviceProxy))
{
var query_where3 = from c in svcContext.ContactSet
join a in svcContext.AccountSet
on c.ContactId equals a.PrimaryContactId.Id
where c.FullName.Contains("Caf")
select new
{
account_name = a.Name,
contact_name = c.LastName
};
}
//then
foreach(var q in query_where3)
{
if(string.IsNullOrEmpty(account_name)==false && string.IsNullOrEmpty(contact_name)==false)
{
someDictionary.Add(account_name, contact_name);
}
}
//then you can add the .Normalize(NormalizationForm.FormD) to your dictionary
Hope that helped.
Its all about
.Normalize(NormalizationForm.FormD)
, probably EF does not knows how to handle this method. Remove it and test just with
c.FullName.Contains("Café")
------------------------------------------------- Added in 2015-01-30 --------------------------------------------------
So man, the unique solution i can think about is list before you do the where condition. This way the you can use the normalize once this will be handled by linq 2 objects. try:
(from c in svcContext.ContactSet join a in svcContext.AccountSet
on c.ContactId equals a.PrimaryContactId.Id
select new {a=a,c=c} ).ToList()
.Where(c=>c.FullName.Normalize(NormalizationForm.FormD).Contains("Café"))
.Select( x=> select new {
account_name = x.a.Name,
contact_name = x.c.LastName
};)
But that way can cause some overhead given that linq 2 obejects runs in application server memory, not in database server.
CRM's LINQ translator cannot handle the .Equals() method.
on c.ContactId equals a.PrimaryContactId.Id
Change the above line to below line.
on c.ContactId == a.PrimaryContactId.Id

Which Linq join will work for my scenario, confused

I have a web api project. In database I have two tables of comments and pictures. I want to use join to merge these two tables in such a way that every picture should have all the comments related to it. Both tables have picture id. Which join should I use? I need to use linq. Can someone tell me the linq query I should use?
I have tried cross join, in this way
var combo = from p in db.picturedetails
from c in db.comments
select new CommentAndPictureDetails
{
IdUser = p.iduser,
IdPictures = p.idpictures,
Likes = p.likes,
NudityLevel = p.nuditylevel,
PicTitle = p.picTitle,
PicTime = p.pictime,
FakesLevel = p.fakeslevel,
Comment1 c.comment1,
CTime = c.ctime,
IdComments = c.idcomments,
SpamLevel = c.spamlevel,
TargetPictureId = c.targetpictureid
};
But I am getting all the pictures with all the comments so a very big json. So which join should i use?
What you're looking for a group join:
var query = from p in db.picturedetails
join c in db.comments
on p.PictureId equals c.PictureId into comments
select new
{
ID = p.PictureId,
Comments = comments,
//...
};
My understanding of LINQ is limited at best but I will try to answer what I assume you are asking.
As far as my understanding of your question goes you are wanting the following to work :
Table1:
PictureID
PictureName
Table2:
PictureID
Comments
And the result to have 1 picture with multiple comments. Is this a correct assumption?
If so I do not believe that is possible with 1 single query as the query will return a picture for every comment, the best way would be to find the 1 picture object, then find the multiple comment objects seperately and return them in some fashion to make it appear as one object.

Left join in Linq?

There are a lot of questions on SO already about Left joins in Linq, and the ones I've looked at all use the join keyword to achieve the desired end.
This does not make sense to me. Let's say I have the tables Customer and Invoice, linked by a foreign key CustomerID on Invoice. Now I want to run a report containing customer info, plus any invoices. SQL would be:
select c.*, i.*
from Customer c
left join Invoice i on c.ID = i.CustomerID
From what I've seen of the answers on SO, people are mostly suggesting:
var q = from c in Customers
join i in Invoices.DefaultIfEmpty() on c.ID equals i.CustomerID
select new { c, i };
I really don't understand how this can be the only way. The relationship between Customer and Invoice is already defined by the LinqToSQL classes; why should I have to repeat it for the join clause? If I wanted an inner join it would simply have been:
var q = from c in Customers
from i in c.Invoices
select new { c, i };
without specifying the joined fields!
I tried:
var q = from c in Customers
from i in c.Invoices.DefaultIfEmpty()
select new { c, i };
but that just gave me the same result as if it were an inner join.
Is there not a better way of doing this?
While the relationship is already defined (both in the database and in the .dbml markup) the runtime cannot automatically determine if it should use that relationship.
What if there are two relationships in the object model (Person has Parents and Children, both relationships to other Person instances). While cases could be special cased, this would make the system more complex (so more bugs). Remember in SQL you would repeat the specification of the relationship.
Remember indexes and keys are an implementation detail and not part of the relational algebra that underlies the relation model.
If you want a LEFT OUTER JOIN then you need to use "into":
from c in Customers
join i in Invoices on i.CustomerId equals c.CustomerId into inv
...
and inv will have type IEnumerable<Invoivce>, possibly with no instances.
What are you talking about? That from i in c.Invoice.DefaultIfEmpty() is exactly a left join.
List<string> strings = new List<string>() { "Foo", "" };
var q = from s in strings
from c in s.DefaultIfEmpty()
select new { s, c };
foreach (var x in q)
{
Console.WriteLine("ValueOfStringIs|{0}| ValueOfCharIs|{1}|",
x.s,
(int)x.c);
}
This test produces:
ValueOfStringIs|Foo| ValueOfCharIs|70|
ValueOfStringIs|Foo| ValueOfCharIs|111|
ValueOfStringIs|Foo| ValueOfCharIs|111|
ValueOfStringIs|| ValueOfCharIs|0|
You may probably want to use the 'into' keyword.
Example

Can take be used in a query expression in c# linq instead of using .Take(x)?

I'm trying to write some LINQ To SQL code that would generate SQL like
SELECT t.Name, g.Name
FROM Theme t
INNER JOIN (
SELECT TOP 5 * FROM [Group] ORDER BY TotalMembers
) as g ON t.K = g.ThemeK
So far I have
var q = from t in dc.Themes
join g in dc.Groups on t.K equals g.ThemeK into groups
select new {
t.Name, Groups = (from z in groups orderby z.TotalMembers select z.Name )
};
but I need to do a top/take on the ordered groups subquery. According to http://blogs.msdn.com/vbteam/archive/2008/01/08/converting-sql-to-linq-part-7-union-top-subqueries-bill-horst.aspx in VB I could just add TAKE 5 on the end, but I can't get this syntax to work in c#. How do you use the take syntax in c#?
edit: PS adding .Take(5) at the end causes it to run loads of individual queries
edit 2: I made a slight mistake with the intent of the SQL above, but the question still stands. The problem is that if you use extension methods in the query like .Take(5), LinqToSql runs lots of SQL queries instead of a single query.
Second answer, now I've reread the original question.
Are you sure the SQL you've shown is actually correct? It won't give the top 5 groups within each theme - it'll match each theme just against the top 5 groups overall.
In short, I suspect you'll get your original SQL if you use:
var q = from t in dc.Themes
join g in dc.Groups.OrderBy(z => z.TotalMembers).Take(5)
on t.K equals g.ThemeK into groups
select new { t.Name, Groups = groups };
But I don't think that's what you actually want...
Just bracket your query expression and call Take on it:
var q = from t in dc.Themes
join g in dc.Groups on t.K equals g.ThemeK into groups
select new { t.Name, Groups =
(from z in groups orderby z.TotalMembers select z.Name).Take(5) };
In fact, the query expression isn't really making things any simpler for you - you might as well call OrderBy directly:
var q = from t in dc.Themes
join g in dc.Groups on t.K equals g.ThemeK into groups
select new { t.Name, Groups = groups.OrderBy(z => z.TotalMembers).Take(5) };
Here's a faithful translation of the original query. This should not generate repeated roundtrips.
var subquery =
dc.Groups
.OrderBy(g => g.TotalMembers)
.Take(5);
var query =
dc.Themes
.Join(subquery, t => t.K, g => g.ThemeK, (t, g) => new
{
ThemeName = t.Name, GroupName = g.Name
}
);
The roundtrips in the question are caused by the groupjoin (join into). Groups in LINQ have a heirarchical shape. Groups in SQL have a row/column shape (grouped keys + aggregates). In order for LinqToSql to fill its hierarchy from row/column results, it must query the child nodes seperately using the group's keys. It only does this if the children are used outside of an aggregate.

Categories

Resources