Search based on a set of keywords - c#

I need to make a search based on a set of keywords, that return all the Ads related with those keywords. Then the result is a list of Categories with the Ads Count for each Category.
The search is made in a KeywordSearch Table:
public class KeywordSearch
{
public int Id { get; set; }
public string Name { get; set; }
public Keyword Keyword { get; set; }
}
Where the Keyword Table is:
public class Keyword
{
public int Id { get; set; }
public string Name { get; set; }
}
The Ads are related with the Keywords using the following Table:
public class KeywordAdCategory
{
[Key]
[Column("Keyword_Id", Order = 0)]
public int Keyword_Id { get; set; }
[Key]
[Column("Ad_Id", Order = 1)]
public int Ad_Id { get; set; }
[Key]
[Column("Category_Id", Order = 2)]
public int Category_Id { get; set; }
}
Finally, the Category table:
public class Category
{
public int Id { get; set; }
public string Name { get; set; }
}
Example:
Keywords: "Mercedes-Benz" and "GLK"
KeywordSearch: "Mercedes" and "Benz" for the Keyword "Mercedes-Benz"
"GLK" for the Keyword "GLK"
Category: "Cars" and "Trucks"
Ads: Car - Mercedes-Benz GLK
Truck - Mercedes-Benz Citan
If I search "Mercedes-Benz" I get:
Cars: 1
Trucks: 1
If I search "Mercedes-Benz GLK" I get:
Cars: 1
If I search "Mercedes Citan" I get:
Trucks: 1
What I get until now:
var keywordIds = from k in keywordSearchQuery
where splitKeywords.Contains(k.Name)
select k.Keyword.Id;
var matchingKac = from kac in keywordAdCategoryQuery
where keywordIds.Distinct().Contains(kac.Keyword_Id)
select kac;
var addIDs = from kac in matchingKac
group kac by kac.Ad_Id into d
where d.Count() == splitKeywords.Count()
select d.Key;
var groupedKac = from kac in keywordAdCategoryQuery
where addIDs.Contains(kac.Ad_Id) <--- EDIT2
group kac by new { kac.Category_Id, kac.Ad_Id };
var result = from grp in groupedKac
group grp by grp.Key.Category_Id into final
join c in categoryQuery on final.Key equals c.Id
select new CategoryGetAllBySearchDto
{
Id = final.Key,
Name = c.Name,
ListController = c.ListController,
ListAction = c.ListAction,
SearchCount = final.Count()
};
The problem is that I can't get only the Ads that match all Keywords.
EDIT:
When a keyword is made of 2 or more KeywordSearches like "Mercedes-Benz", the line "where d.Count() == splitKeywords.Count()" fails, because d.count = 1 and splitkeywords.Count = 2 for "Mercedes-Benz"
Any Help?

this may not be the direct answer, but in such "multiple parameter search" situations i just forget about anything and do the simple thing, for ex: Search By Car Manufacturer, CategoryId, MillageMax, Price :
var searchResults = from c in carDb.Cars
where (c.Manufacturer.Contains(Manufacturer) || Manufacturer == null) &&
(c.CategoryId == CategoryId || CategoryId == null) &&
(c.Millage <= MillageMax || MillageMax== null) &&
(c.Price <= Price || Price == null)
select c
now if any of the parameters is null it cancels the containing line by making the whole expression in brackets True and so it does not take a part in search any more

If you try to make your own search engine you will probably fail.Why don't you try Lucene.
Here's a link http://lucenenet.apache.org/.
Cheers

I think I have a solution now. This is based on your previous question and a few assumptions:
Keywords are complete names like "Mercedes-Benz GLK", "Mercedes-Benz Citan".
KeywordSearchs are "Mercedes", "Benz" and "GLK" for "Mercedes-Benz GLK" and "Mercedes", "Benz" and "Citan" for "Mercedes-Benz Citan"
"Mercedes-Benz GLK" is a "Car", "Mercedes-Benz Citan" is a "Truck"
With those three assumptions in mind I can say that
var keywordIds = from k in keywordSearchQuery
where splitKeywords.Contains(k.Name)
select k.Keyword.Id;
is the culprit and all queries below rely on it. This query will find all keywords that contain any words in your searchstring.
Example: Given searchstring "Mercedes-Benz GLK" will be split into "Mercedes", "Benz" and "GLK". Your query now finds "Mercedes" and "Benz" in both "Mercedes-Benz GLK" and "Mercedes-Benz Citan".
I think it's obvious that you don't want "Mercedes-Benz GLK" to match "Mercedes-Benz Citan".
The solution is to tell the query to match every splitKeywords with any Keywordsearch and return the appropriate Keyword:
var keywordIds = keywordSearchQuery
.GroupBy(k => k.Keyword.Id)
.Where(g => splitKeywords.All(w =>
g.Any(k => k.Name.Contains(w))))
.Select(g => g.Key);
As for addIds changing it to var addIDs = matchingKac.Select(ad => ad.Ad_Id).Distinct(); should do the trick. Or if matchingKac is only needed in addIds then you could change it to
var matchingKac = (from kac in keywordAdCategoryQuery
where keywordIds.Distinct().Contains(kac.Keyword_Id)
select kac.Ad_Id).Distinct();
and remove addIds.

I haven't compile-checked this or anything, so it may require some tweaking, but you're looking for something along these lines.
var matchingKac = keywordIds.Distinct().ToList()
.Aggregate(
keywordAdCategoryQuery.AsQueryable(),
(q, id) => q.Where(kac => kac.Keyword_Id == id));
You're effectively saying, "Start with keywordAdCategoryQuery, and for each keyword add a .Where() condition saying that it must have that keyword in it. You could do the same thing with a for loop if you find Aggregate difficult to read.

I am suggesting you to add regex and omit that special characters and then use Linq for that
So Mercedez-Benz can become Mercedez and benz

I recommend to NOT define keywords to objects that way, because you might search and find too many objects or you'll find possibly nothing. You will always spoil your time when searching. Classify your objects in a way that the users focus is to FIND and not to search.

I have posted my answer to: https://github.com/n074v41l4bl34u/StackOverflow19796132
Feel free to review it.
Here is the most important snippet.
with:
internal class SearchDomain
{
public List<Keyword> Keywords { get; set; }
public List<Category> Categories { get; set; }
public List<KeywordAdCategory> KeywordAdCategories { get; set; }
}
then:
private static char[] keywordPartsSplitter = new char[] { ' ', '-' };
internal static Dictionary<Category, Dictionary<int, List<KeywordAdCategory>>> FromStringInput(string searchPhrase, SearchDomain searchDomain)
{
var identifiedKeywords = searchPhrase
.Split(keywordPartsSplitter);
var knownKeywordParts = identifiedKeywords
.Where
(ik =>
searchDomain
.Keywords
.SelectMany(x => x.GetKeywordParts())
.Any(kp => kp.Equals(ik, StringComparison.InvariantCultureIgnoreCase))
);
var keywordkSearches = knownKeywordParts
.Select((kkp, n) => new KeywordSearch()
{
Id = n,
Name = kkp,
Keyword = searchDomain
.Keywords
.Single
(k =>
k.GetKeywordParts()
.Any(kp => kp.Equals(kkp, StringComparison.InvariantCultureIgnoreCase))
)
});
var relevantKeywords = keywordkSearches
.Select(ks => ks.Keyword)
.Distinct();
var keywordAdCategoriesByCategory = searchDomain.Categories
.GroupJoin
(
searchDomain.KeywordAdCategories,
c => c.Id,
kac => kac.Category_Id,
(c, kac) => new { Category = c, AdKeywordsForCategory = kac }
);
var relevantKeywordAdCategories = keywordAdCategoriesByCategory
.Where
(kacbk =>
relevantKeywords
.All
(rk =>
kacbk
.AdKeywordsForCategory
.Any(kac => kac.Keyword_Id == rk.Id)
)
);
var foundAdsInCategories = relevantKeywordAdCategories
.ToDictionary
(rkac =>
rkac.Category,
rkac => rkac.AdKeywordsForCategory
.GroupBy(g => g.Ad_Id)
.ToDictionary(x => x.Key, x => x.ToList())
);
return foundAdsInCategories;
}
It does exactly what you want however I find something fishy about keywords being divisible to sub-keywords. Than again, maybe it is just the naming.

Related

Sort items of list by field of another table LINQ ASP.NET MVC

So, I have my Products table in SSMS with these properties:
public class Product
{
public int Id {get; set;}
public string Title { get; set; }
public decimal Price { get; set; }
}
and my Reports table:
public class Report
{
public int Id { get; set; }
public int ProductId { get; set; }
public ReportType ReportType { get; set; }
}
I want to return a List<Product> to my View that is sorted based on how many reports each Product has, but I can't figure out how to do it with LINQ. Any help/tip would be appreciated.
If you put nav props in this would be:
context.Products.Include(p => p.Reports).OrderBy(p => p.Reports.Count(*));
But as you have no nav props, perhaps something like:
context.Products.OrderBy(p => context.Reports.Count(r => r.ProductId == p.Id));
The query ends up looking like this for the latter:
SELECT *
FROM p
ORDER BY (SELECT COUNT(*) FROM r WHERE p.id = r.id)
and similar but with a left join, for the former
You could also do it on the client side
var dict = context.Reports.GroupBy(r => ProductId, (k,g) => new { ProductId, Count = g.Count() } )
.ToDictionary(at => at.ProductId, at => at.Count);
Then:
//or OrderByDescending if you want most reported products
var ret = context.Products.ToList().OrderBy(p => dict[p.ProductId]);
If you have some limited list of products:
var prods = context.Products.Where(...).ToList();
var prodIds = prods.Select(p => p.ProductId).ToArray();
var dict = context.Reports
.Where(r => prods.Contains(r.ProductId))
.GroupBy(r => ProductId, (k,g) => new { ProductId, Count = g.Count() } )
.ToDictionary(at => at.ProductId, at => at.Count)
var ret = prods.OrderBy(p => dict[p.ProductId]);

C# LINQ. Searching for object by object name property or name part

I am trying to find all invoices to buyers, searching by buyer name (contains and equals filter). Looking for the cleanest way to do it.
I have a list of Buyers.
List <Buyer> AllBuyers;
And a Buyer is:
public class Buyer
{
public string BuyerIdentifier{ get; set; }
public string Name { get; set; }
}
I have a list of Invoices to buyers.
List <Invoice> AllInvoices;
And an Invoice is
public class Invoice
{
public string InvoiceID { get; set; }
public string BuyerID { get; set; }
public string Amount{ get; set; }
}
What I am doing currently:
List<string> BuyerIDs = new List<string> { };
foreach (Invoice inv in AllInvoices)
{
if (!(BuyerIDs.Contains(inv.BuyerID)))
{
// add BuyerID to list if it's not already there. Getting id's that are present on invoices and whose Buyer names match using contains or equals
BuyerIDs.Add(AllBuyers.First(b => b.BuyerIdentifier == inv.BuyerID
&& (b.Name.IndexOf(SearchValue, StringComparison.OrdinalIgnoreCase) >= 0)).BuyerIdentifier);
}
}
Invoices = AllInvoices.FindAll(i=> BuyerIDs.Contains(i.BuyerID));
LINQ query syntax is a little easier for me to understand than LINQ methods to join. So after replies below I am now doing this:
Invoices = (from buyer in AllBuyers
join invoice in AllInvoices on buyer.BuyerIdentifier equals invoice.BuyerID
where buyer.Name.IndexOf(SearchValue, StringComparison.OrdinalIgnoreCase) >= 0
select invoice).ToList();
If all you need are the invoices, you could join your two collections, filter, and select the invoices
AllBuyers.Join(AllInvoices,
a => a.BuyerIdentifier,
a => a.BuyerID,
(b, i) => new { Buyer = b, Invoice = i })
.Where(a => a.Buyer.Name.Contains("name"))
.Select(a => a.Invoice).ToList();
If you want the buyers as well, just leave out the .Select(a => a.Invoice).
The Contains method of a string will match an equals as well.
Here is a suggestion where I create a dictionary with BuyerIdentifier as keys and a List of Invoices as values:
var dict = AllBuyers.ToDictionary(k => k.BuyerIdentifier,
v => AllInvoices.Where(i => i.BuyerID == v.BuyerIdentifier).ToList());
Then you can access a list of Invoices for a specific buyer like so:
List<Invoice> buyerInvoices = dict[buyerId];
This should work for you:
var InvoiceGrouping = AllInvoices.GroupBy(invoice => invoice.BuyerID)
.Where(grouping => AllBuyers.Any(buyer => buyer.BuyerIdentifier == grouping.Key && buyer.Name.IndexOf(pair.Value, StringComparison.OrdinalIgnoreCase) >= 0));
What you end up with is a grouping which has a buyer's ID as the key and all their invoices as the value.
If you want just a flat list of invoices, you can do like so:
var Invoices = AllInvoices.GroupBy(invoice => invoice.BuyerID)
.Where(grouping => AllBuyers.Any(buyer => buyer.BuyerIdentifier == grouping.Key && buyer.Name.IndexOf(pair.Value, StringComparison.OrdinalIgnoreCase) >= 0))
.SelectMany(grouping => grouping);
Note the added SelectMany at the end which, since IGrouping implements IEnumerable, flattens the groupings into a single enumeration of values.
As an ILookup fanboy, this would be my approach:
var buyerMap = AllBuyers
.Where(b => b.Name.IndexOf(SearchValue, StringComparison.OrdinalIgnoreCase) >= 0)
.ToDictionary(b => b.BuyerIdentifier);
var invoiceLookup = AllInvoices
.Where(i => buyerMap.ContainsKey(i.BuyerID))
.ToLookup(x => x.BuyerID);
foreach (var invoiceGroup in invoiceLookup)
{
var buyerId = invoiceGroup.Key;
var buyer = buyerMap[buyerId];
var invoicesForBuyer = invoiceGroup.ToList();
// Do your stuff with buyer and invoicesForBuyer
}

QueryOver multiple tables with NHibernate

I'm trying to left join multiple tables and project some columns that result from this join onto a new entity, and then taking a few records from this result from my database. I've taken a look on a few similar questions here on SF, but I'm not managing to assemble all of those parts into a piece of code that works.
Here is the query I'm trying to generate with NHibernate:
select * from
( select LOC_Key.KeyName, LOC_Translation.TranslationString, LOC_Translation.Comments
from LOC_Key
left join LOC_Translation
on LOC_Key.ID = LOC_Translation.KeyID and LOC_Translation.LanguageID = 6
order by LOC_Key.KeyName
) as keyTable
limit 0,100
I have three entities here, Key, Translation and Language. A Key is a unique string identifier for different translations of a same word in different languages. I want to show the first n keys in alphabetical order for a language, but I want all keys listed, not only the ones that are translated (that's why I'm using a left join).
I took a look at QueryOver<>, Select() method and List<object[]>() method but I can't even manage to have a code that compiles in the first place.
I could use C# linq after getting all records from the tables Key and Translation, having something like this:
IEnumerable<string> k = RepositoryKey.GetLimits( offset, size ).Select( x => x.KeyName );
IEnumerable<TranslationDescriptor> t = RepositoryTranslation.GetAllWhere( x => x.LanguageID.LanguageCode == language && k.Contains ( x.KeyID.KeyName ) ).ToList().ConvertAll( new Converter<Translation, TranslationDescriptor>( ( x ) => { return new TranslationDescriptor { LanguageCode = x.LanguageID.LanguageCode, KeyName = x.KeyID.KeyName, Comments = x.Comments, TranslationString = x.TranslationString }; } ) );
var q = from key in k
join trl in t on key equals trl.KeyName into temp
from tr in temp.DefaultIfEmpty()
select new TranslationDescriptor { KeyName = key, LanguageCode = language, Comments = ( tr == null ) ? string.Empty : tr.Comments, TranslationString = ( tr == null ) ? string.Empty : tr.TranslationString };
However, that's very slow. By the way, my implementation for GetLimits and GetAllWhere is:
public IEnumerable<T> GetAllWhere(Func<T, bool> func)
{
var products = Session.Query<T>().Where(func);
return products;
}
public IEnumerable<T> GetLimits(int offset, int size)
{
return Session.CreateCriteria(typeof(T)).SetFirstResult(offset).SetMaxResults(size).List<T>();
}
Thank you for your help!
Bruno
I'm guessing a little bit at your entities and mappings, but the following might help you get ideas. It joins Key to Translation with a left outer join, then projects the results to a new DTO object.
[Test]
public void LeftOuterProjection()
{
using (var s = OpenSession())
using (var t = s.BeginTransaction())
{
// Set up aliases to use in the queryover.
KeyDTO dtoAlias = null;
Key keyAlias = null;
Translation translationAlias = null;
var results = s.QueryOver<Key>(() => keyAlias)
.JoinAlias(k => k.Translations, () => translationAlias, JoinType.LeftOuterJoin)
.Where(() => translationAlias.LanguageId == 6)
.OrderBy(() => keyAlias.KeyName).Asc
.Select(Projections.Property(() => keyAlias.KeyName).WithAlias(() => dtoAlias.KeyName),
Projections.Property(() => translationAlias.TranslationString).WithAlias(() => dtoAlias.TranslationString),
Projections.Property(() => translationAlias.Comments).WithAlias(() => dtoAlias.Comments))
.TransformUsing(Transformers.AliasToBean<KeyDTO>())
.List<KeyDTO>();
}
}
public class KeyDTO
{
public string KeyName { get; set; }
public string TranslationString { get; set; }
public string Comments { get; set; }
}
public class Key
{
public int Id { get; set; }
public string KeyName { get; set; }
public IList<Translation> Translations { get; set; }
}
public class Translation
{
public Key Key { get; set; }
public int LanguageId { get; set; }
public string TranslationString { get; set; }
public string Comments { get; set; }
}
The modifications I made to my code following ngm suggestion (thanks!):
Language l = RepositoryLanguage.GetSingleOrDefault(x => x.LanguageCode
== language);
KeyTranslationDTO dtoAlias = null;
Key keyAlias = null;
Translation translationAlias = null;
var results = RepositoryKey.GetSession()
.QueryOver<Key>(() => keyAlias)
.OrderBy(() => keyAlias.KeyName).Asc
.JoinQueryOver<Translation>( k => k.Translations, () => translationAlias, JoinType.LeftOuterJoin, Restrictions.Where( () => translationAlias.LanguageID == l ) )
.Select(Projections.Property(() => keyAlias.KeyName).WithAlias(() => dtoAlias.KeyName),
Projections.Property(() => translationAlias.TranslationString).WithAlias(() => dtoAlias.TranslationString),
Projections.Property(() => translationAlias.Comments).WithAlias(() => dtoAlias.Comments))
.TransformUsing(Transformers.AliasToBean<KeyTranslationDTO>())
.Skip(index).Take(amount)
.List<KeyTranslationDTO>();

Select distinct records from list of keywords

I need to generate an autocomplete sugestions list of Keywords based on a search, where each keyword has a set of KeywordSearch:
Keyword class:
public class Keyword
{
public int Id { get; set; }
public string Name { get; set; }
}
public class KeywordSearch
{
// Primary properties
public int Id { get; set; }
public string Name { get; set; }
public Keyword Keyword { get; set; }
}
So If have a keyword like "Company Name", I will have the KeywordSearch "Company" and "Name".
The function I have now, that is not working well is:
public IList<KeywordDto> GetAllBySearch(string keywords, int numberOfRecords)
{
var splitKeywords = keywords.Split(new Char[] { ' ' });
var keywordQuery = _keywordRepository.Query.Where(p => p.IsActive == true);
var keywordSearchQuery = _keywordSearchRepository.Query;
var keywordIds = keywordSearchQuery
.GroupBy(k => k.Keyword.Id)
.Where(g => splitKeywords.All(w => g.Any(k => w.Contains(k.Name))))
.Select(g => g.Key);
IList<KeywordDto> keywordList = (from kw in keywordQuery
join kwids in keywordIds on kw.Id equals kwids
select new KeywordDto { Id = kw.Id, Name = kw.Name })
.Take(numberOfRecords)
.Distinct()
.OrderBy(p => p.Name).ToList();
return keywordList;
}
I need to build a KeywordList based on the keywords string, so if keywords = "Compa" I return "Company Name" with the part "Comp" with bold style , or if keywords = "Compa Nam" I return "Company Name" with "Compa Nam" with bold style etc...
Now what is happening is that it's not able to find the part "Comp" in the KeywordSearch.
Any Suggestion?
Thanks
If I'm not mistaken w.Contains(k.Name) is the key part.
w is "Compa", k.Name is you KeywordSearch "Company" and "Name". So you're asking whether "Compa" contains "Company" or "Name", which is false.
k.Name.Contains(w) (or k.Name.StartsWith(w, StringComparison.CurrentCultureIgnoreCase) if you don't want it to be case sensitive) should return the correct result.

Linq distinct record containing keywords

I need to return a distinct list of records based on a car keywords search like: "Alfa 147"
The problem is that, as I have 3 "Alfa" cars, it returns 1 + 3 records (it seems 1 for the Alfa and 147 result, and 3 for the Alfa result)
EDIT:
The SQL-Server Query look something like this:
SELECT DISTINCT c.Id, c.Name /*, COUNT(Number of Ads in the KeywordAdCategories table with those 2 keywords) */
FROM Categories AS c
INNER JOIN KeywordAdCategories AS kac ON kac.Category_Id = c.Id
INNER JOIN KeywordAdCategories AS kac1 ON kac.Ad_Id = kac1.Ad_Id AND kac1.Keyword_Id = (SELECT Id FROM Keywords WHERE Name = 'ALFA')
INNER JOIN KeywordAdCategories AS kac2 ON kac1.Ad_Id = kac2.Ad_Id AND kac2.Keyword_Id = (SELECT Id FROM Keywords WHERE Name = '147')
My LINQ query is:
var query = from k in keywordQuery where splitKeywords.Contains(k.Name)
join kac in keywordAdCategoryQuery on k.Id equals kac.Keyword_Id
join c in categoryQuery on kac.Category_Id equals c.Id
join a in adQuery on kac.Ad_Id equals a.Id
select new CategoryListByKeywordsDetailDto
{
Id = c.Id,
Name = c.Name,
SearchCount = keywordAdCategoryQuery.Where(s => s.Category_Id == c.Id).Where(s => s.Keyword_Id == k.Id).Distinct().Count(),
ListController = c.ListController,
ListAction = c.ListAction
};
var searchResults = new CategoryListByBeywordsListDto();
searchResults.CategoryListByKeywordsDetails = query.Distinct().ToList();
The entities are:
public class Keyword
{
// Primary properties
public int Id { get; set; }
public string Name { get; set; }
}
// Keyword Sample Data:
// 1356 ALFA
// 1357 ROMEO
// 1358 145
// 1373 147
public class Category
{
// Primary properties
public int Id { get; set; }
public string Name { get; set; }
}
// Category Sample Data
// 1 NULL 1 Carros
// 2 NULL 1 Motos
// 3 NULL 2 Oficinas
// 4 NULL 2 Stands
// 5 NULL 1 Comerciais
// 8 NULL 1 Barcos
// 9 NULL 1 Máquinas
// 10 NULL 1 Caravanas e Autocaravanas
// 11 NULL 1 Peças e Acessórios
// 12 1 1 Citadino
// 13 1 1 Utilitário
// 14 1 1 Monovolume
public class KeywordAdCategory
{
[Key]
[Column("Keyword_Id", Order = 0)]
public int Keyword_Id { get; set; }
[Key]
[Column("Ad_Id", Order = 1)]
public int Ad_Id { get; set; }
[Key]
[Column("Category_Id", Order = 2)]
public int Category_Id { get; set; }
}
// KeywordAdCategory Sample Data
// 1356 1017 1
// 1356 1018 1
// 1356 1019 1
// 1357 1017 1
// 1357 1018 1
// 1357 1019 1
// 1358 1017 1
// 1373 1019 1
public class Ad
{
// Primary properties
public int Id { get; set; }
public string Title { get; set; }
public string TitleStandard { get; set; }
public string Version { get; set; }
public int Year { get; set; }
public decimal Price { get; set; }
// Navigation properties
public Member Member { get; set; }
public Category Category { get; set; }
public IList<Feature> Features { get; set; }
public IList<Picture> Pictures { get; set; }
public IList<Operation> Operations { get; set; }
}
public class AdCar : Ad
{
public int Kms { get; set; }
public Make Make { get; set; }
public Model Model { get; set; }
public Fuel Fuel { get; set; }
public Color Color { get; set; }
}
// AdCar Sample Data
// 1017 Alfa Romeo 145 1.6TDI 2013 ALFA ROMEO 145 1.6TDI 2013 12 2 1.6TDI 1000 1 2013 1 20000,0000 2052 AdCar
// 1018 Alfa Romeo 146 1.6TDI 2013 ALFA ROMEO 146 1.6TDI 2013 12 2 5 1.6TDI 1000 2 2013 1 20000,0000 2052 AdCar
// 1019 Alfa Romeo 147 1.6TDI 2013 ALFA ROMEO 147 1.6TDI 2013 12 2 6 1.6TDI 1000 3 2013 1 20000,0000 2052 AdCar
The result I expect for the search of "ALFA" is "Cars: 3" and for the search of "ALFA 147" is "Cars: 1" and actually the result I get is "Cars: 1 \n Cars: 3"
The kac is not filtering words... so this joins of kac, kac1 and kac2 will return 3 lines, cause this is the numbers of keywords for this ad
You should remove it..
Try this:
SELECT DISTINCT
c.Id, c.Name /*, COUNT(Number of Ads in the KeywordAdCategories table with those 2 keywords) */
FROM
Categories AS c
INNER JOIN
KeywordAdCategories AS kac1 ON kac1.Keyword_Id = (SELECT Id
FROM Keywords
WHERE Name = 'ALFA')
AND kac1.Category_Id = c.Id
INNER JOIN
KeywordAdCategories AS kac2 ON kac1.Ad_Id = kac2.Ad_Id
AND kac2.Keyword_Id = (SELECT Id
FROM Keywords
WHERE Name = '147')
AND kac2.Category_Id = c.Id
I did a test...
Setting the ambient as
declare #Keywords table(id int,name varchar(max))
insert into #Keywords(id,name)
values (1356,'ALFA')
,(1357,'ROMEO')
,(1358,'145')
,(1373,'147')
declare #Categories table(id int, name varchar(max))
insert into #Categories(id,name)
values (1,'Carros')
,(2,'Motos')
declare #KeywordAdCategories table(Keyword_Id int, ad_Id int,Category_Id int)
insert into #KeywordAdCategories (Keyword_Id , ad_Id,Category_Id)
values (1356, 1017,1)
,(1356, 1018,1)
,(1356, 1019,1)
,(1357, 1017,1)
,(1357, 1018,1)
,(1357, 1019,1)
,(1358, 1017,1)
,(1373, 1019,1)
I run these two queries:
--query 1
SELECT
c.Id, c.Name,COUNT(*) as [count]
FROM
#Categories AS c
INNER JOIN
#KeywordAdCategories AS kac1 ON kac1.Keyword_Id = (SELECT Id
FROM #Keywords
WHERE Name = 'ALFA')
AND kac1.Category_Id = c.Id
GROUP BY
c.Id, c.Name
I get this result set:
Id Name count
----------- ---------- -----------
1 Carros 3
and the second query for two words...
--query 2
SELECT
c.Id, c.Name,COUNT(*) as [count]
FROM
#Categories AS c
INNER JOIN
#KeywordAdCategories AS kac1 ON kac1.Keyword_Id = (SELECT Id
FROM #Keywords
WHERE Name = 'ALFA')
AND kac1.Category_Id = c.Id
INNER JOIN
#KeywordAdCategories AS kac2 ON kac1.Ad_Id = kac2.Ad_Id
AND kac2.Keyword_Id = (SELECT Id
FROM #Keywords
WHERE Name = '147')
AND kac2.Category_Id = c.Id
GROUP BY
c.Id, c.Name
Result set is:
Id Name count
----------- ---------- -----------
1 Carros 1
Is this what you want?
You can use the Distinct() method.
var query = ...
var query = query.Distinct();
See This code returns distinct values. However, what I want is to return a strongly typed collection as opposed to an anonymous type for more details.
Split the query string into an array and iterate through querying the database for each keyword and joining the result sets using unions. The resultant set will be every distinct record that matches any of the given keywords.
Maybe this is close? At least the subqueries open it up a little for you to work with.
var query =
from c in categoryQuery
let keywords =
(
from k in keywordQuery where splitKeywords.Contains(k.Name)
join kac in keywordAdCategoryQuery on k.Id equals kac.Keyword_Id
where kac.Category_Id == c.Id
join a in adQuery on kac.Ad_Id equals a.Id
select k.Id
).Distinct()
where keywords.Any()
select new CategoryListByKeywordsDetailDto
{
Id = c.Id,
Name = c.Name,
SearchCount =
(
from kac in keywordAdCategoryQuery
where kac.Category_Id == c.Id
join kId in keywords on kac.Keyword_Id equals kId
select kac.Id
).Distinct().Count(),
ListController = c.ListController,
ListAction = c.ListAction
};
One of the beautiful features of linq is that you can build up complicated queries in smaller and simpler steps and let linq figure out how to join them all together.
The following is one way to get this information. I'm not sure whether this is the best and you would need to check it performs well when multiple keywords are selected.
Assuming keywords is defined something like
var keywords = "Alfa 147";
var splitKeywords = keywords.Split(new char[] {' '});
Stage 1
Get a list of keywords grouped by Ad and Category and
var subQuery = (from kac in keywordAdCategoryQuery
join k in keywordQuery on kac.Keyword_Id equals k.Id
select new
{
kac.Ad_Id,
kac.Category_Id,
KeyWord = k.Name,
});
var grouped = (from r in subQuery
group r by new { r.Ad_Id, r.Category_Id} into results
select new
{
results.Key.Ad_Id ,
results.Key.Category_Id ,
keywords = (from r in results select r.KeyWord)
});
Note, the classes you posted would suggest that your database does not have foreign key relationships defined between the tables. If they did then this stage would be slightly simpler to write.
Stage 2
Filter out any groups that do not have each of the keywords
foreach(var keyword in splitKeywords)
{
var copyOfKeyword = keyword ; // Take copy of keyword to avoid closing over loop
grouped = (from r in grouped where r.keywords.Contains(copyOfKeyword) select r) ;
}
Stage 3
Group by Category and count the results per category
var groupedByCategories = (from r in grouped
group r by r.Category_Id into results
join c in categoryQuery on results.Key equals c.Id
select new
{
c.Id ,
c.Name ,
Count = results.Count()
});
Stage 4
Now retrieve the information from sql. This should be done all in one query.
var finalResults = groupedByCategories.ToList();
So, if I understand the need correctly, you want all of the subset of words to be matched in the text and not the OR matching you are getting right now? I see at least two options, the first of which may not translate the split to SQL:
var query = from k in keywordQuery where !splitKeywords.Except(k.Name.split(' ')).Any()
This makes the following assumptions:
Your words in the Keywords are space delimited.
You are looking for exact matches and not partial matches. (I.e. Test will not match TestTest).
The other option being to dynamically generate a predicate using predicate builder (haven't done this in a while, my implementation might need tweaking - but this is the more likely (and better in my mind) solution):
var predicate = PredicateBuilder.True<keywordQuery>();
foreach (string s in splitKeywords) {
predicate.AND(s.Contains(k.Name));
}
query.Where(predicate);
If someone can comment if some of my syntax is off I would appreciate it. EDIT: Including link to a good reference on predicate builder: http://www.albahari.com/nutshell/predicatebuilder.aspx
UPDATE
Predicate builder across multiple tables, if anyone gets here looking for how to do that.
Can PredicateBuilder generate predicates that span multiple tables?
Should be possible to query for each keyword then union the result sets. The duplicate values will be removed from the union and you can work out the required aggregations.
Try removing the class while select
var query = (from k in keywordQuery where splitKeywords.Contains(k.Name)
join kac in keywordAdCategoryQuery on k.Id equals kac.Keyword_Id
join c in categoryQuery on kac.Category_Id equals c.Id
join a in adQuery on kac.Ad_Id equals a.Id
select new
{
Id = c.Id,
Name = c.Name,
SearchCount = keywordAdCategoryQuery.Where(s => s.Category_Id == c.Id).Where(s => s.Keyword_Id == k.Id).Distinct().Count(),
ListController = c.ListController,
ListAction = c.ListAction
}).Distinct().ToList();
var searchResults = new CategoryListByBeywordsListDto();
searchResults.CategoryListByKeywordsDetails = (from q in query select new CategoryListByKeywordsDetailDto
{
Id = q.Id,
Name = q.Name,
SearchCount = q.SearchCount,
ListController = q.ListController,
ListAction = q.ListAction
}).ToList();
You are doing a select distinct on a list of CategoryListByKeywordsDetailDto. Distinct only works on POCO and anonymous objects. In your case you need to implement the IEqualitycomparer for select distinct to work.
I tried this using LINQ directly against in memory collections (as in, not through SQL) - seems to work for me (I think the main point being that you want to search for Ads that apply to ALL the keywords specified, not ANY, correct?
Anyway, some sample code below (a little comment-ish and not necessarily the most efficient, but hopefully illustrates the point...)
Working with the following "data sets":
private List<AdCar> AdCars = new List<AdCar>();
private List<KeywordAdCategory> KeywordAdCategories = new List<KeywordAdCategory>();
private List<Category> Categories = new List<Category>();
private List<Keyword> Keywords = new List<Keyword>();
which are populated in a test method using the data you provided...
Search method looks a little like this:
var splitKeywords = keywords.Split(' ');
var validKeywords = Keywords.Join(splitKeywords, kwd => kwd.Name.ToLower(), spl => spl.ToLower(), (kwd, spl) => kwd.Id).ToList();
var groupedAdIds = KeywordAdCategories
.GroupBy(kac => kac.Ad_Id)
.Where(grp => validKeywords.Except(grp.Select(kac => kac.Keyword_Id)).Any() == false)
.Select(grp => grp.Key)
.ToList();
var foundKacs = KeywordAdCategories
.Where(kac => groupedAdIds.Contains(kac.Ad_Id))
.GroupBy(kbc => kbc.Category_Id, kac => kac.Ad_Id);
//Results count by category
var catCounts = Categories
.Join(foundKacs, cat => cat.Id, kacGrp => kacGrp.Key, (cat, kacGrp) => new { CategoryName = cat.Name, AdCount = kacGrp.Distinct().Count() })
.ToList();
//Actual results set
var ads = AdCars.Join(groupedAdIds, ad => ad.Id, grpAdId => grpAdId, (ad, grpAdId) => ad);
As I said, this is more to illustrate, please don't look too closely at the use of Joins & GroupBy etc (not sure its exactly, er, "optimal")
So, using the above, if I search for "Alfa", I get 3 Ad results, and if I search for "Alfa 147" I get just 1 result.
EDIT: I've changed the code to represent two possible outcomes (as I wasn't sure which was needed by your question)
ads will give you the actual Ads returned by the search
catCounts will give a list of anonymous types each representing the find results as a count of Ads by category
Does this help?
hi if i understand your problem correctly
"The problem is that, as I have 3 "Alfa" cars, it returns 1 + 3
records (it seems 1 for the Alfa and 147 result, and 3 for the Alfa
result)"
and Linq isn't really required i maybe have what you need just test it as new project
public Linqfilter()
{
//as Note: I modified a few classes from you because i doesn'T have your Member, Operation, Make,... classes
#region declaration
var originalAdCarList = new List<AdCar>()
{
new AdCar(){Id=1017, Title= "Alfa Romeo 145 1.6TDI 2013", Category= new Category(){Id =12}} ,
new AdCar(){Id=1018, Title= "Alfa Romeo 146 1.6TDI 2013", Category= new Category(){Id =11}} ,
new AdCar(){Id=1019, Title= "Alfa Romeo 147 1.6TDI 2013", Category= new Category(){Id =12}}
};
var originalKeywordAdCategoryList = new List<KeywordAdCategory>()
{
new KeywordAdCategory() { Keyword_Id=1356, Ad_Id=1017,Category_Id=1},
new KeywordAdCategory() { Keyword_Id=1356, Ad_Id=1018,Category_Id=1},
new KeywordAdCategory() { Keyword_Id=1356, Ad_Id=1019,Category_Id=1},
new KeywordAdCategory() { Keyword_Id=1357, Ad_Id=1017,Category_Id=1},
new KeywordAdCategory() { Keyword_Id=1357, Ad_Id=1018,Category_Id=1},
new KeywordAdCategory() { Keyword_Id=1357, Ad_Id=1019,Category_Id=1},
new KeywordAdCategory() { Keyword_Id=1358, Ad_Id=1017,Category_Id=1},
new KeywordAdCategory() { Keyword_Id=1373, Ad_Id=1019,Category_Id=1}
};
var originalCategoryList = new List<Category>()
{
new Category(){Id=1, Name="NULL 1 Carros"},
new Category(){Id=2, Name="NULL 1 Motos"},
new Category(){Id=3, Name="NULL 2 Oficinas"},
new Category(){Id=4 , Name="NULL 2 Stands"},
new Category(){Id=5 , Name="NULL 1 Comerciais"},
new Category(){Id=8, Name="NULL 1 Barcos"},
new Category(){Id=9 , Name="NULL 1 Máquinas"},
new Category(){Id=10 , Name="NULL 1 Caravanas e Autocaravanas"},
new Category(){Id=11 , Name="NULL 1 Peças e Acessórios"},
new Category(){Id=12 , Name="1 1 Citadino"},
new Category(){Id=13 , Name="1 1 Utilitário"},
new Category(){Id=14 , Name="1 1 Monovolume"}
};
var originalKeywordList = new List<Keyword>()
{
new Keyword(){Id=1356 ,Name="ALFA"},
new Keyword(){Id=1357 ,Name="ROMEO"},
new Keyword(){Id=1358 ,Name="145"},
new Keyword(){Id=1373 ,Name="147"}
};
#endregion declaration
string searchText = "ALFA";
// split the string searchText in an Array of substrings
var splitSearch = searchText.Split(' ');
var searchKeyList =new List<Keyword>();
// generate a list of Keyword based on splitSearch
foreach (string part in splitSearch)
if(originalKeywordList.Any(key => key.Name == part))
searchKeyList.Add(originalKeywordList.First(key => key.Name == part));
// generate a list of KeywordAdCategory based on searchKList
var searchKACList = new List<KeywordAdCategory>();
foreach(Keyword key in searchKeyList)
foreach (KeywordAdCategory kAC in originalKeywordAdCategoryList.Where(kac => kac.Keyword_Id == key.Id))
searchKACList.Add(kAC);
var groupedsearchKAClist = from kac in searchKACList group kac by kac.Keyword_Id;
var listFiltered = new List<AdCar>(originalAdCarList);
//here starts the real search part
foreach (IGrouping<int, KeywordAdCategory> kacGroup in groupedsearchKAClist)
{
var listSingleFiltered = new List<AdCar>();
// generate a list of AdCar that matched the current KeywordAdCategory filter
foreach (KeywordAdCategory kac in kacGroup)
foreach (AdCar aCar in originalAdCarList.Where(car => car.Id == kac.Ad_Id))
listSingleFiltered.Add(aCar);
var tempList = new List<AdCar>(listFiltered);
// iterrates over a temporary copie of listFiltered and removes items which don't match to the current listSingleFiltered
foreach (AdCar aC in tempList)
if (!listSingleFiltered.Any(car => car.Id == aC.Id))
listFiltered.Remove(aC);
}
var AdCarCount = listFiltered.Count; // is the count of the AdCar who match
var CatDic =new Dictionary<Category, int>(); // will contain the Counts foreach Categorie > 0
foreach(AdCar aCar in listFiltered)
if(originalCategoryList.Any(cat => cat.Id ==aCar.Category.Id))
{
var selectedCat = originalCategoryList.First(cat => cat.Id == aCar.Category.Id);
if (!CatDic.ContainsKey(selectedCat))
{
CatDic.Add(selectedCat, 1);//new Category Countvalue
}
else
{
CatDic[selectedCat]++; //Category Countvalue +1
}
}
}
}
public class Keyword
{
// Primary properties
public int Id { get; set; }
public string Name { get; set; }
}
public class Category
{
// Primary properties
public int Id { get; set; }
public string Name { get; set; }
}
public class KeywordAdCategory
{
//[Key]
//[Column("Keyword_Id", Order = 0)]
public int Keyword_Id { get; set; }
//[Key]
//[Column("Ad_Id", Order = 1)]
public int Ad_Id { get; set; }
//[Key]
//[Column("Category_Id", Order = 2)]
public int Category_Id { get; set; }
}
public class Ad
{
// Primary properties
public int Id { get; set; }
public string Title { get; set; }
public string TitleStandard { get; set; }
public string Version { get; set; }
public int Year { get; set; }
public decimal Price { get; set; }
// Navigation properties
public string Member { get; set; }
public Category Category { get; set; }
public IList<string> Features { get; set; }
public IList<int> Pictures { get; set; }
public IList<string> Operations { get; set; }
}
public class AdCar : Ad
{
public int Kms { get; set; }
public string Make { get; set; }
public int Model { get; set; }
public int Fuel { get; set; }
public int Color { get; set; }
}
hopefully it will help you or someone else
Edit:
extended my Methode Linqfilter() to answer the request
Edit2:
i think that should be exactly what you are looking for
var selectedKWLinq = from kw in originalKeywordList
where splitSearch.Contains(kw.Name)
select kw;
var selectedKACLinq = from kac in originalKeywordAdCategoryList
where selectedKWLinq.Any<Keyword>(item => item.Id == kac.Keyword_Id)
group kac by kac.Keyword_Id into selectedKAC
select selectedKAC;
var selectedAdCar = from adC in originalAdCarList
where (from skAC in selectedKACLinq
where skAC.Any(kac => kac.Ad_Id == adC.Id)
select skAC).Count() == selectedKACLinq.Count()
select adC;
var selectedCategorys = from cat in originalCategoryList
join item in selectedAdCar
on cat.Id equals item.Category.Id
group cat by cat.Id into g
select g;
//result part
var AdCarCount = selectedAdCar.Count();
List<IGrouping<int, Category>> list = selectedCategorys.ToList();
var firstCategoryCount = list[0].Count();
var secoundCategoryCount = list[1].Count();
Fiuu, this was brain-wreck. I splited query in several pieces, but it's executed as a whole at the end (var result). And I returned anonymous class, but intention is clear.
Here is the solution:
var keywordIds = from k in keywordQuery
where splitKeywords.Contains(k.Name)
select k.Id;
var matchingKac = from kac in keywordAdCategories
where keywordIds.Contains(kac.Keyword_Id)
select kac;
var addIDs = from kac in matchingKac
group kac by kac.Ad_Id into d
where d.Count() == splitKeywords.Length
select d.Key;
var groupedKac = from kac in keywordAdCategoryQuery
where addIDs.Contains(kac.Ad_Id)
group kac by new { kac.Category_Id, kac.Ad_Id };
var result = from grp in groupedKac
group grp by grp.Key.Category_Id into final
join c in categoryQuery on final.Key equals c.Id
select new
{
Id = final.Key,
Name = c.Name,
SearchCount = final.Count()
};
// here goes result.ToList() or similar

Categories

Resources