Equal query in Elasticsearch by Nest client - c#

public class User
{
public string Email { get; set; }
}
client.Index(new User { Email ="test#test.te" });
Query in Linq C# for example :
rep.Where(user=>user.Email=="test#test.te");
That works correctly.
I use same query in Nest:
client.Search<Post>(q => q
.Query(qu => qu
.Term(te=>te.OnField("email").Value("test#test.te"))));
Document result count is zero!!
But :
client.Search<Post>(q => q
.Query(qu => qu
.Term(te=>te.OnField("email").Value("test"))));
Document result count is 1 - why?
How I can make equal-query in ElasticSearch?

It's all because analyzers. Your document is parsed to terms while indexing, which means that elasticsearch stores something like an array of strings ["test", "test", "te"] in row with your document. Depending on what anayzer is configured (I guess it's standard one), you may get different terms decomposition. On the other side, your term request is not analyzed; that's why first request returns nothing - there's no such string as "test#test.te" in index strings ["test", "test", "te"], but there's a "test" string, so you get results for the second one. In your case you should use query_string query, but beware of the fact that such queries are analyzed too. It means, that if you index two documents, say {"Email":"test#test.te"} and {"Email":"test#gmail.com"}, without any flags query {"query":{"query_string":{"default_field":"Email","query":"test#test.te"}}} will return both documents because both of them contain "test" string in the index. To avoid this, use something like {"default_field":"Email","query":"test#test.te", "default_operator":"AND"}}} - default default operator is "OR".
Speaking of NEST, use query like
client.Search<Post>(q => q
.Query(qu => qu
.QueryString(qs=>qs
.OnField(x=>x.Email).Query("test#test.te").Operator(Operator.and)
)
)
);

Related

EF Like operator usage

I tried to use DbFunctions.Like with EF 6.2 and got run-time error:
LINQ to Entities does not recognize the method 'Boolean
Like(System.String, System.String)' method, and this method cannot be
translated into a store expression.
Code:
list=list.Where(p=> DbFunctions.Like(p.Master_Bill,"somestring%")); where list is IQueryable<SomeView>
It compiles OK. I thought it can be used with EF 6.2. I know there is also EF Core, did not look at it
Any ideas?
thanks
What about
list = list.Where( p => p.Master_Bill.StartsWith(someString));
?
If the user can enter a wildcard(s) in their search string you can evaluate the string for legal combinations I.e. "?somestring", "somestring?" or "?somestring?" then choose the appropriate where condition.
wildcardResult = evaluateWildcard(someString);
switch(wildcardResult.Result)
{
case WildcardResult.NoWildcard:
list = list.Where( p => p.Master_Bill == wildcardResult.SearchString);
break;
case WildcardResult.StartsWith:
list = list.Where( p => p.Master_Bill.StartsWith(wildcardResult.SearchString));
break;
case WildcardResult.EndsWith:
list = list.Where( p => p.Master_Bill.EndsWith(wildcardResult.SearchString));
break;
case WildcardResult.Contains:
list = list.Where( p => p.Master_Bill.Contains(wildcardResult.SearchString));
break;
}
Where the result class contains an enum for the detected search expression pattern, and the search expression with the wildcard characters stripped to use as the SearchString.
It would also be advisable to evaluate the length of the search string for a minimum viable length when using wildcards. Users could trigger rather expensive queries by using expressions like "?" or "?e?".
Edit: DbFunctions.Like does work as well with SQL Server. Any error you are getting is likely due to an assumption about the IQueryable you are running or the field you are comparing. (I.e. not a mapped column, or a particular data type?)
For instance: Something like this works just fine..
var data = _context.People.Where(p => DbFunctions.Like(p.Name, "s%")).ToList();
Which would return all People with a Name starting with "S". (case insensitive)
I'd look at what your entire IQueryable looks like, as well as that Master_Bill is both a mapped column and a regular NVARCHAR/VARCHAR column.

Group by Linq setting properties

I'm working on a groupby query using Linq, but I want to set the value for a new property in combination with another list. This is my code:
var result = list1.GroupBy(f => f.Name)
.ToList()
.Select(b => new Obj
{
ClientName = b.Name,
Status = (AnotherClass.List().Where(a=>a.state_id=b.????).First()).Status
})
I know I'm using a group by, but I'm not sure of how to access the value inside my bcollection to compare it with a.state_id.
This snippet:
Status = (AnotherClass.List().Where(a=>a.state_id=b.????).First()).Status
I've done that before but months ago I don't remember the syntax, when I put a dot behind b I have acces only to Key and the Linq Methods... What should be the syntax?`
Issue in your code is happening here:
a=>a.state_id=b.????
Why ?
Check type of b here, it would be IGrouping<TKey,TValue>, which is because, post GroupBy on an IEnumerable, you get result as IEnumerable<IGrouping<TKey,TValue>>
What does that mean?
Think of Grouping operation in the database, where when you GroupBy on a given Key, then remaining columns that are selected need an aggregation operation,since there could be more than one record per key and that needs to be represented
How it is represented in your code
Let's assume list1 has Type T objects
You grouped the data by Name property, which is part of Type T
There's no data projection so for a given key, it will aggregate the remaining data as IEnumerable<T>, as grouped values
Result is in the format IEnumerable<IGrouping<TK, TV>>, where TK is Name and TV represent IEnumerable<T>
Let's check out some code, break your original code in following parts
var result = list1.GroupBy(f => f.Name) - result will be of type IEnumerable<IGrouping<string,T>>, where list1 is IEnumerable<T>
On doing result.Select(b => ...), b is of type IGrouping<string,T>
Further you can run Linq queries on b, as follows:
b.Key, will give access to Name Key, there's no b.Value, for that your options could be following or any other relevant Linq operations:
a=>b.Any(x => a.state_id == x.state_id) or // Suuggests if an Id match in the Collection
a=>a.state_id == b.FirstOrDefault(x => x.state_id) //Selects First or default Value
Thus you can create a final result, from the IGrouping<string,T>, as per the logical requirement / use case

EF filtering/searching with multiple words

I have a simple custom table with a search/filter field. I leave the implementation of the search up to each use of the table.
So let's say I have users in my table and I want to search for them. I want to search both in users firstname, lastname and also any role they are in.
This would probably do the trick
searchString = searchString.ToLower();
query = query.Where(
x =>
x.FirstName.ToLower().Contains(searchString)
||
x.LastName.ToLower().Contains(searchString)
||
x.Roles.Any(
role =>
role.Name.ToLower().Contains(searchString)
)
);
But now I want to search/filter on multiple words. First I get an array of all separate words.
var searchStrings = searchString.ToLower().Split(null);
I tried the following but it does not fulfill my requirements listed further down as it returns any user where any word is matched in any field. I need that all words are matched (but possibly in different fields). Se below for more details.
query = query.Where(
x =>
searchStrings.Any(word => x.FirstName.ToLower().Contains(word))
||
searchStrings.Any(word => x.LastName.ToLower().Contains(word))
//snipped away roles search for brevity
);
First let me produce some data
Users (data)
Billy-James Carter is admin and manager
James Carter is manager
Billy Carter has no role
Cases
If my search string is "billy car" I want Billy-James and Billy returned but not James Carter (so all words must match but not on same field).
If my search string is "bil jam" or even "bil jam car" I only want Billy-James returned as he is the only one matching all terms/words. So in this the words bil and jam were both found in the FirstName field while the car term was found in the LastName field. Only getting the "car" part correct is not enough and James is not returned.
If I search for "car man" Billy-James and James are both managers (man) and named Carter and should show up. should I search for "car man admi" then only Billy-James should show up.
I am happy to abandon my current approach if better is suggested.
I cannot think of a way to wrap what you're looking for up into a single LINQ statement. There may be a way, but I know with EF the options are more limited than LINQ on an object collection. With that said, why not grab a result set from the database with the first word in the split, then filter the resulting collection further?
var searchWords = searchString.ToLower().split(' ');
var results = dataBase.Where(i => i.FirstName.ToLower().Contains(searchWords[0])
|| i.LastName.ToLower().Contains(searchWords[0])
|| i.Role.ToLower().Contains(searchWords[0]));
if(searchWords.Length > 1) {
for(int x = 1; x < searchWords.Length; x++) {
results = results.Where(i => i.FirstName.ToLower().Contains(searchWords[x])
|| i.LastName.ToLower().Contains(searchWords[x])
|| i.Role.ToLower().Contains(searchWords[x]));
}
}
Your final content of the results collection will be what you're looking for.
Disclaimer: I didn't have a setup at the ready to test this, so there may be something like a .ToList() needed to make this work, but it's basically functional.
Update: More information about EF and deferred execution, and string collection search
Given we have the schema:
Employee:
FirstName - String
Last Name - String
Roles - One to Many
Role:
Name - String
The following will build a query for everything you want to find
var searchTerms = SearchString.ToLower().Split(null);
var term = searchTerms[0];
var results = from e in entities.Employees
where (e.FirstName.Contains(term)
|| e.LastName.Contains(term)
|| e.Roles.Select(r => r.Name).Any(n => n.Contains(term)))
select e;
if (searchTerms.Length > 1)
{
for (int i = 1; i < searchTerms.Length; i++)
{
var tempTerm = searchTerms[i];
results = from e in results
where (e.FirstName.Contains(tempTerm)
|| e.LastName.Contains(tempTerm)
|| e.Roles.Select(r => r.Name).Any(n => n.Contains(tempTerm)))
select e;
}
}
At this point the query still has not been executed. As you filter the result set in the loop, this is actually adding additional AND clauses to the search criteria. The query doesn't execute until you run a command that does something with the result set like ToList(), iterating over the collection, etc. Put a break point after everything that builds the query and take a look at it. LINQ to SQL is both interesting and powerful.
More on deferred execution
The one thing which needs explanation is the variable tempTerm. We need a variable which is scoped within the loop so that we don't end up with one value for all the parameters in the query referencing the variable term.
I simplified it a bit
//we want to search/filter
if (!string.IsNullOrEmpty(request.SearchText))
{
var searchTerms = request.SearchText.ToLower().Split(null);
foreach (var term in searchTerms)
{
string tmpTerm = term;
query = query.Where(
x =>
x.Name.ToLower().Contains(tmpTerm)
);
}
}
I build a much bigger query where searching is just a part, starting like this
var query = _context.RentSpaces.Where(x => x.Property.PropertyId == request.PropertyId).AsQueryable();
above search only uses one field but should work just fine with more complex fields. like in my user example.
I usually take the apporach to sort of queue the queries. They are all executed in one step at the database if you look with the diagnostic tools:
IQueryable<YourEntity> entityQuery = context.YourEntity.AsQueryable();
foreach (string term in serchTerms)
{
entityQuery = entityQuery.Where(a => a.YourProperty.Contains(term));
}

Linq Extension method for Join

I am in the process of learning LINQ, ASP.NET, EF, and MVC via online video tutorials. I would love some help understanding Joins in LINQ extension method syntax.
For simplification, I have two tables (these map to a SQL DB):
User Table:
public int userID{get;set;}
public string firstName{get;set;}
...
Address
public int ownerID{get;set;}
public int value{get;set;}
public string Nickname{get;set;}
public string street{get;set;}
public string zip{get;set;}
...
Let's say I want to find all the property that a particular user owns. I believe I can do something like this:
var test = db.User
.Join(db.Address, user => user.userID, add => add.ownerID, (user, add) => new { user, add });
Source: http://byatool.com/c/linq-join-method-and-how-to-use-it/
This should be equivalent to
SELECT * FROM User a JOIN Address b on a.userID = b.ownerID
Please confirm that this is correct.
Now, what if I wanted to find all property that a particular user owns that has a value greater than x. Let's take it a step further and say x is a result from another LINQ query. How do I force execution of x inside of a second query? Do I even have to consider this, or will LINQ know what to do in this case?
Thanks
EDIT:
When I try to use the result of a query as a parameter in another, I am required to use a greedy operator to force execution. Many people like to use .Count() or .ToList(). I only expect x (from example above) to return 1 string by using .Take(1). If I append ToList() to the end of my first query, I am required to use x[0] in my second query. This seems like a messy way to do things. Is there a better way to force execution of a query when you know you will only have 1 result?
If I understand your question, you're trying to do a conditional on a joined model?
var query = db.Users.Where(x => x.Addresses.Where(y => y.Value >= yourValue).Any());
That will return all users who have a property value greater than yourValue. If you need to return the addresses with the query, you can just add Include to your query. For example:
query.Include(x => x.Addresses);
You don't need to manually do that Join that you have in your example.

Detecting "near duplicates" using a LINQ/C# query

I'm using the following queries to detect duplicates in a database.
Using a LINQ join doesn't work very well because Company X may also be listed as CompanyX, therefore I'd like to amend this to detect "near duplicates".
var results = result
.GroupBy(c => new {c.CompanyName})
.Select(g => new CompanyGridViewModel
{
LeadId = g.First().LeadId,
Qty = g.Count(),
CompanyName = g.Key.CompanyName,
}).ToList();
Could anybody suggest a way in which I have better control over the comparison? Perhaps via an IEqualityComparer (although I'm not exactly sure how that would work in this situation)
My main goals are:
To list the first record with a subset of all duplicates (or "near duplicates")
To have some flexibility over the fields and text comparisons I use for my duplicates.
For your explicit "ignoring spaces" case, you can simply call
var results = result.GroupBy(c => c.Name.Replace(" ", ""))...
However, in the general case where you want flexibility, I'd build up a library of IEqualityComparer<Company> classes to use in your groupings. For example, this should do the same in your "ignore space" case:
public class CompanyNameIgnoringSpaces : IEqualityComparer<Company>
{
public bool Equals(Company x, Company y)
{
return x.Name.Replace(" ", "") == y.Name.Replace(" ", "");
}
public int GetHashCode(Company obj)
{
return obj.Name.Replace(" ", "").GetHashCode();
}
}
which you could use as
var results = result.GroupBy(c => c, new CompanyNameIgnoringSpaces())...
It's pretty straightforward to do similar things containing multiple fields, or other definitions of similarity, etc.
Just note that your defintion of "similar" must be transitive, e.g. if you're looking at integers you can't define "similar" as "within 5", because then you'd have "0 is similar to 5" and "5 is similar to 10" but not "0 is similar to 10". (It must also be reflexive and symmetric, but that's more straightforward.)
Okay, so since you're looking for different permutations you could do something like this:
Bear in mind this was written in the answer so it may not fully compile, but you get the idea.
var results = result
.Where(g => CompanyNamePermutations(g.Key.CompanyName).Contains(g.Key.CompanyName))
.GroupBy(c => new {c.CompanyName})
.Select(g => new CompanyGridViewModel
{
LeadId = g.First().LeadId,
Qty = g.Count(),
CompanyName = g.Key.CompanyName,
}).ToList();
private static List<string> CompanyNamePermutations(string companyName)
{
// build your permutations here
// so to build the one in your example
return new List<string>
{
companyName,
string.Join("", companyName.Split(" ".ToCharArray(), StringSplitOptions.RemoveEmptyEntries);
};
}
In this case you need to define where the work is going to take place i.e. fully on the server, in local memory or a mixture of both.
In local memory:
In this case we have two routes, to pull back all the data and just do the logic in local memory, or to stream the data and apply the logic piecewise. To pull all the data just ToList() or ToArray() the base table. To stream the data would suggest using ToLookup() with custom IEqualityComparer, e.g.
public class CustomEqualityComparer: IEqualityComparer<String>
{
public bool Equals(String str1, String str2)
{
//custom logic
}
public int GetHashCode(String str)
{
// custom logic
}
}
//result
var results = result.ToLookup(r => r.Name,
new CustomEqualityComparer())
.Select(r => ....)
Fully on the server:
Depends on your provider and what it can successfully map. E.g. if we define a near duplicate as one with an alternative delimiter one could do something like this:
private char[] delimiters = new char[]{' ','-','*'}
var results = result.GroupBy(r => delimiters.Aggregate( d => r.Replace(d,'')...
Mixture:
In this case we are splitting the work between the two. Unless you come up with a nice scheme this route is most likely to be inefficient. E.g. if we keep the logic on the local side, build groupings as a mapping from a name into a key and just query the resulting groupings we can do something like this:
var groupings = result.Select(r => r.Name)
//pull into local memory
.ToArray()
//do local grouping logic...
//Query results
var results = result.GroupBy(r => groupings[r]).....
Personally I usually go with the first option, pulling all the data for small data sets and streaming large data sets (empirically I found streaming with logic between each pull takes a lot longer than pulling all the data then doing all the logic)
Notes: Dependent on the provider ToLookup() is usually immediate execution and in construction applies its logic piecewise.

Categories

Resources