Why escaping backslashes are appear in ElasticSearch Nest Query? - c#

I am trying to write a C# method that get the queryString of a controller and converts it into an ElasticSearch query, like below:
public QueryContainerDescriptor<T> Convert<T> (IQueryCollection query) where T: class
{
var selector = new QueryContainerDescriptor<T>();
List<QueryContainer> Must = new List<QueryContainer>();
foreach(var key in query.keys)
{
string value = query[key];
var match = new MatchQuery { Field = $"{key}.keyword", Query = value };
list.Add(match)
}
selector.Bool(q => q.Must(Must.ToArray()));
return selector;
}
It works as expected, but if I pass a queryString value with a backslash, like:
http://localhost:5000/api/indexData?user=ESKA\\USER
It should be converted into this query:
{ "bool": { "must": [ { "match" : { "user.keyword": "ESKA\\USER" } } ] }
But ElasticSearch will return nothing because the value will of the query be: ESKA\\\\USER with 4 backslashes, like:
{ "bool": { "must": [ { "match" : { "user.keyword": "ESKA\\\\USER" } } ] }
how can I solve this issue?

I don't think that Nest is performing any escaping of backslashes. Here's an example that writes out the requests (and responses, if using an IConnection that sends the request)
private static void Main()
{
var pool = new SingleNodeConnectionPool(new Uri($"http://localhost:9200"));
var settings = new ConnectionSettings(pool, new InMemoryConnection())
.DefaultIndex("default_index")
.DisableDirectStreaming()
.PrettyJson()
.OnRequestCompleted(callDetails =>
{
if (callDetails.RequestBodyInBytes != null)
{
var json = JObject.Parse(Encoding.UTF8.GetString(callDetails.RequestBodyInBytes));
Console.WriteLine(
$"{callDetails.HttpMethod} {callDetails.Uri} \n" +
$"{json.ToString(Newtonsoft.Json.Formatting.Indented)}");
}
else
{
Console.WriteLine($"{callDetails.HttpMethod} {callDetails.Uri}");
}
Console.WriteLine();
if (callDetails.ResponseBodyInBytes != null)
{
Console.WriteLine($"Status: {callDetails.HttpStatusCode}\n" +
$"{Encoding.UTF8.GetString(callDetails.ResponseBodyInBytes)}\n" +
$"{new string('-', 30)}\n");
}
else
{
Console.WriteLine($"Status: {callDetails.HttpStatusCode}\n" +
$"{new string('-', 30)}\n");
}
});
var client = new ElasticClient(settings);
var collection = new QueryCollection(new Dictionary<string, StringValues>
{
{ "user", "ESKA\\USER" }
});
var response = client.Search<object>(s => s
.Query(q => Convert<object>(q, collection))
);
}
public static QueryContainerDescriptor<T> Convert<T>(QueryContainerDescriptor<T> selector, IQueryCollection query) where T : class
{
List<QueryContainer> Must = new List<QueryContainer>();
foreach (var key in query.Keys)
{
string value = query[key];
var match = new MatchQuery { Field = $"{key}.keyword", Query = value };
Must.Add(match);
}
selector.Bool(q => q.Must(Must.ToArray()));
return selector;
}
the resulting query is
POST http://localhost:9200/default_index/_search?pretty=true&typed_keys=true
{
"query": {
"bool": {
"must": [
{
"match": {
"user.keyword": {
"query": "ESKA\\USER"
}
}
}
]
}
}
}
If the user value were to be the verbatim string literal #"ESKA\\USER", then the resulting query would be
"user.keyword": {
"query": "ESKA\\\\USER"
}
because each \ in the verbatim string literal needs to be escaped.

Related

How to search and get only sub document using C# mongoDB

I have data in provinces collection like this:
{
"_id": {
"$oid": "63dc7ff82e7e5e91c0f1cd87"
},
"province": "province1",
"districts": [
{
"district": "district1",
"sub_districts": [
{
"sub_district": "sub_district1",
"zip_codes": [
"zip_code1"
]
},
{
"sub_district": "sub_district2",
"zip_codes": [
"zip_code2"
]
},
],
},
],
}
This is how I get a list of sub_district for now:
- I search for province using Builders.Filter.
- Use foreach to get districts array (In province collection) and use if-statement to check if district equal searchDistrict.
- Get sub_districts array in that distric.
Source code:
public static List<string> MongoDbSelectSubDistrict(string searchProvince, string searchDistrict)
{
List<string> subDistrictList = new List<string>();
try
{
var provincesCollection = _db.GetCollection<BsonDocument>("provinces");
var builder = Builders<BsonDocument>.Filter;
var filter = builder.Empty;
if (searchProvince != "")
{
var provinceFilter = Builders<BsonDocument>.Filter.Eq("province", searchProvince);
filter &= provinceFilter;
}
/*
//***Need to be revised***
if (searchDistrict != "")
{
var districtFilter = Builders<BsonDocument>.Filter.Eq("provinces.district", searchDistrict);
filter &= districtFilter;
}
*/
var queryProvinces = provincesCollection.Find(filter).ToList();
foreach (BsonDocument queryProvince in queryProvinces)
{
BsonArray districtArray = queryProvince.GetValue("districts").AsBsonArray;
foreach (BsonDocument districtDocument in districtArray)
{
string district = districtDocument.GetValue("district").ToString();
if (district == searchDistrict) //***Need to be revised***
{
BsonArray subDistrictArray = districtDocument.GetValue("sub_districts").AsBsonArray;
foreach (BsonDocument subDistrictDocument in subDistrictArray)
{
string subDistrict = subDistrictDocument.GetValue("sub_district").ToString();
subDistrictList.Add(subDistrict);
}
}
}
}
}
catch (TimeoutException ex)
{
}
return subDistrictList;
}
Is there any efficient way to get this?
This is what I want:
[
{
"sub_district": "sub_district1",
"zip_codes": [
"zip_code1"
]
},
{
"sub_district": "sub_district2",
"zip_codes": [
"zip_code2"
]
},
]
And one more question: if I want to search for sub_district in the collection, how do I get this without looping in sub_districts array?

How to serialize sql data without the column names?

I am Serializing data from SQL database to JSON, how can I serialize just the values without the string name OR a function to trim the serialized JSON before Deserializing.
I read about ScriptIgnoreAttribute but didn't see how to relate it with what I want to do
Original JSON
​[
{
"CODE": "AF",
"TOTALVALUE": "$23,554,857.27"
},
{
"CODE": "AS",
"TOTALVALUE": "$38,379,964.65"
},
{
"CODE": "SG",
"TOTALVALUE": "$24,134,283.47"
}
]
Desired JSON
​[
{
"AF": "$23,554,857.27"
},
{
"AS": "$38,379,964.65"
},
{
"SG": "$24,134,283.47"
}
]
SQL View structure
My SQL query to return the data
SELECT [CODE],[TOTALVALUE] FROM [dbo].[vw_BuyersByCountryValue]
enter code here
Code for Serializing in ASP.NET
[WebMethod]
public void GetBuyersByCountryValue()
{
using (PMMCEntities ctx = new PMMCEntities())
{
ctx.Configuration.ProxyCreationEnabled = false;
var qry = ctx.vw_BuyersByCountryValue.ToList();
var js = new JavaScriptSerializer();
string strResponse = js.Serialize(qry);
Context.Response.Clear();
Context.Response.ContentType = "application/json";
Context.Response.AddHeader("content-length", strResponse.Length.ToString(CultureInfo.InvariantCulture));
Context.Response.Flush();
Context.Response.Write(strResponse);
HttpContext.Current.ApplicationInstance.CompleteRequest();
}
}
It is very simple
// data from the query
// SELECT CODE, TOTALVALUE FROM vw_BuyersByCountryValue
var sqldata = new []
{
new { Code = "AF", TotalValue = "$23,554,857.27" },
new { Code = "AS", TotalValue = "$38,379,964.65" },
new { Code = "SG", TotalValue = "$24,134,283.47" },
};
var mappeddata = sqldata.Select( r =>
{
var dict = new Dictionary<string,string>();
dict[r.Code] = r.TotalValue;
return dict;
});
var json = JsonConvert.SerializeObject(mappeddata,Formatting.Indented);
content of json
[
{
"AF": "$23,554,857.27"
},
{
"AS": "$38,379,964.65"
},
{
"SG": "$24,134,283.47"
}
]
.net fiddle sample
You can even populate it as
{
"AF": "$23,554,857.27",
"AS": "$38,379,964.65",
"SG": "$24,134,283.47"
}
with
var sqldata = new []
{
new { Code = "AF", TotalValue = "$23,554,857.27" },
new { Code = "AS", TotalValue = "$38,379,964.65" },
new { Code = "SG", TotalValue = "$24,134,283.47" },
};
var mappeddata = sqldata.ToDictionary(r => r.Code, r => r.TotalValue);
var json = JsonConvert.SerializeObject(mappeddata,Formatting.Indented);
.net fiddle sample
Update
[WebMethod]
public void GetBuyersByCountryValue()
{
using (PMMCEntities ctx = new PMMCEntities())
{
ctx.Configuration.ProxyCreationEnabled = false;
var qry = ctx.vw_BuyersByCountryValue.ToList();
var mapped = qry.Select r =>
{
var dict = new Dictionary<string,string>();
dict[r.CODE] = r.TOTALVALUE;
return dict;
});
string strResponse = Newtonsoft.Json.JsonConvert.SerializeObject(mapped);
Context.Response.Clear();
Context.Response.ContentType = "application/json";
Context.Response.AddHeader("content-length", strResponse.Length.ToString(CultureInfo.InvariantCulture));
Context.Response.Flush();
Context.Response.Write(strResponse);
HttpContext.Current.ApplicationInstance.CompleteRequest();
}
}
You need the NuGet package Newtonsoft.Json

NEST compound queries which must all be satisfied

var availableToField = Infer.Field<Project>(f => f.Availablity.AvailableTo);
var availableFromField = Infer.Field<Project>(f => f.Availablity.AvailableFrom);
var nameField = Infer.Field<Project>(f => f.Contact.Name);
var active_date_to = new DateRangeQuery(){
Name = "toDate",
Boost = 1.1,
Field = "availablity.availableTo",
GreaterThan = DateTime.Now,
TimeZone = "+01:00",
Format = "yyyy-MM-ddTHH:mm:SS||dd.MM.yyyy"
};
var active_date_from = new DateRangeQuery(){
Name = "from",
Boost = 1.1,
Field = "availablity.availableFrom",
LessThanOrEqualTo = DateTime.Now,
TimeZone = "+01:00",
Format = "yyyy-MM-ddTHH:mm:SS||dd.MM.yyyy"
};
public ISearchResult<Project> Search(SearchCriteria criteria)
{var ret = _client.Search<Project>(s =>
s.Query(q =>
active_date_from &&
active_date_to &&
q.Match(d => d.Query(criteria.FreeText))
).From(criteria.CurrentPage).Size(criteria.Take)
.From(criteria.CurrentPage)
.Take(criteria.Take)
);
result.Total = ret.Total;
result.Page = criteria.CurrentPage;
result.PerPage = criteria.Take;
result.Results = ret.Documents;
return result;
}
what im trying to do is get the results matching the freetext but are also withing the pricerange..
somehow though what i get is an invalid NEST response build from a unsuccessful low level call on POST... and in consequence an empty query.
there are no compiling errors.
does anyone have an idea where i could have gone wrong or what im missing?
the other thing i tried was
var mustClauses = new List<QueryContainer>();
mustClauses.Add(active_date_from);
mustClauses.Add(active_date_to);
mustClauses.Add(new TermQuery
{
Field = "contact.name",
Value = criteria.FreeText
});
var searchRequest = new SearchRequest<Project>()
{
Size = 10,
From = 0,
Query = new BoolQuery
{
Must = mustClauses
}
};
var ret = _client.Search<Project>(searchRequest);
result.Total = ret.Total;
result.Page = criteria.CurrentPage;
result.PerPage = criteria.Take;
result.Results = ret.Documents;
which got me pretty much the same results.. (read: none)
is there something im missing?
edit:
however.. this:
var ret = _client.Search<Project>(s => s.Query(q => q.Match(m => m.Field(f => f.DisplayName).Query(criteria.FreeText))));
gives me exactly what i want (without the validation of the dates of course and only looking at one field)
In your first example, the match query is missing a field property which is needed for the query. Because of NEST's conditionless query behaviour, the query is not serialized as part of the request. The two date range queries are serialized however.
Here's a simple example that you may find useful to get the correct query you're looking for
void Main()
{
var pool = new SingleNodeConnectionPool(new Uri("http://localhost:9200"));
var defaultIndex = "projects";
var connectionSettings = new ConnectionSettings(pool, new InMemoryConnection())
.DefaultIndex(defaultIndex )
.PrettyJson()
.DisableDirectStreaming()
.OnRequestCompleted(response =>
{
if (response.RequestBodyInBytes != null)
{
Console.WriteLine(
$"{response.HttpMethod} {response.Uri} \n" +
$"{Encoding.UTF8.GetString(response.RequestBodyInBytes)}");
}
else
{
Console.WriteLine($"{response.HttpMethod} {response.Uri}");
}
Console.WriteLine();
if (response.ResponseBodyInBytes != null)
{
Console.WriteLine($"Status: {response.HttpStatusCode}\n" +
$"{Encoding.UTF8.GetString(response.ResponseBodyInBytes)}\n" +
$"{new string('-', 30)}\n");
}
else
{
Console.WriteLine($"Status: {response.HttpStatusCode}\n" +
$"{new string('-', 30)}\n");
}
});
var client = new ElasticClient(connectionSettings);
var availableToField = Infer.Field<Project>(f => f.Availablity.AvailableTo);
var availableFromField = Infer.Field<Project>(f => f.Availablity.AvailableFrom);
var nameField = Infer.Field<Project>(f => f.Contact.Name);
var active_date_to = new DateRangeQuery
{
Name = "toDate",
Boost = 1.1,
Field = availableToField,
GreaterThan = DateTime.Now,
TimeZone = "+01:00",
Format = "yyyy-MM-ddTHH:mm:SS||dd.MM.yyyy"
};
var active_date_from = new DateRangeQuery
{
Name = "from",
Boost = 1.1,
Field = availableFromField,
LessThanOrEqualTo = DateTime.Now,
TimeZone = "+01:00",
Format = "yyyy-MM-ddTHH:mm:SS||dd.MM.yyyy"
};
var ret = client.Search<Project>(s => s
.Query(q =>
active_date_from &&
active_date_to && q
.Match(d => d
.Query("free text")
)
)
.From(0)
.Size(10)
);
}
public class Project
{
public Availibility Availablity { get; set; }
public Contact Contact { get; set; }
}
public class Contact
{
public string Name { get; set; }
}
public class Availibility
{
public DateTime AvailableFrom { get; set; }
public DateTime AvailableTo { get; set; }
}
Your current query generates
POST http://localhost:9200/projects/project/_search?pretty=true
{
"from": 0,
"size": 10,
"query": {
"bool": {
"must": [
{
"range": {
"availablity.availableFrom": {
"lte": "2017-07-21T10:01:01.456794+10:00",
"time_zone": "+01:00",
"format": "yyyy-MM-ddTHH:mm:SS||dd.MM.yyyy",
"_name": "from",
"boost": 1.1
}
}
},
{
"range": {
"availablity.availableTo": {
"gt": "2017-07-21T10:01:01.456794+10:00",
"time_zone": "+01:00",
"format": "yyyy-MM-ddTHH:mm:SS||dd.MM.yyyy",
"_name": "toDate",
"boost": 1.1
}
}
}
]
}
}
}
If a nameField is added as the field for the match query you get
POST http://localhost:9200/projects/project/_search?pretty=true
{
"from": 0,
"size": 10,
"query": {
"bool": {
"must": [
{
"range": {
"availablity.availableFrom": {
"lte": "2017-07-21T10:02:23.896385+10:00",
"time_zone": "+01:00",
"format": "yyyy-MM-ddTHH:mm:SS||dd.MM.yyyy",
"_name": "from",
"boost": 1.1
}
}
},
{
"range": {
"availablity.availableTo": {
"gt": "2017-07-21T10:02:23.896385+10:00",
"time_zone": "+01:00",
"format": "yyyy-MM-ddTHH:mm:SS||dd.MM.yyyy",
"_name": "toDate",
"boost": 1.1
}
}
},
{
"match": {
"contact.name": {
"query": "free text"
}
}
}
]
}
}
}
Remove InMemoryConnection from ConnectionSettings if you actually want to execute the query against Elasticsearch and see the results.
The range query is a structured query where a document either matches or doesn't match the predicate. Because of this, it can be wrapped in a bool query filter clause which will forgo calculating a score for it and perform better. Because no scoring occurs, boost is not needed.
Putting this together
var availableToField = Infer.Field<Project>(f => f.Availablity.AvailableTo);
var availableFromField = Infer.Field<Project>(f => f.Availablity.AvailableFrom);
var nameField = Infer.Field<Project>(f => f.Contact.Name);
var active_date_to = new DateRangeQuery
{
Name = "toDate",
Field = availableToField,
GreaterThan = DateTime.Now,
TimeZone = "+01:00",
Format = "yyyy-MM-ddTHH:mm:SS||dd.MM.yyyy"
};
var active_date_from = new DateRangeQuery
{
Name = "from",
Field = availableFromField,
LessThanOrEqualTo = DateTime.Now,
TimeZone = "+01:00",
Format = "yyyy-MM-ddTHH:mm:SS||dd.MM.yyyy"
};
var ret = client.Search<Project>(s => s
.Query(q =>
+active_date_from &&
+active_date_to && q
.Match(d => d
.Field(nameField)
.Query("free text")
)
)
.From(0)
.Size(10)
);
You may also want to explore modelling available from and to as a date_range type

NEST: Add Term Query Under Condition

In my application I pass a boolean parameter to a function that searches certain documents in my elastic index via a HasChildQuery.
If this boolean is set to false I want to exclude documents with a specific field set, when the boolean is set to true I do not want this second condition.
This is my approach so far:
Query = new HasChildQuery
{
// ...
Query = new CommonTermsQuery
{
// This Query always needs to be there
Field = Nest.Infer.Field<FaqQuestion>(q => q.Content),
Query = content
}
&& (includeAutoLearnedData ? null : +new TermQuery
{
// I only want this Query if includeAutoLearnedData is false
Field = Nest.Infer.Field<FaqQuestion>(q => q.AutoLearned),
Value = false
})
}
My idea behind this is to always generate a request like this
has_child
|
|__ ...
|
|__ common_terms
and expand this to
has_child
|
|__ ...
|
|__ bool
|
|__must
| |
| |__common_terms
|
|__filter
|
|__term
if includeAutoLearnedData is false.
But the query for the case when it is true seems to not work.
I hoped that && (includeAutoLearnedData ? null : +new TermQuery will add the filter only when the boolean is false and leave the query unmodified when it is true
So what is the correct way of including an additional filter query under a certain condition in NEST?
EDIT:
I set a breakpoint when I get the result from my ElasticClient and expected it to have something like
Valid NEST response built from a successful low level call on POST: /faq/_search
# Audit trail of this API call:
- [1] HealthyResponse: Node: http://localhost:9200/ Took: 00:00:00.0770000
# Request:
{
"query": {
"has_child": {
"bool": {
"must": [{
"common_terms": { ... }
}],
"filter": [{
"term": { ... }
}]
}
}
}
}
but actual result had:
# Request:
{}
What you have is correct and your approach is sound, but the reason you're seeing {} in the output is because of conditionless queries in NEST; Essentially, if a query does not have certain properties set (or they are assigned null or empty string), then the query is considered conditionless and not serialized as part of the request. For example, for a term query, if
the field has an empty string assigned to it, or a null string, expression or property
the value is null or an empty string
then the term query is considered conditionless. You can change this behaviour using verbatim and strict
Verbatim
Individual queries can be marked as verbatim meaning that the query should be sent to Elasticsearch as is, even if it is conditionless.
Strict
Individual queries can be marked as strict meaning that if they are conditionless, an exception is thrown. This is useful for when a query must have an input value.
To demonstrate that your approach works
void Main()
{
var pool = new SingleNodeConnectionPool(new Uri("http://localhost:9200"));
var defaultIndex = "default-index";
var connectionSettings = new ConnectionSettings(pool, new InMemoryConnection())
.DefaultIndex(defaultIndex)
.PrettyJson()
.DisableDirectStreaming()
.OnRequestCompleted(response =>
{
if (response.RequestBodyInBytes != null)
{
Console.WriteLine(
$"{response.HttpMethod} {response.Uri} \n" +
$"{Encoding.UTF8.GetString(response.RequestBodyInBytes)}");
}
else
{
Console.WriteLine($"{response.HttpMethod} {response.Uri}");
}
Console.WriteLine();
if (response.ResponseBodyInBytes != null)
{
Console.WriteLine($"Status: {response.HttpStatusCode}\n" +
$"{Encoding.UTF8.GetString(response.ResponseBodyInBytes)}\n" +
$"{new string('-', 30)}\n");
}
else
{
Console.WriteLine($"Status: {response.HttpStatusCode}\n" +
$"{new string('-', 30)}\n");
}
});
var client = new ElasticClient(connectionSettings);
var includeAutoLearnedData = false;
var request = new SearchRequest<Message>
{
Query = new HasChildQuery
{
Type = "child",
Query = new CommonTermsQuery
{
Field = Infer.Field<Message>(m => m.Content),
Query = "commonterms"
}
&& (includeAutoLearnedData ? null : +new TermQuery
{
Field = Infer.Field<Message>(m => m.Content),
Value = "term"
})
}
};
client.Search<Message>(request);
}
public class Message
{
public string Content { get; set; }
}
produces the following query when includeAutoLearnedData is false
{
"query": {
"has_child": {
"type": "child",
"query": {
"bool": {
"must": [
{
"common": {
"content": {
"query": "commonterms"
}
}
}
],
"filter": [
{
"term": {
"content": {
"value": "term"
}
}
}
]
}
}
}
}
}
and when it's true
{
"query": {
"has_child": {
"type": "child",
"query": {
"common": {
"content": {
"query": "commonterms"
}
}
}
}
}
}
(I noticed that we are missing a section on conditionless queries in the latest documentation. Will add one!)

how to query data limit in one type using nest elasticsearch

in NEST 2.x, I wrote code to query data like below:
var query = new QueryContainer();
query = query && new TermQuery { Field = "catId", Value = catId };
query = query && new NumericRangeQuery { Field ="price", GreaterThan = 10 };
var request =new SearchRequest<Project>
{
From = 0,
Size = 100,
Query = query,
Sort = new List<ISort>
{
new SortField { Field = "field", Order = SortOrder.Descending },
...
},
Type?? //problem comes here, how to specify type??
}
var response = _client.Search<Project>(request);
There are more than one type in my index, I want to query data in one of type.(just like query one of table data in a database), I hope in the SearchRequest object initializer have a "Type" parameter.
You can specify the indices and types in the constructor for SearchRequest<T>()
var catId = 1;
var query = new QueryContainer(new TermQuery { Field = "catId", Value = catId });
query = query && new NumericRangeQuery { Field = "price", GreaterThan = 10 };
var request = new SearchRequest<Project>("index-name", Types.Type(typeof(Project), typeof(AnotherProject)))
{
From = 0,
Size = 100,
Query = query,
Sort = new List<ISort>
{
new SortField { Field = "field", Order = Nest.SortOrder.Descending },
}
};
var response = client.Search<Project>(request);
would generate the following query
POST http://localhost:9200/index-name/project%2Canotherproject/_search?pretty=true
{
"from": 0,
"size": 100,
"sort": [
{
"field": {
"order": "desc"
}
}
],
"query": {
"bool": {
"must": [
{
"term": {
"catId": {
"value": 1
}
}
},
{
"range": {
"price": {
"gt": 10.0
}
}
}
]
}
}
}

Categories

Resources