To contextualize:
A Batch file has BachItems
Each BatchItem is a row and has a line number.
Each row is processed in order.
I'm new to NoSQL and mongo and I'd like to know how to query the last processing step executed (most recent EventType column) for each BatchItem (line number), filtering by BatchId?
For example, it should return the following result for BatchId "102030":
I believe I can achieve this using Aggregate and Group functions but don't know how.
Thanks.
You can do it as below:
db.batch.aggregate([
{
$match: {
"BatchId": 102030
}
},
{ $sort: { "Date": -1 } },
{
$group: {
_id: "$BatchItemId",
"doc": { "$push": { lastEventName: "$EventType" } },
}
},
{
$replaceRoot: {
newRoot: { $arrayElemAt: ["$doc", 0] }
}}
])
Related
My API returns this JSON, it's the same employee which is present 2 times in my database.
I can't delete it in my database, also how can I print only 1 of them when I make my get in angular?
Or when I make my *ngfor:"let e of employee", how can I print only the first result?
getEmployee(id: string){
return this.http.get<any>(this.localUrlAPI+"/employee/getEmployee/"+id)
.pipe(map((res:any)=>{
return res.Data;
}))
}
{
"Data":[
{
"IdEmployee": "1",
"Name": "Jacob"
}
{
"IdEmployee": "1",
"Name" ; "Jacob"
}
]
}
so you want to remove duplicates, there are many approaches, this is one:
.pipe(map((res:any)=>{
const uniqueData = new Map<number, any>();
for (const data of res.Data) {
uniqueData.set(data.id, data);
}
return Array.from(uniqueData.values());
}))
I'm trying to rollup some of my 'other' results using Elasticsearch. Ideally, I'd like my query to return the top N hits and then roll the rest of the data up into an N+1 hit titled "Other".
So for example, if I'm trying to aggregate "Institutions by Total Value", I'd get back 10 Institutions with the most value and then the total aggregated value of the other institutions as another record. The purpose is that I'd like to see the total value aggregated across all institutions but not have to list thousands.
An example search I've been using is:
GET my_index/institution/_search?pretty=true
{
"query": {
"filtered": {
"filter": {
"bool": {
"must": [
... terms queries ...
]
}
}
}
},
"aggs": {
"dimension_type_name_agg": {
"terms": {
"field": "institution_name",
"order": {
"metric_sum_total_value_agg": "desc"
},
"size": 0
},
"aggs": {
"metric_sum_total_value_agg": {
"sum": {
"field": "total_value"
}
},
"metric_count_account_id_agg": {
"value_count": {
"field": "institution_id"
}
}
}
}
}
}
I'm curious as to if this can be done by modifying a query like the one given above. Also, I'm using C# and Nest/Elasticsearch.NET so any tips on how this translates to that side is appreciated as well.
I am trying to write a search query on an elastic index that will return me results from any part of the field value.
I have a Path field that contains values like C:\temp\ab-cd\abc.doc
I want the ability to send a query that will return my any matching part from what I wrote
QueryContainer currentQuery = new QueryStringQuery
{
DefaultField = "Path",
Query = string.Format("*{0}*", "abc"),
};
The above will return results, this will not:
QueryContainer currentQuery = new QueryStringQuery
{
DefaultField = "Path",
Query = string.Format("*{0}*", "ab-cd"),
};
The same goes for any other special character like ##$%^&* and so on.
Is there some generic way to send a query and to find exactly what I searched?
Each of my fields are multi-fields and I can use the *.raw options but do not exactly know how or if I should
Use nGrams to split the text in smaller chunks and use term filter to query it. Pro: it should be faster. Con: the size of the index (disk space) will be larger because more terms (from nGram filter) are generated.
PUT /test
{
"settings": {
"analysis": {
"analyzer": {
"my_ngram_analyzer": {
"tokenizer": "keyword",
"filter": [
"substring"
]
}
},
"filter": {
"substring": {
"type": "nGram",
"min_gram": 1,
"max_gram": 50
}
}
}
},
"mappings": {
"test": {
"properties": {
"Path": {
"type": "string",
"index_analyzer": "my_ngram_analyzer",
"search_analyzer": "keyword"
}
}
}
}
}
And the query:
GET /test/test/_search
{
"query": {
"term": {
"Path": {
"value": "\temp"
}
}
}
}
If you wish, you can use the config above as a sub-field for whatever mapping you already have.
If you want to use query_string there one thing you need to be aware: you need to escape special characters. For example -, \ and : (complete list here). Also, when indexing, the \ char needs escaping, otherwise it will issue an error. This is what I tested especially with query_string: https://gist.github.com/astefan/a52fa4989bf5298102d1
I have the following working aggregate statement in MongoDB:
db.Forms.aggregate( [
{ $match: { Id: { $nin: db.Forms.distinct("Id", {"Status": "DELETE"}, {Id: 1}) } }},
{ $group: { _id: "$Id", lastModifiedId: { $last: "$_id" } } }
]).result
The statement selects all Forms that do not have a delete record on the Forms table and returns their last modified rows.
What I would like to know is: how do I write the aggregate statement above using the MongoDB C# driver? This is what I have so far:
var aggregate = _forms.Aggregate()
//.Match(new BsonDocument { { "Id", new BsonDocument("$nin", "db.Forms.distinct(\"Id\", {\"Status\": \"DELETE\"}, {Id: 1})") } })
.Group(new BsonDocument { { "_id", "$Id" }, { "lastModifiedId", new BsonDocument("$last", "_id") } });
var aggregateResults = aggregate.ToListAsync().Result.ToList();
I just need to get the Match statement sorted out.
I've looked at a few articles about aggregation and I understand the concept (I think), but the "db.Forms.distinct" is where I'm having issues and none of the articles that I found handled this scenario.
If there's a better way of doing this, please let me know.
Thank you!
I am using the c# driver, but I would be happy about pointers in any language.
My documents have the following structure:
class Document
{
List<Comment> comments;
}
Or in Json:
[{
"comments" : [{"comment" : "text1"}, {"comment" : "text2"}, ...]
},
{
"comments" : [{"comment" : "text1"}, {"comment" : "text2"}, ...]
}, ...]
As you can see, each document contains a list of comments.
My goal is to run a periodic task, that truncates the list of comments of each document to a specific number of elements (eg. 10).
The obvious way that comes to my mind is to:
Fetch each document
Get the comments that should be removed
Update the document by it's id by pulling the ids of the comments that should be removed
Is there a possibility to do this with a bulk Update?
I couldn't think of a condition for the update that would me allow to truncate the number of comments without fetching them first.
You can slice the elements of the comments array to the last n elements (-10 in the example below). Try this in the shell:
db.coll.update(
{ },
{ $push: { comments: { $each: [ ], $slice: -10 } } },
{ multi: true }
)
Since MongoDB 2.6 you can also use a positive n to update the array to contain only the first n elements.
In case you have a field you want to sort on before applying the slice operation:
db.coll.update(
{ }, {
$push: {
comments: {
$each: [ ],
$sort: { <field_to_sort_on>: 1 },
$slice: -10
}
}
},
{ multi: true }
)