Get Elasticsearch result from a NEST C# nested aggregation - c#

I have this Elasticsearch NEST query:
var res = elastic.Search<SegmentRecord>(s => s.Index(esIndex).Aggregations(a => a.Terms("agg", x => x.Field(o => o.InstrumentName).Aggregations(a1 => a1.Terms("agg2", f => f.Field(y => y.GroupId))))));
how can I cycle through all the InstrumentName fields, and for each of those, cycle through all the GroupId fields?

On Nest 5.4.0
foreach (var bucket in res.Aggs.Terms("agg").Buckets)
{
foreach (var innerBucket in bucket.Terms("agg2").Buckets)
{
System.Console.WriteLine($"agg:{bucket.Key}, agg2:{innerBucket.Key} - {innerBucket.DocCount}");
}
}

This is how I accessed my children buckets with nested aggregations.
var yourAgg = result.Aggregations.Terms("YourParentField");
foreach (var child in yourAgg .Buckets)
{
var aggs = child.Terms("YourChildField").Buckets;
foreach (var item in aggs)
{
perDealerAggItems.Add(new AggregateItem()
{
Count = item.DocCount ?? 0,
Key = item.Key,
ParentList = field
});
}
}

Related

How do I pick out values between a duplicate value in a collection?

I have a method that returns a collection that has a duplicate value.
static List<string> GenerateItems()
{
var _items = new List<string>();
_items.Add("Tase");
_items.Add("Ray");
_items.Add("Jay");
_items.Add("Bay");
_items.Add("Tase");
_items.Add("Man");
_items.Add("Ran");
_items.Add("Ban");
return _items;
}
I want to search through that collection and find the first place that duplicate value is located and start collecting all the values from the first appearance of the duplicate value to its next appearance. I want to put this in a collection but I only want the duplicate value to appear once in that collection.
This is what I have so far but.
static void Main(string[] args)
{
string key = "Tase";
var collection = GenerateItems();
int index = collection.FindIndex(a => a == key);
var matchFound = false;
var itemsBetweenKey = new List<string>();
foreach (var item in collection)
{
if (item == key)
{
matchFound = !matchFound;
}
if (matchFound)
{
itemsBetweenKey.Add(item);
}
}
foreach (var item in itemsBetweenKey)
{
Console.WriteLine(item);
}
Console.ReadLine();
}
There must be an easier way of doing this. Perhaps with Indexing or a LINQ query?
You can do something like that
string key = "Tase";
var collection = GenerateItems();
int indexStart = collection.FindIndex(a => a == key);
int indexEnd = collection.FindIndex(indexStart+1, a => a == key);
var result = collection.GetRange(indexStart, indexEnd-indexStart);
You can use linq select and group by to find the first index and last index of all duplicates (Keep in mind if something is in the list more then 2 times it would ignore the middle occurences.
But I personally think the linq for this seems overcomplicated. I would stick with simple for loops and if statements (Just turn it into a method so it reads better)
Here is a solution with Linq to get all duplicate and all values between those duplicates including itself once as you mentioned.
var collection = GenerateItems();
var Duplicates = collection.Select((x,index) => new { index, value = x })
.GroupBy(x => x.value)//group by the strings
.Where(x => x.Count() > 1)//only take duplicates
.Select(x=>new {
Value = x.Key,
FirstIndex = x.Min(y=> y.index),//take first occurenc
LastIndex = x.Max(y => y.index)//take last occurence
}).ToList();
var resultlist = new List<List<string>>();
foreach (var duplicaterange in Duplicates)
resultlist .Add(collection.GetRange(duplicaterange.FirstIndex, duplicaterange.LastIndex - duplicaterange.FirstIndex));
Try this function
public List<string> PickOut(List<string> collection, string key)
{
var index = 0;
foreach (var item in collection)
{
if (item == key)
{
return collection.Skip(index).TakeWhile(x=> x != key).ToList();
}
index++;
};
return null;
}
First finding the duplicate key then find the second occurrence of the item and then take result.
var firstduplicate = collection.GroupBy(x => x)
.Where(g => g.Count() > 1)
.Select(g => g.Key).First();
var indices = collection.Select((b, i) => b == firstduplicate ? i : -1).Where(i => i != -1).Skip(1).FirstOrDefault();
if (indices>0)
{
var result = collection.Take(indices).ToList();
}

Group list of strings with common prefixes

Suppose I have a list of strings [city01, city01002, state02, state03, city04, statebg, countryqw, countrypo]
How do I group them in a dictionary of <string, List<Strings>> like
city - [city01, city04, city01002]
state- [state02, state03, statebg]
country - [countrywq, countrypo]
If not code, can anyone please help with how to approach or proceed?
As shown in other answers you can use the GroupBy method from LINQ to create this grouping based on any condition you want. Before you can group your strings you need to know the conditions for how a string is grouped. It could be that it starts with one of a set of predefined prefixes, grouped by whats before the first digit or any random condition you can describe with code. In my code example the groupBy method calls another method for every string in your list and in that method you can place the code you need to group the strings as you want by returning the key to group the given string under. You can test this example online with dotnetfiddle: https://dotnetfiddle.net/UHNXvZ
using System;
using System.Collections.Generic;
using System.Linq;
public class Program
{
public static void Main()
{
List<string> ungroupedList = new List<string>() {"city01", "city01002", "state02", "state03", "city04", "statebg", "countryqw", "countrypo", "theFirstTown"};
var groupedStrings = ungroupedList.GroupBy(x => groupingCondition(x));
foreach (var a in groupedStrings) {
Console.WriteLine("key: " + a.Key);
foreach (var b in a) {
Console.WriteLine("value: " + b);
}
}
}
public static string groupingCondition(String s) {
if(s.StartsWith("city") || s.EndsWith("Town"))
return "city";
if(s.StartsWith("country"))
return "country";
if(s.StartsWith("state"))
return "state";
return "unknown";
}
}
You can use LINQ:
var input = new List<string>()
{ "city01", "city01002", "state02",
"state03", "city04", "statebg", "countryqw", "countrypo" };
var output = input.GroupBy(c => string.Join("", c.TakeWhile(d => !char.IsDigit(d))
.Take(4))).ToDictionary(c => c.Key, c => c.ToList());
i suppose you have a list of references you are searching in the list:
var list = new List<string>()
{ "city01", "city01002", "state02",
"state03", "city04", "statebg", "countryqw", "countrypo" };
var tofound = new List<string>() { "city", "state", "country" }; //references to found
var result = new Dictionary<string, List<string>>();
foreach (var f in tofound)
{
result.Add(f, list.FindAll(x => x.StartsWith(f)));
}
In the result, you have the dictionary wanted. If no value are founded for a reference key, the value of key is null
Warning: This answer has a combinatorial expansion and will fail if your original string set is large. For 65 words I gave up after running for a couple of hours.
Using some IEnumerable extension methods to find Distinct sets and to find all possible combinations of sets, you can generate a group of prefixes and then group the original strings by these.
public static class IEnumerableExt {
public static bool IsDistinct<T>(this IEnumerable<T> items) {
var hs = new HashSet<T>();
foreach (var item in items)
if (!hs.Add(item))
return false;
return true;
}
public static bool IsEmpty<T>(this IEnumerable<T> items) => !items.Any();
public static IEnumerable<IEnumerable<T>> AllCombinations<T>(this IEnumerable<T> start) {
IEnumerable<IEnumerable<T>> HelperCombinations(IEnumerable<T> items) {
if (items.IsEmpty())
yield return items;
else {
var head = items.First();
var tail = items.Skip(1);
foreach (var sequence in HelperCombinations(tail)) {
yield return sequence; // Without first
yield return sequence.Prepend(head);
}
}
}
return HelperCombinations(start).Skip(1); // don't return the empty set
}
}
var keys = Enumerable.Range(0, src.Count - 1)
.SelectMany(n1 => Enumerable.Range(n1 + 1, src.Count - n1 - 1).Select(n2 => new { n1, n2 }))
.Select(n1n2 => new { s1 = src[n1n2.n1], s2 = src[n1n2.n2], Dist = src[n1n2.n1].TakeWhile((ch, n) => n < src[n1n2.n2].Length && ch == src[n1n2.n2][n]).Count() })
.SelectMany(s1s2d => new[] { new { s = s1s2d.s1, s1s2d.Dist }, new { s = s1s2d.s2, s1s2d.Dist } })
.Where(sd => sd.Dist > 0)
.GroupBy(sd => sd.s.Substring(0, sd.Dist))
.Select(sdg => sdg.Distinct())
.AllCombinations()
.Where(sdgc => sdgc.Sum(sdg => sdg.Count()) == src.Count)
.Where(sdgc => sdgc.SelectMany(sdg => sdg.Select(sd => sd.s)).IsDistinct())
.OrderByDescending(sdgc => sdgc.Sum(sdg => sdg.First().Dist)).First()
.Select(sdg => sdg.First())
.Select(sd => sd.s.Substring(0, sd.Dist))
.ToList();
var groups = src.GroupBy(s => keys.First(k => s.StartsWith(k)));

How to intersect results after GroupBy

To illustrate my problem I have created this simple snippet. I have a class Item
public class Item
{
public int GroupID { get; set; }
public int StrategyID { get; set; }
public List<Item> SeedData()
{
return new List<Item>
{
new Item {GroupID = 1, StrategyID = 1 },
new Item {GroupID = 2, StrategyID = 1 },
new Item {GroupID = 3, StrategyID = 2 },
new Item {GroupID = 4, StrategyID = 2 },
new Item {GroupID = 5, StrategyID = 3 },
new Item {GroupID = 1, StrategyID = 3 },
};
}
}
And what I want to check is that this SeedData method is not returning any duplicated GroupID/StrategyID pairs.
So in my Main method I have this:
Item item = new Item();
var data = item.SeedData();
var groupByStrategyIdData = data.GroupBy(g => g.StrategyID).Select(v => v.Select(gr => gr.GroupID)).ToList();
for (var i = 0; i < groupByStrategyIdData.Count; i++)
{
for (var j = i + 1; j < groupByStrategyIdData.Count; j++)
{
Console.WriteLine(groupByStrategyIdData[i].Intersect(groupByStrategyIdData[j]).Any());
}
}
which is working fine but one of the problems is that I have lost the StrategyID so in my real-case scenario I won't be able to say for which StrategyID/GroupID pair I have duplication so I was wondering is it possible to cut-off the LINQ to here:
var groupByStrategyIdData = data.GroupBy(g => g.StrategyID)
and somehow perform the check on this result?
One of the very easy ways would be to do grouping using some identity for your Item. You can override Equals/GetHashCode for your Item or instead write something like:
Item item = new Item();
var data = item.SeedData();
var duplicates = data.GroupBy(x => string.Format("{0}-{1}", x.GroupID, x.StrategyID))
.Where(group => group.Count() > 1)
.Select(group => group.Key)
.ToList();
Please note, that using a string for identity inside of group by is probably not the best way to do grouping.
As of your question about "cutting" the query, you should also be able to do the following:
var groupQuery = data.GroupBy(g => g.StrategyID);
var groupList = groupQuery.Select(grp => grp.ToList()).ToList();
var groupByStrategyIdData = groupQuery.Select(v => v.Select(gr => gr.GroupID)).ToList();
You may be able to do it another way, as follows:
// Check for duplicates
if (data != null)
{
var grp =
data.GroupBy(
g =>
new
{
g.GroupID,
g.StrategyID
},
(key, group) => new
{
GroupID = key.GroupID,
StrategyId = key.StrategyID,
Count = group.Count()
});
if (grp.Any(c => c.Count > 1))
{
Console.WriteLine("Duplicate exists");
// inside the grp object, you can find which GroupID/StrategyID combo have a count > 1
}
}

c# loop on group by key

I have made a group by statement on a datatable like this:
var finalResult = (from r in result.AsEnumerable()
group r by new
{
r.Agent,
r.Reason
} into grp
select new
{
Agent = grp.Key.Agent,
Reason = grp.Key.Reason,
Count = grp.Count()
}).ToList();
The finalResult will be like this:
agent1 reason1 4
agent1 reason2 7
agent2 reason1 8
agent2 reason2 3
..
...
...
agentn reason1 3
agentn reason2 11
I want to loop over agent name in order to get the reasons and the counts for each reason for each agent. In other words: i need to build this :
can you tell me please how to loop over agent name from the finalResult variable?
You need one more GroupBy and you are done:
var solution =
finalResult
.GroupBy(x => x.Agent);
foreach (var group in solution)
{
// group.Key is the agent
// All items in group are a sequence of reasons and counts for this agent
foreach (var item in group)
{
// Item has <Agent, Reason, Count> and belongs to the agent from group.Key
}
}
Outer loop goes over all the agents (so Agent1, Agent2, etc.) while inner loop will go through all reasons for the current agent.
You might want to try GroupBy in LINQ :
You can read more about it here
Perhaps:
var agentGroups = finalResult
.GroupBy(x => x.Agent)
.Select(ag => new
{
Agent = ag.Key,
ReasonCounts = ag.GroupBy(x => x.Reason)
.Select(g => new
{
Agent = ag.Key,
Reason = g.Key,
Count = g.Sum(x => x.Count)
}).ToList(),
Total_Count = ag.Sum(x => x.Count)
});
foreach (var agentGroup in agentGroups)
{
string agent = agentGroup.Agent;
int totalCount = agentGroup.Total_Count;
foreach (var reasonCount in agentGroup.ReasonCounts)
{
string reason = reasonCount.Reason;
int count = reasonCount.Count;
}
}

Getting the sum of a column based on column name

I need to get the sum of columns based on the name of the column. Currently, I'm using an IF ELSE block to take care of it, but I'm hoping there is a more automatic method for getting this sort of thing done.
What works:
foreach (var day in bydates)
{
var bymile_bydate = bymile.Where(x => x.Date == day).ToList();
foreach (var r in results)
{
var name = r.name;
if (name.Equals("TotalIssues"))
{
r.data.Add(bymile_bydate.Sum(x => x.TotalIssues).Value);
}
else if (name.Equals("TotalCritical"))
{
r.data.Add(bymile_bydate.Sum(x => x.TotalCritical).Value);
}
}
}
How I'd like to get it working:
foreach (var day in bydates)
{
var bymile_bydate = bymile.Where(x => x.Date == day).ToList();
foreach (var r in results)
{
r.data.Add(bymile_bydate.Sum(x=> x.(r.name)).Value);
}
}
So anyway way of doing this?
C# doesn't have good support for accessing members whose names aren't known at compile time. What I'll often do in this situation is have a dictionary of names to delegates that return the properties:
// assuming your objects are of type ClassX and your properties are decimals
static Dictionary<string, Func<ClassX, decimal>> PropertyLookup =
new Dictionary<string, Func<ClassX, decimal>>
{ { "TotalIssues", x => x.TotalIssues },
{ "TotalCritical", x => x.TotalCritical },
};
foreach (var day in bydates)
{
var bymile_bydate = bymile.Where(x => x.Date == day).ToList();
foreach (var r in results)
{
var name = r.name;
r.data.Add(bymile_bydate.Sum(PropertyLookup[name]).Value);
}
}
If you don't want to have to define the property names ahead of time, you can use reflection to get the delegates. Here's an implementation that caches the delegates in a dictionary like the previous solution:
// assuming your objects are of type ClassX and your properties are decimals
static Dictionary<string, Func<ClassX, decimal>> PropertyLookup =
new Dictionary<string, Func<ClassX, decimal>>();
foreach (var day in bydates)
{
var bymile_bydate = bymile.Where(x => x.Date == day).ToList();
foreach (var r in results)
{
var name = r.name;
if (!PropertyLookup.ContainsKey(name))
PropertyLookup[name] = (Func<ClassX, decimal>)Delegate.CreateDelegate(
typeof(Func<ClassX, decimal>),
typeof(ClassX).GetProperty(name).GetGetMethod());
r.data.Add(bymile_bydate.Sum(PropertyLookup[name]).Value);
}
}
Without reflection, you can't refer to the property name by a string value. And reflection would be a poor choice here. I'd recommend creating a property on the class that contains your totals that does the following:
public int? RelevantTotal
{
get
{
switch (this.name)
{
case "TotalIssues":
return this.TotalIssues;
case "TotalCritical":
return this.TotalCritical;
default:
return 0;
}
}
}
The when you loop through the results, call that property instead:
foreach (var day in bydates)
{
var bymile_bydate = bymile.Where(x => x.Date == day).ToList();
foreach (var r in results)
{
r.data.Add(bymile_bydate.Sum(x => r.RelevantTotal).Value);
}
}

Categories

Resources