Facet search using Nest (Elasticsearch) - c#

I am using NEST (ver 0.12) Elasticsearch (ver 1.0) and I'm facing a problem with the facets.
Basically I'm expecting the results to be something similar to this
Between 18 and 25 (10)
Between 26 and 35 (80)
Greater then 35 (10)
But what I'm actually getting is this
between (99)
and (99)
35 (99)
26 (99)
This is my code
namespace ElasticSerachTest
{
class Program
{
static void Main(string[] args)
{
var setting = new ConnectionSettings(new Uri("http://localhost:9200/"));
setting.SetDefaultIndex("customertest");
var client = new ElasticClient(setting);
var createIndexResult = client.CreateIndex("customertest", new IndexSettings
{
});
// put documents to the index using bulk
var customers = new List<BulkParameters<Customer>>();
for (int i = 1; i < 100; i++)
{
customers.Add(new BulkParameters<Customer>(new Customer
{
Id = i,
AgeBand = GetAgeBand(),
Name = string.Format("Customer {0}", i)
}));
}
var bp = new SimpleBulkParameters()
{
Replication = Replication.Async,
Refresh = true
};
//first delete
client.DeleteMany(customers, bp);
//now bulk insert
client.IndexMany(customers, bp);
// query with facet on nested field property genres.genreTitle
var queryResults = client.Search<Customer>(x => x
.From(0)
.Size(10)
.MatchAll()
.FacetTerm(t => t
.OnField(f => f.AgeBand)
.Size(30))
);
var yearFacetItems = queryResults.FacetItems<FacetItem>(p => p.AgeBand);
foreach (var item in yearFacetItems)
{
var termItem = item as TermItem;
Console.WriteLine(string.Format("{0} ({1})", termItem.Term, termItem.Count));
}
Console.ReadLine();
}
public static string GetAgeBand()
{
Random rnd = new Random();
var age = rnd.Next(18, 50);
if (Between(age, 18, 25))
{
return "Between 18 and 25";
}
else if (Between(age, 26, 35))
{
return "Between 26 and 35";
}
return "Greater then 35";
}
public static bool Between(int num, int lower, int upper)
{
return lower <= num && num <= upper;
}
[ElasticType(Name = "Customer", IdProperty = "id")]
public class Customer
{
public int Id
{
get;
set;
}
public string Name
{
get;
set;
}
[ElasticProperty(Index = FieldIndexOption.not_analyzed)]
public string AgeBand
{
get;
set;
}
}
}
}
Thanks

Based on the output you are seeing, I do not think your FieldIndexOption.not_analyzed is being applied to the AgeBand field. As those facet results look like like analyzed values. You need to apply the mapping during the index creating process as part of your index settings. Please try the following index creation code:
var createIndexResult = client.CreateIndex("customertest", s => s
.AddMapping<Customer>(m => m
.MapFromAttributes()
)
);
Please see the Nest Documentation on Mapping for some other ways to add the mapping to your index as well.

Related

How to dynamically group a list and select values using linq query in C#?

I have an input list test
class Tracker
{
public string Toolid {get;set;}
public string Description {get;set;}
public int length {get;set;}
public int breadth {get;set;}
public int height {get;set;}
}
List<Tracker> test = new List<Tracker>();
test.Add( new Tracker {Toolid="A.1",Description ="ABC",length = 10, breadth =10,height = 50});
test.Add( new Tracker {Toolid="A.1",Description ="ABC", length = 10, breadth =10,height = 50});
test.Add( new Tracker {Toolid="C.1",Description ="ABCD", length = 10, breadth =10,height = 50});
test.Add( new Tracker {Toolid="D.1",Description ="Admin123", length = 10, breadth =10,height = 50});
This list contain more values like weight, colour etc.
For better understanding I have added only 5 member variables in the class Tracker.
I need to Group the list test based on the values of another list (grpList).
This list (grpList ) is dynamic, hence the number of parameter and values in the list may change.
So I need a dynamic Group By of list using LINQ query.
case 1 : sometimes this list grpList contain 2 values .
List <string> grpList = new List<string>();
grpList.Add(ToolId);
grpList.Add(Description);
if So , I have to group the list test with ToolId and Description,
case 2 : if list grpList contain N values , I have to group the list test with 'N' values .
The number of values in the grpList varies. I have group the main list test using the values in grpList. If grpList contain 2 values ,group the test list with 2 values . if If grpList contain 5 values , group the test list with 5 values.
NB : I need to Group the list test ( Main list).
grpList values only for grouping .
try reflection:
List<string> grpList = new List<string>();
grpList.Add("Toolid");
grpList.Add("Description");
var groups = new Dictionary<string, IEnumerable>();
var all_properties = typeof(Tracker).GetProperties();
foreach ( var prop_name in grpList )
{
var prop = all_properties.First( x => x.Name == prop_name);
var group = test.GroupBy( x => prop.GetValue( x ) );
groups.Add( prop_name, group );
}
if you want to have an sql-like nested grouping, apply the GroupBy to the resulting groups:
var groups = new List<List<Tracker>>() { test };
foreach ( var prop_name in grpList )
{
var prop = all_properties.First( x => x.Name == prop_name);
var newgroups = new List<List<Tracker>>();
foreach ( var group in groups)
{
var subgroups = group.GroupBy( x => prop.GetValue( x ) );
newgroups.AddRange( subgroups.Select(g => g.ToList()).ToList() );
}
groups = newgroups;
}
I used Enumerable.GroupBy Method selector predicate.
Here's how I generated the predicate and the solution seems to work.
public class Tracker
{
public string Toolid { get; set; }
public string Description { get; set; }
public int length { get; set; }
public int breadth { get; set; }
public int height { get; set; }
}
class Program
{
static void Main(string[] args)
{
List<Tracker> test = new List<Tracker>();
test.Add(new Tracker { Toolid = "A.1", Description = "ABC", length = 50, breadth = 10, height = 50 });
test.Add(new Tracker { Toolid = "A.1", Description = "ABC", length = 20, breadth = 10, height = 50 });
test.Add(new Tracker { Toolid = "C.1", Description = "LMN", length = 10, breadth = 10, height = 50 });
test.Add(new Tracker { Toolid = "D.1", Description = "Admin123", length = 7, breadth = 10, height = 50 });
List<string> grpList = new List<string>();
grpList.Add("length");
grpList.Add("Description");
var sourceParm = Expression.Parameter(typeof(Tracker), "x");
List<Expression> propertyExpressions = new List<Expression>();
foreach (var f in grpList.ToArray())
{
Expression conv = Expression.Convert(Expression.Property(sourceParm, f), typeof(object));
propertyExpressions.Add(conv);
}
var concatMethod = typeof(string).GetMethod(
"Concat",
new[] { typeof(object), typeof(object), typeof(object) });
Expression body = propertyExpressions.Aggregate((x, y) => Expression.Call(concatMethod,
x,
Expression.Constant(","),
y));
var groupSelector = Expression.Lambda<Func<Tracker, string>>(body, sourceParm);
var j = test.GroupBy(groupSelector.Compile());
}

Get the index of item in list based on value

The scenario is for a football league table. I can order the list by match win percentage and then by goals scored to determine their position in the league. I then use this ordering to get teams position in the league table using the IndexOf function.
this.results = this.results.OrderByDescending(x => x.WinPercentage).ThenByDescending(x => x.Goals);
this.results.Foreach(x => x.Position = this.results.IndexOf(x));
The problem arises when two teams (should be joint #1) have the same match win percentage and goals scored but when getting the index one team will be assigned #1 and the other #2.
Is there a way to get the correct position?
var position = 1;
var last = result.First();
foreach(var team in results)
{
if (team.WinPercentage != last.WinPercentage || team.Goals != last.Goals)
++position;
team.Position = position;
last = team;
}
What you could do is group the items based on the win percentage and goals (if both are the same, the teams will be in the same group), then apply the same position number to every element in the same group:
this.results = this.results.OrderByDescending(x => x.WinPercentage).ThenByDescending(x => x.Goals);
var positionGroups = this.results.GroupBy(x => new { WinPercentage = x.WinPercentage, Goals = x.Goals });
int position = 1;
foreach (var positionGroup in positionGroups)
{
foreach (var team in positionGroup)
{
team.Position = position;
}
position++;
}
The code below code will work for you
this.results = this.results.OrderByDescending(x => x.WinPercentage).ThenByDescending(x => x.Goals);
this.results.Foreach(x =>
{
int index = this.results.FindIndex(y => y.Goals == x.Goals && y.WinPercentage == x.WinPercentage);
x.Position = index > 0 ? this.results[index - 1].Position + 1 : 0;
});
Here's my solution
Define a class:
public class ABC
{
public int A { get; set; }
public int B { get; set; }
public int R { get; set; }
}
Constructing numerical:
List<ABC> list = new List<ABC>();
for (var i = 0; i < 100; i++)
{
list.Add(new ABC()
{
A = i,
B = i > 50 && i < 70 ? i + 20 : i + 1
});
}
Ranking and print the values:
var result = list.OrderByDescending(d => d.B)
.GroupBy(d => d.B)
.SelectMany((g, `i`) => g.Select(e => new ABC()
{
A = e.A,
B = e.B,
R = i + 1
})).ToList();
foreach (var t in result)
{
Console.WriteLine(JsonConvert.SerializeObject(t));
}
Console.ReadLine();
the result:

LINQ OrderBy - Custom

I have some data like
ID Sequence customIndex
1 1 0
2 2 0
3 3 2
4 4 1
5 5 0
I need to use sequence in order by when customIndex is zero other wise use customIndex.
So result should be ID in order of 1,2,4,3,5.
I need LINQ implementation using Lambda. I tried some solution but could not implement.
Posting duplicate and deleting previous one, because of wrong formatting the meaning of question got changed and I received bunch of negative votes.
Added code at dotnet fiddle:
https://stable.dotnetfiddle.net/fChl40
The answer is based on assumption, that CustomIndex is greater or equals to zero:
var result =
data.OrderBy(x => x.CustomIndex==0 ? x.Sequence :
data.Where(y => y.CustomIndex==0 && y.Sequence < x.Sequence)
.Max(y => (int?)y.Sequence))
.ThenBy(x => x.CustomIndex);
This is working for provided test data:
l.OrderBy(a => a.customIndex != 0 ?
list.Where(b => b.Sequence < a.Sequence && b.customIndex == 0)
.OrderByDescending(c => c.Sequence)
.FirstOrDefault()
.Sequence : a.Sequence)
.ThenBy(c=>c.customIndex )
.ToList();
The idea is to order non zero values by first preceding zero valued rows, and then by non zero values itself.
This is something I wanted:
public static void Main()
{
List<Data> data = new List<Data>();
data.Add(new Data{ Id=1, Sequence=1, CustomIndex=0});
data.Add(new Data{ Id=5, Sequence=5, CustomIndex=0});
data.Add(new Data{ Id=6, Sequence=6, CustomIndex=2});
data.Add(new Data{ Id=2, Sequence=2, CustomIndex=0});
data.Add(new Data{ Id=3, Sequence=3, CustomIndex=2});
data.Add(new Data{ Id=4, Sequence=4, CustomIndex=1});
data.Add(new Data{ Id=7, Sequence=7, CustomIndex=1});
int o = 0;
var result = data
.OrderBy(x=>x.Sequence).ToList()
.OrderBy((x)=> myCustomSort(x, ref o) )
;
result.ToList().ForEach(x=> Console.WriteLine(x.Id));
}
public static float myCustomSort(Data x, ref int o){
if(x.CustomIndex==0){
o = x.Sequence;
return x.Sequence ;
}
else
return float.Parse(o + "."+ x.CustomIndex);
}
Sample code: https://stable.dotnetfiddle.net/fChl40
I will refine it further
Based on your question and reply to my comment, I understand you need to clusterize the items' collection, then consider Sequence and CustomIndex on all items of each cluster.
Once clustered (split into blocks depending on a specific criterion) you can merge them back into a unique collection, but while doing that you can manipulate each cluster independently the way you need.
public static class extCluster
{
public static IEnumerable<KeyValuePair<bool, T[]>> Clusterize<T>(this IEnumerable<T> self, Func<T, bool> clusterizer)
{
// Prepare temporary data
var bLastCluster = false;
var cluster = new List<T>();
// loop all items
foreach (var item in self)
{
// Compute cluster kind
var bItemCluster = clusterizer(item);
// If last cluster kind is different from current
if (bItemCluster != bLastCluster)
{
// If previous cluster was not empty, return its items
if (cluster.Count > 0)
yield return new KeyValuePair<bool, T[]>(bLastCluster, cluster.ToArray());
// Store new cluster kind and reset items
bLastCluster = bItemCluster;
cluster.Clear();
}
// Add current item to cluster
cluster.Add(item);
}
// If previous cluster was not empty, return its items
if (cluster.Count > 0)
yield return new KeyValuePair<bool, T[]>(bLastCluster, cluster.ToArray());
}
}
// sample
static class Program
{
public class Item
{
public Item(int id, int sequence, int _customIndex)
{
ID = id; Sequence = sequence; customIndex = _customIndex;
}
public int ID, Sequence, customIndex;
}
[STAThread]
static void Main(string[] args)
{
var aItems = new[]
{
new Item(1, 1, 0),
new Item(2, 2, 0),
new Item(3, 3, 2),
new Item(4, 4, 1),
new Item(5, 5, 0)
};
// Split items into clusters
var aClusters = aItems.Clusterize(item => item.customIndex != 0);
// Explode clusters and sort their items
var result = aClusters
.SelectMany(cluster => cluster.Key
? cluster.Value.OrderBy(item => item.customIndex)
: cluster.Value.OrderBy(item => item.Sequence));
}
}
It ain't pretty, but it exemplifies what you were asking for, I think:
using System;
using System.Collections.Generic;
using System.Linq;
public class Program
{
public static void Main()
{
List<Data> data = new List<Data>();
data.Add(new Data { Id = 1, Sequence = 1, CustomIndex = 0 });
data.Add(new Data { Id = 2, Sequence = 2, CustomIndex = 0 });
data.Add(new Data { Id = 3, Sequence = 3, CustomIndex = 2 });
data.Add(new Data { Id = 4, Sequence = 4, CustomIndex = 1 });
data.Add(new Data { Id = 5, Sequence = 5, CustomIndex = 0 });
//List of items where the sequence is what counts
var itemsToPlaceBySequence = data.Where(x => x.CustomIndex == 0).OrderBy(x => x.Sequence).ToList();
//List of items where the custom index counts
var itemsToPlaceByCustomIndex = data.Where(x => x.CustomIndex != 0).OrderBy(x => x.CustomIndex).ToList();
//Array to hold items
var dataSlots = new Data[data.Count];
//Place items by sequence
foreach(var dataBySequence in itemsToPlaceBySequence) {
dataSlots[dataBySequence.Sequence - 1] = dataBySequence ;
}
//Find empty data slots and place remaining items in CustomIndex order
foreach (var dataByCustom in itemsToPlaceByCustomIndex) {
var index = dataSlots.ToList().IndexOf(null);
dataSlots[index] = dataByCustom ;
}
var result = dataSlots.ToList();
result.ForEach(x => Console.WriteLine(x.Id));
var discard = Console.ReadKey();
}
public class Data
{
public int Id { get; set; }
public int Sequence { get; set; }
public int CustomIndex { get; set; }
}
}
The ordering you want to do (order partly on CustomIndex and partly on Sequence) doesn't work like that. But this should be close to what you want. Order first by CustomIndex, and then by Sequence.
var result = data.OrderBy(x => x.CustomIndex).ThenBy(x => x.Sequence);

Find Average of a node in Multiple JSON strings

I have searched through many topics, find some relevant answers too, but I am still not able to reach to a solution, hence I am posting this question.
Problem Description
EmployeeResponse1 = [{"Ques":"1","Rating":"7"},{"Ques":"2","Rating":"1"},{"Ques":"3","Rating":"6"},{"Ques":"4","Rating":"1"},{"Ques":"5","Rating":"1"},{"Ques":"6","Rating":"1"},{"Ques":"7","Rating":"7"},{"Ques":"8","Rating":"1"},{"Ques":"9","Rating":"1"},{"Ques":"10","Rating":"1"},{"Ques":"11","Rating":"1"},{"Ques":"12","Rating":"1"},{"Ques":"13","Rating":"1"},{"Ques":"14","Rating":"1"},{"Ques":"15","Rating":"1"},{"Ques":"16","Rating":"10"}]
EmployeeResponse2 = [{"Ques":"1","Rating":"5"},{"Ques":"2","Rating":"4"},{"Ques":"3","Rating":"7"},{"Ques":"4","Rating":"8"},{"Ques":"5","Rating":"5"},{"Ques":"6","Rating":"9"},{"Ques":"7","Rating":"10"},{"Ques":"8","Rating":"4"},{"Ques":"9","Rating":"9"},{"Ques":"10","Rating":"6"},{"Ques":"11","Rating":"6"},{"Ques":"12","Rating":"6"},{"Ques":"13","Rating":"7"},{"Ques":"14","Rating":"7"},{"Ques":"15","Rating":"9"},{"Ques":"16","Rating":"8"}]
I have these two JSON strings in c# (there can be more).. Now I want to make a final JSON string which will be like:
EmployeeResponseAvg = [{"Ques":"1","Rating":"6"},{"Ques":"2","Rating":"2.5"},{"Ques":"3","Rating":"6.5"},{"Ques":"4","Rating":"4.5"},{"Ques":"5","Rating":"3"},{"Ques":"6","Rating":"5"},{"Ques":"7","Rating":"8.5"},{"Ques":"8","Rating":"2.5"},....,{"Ques":"16", "Rating": "9"}]
like I want rating of Ques = 1 should be the average of Rating (Ques = 1 of string 1) and Rating (Ques = 1 of string 2)... similarly for other questions
i.e like FINAL =[{ QUES = 1, RATING = (Emp1(Rating.WHERE(QUES = 1), Emp2(Rating.WHERE(QUES = 1),).AVERAGE),....................}]
Work So Far
MODEL -> SurveyResponse.cs
public class SurveyResponse
{
public string Ques { get; set; }
public string Rating { get; set; }
}
public class ResponseDataCalls
{
public static SurveyResponse PutData(string t, string v)
{
SurveyResponse s = new SurveyResponse();
s.Ques = t;
s.Rating = v;
return s;
}
}
WebAPI RevGroupChartController.cs
public class RevGroupChartController : ApiController
{
private hr_toolEntities _db = new hr_toolEntities();
public object Get(int cid, int gid)
{
spiderChart obj = new spiderChart();
var group_employees = (from ge in _db.hrt_group_employee
where ge.fk_group_id == gid
select ge.fk_employee_id).ToList();
List<string> EMP = new List<string>();
List<string> SUP = new List<string>();
List<SurveyResponse> EmpResponse = new List<SurveyResponse>();
List<SurveyResponse> SupResponse = new List<SurveyResponse>();
List<List<SurveyResponse>> tmpEMP = new List<List<SurveyResponse>>();
List<List<SurveyResponse>> tmpSUP = new List<List<SurveyResponse>>();
foreach(var emp in group_employees)
{
int eid = Convert.ToInt32(emp);
var Data = (from d in _db.hrt_cycle_response
join g in _db.hrt_cycle_groups on d.hrt_cycle.pk_cycle_id equals g.fk_cycle_id
where d.fk_cycle_id == cid && g.fk_group_id == gid && d.fk_employee_id == eid
select new
{
d.response_employee_answers,
d.response_supervisor_answers
}).First();
EMP.Add(Data.response_employee_answers);
SUP.Add(Data.response_supervisor_answers);
}
foreach(var e in EMP)
{
//tmpEMP = new JavaScriptSerializer().Deserialize<TEMP>(e);
var s = new JavaScriptSerializer();
List<SurveyResponse> em = s.Deserialize<List<SurveyResponse>>(e);
tmpEMP.Add(em);
}
foreach (var s in SUP)
{
//tmpSUP = new JavaScriptSerializer().Deserialize<TEMP>(s);
var e = new JavaScriptSerializer();
List<SurveyResponse> sp = e.Deserialize<List<SurveyResponse>>(s);
tmpSUP.Add(sp);
}
var empl = _db.hrt_questions.Select(x => new { x.question_name }).ToList();
List<int[]> Emprating = new List<int[]>();
//int avgRating;
int cnt = 0;
foreach(var item in tmpSUP)
{
int noofQ = item.Count;
int[] i = new int[noofQ];
for (int y = 0; y > tmpSUP.Count; y++)
{
i[y] = Convert.ToInt32(item[cnt].Rating);
}
Emprating.Add(i);
cnt++;
}
//obj.Employee = Data.response_employee_answers;
//obj.Supervisor = Data.response_supervisor_answers;
obj.ques = new List<object>();
for (int i = 0; i < empl.Count; i++)
{
obj.ques.Add(empl[i].question_name);
}
return obj;
}
public class TEMP
{
public List<SurveyResponse> data { get; set; }
}
}
Explanation of Code
I pass a cycle ID and a group ID...
Each group has more than 1 employee and each employee has a supervisor
so if say group ID 1023 has 2 employees.
Now we have 2 employees and 2 supervisors
we have a json record for each of them
LIKE DB TABLE RESPONSE {fk_emp_id, fk_sup_id, cycle_id, emp_reponse(json), supervisor_response(json)}
so I need to make ONE JSON string for employees (which contains the average of all ratings)
and ONE JSON string for SUPERVISOR (again, average of both the JSONs)
there could be any number of employees, depending on the group size
and each employee will always have a supervisor
In short I want a a string like:
FinalEmployeeResponse = [{'Ques': '1', 'Rating': 'R1'}, {'Ques': '2', 'Rating': 'R2'}, {'Ques': '3', 'Rating': 'R3'}, {'Ques': '4', 'Rating': 'R4'}, ........, {'Ques': '16', 'Rating': 'R16'}]
Here, R1 = AVERAGE(Emp1json.Rating.WHERE('Ques' = 1), Emp2json.Rating.WHERE('Ques' = 1), .....)
and
R2 = AVERAGE(Emp1json.Rating.WHERE('Ques' = 2), Emp2json.Rating.WHERE('Ques' = 2), .....)
... and so on....
Looking forward to your responses.
I am new on stack overflow, please ask for more details if I have missed something.
The correct way to do this is to parse this as JSON. The quick and dirty way is:
static void Main(string[] args)
{
string json1 = #"[{""Ques"":""1"",""Rating"":""7""},{""Ques"":""2"",""Rating"":""1""},{""Ques"":""3"",""Rating"":""6""},{""Ques"":""4"",""Rating"":""1""},{""Ques"":""5"",""Rating"":""1""},{""Ques"":""6"",""Rating"":""1""},{""Ques"":""7"",""Rating"":""7""},{""Ques"":""8"",""Rating"":""1""},{""Ques"":""9"",""Rating"":""1""},{""Ques"":""10"",""Rating"":""1""},{""Ques"":""11"",""Rating"":""1""},{""Ques"":""12"",""Rating"":""1""},{""Ques"":""13"",""Rating"":""1""},{""Ques"":""14"",""Rating"":""1""},{""Ques"":""15"",""Rating"":""1""},{""Ques"":""16"",""Rating"":""10""}]";
string json2 = #"[{""Ques"":""1"",""Rating"":""5""},{""Ques"":""2"",""Rating"":""4""},{""Ques"":""3"",""Rating"":""7""},{""Ques"":""4"",""Rating"":""8""},{""Ques"":""5"",""Rating"":""5""},{""Ques"":""6"",""Rating"":""9""},{""Ques"":""7"",""Rating"":""10""},{""Ques"":""8"",""Rating"":""4""},{""Ques"":""9"",""Rating"":""9""},{""Ques"":""10"",""Rating"":""6""},{""Ques"":""11"",""Rating"":""6""},{""Ques"":""12"",""Rating"":""6""},{""Ques"":""13"",""Rating"":""7""},{""Ques"":""14"",""Rating"":""7""},{""Ques"":""15"",""Rating"":""9""},{""Ques"":""16"",""Rating"":""8""}]";
string averages = AverageNodes(json1, json2);
Console.WriteLine(averages);
Console.ReadKey();
}
private static string AverageNodes(params string[] json)
{
var regex = new Regex(#"(""Ques"":""(?<question>\d+)"",""Rating"":""(?<rating>\d+)"")", RegexOptions.ExplicitCapture | RegexOptions.IgnoreCase);
var ANUs = regex.Matches(string.Join("", json))
.Cast<Match>()
.Select(m => new { Question = m.Groups["question"].Value, Rating = int.Parse(m.Groups["rating"].Value) })
.GroupBy(a => a.Question, a => a.Rating)
.Select(a => string.Format("{{\"Ques\":\"{0}\",\"Rating\":\"{1}\"}}", a.Key, a.Average()));
return "[" + string.Join(",", ANUs) + "]";
}
I found a 1 line answer to this using LINQ.
double _avg1 = tmpEMP.Select(x => Convert.ToInt32(x.ElementAt(i).Rating)).Average();

How to build multiple integer key index (fast look up object) for using between operator (val >= & val <=)

Ok let me explain clearly what i want to achieve
It will be an object which will contain the below data - like an sql server table
BigInt parameter1
BigInt parameter2
string parameter3
these parameter1 and parameter2 both will compose the index (like primary key in sql-server table)
So this object will have like 500000 records like the above
And i will make fast look ups from this object like
return parameter3 where parameter1 <= value and value <= parameter2
What can be used for this ?
So far i tried these and they are slow
DataView.RowFilter = super slow
static Dictionary<Int64, KeyValuePair<Int64, string>> = slower than database query
Database query = where parameter1 & parameter2 composes primary key = slow since i need to make over 500000 query.
I also searched many questions at stackoverflow and none of them targeting between operator at integer keys. They are all multiple string key.
C# 4.0
Quick and dirty sketch:
public class GeoIp
{
private class GeoIpRecord
{
public long StartIp;
public long EndIp;
public string Iso;
}
private class GeoIpRecordComparer: IComparer<GeoIpRecord>
{
public int Compare(GeoIpRecord x, GeoIpRecord y)
{
return x.StartIp.CompareTo(y.StartIp);
}
}
private List<GeoIpRecord> geoIp;
private IComparer<GeoIpRecord> comparer;
public GeoIp()
{
this.geoIp = new List<GeoIpRecord>(500000)
{
new GeoIpRecord { StartIp = 1, EndIp = 2, Iso = "One" },
new GeoIpRecord { StartIp = 3, EndIp = 5, Iso = "Three" },
new GeoIpRecord { StartIp = 6, EndIp = 6, Iso = "Six" },
new GeoIpRecord { StartIp = 7, EndIp = 10, Iso = "Seven" },
new GeoIpRecord { StartIp = 15, EndIp = 16, Iso = "Fifteen" },
};
this.comparer = new GeoIpRecordComparer();
}
public string GetIso(long ipValue)
{
int index = this.geoIp.BinarySearch(new GeoIpRecord() { StartIp = ipValue }, this.comparer);
if (index < 0)
{
index = ~index - 1;
if (index < 0)
{
return string.Empty;
}
}
GeoIpRecord record = this.geoIp[index];
if (record.EndIp >= ipValue)
{
return record.Iso;
}
else
{
return string.Empty;
}
}
}
And the code that confirms the solution:
GeoIp geoIp = new GeoIp();
var iso1 = geoIp.GetIso(1); // One
var iso2 = geoIp.GetIso(2); // One
var iso3 = geoIp.GetIso(3); // Three
var iso4 = geoIp.GetIso(4); // Three
var iso5 = geoIp.GetIso(5); // Three
var iso6 = geoIp.GetIso(6); // Six
var iso7 = geoIp.GetIso(7); // Seven
var iso11 = geoIp.GetIso(11); //
var iso15 = geoIp.GetIso(15); // Fifteen
var iso17 = geoIp.GetIso(17); //
The List has to be filled with an ordered data.
List.BinarySearch Method (T, IComparer)
I don't think [that] ranges overlap.
This simplifies the problem a great deal: rather than performing a two-dimensional search, you can sort your list, and perform a one-dimensional binary search, like this:
var data = new List<Tuple<long,long,string>>(TotalCount);
var cmp = new TupleComparer();
data.Sort(cmp);
long item = ... // The item to be searched
var pos = data.BinarySearch(Tuple.Create(item, long.MinValue, String.Empty), cmp);
// It appears that your data has only non-empty strings, so it is guaranteed that
// pos is going to be negative, because Item3, the last tie-breaker, will be smaller
// than anything that you may have in the table
pos = ~pos;
if (pos != data.Count && data[pos].Item1 <= item && data[pos].Item2 >= item) {
Console.WriteLine("Found: '{0}'", data[pos].Item3);
} else {
Console.WriteLine("Not found");
}
Here is the TupleComparer class:
class TupleComparer : IComparer<Tuple<long,long,string>> {
public int Compare(Tuple<long,long,string> x, Tuple<long,long,string> y) {
var res = x.Item1.CompareTo(y.Item1);
if (res != 0) return res;
res = x.Item2.CompareTo(y.Item2);
return (res != 0) ? res : String.CompareOrdinal(x.Item3, y.Item3);
}
}

Categories

Resources