Newtonsoft JArray LINQ - Group By array with identical values - c#

I'm new to LINQ and having a problem I can't seem to solve. I have a JSON array/object like this:
[
{
"items": [
"pepperoni"
]
},
{
"items": [
"sausage"
]
},
{
"items": [
"sausage"
]
},
{
"items": [
"pepperoni",
"mushrooms",
"olives"
]
},
{
"items": [
"peppers",
"spinach"
]
},
{
"items": [
"peppers",
"spinach"
]
},
{
"items": [
"peppers",
"spinach"
]
}
]
I need to GROUP BY the items combinations and produce results like this:
peppers,spinach - 3
sausage - 2
pepperoni - 1
pepperoni,mushrooms,olives - 1
This is the Linq query I have (clearly doesn't work).
JArray jsonData= JsonConvert.DeserializeObject<JArray>(jsonString);
var queryResult =
from c in jsonData.Select(i => i["items"]).Values<string>()
group c by c
into g
orderby g.Count() descending
select new { Items = g.Key, Count = g.Count() };
I find examples for every scenario except this one.

You need to merge list items into string then Group items:
var queryResult = from c in jsonData.Select(i => String.Join(",", i["items"]).OrderBy(o => o))
group c by c
into g
orderby g.Count() descending
select new { Items = g.Key, Count = g.Count() };
It will give you desired output:
peppers,spinach-3
sausage-2
pepperoni-1
pepperoni,mushrooms,olives-1

There's another solution with the use of SequenceEqual and IEqualityComparer.
This solution is longer but more complete, since it uses proper custom equality comparer.
First we need a class implementing IEqualityComparer
public class ItemEqualityComparer : IEqualityComparer<IEnumerable<string>>
{
public bool Equals(IEnumerable<string> x, IEnumerable<string> y)
{
if (x == null && y == null)
return true;
else if (x == null || y == null)
return false;
return x.SequenceEqual(y);
}
public int GetHashCode(IEnumerable<string> obj)
{
return obj.Select(o => o.GetHashCode()).Sum();
}
}
then we can use it to group the items correctly:
var itemsGroups = jsonData.Select(i => i["items"].Values<string>()).GroupBy(l => l, l => l, (key, values) => new
{
Key = key,
Count = values.Count()
}, new ItemEqualityComparer());
This solution keeps the items inside a list so that you don't have to resplit them from the joint string.
The drawback is that is slower becausxe it has to enumerate all the items of all the lists to check that there're equals.
I hope I was helpful.

Related

EF Core LINQ groups results of the query incorrectly

I have a entity like this:
public class Vehicle
{
public long Id { get; set; }
public string RegistrationNumber { get; set; }
public string Model { get; set; }
public string Code { get; set; }
//other properties
}
Which has a unique constraint on { RegistrationNumber, Model, Code, /*two other properties*/ }
I'm trying to query the database to get an object that's structured like this:
[
{
"name": "Model1",
"codes": [
{
"name": "AAA",
"registrationNumbers": ["2", "3"]
},
{
"name":"BBB",
"registrationNumbers": ["3", "4"]
}
]
},
{
"name": "Model2",
"codes": [
{
"name": "BBB",
"registrationNumbers": ["4", "5"]
}
]
}
]
I.e. the list of Models, each models has a list of Codes that can co-appear with it, each code has a list of Registration Numbers that can appear with that Model and that Code.
I'm doing a LINQ like this:
var vehicles = _context.Vehicles.Where(/*some additional filters*/)
return await vehicles.Select(v => v.Model).Distinct().Select(m => new ModelFilterDTO()
{
Name = m,
Codes = vehicles.Where(v => v.Model== m).Select(v => v.Code).Distinct().Select(c => new CodeFilterDTO()
{
Name = c,
RegistrationNumbers = vehicles.Where(v => v.Model == m && v.Code == c).Select(v => v.RegistrationNumber).Distinct()
})
}).ToListAsync();
Which gets translated into this SQL query:
SELECT [t].[Model], [t2].[Code], [t2].[RegistrationNumber], [t2].[Id]
FROM (
SELECT DISTINCT [v].[Model]
FROM [Vehicles] AS [v]
WHERE --additional filtering
) AS [t]
OUTER APPLY (
SELECT [t0].[Code], [t1].[RegistrationNumber], [t1].[Id]
FROM (
SELECT DISTINCT [v0].[Code]
FROM [Vehicles] AS [v0]
WHERE /* additional filtering */ AND ([v0].[Model] = [t].[Model])
) AS [t0]
LEFT JOIN (
SELECT DISTINCT [v1].[RegistrationNumber], [v1].[Id], [v1].[Code]
FROM [Vehicles] AS [v1]
WHERE /* additional filtering */ AND ([v1].[Model] = [t].[Model])
) AS [t1] ON [t0].[Code] = [t1].[Code]
) AS [t2]
ORDER BY [t2].[Id]
Running this query in the SQL Server gets me correct sets of values. But when I perform the LINQ, I get an object like this:
[
{
"name": "Model1",
"codes": [
{
"name": "AAA",
"registrationNumbers": [/* every single registration number that is present among the records that passed the filters*/]
}
]
}
]
What is the problem may be, and how to fix it?
Edit: After playing with it for a bit, I'm even more confused than I was
This LINQ:
var vehicles = _context.Vehicles.Where(/*some additional filters*/)
return await vehicles.Select(v => v.Model).Distinct().Select(m => new ModelFilterDTO()
{
Name = m
}).ToListAsync();
Gives the expected result:
[
{
"name": "Model1"
},
{
"name": "Model2"
},
...
]
Hovewer this LINQ:
var vehicles = _context.Vehicles.Where(/*some additional filters*/)
return await vehicles.Select(v => v.Model).Distinct().Select(m => new ModelFilterDTO()
{
Name = m,
Codes = vehicles.Select(v=>v.Code).Distinct().Select(c => new CodeFilterDTO()
{
Name = c
})
}).ToListAsync();
Gives result like this:
[
{
"name": "Model1",
"codes": [
{
"name": "AAA"
}
]
}
]
Open for yourself GroupBy operator. Using double grouping you can achieve desired result.
var rawData = await _context.Vehicles
.Where(/*some additional filters*/)
.Select(v => new
{
v.Model,
v.RegistrationNumber,
v.Code
})
.ToListAsync(); // materialize minimum data
// perform grouping on the client side
var result = rawData
.GroupBy(v => v.Model)
.Select(gm => new ModelFilterDTO
{
Name = gm.Key,
Codes = gm
.GroupBy(x => x.Code)
.Select(gc => new CodeFilterDTO
{
Name = gc.Key,
RegistrationNumbers = gc.Select(x => x.RegistrationNumber).ToList()
}).ToList()
})
.ToList();

Filter JSON Array with dynamic conditions

I have many JSON array with different types of nodes in it.
Sample Json 1:
[
{
"EmpID": "23",
"EmpName": "Jhon",
"Age": "23"
},
{
"EmpID": "29",
"EmpName": "Paul",
"Age": "25"
},
{
"EmpID": "123",
"EmpName": "Jack",
"Age": "29"
},
{
"EmpID": "129",
"EmpName": "Apr",
"Age": "29"
}
]
Sample Json 2
[
{
"DepID": "2",
"Name": "Sales"
},
{
"DepID": "5",
"Name": "Marketing"
},
{
"DepID": "12",
"Name": "IT"
}
]
I want to filter them based on different conditions such as
1)EmpID=29
This should return
[
{
"EmpID": "29",
"EmpName": "Paul",
"Age": "25",
}
]
2)Age=23 and EmpName=Jhon
This should return
[
{
"EmpID": "23",
"EmpName": "Jhon",
"Age": "23"
}
]
Age=29
This should return
[
{
"EmpID": "123",
"EmpName": "Jack",
"Age": "29"
},
{
"EmpID": "129",
"EmpName": "Apr",
"Age": "29"
}
]
So I need a generic approach to do any number of filters on the JSON array. I am planning to get all the filters using some comma separated string like Age="23",EmpName="Jhon" and this can be converted to any format in the code.
I have tried creating dynamic filter using Json Path such as $.[?(#.Age == '23' && #.EmpName == 'Jhon')].
Also I tried using LINQ like
var result = JsonConvert.DeserializeObject(jsonString);
var res = (result as Newtonsoft.Json.Linq.JArray).Where(x =>
x["Age"].ToString() =="23" && x["EmpName"].ToString()=="Jhon").ToList();
But how I can generate the where conditions dynamically based on any number of conditions I receive
Also there is a plan to include Date filters in case there is some datetime nodes in json such as BirthDate>12051995.
I am not sure how I can dynamically filter using any number of input filter conditions.
To get this working in a traditional way, you'll need to perform 3 steps:
define a class to contain the data
deserialize the json into a list of objects
use linq to query your selection
You can do the same thing for the departments.
If you need to join them in any way, use .Join. If the JSON is mixed, you can create a single class containing all the properties and use that to query.
So for the simple case: first define a class to represent you object:
public class Employee
{
public int EmpID {get;set;}
public string EmpName {get;set;}
public int Age {get;set;}
}
Then deserialize and query:
put at the top:
using System.Text.Json;
public void Main()
{
//deserialize into a list
List<Employee> employees =
JsonSerializer.Deserialize<List<Employee>>(yourJsonString);
//query
var result = employees.Where(c => c.Age == 23 && c.EmpName == "Jhon");
//show results
foreach (var employee in result)
Console.WriteLine(employee.EmpID);
}
As by update:
Depending on your use case you have a couple of options:
a fixed number of dynamic properties
a truly dynamic query
A fixed number of dynamic properties
You can achieve a more dynamic setup with the following:
//define the filterable properties
//note they are nullable
int? age = null;
int? id = null;
string name = null;
//apply them in a query
//
//note: if one of the filter properties is not set,
// that side of the && expression evaluates to "true"
var result = employees.Where(c => (age == null ? true : c.Age == age) &&
(id == null ? true : c.EmpId == id) &&
(name == null ? true : c.EmpName == name));
a truly dynamic query
Now here things start to get tricky. One possible option is to generate a string based query, with the help of a libary like Dynamic Linq
You have almost nailed it. :)
Instead of using DeserializeObject and then converting it to JArray prefer JArray.Parse
var json = File.ReadAllText("sample.json");
var semiParsedJson = JArray.Parse(json);
Instead of using ToList after Where prefer JArray constructor which can work well with an IEnumerable<JToken>
const string IdField = "EmpID", NameField = "EmpName", AgeField = "Age";
const StringComparison caseIgnorant = StringComparison.OrdinalIgnoreCase;
var idEq29 = semiParsedJson.Children()
.Where(token => string.Equals(token[IdField].Value<string>(),"29", caseIgnorant));
Console.WriteLine(new JArray(idEq29).ToString());
The other queries can be implemented in the very same way
var ageEq23AndNameJhon = semiParsedJson.Children()
.Where(token => string.Equals(token[AgeField].Value<string>(), "23", caseIgnorant)
&& string.Equals(token[NameField].Value<string>(), "Jhon", caseIgnorant));
Console.WriteLine(new JArray(ageEq23AndNameJhon).ToString());
var ageEq29 = semiParsedJson.Children()
.Where(token => string.Equals(token[AgeField].Value<string>(), "29", caseIgnorant));
Console.WriteLine(new JArray(ageEq29).ToString());
UPDATE #1: Enhance proposed solution
With the following extension method
public static class JArrayExtensions
{
public static JArray Filter(this JArray array, Func<JToken, bool> predicate)
=> new JArray(array.Children().Where(predicate));
}
you can greatly simplify the filtering
var idEq29 = semiParsedJson
.Filter(token => string.Equals(token[IdField].Value<string>(),"29", caseIgnorant));
var ageEq23AndNameJhon = semiParsedJson
.Filter(token => string.Equals(token[AgeField].Value<string>(), "23", caseIgnorant))
.Filter(token => string.Equals(token[NameField].Value<string>(), "Jhon", caseIgnorant));
var ageEq29 = semiParsedJson
.Filter(token => string.Equals(token[AgeField].Value<string>(), "29", caseIgnorant));
Console.WriteLine(idEq29);
Console.WriteLine();
Console.WriteLine(ageEq23AndNameJhon);
Console.WriteLine();
Console.WriteLine(ageEq29);
Or you can push it even further. If all the fields store string values then you can define the extension method like this:
public static class JArrayExtensions
{
public static JArray Filter(this JArray array, string field, string value)
=> new JArray(array.Children().Where(GenerateFilter(field, value)));
private static Func<JToken, bool> GenerateFilter(string field, string value)
=> (JToken token) => string.Equals(token[field].Value<string>(), value, StringComparison.OrdinalIgnoreCase);
}
The the filter queries are super simple :D
var idEq29 = semiParsedJson
.Filter(IdField,"29");
var ageEq23AndNameJhon = semiParsedJson
.Filter(AgeField, "23")
.Filter(NameField, "Jhon");
var ageEq29 = semiParsedJson
.Filter(AgeField, "29");
Console.WriteLine(ageEq23AndNameJhon);
Console.WriteLine();
Console.WriteLine(idEq29);
Console.WriteLine();
Console.WriteLine(ageEq29);

C# MongoDB Filter returns the whole object

I'm trying to create a MongoDB filter in C#.
For example i have a JSON object like this :
"Username": "Tinwen",
"Foods": [
{
"Fruit": "Apple",
"Amount": 1
},
{
"Fruit": "Banana",
"Amount": 2
},
{
"Fruit": "Mango",
"Amount": 3
},
{
"Fruit": "Strawberry",
"Amount": 3
}
]
}
And i want to create a filter that returns only the objects in the array with Amount == 2 || Amount == 3:
{
"Username": "Tinwen",
"Foods": [
{
"Fruit": "Banana",
"Amount": 2
},
{
"Fruit": "Mango",
"Amount": 3
},
{
"Fruit": "Strawberry",
"Amount": 3
}
]
}
I've already tried filter like this :
var amountFilter= Builders<MyObject>.Filter.ElemMatch(
m => m.Foods,
f => f.Amount == 2 || f.Amount == 3);
And this one :
var expected = new List<int>();
expected.Add(2);
expected.Add(3);
var amountFilter = Builders<MyObject>.Filter.And(Builders<MyObject>.Filter.ElemMatch(
x => x.Foods, Builders<Foods>.Filter.And(
Builders<Foods>.Filter.In(y => y.Amount, expected))));
But every time it returns me the whole object (with the full array).
For now i'm using LinQ like this:
List<MyObject> res = _messageCollection.Find(amountFilter).ToEnumerable().ToList();
foreach (var msg in res)
{
for (int j = 0; j < msg.Foods.Count; j++)
{
if (!expected.Contains(msg.Foods[j].Amount))
{
msg.Foods.RemoveAt(j);
}
}
}
for (int i = 0; i < res.Count; i++)
{
if (res[i].Foods.Count == 0)
{
res.RemoveAt(i);
}
}
But I'm pretty sure it can be down using MongoDB filter (and also because it's pretty bad with LinQ). So if anyone have an answer that can help me !
you can do it with linq like following. but you gotta make the Foods property IEnumerable<Food>
public class User
{
public string Username { get; set; }
public IEnumerable<Food> Foods { get; set; }
}
the following query will do a $filter projection on the Foods array/list.
var result = await collection
.AsQueryable()
.Where(x => x.Foods.Any(f => f.Amount == 2 || f.Amount == 3))
.Select(x => new User
{
Username = x.Username,
Foods = x.Foods.Where(f => f.Amount == 2 || f.Amount == 3)
})
.ToListAsync();

Linq query returns duplicate results

The following query returns duplicate results, in the second select query.
Country has 0..1 to * relationship with leagues.
Leagues have 1 to * relationship with userLeagues.
return from ul in userLeagues
select new Map.Country
{
id = ul.Country.CountryId,
name = ul.Country.Common_Name,
leagues = userLeagues.Where(x => x.CountryId.Value == ul.CountryId.Value)
.Select(x => new Map.League
{
id = x.LeagueID,
name = x.leagueNameEN,
})
};
I tried using Distinct with no luck.
It seems that either i have to use distinct or groupby countryId
The output is such as
[
{
"id": 1,
"name": "Europe",
"leagues": [
{
"id": 2,
"name": "Champions League",
},
{
"id": 3,
"name": "Europa league",
}
]
},
{
"id": 1,
"name": "Europe",
"leagues": [
{
"id": 2,
"name": "Champions League",
},
{
"id": 3,
"name": "Europa league",
}
]
}
]
You need to group it by CountryId and Common_Name to get expected results:
var result = from ul in userLeagues
group ul by new { ul.Country.CountryId, ul.Country.Common_Name } into g
select new Map.Country
{
id = g.Key.CountryId,
name = g.Key.Common_Name,
leagues = g.Select(x => new Map.League
{
id = x.LeagueID,
name = x.leagueNameEN,
})
};
Think about what you're doing: For each league in userLeagues, you're creating a Map.Country for the country that league belongs to. If three leagues are in France, that's three Frances. France is a wonderful country, but let's not go overboard.
Instead, you want to start with a distinct list of countries. For each one, create one Map.Country, and give that Map.Country a list of the leagues that should belong to it.
First, let's make Country implement IEquatable<Country> for Distinct purposes:
public class Country : IEquatable<Country>
{
public bool Equals(Country other)
{
return other.CountryID == CountryID;
}
Second, you want to start with a distinct list of countries, and then populate them with leagues.
var q =
from ctry in userLeagues.Select(ul => ul.Country).Distinct()
select new
{
id = ctry.CountryID,
name = ctry.Common_Name,
leagues = userLeagues.Where(x => x.Country == ctry)
.Select(x => new
{
id = x.LeagueID,
name = x.leagueNameEn
}).ToList()
};
I didn't recreate your Map.League and Map.Country classes, I just used anonymous objects, and I left it that way because this code definitely works just as it is. But filling in your class names is trivial.
If it's not practical to make Country implement IEquatable<T>, just write a quick equality comparer and use that:
public class CountryComparer : IEqualityComparer<Country>
{
public bool Equals(Country x, Country y)
{
return x.CountryID == y.CountryID;
}
public int GetHashCode(Country obj)
{
return obj.CountryID.GetHashCode();
}
}
...like so:
var cc = new CountryComparer();
var q =
from ctry in userLeagues.Select(ul => ul.Country).Distinct(cc)
select new
{
id = ctry.CountryID,
name = ctry.Common_Name,
leagues = userLeagues.Where(x => cc.Equals(x.Country, ctry))
.Select(x => new
{
id = x.LeagueID,
name = x.leagueNameEn
}).ToList()
};
This is logically equivalent to a GroupBy, which is probably a more respectable way to do it. But somebody else thought of that before I did, so he earned the glory.
I would say the you need to reverse your query. So instead of starting with userLeagues, start with country and include the child leagues.

Find pattern in json with json.net and linq

I'm searching a json file, with the following structure:
{
"objects": [
{
"name": "obj1",
"state": {
"type": 4,
"childs": [
"state": {
"type": 5,
...
The state can contain state as a child until any number of Levels. Now im trying to find all objects containing a certain Patterns of states, e.g. state 4 with child state 5 with child state 2.
My code so far is this.
JObject o = JObject.Parse(System.IO.File.ReadAllText(#"j.json"));
var oObjects=
from p in o["objects"]
where (string)p["state"] == "4"
select (string)p["name"];
How can I expand the code to find all objects containing the search pattern on any Level?
To make it work for indefinite level, then you will need to use a recursive method like the following:
void Main()
{
var str = #"{
""objects"": [
{
""name"": ""obj1"",
""state"": {
""type"": 4,
""childs"": [
{
""state"": {
""type"": 5
}
}
]
}
}
]
}";
var obj = JObject.Parse(str);
GetValidObjects(obj, new string[] { "4", "5" }); // Name list of valid objects
}
And the helper methods defined like:
public IEnumerable<string> GetValidObjects(JObject obj, IEnumerable<string> values)
{
return obj["objects"]
.Where(i => (string)i["state"]["type"] == values.First() && ContainsState((JArray)i["state"]["childs"], values.Skip(1)))
.Select(i => (string)i["name"]);
}
public bool ContainsState(JArray childs, IEnumerable<string> values)
{
if (childs == null)
{
return values.Count() == 0;
}
return childs.Any(i => (string)i["state"]["type"] == values.First() && ContainsState((JArray)i["state"]["childs"], values.Skip(1)));
}
An option could be to convert the json to xml and then use an xpath query to obtain the list of nodes.
string json = System.IO.File.ReadAllText(#"j.json");
XmlDocument document = (XmlDocument)JsonConvert.DeserializeXmlNode(json);
XmlNodeList nodes = document.SelectNodes("//name[../state[type[.=4] and childs/state[type[.=5] and childs/state[type[.=2]]]]]");
You can use SelectTokens for this:
var objects = o.SelectTokens("$.objects[?(#.state.type == 4
&& #.state.childs[*].state.type == 5)].name")
.Select(s => (string)s)
.ToList();

Categories

Resources