XPath 2 states that the nodes order of a selection should be returned in their order in the document.
It looks this is not the case when you SelectTokens(JSONPath) in JSON.Net
When I process the following document
string json = #"
""Files"": {
""dir1"": {
""Files"": {
""file1.1.txt"": {
""file1.2.txt"": {
""dir2"": {
""Files"": {
""file2.1.txt"": {
""file2.2.txt"": {
""file3.txt"": {
The order is the following when using JSON.net SelectTokens("$..files.*")
When I expected the following order (as Xpath //files/*)
How should I write my query so that I get a List in the XPath order ?
Short of modifying the Json.Net source code, there is not a way that I can see to directly control what order SelectTokens() returns its results. It appears to be using breadth-first ordering.
Instead of using SelectTokens(), you could use a LINQ-to-JSON query with the Descendants() method. This will return tokens in depth-first order. However, you would need to filter out the property names you are not interested in, like "Files" and "size".
string json = #"
""Files"": {
""dir1"": {
""Files"": {
""file1.1.txt"": { ""size"": 100 },
""file1.2.txt"": { ""size"": 100 }
""dir2"": {
""Files"": {
""file2.1.txt"": { ""size"": 100 },
""file2.2.txt"": { ""size"": 100 }
""file3.txt"": { ""size"": 100 }
JObject jo = JObject.Parse(json);
var files = jo.Descendants()
.Select(p => p.Name)
.Where(n => n != "Files" && n != "size")
Console.WriteLine(string.Join("\n", files));
Fiddle: https://dotnetfiddle.net/yRAev4
If you don't like that idea, another possible solution is to use a custom IComparer<T> to sort the selected properties back into their original document order after the fact:
class JPropertyDocumentOrderComparer : IComparer<JProperty>
public int Compare(JProperty x, JProperty y)
var xa = GetAncestors(x);
var ya = GetAncestors(y);
for (int i = 0; i < xa.Count && i < ya.Count; i++)
if (!ReferenceEquals(xa[i], ya[i]))
return IndexInParent(xa[i]) - IndexInParent(ya[i]);
return xa.Count - ya.Count;
private List<JProperty> GetAncestors(JProperty prop)
return prop.AncestorsAndSelf().OfType<JProperty>().Reverse().ToList();
private int IndexInParent(JProperty prop)
int i = 0;
var parent = (JObject)prop.Parent;
foreach (JProperty p in parent.Properties())
if (ReferenceEquals(p, prop)) return i;
return -1;
Use the comparer like this:
JObject jo = JObject.Parse(json);
var files = jo.SelectTokens("$..Files")
.SelectMany(j => j.Properties())
.OrderBy(p => p, new JPropertyDocumentOrderComparer())
.Select(p => p.Name)
Console.WriteLine(string.Join("\n", files));
Fiddle: https://dotnetfiddle.net/xhx7Kk
I have an array list of List<string> that contains values in the following order ["1m", "1cm", "4km","2cm"] (Centimeters, meters and kilometers)
When I want to sort this array, I get a wrong answer. I use OrderBy:
List<string> data = new List<string> { "1m", "1cm", "4km","2cm" };
var result= data.OrderBy(x => x).ToList();
the result is:
{ "1cm", "1m", "2cm", "4km"}
But I want the answer to be this order-: { "1cm", "2cm", "1m", "4km"}
You have sorted the data alphabetically. First the first character is compared. Then the second character and...
You need to normalize the data based on cm(or m) and then sort.
List<string> data = new List<string> { "1m", "1cm", "4km","2cm" };
var result = data.OrderBy(x => lenghtCM(x));
public int lenghtCM(string lenghtStr)
if (lenghtStr.Contains("cm"))
string num = lenghtStr.Split("cm")[0];
return int.Parse(num);
else if (lenghtStr.Contains("km"))
string num = lenghtStr.Split("km")[0];
return int.Parse(num) * 100*1000;
else if (lenghtStr.Contains("m"))
string num = lenghtStr.Split('m')[0];
return int.Parse(num) * 100;
return 0;
then the result:
{ "1cm", "2cm", "1m", "4km"}
private string[] normalaizeArray(string[] inputArray)
for (int i= 0 ; i < inputArray.Length; i++)
inputArray[i] = (float.Parse(inputArray[i].Split('k')[0]) * 100).ToString();
} else if(inputArray[i].Contains('km'))
inputArray[i] = (float.Parse(inputArray[i].Split('k')[0]) * 100*1000).ToString();
inputArray[i] = inputArray[i].Replace("cm", "");
inputArray = inputArray.OrderBy(x => int.Parse(x)).ToArray();
for (int i = 0; i < inputArray.Length; i++)
inputArray[i] = (float.Parse(inputArray[i])/1000).ToString() + "km";
else if(int.Parse(inputArray[i])>100)
inputArray[i] = (float.Parse(inputArray[i])/100).ToString() + "m";
inputArray[i] = inputArray[i] + 'cm';
return inputArray;
If you can, parse the strings first:
enum Unit { cm, m, km }
record Measurment(int Length, Unit Unit)
public override string ToString() => $"{Length}{Enum.GetName(typeof(Unit), Unit)}";
public double NormalizedLength => Unit switch
Unit.cm => Length * 0.001,
Unit.m => Length * 1.0,
Unit.km => Length * 1000.0,
_ => throw new NotImplementedException()
public static Measurment Parse(string source)
var digits = source.TakeWhile(char.IsDigit).Count();
var length = int.Parse(source.AsSpan(0, digits));
// switches with source.AsSpan(digits) in preview
var measure = source[..digits] switch
"cm" => Unit.cm,
"m" => Unit.m,
"km" => Unit.km,
_ => throw new NotImplementedException(),
return new Measurment(length, measure);
var result = data.Select(Measurment.Parse).OrderBy(x => x.NormalizedLength).ToList();
This lets you sort your measurments by NormalizedLength and ToString gets back the original string. Should be very fast, simple to extend with new units and you can make it fault-tolerant if you turn Parse into the TryParse pattern.
There's a NuGet package to manage parsing and manipulating SI units called UnitsNet.
If you install that package (via Add | NuGet Package, search for and select UnitsNet and install it), then you can write the following code:
(You'll need to add using UnitsNet; at the top of the code file first)
This also works with nm etc.
List<string> data = new List<string> { "1m", "1cm", "4km", "2cm" };
var result = data.OrderBy(Length.Parse).ToList();
Console.WriteLine(string.Join(", ", result));
This will output "1cm, 2cm, 1m, 4km"
You need custom sort using IComparable
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Text.RegularExpressions;
namespace ConsoleApplication49
class Program
static void Main(string[] args)
List<string> data = new List<string> { "1m", "1cm", "4km", "2cm" };
List<string> results = data.Select(x => new SortDistance(x)).OrderBy(x => x).Select(x => x.value).ToList();
public class SortDistance : IComparable<SortDistance>
const string pattern = #"(?'number'\d+)(?'multiplier'.*)";
List<string> distanceOrder = new List<string>() { "cm", "m", "km" };
public string value { get; set; }
public int distance { get; set; }
public string multiplier { get; set; }
public SortDistance(string value)
this.value = value;
Match match = Regex.Match(value, pattern);
this.distance = int.Parse(match.Groups["number"].Value);
this.multiplier = match.Groups["multiplier"].Value;
public int CompareTo(SortDistance other)
if (this.multiplier == other.multiplier)
return this.distance.CompareTo(other.distance);
return distanceOrder.IndexOf(this.multiplier).CompareTo(distanceOrder.IndexOf(other.multiplier));
you can not sort using OrderBy.
You have to define the conversion first from all units to the smallest unit. for example m to cm, km to cm.....
so 1m euqals to 100 cm
then you have to iterate through your list and check each item's unit, get its equivalent to the smallest unit.
Create another list.
you can implement insertion sort to sort the items and add keep on inserting the item based on the comparison.
Suppose I have a list of strings [city01, city01002, state02, state03, city04, statebg, countryqw, countrypo]
How do I group them in a dictionary of <string, List<Strings>> like
city - [city01, city04, city01002]
state- [state02, state03, statebg]
country - [countrywq, countrypo]
If not code, can anyone please help with how to approach or proceed?
As shown in other answers you can use the GroupBy method from LINQ to create this grouping based on any condition you want. Before you can group your strings you need to know the conditions for how a string is grouped. It could be that it starts with one of a set of predefined prefixes, grouped by whats before the first digit or any random condition you can describe with code. In my code example the groupBy method calls another method for every string in your list and in that method you can place the code you need to group the strings as you want by returning the key to group the given string under. You can test this example online with dotnetfiddle: https://dotnetfiddle.net/UHNXvZ
using System;
using System.Collections.Generic;
using System.Linq;
public class Program
public static void Main()
List<string> ungroupedList = new List<string>() {"city01", "city01002", "state02", "state03", "city04", "statebg", "countryqw", "countrypo", "theFirstTown"};
var groupedStrings = ungroupedList.GroupBy(x => groupingCondition(x));
foreach (var a in groupedStrings) {
Console.WriteLine("key: " + a.Key);
foreach (var b in a) {
Console.WriteLine("value: " + b);
public static string groupingCondition(String s) {
if(s.StartsWith("city") || s.EndsWith("Town"))
return "city";
return "country";
return "state";
return "unknown";
You can use LINQ:
var input = new List<string>()
{ "city01", "city01002", "state02",
"state03", "city04", "statebg", "countryqw", "countrypo" };
var output = input.GroupBy(c => string.Join("", c.TakeWhile(d => !char.IsDigit(d))
.Take(4))).ToDictionary(c => c.Key, c => c.ToList());
i suppose you have a list of references you are searching in the list:
var list = new List<string>()
{ "city01", "city01002", "state02",
"state03", "city04", "statebg", "countryqw", "countrypo" };
var tofound = new List<string>() { "city", "state", "country" }; //references to found
var result = new Dictionary<string, List<string>>();
foreach (var f in tofound)
result.Add(f, list.FindAll(x => x.StartsWith(f)));
In the result, you have the dictionary wanted. If no value are founded for a reference key, the value of key is null
Warning: This answer has a combinatorial expansion and will fail if your original string set is large. For 65 words I gave up after running for a couple of hours.
Using some IEnumerable extension methods to find Distinct sets and to find all possible combinations of sets, you can generate a group of prefixes and then group the original strings by these.
public static class IEnumerableExt {
public static bool IsDistinct<T>(this IEnumerable<T> items) {
var hs = new HashSet<T>();
foreach (var item in items)
if (!hs.Add(item))
return false;
return true;
public static bool IsEmpty<T>(this IEnumerable<T> items) => !items.Any();
public static IEnumerable<IEnumerable<T>> AllCombinations<T>(this IEnumerable<T> start) {
IEnumerable<IEnumerable<T>> HelperCombinations(IEnumerable<T> items) {
if (items.IsEmpty())
yield return items;
else {
var head = items.First();
var tail = items.Skip(1);
foreach (var sequence in HelperCombinations(tail)) {
yield return sequence; // Without first
yield return sequence.Prepend(head);
return HelperCombinations(start).Skip(1); // don't return the empty set
var keys = Enumerable.Range(0, src.Count - 1)
.SelectMany(n1 => Enumerable.Range(n1 + 1, src.Count - n1 - 1).Select(n2 => new { n1, n2 }))
.Select(n1n2 => new { s1 = src[n1n2.n1], s2 = src[n1n2.n2], Dist = src[n1n2.n1].TakeWhile((ch, n) => n < src[n1n2.n2].Length && ch == src[n1n2.n2][n]).Count() })
.SelectMany(s1s2d => new[] { new { s = s1s2d.s1, s1s2d.Dist }, new { s = s1s2d.s2, s1s2d.Dist } })
.Where(sd => sd.Dist > 0)
.GroupBy(sd => sd.s.Substring(0, sd.Dist))
.Select(sdg => sdg.Distinct())
.Where(sdgc => sdgc.Sum(sdg => sdg.Count()) == src.Count)
.Where(sdgc => sdgc.SelectMany(sdg => sdg.Select(sd => sd.s)).IsDistinct())
.OrderByDescending(sdgc => sdgc.Sum(sdg => sdg.First().Dist)).First()
.Select(sdg => sdg.First())
.Select(sd => sd.s.Substring(0, sd.Dist))
var groups = src.GroupBy(s => keys.First(k => s.StartsWith(k)));
I have a flat JSON like below (I don't know what to call it, I hope that flat is the right word)
"address.street.name":"Narrow Street",
"books": [
"title":"The long story",
"title":"Money and morality",
Please notice that the fields are not in sorted order.
I want to convert it into a nested JSON like below:
"name":"Narrow Street"
"books": [
"title":"The long story",
"author": {
"title":"Money and morality",
What is a good algorithm to convert it?
I am a C# person, I intend to use Newtonsoft.Json to parse the input JSON to a JObject, then iterate through all fields to check their keys and create nested JObjects. For arrays, I repeat the same process for every array item.
Do you have any better idea?
This is my solution for those who are interested.
public static string ConvertFlatJson(string input)
var token = JToken.Parse(input);
if (token is JObject obj)
return ConvertJObject(obj).ToString();
if (token is JArray array)
return ConvertArray(array).ToString();
return input;
private static JObject ConvertJObject(JObject input)
var enumerable = ((IEnumerable<KeyValuePair<string, JToken>>)input).OrderBy(kvp => kvp.Key);
var result = new JObject();
foreach (var outerField in enumerable)
var key = outerField.Key;
var value = outerField.Value;
if (value is JArray array)
value = ConvertArray(array);
var fieldNames = key.Split('.');
var currentObj = result;
for (var fieldNameIndex = 0; fieldNameIndex < fieldNames.Length; fieldNameIndex++)
var fieldName = fieldNames[fieldNameIndex];
if (fieldNameIndex == fieldNames.Length - 1)
currentObj[fieldName] = value;
if (currentObj.ContainsKey(fieldName))
currentObj = (JObject)currentObj[fieldName];
var newObj = new JObject();
currentObj[fieldName] = newObj;
currentObj = newObj;
return result;
private static JArray ConvertArray(JArray array)
var resultArray = new JArray();
foreach (var arrayItem in array)
if (!(arrayItem is JObject))
var itemObj = (JObject)arrayItem;
return resultArray;
I am trying to find out the duplicate Elements in XElement , and make a generic function to remove duplicates .Something like:
public List<Xelement>RemoveDuplicatesFromXml(List<Xelement> xele)
{ // pass the Xelement List in the Argument and get the List back , after deleting the duplicate entries.
return xele;
the xml is as follows:
<Execute ID="7300" Attrib1="xyz" Attrib2="abc" Attrib3="mno" Attrib4="pqr" Attrib5="BCD" />
<Execute ID="7301" Attrib1="xyz" Attrib2="abc" Attrib3="mno" Attrib4="pqr" Attrib5="BCD" />
<Execute ID="7302" Attrib1="xyz1" Attrib2="abc" Attrib3="mno" Attrib4="pqr" Attrib5="BCD" />
I want get duplicates on every attribute excluding ID ,and then delete the one having lesser ID.
You can implement custom IEqualityComparer for this task
class XComparer : IEqualityComparer<XElement>
public IList<string> _exceptions;
public XComparer(params string[] exceptions)
_exceptions = new List<string>(exceptions);
public bool Equals(XElement a, XElement b)
var attA = a.Attributes().ToList();
var attB = b.Attributes().ToList();
var setA = AttributeNames(attA);
var setB = AttributeNames(attB);
if (!setA.SetEquals(setB))
return false;
foreach (var e in setA)
var xa = attA.First(x => x.Name.LocalName == e);
var xb = attB.First(x => x.Name.LocalName == e);
if (xa.Value == null && xb.Value == null)
if (xa.Value == null || xb.Value == null)
return false;
if (!xa.Value.Equals(xb.Value))
return false;
return true;
private HashSet<string> AttributeNames(IList<XAttribute> e)
return new HashSet<string>(e.Select(x =>x.Name.LocalName).Except(_exceptions));
public int GetHashCode(XElement e)
var h = 0;
var atts = e.Attributes().ToList();
var names = AttributeNames(atts);
foreach (var a in names)
var xa = atts.First(x => x.Name.LocalName == a);
if (xa.Value != null)
h = h ^ xa.Value.GetHashCode();
return h;
var comp = new XComparer("ID");
var distXEle = xele.Distinct(comp);
Please note that IEqualityComparer implementation in this answer only compare LocalName and doesn't take namespace into considerataion. If you have element with duplicate local name attribute, then this implementation will take the first one.
You can see the demo here : https://dotnetfiddle.net/w2DteS
If you want to
delete the one having lesser ID
It means you want the largest ID, then you can chain the .Distinct call with .Select.
var comp = new XComparer("ID");
var distXEle = xele
.Select(z => xele
.Where(a => comp.Equals(z, a))
.OrderByDescending(a => int.Parse(a.Attribute("ID").Value))
It will guarantee that you get the element with largest ID.
Use Linq GroupBy
var doc = XDocument.Parse(yourXmlString);
var groups = doc.Root
.GroupBy(element => new
Attrib1 = element.Attribute("Attrib1").Value,
Attrib2 = element.Attribute("Attrib2").Value,
Attrib3 = element.Attribute("Attrib3").Value,
Attrib4 = element.Attribute("Attrib4").Value,
Attrib5 = element.Attribute("Attrib5").Value
var duplicates = group1.SelectMany(group =>
if(group.Count() == 1) // remove this if you want only duplicates
return group;
int minId = group.Min(element => int.Parse(element.Attribute("ID").Value));
return group.Where(element => int.Parse(element.Attribute("ID").Value) > minId);
Solution above will remove elements with lesser ID which have duplicates by attributes.
If you want return only elements which have duplicates then remove if fork from last lambda
I have the below class:
public class FactoryOrder
public string Text { get; set; }
public int OrderNo { get; set; }
and collection holding the list of FactoryOrders
here is the sample data
My requirement is to merge the Text of FactoryOrders where orderNo are in sequence and retain the lower orderNo for the merged FactoryOrder
- so the resulting output will be
FactoryOrder("Apple Orange",20) //Merged Apple and Orange and retained Lower OrderNo 20
FactoryOrder("Grapes mango Cherry",71)//Merged Grapes,Mango,cherry and retained Lower OrderNo 71
I am new to Linq so not sure how to go about this. Any help or pointers would be appreciated
As commented, if your logic depends on consecutive items so heavily LINQ is not the easiest appoach. Use a simple loop.
You could order them first with LINQ: orders.OrderBy(x => x.OrderNo )
var consecutiveOrdernoGroups = new List<List<FactoryOrder>> { new List<FactoryOrder>() };
FactoryOrder lastOrder = null;
foreach (FactoryOrder order in orders.OrderBy(o => o.OrderNo))
if (lastOrder == null || lastOrder.OrderNo == order.OrderNo - 1)
consecutiveOrdernoGroups.Add(new List<FactoryOrder> { order });
lastOrder = order;
Now you just need to build the list of FactoryOrder with the joined names for every group. This is where LINQ and String.Join can come in handy:
orders = consecutiveOrdernoGroups
.Select(list => new FactoryOrder
Text = String.Join(" ", list.Select(o => o.Text)),
OrderNo = list.First().OrderNo // is the minimum number
Result with your sample:
I'm not sure this can be done using a single comprehensible LINQ expression. What would work is a simple enumeration:
private static IEnumerable<FactoryOrder> Merge(IEnumerable<FactoryOrder> orders)
var enumerator = orders.OrderBy(x => x.OrderNo).GetEnumerator();
FactoryOrder previousOrder = null;
FactoryOrder mergedOrder = null;
while (enumerator.MoveNext())
var current = enumerator.Current;
if (mergedOrder == null)
mergedOrder = new FactoryOrder(current.Text, current.OrderNo);
if (current.OrderNo == previousOrder.OrderNo + 1)
mergedOrder.Text += current.Text;
yield return mergedOrder;
mergedOrder = new FactoryOrder(current.Text, current.OrderNo);
previousOrder = current;
if (mergedOrder != null)
yield return mergedOrder;
This assumes FactoryOrder has a constructor accepting Text and OrderNo.
Linq implementation using side effects:
var groupId = 0;
var previous = Int32.MinValue;
var grouped = GetItems()
.OrderBy(x => x.OrderNo)
.Select(x =>
var #group = x.OrderNo != previous + 1 ? (groupId = x.OrderNo) : groupId;
previous = x.OrderNo;
return new
GroupId = group,
Item = x
.GroupBy(x => x.GroupId)
.Select(x => new FactoryOrder(
String.Join(" ", x.Select(y => y.Item.Text).ToArray()),
foreach (var item in grouped)
Console.WriteLine(item.Text + "\t" + item.OrderNo);
Apple Orange 20
WaterMelon 42
JackFruit 51
Grapes mango Cherry 71
Or, eliminate the side effects by using a generator extension method
public static class IEnumerableExtensions
public static IEnumerable<IList<T>> MakeSets<T>(this IEnumerable<T> items, Func<T, T, bool> areInSameGroup)
var result = new List<T>();
foreach (var item in items)
if (!result.Any() || areInSameGroup(result[result.Count - 1], item))
yield return result;
result = new List<T> { item };
if (result.Any())
yield return result;
and your implementation becomes
var grouped = GetItems()
.OrderBy(x => x.OrderNo)
.MakeSets((prev, next) => next.OrderNo == prev.OrderNo + 1)
.Select(x => new FactoryOrder(
String.Join(" ", x.Select(y => y.Text).ToArray()),
foreach (var item in grouped)
Console.WriteLine(item.Text + "\t" + item.OrderNo);
The output is the same but the code is easier to follow and maintain.
LINQ + sequential processing = Aggregate.
It's not said though that using Aggregate is always the best option. Sequential processing in a for(each) loop usually makes for better readable code (see Tim's answer). Anyway, here's a pure LINQ solution.
It loops through the orders and first collects them in a dictionary having the first Id of consecutive orders as Key, and a collection of orders as Value. Then it produces a result using string.Join:
class FactoryOrder
public FactoryOrder(int id, string name)
this.Id = id;
this.Name = name;
public int Id { get; set; }
public string Name { get; set; }
The program:
IEnumerable<FactoryOrder> orders =
new FactoryOrder(20, "Apple"),
new FactoryOrder(21, "Orange"),
new FactoryOrder(22, "Pear"),
new FactoryOrder(42, "WaterMelon"),
new FactoryOrder(51, "JackFruit"),
new FactoryOrder(71, "Grapes"),
new FactoryOrder(72, "Mango"),
new FactoryOrder(73, "Cherry"),
var result = orders.OrderBy(t => t.Id).Aggregate(new Dictionary<int, List<FactoryOrder>>(),
(dir, curr) =>
var prevId = dir.SelectMany(d => d.Value.Select(v => v.Id))
.OrderBy(i => i).DefaultIfEmpty(-1)
var newKey = dir.Select(d => d.Key).OrderBy(i => i).LastOrDefault();
if (prevId == -1 || curr.Id - prevId > 1)
newKey = curr.Id;
if (!dir.ContainsKey(newKey))
dir[newKey] = new List<FactoryOrder>();
return dir;
}, c => c)
.Select(t => new
Items = string.Join(" ", t.Value.Select(v => v.Name))
As you see, it's not really straightforward what happens here, and chances are that it performs badly when there are "many" items, because the growing dictionary is accessed over and over again.
Which is a long-winded way to say: don't use Aggregate.
Just coded a method, it's compact and quite good in terms of performance :
static List<FactoryOrder> MergeValues(List<FactoryOrder> dirtyList)
FactoryOrder[] temp1 = dirtyList.ToArray();
int index = -1;
for (int i = 1; i < temp1.Length; i++)
if (temp1[i].OrderNo - temp1[i - 1].OrderNo != 1) { index = -1; continue; }
if(index == -1 ) index = dirtyList.IndexOf(temp1[i - 1]);
dirtyList[index].Text += " " + temp1[i].Text;
return dirtyList;