Group list of strings with common prefixes

Group list of strings with common prefixes - c#

Suppose I have a list of strings [city01, city01002, state02, state03, city04, statebg, countryqw, countrypo]
How do I group them in a dictionary of <string, List<Strings>> like
city - [city01, city04, city01002]
state- [state02, state03, statebg]
country - [countrywq, countrypo]
If not code, can anyone please help with how to approach or proceed?

As shown in other answers you can use the GroupBy method from LINQ to create this grouping based on any condition you want. Before you can group your strings you need to know the conditions for how a string is grouped. It could be that it starts with one of a set of predefined prefixes, grouped by whats before the first digit or any random condition you can describe with code. In my code example the groupBy method calls another method for every string in your list and in that method you can place the code you need to group the strings as you want by returning the key to group the given string under. You can test this example online with dotnetfiddle: https://dotnetfiddle.net/UHNXvZ
using System;
using System.Collections.Generic;
using System.Linq;
public class Program
{
public static void Main()
{
List<string> ungroupedList = new List<string>() {"city01", "city01002", "state02", "state03", "city04", "statebg", "countryqw", "countrypo", "theFirstTown"};
var groupedStrings = ungroupedList.GroupBy(x => groupingCondition(x));
foreach (var a in groupedStrings) {
Console.WriteLine("key: " + a.Key);
foreach (var b in a) {
Console.WriteLine("value: " + b);
}
}
}
public static string groupingCondition(String s) {
if(s.StartsWith("city") || s.EndsWith("Town"))
return "city";
if(s.StartsWith("country"))
return "country";
if(s.StartsWith("state"))
return "state";
return "unknown";
}
}

You can use LINQ:
var input = new List<string>()
{ "city01", "city01002", "state02",
"state03", "city04", "statebg", "countryqw", "countrypo" };
var output = input.GroupBy(c => string.Join("", c.TakeWhile(d => !char.IsDigit(d))
.Take(4))).ToDictionary(c => c.Key, c => c.ToList());

i suppose you have a list of references you are searching in the list:
var list = new List<string>()
{ "city01", "city01002", "state02",
"state03", "city04", "statebg", "countryqw", "countrypo" };
var tofound = new List<string>() { "city", "state", "country" }; //references to found
var result = new Dictionary<string, List<string>>();
foreach (var f in tofound)
{
result.Add(f, list.FindAll(x => x.StartsWith(f)));
}
In the result, you have the dictionary wanted. If no value are founded for a reference key, the value of key is null

Warning: This answer has a combinatorial expansion and will fail if your original string set is large. For 65 words I gave up after running for a couple of hours.
Using some IEnumerable extension methods to find Distinct sets and to find all possible combinations of sets, you can generate a group of prefixes and then group the original strings by these.
public static class IEnumerableExt {
public static bool IsDistinct<T>(this IEnumerable<T> items) {
var hs = new HashSet<T>();
foreach (var item in items)
if (!hs.Add(item))
return false;
return true;
}
public static bool IsEmpty<T>(this IEnumerable<T> items) => !items.Any();
public static IEnumerable<IEnumerable<T>> AllCombinations<T>(this IEnumerable<T> start) {
IEnumerable<IEnumerable<T>> HelperCombinations(IEnumerable<T> items) {
if (items.IsEmpty())
yield return items;
else {
var head = items.First();
var tail = items.Skip(1);
foreach (var sequence in HelperCombinations(tail)) {
yield return sequence; // Without first
yield return sequence.Prepend(head);
}
}
}
return HelperCombinations(start).Skip(1); // don't return the empty set
}
}
var keys = Enumerable.Range(0, src.Count - 1)
.SelectMany(n1 => Enumerable.Range(n1 + 1, src.Count - n1 - 1).Select(n2 => new { n1, n2 }))
.Select(n1n2 => new { s1 = src[n1n2.n1], s2 = src[n1n2.n2], Dist = src[n1n2.n1].TakeWhile((ch, n) => n < src[n1n2.n2].Length && ch == src[n1n2.n2][n]).Count() })
.SelectMany(s1s2d => new[] { new { s = s1s2d.s1, s1s2d.Dist }, new { s = s1s2d.s2, s1s2d.Dist } })
.Where(sd => sd.Dist > 0)
.GroupBy(sd => sd.s.Substring(0, sd.Dist))
.Select(sdg => sdg.Distinct())
.AllCombinations()
.Where(sdgc => sdgc.Sum(sdg => sdg.Count()) == src.Count)
.Where(sdgc => sdgc.SelectMany(sdg => sdg.Select(sd => sd.s)).IsDistinct())
.OrderByDescending(sdgc => sdgc.Sum(sdg => sdg.First().Dist)).First()
.Select(sdg => sdg.First())
.Select(sd => sd.s.Substring(0, sd.Dist))
.ToList();
var groups = src.GroupBy(s => keys.First(k => s.StartsWith(k)));

Related

How do I pick out values between a duplicate value in a collection?

I have a method that returns a collection that has a duplicate value.
static List<string> GenerateItems()
{
var _items = new List<string>();
_items.Add("Tase");
_items.Add("Ray");
_items.Add("Jay");
_items.Add("Bay");
_items.Add("Tase");
_items.Add("Man");
_items.Add("Ran");
_items.Add("Ban");
return _items;
}
I want to search through that collection and find the first place that duplicate value is located and start collecting all the values from the first appearance of the duplicate value to its next appearance. I want to put this in a collection but I only want the duplicate value to appear once in that collection.
This is what I have so far but.
static void Main(string[] args)
{
string key = "Tase";
var collection = GenerateItems();
int index = collection.FindIndex(a => a == key);
var matchFound = false;
var itemsBetweenKey = new List<string>();
foreach (var item in collection)
{
if (item == key)
{
matchFound = !matchFound;
}
if (matchFound)
{
itemsBetweenKey.Add(item);
}
}
foreach (var item in itemsBetweenKey)
{
Console.WriteLine(item);
}
Console.ReadLine();
}
There must be an easier way of doing this. Perhaps with Indexing or a LINQ query?

You can do something like that
string key = "Tase";
var collection = GenerateItems();
int indexStart = collection.FindIndex(a => a == key);
int indexEnd = collection.FindIndex(indexStart+1, a => a == key);
var result = collection.GetRange(indexStart, indexEnd-indexStart);

You can use linq select and group by to find the first index and last index of all duplicates (Keep in mind if something is in the list more then 2 times it would ignore the middle occurences.
But I personally think the linq for this seems overcomplicated. I would stick with simple for loops and if statements (Just turn it into a method so it reads better)
Here is a solution with Linq to get all duplicate and all values between those duplicates including itself once as you mentioned.
var collection = GenerateItems();
var Duplicates = collection.Select((x,index) => new { index, value = x })
.GroupBy(x => x.value)//group by the strings
.Where(x => x.Count() > 1)//only take duplicates
.Select(x=>new {
Value = x.Key,
FirstIndex = x.Min(y=> y.index),//take first occurenc
LastIndex = x.Max(y => y.index)//take last occurence
}).ToList();
var resultlist = new List<List<string>>();
foreach (var duplicaterange in Duplicates)
resultlist .Add(collection.GetRange(duplicaterange.FirstIndex, duplicaterange.LastIndex - duplicaterange.FirstIndex));

Try this function
public List<string> PickOut(List<string> collection, string key)
{
var index = 0;
foreach (var item in collection)
{
if (item == key)
{
return collection.Skip(index).TakeWhile(x=> x != key).ToList();
}
index++;
};
return null;
}

First finding the duplicate key then find the second occurrence of the item and then take result.
var firstduplicate = collection.GroupBy(x => x)
.Where(g => g.Count() > 1)
.Select(g => g.Key).First();
var indices = collection.Select((b, i) => b == firstduplicate ? i : -1).Where(i => i != -1).Skip(1).FirstOrDefault();
if (indices>0)
{
var result = collection.Take(indices).ToList();
}

How to get distinct values with corresponding data from IEnumerable

I need to be able to return back only the records that have a unique AccessionNumber with it's corresponding LoginId. So that at the end, the data looks something like:
A1,L1
A2,L1
A3,L2
However, my issue is with this line of code because Distinct() returns a IEnumerable of string and not IEnumerable of string[]. Therefore, compiler complains about string not containing a definition for AccessionNumber and LoginId.
yield return new[] { record.AccessionNumber, record.LoginId };
This is the code that I am trying to execute:
internal static IEnumerable<string[]> GetTestDataForSpecificItemType(ItemTypes itemTypeCode)
{
IEnumerable<StudentAssessmentTestData> data = DataGetter.GetTestData("MyTestData");
data = data.Where(x => x.ItemTypeCode.Trim() == itemTypeCode.ToString());
var z = data.Select(x => x.AccessionNumber).Distinct();
foreach (var record in z)
{
yield return new[] { record.AccessionNumber, record.LoginId };
}
}

That's cause you are selecting only that property AccessionNumber by saying the below
var z = data.Select(x => x.AccessionNumber).Distinct();
You probably want to select entire StudentAssessmentTestData record
data = data.Where(x => x.ItemTypeCode.Trim() == itemTypeCode.ToString()).Distinct();
foreach (var record in data)
{
yield return new[] { record.AccessionNumber, record.LoginId };
}

Instead of using Distinct, use GroupBy. This:
var z = data.Select(x => x.AccessionNumber).Distinct();
foreach (var record in z)
{
yield return new[] { record.AccessionNumber, record.LoginId };
}
should be something like this:
return data.GroupBy(x => x.AccessionNumber)
.Select(r => new { AccessionNumber = r.Key, r.First().LoginId});
The GroupBy() call ensures only unique entries for AccessionNumber and the First() ensures that only the first one LoginId with that AccessionNumber is returned.
This assumes that your data is sorted in a way that if there are multiple logins with the same AccessionNumber, the first login is correct.

If you want to choose distinct values based on a certain property you can do it in several ways.
If it is always the same property you wish to use for comparision, you can override Equals and GetHashCode methods in the StudentAssessmentTestData class, thus allowing the Distinct method to recognize how the classes differ from each other, an example can be found in this question
However, you can also implement a custom IEqualityComparer<T> for your implementation, for example the following version
// Custom comparer taking generic input parameter and a delegate function to do matching
public class CustomComparer<T> : IEqualityComparer<T> {
private readonly Func<T, object> _match;
public CustomComparer(Func<T, object> match) {
_match = match;
}
// tries to match both argument its return values against eachother
public bool Equals(T data1, T data2) {
return object.Equals(_match(data1), _match(data2));
}
// overly simplistic implementation
public int GetHashCode(T data) {
var matchValue = _match(data);
if (matchValue == null) {
return 42.GetHashCode();
}
return matchValue.GetHashCode();
}
}
This class can then be used as an argument for the Distinct function, for example in this way
// compare by access number
var accessComparer = new CustomComparer<StudentTestData>(d => d.AccessionNumber );
// compare by login id
var loginComparer = new CustomComparer<StudentTestData>(d => d.LoginId );
foreach (var d in data.Distinct( accessComparer )) {
Console.WriteLine( "{0}, {1}", d.AccessionNumber, d.LoginId);
}
foreach (var d in data.Distinct( loginComparer )) {
Console.WriteLine( "{0}, {1}", d.AccessionNumber, d.LoginId);
}
A full example you can find in this dotnetfiddle

Add a LinqExtension method DistinctBy as below.
public static class LinqExtensions
{
public static IEnumerable<TSource> DistinctBy<TSource, TKey>(this IEnumerable<TSource> source, Func<TSource, TKey> keySelector)
{
HashSet<TKey> seenKeys = new HashSet<TKey>();
foreach (TSource element in source)
{
if (seenKeys.Add(keySelector(element)))
{
yield return element;
}
}
}
}
Use it in your code like this:
var z = data.DistinctBy(x => x.AccessionNumber);
internal static IEnumerable<string[]> GetTestDataForSpecificItemType(ItemTypes itemTypeCode)
{
IEnumerable<StudentAssessmentTestData> data = DataGetter.GetTestData("MyTestData");
data = data.Where(x => x.ItemTypeCode.Trim() == itemTypeCode.ToString());
var z = data.DistinctBy(x => x.AccessionNumber);
foreach (var record in z)
{
yield return new[] { record.AccessionNumber, record.LoginId };
}
}

This is the code that finally worked:
internal static IEnumerable<string[]> GetTestDataForSpecificItemType(ItemTypes itemTypeCode)
{
var data = DataGetter.GetTestData("MyTestData");
data = data.Where(x => x.ItemTypeCode.Trim() == itemTypeCode.ToString());
var z = data.GroupBy(x => new{x.AccessionNumber})
.Select(x => new StudentAssessmentTestData(){ AccessionNumber = x.Key.AccessionNumber, LoginId = x.FirstOrDefault().LoginId});
foreach (var record in z)
{
yield return new[] { record.AccessionNumber, record.LoginId };
}
}
Returns a sequence that looks like similar to this:
Acc1, Login1
Acc2, Login1
Acc3, Login2
Acc4, Login1
Acc5, Login3

You can try this. It works for me.
IEnumerable<StudentAssessmentTestData> data = DataGetter.GetTestData("MyTestData");
data = data.Where(x => x.ItemTypeCode.Trim() == itemTypeCode.ToString());
var z = data.GroupBy(x => x.AccessionNumber).SelectMany(y => y.Take(1));
foreach (var record in z)
{
yield return new[] { record.AccessionNumber, record.LoginId };
}

I'm not 100% sure what you're asking. You either want (1) only records with a unique AccessionNumber , if two or more records had the same AccessionNumber then don't return them, or (2) only the first record for each AccessionNumber.
Here's both options:
(1)
internal static IEnumerable<string[]> GetTestDataForSpecificItemType(ItemTypes itemTypeCode)
{
return
DataGetter
.GetTestData("MyTestData");
.Where(x => x.ItemTypeCode.Trim() == itemTypeCode.ToString())
.GroupBy(x => x.AccessionNumber)
.Where(x => !x.Skip(1).Any())
.SelectMany(x => x)
.Select(x => new [] { x.AccessionNumber, x.LoginId });
}
(2)
internal static IEnumerable<string[]> GetTestDataForSpecificItemType(ItemTypes itemTypeCode)
{
return
DataGetter
.GetTestData("MyTestData");
.Where(x => x.ItemTypeCode.Trim() == itemTypeCode.ToString())
.GroupBy(x => x.AccessionNumber)
.SelectMany(x => x.Take(1))
.Select(x => new [] { x.AccessionNumber, x.LoginId });
}

Merge two or more T in List<T> based on condition

I have the below class:
public class FactoryOrder
{
public string Text { get; set; }
public int OrderNo { get; set; }
}
and collection holding the list of FactoryOrders
List<FactoryOrder>()
here is the sample data
FactoryOrder("Apple",20)
FactoryOrder("Orange",21)
FactoryOrder("WaterMelon",42)
FactoryOrder("JackFruit",51)
FactoryOrder("Grapes",71)
FactoryOrder("mango",72)
FactoryOrder("Cherry",73)
My requirement is to merge the Text of FactoryOrders where orderNo are in sequence and retain the lower orderNo for the merged FactoryOrder
- so the resulting output will be
FactoryOrder("Apple Orange",20) //Merged Apple and Orange and retained Lower OrderNo 20
FactoryOrder("WaterMelon",42)
FactoryOrder("JackFruit",51)
FactoryOrder("Grapes mango Cherry",71)//Merged Grapes,Mango,cherry and retained Lower OrderNo 71
I am new to Linq so not sure how to go about this. Any help or pointers would be appreciated

As commented, if your logic depends on consecutive items so heavily LINQ is not the easiest appoach. Use a simple loop.
You could order them first with LINQ: orders.OrderBy(x => x.OrderNo )
var consecutiveOrdernoGroups = new List<List<FactoryOrder>> { new List<FactoryOrder>() };
FactoryOrder lastOrder = null;
foreach (FactoryOrder order in orders.OrderBy(o => o.OrderNo))
{
if (lastOrder == null || lastOrder.OrderNo == order.OrderNo - 1)
consecutiveOrdernoGroups.Last().Add(order);
else
consecutiveOrdernoGroups.Add(new List<FactoryOrder> { order });
lastOrder = order;
}
Now you just need to build the list of FactoryOrder with the joined names for every group. This is where LINQ and String.Join can come in handy:
orders = consecutiveOrdernoGroups
.Select(list => new FactoryOrder
{
Text = String.Join(" ", list.Select(o => o.Text)),
OrderNo = list.First().OrderNo // is the minimum number
})
.ToList();
Result with your sample:

I'm not sure this can be done using a single comprehensible LINQ expression. What would work is a simple enumeration:
private static IEnumerable<FactoryOrder> Merge(IEnumerable<FactoryOrder> orders)
{
var enumerator = orders.OrderBy(x => x.OrderNo).GetEnumerator();
FactoryOrder previousOrder = null;
FactoryOrder mergedOrder = null;
while (enumerator.MoveNext())
{
var current = enumerator.Current;
if (mergedOrder == null)
{
mergedOrder = new FactoryOrder(current.Text, current.OrderNo);
}
else
{
if (current.OrderNo == previousOrder.OrderNo + 1)
{
mergedOrder.Text += current.Text;
}
else
{
yield return mergedOrder;
mergedOrder = new FactoryOrder(current.Text, current.OrderNo);
}
}
previousOrder = current;
}
if (mergedOrder != null)
yield return mergedOrder;
}
This assumes FactoryOrder has a constructor accepting Text and OrderNo.

Linq implementation using side effects:
var groupId = 0;
var previous = Int32.MinValue;
var grouped = GetItems()
.OrderBy(x => x.OrderNo)
.Select(x =>
{
var #group = x.OrderNo != previous + 1 ? (groupId = x.OrderNo) : groupId;
previous = x.OrderNo;
return new
{
GroupId = group,
Item = x
};
})
.GroupBy(x => x.GroupId)
.Select(x => new FactoryOrder(
String.Join(" ", x.Select(y => y.Item.Text).ToArray()),
x.Key))
.ToArray();
foreach (var item in grouped)
{
Console.WriteLine(item.Text + "\t" + item.OrderNo);
}
output:
Apple Orange 20
WaterMelon 42
JackFruit 51
Grapes mango Cherry 71
Or, eliminate the side effects by using a generator extension method
public static class IEnumerableExtensions
{
public static IEnumerable<IList<T>> MakeSets<T>(this IEnumerable<T> items, Func<T, T, bool> areInSameGroup)
{
var result = new List<T>();
foreach (var item in items)
{
if (!result.Any() || areInSameGroup(result[result.Count - 1], item))
{
result.Add(item);
continue;
}
yield return result;
result = new List<T> { item };
}
if (result.Any())
{
yield return result;
}
}
}
and your implementation becomes
var grouped = GetItems()
.OrderBy(x => x.OrderNo)
.MakeSets((prev, next) => next.OrderNo == prev.OrderNo + 1)
.Select(x => new FactoryOrder(
String.Join(" ", x.Select(y => y.Text).ToArray()),
x.First().OrderNo))
.ToList();
foreach (var item in grouped)
{
Console.WriteLine(item.Text + "\t" + item.OrderNo);
}
The output is the same but the code is easier to follow and maintain.

LINQ + sequential processing = Aggregate.
It's not said though that using Aggregate is always the best option. Sequential processing in a for(each) loop usually makes for better readable code (see Tim's answer). Anyway, here's a pure LINQ solution.
It loops through the orders and first collects them in a dictionary having the first Id of consecutive orders as Key, and a collection of orders as Value. Then it produces a result using string.Join:
Class:
class FactoryOrder
{
public FactoryOrder(int id, string name)
{
this.Id = id;
this.Name = name;
}
public int Id { get; set; }
public string Name { get; set; }
}
The program:
IEnumerable<FactoryOrder> orders =
new[]
{
new FactoryOrder(20, "Apple"),
new FactoryOrder(21, "Orange"),
new FactoryOrder(22, "Pear"),
new FactoryOrder(42, "WaterMelon"),
new FactoryOrder(51, "JackFruit"),
new FactoryOrder(71, "Grapes"),
new FactoryOrder(72, "Mango"),
new FactoryOrder(73, "Cherry"),
};
var result = orders.OrderBy(t => t.Id).Aggregate(new Dictionary<int, List<FactoryOrder>>(),
(dir, curr) =>
{
var prevId = dir.SelectMany(d => d.Value.Select(v => v.Id))
.OrderBy(i => i).DefaultIfEmpty(-1)
.LastOrDefault();
var newKey = dir.Select(d => d.Key).OrderBy(i => i).LastOrDefault();
if (prevId == -1 || curr.Id - prevId > 1)
{
newKey = curr.Id;
}
if (!dir.ContainsKey(newKey))
{
dir[newKey] = new List<FactoryOrder>();
}
dir[newKey].Add(curr);
return dir;
}, c => c)
.Select(t => new
{
t.Key,
Items = string.Join(" ", t.Value.Select(v => v.Name))
}).ToList();
As you see, it's not really straightforward what happens here, and chances are that it performs badly when there are "many" items, because the growing dictionary is accessed over and over again.
Which is a long-winded way to say: don't use Aggregate.

Just coded a method, it's compact and quite good in terms of performance :
static List<FactoryOrder> MergeValues(List<FactoryOrder> dirtyList)
{
FactoryOrder[] temp1 = dirtyList.ToArray();
int index = -1;
for (int i = 1; i < temp1.Length; i++)
{
if (temp1[i].OrderNo - temp1[i - 1].OrderNo != 1) { index = -1; continue; }
if(index == -1 ) index = dirtyList.IndexOf(temp1[i - 1]);
dirtyList[index].Text += " " + temp1[i].Text;
dirtyList.Remove(temp1[i]);
}
return dirtyList;
}

How can I use LINQ to calculate the longest streak?

Currently, this is just something I am curious about, I don't have any code I am working on but I am wondering how this could be achieved...
Lets say for example that I have an application that tracks the results of all the football teams in the world. What I want to be able to do is to identify the longest "win" streak for any given team.
I imagine I would most likely have some sort of data table like so:
MatchDate datetime
TeamA string
TeamB string
TeamAGoals int
TeamBGoals int
So what I would want to do for example is find the longest win streak where TeamA = "My Team" and obviously this would mean TeamAGoals must be greater than TeamBGoals.
As I have said, this is all just for example. It may be better for a different DB design for something like this. But the root question is how to calculate the longest streak/run of matching results.

This is an old question now, but I just had to solve the same problem myself, and thought people might be interested in a fully LINQ implementation of Rawling's LongestStreak extension method. This uses Aggregate with a seed and result selector to run through the list.
public static int LongestStreak<TSource>(
this IEnumerable<TSource> source,
Func<TSource, bool> predicate)
{
return source.Aggregate(
new {Longest = 0, Current = 0},
(agg, element) => predicate(element) ?
new {Longest = Math.Max(agg.Longest, agg.Current + 1), Current = agg.Current + 1} :
new {agg.Longest, Current = 0},
agg => agg.Longest);
}

There's no out-of-the-box LINQ method to count streaks, so you'll need a custom LINQy method such as
public static int LongestStreak<TSource>(
this IEnumerable<TSource> source,
Func<TSource, bool> predicate)
{
int longestStreak = 0;
int currentStreak = 0;
foreach (TSource s in source)
{
if (predicate(s))
currentStreak++;
else
{
if (currentStreak > longestStreak) longestStreak = currentStreak;
currentStreak = 0;
}
}
if (currentStreak > longestStreak) longestStreak = currentStreak;
return longestStreak;
}
Then, to use this, first turn each "match result" into a pair of "team results".
var teamResults = matches.SelectMany(m => new[] {
new {
MatchDate = m.MatchDate,
Team = m.TeamA,
Won = m.TeamAGoals > m.TeamBGoals },
new {
MatchDate = m.MatchDate,
Team = m.TeamB,
Won = m.TeamBGoals > m.TeamAGoals }
});
Group these by team.
var groupedResults = teamResults.GroupBy(r => r.Team);
Then calculate the streaks.
var streaks = groupedResults.Select(g => new
{
Team = g.Key,
StreakLength = g
// unnecessary if the matches were ordered originally
.OrderBy(r => r.MatchDate)
.LongestStreak(r => r.Won)
});
If you want the longest streak only, use MoreLinq's MaxBy; if you want them all ordered, you can use OrderByDescending(s => s.StreakLength).
Alternatively, if you want to do this in one pass, and assuming matches is already ordered, using the following class
class StreakAggregator<TKey>
{
public Dictionary<TKey, int> Best = new Dictionary<TKey, int>();
public Dictionary<TKey, int> Current = new Dictionary<TKey, int>();
public StreakAggregator<TKey> UpdateWith(TKey key, bool success)
{
int c = 0;
Current.TryGetValue(key, out c);
if (success)
{
Current[key] = c + 1;
}
else
{
int b = 0;
Best.TryGetValue(key, out b);
if (c > b)
{
Best[key] = c;
}
Current[key] = 0;
}
return this;
}
public StreakAggregator<TKey> Finalise()
{
foreach (TKey k in Current.Keys.ToArray())
{
UpdateWith(k, false);
}
return this;
}
}
you can then do
var streaks = teamResults.Aggregate(
new StreakAggregator<string>(),
(a, r) => a.UpdateWith(r.Team, r.Won),
(a) => a.Finalise().Best.Select(kvp =>
new { Team = kvp.Key, StreakLength = kvp.Value }));
and OrderBy or whatever as before.

You can get all results of team with single query:
var results = from m in Matches
let homeMatch = m.TeamA == teamName
let awayMatch = m.TeamB == teamName
let hasWon = (homeMatch && m.TeamAGoals > m.TeamBGoals) ||
(awayMatch && m.TeamBGoals > m.TeamAGoals)
where homeMatch || awayMatch
orderby m.MatchDate
select hasWon;
Then just do simple calculation of longest streak:
int longestStreak = 0;
int currentStreak = 0;
foreach (var hasWon in results)
{
if (hasWon)
{
currentStreak++;
if (currentStreak > longestStreak)
longestStreak = currentStreak;
continue;
}
currentStreak = 0;
}
You can use it as is, extract to method, or create IEnumerable extension for calculating longest sequence in results.

You could make use of string.Split. Something like this:
int longestStreak =
string.Concat(results.Select(r => (r.ours > r.theirs) ? "1" : "0"))
.Split(new[] { '0' })
.Max(s => s.Length);
Or, better, create a Split extension method for IEnumerable<T> to avoid the need to go via a string, like this:
public static IEnumerable<IEnumerable<T>> Split<T>(this IEnumerable<T> items, Predicate<T> p)
{
while (true)
{
items = items.SkipWhile(i => !p(i));
var trueItems = items.TakeWhile (i => p(i)).ToList();
if (trueItems.Count > 0)
{
yield return trueItems;
items = items.Skip(trueItems.Count);
}
else
{
break;
}
}
}
You can then simply do this:
int longestStreak = results.Split(r => r.ours > r.theirs).Max(g => g.Count());

how to ensure a List<String> contains each string in a sequence exactly once

Suppose I have a list of strings, like this:
var candidates = new List<String> { "Peter", "Chris", "Maggie", "Virginia" };
Now I'd like to verify that another List<String>, let's call it list1, contains each of those candidates exactly once.
How can I do that, succintly? I think I can use Intersect(). I also want to get the missing candidates.
private bool ContainsAllCandidatesOnce(List<String> list1)
{
????
}
private IEnumerable<String> MissingCandidates(List<String> list1)
{
????
}
Order doesn't matter.

This may not be optimal in terms of speed, but both queries are short enough to fit on a single line, and are easy to understand:
private bool ContainsAllCandidatesOnce(List<String> list1)
{
return candidates.All(c => list1.Count(v => v == c) == 1);
}
private IEnumerable<String> MissingCandidates(List<String> list1)
{
return candidates.Where(c => list1.Count(v => v == c) != 1);
}

Here we are talking about Except, Intersect and Distinct. I could have used a lamba operator with expression but it would have to loop over each and every item. That functionality is available with a predefined functions.
for your first method
var candidates = new List<String> { "Peter", "Chris", "Maggie", "Virginia" };
private bool ContainsAllCandidatesOnce(List<String> list1)
{
list1.Intersect(candidates).Distinct().Any();
}
This will give any element from list1 which are in common in candidates list or you can do it the other way
candidates.Intersect(list1).Distinct().Any();
for your second method
private IEnumerable<String> MissingCandidates(List<String> list1)
{
list1.Except(candidates).AsEnumerable();
}
This will remove all elements from list1 which are in candidates. If you wants it the other way you can do
candidates.Except(list1).AsEnumerable();

This should be quite efficient:
IEnumerable<string> strings = ...
var uniqueStrings = from str in strings
group str by str into g
where g.Count() == 1
select g.Key;
var missingCandidates = candidates.Except(uniqueStrings).ToList();
bool isValid = !missingCandidates.Any();
Filter out repeats.
Ensure that all the candidates occur in the filtered-out-set.

GroupJoin is the right tool for the job. From msdn:
GroupJoin produces hierarchical results, which means that elements
from outer are paired with collections of matching elements from
inner. GroupJoin enables you to base your results on a whole set of
matches for each element of outer.
If there are no correlated elements in inner for a given element of outer, the sequence of matches for that element will be empty but
will still appear in the results.
So, GroupJoin will find any matches from the target, for each item in the source. Items in the source are not filtered if no matches are found in the target. Instead they are matched to an empty group.
Dictionary<string, int> counts = candidates
.GroupJoin(
list1,
c => c,
s => s,
(c, g) => new { Key = c, Count = g.Count()
)
.ToDictionary(x => x.Key, x => x.Count);
List<string> missing = counts.Keys
.Where(key => counts[key] == 0)
.ToList();
List<string> tooMany = counts.Keys
.Where(key => 1 < counts[key])
.ToList();

private bool ContainsAllCandidatesOnce(List<String> list1)
{
return list1.Where(s => candidates.Contains(s)).Count() == candidates.Count();
}
private IEnumerable<String> MissingCandidates(List<String> list1)
{
return candidates.Where(s => list1.Count(c => c == s) != 1);
}

How about using a HashSet instead of List?

private static bool ContainsAllCandidatesOnce(List<string> lotsOfCandidates)
{
foreach (string candidate in allCandidates)
{
if (lotsOfCandidates.Count(t => t.Equals(candidate)) != 1)
{
return false;
}
}
return true;
}
private static IEnumerable<string> MissingCandidates(List<string> lotsOfCandidates)
{
List<string> missingCandidates = new List<string>();
foreach (string candidate in allCandidates)
{
if (lotsOfCandidates.Count(t => t.Equals(candidate)) != 1)
{
missingCandidates.Add(candidate);
}
}
return missingCandidates;
}

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Group list of strings with common prefixes - c#

You can use LINQ: var input = new List<string>() { "city01", "city01002", "state02", "state03", "city04", "statebg", "countryqw", "countrypo" }; var output = input.GroupBy(c => string.Join("", c.TakeWhile(d => !char.IsDigit(d)) .Take(4))).ToDictionary(c => c.Key, c => c.ToList());

Related

How do I pick out values between a duplicate value in a collection?

How to get distinct values with corresponding data from IEnumerable

Merge two or more T in List<T> based on condition

How can I use LINQ to calculate the longest streak?

how to ensure a List<String> contains each string in a sequence exactly once

Categories

Resources