I have this scenario where I have to add the numbers inside a collection. An example would serve a best example.
I have these values in my database:
| Foo |
| 1 |
| 5 |
| 8 |
| 4 |
Result:
| Foo | Result |
| 1 | 1 |
| 5 | 6 |
| 8 | 14 |
| 4 | 18 |
As you can see, it has a somewhat fibonacci effect but the twist here is the numbers are given.
I can achieve this result with the help of for loop but is this possible doing in Linq. Like querying the database then having a result like above?
Any help would be much appreciated. Thanks!
I'm not sure how exactly you're touching the database, but here's a solution that can probably be improved upon:
var numbers = new List<int> { 1, 5, 8, 4 };
var result = numbers.Select((n, i) => numbers.Where((nn, ii) => ii <= i).Sum());
This overload of Select and Where takes the object (each number) and the index of that object. For each index, I used numbers.Where to Sum all the items with a lower and equal index.
For example, when the Select gets to the number 8 (index 2), numbers.Where grabs items with index 0-2 and sums them.
MoreLINQ has a Scan method that allows you to aggregate the values in a sequence while yielding each intermediate value, rather than just the final value, which is exactly what you're trying to do.
With that you can write:
var query = data.Scan((sum, next) => sum + next);
The one overload that you need here will be copied below. See the link above for details and additional overloads:
public static IEnumerable<TSource> Scan<TSource>(this IEnumerable<TSource> source,
Func<TSource, TSource, TSource> transformation)
{
if (source == null) throw new ArgumentNullException("source");
if (transformation == null) throw new ArgumentNullException("transformation");
return ScanImpl(source, transformation);
}
private static IEnumerable<T> ScanImpl<T>(IEnumerable<T> source, Func<T, T, T> f)
{
using (var i = source.GetEnumerator())
{
if (!i.MoveNext())
throw new InvalidOperationException("Sequence contains no elements.");
var aggregator = i.Current;
while (i.MoveNext())
{
yield return aggregator;
aggregator = f(aggregator, i.Current);
}
yield return aggregator;
}
}
I think you can achieve this with :
var acc = 0;
var result = numbers.Select(i =>
{
acc += i;
return acc;
}).ToList();
You need the ToList to be sure it will be run only once (otherwise the acc will keep growing).
Also I'm not sure it can be converted to a query (and performed server side).
Thomas Levesque posted a response to a similar question where he provide a SelectAggregate method who provide the intermediate values of an aggregate computation.
It's look like this feature is not present in Linq by default, so you probably will not be able to perform the computation server side using Linq.
You could do the following
int runningTotal = 0;
var runningTotals = numbers.Select(n => new
{
Number = n,
RunningTotal = (runningTotal += n)
});
This will give you the number and the running total.
It's just an adaption of #Jonesy's answer:
int[] ints = new[] {1, 5, 8, 4};
var result = ints.Select((x, y) => x + ints.Take(y).Sum());
Related
I have many tasks, each task defined by the day that I can start working on and the last day that task is still valid to do, each task done withing one day, not more, I can do one task per day.
The tasks with the deadlines as described in the below table.
| task | valid from | valid until |
|------|------------|-------------|
| t01 | 1 | 3 |
| t02 | 2 | 2 |
| t03 | 1 | 1 |
| t04 | 2 | 3 |
| t05 | 2 | 3 |
the number of tasks may be a huge number.
I want to know which algorithm I can use to solve this problem to maximize the number of tasks that I can do.
Update
base on the comments I wrote this code it is working but still hasn't good performance with a huge number of tasks.
public static int countTodoTasks(int[] validFrom, int[] validUnitil)
{
var tasks = new List<TaskTodo>();
for (int i = 0; i < validFrom.Length; i++)
{
tasks.Add(new TaskTodo { ValidFrom = validFrom[i], ValidUntil = validUnitil[i] });
}
tasks = tasks.OrderBy(x => x.ValidUntil).ToList();
var lDay = 0;
var schedule = new Dictionary<int, TaskTodo>();
while (tasks.Count > 0)
{
lDay = findBigestMinimumOf(lDay, tasks[0].ValidFrom, tasks[0].ValidUntil);
if (lDay != -1)
{
schedule[lDay] = tasks[0];
}
tasks.RemoveAt(0);
tasks.RemoveAll(x => lDay >= x.ValidUntil);
}
return schedule.Count;
}
static int findBigestMinimumOf(int x, int start, int end)
{
if (start > x)
{
return start;
}
if ((x == start && start == end) || x == end || x > end)
{
return -1;
}
return x + 1;
}
If the tasks have the same duration, then use a greedy algorithm as described above.
If it's too slow, use indexes (= hashing) and incremental calculation to speed it up if you need to scale out.
Indexing: during setup, iterate through all tasks to create map (=dictionary?) that maps each due date to a list of tasks. Better yet, use a NavigableMap (TreeMap), so you can ask for tail iterator (all tasks starting from a specific due date, in order). The greedy algorithm can then use that to scale better (think a better bigO notation).
Incremental calculation: only calculate the delta's for each task you're considering.
If the tasks have different duration, a greedy algorithm (aka construction heuristic) won't give you the optimal solution. Then it's NP-hard. After the Construction Heuristic (= greedy algorithm), run a Local Search (such as Tabu Search). Libraries such as OptaPlanner (Java, not C# unfortunately - look for alternatives there) can do both for you.
Also note there are multiple greedy algo's (First Fit, Fit Fit Decreasing, ...)
I suppose you can apply greedy algorithm for you purpose in this way.
Select minimal "valid from", minday.
Add to Xcandidates, all candidates with "valid from" = minday.
If no Xcandidates go to 1.
Select the interval, x, from Xcandidates, with earliest "valid until".
Remove x, inserting it in your schedule.
Remove all Xcandidates with "valid until" = minday.
Increment minday and go to 2.
I have a table in the database called Control:
Table structure:
Id | Name | MinValue (decimal) | MaxValue(decimal)
I have some restrictions on that table, one of it's restrictions is : no intersections.
Example : if the table has some values as follows:
row1 : 1 | Test1 | 1.3 | 2.5 //valid
row2 : 2 | Test2 | 3.3 | 4.5 // valid
row3 : 3 | Test3 | 5 | 6 // valid
Now if I want to add a new record, it must not intersect with any other row
Example:
row4 : 4 | Test4 | 5.1 | 10 //not valid since slot from 5 to 6 is reserved
row5 : 5 | Test5 | 1.0 | 1.4 // not valid since slot from 1.3 to 2.5 is reserved
I'm using this code, and it worked perfectly, but I wonder if there is a better solution and more efficient :
var allRows = db.Control.ToList();
var minValue = control.MinimumValue;
var maxValue = control.MaximumValue;
bool flag = true;
foreach(var row in allRows)
{
for(var i = minValue; i <= maxValue && flag ; i = decimal.Add( i , (decimal) 0.01))
{
if(i >= row.MinimumValue && i <= row.MaximumValue)
{
flag = false;
min = row.MinimumValue;
max = row.MaximumValue;
break;
}
}
}
if (flag)
{
//add
}
else
{
//intersection
}
Any suggestions ?
I think this is a O(LogN) issue...
Keep segments Ordered by their Start Value.
in a valid list s[i].end < s[i+1].start for any i
when inserting new segment, find it's position (the one that which start is closest (but lesser) than your new segment) call it i
if((seg[i-1].end < new.start) && (seg[i+1].start > new.end))
//OK to insert
else
// intersect
Let's assume this is the object you're trying to add :
var control = new Control()
{
Name = 'name',
MinValue = 5,
MaxValue = 6
};
You can do the following:
var biggerThanMinValue = db.Control.Count(x => x.MinValue >= control.MinValue) != 0;
var biggerThanMaxValue = db.Control.Count(x => x.MaxValue >= control.MaxValue) != 0;
if (!biggerThanMinValue && !biggerThanMinValue)
{
db.Control.Add(control); // or whatever your add operation is
}
By doing so you:
do NOT load the whole table in-memory -> performance gain terms of time, traffic and memory
let the database use it's data structures/algorithms to verify that the item can be added (the db should be able to optimize this request) -> another performance gain (cpu + time)
have clearer backend code
have less code to test
Edit: I suppose you could also ask the database to sort your table by min/max value and then make some validation (1 or 2 ifs), but the first approach is better, imo.
I have the following problem: I need to create a table, which is combination of values coming from sets. The cardinality of the elements in the set is unknown, and may vary from set to set, the domain of the values is unknown, and may as well vary from set to set. The elements in the set are non-negative, at least two elements are within a set.
Here follows an example:
SET_A = { 0, 1, 2 }
SET_B = { 0, 1 }
SET_C = { 0, 1 }
The result should contain the following rows (order is not a constraint):
TABLE:
| 0 0 0 |
| 0 0 1 |
| 0 1 0 |
| 0 1 1 |
| 1 0 0 |
| 1 0 1 |
| 1 1 0 |
| 1 1 1 |
| 2 0 0 |
| 2 0 1 |
| 2 1 0 |
| 2 1 1 |
Does anybody know which is the Mathematics behind this problem? I tried to look at Multiset problems, logic tables, combinatorics. Many of the definitions that I found have similarities to my problem, but I can't isolate anything in the literature that I have accessed so far. Once I have a reference definition I can think of coding it, but now I just got lost in recursive functions and terrible array-index games. Thanks.
EDIT: Question was proposed already at:
C# Permutation of an array of arraylists?
Edit: Sorry, had to run last evening. For arbitrary dimensionality you probably would have to use recursion. There's probably a way to do without it, but with recursion is most straightforward. The below is untested but should be about right.
IEnumerable<int[]> getRows(int[][] possibleColumnValues, int[] rowPrefix) {
if(possibleColumnValues.Any()) { //can't return early when using yield
var remainingColumns = possibleColumnValues.Skip(1).ToArray();
foreach(var val in possibleColumnValues.First()) {
var rowSoFar = rowPrefix.Concat(new[]{val}).ToArray();
yield return getRows(remainingColumns rowSoFar);
}
}
}
Usage:
getRows(new [][] {
new [] {0,1,2},
new [] {0,1},
new [] {0,1},
}, new int[0]);
The thing you look for is combinatorics. Also it doesn't really matter what is the domain of the elements in set. As long as you can enumerate them, the problem is the same as for numbers from 0 to the set cardinality.
To enumerate all options, have a vector of indices and after each iteration increment the first index. If it overflows, set to 0 and increment the second index, etc.
The task is to print permutations. You seem to dig deeper then it is. It has nothing to do with nature of elements.
The following is not written for efficiency (neither in space nor speed). The idea is to just get the basic algorithm across. I'll leave making this more space and time efficient up to you.
The basic idea is to recognize that all the combinations of n lists, is just all the combinations of n-1 lists with each element of the first list tacked on. It's a pretty straight-forward recursive function at that point.
public static IEnumerable<int[]> Permute( params IEnumerable<int>[] sets )
{
if( sets.Length == 0 ) yield break;
if( sets.Length == 1 )
{
foreach( var element in sets[0] ) yield return new[] { element };
yield break;
}
var first = sets.First();
var rest = Permute( sets.Skip( 1 ).ToArray() );
var elements = first.ToArray();
foreach( var permutation in rest )
{
foreach( var element in elements )
{
var result = new int[permutation.Length + 1];
result[0] = element;
Array.Copy( permutation, 0, result, 1, permutation.Length );
yield return result;
}
}
}
I have a general question about dictionaries in C#.
Say I read in a text file, split it up into keys and values and store them in a dictionary.
Would it be more useful to put them all into a single dictionary or split it up into smaller ones?
It probably wouldn't make a huge difference with small text files but some of them have more than 100.000 lines.
What would you recommend?
First rule is always to benchmark before trying optimization. That being said, some people might have done the benchmarking for you. Check those results here
From the article (Just in case it disappears from the net)
The smaller Dictionary (with half the number of keys) was much faster.
In this case, the behavior of both Dictionaries on the input was
identical. This means that having unneeded keys in the Dictionary
makes it slower.
My perspective is that you should use separate Dictionaries for
separate purposes. If you have two sets of keys, do not store them in
the same Dictionary. If you can divide them up, you can enhance lookup
performance.
Credit: dotnetperls.com
Also from the article :
Full Dictionary: 791 ms
Half-size Dictionary: 591 ms [faster]
Maybe you can live with much less code and 200ms more, it really depends on your application
I believe the original article is either inaccurate or outdated. In any case, the statements regarding "dictionary size" have since been removed. Now, to answer the question:
Targeting .NET 6 x64 gives BETTER performance for a SINGLE dictionary. In fact, performance gets worse the more dictionaries you use:
| Method | Mean | Error | StdDev | Median |
|-------------- |----------:|---------:|----------:|----------:|
| Dictionary_1 | 91.54 us | 1.815 us | 3.318 us | 89.88 us |
| Dictionary_2 | 122.55 us | 1.067 us | 0.998 us | 122.19 us |
| Dictionary_10 | 390.77 us | 7.757 us | 18.882 us | 382.55 us |
The results should come as no surprise. For N-dictionary lookup you will calculate the hash code up to N times for every item to look up, instead of doing it just once. Also, you have to loop through the list of dictionaries which introduces a miniscule performance hit. All in all, it just makes sense.
Now, under some bizarre conditions, it might be possible to gain some speed using N-dictionary. E.g. Consider a tiny CPU cache, thrashing, hash code collisions etc. Have yet to encounter such a scenario though...
Benchmark code
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;
namespace MyBenchmarks;
public class DictionaryBenchmark
{
private const int N = 1000000;
private readonly string[] data;
private readonly Dictionary<string, string> dictionary;
private readonly List<Dictionary<string, string>> dictionaries2;
private readonly List<Dictionary<string, string>> dictionaries10;
public DictionaryBenchmark()
{
data = Enumerable.Range(0, N).Select(n => Guid.NewGuid().ToString()).ToArray();
dictionary = data.ToDictionary(x => x);
dictionaries2 = CreateDictionaries(2);
dictionaries10 = CreateDictionaries(10);
}
private List<Dictionary<string, string>> CreateDictionaries(int count)
{
int chunkSize = N / count;
return data.Select((item, index) => (Item: item, Index: index))
.GroupBy(x => x.Index / chunkSize)
.Select(g => g.Select(x => x.Item).ToDictionary(x => x))
.ToList();
}
[Benchmark]
public void Dictionary_1()
{
for (int i = 0; i < N; i += 1000)
{
dictionary.ContainsKey(data[i]);
}
}
[Benchmark]
public void Dictionary_2()
{
for (int i = 0; i < N; i += 1000)
{
foreach (var d in dictionaries2)
{
if (d.ContainsKey(data[i]))
{
break;
}
}
}
}
[Benchmark]
public void Dictionary_10()
{
for (int i = 0; i < N; i += 1000)
{
foreach (var d in dictionaries10)
{
if (d.ContainsKey(data[i]))
{
break;
}
}
}
}
}
public class Program
{
public static void Main() => BenchmarkRunner.Run<DictionaryBenchmark>();
}
I have a list of system users that are awaiting to be assigned with an account.
The assignment algorithm is very simple, assigning should be as fair as possible which means that if I have 40 accounts and 20 system users I need to assign 2 accounts per system user.
If I have 41 accounts and 20 system users I need to assign 2 accounts per system user and split the remaining accounts between the system users again (in this case, one system user will be assigned with one extra account).
I am trying to figure out how to do this while using a LINQ query.
So far I figured that grouping should be involved and my query is the following:
from account in accounts
let accountsPerSystemUser = accounts.Count / systemUsers.Count
let leftover = accounts.Count % systemUsers.Count
from systemUser in systemUsers
group account by systemUser into accountsGroup
select accountsGroup
However I am uncertain how to proceed from here.
I am positive that I am missing a where clause here that will prevent grouping if you reached the maximum amount of accounts to be assigned to a system user.
How do I implement the query correctly so that the grouping will know how much to assign?
Here is a simple implementation that works if you can restrict yourself to a IList<T> for the accounts (you can always use ToList though).
public static IEnumerable<IGrouping<TBucket, TSource>> DistributeBy<TSource, TBucket>(
this IEnumerable<TSource> source, IList<TBucket> buckets)
{
var tagged = source.Select((item,i) => new {item, tag = i % buckets.Count});
var grouped = from t in tagged
group t.item by buckets[t.tag];
return grouped;
}
// ...
var accountsGrouped = accounts.DistributeBy(systemUsers);
Basically this grabs each account's index and "tags" each with the remainder of integer division of that index by the number of system users. These tags are the indices of the system users they will belong to. Then it just groups them by the system user at that index.
This ensures your fairness requirement because the remainder will cycle between zero and one minus the number of system users.
0 % 20 = 0
1 % 20 = 1
2 % 20 = 2
...
19 % 20 = 19
20 % 20 = 0
21 % 21 = 1
22 % 22 = 2
...
39 % 20 = 19
40 % 20 = 0
You can't do this using "pure LINQ" (i.e. using query comprehension syntax), and to be honest LINQ probably isn't the best approach here. Nonetheless, here's an example of how you might do it:
var listB = new List<string>() { "a", "b", "c", "d", "e" };
var listA = new List<string>() { "1", "2", "3" };
var groupings = (from b in listB.Select((b, i) => new
{
Index = i,
Element = b
})
group b.Element by b.Index % listA.Count).Zip(listA, (bs, a) => new
{
A = a,
Bs = bs
});
foreach (var item in groupings)
{
Console.WriteLine("{0}: {1}", item.A, string.Join(",", item.Bs));
}
This outputs:
1: a,d
2: b,e
3: c
I don't thin "pure" LINQ is really suited to solve this problem. Nevertheless here is a solution that only requires two IEnumerable:
var users = new[] { "A", "B", "C" };
var accounts = new[] { 1, 2, 3, 4, 5, 6, 7, 8 };
var accountsPerUser = accounts.Count()/users.Count();
var leftover = accounts.Count()%users.Count();
var assignments = users
.Select((u, i) => new {
User = u,
AccountsToAssign = accountsPerUser + (i < leftover ? 1 : 0),
AccountsAlreadyAssigned =
(accountsPerUser + 1)*(i < leftover ? i : leftover)
+ accountsPerUser*(i < leftover ? 0 : i - leftover)
})
.Select(x => new {
x.User,
Accounts = accounts
.Skip(x.AccountsAlreadyAssigned)
.Take(x.AccountsToAssign)
});
To cut down on the text I use the term User instead of SystemUser.
The idea is quite simple. The first leftover users are assigned accountsPerUser + 1 from accounts. The remaining users are only assigned accountsPerUser.
The first Select uses the overload that provides an index to compute these values:
User | Index | AccountsAlreadyAssigned | AccountsToAssign
-----+-------+-------------------------+-----------------
A | 0 | 0 | 3
B | 1 | 3 | 3
C | 1 | 6 | 2
The second Select uses these values to Skip and Take the correct numbers from accounts.
If you want to you can "merge" the two Select statements and replace the AccountsAlreadyAssigned and AccountsToAssign with the expressions used to compute them. However, that will make the query really hard to understand.
Here is a "non-LINQ" alternative. It is based on IList but could easily be converted to IEnumerable. Or instead of returning the assignments as tuples it could perform the assignments inside the loop.
IEnumerable<Tuple<T, IList<U>>> AssignEvenly<T, U>(IList<T> targetItems, IList<U> sourceItems) {
var fraction = sourceItems.Count/targetItems.Count;
var remainder = sourceItems.Count%targetItems.Count;
var sourceIndex = 0;
for (var targetIndex = 0; targetIndex < targetItems.Count; ++targetIndex) {
var itemsToAssign = fraction + (targetIndex < remainder ? 1 : 0);
yield return Tuple.Create(
targetItems[targetIndex],
(IList<U>) sourceItems.Skip(sourceIndex).Take(itemsToAssign).ToList()
);
sourceIndex += itemsToAssign;
}
}