Languages such as Nemerle support the idea of chords. I'd like to know what their practical use is.
The construct also seems to exist in the Cω language (as well as Polyphonic C#), at least according to [Wikipedia](http://en.wikipedia.org/wiki/Chord_(concurrency).
The primary usage of chords appears to involve database programming (more specifically, join calculus), which is unsurprising given that it is a concurrency construct. More than that, I'm afraid I don't know.
A chord is used for concurrency. The definition is available here.
The bit you are looking for:
In most languages, including C#, methods in the signature of a class are in bijective correspondence with the code of their implementations -- for each method which is declared, there is a single, distinct definition of what happens when that method is called. In Cω, however, a body may be associated with a set of (synchronous and/or asynchronous) methods. We call such a definition a chord, and a particular method may appear in the header of several chords. The body of a chord can only execute once all the methods in its header have been called. Thus, when a method is called there may be zero, one, or more chords which are enabled:
If no chord is enabled then the method
invocation is queued up. If the method
is asynchronous, then this simply
involves adding the arguments (the
contents of the message) to a queue.
If the method is synchronous, then the
calling thread is blocked. If there
is a single enabled chord, then the
arguments of the calls involved in the
match are de-queued, any blocked
thread involved in the match is
awakened, and the body runs. When a
chord which involves only asynchronous
methods runs, then it does so in a new
thread. If there are several chords
which are enabled then an unspecified
one of them is chosen to run.
Similarly, if there are multiple calls
to a particular method queued up, we
do not specify which call will be
de-queued when there is a match.
Try Nemerle Computation Expressions:
https://code.google.com/p/nemerle/source/browse/nemerle/trunk/snippets/ComputationExpressions/
Some examples:
def upTo (n : int)
{
comp enumerable
{
mutable i = 0;
while (i < n)
{
i ++;
yield i
}
}
}
def manyTimes : IEnumerable [int] =
comp enumerable
{
yieldcomp upTo(2); // 1 2
yield 100; // 100
yieldcomp upTo(3); // 1 2 3
yield 100; // 100
yieldcomp upTo(10); // 1 2 3 .. 10
}
def fn(n)
{
comp async
{
if (n < 20)
returncomp fn(n + 1);
else
return n;
}
}
def f(n1, n2)
{
comp async
{
defcomp n1 = fn(n1);
defcomp n2 = fn(n2);
return $"$n1 $n2";
}
}
private HttpGet(url : string) : Async[string]
{
comp async
{
def req = WebRequest.Create(url);
using (defcomp resp = req.AsyncGetResponse())
using (stream = resp.GetResponseStream())
using (reader = StreamReader(stream))
return reader.ReadToEnd();
}
}
Some more examples here: (Although article in Russian but code in English :) ) http://habrahabr.ru/blogs/programming/108184/
Related
I have something like the following code:
public class MainAppClass : BaseClass
{
public IList<Token> TokenList
{
get;
set;
}
// This is execute before any thread is created
public override void OnStart()
{
MyDataBaseContext dbcontext = new MyDataBaseContext();
this.TokenList = dbcontext.GetTokenList();
}
// After this the application will create a list of many items to be iterated
// and will create as many threads as are defined in the configuration (5 at the momment),
// then it will distribute those items among the threads for parallel processing.
// The OnProcessItem will be executed for every item and could be running on different threads
protected override void OnProcessItem(AppItem processingItem)
{
string expression = getExpressionFromItem();
expression = Utils.ReplaceTokens(processingItem, expression, this);
}
}
public class Utils
{
public static string ReplaceTokens(AppItem currentProcessingItem, string expression, MainAppClass mainAppClass)
{
Regex tokenMatchExpression = new Regex(#"\[[^+~][^$*]+?\]", RegexOptions.IgnoreCase);
Match tokenMatch = tokenMatchExpression.Match(expression)
if(tokenMatch.Success == false)
{
return expression;
}
string tokenName = tokenMatch.Value;
// This line is my principal suspect of messing in some way with the multiple threads
Token tokenDefinition = mainAppClass.TokenList.Where(x => x.Name == tokenName).First();
Regex tokenElementExpression = new Regex(tokenDefintion.Value);
MyRegexSearchResult evaluationResult = Utils.GetRegexMatches(currentProcessingItem, tokenElementExpression).FirstOrDefault();
string tokenValue = string.Empty;
if (evaluationResult != null && evaluationResult.match.Groups.Count > 1)
{
tokenValue = evaluationResult.match.Groups[1].Value;
}
else if (evaluationResult != null && evaluationResult.match.Groups.Count == 1)
{
tokenValue = evaluationResult.match.Groups[0].Value;
}
expression = expression.Replace("[" + tokenName + "]", tokenValue);
return expression;
}
}
The problem I have right now is that for some reason the value of the token replaced in the expression get confused with one from another thread, resulting in an incorrect replacement as it should be a different value, i.e:
Expression: Hello [Name]
Expected result for item 1: Hello Nick
Expected result for item 2: Hello Sally
Actual result for item 1: Hello Nick
Actual result for item 2: Hello Nick
The actual result is not always the same, sometimes is the expected one, sometimes both expressions are replaced with the value expected for the item 1, or sometimes both expressions are replaced with the value expected for the item 2.
I'm not able to find what's wrong with the code as I was expecting for all the variables within the static method to be in its own scope for every thread, but that doesn't seem to be the case.
Any help will be much appreciated!
Yeah, static objects only have one instance throughout the program - creating new threads doesn't create separate instances of those objects.
You've got a couple different ways of dealing with this.
Door #1. If the threads need to operate on different instances, you'll need to un-static the appropriate places. Give each thread its own instance of the object you need it to modify.
Door #2. Thread-safe objects (like mentioned by Fildor.) I'll admit, I'm a bit less familiar with this door, but it's probably the right approach if you can get it to work (less complexity in code is awesome)
Door #3. Lock on the object directly. One option is to, when modifying the global static, to put it inside a lock(myObject) { } . They're pretty simple and straight-foward (so much simpler than the old C/C++ days), and it'll make it so multiple modifications don't screw the object up.
Door #4. Padlock the encapsulated class. Don't allow outside callers to modify the static variable at all. Instead, they have to call global getters/setters. Then, have a private object inside the class that serves simply as a lockable object - and have the getters/setters lock that lockable object whenever they're reading/writing it.
The tokenValue that you're replacing the token with is coming from evaluationResult.
evaluationResult is based on Utils.GetRegexMatches(currentProcessingItem, tokenElementExpression).
You might want to check GetRegexMatches to see if it's using any static resources, but my best guess is that it's being passed the same currentProcessingItem value in multiple threads.
Look to the code looks like that splits up the AppItems. You may have an "access to modified closure" in there. For example:
for(int i = 0; i < appItems.Length; i++)
{
var thread = new Thread(() => {
// Since the variable `i` is shared across all of the
// iterations of this loop, `appItems[i]` is going to be
// based on the value of `i` at the time that this line
// of code is run, not at the time when the thread is created.
var appItem = appItems[i];
...
});
...
}
Trying to understand how the Subject<T>, ReplaySubject<T> and other work. Here is example:
(Subject is Observable and observer)
public IObservable<int> CreateObservable()
{
Subject<int> subj = new Subject<int>(); // case 1
ReplaySubject<int> subj = new ReplaySubject<int>(); // case 2
Random rnd = new Random();
int maxValue = rnd.Next(20);
Trace.TraceInformation("Max value is: " + maxValue.ToString());
subj.OnNext(-1); // specific value
for(int iCounter = 0; iCounter < maxValue; iCounter++)
{
Trace.TraceInformation("Value: " + iCounter.ToString() + " is about to publish");
subj.OnNext(iCounter);
}
Trace.TraceInformation("Publish complete");
subj.OnComplete();
return subj;
}
public void Main()
{
//
// First subscription
CreateObservable()
.Subscribe(
onNext: (x)=>{
Trace.TraceInformation("X is: " + x.ToString());
});
//
// Second subscribe
CreateObservable()
.Subscribe(
onNext: (x2)=>{
Trace.TraceInformation("X2 is: " + x.ToString());
});
Case 1: The strange situation is - when I use Subject<T> no subscription is made (???) - I never see the "X is: " text - I only see the "Value is: " and "Max value is"... Why does Subject<T> does not push values to subscription ?
Case 2: If I use ReplaySubject<T> - I do see the values in Subscription but I could not apply Defer option to anything. Not to Subject and not to Observable.... So every subscription will receive different values because CreateObservable function is cold observable. Where is Defer ?
Whenever you need to create an observable out of thin air, Observable.Create should be the first thing to think of. Subjects enter the picture in two cases:
You need some kind of "addressable endpoint" to feed data to in order for all subscribers to receive it. Compare this to a .NET event which has both an invocation side (through delegate invocation) and a subscription side (through delegate combine with +- and -= syntax). You'll find in a lot of cases, you can achieve the same effect using Observable.Create.
You need multicasting of messages in a query pipeline, effectively sharing an observable sequence by many forks in your query logic, without triggering multiple subscriptions. (Think of subscribing to your favorite magazine once for your dorm and putting a photo copier right behind the letter box. You still pay one subscription, though all of your friends can read the magazine delivered through OnNext on the letter box.)
Also, in a lot of cases, there's already a built-in primitive in Rx that does exactly what you need. For example, there's From* factory methods to bridge with existing concepts (such as events, tasks, asynchronous methods, enumerable sequence), some of which using a subject under the covers. For the second case of multicasting logic, there's the Publish, Replay, etc. family of operators.
You need to be mindful of when code is executed.
In "Case 1", when you use a Subject<T>, you'll notice that the all of the calls to OnNext & OnCompleted finish before the observable is returned by the CreateObservable method. Since you are using a Subject<T> this means that any subsequent subscription will have missed all of the values so you should expect to get what you got - nothing.
You have to delay the operations on the subject until you have the observer subscribed. To do that using the Create method. Here's how:
public IObservable<int> CreateObservable()
{
return Observable.Create<int>(o =>
{
var subj = new Subject<int>();
var disposable = subj.Subscribe(o);
var rnd = new Random();
var maxValue = rnd.Next(20);
subj.OnNext(-1);
for(int iCounter = 0; iCounter < maxValue; iCounter++)
{
subj.OnNext(iCounter);
}
subj.OnCompleted();
return disposable;
});
}
I've removed all the trace code for succinctness.
So now, for every subscriber, you get a new execution of the code inside the Create method and you would now get the values from the internal Subject<T>.
The use of the Create method is generally the correct way to create observables that you return from methods.
Alternatively you could use a ReplaySubject<T> and avoid the use of the Create method. However this is unattractive for a number of reasons. It forces the computation of the entire sequence at creation time. This give you a cold observable that you could have produced more efficiently without using a replay subject.
Now, as an aside, you should try to avoid using subjects at all. The general rule is that if you're using a subject then you're doing something wrong. The CreateObservable method would be better written as this:
public IObservable<int> CreateObservable()
{
return Observable.Create<int>(o =>
{
var rnd = new Random();
var maxValue = rnd.Next(20);
return Observable.Range(-1, maxValue + 1).Subscribe(o);
});
}
No need for a subject at all.
Let me know if this helps.
Can someone please explain me what I am missing here. Based on my basic understanding linq result will be calculated when the result will be used and I can see that in following code.
static void Main(string[] args)
{
Action<IEnumerable<int>> print = (x) =>
{
foreach (int i in x)
{
Console.WriteLine(i);
}
};
int[] arr = { 1, 2, 3, 4, 5 };
int cutoff = 1;
IEnumerable<int> result = arr.Where(x => x < cutoff);
Console.WriteLine("First Print");
cutoff = 3;
print(result);
Console.WriteLine("Second Print");
cutoff = 4;
print(result);
Console.Read();
}
Output:
First Print
1
2
Second Print
1
2
3
Now I changed the
arr.Where(x => x < cutoff);
to
IEnumerable<int> result = arr.Take(cutoff);
and the output is as follow.
First Print
1
Second Print
1
Why with Take, it does not use the current value of the variable?
The behavior your seeing comes from the different way in which the arguments to the LINQ functions are evaluated. The Where method recieves a lambda which captures the value cutoff by reference. It is evaluated on demand and hence sees the value of cutoff at that time.
The Take method (and similar methods like Skip) take an int parameter and hence cutoff is passed by value. The value used is the value of cutoff at the moment the Take method is called, not when the query is evaluated
Note: The term late binding here is a bit incorrect. Late binding generally refers to the process where the members an expression binds to are determined at runtime vs. compile time. In C# you'd accomplish this with dynamic or reflection. The behavior of LINQ to evaluate it's parts on demand is known as delayed execution.
There's a few different things getting confused here.
Late-binding: This is where the meaning of code is determined after it was compiled. For example, x.DoStuff() is early-bound if the compiler checks that objects of x's type have a DoStuff() method (considering extension methods and default arguments too) and then produces the call to it in the code it outputs, or fails with a compiler error otherwise. It is late-bound if the search for the DoStuff() method is done at run-time and throws a run-time exception if there was no DoStuff() method. There are pros and cons to each, and C# is normally early-bound but has support for late-binding (most simply through dynamic but the more convoluted approaches involving reflection also count).
Delayed execution: Strictly speaking, all Linq methods immediately produce a result. However, that result is an object which stores a reference to an enumerable object (often the result of the previous Linq method) which it will process in an appropriate manner when it is itself enumerated. For example, we can write our own Take method as:
private static IEnumerable<T> TakeHelper<T>(IEnumerable<T> source, int number)
{
foreach(T item in source)
{
yield return item;
if(--number == 0)
yield break;
}
}
public static IEnumerable<T> Take<T>(this IEnumerable<T> source, int number)
{
if(source == null)
throw new ArgumentNullException();
if(number < 0)
throw new ArgumentOutOfRangeException();
if(number == 0)
return Enumerable.Empty<T>();
return TakeHelper(source, number);
}
Now, when we use it:
var taken4 = someEnumerable.Take(4);//taken4 has a value, so we've already done
//something. If it was going to throw
//an argument exception it would have done so
//by now.
var firstTaken = taken4.First();//only now does the object in taken4
//do the further processing that iterates
//through someEnumerable.
Captured variables: Normally when we make use of a variable, we make use of how its current state:
int i = 2;
string s = "abc";
Console.WriteLine(i);
Console.WriteLine(s);
i = 3;
s = "xyz";
It's pretty intuitive that this prints 2 and abc and not 3 and xyz. In anonymous functions and lambda expressions though, when we make use of a variable we are "capturing" it as a variable, and so we will end up using the value it has when the delegate is invoked:
int i = 2;
string s = "abc";
Action λ = () =>
{
Console.WriteLine(i);
Console.WriteLine(s);
};
i = 3;
s = "xyz";
λ();
Creating the λ doesn't use the values of i and s, but creates a set of instructions as to what to do with i and s when λ is invoked. Only when that happens are the values of i and s used.
Putting it all together: In none of your cases do you have any late-binding. That is irrelevant to your question.
In both you have delayed execution. Both the call to Take and the call to Where return enumerable objects which will act upon arr when they are enumerated.
In only one do you have a captured variable. The call to Take passes an integer directly to Take and Take makes use of that value. The call to Where passes a Func<int, bool> created from a lambda expression, and that lambda expression captures an int variable. Where knows nothing of this capture, but the Func does.
That's the reason the two behave so differently in how they treat cutoff.
Take doesn't take a lambda, but an integer, as such it can't change when you change the original variable.
Lambdas are nice, as they offer brevity and locality and an extra form of encapsulation. Instead of having to write functions which are only used once you can use a lambda.
While wondering how they worked, I intuitively figured they are probably only created once. This inspired me to create a solution which allows to restrict the scope of a class member beyond private to one particular scope by using the lambda as an identifier of the scope it was created in.
This implementation works, although perhaps overkill (still researching it), proving my assumption to be correct.
A smaller example:
class SomeClass
{
public void Bleh()
{
Action action = () => {};
}
public void CallBleh()
{
Bleh(); // `action` == {Method = {Void <SomeClass>b__0()}}
Bleh(); // `action` still == {Method = {Void <SomeClass>b__0()}}
}
}
Would the lambda ever return a new instance, or is it guaranteed to always be the same?
It's not guaranteed either way.
From what I remember of the current MS implementation:
A lambda expression which doesn't capture any variables is cached statically
A lambda expression which only captures "this" could be captured on a per-instance basis, but isn't
A lambda expression which captures a local variable can't be cached
Two lambda expressions which have the exact same program text aren't aliased; in some cases they could be, but working out the situations in which they can be would be very complicated
EDIT: As Eric points out in the comments, you also need to consider type arguments being captured for generic methods.
EDIT: The relevant text of the C# 4 spec is in section 6.5.1:
Conversions of semantically identical anonymous functions with the same (possibly empty) set of captured outer variable instances to the same delegate types are permitted (but not required) to return the same delegate instance. The term semantically identical is used here to mean that execution of the anonymous functions will, in all cases, produce the same effects given the same arguments.
Based on your question here and your comment to Jon's answer I think you are confusing multiple things. To make sure it is clear:
The method that backs the delegate for a given lambda is always the same.
The method that backs the delegate for "the same" lambda that appears lexically twice is permitted to be the same, but in practice is not the same in our implementation.
The delegate instance that is created for a given lambda might or might not always be the same, depending on how smart the compiler is about caching it.
So if you have something like:
for(i = 0; i < 10; ++i)
M( ()=>{} )
then every time M is called, you get the same instance of the delegate because the compiler is smart and generates
static void MyAction() {}
static Action DelegateCache = null;
...
for(i = 0; i < 10; ++i)
{
if (C.DelegateCache == null) C.DelegateCache = new Action ( C.MyAction )
M(C.DelegateCache);
}
If you have
for(i = 0; i < 10; ++i)
M( ()=>{this.Bar();} )
then the compiler generates
void MyAction() { this.Bar(); }
...
for(i = 0; i < 10; ++i)
{
M(new Action(this.MyAction));
}
You get a new delegate every time, with the same method.
The compiler is permitted to (but in fact does not at this time) generate
void MyAction() { this.Bar(); }
Action DelegateCache = null;
...
for(i = 0; i < 10; ++i)
{
if (this.DelegateCache == null) this.DelegateCache = new Action ( this.MyAction )
M(this.DelegateCache);
}
In that case you would always get the same delegate instance if possible, and every delegate would be backed by the same method.
If you have
Action a1 = ()=>{};
Action a2 = ()=>{};
Then in practice the compiler generates this as
static void MyAction1() {}
static void MyAction2() {}
static Action ActionCache1 = null;
static Action ActionCache2 = null;
...
if (ActionCache1 == null) ActionCache1 = new Action(MyAction1);
Action a1 = ActionCache1;
if (ActionCache2 == null) ActionCache2 = new Action(MyAction2);
Action a2 = ActionCache2;
However the compiler is permitted to detect that the two lambdas are identical and generate
static void MyAction1() {}
static Action ActionCache1 = null;
...
if (ActionCache1 == null) ActionCache1 = new Action(MyAction1);
Action a1 = ActionCache1;
Action a2 = ActionCache1;
Is that now clear?
No guarantees.
A quick demo:
Action GetAction()
{
return () => Console.WriteLine("foo");
}
Call this twice, do a ReferenceEquals(a,b), and you'll get true
Action GetAction()
{
var foo = "foo";
return () => Console.WriteLine(foo);
}
Call this twice, do a ReferenceEquals(a,b), and you'll get false
I see Skeet jumped in while I was answering, so I won't belabor that point. One thing I would suggest, to better understand how you are using things, is to get familiar with reverse engineering tools and IL. Take the code sample(s) in question and reverse engineer to IL. It will give you a great amount of information on how the code is working.
Good question. I don't have an "academic answer," more of a practical answer: I could see a compiler optimizing the binary to use the same instance, but I wouldn't ever write code that assumes it's "guaranteed" to be the same instance.
I upvoted you at least, so hopefully someone can give you the academic answer you're looking for.
What is the most efficient way to find a sequence within a IEnumerable<T> using LINQ
I want to be able to create an extension method which allows the following call:
int startIndex = largeSequence.FindSequence(subSequence)
The match must be adjacent and in order.
Here's an implementation of an algorithm that finds a subsequence in a sequence. I called the method IndexOfSequence, because it makes the intent more explicit and is similar to the existing IndexOf method:
public static class ExtensionMethods
{
public static int IndexOfSequence<T>(this IEnumerable<T> source, IEnumerable<T> sequence)
{
return source.IndexOfSequence(sequence, EqualityComparer<T>.Default);
}
public static int IndexOfSequence<T>(this IEnumerable<T> source, IEnumerable<T> sequence, IEqualityComparer<T> comparer)
{
var seq = sequence.ToArray();
int p = 0; // current position in source sequence
int i = 0; // current position in searched sequence
var prospects = new List<int>(); // list of prospective matches
foreach (var item in source)
{
// Remove bad prospective matches
prospects.RemoveAll(k => !comparer.Equals(item, seq[p - k]));
// Is it the start of a prospective match ?
if (comparer.Equals(item, seq[0]))
{
prospects.Add(p);
}
// Does current character continues partial match ?
if (comparer.Equals(item, seq[i]))
{
i++;
// Do we have a complete match ?
if (i == seq.Length)
{
// Bingo !
return p - seq.Length + 1;
}
}
else // Mismatch
{
// Do we have prospective matches to fall back to ?
if (prospects.Count > 0)
{
// Yes, use the first one
int k = prospects[0];
i = p - k + 1;
}
else
{
// No, start from beginning of searched sequence
i = 0;
}
}
p++;
}
// No match
return -1;
}
}
I didn't fully test it, so it might still contain bugs. I just did a few tests on well-known corner cases to make sure I wasn't falling into obvious traps. Seems to work fine so far...
I think the complexity is close to O(n), but I'm not an expert of Big O notation so I could be wrong... at least it only enumerates the source sequence once, whithout ever going back, so it should be reasonably efficient.
The code you say you want to be able to use isn't LINQ, so I don't see why it need be implemented with LINQ.
This is essentially the same problem as substring searching (indeed, an enumeration where order is significant is a generalisation of "string").
Since computer science has considered this problem frequently for a long time, so you get to stand on the shoulders of giants.
Some reasonable starting points are:
http://en.wikipedia.org/wiki/Knuth%E2%80%93Morris%E2%80%93Pratt_algorithm
http://en.wikipedia.org/wiki/Boyer%E2%80%93Moore_string_search_algorithm
http://en.wikipedia.org/wiki/Rabin-karp
Even just the pseudocode in the wikipedia articles is enough to port to C# quite easily. Look at the descriptions of performance in different cases and decide which cases are most likely to be encountered by your code.
I understand this is an old question, but I needed this exact method and I wrote it up like so:
public static int ContainsSubsequence<T>(this IEnumerable<T> elements, IEnumerable<T> subSequence) where T: IEquatable<T>
{
return ContainsSubsequence(elements, 0, subSequence);
}
private static int ContainsSubsequence<T>(IEnumerable<T> elements, int index, IEnumerable<T> subSequence) where T: IEquatable<T>
{
// Do we have any elements left?
bool elementsLeft = elements.Any();
// Do we have any of the sub-sequence left?
bool sequenceLeft = subSequence.Any();
// No elements but sub-sequence not fully matched
if (!elementsLeft && sequenceLeft)
return -1; // Nope, didn't match
// No elements of sub-sequence, which means even if there are
// more elements, we matched the sub-sequence fully
if (!sequenceLeft)
return index - subSequence.Count(); // Matched!
// If we didn't reach a terminal condition,
// check the first element of the sub-sequence against the first element
if (subSequence.First().Equals(e.First()))
// Yes, it matched - move onto the next. Consume (skip) one element in each
return ContainsSubsequence(elements.Skip(1), index + 1 subSequence.Skip(1));
else
// No, it didn't match. Try the next element, without consuming an element
// from the sub-sequence
return ContainsSubsequence(elements.Skip(1), index + 1, subSequence);
}
Updated to not just return if the sub-sequence matched, but where it started in the original sequence.
This is an extension method on IEnumerable, fully lazy, terminates early and is far more linq-ified than the currently up-voted answer. Bewarned, however (as #wai-ha-lee points out) it is recursive and creates a lot of enumerators. Use it where applicable (performance/memory). This was fine for my needs, but YMMV.
You can use this library called Sequences to do that (disclaimer: I'm the author).
It has a IndexOfSlice method that does exactly what you need - it's an implementation of the Knuth-Morris-Pratt algorithm.
int startIndex = largeSequence.AsSequence().IndexOfSlice(subSequence);
UPDATE:
Given the clarification of the question my response below isn't as applicable. Leaving it for historical purposes.
You probably want to use mySequence.Where(). Then the key is to optimize the predicate to work well in your environment. This can vary quite a bit depending on your requirements and typical usage patterns.
It is quite possible that what works well for small collections doesn't scale well for much larger collections depending on what type T is.
Of course, if the 90% use is for small collections then optimizing for the outlier large collection seems a bit YAGNI.