The Accord.NET Project Home (http://code.google.com/p/accord/) contains examples of creating, training, and evaluating Hidden Markov Models based on sequences of one-variable observations. I'd like to do the same, but with sequences of many variables. I'm envisioning a multiple regression structure with a dependent variable and multiple independent variables. I want to be able estimate an HMM where the output includes estimated intercepts and coefficients for each state, along with a transition probability matrix. An example is time-varying betas for stock returns. e.g. ret(IBM) = intercept + b1*ret(Index) + b2*ret(SectorETF) + error, but where intercept, b1, and b2 are state-dependent.
Marcelo Perlin offers exactly this functionality in his MS_Regress package for Matlab. However, I want this functionality in C#. So, any help would be greatly appreciated on either (1) using Accord.NET libraries to estimate a multiple regression HMM model, (2) translating Marcelo Perlin's package into C#, or (3) other ideas on how to achieve my goal.
Thank you!
The Accord.NET Framework supports multidimensional features as well. You can specify any probability distribution to use in the states by using generics, and there is also an example available in the documentation.
If you have, for example, two-dimensional observation vectors, and choose to assume multidimensional model assuming Gaussian emission densities, you could use:
// Assume a Normal distribution for two-dimensional samples.
var density = new MultivariateNormalDistribution(dimension: 2);
// Create a continuous hidden Markov Model with two states organized in a forward
// topology and an underlying multivariate Normal distribution as emission density.
var model = new HiddenMarkovModel<MultivariateNormalDistribution>(new Forward(2), density);
and then you can learn the model using the generic versions of the usual Baum-Welch, Viterbi or Maximum Likelihood learners.
However, what the framework unfortunately still doesn't support is the exact regression form you mentioned. But it looks very interesting. Perhaps it could be added to the framework somewhere in the future. If you wish, please leave it as a suggestion together with some references and papers in the Issue Tracker of the project. It would seems like a very useful addition.
Related
I am trying to use a MCS (Multi classifier system) to do some better work on limited data i.e become more accurate.
I am using K-means clustering at the moment but may choose to go with FCM (Fuzzy c-means) with that the data is clustered into groups (clusters) the data could represent anything, colours for example. I first cluster the data after pre-processing and normalization and get some distinct clusters with a lot in between. I then go on to use the clusters as the data for a Bayes classifier, each cluster represents a distinct colour and the Bayes classifier is trained and the data from the clusters is then put through separate Bayes classifiers. Each Bayes classifier is trained only in one colour. If we take the colour spectrum 3 - 10 as being blue 13 - 20 as being red and the spectrum in between 0 - 3 being white up to 1.5 then turning blue gradually through 1.5 - 3 and same for blue to red.
What I would like to know is how or what kind of aggregation method (if that is what you would use) could be applied so that the Bayes classifier can become stronger, and how does it work? Does the aggregation method already know the answer or would it be human interaction that corrects the outputs and then those answers go back into the Bayes training data? Or a combination of both? Looking at Bootstrap aggregating it involves having each model in the ensemble vote with equal weight so not quite sure in this particular instance I would use bagging as my aggregation method? Boosting however involves incrementally building an ensemble by training each new model instance to emphasize the training instances that previous models mis-classified, not sure if this would be a better alternative to bagging as im unsure how it incrementally builds upon new instances? And the last one would be Bayesian model averaging which is an ensemble technique that seeks to approximate the Bayes Optimal Classifier by sampling hypotheses from the hypothesis space, and combining them using Bayes' law, however completely unsure how you would sample hypotheses from search space?
I know that usualy you would use a competitive approach to bounce between the two classification algorithms one says yes one says maybe a weighting could be applied and if its correct you get the best of both classifiers but for keep sake I dont want a competitive approach.
Another question is using these two methods together in such a way would it be beneficial, i know the example i provided is very primitive and may not apply in that example but can it be beneficial in more complex data.
I have some issues about the method you are following:
K-means puts in each cluster the points that are the most near to it. And then you train a classifier using the output data. I think that the classifier may outperform the clustering implicit classification, but only by taking into account the number of samples in each cluster. For example, if your training data after clustering you have typeA(60%), typeB(20%), typeC(20%); your classifier will prefer to take ambiguous samples to typeA, to obtain less classification error.
K-means depends on what "coordinates"/"features" you take from the objects. If you use features where the objects of different types are mixed, the K-means performance will decrease. Deleting these kind of features from the feature vector may improve your results.
Your "feature"/"coordinates" that represent the objects that you want to classify may be measured in different units. This fact can affect your clustering algorithm since you are implicitly setting a unit conversion between them through the clustering error function. The final set of clusters is selected with multiple clustering trials (that were obtained upon different cluster initializations), using an error function. Thus, an implicit comparison is made upon the different coordinates of your feature vector (potentially introducing the implicit conversion factor).
Taking into account these three points, you will probably increase the overall performance of your algorithm by adding preprocessing stages. For example in object recognition for computer vision applications, most of the information taken from the images comes only from borders in the image. All the color information and part of the texture information are not used. The borders are substracted from the image processing the image to obtain the Histogram of Oriented Gradients (HOG) descriptors. This descriptor gives back "features"/"coordinates" that separate better the objects, thus, increasing classification (object recognition) performance. Theoretically descriptors throw information contained in the image. However, they present two main advantages (a) the classifier will deal with lower dimensionality data and (b) descriptors calculated from test data can be more easily matched with training data.
In your case, I suggest that you try to improve your accuracy taking a similar approach:
Give richer features to your clustering algorithm
Take advantage of prior knowledge in the field to decide what features you should add and delete from your feature vector
Always consider the possibility of obtaining labeled data, so that supervised learning algorithms can be applied
I hope this helps...
What are the best clustering algorithms to use in order to cluster data with more than 100 dimensions (sometimes even 1000). I would appreciate if you know any implementation in C, C++ or especially C#.
It depends heavily on your data. See curse of dimensionality for common problems. Recent research (Houle et al.) showed that you can't really go by the numbers. There may be thousands of dimensions and the data clusters well, and of course there is even one-dimensional data that just doesn't cluster. It's mostly a matter of signal-to-noise.
This is why for example clustering of TF-IDF vectors works rather well, in particular with cosine distance.
But the key point is that you first need to understand the nature of your data. You then can pick appropriate distance functions, weights, parameters and ... algorithms.
In particular, you also need to know what constitutes a cluster for you. There are many definitions, in particular for high-dimensional data. They may be in subspaces, they may or may not be arbitrarily rotated, they may overlap or not (k-means for example, doesn't allow overlaps or subspaces).
well i know something called vector quantization, its a nice algorithem to cluster stuf with many dimentions.
i've used k-means on data with 100's dimensions, it is very common so i'm sure theres an implementation in any language, worst case scenario - it is very easy to implement by your self.
It might also be worth trying some dimensionality reduction techniques like Principle Component Analysis or an auto-associative neural net before you try to cluster it. It can turn a huge problem into a much smaller one.
After that, go k-means or mixture of gaussians.
The EM-tree and K-tree algorithms in the LMW-tree project can cluster high dimensional problems like this. It is implemented in C++ and supports many different representations.
We have novel algorithms clustering binary vectors created by LSH / Random Projections, or anything else that emits binary vectors that can be compared via Hamming distance for similarity.
I need a little push in the right direction.
I want to code a framework in C# that allows me to create graphs that process (mostly numerical) data. I've been looking for the right nomenclature, and for other projects with the same goal, but found no satisfactory results. I'm pretty sure code like this already exists, and I don't want to completely reinvent the wheel. Also, more experienced programmers will probably use techniques (templates, interfaces, ...) that I would love to learn by examining their code.
The framework should process data much like the DirectShow framework handles video. Some components produce data (eg. a file reader or a sensor), some components manipulate data (eg. add, average) and some components render data (eg. a file writer or a chart drawing control). The components/nodes are connected using edges/lines.
Nodes can have multiple inputs (sinks) and outputs (sources). The framework should encompass the base classes that allow filter graphs to be constructed. Applications using the framework must subclass to implement the actual source, transform and render components.
An example: a GPS device produces latitude and longitude values (2 output pins). A calculator transforms these values into cartesian coordinates. The next component takes two consecutive coordinates and calculates the distance.
I am looking for tips, references and example code that enables me code the framework. Thanks!
UPDATE: Pipes.NET looks promising.
UPDATE: Dataflow is a relevant term.
I suppose you could use QuickGraph for all your graph needs. If none of the built-in algorithms are of use, it's always possible to simply iterate over the tree and invoke whatever custom logic you want along the way.
I am looking for some kind of intelligent (I was thinking AI or Neural network) library that I can feed a list of historical data and this will predict the next sequence of outputs.
As an example I would like to feed the library the following figures 1,2,3,4,5
and based on this, it should predict the next sequence is 6,7,8,9,10 etc.
The inputs will be a lot more complex and contain much more information.
This will be used in a C# application.
If you have any recommendations or warning that will be great.
Thanks
EDIT
What I am trying to do i using historical sales data, predict what amount a specific client is most likely going to spend in the next period.
I do understand that there are dozens of external factors that can influence a clients purchases but for now I need to merely base it on the sales history and then plot a graph showing past sales and predicted sales.
If you're looking for a .NET API, then I would recommend you try AForge.NET http://code.google.com/p/aforge/
If you just want to try various machine learning algorithms on a data set that you have at your disposal, then I would recommend that you play around with Weka; it's (relatively) easy to use and it implements a lot of ML/AI algorithms. Run multiple runs with different settings for each algorithm and try as many algorithms as you can. Most of them will have some predictive power and if you combine the right ones, then you might really get something useful.
If I understand your question correctly, you want to approximate and extrapolate an unknown function. In your example, you know the function values
f(0) = 1
f(1) = 2
f(2) = 3
f(3) = 4
f(4) = 5
A good approximation for these points would be f(x) = x+1, and that would yield f(5) = 6... as expected. The problem is, you can't solve this without knowledge about the function you want to extrapolate: Is it linear? Is it a polynomial? Is it smooth? Is it (approximately or exactly) cyclic? What is the range and domain of the function? The more you know about the function you want to extrapolate, the better your predictions will be.
I just have a warning, sorry. =)
Mathematically, there is no reason for your sequence above to be followed by a "6". I can easily give you a simple function, whose next value is any value you like. Its just that humans like simple rules, and therefore tend to see a connection in these sequences, that in reality is not there. Therefore, this is a impossible task for a computer, if you do not want to feed it with additional information.
Edit:
In the case that you suspect your data to have a known functional dependence, and there are uncontrollable outside factors, maybe regression analysis will have good results. To start easy, look at linear regression first.
If you cannot assume linear dependence, there is a nice application that looks for functions fitting your historical data... I'll update this post with its name as soon as I remember. =)
We have some examples of pictures.
And we have on input set of pictures. Every input picture is one of example after combination of next things
1) Rotating
2) Scaling
3) Cutting part of it
4) Adding noise
5) Using filter of some color
It is guarantee that human can recognize picture ease.
I need simple but effective algorithm to recognize from which one of base examples we get input picture.
I am writing in C# and Java
I don't think there is a single simple algorithm which will enable you to recognise images under all the conditions you mention.
One technique which might cover most is to Fourier transform the image, but this can't be described as simple by any stretch of the imagination, and will involve some pretty heavy mathematical concepts.
You might find it useful to search in the field of Digital Signal Processing which includes image processing since they're just two dimensional signals.
EDIT: Apparently the problem is limited to recognising MONEY (notes and coins) so the first problem of searching becomes avoiding websites which mention money as the result of using their image-recognition product, rather than as the source of the images.
Anyway, I found more useful hits by searching for 'Currency Image Recognition'. Including some which mention Hidden Markov Models (whatever that means). It may be the algorithm you're searching for.
The problem is simplified by having a small set of target images, but complicated by the need to detect counterfeits.
I still don't think there's a 'simple agorithm' for this job. Good luck in your searching.
There is some good research going on in the field of computer vision. One of the problem being solved is identification of an object irrespective of scale changes,noise additions and skews introduced because photo has been clicked from a different view. I have done little assignment on this two years back as a part of computer vision course. There is a transformation called as scale invariant feature transform by which you can extract various features for the corner point. Corner points are those which are different from all its neighboring pixels. As you can observe, If photo has been clicked from two different views, some edges may disappear and appear like some thing else but corners remain almost same. This transformations explains how feature vector of size 128 can be extracted for all the corner points and tells you how to use these feature vector to find out the similarity between two images. Here in you case
You can extract those features for one of all the currency notes you have and check for existence of these corner points in the test image you are supposed to test
As this transformation is robust to rotation,scaling,cropping,noise addition and color filtering, I guess this is the best I can suggest you. You can check this demo to have a better picture of what I explained.
OpenCV has lots of algorithms and features, I guess it should be suitable for your problem, however you'll have to play with PInvoke to consume it from c# (it's C library) - doable, but requires some work.
You would need to build a set of functions that compute the probability of a particular transform between two images f(A,B). A number of transforms have previously been suggested as answers, e.g. Fourier. You would probably not be able to compute the probability of multiple transforms in one go fgh(A,B) with any reliability. So, you would compute the probability that each transform was independently applied f(A,B) g(A,B) h(A,B) and those with P above a threshold are the solution.
If the order is important, i.e you need to know that f(A,B) then g(f,B) then h(g,B) was performed, then you would need to adopt a state based probability framework such as Hidden Markov Models or a Bayesian Network (well, this is a generalization of HMMs) to model the likelihood of moving between states. See the BNT toolbox for Matlab (http://people.cs.ubc.ca/~murphyk/Software/BNT/bnt.html) for more details on these or any good modern AI book.