I would like to know if there is a way that I can determine whether or not two images are the same (there are lots of posts on that topic, I know), but it's also possible that one picture is a compressed version of the other image...
This post also asks for a C# library that does image processing and comparing, but I'm not really sure what kind of functions a library would need to provide this specific match.. This post on the other hand is way to abstract.
I've read about OpenCV (or this .NET wrapper) but I have no experience with it, and I'm not sure if it will do what I want without applying the abstractions from the that post I didn't understand myself.. I mean, OpenCV is probably capable of doing the required computations, it seems like a very powerful tool, but it seems a bit complex for the seemingly simple requirement.. Or is this actually more complex and is something like OpenCV the way to go? (and if so, how?)
So, how would I go about achieving this?
One simple path you could try is AForge .NET library. It is fully .NET implementation so no worries about environment setup, and has following functionality, that might fit your case :
ExhaustiveTemplateMatching Class
"The class implements exhaustive template matching algorithm, which performs complete scan of source image, comparing each pixel with corresponding pixel of template. The class also can be used to get similarity level between two images of the same size, which can be useful to get information about how different/similar are images"
http://www.aforgenet.com/framework/docs/html/17494328-ef0c-dc83-1bc3-907b7b75039f.htm
// create template matching algorithm's instance
// use zero similarity to make sure algorithm will provide anything
ExhaustiveTemplateMatching tm = new ExhaustiveTemplateMatching(0);
// compare two images
TemplateMatch[] matchings = tm.ProcessImage(image1, image2);
// check similarity level
if (matchings[0].Similarity > 0.95f)
{
// do something with quite similar images
}
I love electronic music and I am interested in how it all ticks.
I've found lots of helpful questions on Stack Overflow on libraries that can be used to play with audio, filters etc. But what I am really curious about is what is actually hapening: how is the data being passed between effects and oscillators? I have done research into the mathematical side of dsp and I've got that end of the problem sussed but I am unsure what buffering system to use etc. The final goal is to have a simple object heirarchy of effects and oscillators that pass the data between each other (maybe using multithreading if I don't end up pulling out all my hair trying to implement it). It's not going to be the next Propellerhead Reason but I am interested in how it all works and this is more of an exercise than something that will yeild an end product.
At the moment I use .net and C# and I have recently learnt F# (which may or may not lead to some interesting ways of handling the data) but if these are not suitable for the job I can learn another system if necessary.
The question is: what is the best way to get the large amounts of signal data through the program using buffers? For instance would I be better off using a Queue, Array,Linked List etc? Should I make the samples immutable and create a new set of data each time I apply an effect to the system or just edit the values in the buffer? Shoud I have a dispatcher/thread pool style object that organises passing data or should the effect functions pass data directly between each other?
Thanks.
EDIT: another related question is how would I then use the windows API to play this array? I don't really want to use DirectShow because Microsoft has pretty much left it to die now
EDIT2: thanks for all the answers. After looking at all the technologies I will either use XNA 4(I spent a while trawling the internet and found this site which explains how to do it) or NAudio to output the music... not sure which one yet, depends on how advanced the system ends up being. When C# 5.0 comes out I will use its async capabilities to create an effects architecture on top of that. I've pretty much used everybody's answer equally so now I have a conundrum of who to give the bounty to...
Have you looked at VST.NET (http://vstnet.codeplex.com/)? It's a library to write VST using C# and it has some examples. You can also consider writing a VST, so that your code can be used from any host application (but even if you don't want, looking at their code can be useful).
Signal data is usually big and requires a lot of processing. Do not use a linked list! Most libraries I know simply use an array to put all the audio data (after all, that's what the sound card expect).
From a VST.NET sample:
public override void Process(VstAudioBuffer[] inChannels, VstAudioBuffer[] outChannels)
{
VstAudioBuffer audioChannel = outChannels[0];
for (int n = 0; n < audioChannel.SampleCount; n++)
{
audioChannel[n] = Delay.ProcessSample(inChannels[0][n]);
}
}
The audioChannel is a wrapper around an unmanaged float* buffer.
You probably store your samples in an immutable array. Then, when you want to play them, you copy the data in the output buffer (change the frequency if you want) and perform effects in this buffer. Note you can use several output buffers (or channels) and sum them at the end.
Edit
I know two low-level ways to play your array: DirectSound and WaveOut from Windows API. C# Example using DirectSound. C# example with WaveOut. However, you might prefer use an external higher-level library, like NAudio. NAudio is convenient for .NET audio manipulation - see this blog post for sending a sine wave to the audio card. You can see they are also using an array of float, which is what I recommend (if you do your computations using bytes, you'll end up with a lot of aliasing in the sound).
F# is probably a good choice here, as it's well fitted to manipulate functions. Functions are probably good building blocks for signal creation and processing.
F# is also good at manipulating collections in general, and arrays in particular, thanks to the higher-order functions in the Array module.
These qualities make F# popular in the finance sector and are also useful for signal processing, I would guess.
Visual F# 2010 for Technical Computing has a section dedicated to Fourier Transform, which could be relevant to what you want to do. I guess there is plenty of free information about the transform on the net, though.
Finally, to play samples, you can use XNA. I think the latest version of the API (4.0) also allows recording, but I have never used that. There is a famous music editing app for the Xbox called ezmuse+ Hamst3r Edition that uses XNA, so it's definitely possible.
With respect to buffering and asynchrony/threading/synchronization issues I suggest you to take a look at the new TPL Data Flow library. With its block primitives, concurrent data structures, data flow networks, async message prcessing, and TPL's Task based abstraction (that can be used with the async/await C# 5 features), it's a very good fit for this type of applications.
I don't know if this is really what you're looking for, but this was one of my personal projects while in college. I didn't truly understand how sound and DSP worked until I implemented it myself. I was trying to get as close to the speaker as possible, so I did it using only libsndfile, to handle the file format intricacies for me.
Basically, my first project was to create a large array of doubles, fill it with a sine wave, then use sf_writef_double() to write that array to a file to create something that I could play, and see the result in a waveform editor.
Next, I added another function in between the sine call, and the write call, to add an effect.
This way you start playing with very low-level oscillators and effects, and you can see the results immediately. Plus, it's very little code to get something like this working.
Personally, I would start with the simplest possible solution you can, then slowly add on. Try just writing out to a file and using your audio player to play it, so you don't have to deal with the audio apis. Just use a single array to start, and modify-in-place. Definitely start off single-threaded. As your project grows, you can start moving to other solutions, like pipes instead of the array, multi-threading it, or working with the audio API.
If you're wanting to create a project you can ship, depending on exactly what it is, you'll probably have to move to more complex libraries, like some real-time audio processing. But the basics you learn by doing the simple way above will definitely help when you get to this point.
Good luck!
I've done quite a bit of real-time DSP, although not with audio. While either of your ideas (immutable buffer) vs (mutable buffer modified in place) could work, what I prefer to do is create a single permanent buffer for each link in the signal path. Most effects don't lend themselves well to modification in place, since each input sample affects multiple output samples. The buffer-for-each-link technique works especially well when you have resampling stages.
Here, when samples arrive, the first buffer is overwritten. Then the first filter reads the new data from its input buffer (the first buffer) and writes to its output (the second buffer). Then it invokes the second stage to read from the second buffer and write into the third.
This pattern completely eliminates dynamic allocation, allows each stage to keep a variable amount of history (since effects need some memory), and is very flexible as far as enabling rearranging the filters in the path.
Alright, I'll have a stab at the bounty as well then :)
I'm actually in a very similar situation. I've been making electronic music for ages, but only over the past couple of years I've started exploring actual audio processing.
You mention that you have researched the maths. I think that's crucial. I'm currently fighting my way through Ken Steiglitz' A Digital Signal Processing Primer - With Applications to Digital Audio and Computer Music. If you don't know your complex numbers and phasors it's going to be very difficult.
I'm a Linux guy so I've started writing LADSPA plugins in C. I think it's good to start at that basic level, to really understand what's going on. If I was on Windows I'd download the VST SDK from Steinberg and write a quick proof of concept plugin that just adds noise or whatever.
Another benefit of choosing a framework like VST or LADSPA is that you can immediately use your plugins in your normal audio suite. The satisfaction of applying your first home-built plugin to an audio track is unbeatable. Plus, you will be able to share your plugins with other musicians.
There are probably ways to do this in C#/F#, but I would recommend C++ if you plan to write VST plugins, just to avoid any unnecessary overhead. That seems to be the industry standard.
In terms of buffering, I've been using circular buffers (a good article here: http://www.dspguide.com/ch28/2.htm). A good exercise is to implement a finite response filter (what Steiglitz refers to as a feedforward filter) - these rely on buffering and are quite fun to play around with.
I've got a repo on Github with a few very basic LADSPA plugins. The architectural difference aside, they could potentially be useful for someone writing VST plugins as well. https://github.com/andreasjansson/my_ladspa_plugins
Another good source of example code is the CSound project. There's tonnes of DSP code in there, and the software is aimed primarily at musicians.
Start with reading this and this.
This will give you idea on WHAT you have to do.
Then, learn DirectShow architecture - and learn HOW not to do it, but try to create your simplified version of it.
You could have a look at BYOND. It is an environment for programmatic audio / midi instrument and effect creation in C#. It is available as standalone and as VST instru and effect.
FULL DISCLOSURE I am the developer of BYOND.
What would be the best library choice for finding similar parts in images and similarity matching?
Thank you.
It sounds like the Scale Invariant Feature Transform (SIFT) is probably the algorithm you're really looking for. Offhand, I don't know of any general-purpose image processing library that includes it, but there are definitely standalone implementations to be found (and knowing the name should make Googling for it relatively easy).
ImageJ fastest image processing library in Java.
OpenCV is certainly a solid choice as always.
That said, VLFeat is also very good. It includes many popular feature detectors (including SIFT, MSER, Harris, etc.) as well as clustering algorithms like (kd-trees and quickshift). You can piece together something like a bag of words classifier using that very quickly.
While SIFT is certainly a solid general purpose solution, it actually is a pipeline composed of a feature detector (which points are interesting in the image), a feature descriptor (for each interesting point in the image, what's a good representation), and a feature matcher (given a descriptor and a database of descriptors, how do I determine what is the best match).
Depending upon your application, you may want to break apart this pipeline and swap in different components. VLFeat's SIFT implementation is very modular and lets you experiment with doing so easily.
Never did image processing, but I've heard from friends OpenCV is quite good, they usually use C++
I'm writing a performance critical class in C# for image manipulation. I'm using LockBits to gain access to the actual data directly and all is working well but I'd like to get more info on the memory signature of certain PixelFormats, most notably Imaging.PixelFormat.Format32bppPArgb.
Anyone know a reliable website somewhere which lists these?
You will find neccessary information about image manipulation with using LockBits.
https://web.archive.org/web/20141229164101/http://bobpowell.net/lockingbits.aspx
I have two bitmaps, produced by different variations of an algorithm. I'd like to create a third bitmap by subtracting one from the other to show the differences.
How can this be done in .NET? I've looked over the Graphics class and all its options, including the ImageAttributes class, and I have a hunch it involves the color matrix or remap tables functionality.
Does anyone have a link to some example code, or can point me in the right direction? A google search doesn't reveal much, unless my google-fu is failing me today.
The real question is, what differences do you want to show? If you just need to operate on RGB color values, the best bet in my opinion is to just scan through both bitmaps and compare the Color values using GetPixel, and use SetPixel to generate your 'difference' bitmap. Perhaps you simply want to subtract the values and use those as the new Color value for the third bitmap. Or perhaps you want to calculate out the luminosity and use that. Even better, if you have three metrics for comparison, assign each one to the R G and B components of the color. I've used this approach for fractal colorization before.
There are other approaches, but with this one you are limited only to your imagination. It may not be the fastest approach, but it does not sound like performance is necessary for this scenario.
Check out this project. It is a motion detector made by Andrew Kirillov. He implements a couple of filters to get the differences between two pictures and uses that to calculate movements. It is really nice done and its easy to modify and use in your own application.
http://www.codeproject.com/KB/audio-video/Motion_Detection.aspx
This can be done by PInvoking the BitBlt API function. Here is some sample code:
http://www.codeproject.com/KB/GDI-plus/Bitblt_wrapper_class.aspx
The sample uses the SRCCOPY raster op code; to get the differences between two bitmaps, you'd instead want to use SRCPAINT or something (GOOGLE should give the list of codes).
GetPixel and SetPixel (on the Bitmap class) are unbelievably slow. Using LockBits will be much faster, but you'll still have to write your own code.
Update: this is a better link:
http://www.pinvoke.net/default.aspx/gdi32.BitBlt
and includes all the possible ternary raster operations (SRCPAINT or SRCAND are probably what you're looking for.).
First, define subtract ;-p What do you want the answer to look like?
The most performance way to do this is probably LockBits - it should be much quicker than lots of GetPixel calls, but you'll need to decode the bytes yourself. Easy if it is just something like 32bpp ARGB, but tricky for some more complex cases.
I've read somewhere that the language used in Adobe Pixel Bender is inspired by something that Microsoft once did. Don't remember where I read it. My thinking is that maybe that Microsoft "something" is wrapped into something that a .Net project can use. Overkill for just subtracting two images, but anyway.