I'm writing a little app that's pretty much a sequencer (8 bit synths) I have a formula which converts a note to its corresponding frequency:
private float returnFrequency(Note note)
{
return (float)(440 * Math.Pow(TwoToTheTwelfthRoot, (note.SemitonesFromC0 - 57)));
}
Basically, what I'm trying to do is play a generated tone (sine, square, saw, etc) with this frequency, so it's audible through the speakers. Does XNA have any support for this? Or would I have to use an additional library?
I do not want to import 80+ samples of a sine wave at different frequencies through the Content Pipeline just so I could play tones with different frequencies.
For those of you who requested for the link, and for the future peoples who might need it:
http://www.david-gouveia.com/creating-a-basic-synth-in-xna-part-i/
He first goes through the dynamic sound instance, then goes to another level by showing you how to create voices (allowing a sort of 'play piano with your keyboard' type thing).
Funny thing is, David Gouveia has a StackExchange account, so I wouldn't be surprised if I get any notification from him at all, nor if some people recognized him.
Related
I have to use a pure C# solution for resampling audio, which can produce me the exact same results as FFmpeg's audio sampling can.
FFmpeg first builds some kind of polyphase filter bank, and then uses that for the sampling process (sorry for the vague phrasing, but I'm not too familiar with this topic). According to this brief documentation, the initialization can be customized this way:
AVResampleContext* av_resample_init(
int out_rate,
int in_rate,
int filter_length,
int log2_phase_count,
int linear,
double cutoff
)
The parameters are:
out_rate: output sample rate
in_rate: input sample rate
filter_length: length of each FIR filter in the filterbank relative to the cutoff freq
log2_phase_count: log2 of the number of entries in the polyphase filterbank
linear: if 1 then the used FIR filter will be linearly interpolated between the 2 closest, if 0 the closest will be used
cutoff: cutoff frequency, 1.0 corresponds to half the output sampling rate
I'd need to use a C# library that is configurable in the same depth. I've been trying to use NAudio (more specifically, its WaveFormatConversionStream class), but there, I could only set the input and output sample rates, so I didn't get the expected results.
So, is there a C# lib that could resample with the same settings as FFmpeg can? Or one that has almost all of these settings or similar ones? Note: I need a C# solution, not a wrapper!
In addition to WaveFormatConversionStream (which uses ACM codecs), NAudio includes another resampler that can be accessed as a DirectX Media Object (DMO), or (in the latest prerelease of NAudio 1.7) as a Media Foundation Transform. These can be used in Windows Vista and above. Sadly I think they are not available in XP (but has been a while since I tried).
The DMO version found is in the Resampler class (there is also a ResamplerDmoStream), and the Media Foundation Version is in MediaFoundationResampler. They actually both create the same underlying object, but on the MFT version I have added a property called ResamplerQuality which allows you to choose anywhere between 1 (linear interpolation) and 60 (max quality). In this article I include a spectogram of a resampled sine wave sweep and you will see that the quality is very good.
You could easily make the same change to the Resampler class if you want to go the DMO route, since it has access to IWMResamplerProps, which allows you to set the half filter length (which is the same value between 1 and 60).
I'm trying to get input from plug-in guitar, get the frequency from it and check whether the users is playing the right note or not. Something like a guitar tuner (I'll need to do a guitar tuner as well).
My first question is, how can I get the frequency of guitar input in real time?
and is it possible to do something like :
if (frequency == noteCFrequency)
{
//print This is a C note!!
}
I'm now able to get input from the soundcard, record and playback the input sound already.
For an implementation of FFT in C# you can have a look at this.
Whiel I think that you do not need to fully understand the FFT to use it, you should know about some basic limitations:
You always need a sample window. You may have a sliding window but the essence of being fast here is to take a chunk of signal and accept some error.
You have "buckets" of frequencies not exact ones. The result is something like "In the range 420Hz - 440Hz you have 30% of the signal". (The "width" of the buckets should be adjustable)
The window size must contain a number of samples that is a power of 2.
The window size must be at least two wavelengths of the longest wavelength you want to detect.
The highest frequency is given by the sampling rate. (You don't need to worry about this so much)
The more precise you want your frequencies separated, the longer shall your window be.
The other answers don't really explain how to do this, they just kinda waive their arms. For example, you would have no idea from reading those answers that the output of an FFT is a set of complex numbers, and you wouldn't have any clue how to interpret them.
Moreover FFT is not even the best available method, although for your purposes it works fine and most people find it to be the most intuitive. Anyway, this question has been asked to death, so I'll just refer you to other questions on SO. Not all apply to C#, but you are going to need to understand the (non-trivial) concepts first. You can find an answer to your question by reading answers to these questions and following the links.
frequency / pitch detection for dummies
Get the frequency of an audio file in every 1/4 seconds in android
How to detect sound frequency / pitch on an iPhone?
How to calculate sound frequency in android?
how to retrieve the different original frequency in each FFT caculating and without any frequency leakage in java
You must compute the FFT -Fast Fourier Transform- of a piece of signal and look for a peak. For kinds of FFT, window type, window size... you must read some documentation regarding signal processing. Anyway a 25 ms window is OK and use a Hamming window, for example. On the net there is lot of code for computing FFT. Good luck!
I'm trying to play back a sound at a given pitch using (C#) XNA audio framework's SoundEffect class. So far, I have this (very basic)
public void playSynth(SoundEffect se, int midiNote) {
float pitch = (float)functions.getMIDIFreq(midiNote) / ((float)functions.getMIDIFreq(127)/2);
pitch-=1F;
Debug.WriteLine("Note: {0}. Float: {1}",midiNote,pitch);
synth = se.CreateInstance();
synth.IsLooped = false;
synth.Pitch = pitch;
synth.Play();
}
Currently, the pitch played back is very off-key, because the math is wrong. The way this function works is I'm sending a MIDI note (0 through 127) to the function, using a function I made called getMIDIFreq to convert that note to a frequency - which works fine.
To call this function, I'm using this:
SoundEffect sound = SoundEffect.FromStream(TitleContainer.OpenStream(#"synth.wav"));
playSynth(sound,(int)midiNote); //where midiNote is 0...127 number
where synth.wav is a simple C note I created in a DAW and exported. The whole point of this program is to play back the MIDI note given in that synth sound, but I'd gladly settle for a sine wave, or anything really. I can't use Console.Beep because it's extremely slow and not for playing entire songs with notes in rapid succession.
So my question is, how could I fix this code so it plays the sample at the right pitch? I realize I only have 2 octaves to work with here, so if there's a solution that involves generating a tone at a given frequency and is very fast, that would be even better.
Thanks!
EDIT: I'm making this a WinForms application, not an XNA game, but I have the framework downloaded and referenced in my project.
You can't apply an arbitrary frequency. You can only lower pitch by an octave (half frequency) or raise it by an octave (double frequency). So, to calculate the pitch bend value, you first need to know the initial pitch of the sample.
Suppose your sample is 440 Hz A, and you want an A an octave down (220 Hz). The value you need is -1. yourPitch / initiPitch = 0.5 to 2.0. You will need to make that into the scale of -1 to +1. I can't tell you exactly how, because the documentation isn't clear if the scale is logarithmic or not. You would have to experiment, but this should get you started.
Probably not a very descriptive title, but I'm doing my best. It's my first time posting on StackOverflow, and I'm relatively new to programming in C# (first started around a year ago using Unity, and decided a few days ago to upgrade to XNA). That being said, please be kind to me.
I'm planning out the mechanics of a 2D game that I'm designing, and while most of it seems straightforward after playing around in XNA, there's one issue that I keep coming back to that I have yet to come up with a satisfactory answer for. The issue involves the layering of sprites into composite / complex sprites. For example, a character in the game might be wielding one or two of any number of weapons. I did do a bit of research on the topic, and found some people recommending to use the RenderTarget class to draw a series of sprites as one, and some recommending simply drawing the sprites on top of one another during Draw(). These topics, however, were mostly focused on the relatively simple case of having a single character in the game.
In my case, the game will have a number of sprite-based characters who have totally different postures / animations. There are around 10 right now, and there will probably be more added later in development. There will likewise be a largish number of weapons (probably around 20 to start) that will be composited onto the characters. That much I'm comfortable with. However, the problem is that each of the characters would require the weapon sprites to be draw in different locations and with different rotations during each frame of a character's animation.
I've considered a couple approaches to how to pull this off, but they all have pretty massive drawbacks.
The first was simply drawing a spritesheet of each weapon for each character, that would be the same size as the appropriate character. The benefit to this approach would be the ease of just adding the call to draw the additional sprite on top of the base character without having to do any calculations. The downside would be that that creates an inordinate amount of extra sprite sheets (200 extra sheets for 10 characters x 20 weapons).
The second was creating a class to handle the weapon sprite. The WeaponSprite class would be attached to a single texture for each weapon, and would then store information about the offset / rotation to use when drawn, based on the character that it is attached to. The problem with this is that organizing the offsets / rotations on a per-frame basis would be incredibly tedious, and I can't think of any easy way to pull the information based on the frame required. (I had the idea of making an AnimationFrame class to keep track of the animation name, directional facing and frame number of each character, and then using a dictionary in the weapon class to load the proper data based on the name of the current frame, but something about the idea seemed really ill-conceived). This method also has the drawback of requiring a relatively large amount of memory to pull off (assuming a Vector2 is 8 bytes and a float is 4, having 10 characters and 20 weapons would require 192KB of memory given the current number of frames being used, which would only get larger as more weapons were added). I had an offshoot of that idea (which I sort of stole from another post on here about the same topic) of using a reserved alpha value pixel to link the offset and the 'origin' of each weapon, calculating the position at runtime and then only having to store the rotational float in the aforementioned dictionary.
Since I'm new to XNA (and still pretty green on C#), I figured I'd post and let the experts chime in. Am I on the right track with my methods, or am I missing something really simple? Thank you very much in advance for your help, and please let me know if you need any additional information.
Wow, big question. I can't really tell you exactly how to implement this. But I can give you some helpful nuggets of advice:
Advice #0: Whenever any kind of compositing problem comes up, people come out of the woodwork recommending "render targets" as some kind of compositing panacea. They are usually wrong. Avoid using render targets if you can. You only need them if you are doing effects on the final, composite image (blends, blurs, etc). Otherwise just draw your sprites over the top of each other directly to the backbuffer.
Advice #1: You want to pack all your sprites onto a single sprite sheet, if possible. If you exceed the texture size limit, you'll have to be clever about how you partition your sprites across sheets. The reason is performance - you want to limit the number of texture swaps - see this answer for details.
You may be able to use an existing sprite-packer for XNA. If you can find a suitable one, I recommend you use it. A good one will allow you to treat a packed sprite just as you would treat a texture when calling SpriteBatch.Draw.
Advice #2: Do not worry about how much space your positioning data takes at runtime. 192kb is almost nothing - the size of a small texture.
The upshot of this, and #1, is to store as much as possible in your positioning meta-data, and avoid duplicate textures.
How you store your meta-data almost doesn't matter.
Advice #3: You can change both your storage requirements and content creation story from an n × m problem to an n + m problem (n characters and m weapons). Simply store weapons with only an "origin", and store characters with an "origin" and a "hand position & rotation". Simply render such that the origin of the weapon lines up with the hand of the character (the maths is very simple).
Then you can add characters without worrying about what weapons exist, and add weapons without worrying about what characters exist.
Here's an example of how much space this needs: 10 characters × 20 bytes + 20 weapons × 8 bytes = 360 bytes. Nice and small! (Although you'll probably want many more attachment points - different kinds for different weapons, hats, whatever. Edit: oops I didn't include animation frames - but it's still a relatively small amount of data.)
Advice #4: The trickiest part, as you seem to be hinting at in your post, is content creation.
As you hint at, ideally you would want to be able to edit the attachment points directly in your image editor. This is a compelling idea. Special alpha values are only appropriate if your sprites have no anti-aliasing. You could theoretically do something with layers and different colours. The hardest part is figuring out how to encode rotation.
You could use an XNA content pipeline processor to extract data from the image at build-time. However this gets very expensive to implement (especially if you've not done it before - the content pipeline is badly under-documented). Unless your art requirements are truly enormous, it is almost certainly not worth the extra development time required to make the content pipeline extension. By the time you're done, you could have hand-coded the positioning data several times over.
My recommendation, then, is to store the extra data in an easy-to-edit XML file. I recommend using XNA's XML Content Importer. It can be tricky to get the hang of the formatting at first, and you have to remember to include the appropriate assembly referencing. But once you know how to use it, it's the easiest way to get structured data into XNA quickly.
I want to make a program that detects the note that is being played in front of the microphone. I am testing the FFT function of Naudio, but with the tests that I did in audacity it seems that FFT does not detect the pitch correctly. I played an C5, but the highest pick was at E7.
I changed the first dropdown box in the frequency analysis window to "enchanced autocorrelation" and after that the highest pick was at C5.
I googled "enchanced autocorrelation" and had no luck.
You are likely getting thrown off by harmonics. Have you tried testing with a sine wave to see if your NAudio's FFT is in the ballpark?
See these references:
http://cnx.org/content/m11714/latest/
http://www.gamedev.net/community/forums/topic.asp?topic_id=506592&whichpage=1�
Line 48 in Spectrum.cpp in the Audacity source code seems to be close to what you want. They also reference an IEEE paper by Tolonen and Karjalainen.
The highest peak in an audio spectrum is not necessarily the musical pitch as a human would perceive it, especially in a sound with strong overtones. That's because pitch is a human psycho-perceptual phenomena, the brain will often deduce frequencies that aren't even present in a waveform.
Auto-correlation methods of frequency or pitch estimation (roughly, finding how far apart even a funny-looking and/or non-sinusoidal waveform repeats in time) is usually a better match for what a human would call pitch. The reason for various enhancements to the autocorrelation algorithm is that simple autocorrelation will find an near infinite number of repeating wavelengths (e.g. if it repeats every 1 second it also repeats twice every 2 seconds, etc.) So the trick is to weight the correlation to somehow statistically better match what a human would guess about the same waveform.
Well, if you can live with GPLv2, why not take a peek at the Audacity source code?
http://audacity.sourceforge.net/download/beta_source