Combine existing audio files and generate new audio file from them - c#

I have 2 sample wavfiles.
I would like to combine them into one output wav like this:
Play first wav, wait x seconds play 2nd wav, and save the results as a new wav file.
I'm not particularly attached to the wav format, so happy to use another if necessary.
From my research it looks like I would need to convert the wavs to PCM, and then create a new output buffer and write the first file the output buffer. Then somehow create a space for the x seconds, and then write the second PCM to
How would I go about doing this?

First of all you'll need to undestand what you are talking about
WAV is a type of RIFF which encodes the sound waves as PCM.
Essentially PCM means that discrete values of the wave are stored at a certain sample rate (typically 44 kHz)
Each sample may contain information about one or more channels (typically 2)
The values of each sample are stored as a fixed size integer or float. (typically 16 bit integer)
These attributes are stored in the WAV header
To combine two seperate WAV files you need to read the header of both files and if you are lucky they will have the same ByteRate ( == samplerate * channel count * bits/sample / 8) then you simply need to concat the second file minus the header to the end of the first, and add the length of the second to the 'length' field of the first.
In any other case I advise you to utilize a library that does reencoding of some sort.
If you have the time and muse, you could do the recoding yourself.
If you don't want to bother with this stuff at all try using a complete program (i.E. sox) that does what you need.
Btw.: Silence is 0 values if this bits per sample are signed and half of the max value if they are unsigned (typically only found in 8 bit integers).
So to get 4 seconds of silence you need to have n = 4 * sample rate * channel num * (bits / seconds) / 8 times 0
Trivia: You could use any constant value instead of 0 for silence

Related

NAudio Normalize Audio

I am trying to normalize Mp3-Files with NAudio but I don't know how to do so.
The first I did was converting the Mp3-File to PCM:
using (Mp3FileReader fr = new Mp3FileReader(mp3.getPathWithFilename())) {
using (WaveStream pcm = WaveFormatConversionStream.CreatePcmStream(fr)) {
WaveFileWriter.CreateWaveFile("test.wav", pcm);
}
}
But what is the next step? Unforntunately I didn't find anything on the net.
Thanks for your help
I'm new to NAudio, so I don't exactly know how to code this, but I do know that normalization of an audio file requires two passes through the data. The first pass is to determine the maximum and minimum data values contained in the file - so you would have to scan each data point and determine the max and min data points (and for both channels if stereo). Then, upon determining the highest max or lowest min (whichever absolute value is highest), you calculate that value as a percentage from Full Scale (the highest or lowest possible value for the bit stream, for example with 16-bit audio it's 32767 or -32768). You then increase the volume by the difference in percentage.
So for example on your scanning pass, you discovered that the highest value in a 16 bit mono file was 29000, you would then increase the volume by 112.989 percent so that the maximum sample is increased from 29000 to 32767, and all other samples are increased accordingly.

Cutting random bytes off of file byte array in C#

So I've been working on this project for a while now, involving LSB steganography. Really fun stuff. Anyways, I just finished writing the code for embedding and extracting files from an image(instead of just plaintext), and I'm running into this problem. I can recognize the MIME and extension of the bytes, but because the embedded file doesn't usually take up all of the LSBs of the image, there's a lot of garbage data. So I have the extracted file + some garbage in the byte array right after it. I need to figure out how to cut these, so that the file that is being exported is the correct, smaller size.
TLDR: I have a byte array with a recognized file in it, with some additional random bytes. How do I find out where the file ends and the random bytes begin?
Remember this is all in C#.
Any advice is appreciated.
Link to my project for reference: https://github.com/nicosogangstar/Steg
Generally you have two options.
End of stream marker
This is the more direct approach of the two, but it may lack some versatily depending on what data you want to hide. After you embed your data, continue with embedding a unique sequence of bits/bytes such that you know it cannot be prematurely encountered in the data before. As you extract the bits, you can stop reading once you encounter this sequence. If you expect to hide only readable text, i.e. bytes with ascii codes between 32 and 127, your marker can be as short as eight 0s, or eight 1s. However, if you intend to hide any sort of binary data, where each byte has a chance of appearing, you may accidentally encounter the marker while extracting legitimate data and thus halt the process prematurely.
Header information
You can add a header preceding data, e.g, another 16-24 bits (or any other amount) which can be translated to a number that tells you how many bits/bytes/pixels to read before stopping. For example, if you want to hide a byte array of size 1000, first embed 2 bytes related to the length of the secret and then follow it with the actual data. More specifically, split the length in 2 bytes, where the first byte has the 8th to 15th bits and the second byte has the 0th to 7th bits of the number 1000 in binary.
00000011 11101000 1000 in binary
3 -24 byte values
You can embed all sorts of information in a header, such as whether the data is encrypted or compressed with some algorithm, the original filename of the date, how many LSBs to read for extracting the information, etc.

Kinect Audio PCM Values

Im using kinect to extract audio and classifie its features, but i have a question. On http://msdn.microsoft.com/en-us/library/hh855698.aspx it says the audio.start method Opens an audio data stream (16-bit PCM format, sampled at 16 kHz) and starts capturing audio data streamed out of a sensor. The problem is that i dont know how pcm data is represented and i dont know if the method returns pcm true values or not. Because using the sdk examples i get values like 200, 56, 17 and i think audio values are more like -3*10^-5 .
So does anyone know how do i get the true PCM values? Or am i doing something wrong?
Thanks
I wouldn't expect any particular values. 16-bit PCM means it's a series of 16-bit integers, so -3*10-5 (-0.00003) isn't representable.
I would guess it's encoded with 16-bit signed integers (like a WAV file) which have a range of -32768 to 32767. If you're being very quiet the values will probably be close to 0. If you make a lot of noise you will see some higher values too.
Check out this diagram (from Wikipedia's article on PCM) which shows a sine wave encoded as PCM using 4-bit unsigned integers, which have a range of 0 to 15.
See how that 4-bit sine wave oscillates around 7? That's it's equilibrium. If it was a signed 4-bit integer (which has a range of -8 to 7) it would have the same shape, but its equilibrium would be 0 - the values would be shifted by -8 so it would oscillate around 0.
You can measure the distance from the equilibrium to the highest or lowest points of the sine wave to get its amplitude, or broadly, it's volume (which is why if you're quiet you will mostly see values near 0 in your signed 16-bit data). This is probably the easiest sort of feature detection you can do. You can find plenty of good explanations on the web about this, for example http://scienceaid.co.uk/physics/waves/sound.html.
You could save it to a file and play it back with something like Audacity if you're not sure. Fiddle with the input settings and you'll soon figure out the format.

ASCII file parsing speed

I have two type of file. One of them is ASCII file and data is stored like;
X Y Value
0 0 5154,4
1 0 5545455;
. . ...
. . ...
other one is a binary file.
I parse first one with StreamReader and ReadLine() method and then setting values to an double[,] array by Split(' ').
I parse second one with BinaryReader.
Parsing of binary file is 3-4 times faster than ASCII one.
Question 1: Reading ASCII file is slower than binary one. Is it normal?
Question 2: Do you suggest another way for parsing ASCII file?
It's not so much reading ascii is slower, but how you do it.
It's parsing, looking for new lines, seperators, then converting bits of text to other formats. BinaryReader is basically a straight memory copy.
It's like the difference between fixed length and csv, or csv and xml The more meta data you add, the more you can get out it but the more it costs.
Reading an ascii file character by character might work out faster than readline and split, in that you could optimise it for your specific file structure. Lot of work though and very fragile making it a dubious prospect. Chucking loading to a seperate thread perhaps even parallel processing the lines, might be more rewarding, definitely be more satisfying and reusable.
Reading from ASCII file and binary not different, different is parsing of them,after reading ASCII file you parse string to double, this is got process time.But in binary file your read data stream is completely equals to equivalent binary double number and not need to parsing.
Once a month we receive a 350 MB csv file with 3.5 million rows, then we used to read it one line at a time and make some indexes, it took aprox. 60 seconds every time the service was restarted.I made a program that boiled it down to 1.7 million rows and converted it to a binary format to aprox 24 MB.These data was read directly into memory in 7 ms and the indexes was generated when needed and the data was converted when used.The memory consumption declined from 400 MB til 90 MB.The point is that you should choose an appropriate format for your data if performance is an issue, also note that this solution is only possible because the data is fairly static and that the data is not retrieved more than a few million times in 24 hours.I believe that the new service actually answers a little faster now than it used to.

Audio Beat Detection in C#

Using System.IO BinaryReader object found in .NET mscorlib assembly, I ran a loop that dumped each byte value from a .wav file into Excel spreadsheet. For simplicity sake, I recorded a two second 4K signal from signal generator into software sequencer and saved as monaural wave file. The software I sequence music with shows a resolution of 1ms - which is 44.11 samples(assuming 44.1K sample rate). What I find curious is that the data extracted via ReadInt16() method(starting at position 44 in .wav file) shows varied numbers with integers switching signs seemingly at random- whereas the visual sine wave within sequencer is completely uniform with respect to amplitude and frequency. With 16 bit resolution, I determined that for each sample first byte was frequency resolution and the second amplitude, is correct?
Question: How can I intelligently interpret the integers pulled from wave file for the ultimate purpose of determining rhythmic beats?
Many thanks...........Mickey
For a WAV file with 16 bits per sample, it is not the case that the first byte of the sample is frequency resolution and the second byte is amplitude. Both bytes together indicate the sample's amplitude at that specific point in time. The two bytes are interpreted as a 2-byte integer, so the values will range from -32768 to +32767.
I do not know how your sequencer works or what it is displaying. From your description, it sounds as if your sequencer is using FFT to convert the audio from time-domain (which is what a WAV file is) to frequency-domain (which is a graph with frequency along the x-axis and frequency amplitude along the y-axis). A WAV file does not contain frequency information.

Categories

Resources