I'm streaming music from Spotify with C# wrapper ohLibSpotify and play it with NAudio. Now I'm trying to create a spectrum visualization for the data i receive.
When i get data from libspotify, following callback gets called:
public void MusicDeliveryCallback(SpotifySession session, AudioFormat format, IntPtr frames, int num_frames)
{
//handle received music data from spotify for streaming
//format: audio format for streaming
//frames: pointer to the byte-data in storage
var size = num_frames * format.channels * 2;
if (size != 0)
{
_copiedFrames = new byte[size];
Marshal.Copy(frames, _copiedFrames, 0, size); //Copy Pointer Bytes to _copiedFrames
_bufferedWaveProvider.AddSamples(_copiedFrames, 0, size); //adding bytes from _copiedFrames as samples
}
}
Is it possible to analyze the data i pass to the BufferedWaveProvider to create a realtime visualization? And can somebody explain how?
The standard tool for transforming time-domain signals like audio samples into a frequency domain information is the Fourier transform.
Grab the fast Fourier transform library of your choice and throw it at your data; you will get a decomposition of the signal into its constituent frequencies. You can then take that data and visualize however you like. Spectrograms are particularly easy; you just need to plot the magnitude of each frequency component versus the frequency and time.
Related
Requirement:
I am trying to capture Audio/Video of windows screen with SharpAPI Example with Loopback audio stream of NAudio Example.
I am using C#, wpf to achieve the same.
Couple of nuget packages.
SharpAvi - forVideo capturing
NAudio - for Audio capturing
What has been achieved:
I have successfully integrated that with the sample provided and I'm trying to capture the audio through NAudio with SharpAPI video stream for the video to record along with audio implementation.
Issue:
Whatever I write the audio stream in SharpAvi video. On output, It was recorded only with video and audio is empty.
Checking audio alone to make sure:
But When I try capture the audio as separate file called "output.wav" and It was recorded with audio as expected and can able to hear the recorded audio. So, I'm concluding for now that the issue is only on integration with video via SharpApi
writterx = new WaveFileWriter("Out.wav", audioSource.WaveFormat);
Full code to reproduce the issue:
https://drive.google.com/open?id=1H7Ziy_yrs37hdpYriWRF-nuRmmFbsfe-
Code glimpse from Recorder.cs
NAudio Initialization:
audioSource = new WasapiLoopbackCapture();
audioStream = CreateAudioStream(audioSource.WaveFormat, encodeAudio, audioBitRate);
audioSource.DataAvailable += audioSource_DataAvailable;
Capturing audio bytes and write it on SharpAvi Audio Stream:
private void audioSource_DataAvailable(object sender, WaveInEventArgs e)
{
var signalled = WaitHandle.WaitAny(new WaitHandle[] { videoFrameWritten, stopThread });
if (signalled == 0)
{
audioStream.WriteBlock(e.Buffer, 0, e.BytesRecorded);
audioBlockWritten.Set();
Debug.WriteLine("Bytes: " + e.BytesRecorded);
}
}
Can you please help me out on this. Any other way to reach my requirement also welcome.
Let me know if any further details needed.
Obviously, author doesn't need it, but since I run to the same problem others might need it.
Problem in my case was that I was getting audio every 0.1 seconds and attempted to write both new video and audio at the same time. And getting new video data (taking screenshot) took me too long. Causing each frame was added every 0.3 seconds instead of 0.1. And that caused some problems with audio stream being not sync with video and not being played properly by video players (or whatever it was). And after optimizing code a little bit to be within 0.1 second, the problem is gone
Context
I have been creating a system where an raspberry PI is sending images to a remote client in real-time.
The raspberry PI captures the images using a raspberry PI camera. A captured image is available as a 3-dimensional array of all the pixels (rows, colums and rgb). By sending and displaying the images really fast it will appear as a video to the user.
My goal is to send these images in real-time with the image resolution being as high as possible. An acceptable frame rate is around 30 fps. I selected the protocol UDP and not TCP. I did this because data can be transferred much faster in UDP due to less overhead. Re-transmissions of individual packets is not necessary because losing some pixels is acceptable in my case. The raspberry PI and the client are located in the same network so not many packets will be dropped anyway.
Taking into account that the maximum transmission unit (MTU) on the ethernet layer is 1500 bytes, and the UDP packets should not be fragmented or dropped, I selected a maximum payload length of 1450 bytes, of which 1447 bytes are data, and 3 bytes are application layer overhead. The remaining 50 bytes are reserved
for overhead that is automatically added by the TCP/IP and transport layers.
I mentioned that captured images are available as an array. Assuming the size of this array is, for example, 1.036.800 bytes (e.g. width=720 * height=480 * numberOfColors=3), then 717 (1.036.800 / 1447) UDP packets are needed to send the entire array. The c++ application on the raspberry PI does this by fragmenting the array into fragments of 1447 bytes, and adding an fragment index number, which is between 1-717, as overhead to the packet. We also add an image number, to distinguish from a previously sent image/array. The packet looks like this:
udp packet
Problem
On the client side, I developed a C# application that receives all the packets and reassembles the array using the included index numbers. Using the EgmuCV library, the received array is converted to an image and drawn in a GUI. However, some of the received images are drawn with black lines/chunks. When debugging, I discovered that this problem is not caused by drawing the image, but the black chunks are actually missing array fragments that did never arrive. Because the byte values in an array are initialized as 0 by default, the missing fragments are shown as black chunks
Debugging
Using Wireshark on the client's side, I searched for the index of such a missing fragment, and was surprised to find it, intact. This would mean that the data is received correctly on the transport layer (and observed by wireshark), but never read on the application layer.
This image shows that a chunk of a received array is missing, at index 174.000. Because there are 1447 data bytes in a packet, the index of this missing data corresponds to an UDP packet with the fragment index 121 (174.000/1447). The hexadecimal equivalent for 121 is 79. The following image shows the packet corrosponding UDP packet in wireshark, proving the data was still intact on the transport layer. image
What have I tried soo far
When I lower the frame rate, there will be less black chunks, and they are often smaller. With a framerate of 3FPS there is no black at all. However, this frame rate is not desired. That is a speed of around (3fps * 720x480x3) 3.110.400 bits per second (379kb/s). A normal computer should be capable to read more bits per seconds than this. And as I explained, the packets DID arrive in wireshark, they are only not read in the application layer.
I have also tried changing the UDP payload length from 1447 to 500. This only makes it worse, see image.
I implemented multi threading so that data is read and processed in different threads.
I tried a TCP implementation. The images were received intact, but it was not fast enough to transfer the images in real-time.
It is notable that a 'black chunk' does not represent a single missing fragment of 1447 bytes, but many consecutive fragments. So at some point when reading for data, a number of packets is not read. Also not every image has this problem, some are arrived intact.
I am wondering what is wrong with my implementation that results in this unwanted effect. So I will be posting some of my code below.
Please note that the exception 'SocketException' is never really thrown and the Console.Writeline for 'invalid overhead' is also never printed. The _client.Receive always receives 1450 bytes, expect for the last fragment of an array, which is smaller.
Also
Besides solving this bug, if anyone has alternative suggestions for transmitting these arrays in a more efficient way (requiring less bandwidth but without quality loss), I would gladly hear it. As long as the solution has the array as input/output on both endpoints.
Most importantly: NOTE that the missing packets were never returned by the UdpClient.Receive() method.
I did not post code for c++ application running on the raspberry PI, because the data did arrive (in wireshark) as I have already proved. So the transmission is working fine, but receiving is not.
private const int ClientPort = 50000;
private UdpClient _client;
private Thread _receiveThread;
private Thread _processThread;
private volatile bool _started;
private ConcurrentQueue<byte[]> _receivedPackets = new ConcurrentQueue<byte[]>();
private IPEndPoint _remoteEP = new IPEndPoint(IPAddress.Parse("192.168.4.1"), 2371);
public void Start()
{
if (_started)
{
throw new InvalidCastException("Already started");
}
_started = true;
_client = new UdpClient(_clientPort);
_receiveThread = new Thread(new ThreadStart(ReceiveThread));
_processThread = new Thread(new ThreadStart(ProcessThread));
_receiveThread.Start();
_processThread.Start();
}
public void Stop()
{
if (!_started)
{
return;
}
_started = false;
_receiveThread.Join();
_receiveThread = null;
_processThread.Join();
_processThread = null;
_client.Close();
}
public void ReceiveThread()
{
_client.Client.ReceiveTimeout = 100;
while (_started)
{
try
{
byte[] data = _client.Receive(ref _remoteEP);
_receivedPackets.Enqueue(data);
}
catch(SocketException ex)
{
Console.Writeline(ex.Message);
continue;
}
}
}
private void ProcessThread()
{
while (_started)
{
byte[] data;
bool dequeued = _receivedPackets.TryDequeue(out data);
if (!dequeued)
{
continue;
}
int imgNr = data[0];
int fragmentIndex = (data[1] << 8) | data[2];
if (imgNr <= 0 || imgNr > 255 || fragmentIndex <= 0)
{
Console.WriteLine("Received data with invalid overhead");
return;
}
// i omitted the code for this method because is does not interfere with the
// socket and therefore not really relevant to the issue that i described
ProccessReceivedData(imgNr, fragmentIndex , data);
}
}
I am trying to increase the amplitude of the sound wave in my code. I have the buffer consisting of all the bytes needed to make the wave.
Here is my code for the audio Playback:
public void AddSamples(byte[] buffer)
{
//somehow adjust the buffer to make the sound louder
bufferedWaveProvider.AddSamples(buffer, 0, buffer.Length);
WaveOut waveout = new WaveOut();
waveout.Init(bufferedWaveProvider);
waveout.Play();
//to make the program more memory efficient
bufferedWaveProvider.ClearBuffer();
}
You could convert to an ISampleProvider and then try to amplify the signal by passing it through a VolumeSampleProvider with a gain > 1.0. However, you could end up with hard clipping if any samples go above 0.
WaveOut waveout = new WaveOut();
var volumeSampleProvider = new VolumeSampleProvider(bufferedWaveProvider.ToSampleProvider());
volumeSampleProvider.Volume = 2.0f; // double the amplitude of every sample - may go above 0dB
waveout.Init(volumeSampleProvider);
waveout.Play();
A better solution would be to use a dynamic range compressor effect, but NAudio does not come with one out of the box.
I had a similar problem, too. But Could done with it with this link:
http://mark-dot-net.blogspot.hu/2009/10/playback-of-sine-wave-in-naudio.html
If you know all details about your sound It would be helpful I think
I have been checking around to convert live frames into video. And I found (NReco.VideoConverter) ffmpeg lib to convert live frames to Video, but the problem is it is taking time to write each frame to ConvertLiveMediaTask (async live media task conversion).
I have an event that provides (raw) frames (1920x1080) (25fps) from IpCamera. Whenever I get frame I am doing the following
//Image availbale event fired
//...
//...
// Record video is true
if(record)
{
//////////////############# Time taking part ##############//////////////////////
var bd = frameBmp.LockBits(new Rectangle(0, 0, frameBmp.Width, frameBmp.Height), ImageLockMode.ReadOnly, PixelFormat.Format24bppRgb);
var buf = new byte[bd.Stride * frameBmp.Height];
Marshal.Copy(bd.Scan0, buf, 0, buf.Length);
// write to ConvertLiveMediaTask
convertLiveMediaTask.Write(buf, 0, buf.Length); // ffMpegTask
frameBmp.UnlockBits(bd);
//////////////////////////////////////////////////////////////////////////////////
}
As the above part is taking much time, I am loosing the frames.
//Stop recording
convertLiveMediaTask.Stop(); //ffMpegTask
Stop recording, for this part I have used BackgroundWorker, because this takes too smuch time to save the media to file.
My question is how can I write the frame to ConvertLiveMediaTask in faster way? are there any possibilites to write it in background?
Please give me suggestions.
I'm sure that most time takes encoding and compressing raw bitmaps (if you encode them with h264 or something like that) by FFMpeg because of FullHD resolution (NReco.VideoConverter is a wrapper to FFMpeg). You must know that real-time encoding of FullHD is VERY CPU consuming task; if your computer is not able to do that, you may try to play with FFMPeg encoding parameters (decrease video quality / compression ratio etc) or use encoder that requires less CPU resources.
If You need to record some limited time live stream, You can split video capturing and compressing/saving into two threads.
Use for example ConcurrentQueue to buffer live frames (En-queue) on one thread without delay, and other thread could save those frames at a pace it can (De-queue). This way you will not loose frames.
Obviously You will have strain on RAM and also after stopping Live video You will have a delay while saving thread finishes.
I am trying to perform an FFT on a Signal produced by a .wav file that has 1 channel and 64064 samples (approximately 4 seconds long at 16k). I am using Accord.NET and the following code to attempt to create a ComplexSignal object, which is required to perform an FFT.
string fileName = "mu1.wav"; //the name of my wave file
WaveDecoder sourceDecoder = new WaveDecoder(fileName); //Accord.Audio.Formats.WaveDecoder
Signal s = sourceDecoder.Decode(); //SampleFormat says Format32bitIeeeFloat
ComplexSignal = s.ToComplex(); //This throws the following exception:
//InvalidSignalPropertiesException
//Signals length should be a power of 2.
Reading the source code of Signal, this should only be thrown if the Signal.SampleFormat isn't Format32bitIeeeFloat, which it is.
I'm really surprised it isn't easier to manipulate the audio features (specifically the frequencies) of a wav file in C#.
You need to create a Hamming (or other method) window with a size in a power of 2 (I chose 1024 here). Then, apply the window to the Complex Signal before performing the Forward Fourier Transform.
string fileName = "mu1.wav";
WaveDecoder sourceDecoder = new WaveDecoder(fileName);
Signal sourceSignal = sourceDecoder.Decode();
//Create Hamming window so that signal will fit into power of 2:
RaisedCosineWindow window = RaisedCosineWindow.Hamming(1024);
// Splits the source signal by walking each 512 samples, then creating
// a 1024 sample window. Note that this will result in overlapped windows.
Signal[] windows = sourceSignal.Split(window, 512);
// You might need to import Accord.Math in order to call this:
ComplexSignal[] complex = windows.Apply(ComplexSignal.FromSignal);
// Forward to the Fourier domain
complex.ForwardFourierTransform(); //Complete!
//Recommend building a histogram to see the results