I am making a VOIP program for fun, and I got it mostly working. Since my last question, another issue has come up. When there are two or more voices being played through the client using a MixingWaveProvider, there are strange stutters, clicks, snaps, and static in the final mixed audio. Most of the time, it sounds like a portion of someone's voice plays, pauses, and lets another person's voice play for a short while. This continues for as long as both are talking (Each voice seems to "take turns" outputting to the waveMixer).
I won't bother posting the Speex encoding/decoding code, as this issue happens with or without it being used. I get the input through a WaveInEvent, which feeds it's information into a UDP network stream. The UDP stream sends the sound data to the other clients.
Here is the code that I use to initialize the WaveOut and MixingWaveProvider32:
waveOut = new DirectSoundOut(settings.GetOutputDevice(), 50);
waveMixer = new MixingWaveProvider32();
waveOut.Init(waveMixer);
waveOut.Play();
When a client connects, I input the received packet data into the user's BufferedWaveProvider:
provider = new BufferedWaveProvider(format) { DiscardOnBufferOverflow = true };
wave16ToFloat = new Wave16ToFloatProvider(provider);
After that, I use this code to add the above 32bit provider to the MixingWaveProvider32:
waveMixer.AddInputStream(wave16ToFloat);
It seems that the issue is less severe with streams added before MixingWaveProvider32 is passed to WaveOut. However, I really need to be able to add them dynamically. Assuming that is why this happens.
This may have something to do with my network implementation, so I will look into that if something else isn't found here. Could it be possible that each voice data packet is blocking the next one from being read, thus causing the back and forth kind of sound? If so, how could I buffer the data on the server longer or wait to send in larger chunks on the client?
Edit:
I am almost sure that this is caused by the BufferedWaveProviders draining completely several times a second. The packets are not filling them fast enough, and they drain leaving nothing left to transmit. As I asked above, is there any way that I can send them from the client in large chunks? Or can I make the buffers drain slower somehow?
Edit 2:
I have now implemented a auto pausing buffer that will make sure it stays filled. The buffer unpauses when it's internal buffer goes above 1 second of sound, and it pauses when the data gets below .5 seconds. However, the buffer hovers around 1 second of sound, and I have checked that it is not running out/pausing the sound mid stream. Though this should be a good thing, the sound distortion still exists, and it is just as bad as before. It seems to be something wrong with the mixer or my setup.
Sounds like you have already diagnosed the problem. If the BufferedWaveProviders aren't filling up then you will get silence. You need to implement some kind of auto-pause that delays playback until there is enough buffered audio. A cheating way to do this is to start off each buffer with five seconds of silence, allowing hopefully another five seconds of audio to be received while this buffer plays out.
Related
Very similar to this question, I have I networked micro-controller which collects PCM audio (8-bit, 8 kHz) and streams the data as raw bytes over a TCP network socket. I am able to connect to the device, open a NetworkStream, and create a RIFF / wave file out of the collected data.
I would like to take it a step further, and enable live playback of the measurements. My approach so far has been to buffer the incoming data into multiple MemoryStream's with an appropriate RIFF header, and when each chunk is complete to use System.Media.SoundPlayer to play the wave file segment. To avoid high latency, each segment is only 0.5 seconds long.
My major issue with this approach is that often there is a distinctive popping sound between segments (since each chunk is not necessarily zero-centered or zero-ended).
Questions:
Is there a more suitable or direct method to playback live streaming PCM audio in C#?
If not, are there additional steps I can take to make the multiple playbacks run more smoothly?
I don;t think you can manage it without popping sounds with the SoundPlayer, because there shouldn't be any delay in pushing the buffers. Normally you should always have one extra buffer buffered. But the SoundPlayer only buffers one buffer. Even when the soundplayer gives an event that it is ready, you're already too late to play a new sound.
I advise you to check this link: Recording and Playing Sound with the Waveform Audio Interface http://msdn.microsoft.com/en-us/library/aa446573.aspx
There are some examples of the SoundPlayer (skip those), but also how to use the WaveOut. Look at the section Playing with WaveOut.
The SoundPlayer is normally used for notification sounds.
I have nopt seeing anyone else trying to do this. It is completely possible I am apperoaching this the wrong way. Basically, I have a computer with a DVI input. If nothing is attached to the DVI input, then a program on the computer loads some images on screen. If an output source is connected to the DVI port, then my program should stop writing images and use the DVI video feed instead.
What mechanisms exist to determine if a DVI input exists, and if there is currently a valid video signal present? How can I read the video stream?
Or am I going about this the completely wrong way?
At a hardware level most video input subsystems, analog or digital, are capable of detecting the presence of an input signal, or at least something that has a lot of the characteristics of one.
For a digital standard, you have actual clocking data either on its own wire, or encoded in a serial data stream. If there appears to be a clock, and if its frequency is regular and reasonable would be a first test (though for some standards, reasonable can cover a huge range of frequencies).
Next, video (not just digital, even analog) has a repeating structure of lines and fields, so there should be two identifiable submultiples of the pixel clock, one corresponding to the start or end of each line, and the other to the start or end of each field (screen). Again, these might have their own wires, might have unique means of encoding (special voltages in the analog case), or might represent time gaps in the pixel data. Even if there were no sync and no retrace times, statistical analysis of the pixel data would probably give clues to the X and Y dimensions as many features in the picture would repeat.
Actual video input subsystems (think flatpanel monitor) can have even more complicated detection and auto-adapting circuits - they may for example resample the input in time to change the dots-per-line resolution, or they may even put it in a frame buffer and scale it in both X and Y.
What details of the inner workings of the video capture circuit are exposed to consumer, or even driver level software would depend a lot on the specifics of the chipset used - hopefully a data sheet is available. It's pretty likely though that somewhere there is a readable register bit that indicates if the input is capturing something that the circuit "thinks" is a video signal. You might even be able to read out parameters such as the X and Y resolution and scanning rates or pixel clock rate.
Similarly, the ability to get data out of the port would be chipset dependent, but if the port is going to be useful for anything, there is presumably an operating system driver for it which provides some sort of useful API to video consuming applications.
I have an Arduino mega communicating over Bluetooth (bluesmirf gold device) to a C# application that I wrote. The Arduino is constantly sending a serial signal of 32 characters, the first always being an "S" and the last an "E". Using putty I can confirm that this signal is being sent correctly 99% of the time.
Now I want to read this signal with my C# application, which I am doing with the following code:
public string receiveCommandHC()
{
string messageHC = "";
if (serialHC.IsOpen)
{
while (serialHC.ReadChar() != 'S')
{
}
messageHC = serialHC.ReadTo("E");
serialHC.DiscardInBuffer();
}
return messageHC;
}
serialHC is of the serial class.
Sometimes this works perfectly but other times I'm having problems, I cannot find out why it works sometimes but others not.
The problem that I seem to be having is that sometimes I get a rather large Lag in the data that I am reading from the arduino. I notice this because I am sending button states and they only change a few seconds after I actually press or release the button on the Arduino. I used the standard baud rate of the Bluetooth device, which is 115200, and was wondering if changing that to a much lower rate could yield better results? What if any advantage would that have? I do not need hight communication rates, even updating the state 4-5 times a second would be acceptable for my application.
Is it possible the lag is coming from my code? I think it may be from the while loop that is waiting for the incoming "S" but then I don't see why it should hang there since there are new signals always coming in at a high rate.
I'm using the DiscardInBuffer() because I do not care about outdated data and just want to skip over that. It is much more important that I am reading current up to date data and acting on that fresh data.
Thank you for your help!
Best regards,
Bender
Update:
Just found out a bit more information while debugging. The problem only seems to appear:
When connected over Bluetooth (over USB cable there is absolutely NO Lag)
When a second Bluetooth connection is established from the PC to another device (different COM port and different baud rate)
Does anybody have any experience running two different devices off the same Bluetooth dongle on the PC? I can manage to connect to both no problem but still having the lag issue mentioned before.
Thanks for any help
You are not really using a physical serial port here. The BlueTooth driver merely emulates one. This is common, the Windows API has a well defined set of api functions to talk to a serial port. Emulating one makes the interface to the driver simple, the vendor doesn't have to supply an interface DLL or document a complicated DeviceIoControl() protocol.
Which means for one thing that the actual communication settings don't matter. Baudrate is meaningless in this scenario, it is the BlueTooth radio signaling that sets the transfer rate. The driver will accept whatever you select but will otherwise ignore it. Handshake signals might be interpreted, it's up to the driver to implement them. Communication error reporting is very rarely implemented, BlueTooth has an error correcting protocol, unlike a real serial port.
No, the loss of data here is entirely self induced. Clearly the driver does implement DiscardInBuffer(). Which accomplishes nothing but throw away any data that the driver received. This goes wrong if your code runs a bit late or gets interrupted by a thread context switch.
Delete the DiscardInBuffer() call.
I want to make a similar application to Skype, and the main problem is working with video and audio. The first problem is how to get a bytes array of the video (to be specific, I need to get bytes which represent the video, so that I can send them over the internet), and same with audio. The second problem is to play bytes that come from the other computer.
I've been thinking to do that in WPF. I'm new in WPF (I have practiced a little bit, and made couple of programs among which is a basic chat program). I'm doing this for practice, and I want to code by myself as much as I can, server, client, transmision of data, and so on...
I've been searching over the internet, and only one solution seems to me to be good, or better to say feasible, is to use DirectShow.
Just to add, I know that camera and microphone is supported in Silverlight, and I've tried that (actually, I've tried to host an HTML page with silverlight project in WPF project in which were webbrowser control, and I've succeeded to show video from my webcam), but I don't know how to get bytes which represent video.
Is that possible to do with WPF or silverlight?
I'll be very grateful for suggesting any solution, advice, or useful links.
Using DirectShow filter graphs, you'll have a direct access to image and audio buffers from input devices (such as cameras and microphones) as bytes array, sample by sample. You'll be able to directly manipulate the data, to chose a coding or compression format (using specific filters), and to control the data rate and synchronization.
However :
if you've entirely new to this environment, it will be hard. Also, I know it works nicely with C++, but I've never coded any Directshow application in C#. (You may want to look this way : CodeProject Tutorials, MSDN DirectShow topics, and tests using graphedit)
streaming media accross a network and receiving it with Directshow is not trivial and can be quite a pain. Network renderers and network source filters are available all around, but are always difficult to use in my opinion. And depending on your video format (H264, MPEG, MJPEG...) and network protocol (RTSP, plain old simple UDP...) choices, you might end up having to write your own stream/source filters, which is hard and time consuming.
Nevertheless, it IS feasible, and if your main objective is practice with coding, then why not !
(Never used WPF, maybe it's actually way simpler !)
I can't speak to WPF or Silverlight, but I've done this in DirectShow, and it's a pain in the ass.
If you want to use .NET, there's an open source wrapper called DirectShow.NET, that helps alot, and it's still a pain in the ass.
Microsoft did a good job with DirectShow and the whole Filter-Graph thing, but then they sort of dropped the ball a while ago and haven't updated it in years.
I'd recommend looking for a different technology(although it probably sits on DirectShow), and I'd be interested to hear what you find.
To all who are interested in this subject,
After spending hours and hours searching the internet, i managed to find a solution that should work. With Silverlight i take captures, resizing them to 160x120 (or less), and than convert them with imagetools. One thread that is responsible for taking pictures, starts capture, and when it is finished (capturing is asynchronous, so you need semaphores to use) it sleeps for 200ms; thats almost equivalent to 5 frames per second. I'm doing all of this because i have slow upload bandwidth, about 16 kilobytes per second, so i have to compress one frame as much as i can. Result is low detailed picture, but if u use 100x100 rectangle for viewing it, it isn't too bad. I haven't tried it with the internet yet, but, as i have said, it should work. I've also tried using compress methods, to compress picture a little more, if it is possible, but i don't know how to use that class (something is not working well), so left that for another time. Now i just want to make it work, and latter i'll try to make better performance.
Oh, one more thing, I also have to solve problem with audio transmission, and that needs a lot of work.
So, hear latter.
I have a C# game program that i'm developing. it uses sound samples and winsock.
when i test run the game most of the audio works fine but from time to time if it is multiple samples being played sequentially the application form shakes a little bit and then goes back to its old position.
how do i go about debugging this or present it to you folks in a manageable manner? i'm sure no one is going to want the whole app code in fear of virus attacks.
please guide me..
EDIT: i have not been able to pin down any code section that produces this result. it just does and i cannot explain it.
EDIT: no the x/y position are not changing. the window like shakes around a few pixels and then goes back to the position were it was before the shake.
if (audio)
{
Stream stream;
SoundPlayer player;
stream = Properties.Resources.ResourceManager.GetStream("_home");
player = new System.Media.SoundPlayer(stream);
player.PlaySync();
player.Dispose();
string ShipID = fireResult.DestroyedShipType.ToString();
stream = Properties.Resources.ResourceManager.GetStream("_" + ShipID);
player = new System.Media.SoundPlayer(stream);
player.PlaySync();
player.Dispose();
stream = Properties.Resources.ResourceManager.GetStream("_destroyed");
player = new System.Media.SoundPlayer(stream);
player.PlaySync();
player.Dispose();
}
can you see anything in the above code that would produce this shake?
EDIT: yes the code is being executed within a: this.Invoke(new Action(delegate(){ ....})); could this be it? how do i resolve this?
EDIT:
stream = Properties.Resources.ResourceManager.GetStream("_destroyed");
player = new System.Media.SoundPlayer(stream);
player.PlaySync();
player.Dispose();
stream.Dispose();
if the take out the above code, then it works fine! any ideas?
EDIT: i replaced the line with:
stream = Properties.Resources.ResourceManager.GetStream("_destroyed");
to a different file name but the problem is still there but at least it is not the audio file is corrupt.
EDIT: MSN when someone sends a nudge? it is bit like that but only happens 2 or 3 times.
EDIT: Are you using any 3rd party libraries? - no i am not using any 3rd party libs.
EDIT: it seems no matter what file, the 3rd sample always causes this.
EDIT: happens everywhere i use sound samples. if i play 3 samples, the situation happens.
EDIT: #nobugz: yes think you are right. the problem is holding up the UI thread for too long. as i have tried just using a merged audio file and the problem is there given its original duration.
EDIT: i solved this issue by putting Application.DoEvents(); after each sample play command. no shakes :)
EDIT: the above solution did not really work. as the number of player samples grew the application GUI got stuck again. a solution using QueueUserWorkItem has been employed instead. this still remains to be proven as a satisfactory solution as cross therading occurs i.e. a new thread of samples can be started while an old one is still playing.
will update this as more knowledge comes to light.
Calling PlaySync on the UI thread isn't so great. It will make your main window unresponsive as your UI thread is busy waiting for the sound to finish, it doesn't get around to pumping messages like it should do. If that takes long enough, Windows steps in and overlaps the window with a "ghost", it usually says "Not Responding" in the title bar (if it has one). This ghost window might not quite match your own window, that could explain the "shaking".
Using Play() instead will solve that problem. But gets you a new one, sequencing sounds becomes difficult. Making the calls from a thread can solve both. Check out NAudio for better control over sound.
Make a copy of your program. Delete as many game elements from the copy as possible. Remove modules, chop out game logic, shift functions between classes to reduce abstraction (so that you can delete classes), and generally hack up the game.
Each time you do so, check if the bug still exists. Initially you'll be deleting bigger chunks of the program but over time the amount of deletion will reduce.
If you find something which, when deleted, fixes the bug, there are two possibilities: Either you found the bug, or there is some sort of synergy with the rest of the program to cause the bug. In the latter case, continue deleting more of the program.
Eventually, you will end up with a minimal program that has the bug. Post that here (or in a pastebin if it's too big).
This is one of the last-resort strategies I use when I encounter a bug that I am unable to locate at all.