Seeking in a dynamic GStreamer pipeline - c#

I have a dynamic pipeline that switches between sources (File/HTTP/RMTP/SRT) on set intervals, and outputs to an RTMP sink. This is working fine, until I attempt to perform a seek.
The seek itself works, but the next time I try to switch sources, the pipeline hangs.
I've been able to stop it from hanging by recreating the muxer after performing a seek, but I still end up with audio/video synchronization issues - the audio plays, but the video falls behind.
For what it's worth, I'm testing the outgoing RTMP stream in VLC, and if I stop/play the stream is back in sync and continues to play without issue.
Here is my pipeline graph:
The application is written in C#.
I'm performing the seek like this:
var bus = pipeline.Bus;
do
{
var msg = bus.TimedPopFiltered(Constants.SECOND, MessageType.Error | MessageType.Eos | MessageType.StateChanged);
// Parse message
if (msg != null)
{
...
}
else
{
if (pipeline.CurrentState == State.Playing && !seekDone)
{
seekDone = true;
pipeline.SetState(State.Paused);
var seekTo = 50 * Gst.Constant.SECOND;
var seeked = mux.SeekSimple(Format.Time, SeekFlags.Flush | SeekFlags.KeyUnit, seekTo);
if (!seeked)
{
Console.WriteLine("Seek failed");
}
else
{
Console.WriteLine("Performing seek");
}
pipeline.SetState(State.Playing);
}
}
}
while(!terminate)
The source switching is handled by a Timer on a separate thread while the pipeline is in the Paused state.
Any help would be greatly appreciated.
EDIT: I placed pad probes on the video and audio queues and can see that the video timestamps are falling behind the audio by a few seconds.
Considering that this only happens after a seek, I am wondering if there is something in the video processing that is taking too long and for the player to handle.
The VLC logs show that the player is dropping a lot of late video frames.

Related

Screen recording as video, audio on SharpAvi - Audio not recording

Requirement:
I am trying to capture Audio/Video of windows screen with SharpAPI Example with Loopback audio stream of NAudio Example.
I am using C#, wpf to achieve the same.
Couple of nuget packages.
SharpAvi - forVideo capturing
NAudio - for Audio capturing
What has been achieved:
I have successfully integrated that with the sample provided and I'm trying to capture the audio through NAudio with SharpAPI video stream for the video to record along with audio implementation.
Issue:
Whatever I write the audio stream in SharpAvi video. On output, It was recorded only with video and audio is empty.
Checking audio alone to make sure:
But When I try capture the audio as separate file called "output.wav" and It was recorded with audio as expected and can able to hear the recorded audio. So, I'm concluding for now that the issue is only on integration with video via SharpApi
writterx = new WaveFileWriter("Out.wav", audioSource.WaveFormat);
Full code to reproduce the issue:
https://drive.google.com/open?id=1H7Ziy_yrs37hdpYriWRF-nuRmmFbsfe-
Code glimpse from Recorder.cs
NAudio Initialization:
audioSource = new WasapiLoopbackCapture();
audioStream = CreateAudioStream(audioSource.WaveFormat, encodeAudio, audioBitRate);
audioSource.DataAvailable += audioSource_DataAvailable;
Capturing audio bytes and write it on SharpAvi Audio Stream:
private void audioSource_DataAvailable(object sender, WaveInEventArgs e)
{
var signalled = WaitHandle.WaitAny(new WaitHandle[] { videoFrameWritten, stopThread });
if (signalled == 0)
{
audioStream.WriteBlock(e.Buffer, 0, e.BytesRecorded);
audioBlockWritten.Set();
Debug.WriteLine("Bytes: " + e.BytesRecorded);
}
}
Can you please help me out on this. Any other way to reach my requirement also welcome.
Let me know if any further details needed.
Obviously, author doesn't need it, but since I run to the same problem others might need it.
Problem in my case was that I was getting audio every 0.1 seconds and attempted to write both new video and audio at the same time. And getting new video data (taking screenshot) took me too long. Causing each frame was added every 0.3 seconds instead of 0.1. And that caused some problems with audio stream being not sync with video and not being played properly by video players (or whatever it was). And after optimizing code a little bit to be within 0.1 second, the problem is gone

axWindowsMediaPlayer crashes when run for longer periods

I am not an expert on c# but just trying to churn out a quick solution.
Gist of the solution:
I have a sensor that sends a signal via serial port, and based on the signal the video played must be changed. It switches between two videos
It works as it stands now using axWindowsMediaPlayer. But unfortunately the app crashes after, lets say, 2 hours or so.
Here is the code that i use to change the video url when signal arrives
if (axWindowsMediaPlayer1.playState == WMPPlayState.wmppsPlaying) axWindowsMediaPlayer1.Ctlcontrols.stop();
axWindowsMediaPlayer1.uiMode = "none";
axWindowsMediaPlayer1.URL = mainFile;
axWindowsMediaPlayer1.Ctlcontrols.play();
Initializing the player with this
axWindowsMediaPlayer1.uiMode = "none";
axWindowsMediaPlayer1.Dock = System.Windows.Forms.DockStyle.Fill;
axWindowsMediaPlayer1.settings.setMode("loop", true);
axWindowsMediaPlayer1.settings.volume = 0;
I assume this his some adverse effect on the memory or something; just not expert enough to figure it out. ANy suggestions, is this the right way to change video url?
Thanks

CPU-greedy loop when streaming music

To give some context, I'm working on an opensource alternative desktop Spotify client, with accessibility at it's core. You'll also see some NAudio in here.
I'm noticing pretty intense CPU usage as soon as playback starts. Even when paused, the CPU is high.
I ran Visual Studio's inbuilt profiler to try and shed some light on any resource hogs that might be occuring. As I suspected, the problem wasin my playback manager's streaming loop.
The code that the profiler flags as one of the most sample-rich is as follows:
const int secondsToBuffer = 3;
private void GetStreaming(object state)
{
this.fullyDownloaded = false;
// secondsToBuffer is an integer to represent how many seconds we should buffer up at once to prevent choppy playback on slow connections
try
{
do
{
if (bufferedWaveProvider == null)
{
this.bufferedWaveProvider = new BufferedWaveProvider(new WaveFormat(44100, 2));
this.bufferedWaveProvider.BufferDuration = TimeSpan.FromSeconds(20); // allow us to get well ahead of ourselves
Logger.WriteDebug("Creating buffered wave provider");
this.gatekeeper.MinimumSampleSize = bufferedWaveProvider.WaveFormat.AverageBytesPerSecond * secondsToBuffer;
}
// this bit in particular seems to be the hot point
if (bufferedWaveProvider != null && bufferedWaveProvider.BufferLength - bufferedWaveProvider.BufferedBytes < bufferedWaveProvider.WaveFormat.AverageBytesPerSecond / 4)
{
Logger.WriteDebug("Buffer getting full, taking a break");
Thread.Sleep(500);
}
// do we have at least double the buffered sample's size in free space, just in case
else if (bufferedWaveProvider.BufferLength - bufferedWaveProvider.BufferedBytes > bufferedWaveProvider.WaveFormat.AverageBytesPerSecond * (secondsToBuffer * 2))
{
var sample = gatekeeper.Read();
if (sample != null)
{
bufferedWaveProvider.AddSamples(sample, 0, sample.Length);
}
}
} while (playbackState != StreamingPlaybackState.Stopped);
Logger.WriteDebug("Playback stopped");
}
finally
{
// no post-processing work here, right?
}
}
An NAudio sample was the inspiration for my way of handling streaming in this method. To find the full file's source code, you can view it here: http://blindspot.codeplex.com/SourceControl/latest#Blindspot.Playback/PlaybackManager.cs
I'm a newbie to profiling and I'm not a year on year expert on streaming either (both might be obvious).
Is there any way I can make this loop less resource intensive. Would increasing the sleep amount in the if block where the buffer is full help? Or am I barking up the wrong tree here. It seems like it would, but I'd have thought half a second would be sufficient.
Any help gratefully received.
Basically, you've created an infinite loop until the buffer gets full. The section you've marked with
// this bit in particular seems to be the hot point
probably appears to be as the calculations in the if statement are just being repeated over and over again; can any of them be moved outside of the loop?
I'd put a Thread.Sleep(50) before the while statement to prevent thrashing and see if that makes a difference (I suspect it will).

Kinect speech recognition and skeleton tracking don't work together

I'm writing an application that can take several different external inputs (keyboard presses, motion gestures, speech) and produce similar outputs (for instance, pressing "T" on the keyboard will do the same thing as saying the word "Travel" out loud). Because of that, I don't want any of the input managers to know about each other. Specifically, I don't want the Kinect manager (as much as possible) to know about the Speech manager and vice versa, even though I'm using the Kinect's built-in microphone (the Speech manager should work with ANY microphone). I'm using System.Speech in the Speech manager as opposed to Microsoft.Speech.
I'm having a problem where as soon as the Kinect motion recognition module is enabled, the speech module stops receiving input. I've tried a whole bunch of things like inverting the skeleton stream and audio stream, capturing the audio stream in different ways, etc. I finally narrowed down the problem: something about how I'm initializing my modules does not play nicely with how my application deals with events.
The application works great until motion capture starts. If I completely exclude the Kinect module, this is how my main method looks:
// Main.cs
public static void Main()
{
// Create input managers
KeyboardMouseManager keymanager = new KeyboardMouseManager();
SpeechManager speechmanager = new SpeechManager();
// Start listening for keyboard input
keymanager.start();
// Start listening for speech input
speechmanager.start()
try
{
Application.Run();
}
catch (Exception ex)
{
MessageBox.Show(ex.StackTrace);
}
}
I'm using Application.Run() because my GUI is handled by an outside program. This C# application's only job is to receive input events and run external scripts based on that input.
Both the keyboard and speech modules receive events sporadically. The Kinect, on the other hand, generates events constantly. If my gestures happened just as infrequently, a polling loop might be the answer with a wait time between each poll. However, I'm using the Kinect to control mouse movement... I can't afford to wait between skeleton event captures, because then the mouse would be very laggy; my skeleton capture loop needs to be as constant as possible. This presented a big problem, because now I can't have my Kinect manager on the same thread (or message pump? I'm a little hazy on the difference, hence why I think the problem lies here): from the way I understand it, being on the same thread would not allow keyboard or speech events to consistently get through. Instead, I kind of hacked together a solution where I made my Kinect manager inherit from System.Windows.Forms, so that it would work with Application.Run().
Now, my main method looks like this:
// Main.cs
public static void Main()
{
// Create input managers
KeyboardMouseManager keymanager = new KeyboardMouseManager();
KinectManager kinectManager = new KinectManager();
SpeechManager speechmanager = new SpeechManager();
// Start listening for keyboard input
keymanager.start();
// Attempt to launch the kinect sensor
bool kinectLoaded = kinectManager.start();
// Use the default microphone (if applicable) if kinect isn't hooked up
// Use the kinect microphone array if the kinect is working
if (kinectLoaded)
{
speechmanager.start(kinectManager);
}
else
{
speechmanager.start();
}
try
{
// THIS IS THE PLACE I THINK I'M DOING SOMETHING WRONG
Application.Run(kinectManager);
}
catch (Exception ex)
{
MessageBox.Show(ex.StackTrace);
}
For some reason, the Kinect microphone loses its "default-ness" as soon as the Kinect sensor is started (if this observation is incorrect, or there is a workaround, PLEASE let me know). Because of that, I was required to make a special start() method in the Speech manager, which looks like this:
// SpeechManager.cs
/** For use with the Kinect Microphone **/
public void start(KinectManager kinect)
{
// Get the speech recognizer information
RecognizerInfo recogInfo = SpeechRecognitionEngine.InstalledRecognizers().FirstOrDefault();
if (null == recogInfo)
{
Console.WriteLine("Error: No recognizer information found on Kinect");
return;
}
SpeechRecognitionEngine recognizer = new SpeechRecognitionEngine(recogInfo.Id);
// Loads all of the grammars into the recognizer engine
loadSpeechBindings(recognizer);
// Set speech event handler
recognizer.SpeechRecognized += speechRecognized;
using (var s = kinect.getAudioSource().Start() )
{
// Set the input to the Kinect audio stream
recognizer.SetInputToAudioStream(s, new SpeechAudioFormatInfo(EncodingFormat.Pcm, 16000, 16, 1, 32000, 2, null));
// Recognize asycronous speech events
recognizer.RecognizeAsync(RecognizeMode.Multiple);
}
}
For reference, the start() method in the Kinect manager looks like this:
// KinectManager.cs
public bool start()
{
// Code from Microsoft Sample
kinect = (from sensorToCheck in KinectSensor.KinectSensors where sensorToCheck.Status == KinectStatus.Connected select sensorToCheck).FirstOrDefault();
// Fail elegantly if no kinect is detected
if (kinect == null)
{
connected = false;
Console.WriteLine("Couldn't find a Kinect");
return false;
}
// Start listening
kinect.Start();
// Enable listening for all skeletons
kinect.SkeletonStream.Enable();
// Obtain the KinectAudioSource to do audio capture
source = kinect.AudioSource;
source.EchoCancellationMode = EchoCancellationMode.None; // No AEC for this sample
source.AutomaticGainControlEnabled = false; // Important to turn this off for speech recognition
kinect.AllFramesReady += new EventHandler<AllFramesReadyEventArgs>(allFramesReady);
connected = true;
return true;
}
So when I disable motion capture (by having my main() look similar to the first code segment), speech recognition works fine. When I enable motion capture, motion works great but no speech gets recognized. In both cases, keyboard events always work. There are no errors, and through tracing I found out that all the data in the speech manager is initialized correctly... it seems like the speech recognition events just disappear. How can I reorganize this code so that the input modules can work independently? Do I use threading, or just Application.Run() in a different way?
The Microsoft Kinect SDK have several known issues, one of them being that audio is not processed if you begin tracking the skeleton after starting the audio processor. From the known issues:
Audio is not processed if skeleton stream is enabled after starting audio capture
Due to a bug, enabling or disabling the SkeletonStream will stop the AudioSource
stream returned by the Kinect sensor. The following sequence of instructions will
stop the audio stream:
kinectSensor.Start();
kinectSensor.AudioSource.Start(); // --> this will create an audio stream
kinectSensor.SkeletonStream.Enable(); // --> this will stop the audio stream as an undesired side effect
The workaround is to invert the order of the calls or to restart the AudioSource after changing SkeletonStream status.
Workaround #1 (start audio after skeleton):
kinectSensor.Start();
kinectSensor.SkeletonStream.Enable();
kinectSensor.AudioSource.Start();
Workaround #2 (restart audio after skeleton):
kinectSensor.Start();
kinectSensor.AudioSource.Start(); // --> this will create an audio stream
kinectSensor.SkeletonStream.Enable(); // --> this will stop the audio stream as an undesired side effect
kinectSensor.AudioSource.Start(); // --> this will create another audio stream
Resetting the SkeletonStream engine status is an expensive call. It should be made at application startup only, unless the app has specific needs that require turning Skeleton on and off.
I also hope that when you say you're using "version 1" of the SDK, you mean "version 1.6". If you are using anything but 1.5 or 1.6, you are only hurting yourself due to the many changes that were made in 1.5.

Audio feedback using RIL Audio on SmartPhone

We're working on a SIP softphone and we get audio feedback when we call from one phone to the other. However, when we call from a normal SIP Phone (software or hardware) to our app, then it all works fine - it's only when calling from one phone using the app to another one. Here is the code we use to initialize RIL Audio:
public static void InitRILAudio()
{
IntPtr res;
RILRESULTCALLBACK result = new RILRESULTCALLBACK(f_result);
RILNOTIFYCALLBACK notify = new RILNOTIFYCALLBACK(f_notify);
res = RIL_Initialize(1, result, notify, (0x00010000 | 0x00020000 | 0x00080000), 0, out hRil);
if (res != IntPtr.Zero)
return;
RILAUDIODEVICEINFO audioDeviceInfo = new RILAUDIODEVICEINFO();
audioDeviceInfo.cbSize = 16;
audioDeviceInfo.dwParams = 0x00000003; //RIL_PARAM_ADI_ALL;
audioDeviceInfo.dwRxDevice = 0x00000001; //RIL_AUDIO_HANDSET;
audioDeviceInfo.dwTxDevice = 0x00000001; //RIL_AUDIO_HANDSET;
res = RIL_SetAudioDevices(hRil, audioDeviceInfo);
}
We are using SipEk (http://voipengine.googlepages.com/sipeksdk) for the SIP stack. Basically we just use a callback delegate from the SDK for the audio stuff. Has anyone else experienced problems with Audio feedback loops like this? Either with RIL Audio or SipEk? Any suggestions?
Thanks in advance!
Feedback means that you're not using echo cancellation (line and/or acoustic, depending on whether it's working as a speakerphone or not), or if you are, the delay in your system (jitter buffers, network, encode/decode, etc) is greater than the echo canceller can handle. Excessive gain/clipping in the wrong places can also defeat any echo canceller (they don't like non-linear effects).
Sounds like you're just dumping the audio off to some other layers. SipEk is just a wrapper for pjsip, but I assume you're doing audio via the Microsoft RIL/etc stuff, not via pjmedia. You need to have a good understanding of your audio paths - where stuff gets sampled, if/how it's acoustic/line echo-cancelled, what the echo tail is, how it gets encoded and packetized, how it's received, jitter-buffered, loss-concealed, and decoded and played back.

Categories

Resources