C# - Get byte position of MediaElement - c#

Basically, I want to get the current byte of the MediaElement at its current playback position. For example, when it is at 5 seconds, the byte position would be 1024kb. I don't want to multiply the bitrate with the current time as that is not accurate.
All I need is to get the byte position at certain durations.
So is there anyway I could get this? I'm open to other options. (Does FFProbe support this?)

I've tried everything and there is no way to do this directly using MediaElement.
The only way is to get the frame number of the video by multiplying the framerate with the timecode of the byte position you want to get.
Then use a program like BmffViewer which analyzes the moov atom of the video header. Then go to the stco entries of the track you want to analyze and get the chunk offset of the frame you calculated earlier.

Related

C# extract frames from part of a video file

Using AForge ffmpeg wrapper you can extract frames from a video using the VideoFileReader class and save it as a bitmap.
See this for the exemple:
Extracting frames of a .avi file
My problem with that is that you cannot specified where to start reading the frames. It always starts from the beginning of the video file.
But what if i wanted to extract frames that are in the middle of a two hours long video file. Using that class you'd have to parse the whole first hour juste to get to those frames.
Does anyone know a way to achieve that?
If you know where in the video you want to start reading, just skip the appropriate number of frames; there's no need to process any of them.
This assumes, of course, that you know the exact frame number you want to start reading from, which you can calculate by multiplying the framerate by the time at which you want to perform the extraction. In your example, if the video is two hours long and you want to extract frames from the middle...
VideoFileReader reader = new VideoFileReader();
reader.Open("file.avi");
// Jump to 1 hour into the video
int framesToSkip = reader.FrameRate * 3600; // 1 hour = 3600 seconds
for (int i = 0; i < framesToSkip; i++)
reader.ReadVideoFrame();
// Now the next time ReadVideoFrame() is called, we will get the frame at the 1 hour mark
This assumes that the .FrameRate property returns the value in frames per second. Unfortunately the documentation doesn't say, so I'm not sure how it handles video files with non-integral framerates (i.e. 29.97 is a common framerate.)

How to timestamp two streams and synchronize them at receiver?

I have two streams of data, one with audio data,the other with video data.
The sound i record with DirectSound is put in a buffer of 100ms length and a DirectShow ISampleGrabbe witch grabs frame for me at 30 frames per second(one frame at each 33,33 ms).
What TimeStamping means? Should I attach to the video/audio a DateTime field and verify at receiving which audio packet have the same TimeStamp with the video frame?
I know this is a really hard subject, but can you please give me an idea?
It means each video/audio element has a time offset that says when it must be played in relation to when the video/audio was started. So the receiving end will order received elements by their timestamp and play them in order, also it will wait when video or audio elements are missing.
You should not add a DateTime attribute to every element. Instead the video/audio header should indicate at what framerate or frequency the media must be played and therefore how much elements it will receive every second. So a simple autonumber would do. It's the players responsibility to order the received elements and check if the point to where it has received all elements is far enough in the future that it can keep playing.

How to get sound amplitude of my wav file with respect to time?

I have a wav file and all what i need is to perform a function when a remarkable intensity of sound plays.
For example : if there is a sound of intensity level 10 (supposed) is playing so i want that when ever the intensity level of sound increases from 10 then an event should be triggered to tell me that there is a remarkable sound.
I tried to google it and found that if we read the bytes of wav file and read the data chunk (after 44th byte) we get the user data (sound data). but when i analyse this data i got confused because there is also same data where there is no sound.
I hope my question is quite clear.
so please i need your suggestions/ideas and references.
You don't need an FFT for this - you can just compute the short term RMS power and when this exceeds a predetermined threshold then you have a "loud" sound.
power_RMS = sqrt(sum(x^2) / N)
where x is the sample value and N is the number of samples over which you want to compute RMS power - I would suggest using a period of say 10 ms which gives N = 441 samples at a 44.1 kHz sample rate.

Getting frame dimension from raw mpeg4 stream?

does anyone know how i can retrieve the frame dimension of a mpeg4 video (non h264, i.e. Mpeg4 Part 2) from the raw video bitstream?
i´m currently writing a custom media source for windows media foundation, i have to provide a mediatype which needs the frame size. it doesn´t work without it.
any ideas?
thanks
I am not getting you. Are you trying to know the width and the height of the video being streamed? If so (and I guess that it is the "dimension" you are looking for) heres how:
Parse the stream for this integer 000001B0(hex) its always the first thing you get streamed. If not, see the SDP of the stream (if you have any, and search for config= field, and there it is... only now it is a Base16 string!
Read all the bytes until you get to the integer 000001B6(hex)
You should get something like this (hex): 000001B0F5000001B5891300000100000001200086C40FA28 A021E0A2
This is the "stream configuration header" or frame or whatever, exact name is Video Object Sequence. It holds all the info a decoder would need to decode the video stream.
Read the last 4 bytes (in my example they are separated by one space -- A021E0A2)
Now observe these bytes as one 32-bit unsigned integer...
To get width read the first 8 bits, and then multiply what you get with 4
Skip next 7 bits
To get height read next 9 bits
In pseudo code:
WIDTH = readBitsUnsigned(array, 8) * 4;
readBitsUnsigned(array, 7);
HEIGHT = readBitsUnsigned(array, 9);
There you go... width and height. (:

Timing in C# real time audio analysis

I'm trying to determine the "beats per minute" from real-time audio in C#. It is not music that I'm detecting in though, just a constant tapping sound. My problem is determining the time between those taps so I can determine "taps per minute" I have tried using the WaveIn.cs class out there, but I don't really understand how its sampling. I'm not getting a set number of samples a second to analyze. I guess I really just don't know how to read in an exact number of samples a second to know the time between to samples.
Any help to get me in the right direction would be greatly appreciated.
I'm not sure which WaveIn.cs class you're using, but usually with code that records audio, you either A) tell the code to start recording, and then at some later point you tell the code to stop, and you get back an array (usually of type short[]) that comprises the data recorded during this time period; or B) tell the code to start recording with a given buffer size, and as each buffer is filled, the code makes a callback to a method you've defined with a reference to the filled buffer, and this process continues until you tell it to stop recording.
Let's assume that your recording format is 16 bits (aka 2 bytes) per sample, 44100 samples per second, and mono (1 channel). In the case of (A), let's say you start recording and then stop recording exactly 10 seconds later. You will end up with a short[] array that is 441,000 (44,100 x 10) elements in length. I don't know what algorithm you're using to detect "taps", but let's say that you detect taps in this array at element 0, element 22,050, element 44,100, element 66,150 etc. This means you're finding taps every .5 seconds (because 22,050 is half of 44,100 samples per second), which means you have 2 taps per second and thus 120 BPM.
In the case of (B) let's say you start recording with a fixed buffer size of 44,100 samples (aka 1 second). As each buffer comes in, you find taps at element 0 and at element 22,050. By the same logic as above, you'll calculate 120 BPM.
Hope this helps. With beat detection in general, it's best to record for a relatively long time and count the beats through a large array of data. Trying to estimate the "instantaneous" tempo is more difficult and prone to error, just like estimating the pitch of a recording is more difficult to do in realtime than with a recording of a full note.
I think you might be confusing samples with "taps."
A sample is a number representing the height of the sound wave at a given moment in time. A typical wave file might be sampled 44,100 times a second, so if you have two channels for stereo, you have 88,200 sixteen-bit numbers (samples) per second.
If you take all of these numbers and graph them, you will get something like this:
(source: vbaccelerator.com)
What you are looking for is this peak ------------^
That is the tap.
Assuming we're talking about the same WaveIn.cs, the constructor of WaveLib.WaveInRecorder takes a WaveLib.WaveFormat object as a parameter. This allows you to set the audio format, ie. samples rate, bit depth, etc. Just scan the audio samples for peaks or however you're detecting "taps" and record the average distance in samples between peaks.
Since you know the sample rate of the audio stream (eg. 44100 samples/second), take your average peak distance (in samples), multiply by 1/(samples rate) to get the time (in seconds) between taps, divide by 60 to get the time (in minutes) between taps, and invert to get the taps/minute.
Hope that helps

Categories

Resources