I would like to grab the preview frames that are displayed inside my CaptureElement xaml element. The source of my CaptureElement is set to a MediaCapture object and I use the StartPreview() method to start displaying the camera. I would like to access the frames that are being shown without saving them to an img or video file. The goal is to capture 10 fps from the preview and send each frame to another class that accepts byte[].
I tried using the CapturePhotoToStorageFileAsync method however this is not a feasible option as I do not want to take 10 actual images / second. I also do not want to use ScreenCapture as it stores what is captured into a video file. Ideally I do not want to store any media files temporarily on the phone. After looking at the msdn for MediaCapture, I noticed there's a method called GetPreviewFrameAsync() however this method does not exist inside Windows Phone 8.1. I also stumbled on this example however I do not completely understand how it works.
Any suggestions on how to approach this is greatly appreciated.
There is a sample on the Microsoft github page that is relevant, although they target Windows 10. You may be interested in migrating your project to get this functionality.
GetPreviewFrame: This sample will capture preview frames as opposed to full-blown photos. Once it has a preview frame, it can edit the pixels on it.
Here is the relevant part:
private async Task GetPreviewFrameAsSoftwareBitmapAsync()
{
// Get information about the preview
var previewProperties = _mediaCapture.VideoDeviceController.GetMediaStreamProperties(MediaStreamType.VideoPreview) as VideoEncodingProperties;
// Create the video frame to request a SoftwareBitmap preview frame
var videoFrame = new VideoFrame(BitmapPixelFormat.Bgra8, (int)previewProperties.Width, (int)previewProperties.Height);
// Capture the preview frame
using (var currentFrame = await _mediaCapture.GetPreviewFrameAsync(videoFrame))
{
// Collect the resulting frame
SoftwareBitmap previewFrame = currentFrame.SoftwareBitmap;
// Add a simple green filter effect to the SoftwareBitmap
EditPixels(previewFrame);
}
}
private unsafe void EditPixels(SoftwareBitmap bitmap)
{
// Effect is hard-coded to operate on BGRA8 format only
if (bitmap.BitmapPixelFormat == BitmapPixelFormat.Bgra8)
{
// In BGRA8 format, each pixel is defined by 4 bytes
const int BYTES_PER_PIXEL = 4;
using (var buffer = bitmap.LockBuffer(BitmapBufferAccessMode.ReadWrite))
using (var reference = buffer.CreateReference())
{
// Get a pointer to the pixel buffer
byte* data;
uint capacity;
((IMemoryBufferByteAccess)reference).GetBuffer(out data, out capacity);
// Get information about the BitmapBuffer
var desc = buffer.GetPlaneDescription(0);
// Iterate over all pixels
for (uint row = 0; row < desc.Height; row++)
{
for (uint col = 0; col < desc.Width; col++)
{
// Index of the current pixel in the buffer (defined by the next 4 bytes, BGRA8)
var currPixel = desc.StartIndex + desc.Stride * row + BYTES_PER_PIXEL * col;
// Read the current pixel information into b,g,r channels (leave out alpha channel)
var b = data[currPixel + 0]; // Blue
var g = data[currPixel + 1]; // Green
var r = data[currPixel + 2]; // Red
// Boost the green channel, leave the other two untouched
data[currPixel + 0] = b;
data[currPixel + 1] = (byte)Math.Min(g + 80, 255);
data[currPixel + 2] = r;
}
}
}
}
}
And declare this outside your class:
[ComImport]
[Guid("5b0d3235-4dba-4d44-865e-8f1d0e4fd04d")]
[InterfaceType(ComInterfaceType.InterfaceIsIUnknown)]
unsafe interface IMemoryBufferByteAccess
{
void GetBuffer(out byte* buffer, out uint capacity);
}
And of course, your project will have to allow unsafe code for all of this to work.
Have a closer look at the sample to see how to get all the details. Or, to have a walkthrough, you can watch the camera session from the recent //build/ conference, which includes a little bit of a walkthrough through some camera samples.
Related
I'm using SharpDx to capture the screen (1 to 60fps). Some frames are all transparent and end up getting processed and saved by the code.
Is there any simple/fast way to detect these frames drops without having to open the generated bitmap and look for the alpha values?
Here's what I'm using (saves capture as image):
try
{
//Try to get duplicated frame within given time.
_duplicatedOutput.AcquireNextFrame(MinimumDelay, out var duplicateFrameInformation, out var screenResource);
//Copy resource into memory that can be accessed by the CPU.
using (var screenTexture2D = screenResource.QueryInterface<Texture2D>())
_device.ImmediateContext.CopySubresourceRegion(screenTexture2D, 0, new ResourceRegion(Left, Top, 0, Left + Width, Top + Height, 1), _screenTexture, 0);
//Get the desktop capture texture.
var mapSource = _device.ImmediateContext.MapSubresource(_screenTexture, 0, MapMode.Read, MapFlags.None); //, out var stream);
#region Get image data
var bitmap = new System.Drawing.Bitmap(Width, Height, PixelFormat.Format32bppArgb);
var boundsRect = new System.Drawing.Rectangle(0, 0, Width, Height);
//Copy pixels from screen capture Texture to GDI bitmap.
var mapDest = bitmap.LockBits(boundsRect, ImageLockMode.WriteOnly, bitmap.PixelFormat);
var sourcePtr = mapSource.DataPointer;
var destPtr = mapDest.Scan0;
for (var y = 0; y < Height; y++)
{
//Copy a single line
Utilities.CopyMemory(destPtr, sourcePtr, Width * 4);
//Advance pointers
sourcePtr = IntPtr.Add(sourcePtr, mapSource.RowPitch);
destPtr = IntPtr.Add(destPtr, mapDest.Stride);
}
//Release source and dest locks
bitmap.UnlockBits(mapDest);
//Bitmap is saved in here!!!
#endregion
_device.ImmediateContext.UnmapSubresource(_screenTexture, 0);
screenResource.Dispose();
_duplicatedOutput.ReleaseFrame();
}
catch (SharpDXException e)
{
if (e.ResultCode.Code != SharpDX.DXGI.ResultCode.WaitTimeout.Result.Code)
throw;
}
It's a modified version from this one.
I also have this version (saves capture as pixel array):
//Get the desktop capture texture.
var data = _device.ImmediateContext.MapSubresource(_screenTexture, 0, MapMode.Read, MapFlags.None, out var stream);
var bytes = new byte[stream.Length];
//BGRA32 is 4 bytes.
for (var height = 0; height < Height; height++)
{
stream.Position = height * data.RowPitch;
Marshal.Copy(new IntPtr(stream.DataPointer.ToInt64() + height * data.RowPitch), bytes, height * Width * 4, Width * 4);
}
I'm not sure if it's the best way of saving the screen capture as image and/or pixel array, but it's somewhat working.
Anyway, the problem is that some frames captured are fully transparent and they are useless to me. I need to somehow avoid saving them at all.
When capturing as pixel array, I can simply check the bytes array, to know if the 4th item is 255 or 0. When saving as image, I could use the bitmap.GetPixel(0,0).A to know if the image has content or not.
But with both ways I need to finish the capture and get the full image content before being capable of knowing if the frame was dropped or not.
Is there any way to know if the frame was correctly captured?
You propblem boils down to you trying to do this on a timer. There is no way to guarantee a minimum execution time for every single tick/frame. And if the user picks a too high value, you get effects like this. At worst you might have ticks queue up in the EventQueue, until you run into a Exception because the queue is overfilled.
What you need to do is limit the rate to a maximum. If it does not work as fast as the user wants, that is the reality. I wrote some simple rate limiting code just for such a case:
integer interval = 20;
DateTime dueTime = DateTime.Now.AddMillisconds(interval);
while(true){
if(DateTime.Now >= dueTime){
//insert code here
//Update next dueTime
dueTime = DateTime.Now.AddMillisconds(interval);
}
else{
//Just yield to not tax out the CPU
Thread.Sleep(1);
}
}
Just two notes:
this was designed to run in a seperate thread. You have to run it as such, or adapt the Thread.Sleep() to your multitasking option of choice.
DateTime.Now is not suited for such small timeframes (2 digit ms). Usually the returned value will only update every 18 ms or so. It actually varies over time. 60 FPS puts you around 16ms. You should be using Stopwatch or something similar.
In order to ignore frame drops, I'm using this code (so far it's working as expected):
//Try to get the duplicated frame within given time.
_duplicatedOutput.AcquireNextFrame(1000, out var duplicateFrameInformation, out var screenResource);
//Somehow, it was not possible to retrieve the resource.
if (screenResource == null || duplicateFrameInformation.AccumulatedFrames == 0)
{
//Mark the frame as dropped.
frame.WasDropped = true;
FrameList.Add(frame);
screenResource?.Dispose();
_duplicatedOutput.ReleaseFrame();
return;
}
I'm simply checking if the screenResource is not null and if there are frames accumulated.
I have a quite simple question today: how to create a bitmap and "draw" it, changing every single pixel in it, in a UWP app.
I have read many things here on StackOverflow, but now I am a little bit confused, because there are so many different types (WritableBitmap, SoftwareBitmap, BitmapImage, BitmapSource... now, in FCU, they added even BitmapIconSource...) and so many ways... but they mostly starts with a given image file or source, and it's not my case.
Let's say, e.g., that I want to create a 20x20 Bitmap and want to assign to every pixel a different argb value... and then assign it to a BitmapSource property.
What would be the best and efficient way, in a UWP?
Thank you for your patience and your attention.
Best regards
You can use WriteableBitmap and modify it's PixelBuffer directly:
var wb = new WriteableBitmap(100, 100);
byte[] imageArray = new byte[100 * 100 * 4];
for (int i = 0; i < imageArray.Length; i += 4)
{
//BGRA format
imageArray[i] = 0; // Blue
imageArray[i + 1] = 0; // Green
imageArray[i + 2] = 255; // Red
imageArray[i + 3] = 255; // Alpha
}
using (Stream stream = wb.PixelBuffer.AsStream())
{
//write to bitmap
await stream.WriteAsync(imageArray, 0, imageArray.Length);
}
TargetImage.Source = wb;
If you want more abstraction, look into WriteableBitmapEx which adds very useful and easy to use extension methods and helpers that make working with WriteableBitmap a breeze.
thats how i wrote your beautiful code(some simple changes for me for easier understanding)
private void Form1_Load(object sender, EventArgs e)
{
prev = GetDesktopImage();//get a screenshot of the desktop;
cur = GetDesktopImage();//get a screenshot of the desktop;
var locked1 = cur.LockBits(new Rectangle(0, 0, cur.Width, cur.Height),
ImageLockMode.ReadWrite, PixelFormat.Format32bppArgb);
var locked2 = prev.LockBits(new Rectangle(0, 0, prev.Width, prev.Height),
ImageLockMode.ReadWrite, PixelFormat.Format32bppArgb);
ApplyXor(locked1, locked2);
compressionBuffer = new byte[1920* 1080 * 4];
// Compressed buffer -- where the data goes that we'll send.
int backbufSize = LZ4.LZ4Codec.MaximumOutputLength(this.compressionBuffer.Length) + 4;
backbuf = new CompressedCaptureScreen(backbufSize);
MessageBox.Show(compressionBuffer.Length.ToString());
int length = Compress();
MessageBox.Show(backbuf.Data.Length.ToString());//prints the new buffer size
}
the compression buffer length is for example 8294400
and the backbuff.Data.length is 8326947
I didn't like the compression suggestions, so here's what I would do.
You don't want to compress a video stream (so MPEG, AVI, etc are out of the question -- these don't have to be real-time) and you don't want to compress individual pictures (since that's just stupid).
Basically what you want to do is detect if things change and send the differences. You're on the right track with that; most video compressors do that. You also want a fast compression/decompression algorithm; especially if you go to more FPS that will become more relevant.
Differences. First off, eliminate all branches in your code, and make sure memory access is sequential (e.g. iterate x in the inner loop). The latter will give you cache locality. As for the differences, I'd probably use a 64-bit XOR; it's easy, branchless and fast.
If you want performance, it's probably better to do this in C++: The current C# implementation doesn't vectorize your code, and that will help you a great deal here.
Do something like this (I'm assuming 32bit pixel format):
for (int y=0; y<height; ++y) // change to PFor if you like
{
ulong* row1 = (ulong*)(image1BasePtr + image1Stride * y);
ulong* row2 = (ulong*)(image2BasePtr + image2Stride * y);
for (int x=0; x<width; x += 2)
row2[x] ^= row1[x];
}
Fast compression and decompression usually means simpler compression algorithms. https://code.google.com/p/lz4/ is such an algorithm, and there's a proper .NET port available for that as well. You might want to read on how it works too; there is a streaming feature in LZ4 and if you can make it handle 2 images instead of 1 that will probably give you a nice compression boost.
All in all, if you're trying to compress white noise, it simply won't work and your frame rate will drop. One way to solve this is to reduce the colors if you have too much 'randomness' in a frame. A measure for randomness is entropy, and there are several ways to get a measure of the entropy of a picture ( https://en.wikipedia.org/wiki/Entropy_(information_theory) ). I'd stick with a very simple one: check the size of the compressed picture -- if it's above a certain limit, reduce the number of bits; if below, increase the number of bits.
Note that increasing and decreasing bits is not done with shifting in this case; you don't need your bits to be removed, you simply need your compression to work better. It's probably just as good to use a simple 'AND' with a bitmask. For example, if you want to drop 2 bits, you can do it like this:
for (int y=0; y<height; ++y) // change to PFor if you like
{
ulong* row1 = (ulong*)(image1BasePtr + image1Stride * y);
ulong* row2 = (ulong*)(image2BasePtr + image2Stride * y);
ulong mask = 0xFFFCFCFCFFFCFCFC;
for (int x=0; x<width; x += 2)
row2[x] = (row2[x] ^ row1[x]) & mask;
}
PS: I'm not sure what I would do with the alpha component, I'll leave that up to your experimentation.
Good luck!
The long answer
I had some time to spare, so I just tested this approach. Here's some code to support it all.
This code normally run over 130 FPS with a nice constant memory pressure on my laptop, so the bottleneck shouldn't be here anymore. Note that you need LZ4 to get this working and that LZ4 is aimed at high speed, not high compression ratio's. A bit more on that later.
First we need something that we can use to hold all the data we're going to send. I'm not implementing the sockets stuff itself here (although that should be pretty simple using this as a start), I mainly focused on getting the data you need to send something over.
// The thing you send over a socket
public class CompressedCaptureScreen
{
public CompressedCaptureScreen(int size)
{
this.Data = new byte[size];
this.Size = 4;
}
public int Size;
public byte[] Data;
}
We also need a class that will hold all the magic:
public class CompressScreenCapture
{
Next, if I'm running high performance code, I make it a habit to preallocate all the buffers first. That'll save you time during the actual algorithmic stuff. 4 buffers of 1080p is about 33 MB, which is fine - so let's allocate that.
public CompressScreenCapture()
{
// Initialize with black screen; get bounds from screen.
this.screenBounds = Screen.PrimaryScreen.Bounds;
// Initialize 2 buffers - 1 for the current and 1 for the previous image
prev = new Bitmap(screenBounds.Width, screenBounds.Height, PixelFormat.Format32bppArgb);
cur = new Bitmap(screenBounds.Width, screenBounds.Height, PixelFormat.Format32bppArgb);
// Clear the 'prev' buffer - this is the initial state
using (Graphics g = Graphics.FromImage(prev))
{
g.Clear(Color.Black);
}
// Compression buffer -- we don't really need this but I'm lazy today.
compressionBuffer = new byte[screenBounds.Width * screenBounds.Height * 4];
// Compressed buffer -- where the data goes that we'll send.
int backbufSize = LZ4.LZ4Codec.MaximumOutputLength(this.compressionBuffer.Length) + 4;
backbuf = new CompressedCaptureScreen(backbufSize);
}
private Rectangle screenBounds;
private Bitmap prev;
private Bitmap cur;
private byte[] compressionBuffer;
private int backbufSize;
private CompressedCaptureScreen backbuf;
private int n = 0;
First thing to do is capture the screen. This is the easy part: simply fill the bitmap of the current screen:
private void Capture()
{
// Fill 'cur' with a screenshot
using (var gfxScreenshot = Graphics.FromImage(cur))
{
gfxScreenshot.CopyFromScreen(screenBounds.X, screenBounds.Y, 0, 0, screenBounds.Size, CopyPixelOperation.SourceCopy);
}
}
As I said, I don't want to compress 'raw' pixels. Instead, I'd much rather compress XOR masks of previous and the current image. Most of the times this will give you a whole lot of 0's, which is easy to compress:
private unsafe void ApplyXor(BitmapData previous, BitmapData current)
{
byte* prev0 = (byte*)previous.Scan0.ToPointer();
byte* cur0 = (byte*)current.Scan0.ToPointer();
int height = previous.Height;
int width = previous.Width;
int halfwidth = width / 2;
fixed (byte* target = this.compressionBuffer)
{
ulong* dst = (ulong*)target;
for (int y = 0; y < height; ++y)
{
ulong* prevRow = (ulong*)(prev0 + previous.Stride * y);
ulong* curRow = (ulong*)(cur0 + current.Stride * y);
for (int x = 0; x < halfwidth; ++x)
{
*(dst++) = curRow[x] ^ prevRow[x];
}
}
}
}
For the compression algorithm I simply pass the buffer to LZ4 and let it do its magic.
private int Compress()
{
// Grab the backbuf in an attempt to update it with new data
var backbuf = this.backbuf;
backbuf.Size = LZ4.LZ4Codec.Encode(
this.compressionBuffer, 0, this.compressionBuffer.Length,
backbuf.Data, 4, backbuf.Data.Length-4);
Buffer.BlockCopy(BitConverter.GetBytes(backbuf.Size), 0, backbuf.Data, 0, 4);
return backbuf.Size;
}
One thing to note here is that I make it a habit to put everything in my buffer that I need to send over the TCP/IP socket. I don't want to move data around if I can easily avoid it, so I'm simply putting everything that I need on the other side there.
As for the sockets itself, you can use a-sync TCP sockets here (I would), but if you do, you will need to add an extra buffer.
The only thing that remains is to glue everything together and put some statistics on the screen:
public void Iterate()
{
Stopwatch sw = Stopwatch.StartNew();
// Capture a screen:
Capture();
TimeSpan timeToCapture = sw.Elapsed;
// Lock both images:
var locked1 = cur.LockBits(new Rectangle(0, 0, cur.Width, cur.Height),
ImageLockMode.ReadWrite, PixelFormat.Format32bppArgb);
var locked2 = prev.LockBits(new Rectangle(0, 0, prev.Width, prev.Height),
ImageLockMode.ReadWrite, PixelFormat.Format32bppArgb);
try
{
// Xor screen:
ApplyXor(locked2, locked1);
TimeSpan timeToXor = sw.Elapsed;
// Compress screen:
int length = Compress();
TimeSpan timeToCompress = sw.Elapsed;
if ((++n) % 50 == 0)
{
Console.Write("Iteration: {0:0.00}s, {1:0.00}s, {2:0.00}s " +
"{3} Kb => {4:0.0} FPS \r",
timeToCapture.TotalSeconds, timeToXor.TotalSeconds,
timeToCompress.TotalSeconds, length / 1024,
1.0 / sw.Elapsed.TotalSeconds);
}
// Swap buffers:
var tmp = cur;
cur = prev;
prev = tmp;
}
finally
{
cur.UnlockBits(locked1);
prev.UnlockBits(locked2);
}
}
Note that I reduce Console output to ensure that's not the bottleneck. :-)
Simple improvements
It's a bit wasteful to compress all those 0's, right? It's pretty easy to track the min and max y position that has data using a simple boolean.
ulong tmp = curRow[x] ^ prevRow[x];
*(dst++) = tmp;
hasdata |= tmp != 0;
You also probably don't want to call Compress if you don't have to.
After adding this feature you'll get something like this on your screen:
Iteration: 0.00s, 0.01s, 0.01s 1 Kb => 152.0 FPS
Using another compression algorithm might also help. I stuck to LZ4 because it's simple to use, it's blazing fast and compresses pretty well -- still, there are other options that might work better. See http://fastcompression.blogspot.nl/ for a comparison.
If you have a bad connection or if you're streaming video over a remote connection, all this won't work. Best to reduce the pixel values here. That's quite simple: apply a simple 64-bit mask during the xor to both the previous and current picture... You can also try using indexed colors - anyhow, there's a ton of different things you can try here; I just kept it simple because that's probably good enough.
You can also use Parallel.For for the xor loop; personally I didn't really care about that.
A bit more challenging
If you have 1 server that is serving multiple clients, things will get a bit more challenging, as they will refresh at different rates. We want the fastest refreshing client to determine the server speed - not slowest. :-)
To implement this, the relation between the prev and cur has to change. If we simply 'xor' away like here, we'll end up with a completely garbled picture at the slower clients.
To solve that, we don't want to swap prev anymore, as it should hold key frames (that you'll refresh when the compressed data becomes too big) and cur will hold incremental data from the 'xor' results. This means you can basically grab an arbitrary 'xor'red frame and send it over the line - as long as the prev bitmap is recent.
H264 or Equaivalent Codec Streaming
There are various compressed streaming available which does almost everything that you can do to optimize screen sharing over network. There are many open source and commercial libraries to stream.
Screen transfer in Blocks
H264 already does this, but if you want to do it yourself, you have to divide your screens into smaller blocks of 100x100 pixels, and compare these blocks with previous version and send these blocks over network.
Window Render Information
Microsoft RDP does lot better, it does not send screen as a raster image, instead it analyzes screen and creates screen blocks based on the windows on the screen. It then analyzes contents of screen and sends image only if needed, if it is a text box with some text in it, RDP sends information to render text box with a text with font information and other information. So instead of sending image, it sends information on what to render.
You can combine all techniques and make a mixed protocol to send screen blocks with image and other rendering information.
Instead of handling data as an array of bytes, you can handle it as an array of integers.
int* p = (int*)((byte*)scan0.ToPointer() + y * stride);
int* p2 = (int*)((byte*)scan02.ToPointer() + y * stride2);
for (int x = 0; x < nWidth; x++)
{
//always get the complete pixel when differences are found
if (*p2 != 0)
*p = *p2
++p;
++p2;
}
I have written a face recognition code in C++ using opencv.
I use my webcam for live video and output the recognized faces in video in the debuggin screen.
Now I want to create a app in C# using Visual Studio and use the output of the C++ opencv code and put it in a window in the C# app.
So my problem..
1) How to use the opencv C++ code in C#
2) How to put the output of my code on window in C# app
My Code::
#include<opencv2\opencv.hpp> //For opencv functions
#include<opencv2\highgui\highgui.hpp> //For window based functions
#include<fstream> //For dealing with I/O operations on file
using namespace std;
using namespace cv;
static void read_data(vector <Mat> & images,vector <int>& labels, char separator=' ')
{
ifstream file("images.txt"); //images.txt contains paths and labels separated by a space
string line;
string a[2];
while(getline(file,line)) // read images.txt line by line
{
int i=0;
stringstream iss(line);
while (iss.good() && i < 2)
{
iss>>a[i];
++i;
}
images.push_back(imread(a[0],CV_LOAD_IMAGE_GRAYSCALE)); // a[0] = "path of images"
labels.push_back(atoi(a[1].c_str())); //a[1] = "labels"
}
file.close();
}
int main()
{
vector<Mat> images; //stores the paths of all images
vector<int> labels; //stores the corresponding labels
//function call to function read_data
read_data(images,labels);
//take the size of the sample images
int im_width = images[0].cols;
int im_height = images[0].rows;
//threshold is the minimum value of magnitude of vector of EigenFaces
double threshold=10.0;
//create instance of EigenFaceRecognizer
Ptr<FaceRecognizer> model = createEigenFaceRecognizer(10,threshold);
double current_threshold =model->getDouble("threshold");
// set a threshold value, for face prediction
model->set("threshold",5000.0);
// train the face recognizer using the sample images
model->train(images,labels);
// Create face_cascade to detect people
CascadeClassifier face_cascade;
if(!face_cascade.load("c:\\haar\\haarcascade_frontalface_default.xml"))
{
cout<<"ERROR Loading cascade file";
return 1;
}
// capture the video input from webcam
VideoCapture capture(CV_CAP_ANY);
capture.set(CV_CAP_PROP_FRAME_WIDTH, 320);
capture.set(CV_CAP_PROP_FRAME_HEIGHT, 240);
Size frameSize(static_cast<int>(320), static_cast<int>(240));
//initialize the VideoWriter object
VideoWriter oVideoWriter ("MyVideo.avi", CV_FOURCC('P','I','M','1'), 20, frameSize, true);
if(!capture.isOpened())
{
cout<<"Error in camera";
return 1;
}
Mat cap_img, gray_img;
//store the detected faces
vector<Rect> faces;
while(1)
{
//capture frame by frame in cap_img
capture>>cap_img;
waitKey(10);
// Image conversion: Color to Gray
cvtColor(cap_img,gray_img,CV_BGR2GRAY);
//Histogram Equilization to increase contrast by stretching intensity ranges
equalizeHist(gray_img,gray_img);
// detects faces in the frame
//CV_HAAR_SCALE_IMAGE to scale the size of the detect face
//CV_HAAR_DO_CANNY_PRUNING to increase speed as it skips image regions that are unlikely to contain a face
face_cascade.detectMultiScale(gray_img,faces,1.1,10,CV_HAAR_SCALE_IMAGE | CV_HAAR_DO_CANNY_PRUNING, Size(0,0),Size(300,300));
//Loop over the detected faces
for(int i=0;i<faces.size();i++)
{
Rect face_i = faces[i];
Mat face = gray_img(face_i);
Mat face_resized;
//resize the detected face to the size of sample images
resize(face,face_resized, Size(im_width,im_height),1.0,1.0,INTER_CUBIC);
// predict the person the face belongs to, returns label
int predicted_label = -1;
predicted_label=model->predict(face_resized);
// Draws a rectangle around the faces
rectangle(cap_img,face_i, CV_RGB(0,255,0),1);
//text to be put with the face, by default "new" for new faces
string box_text=format("new");
// Change the text based on label
if(predicted_label>-1)
switch(predicted_label)
{
case 0:box_text = format("keanu");
break;
case 1:box_text = format("selena");
break;
}
// calculate the coordinates to put the text based on the postion of the face
int pos_x = max(face_i.tl().x - 10, 0);
int pos_y = max(face_i.tl().y - 10, 0);
// put text on the output screen
putText(cap_img, box_text , Point(pos_x,pos_y), FONT_HERSHEY_PLAIN,0.8, CV_RGB(0,255,0), 1,CV_AA);
if (box_text=="keanu" || box_text=="selena");
else
oVideoWriter.write(cap_img); //writer the frame into the file
}
// show the frame on the result window
imshow("Result",cap_img);
waitKey(3);
char c =waitKey(3);
if(c==27)
break;
}
return 0;
}
I'm working on an application which will stream the color, depth, and IR video data from the Kinect V2 sensor. Right now I'm just putting together the color video part of the app. I've read through some tutorials and actually got some video data coming into my app - the problem seems to be that the byte order seems to be in the wrong order which gives me an oddly discolored image (see below).
So, let me explain how I got here. In my code, I first open the sensor and also instantiate a new multi source frame reader. After I've created the reader, I create an event handler called Reader_MultiSourceFrameArrived:
void Reader_MultiSourceFrameArrived(object sender, MultiSourceFrameArrivedEventArgs e)
{
if (proccessing || gotframe) return;
// Get a reference to the multi-frame
var reference = e.FrameReference.AcquireFrame();
// Open color frame
using (ColorFrame frame = reference.ColorFrameReference.AcquireFrame())
{
if (frame != null)
{
proccessing = true;
var description = frame.ColorFrameSource.FrameDescription;
bw2 = description.Width / 2;
bh2 = description.Height / 2;
bpp = (int)description.BytesPerPixel;
if (imgBuffer == null)
{
imgBuffer = new byte[description.BytesPerPixel * description.Width * description.Height];
}
frame.CopyRawFrameDataToArray(imgBuffer);
gotframe = true;
proccessing = false;
}
}
}
Now, every time a frame is received (and not processing) it should copy the frame data into an array called imgBuffer. When my application is ready I then call this routine to convert the array into a Windows Bitmap that I can display on my screen.
if (gotframe)
{
if (theBitmap.Rx != bw2 || theBitmap.Ry != bh2) theBitmap.SetSize(bw2, bh2);
int kk = 0;
for (int j = 0; j < bh2; ++j)
{
for (int i = 0; i < bw2; ++i)
{
kk = (j * bw2 * 2 + i) * 2 * bpp;
theBitmap.pixels[i, bh2 - j - 1].B = imgBuffer[kk];
theBitmap.pixels[i, bh2 - j - 1].G = imgBuffer[kk + 1];
theBitmap.pixels[i, bh2 - j - 1].R = imgBuffer[kk + 2];
theBitmap.pixels[i, bh2 - j - 1].A = 255;
}
}
theBitmap.needupdate = true;
gotframe = false;
}
}
So, after this runs theBitmap now contains the image information needed to draw the image on the screen... however, as seen in the image above - it looks quite strange. The most obvious change is to simply change the order of the pixel B,G,R values when they get assigned to the bitmap in the double for loop (which I tried)... however, this simply results in other strange color images and none provide an accurate color image. Any thoughts where I might be going wrong?
Is this Bgra?
The normal "RGB" in Kinect v2 for C# is BGRA.
Using the Kinect SDK 2.0, you don't need all of those "for" cycles.
The function used to allocate the pixels in the bitmap is this one:
colorFrame.CopyConvertedFrameDataToIntPtr(
this.colorBitmap.BackBuffer,
(uint)(colorFrameDescription.Width * colorFrameDescription.Height * 4),
ColorImageFormat.Bgra);
1) Get the Frame From the kinect, using Reader_ColorFrameArrived (go see ColorSamples - WPF);
2) Create the colorFrameDescription from the ColorFrameSource using Bgra format;
3) Create the bitmap to display;
If you have any problems, please say. But if you follow the sample it's actually pretty clean there how to do it.
I was stuck on this problem forever. Problem is, that all almost all examples you find, are WPF examples. But for Windows Forms its a different story.
frame.CopyRawFrameDataToArray(imgBuffer);
gets you the rawdata whitch is
ColorImageFormat.Yuy2
By converting it to RGB you should be able to fix your color problem. The transformation from YUY2 to RGB is very expensive, you might want to use a Parallel foreach loop to maintain your framerate