How does MediaCapture.StartPreviewToCustomSinkAsync method work? - c#

Why is Windows documentation so lacking ?? It seems impossible to find an example of how this method is supposed to work StartPreviewToCustomSinkAsync
What I am trying to do is get a preview image from a video source (via MediaCapture) but can't understand how this method works (especially what the second parameter, IMediaExtension, is supposed to be/do).
Any chance any of you can help me with this ?

If all you need is to get a preview frame every now and then, there is a sample on the Microsoft github page that is relevant, although they target Windows 10. You may be interested in migrating your project to get this functionality.
GetPreviewFrame: This sample will capture preview frames as opposed to full-blown photos. Once it has a preview frame, it can edit the pixels on it.
Here is the relevant part:
private async Task GetPreviewFrameAsSoftwareBitmapAsync()
{
// Get information about the preview
var previewProperties = _mediaCapture.VideoDeviceController.GetMediaStreamProperties(MediaStreamType.VideoPreview) as VideoEncodingProperties;
// Create the video frame to request a SoftwareBitmap preview frame
var videoFrame = new VideoFrame(BitmapPixelFormat.Bgra8, (int)previewProperties.Width, (int)previewProperties.Height);
// Capture the preview frame
using (var currentFrame = await _mediaCapture.GetPreviewFrameAsync(videoFrame))
{
// Collect the resulting frame
SoftwareBitmap previewFrame = currentFrame.SoftwareBitmap;
// Add a simple green filter effect to the SoftwareBitmap
EditPixels(previewFrame);
}
}
private unsafe void EditPixels(SoftwareBitmap bitmap)
{
// Effect is hard-coded to operate on BGRA8 format only
if (bitmap.BitmapPixelFormat == BitmapPixelFormat.Bgra8)
{
// In BGRA8 format, each pixel is defined by 4 bytes
const int BYTES_PER_PIXEL = 4;
using (var buffer = bitmap.LockBuffer(BitmapBufferAccessMode.ReadWrite))
using (var reference = buffer.CreateReference())
{
// Get a pointer to the pixel buffer
byte* data;
uint capacity;
((IMemoryBufferByteAccess)reference).GetBuffer(out data, out capacity);
// Get information about the BitmapBuffer
var desc = buffer.GetPlaneDescription(0);
// Iterate over all pixels
for (uint row = 0; row < desc.Height; row++)
{
for (uint col = 0; col < desc.Width; col++)
{
// Index of the current pixel in the buffer (defined by the next 4 bytes, BGRA8)
var currPixel = desc.StartIndex + desc.Stride * row + BYTES_PER_PIXEL * col;
// Read the current pixel information into b,g,r channels (leave out alpha channel)
var b = data[currPixel + 0]; // Blue
var g = data[currPixel + 1]; // Green
var r = data[currPixel + 2]; // Red
// Boost the green channel, leave the other two untouched
data[currPixel + 0] = b;
data[currPixel + 1] = (byte)Math.Min(g + 80, 255);
data[currPixel + 2] = r;
}
}
}
}
}
And declare this outside your class:
[ComImport]
[Guid("5b0d3235-4dba-4d44-865e-8f1d0e4fd04d")]
[InterfaceType(ComInterfaceType.InterfaceIsIUnknown)]
unsafe interface IMemoryBufferByteAccess
{
void GetBuffer(out byte* buffer, out uint capacity);
}
And of course, your project will have to allow unsafe code for all of this to work.
Have a closer look at the sample to see how to get all the details. Or, to have a walkthrough, you can watch the camera session from a recent //build/ conference, which includes a little bit of a walkthrough through some camera samples.
Alternatively, if you're bound to 8.1, you can look into the Lumia Imaging SDK, which can notify you when there's a new preview frame available.

There are much of examples on GitHub. If you are developing for Windows Phone 8.1 - samples are here
According this example recording looking like this:
private StspMediaSinkProxy mediaSink;
mediaSink = new StspMediaSinkProxy();
MediaEncodingProfile encodingProfile = MediaEncodingProfile.CreateMp4(VideoEncodingQuality.Qvga);
var mfExtension = await mediaSink.InitializeAsync(encodingProfile.Audio, encodingProfile.Video);
await mediaCapture.StartRecordToCustomSinkAsync(encodingProfile, mfExtension);
So, you can understand how to get IMediaExtension from MediaEncodingProfile from this example.
You haven't post any code, but making Preview should be similar to code I have provide

Related

How to use the ARCore camera image in OpenCV in an Unity Android app?

I am trying to use OpenCV for hand gesture recognition in my Unity ARCore game. However, with the deprecation of TextureReaderAPI, the only way to capture the image from the camera is by using Frame.CameraImage.AcquireCameraImageBytes(). The problem with that is not only that the image is in 640x480 resolution (this cannot be changed AFAIK), but it is also in YUV_420_888 format.
As if that were not enough, OpenCV does not have free C#/Unity packages, so if I do not want to cash out 20$ for a paid package, I need to use available C++ or python versions. How do I move the YUV image to OpenCV, convert it to an RGB (or HSV) color space, and then either do some processing on it or return it back to Unity?
In this example, I will use C++ OpenCV libraries and Visual Studio 2017 and I will try to capture ARCore camera image, move it to OpenCV (as efficiently as possible), convert it to RGB color space, then move it back to Unity C# code and save it in the phone's memory.
Firstly, we have to create a C++ dynamic library project to use with OpenCV. For this, I highly recommend to follow both Pierre Baret's and Ninjaman494's answers on this question: OpenCV + Android + Unity. The process is rather straightforward, and if you will not deviate from their answers too much (i.e. you can safely download a newer than 3.3.1 version of OpenCV, but be careful when compiling for ARM64 instead of ARM, etc.), you should be able to call a C++ function from C#.
In my experience, I had to solve two problems - firstly, if you made the project part of your C# solution instead of creating a new solution, Visual Studio will keep messing with your configuration, like trying to compile a x86 version instead of an ARM version. To save yourself the hassle, create a completely separate solution. The other problem is that some functions failed to link for me, thus throwing a undefined reference linker error (undefined reference to 'cv::error(int, std::string const&, char const*, char const*, int, to be exact). If this happens and the problem is with a function that you do not really need, just recreate the function in your code - for instance if you have problems with cv::error, add this code in the end of your .cpp file:
namespace cv {
__noreturn void error(int a, const String & b, const char * c, const char * d, int e) {
throw std::string(b);
}
}
Sure, this is ugly and dirty way to do things, so if you know how to fix the linker error, please do so and let me know.
Now, you should have a working C++ code that compiles and can be run from a Unity Android application. However, what we want is for OpenCV to not return a number, but to convert an image. So change your code to this:
.h file
extern "C" {
namespace YOUR_OWN_NAMESPACE
{
int ConvertYUV2RGBA(unsigned char *, unsigned char *, int, int);
}
}
.cpp file
extern "C" {
int YOUR_OWN_NAMESPACE::ConvertYUV2RGBA(unsigned char * inputPtr, unsigned char * outputPtr, int width, int height) {
// Create Mat objects for the YUV and RGB images. For YUV, we need a
// height*1.5 x width image, that has one 8-bit channel. We can also tell
// OpenCV to have this Mat object "encapsulate" an existing array,
// which is inputPtr.
// For RGB image, we need a height x width image, that has three 8-bit
// channels. Again, we tell OpenCV to encapsulate the outputPtr array.
// Thanks to specifying existing arrays as data sources, no copying
// or memory allocation has to be done, and the process is highly
// effective.
cv::Mat input_image(height + height / 2, width, CV_8UC1, inputPtr);
cv::Mat output_image(height, width, CV_8UC3, outputPtr);
// If any of the images has not loaded, return 1 to signal an error.
if (input_image.empty() || output_image.empty()) {
return 1;
}
// Convert the image. Now you might have seen people telling you to use
// NV21 or 420sp instead of NV12, and BGR instead of RGB. I do not
// understand why, but this was the correct conversion for me.
// If you have any problems with the color in the output image,
// they are probably caused by incorrect conversion. In that case,
// I can only recommend you the trial and error method.
cv::cvtColor(input_image, output_image, cv::COLOR_YUV2RGB_NV12);
// Now that the result is safely saved in outputPtr, we can return 0.
return 0;
}
}
Now, rebuild the solution (Ctrl + Shift + B) and copy the libProjectName.so file to Unity's Plugins/Android folder, as in the linked answer.
The next thing is to save the image from ARCore, move it to C++ code, and get it back. Let us add this inside the class in our C# script:
[DllImport("YOUR_OWN_NAMESPACE")]
public static extern int ConvertYUV2RGBA(IntPtr input, IntPtr output, int width, int height);
You will be prompted by Visual Studio to add System.Runtime.InteropServices using clause - do so.
This allows us to use the C++ function in our C# code. Now, let's add this function to our C# component:
public Texture2D CameraToTexture()
{
// Create the object for the result - this has to be done before the
// using {} clause.
Texture2D result;
// Use using to make sure that C# disposes of the CameraImageBytes afterwards
using (CameraImageBytes camBytes = Frame.CameraImage.AcquireCameraImageBytes())
{
// If acquiring failed, return null
if (!camBytes.IsAvailable)
{
Debug.LogWarning("camBytes not available");
return null;
}
// To save a YUV_420_888 image, you need 1.5*pixelCount bytes.
// I will explain later, why.
byte[] YUVimage = new byte[(int)(camBytes.Width * camBytes.Height * 1.5f)];
// As CameraImageBytes keep the Y, U and V data in three separate
// arrays, we need to put them in a single array. This is done using
// native pointers, which are considered unsafe in C#.
unsafe
{
for (int i = 0; i < camBytes.Width * camBytes.Height; i++)
{
YUVimage[i] = *((byte*)camBytes.Y.ToPointer() + (i * sizeof(byte)));
}
for (int i = 0; i < camBytes.Width * camBytes.Height / 4; i++)
{
YUVimage[(camBytes.Width * camBytes.Height) + 2 * i] = *((byte*)camBytes.U.ToPointer() + (i * camBytes.UVPixelStride * sizeof(byte)));
YUVimage[(camBytes.Width * camBytes.Height) + 2 * i + 1] = *((byte*)camBytes.V.ToPointer() + (i * camBytes.UVPixelStride * sizeof(byte)));
}
}
// Create the output byte array. RGB is three channels, therefore
// we need 3 times the pixel count
byte[] RGBimage = new byte[camBytes.Width * camBytes.Height * 3];
// GCHandles help us "pin" the arrays in the memory, so that we can
// pass them to the C++ code.
GCHandle YUVhandle = GCHandle.Alloc(YUVimage, GCHandleType.Pinned);
GCHandle RGBhandle = GCHandle.Alloc(RGBimage, GCHandleType.Pinned);
// Call the C++ function that we created.
int k = ConvertYUV2RGBA(YUVhandle.AddrOfPinnedObject(), RGBhandle.AddrOfPinnedObject(), camBytes.Width, camBytes.Height);
// If OpenCV conversion failed, return null
if (k != 0)
{
Debug.LogWarning("Color conversion - k != 0");
return null;
}
// Create a new texture object
result = new Texture2D(camBytes.Width, camBytes.Height, TextureFormat.RGB24, false);
// Load the RGB array to the texture, send it to GPU
result.LoadRawTextureData(RGBimage);
result.Apply();
// Save the texture as an PNG file. End the using {} clause to
// dispose of the CameraImageBytes.
File.WriteAllBytes(Application.persistentDataPath + "/tex.png", result.EncodeToPNG());
}
// Return the texture.
return result;
}
To be able to run unsafe code, you also need to allow it in Unity. Go to Player Settings (Edit > Project Settings > Player Settings and check the Allow unsafe code checkbox.)
Now, you can call the CameraToTexture() function, let's say, every 5 seconds from Update(), and the camera image should be saved as /Android/data/YOUR_APPLICATION_PACKAGE/files/tex.png. The image will probably be landscape oriented, even if you held the phone in portrait mode, but this is not that hard to fix anymore. Also, you might notice a freeze everytime the image is saved - I recommend calling this function in a separate thread because of this. Also, the most demanding operation here is saving the image as an PNG file, so if you need it for any other reason, you should be fine (still use the separate thread, though).
If you want to undestand the YUV_420_888 format, why you need a 1.5*pixelCount array, and why we modified the arrays the way we did, read https://wiki.videolan.org/YUV/#NV12. Other websites seem to have incorrect information about how this format works.
Also, feel free to comment me with any issues you might have, and I will try to help with them, as well as any feedback for both the code and answer.
APPENDIX 1: According to https://docs.unity3d.com/ScriptReference/Texture2D.LoadRawTextureData.html, you should use GetRawTextureData instead of LoadRawTextureData, to prevent copying. To do this, just pin the array returned by GetRawTextureData instead of the RGBimage array (which you can remove). Also, do not forget to call result.Apply(); afterwards.
APPENDIX 2: Do not forget to call Free() on both GCHandles when you are done using them.
I figured out how to get the full resolution CPU image in Arcore 1.8.
I can now get the full camera resolution with cameraimagebytes.
put this in your class variables:
private ARCoreSession.OnChooseCameraConfigurationDelegate m_OnChoseCameraConfiguration = null;
put this in Start()
m_OnChoseCameraConfiguration = _ChooseCameraConfiguration; ARSessionManager.RegisterChooseCameraConfigurationCallback(m_OnChoseCameraConfiguration); ARSessionManager.enabled = false; ARSessionManager.enabled = true;
Add this callback to the class:
private int _ChooseCameraConfiguration(List<CameraConfig> supportedConfigurations) { return supportedConfigurations.Count - 1; }
Once you add those, you should have cameraimagebytes returning the full resolution of the camera.
For everyone who want to try this with OpencvForUnity:
public Mat getCameraImage()
{
// Use using to make sure that C# disposes of the CameraImageBytes afterwards
using (CameraImageBytes camBytes = Frame.CameraImage.AcquireCameraImageBytes())
{
// If acquiring failed, return null
if (!camBytes.IsAvailable)
{
Debug.LogWarning("camBytes not available");
return null;
}
// To save a YUV_420_888 image, you need 1.5*pixelCount bytes.
// I will explain later, why.
byte[] YUVimage = new byte[(int)(camBytes.Width * camBytes.Height * 1.5f)];
// As CameraImageBytes keep the Y, U and V data in three separate
// arrays, we need to put them in a single array. This is done using
// native pointers, which are considered unsafe in C#.
unsafe
{
for (int i = 0; i < camBytes.Width * camBytes.Height; i++)
{
YUVimage[i] = *((byte*)camBytes.Y.ToPointer() + (i * sizeof(byte)));
}
for (int i = 0; i < camBytes.Width * camBytes.Height / 4; i++)
{
YUVimage[(camBytes.Width * camBytes.Height) + 2 * i] = *((byte*)camBytes.U.ToPointer() + (i * camBytes.UVPixelStride * sizeof(byte)));
YUVimage[(camBytes.Width * camBytes.Height) + 2 * i + 1] = *((byte*)camBytes.V.ToPointer() + (i * camBytes.UVPixelStride * sizeof(byte)));
}
}
// Create the output byte array. RGB is three channels, therefore
// we need 3 times the pixel count
byte[] RGBimage = new byte[camBytes.Width * camBytes.Height * 3];
// GCHandles help us "pin" the arrays in the memory, so that we can
// pass them to the C++ code.
GCHandle pinnedArray = GCHandle.Alloc(YUVimage, GCHandleType.Pinned);
IntPtr pointer = pinnedArray.AddrOfPinnedObject();
Mat input = new Mat(camBytes.Height + camBytes.Height / 2, camBytes.Width, CvType.CV_8UC1);
Mat output = new Mat(camBytes.Height, camBytes.Width, CvType.CV_8UC3);
Utils.copyToMat(pointer, input);
Imgproc.cvtColor(input, output, Imgproc.COLOR_YUV2RGB_NV12);
pinnedArray.Free();
return output;
}
}
Here is an implementation of this which just uses the free plugin OpenCV Plus Unity. Very simple to set up and great documentation if you are familiar with OpenCV.
This implementation rotates the image properly using OpenCV, stores them into memory and upon exiting the app, saves them to file. I have tried to strip all Unity aspects from the code so that the function GetCameraImage() can be run on a separate thread.
I can confirm it works on Andoird (GS7), I presume it will work pretty universally.
using System;
using System.Collections.Generic;
using GoogleARCore;
using UnityEngine;
using OpenCvSharp;
using System.Runtime.InteropServices;
public class CamImage : MonoBehaviour
{
public static List<Mat> AllData = new List<Mat>();
public static void GetCameraImage()
{
// Use using to make sure that C# disposes of the CameraImageBytes afterwards
using (CameraImageBytes camBytes = Frame.CameraImage.AcquireCameraImageBytes())
{
// If acquiring failed, return null
if (!camBytes.IsAvailable)
{
return;
}
// To save a YUV_420_888 image, you need 1.5*pixelCount bytes.
// I will explain later, why.
byte[] YUVimage = new byte[(int)(camBytes.Width * camBytes.Height * 1.5f)];
// As CameraImageBytes keep the Y, U and V data in three separate
// arrays, we need to put them in a single array. This is done using
// native pointers, which are considered unsafe in C#.
unsafe
{
for (int i = 0; i < camBytes.Width * camBytes.Height; i++)
{
YUVimage[i] = *((byte*)camBytes.Y.ToPointer() + (i * sizeof(byte)));
}
for (int i = 0; i < camBytes.Width * camBytes.Height / 4; i++)
{
YUVimage[(camBytes.Width * camBytes.Height) + 2 * i] = *((byte*)camBytes.U.ToPointer() + (i * camBytes.UVPixelStride * sizeof(byte)));
YUVimage[(camBytes.Width * camBytes.Height) + 2 * i + 1] = *((byte*)camBytes.V.ToPointer() + (i * camBytes.UVPixelStride * sizeof(byte)));
}
}
// GCHandles help us "pin" the arrays in the memory, so that we can
// pass them to the C++ code.
GCHandle pinnedArray = GCHandle.Alloc(YUVimage, GCHandleType.Pinned);
IntPtr pointerYUV = pinnedArray.AddrOfPinnedObject();
Mat input = new Mat(camBytes.Height + camBytes.Height / 2, camBytes.Width, MatType.CV_8UC1, pointerYUV);
Mat output = new Mat(camBytes.Height, camBytes.Width, MatType.CV_8UC3);
Cv2.CvtColor(input, output, ColorConversionCodes.YUV2BGR_NV12);// YUV2RGB_NV12);
// FLIP AND TRANPOSE TO VERTICAL
Cv2.Transpose(output, output);
Cv2.Flip(output, output, FlipMode.Y);
AllData.Add(output);
pinnedArray.Free();
}
}
}
I then call ExportImages() when exiting the program to save to file.
private void ExportImages()
{
/// Write Camera intrinsics to text file
var path = Application.persistentDataPath;
StreamWriter sr = new StreamWriter(path + #"/intrinsics.txt");
sr.WriteLine(CameraIntrinsicsOutput.text);
Debug.Log(CameraIntrinsicsOutput.text);
sr.Close();
// Loop through Mat List, Add to Texture and Save.
for (var i = 0; i < CamImage.AllData.Count; i++)
{
Mat imOut = CamImage.AllData[i];
Texture2D result = Unity.MatToTexture(imOut);
result.Apply();
byte[] im = result.EncodeToJPG(100);
string fileName = "/IMG" + i + ".jpg";
File.WriteAllBytes(path + fileName, im);
string messge = "Succesfully Saved Image To " + path + "\n";
Debug.Log(messge);
Destroy(result);
}
}
Seems you already fix this.
But for anyone who want to combine AR with hand gesture recognition and tracking, try Manomotion: https://www.manomotion.com/
free SDk and work prefect in 12/2020.
Use the SDK Community Edition and Download ARFoundation version

Generating photo quality scans with WIA in C#

So I have been working on updates to a fairly simple and task-specific image processing app that I rolled out some time ago. Due to a tendency of the imaging techs who use this software to mess with their scanner settings in ways that require the program to accommodate unnecessary changes, sometimes intentionally and sometimes accidentally, I wanted to add a Scan button in the update that will standardize things like image resolution and color settings in order to enforce uniformity while also reducing the number of programs employees have to have open and switch between. Initially I tried to accomplish this with a Powershell script that was called by the original python program. That was a nightmare, is not what I am doing now, and this is not a duplicate of the question I posted with regards to that problem. So, on to the problem:
Rather than sticking with python and Powershell, I wrote the upgraded app in C#, using WIA to handle the scanner and Aforge.Net to perform image post-processing tasks. I have code that works reasonably well, finds the scanner, and scans the image in color with the appropriate size, resolution, and compression. The issue is that this is still not really a "photo-quality" image. We are scanning comic books and things like smudges and creases in the cover have to be visible on all scans, even very dark ones. The Epson scan manager accomplishes this pretty well, though it washes out the images a bit in the process, but I can't figure out what settings I should change in order to achieve a similar end. As an example, here is a test image scanned with the scan button on my app:
And here is the same image scanned using the Epson Scan Manager:
I basically want to know how I get the top image to look more like the bottom image. It doesn't have to be exactly the same, but I need to be able to see all those smudges and imperfections, or at least as many of them as possible. I can pretty easily imitate the general look of the bottom image with image filters, but I can't use that to get information that the scanner didn't get. Post-processing won't necessarily get me those smudges back. I need to adjust how the image is taken. In theory, I know I should be able to play with something like exposure times and the like but I can't even find appropriate constants for that sort of thing as the documentation is somewhat opaque. Here is the code I currently have for accessing the scanner and getting the scan:
private static void AdjustScannerSettings(IItem scannerItem, int scanResolutionDPI, int scanStartLeftPixel, int scanStartTopPixel, int scanWidthPixels, int scanHeightPixels, int brightnessPercents, int contrastPercents, int colorMode)
{
const string WIA_SCAN_COLOR_MODE = "6146";
const string WIA_HORIZONTAL_SCAN_RESOLUTION_DPI = "6147";
const string WIA_VERTICAL_SCAN_RESOLUTION_DPI = "6148";
const string WIA_HORIZONTAL_SCAN_START_PIXEL = "6149";
const string WIA_VERTICAL_SCAN_START_PIXEL = "6150";
const string WIA_HORIZONTAL_SCAN_SIZE_PIXELS = "6151";
const string WIA_VERTICAL_SCAN_SIZE_PIXELS = "6152";
const string WIA_SCAN_BRIGHTNESS_PERCENTS = "6154";
const string WIA_SCAN_CONTRAST_PERCENTS = "6155";
SetWIAProperty(scannerItem.Properties, "4104", 24);
SetWIAProperty(scannerItem.Properties, WIA_HORIZONTAL_SCAN_RESOLUTION_DPI, scanResolutionDPI);
SetWIAProperty(scannerItem.Properties, WIA_VERTICAL_SCAN_RESOLUTION_DPI, scanResolutionDPI);
SetWIAProperty(scannerItem.Properties, WIA_HORIZONTAL_SCAN_START_PIXEL, scanStartLeftPixel);
SetWIAProperty(scannerItem.Properties, WIA_VERTICAL_SCAN_START_PIXEL, scanStartTopPixel);
SetWIAProperty(scannerItem.Properties, WIA_HORIZONTAL_SCAN_SIZE_PIXELS, scanWidthPixels);
SetWIAProperty(scannerItem.Properties, WIA_VERTICAL_SCAN_SIZE_PIXELS, scanHeightPixels);
SetWIAProperty(scannerItem.Properties, WIA_SCAN_BRIGHTNESS_PERCENTS, brightnessPercents);
SetWIAProperty(scannerItem.Properties, WIA_SCAN_CONTRAST_PERCENTS, contrastPercents);
SetWIAProperty(scannerItem.Properties, WIA_SCAN_COLOR_MODE, colorMode);
}
private static void SetWIAProperty(IProperties properties, object propName, object propValue)
{
Property prop = properties.get_Item(ref propName);
prop.set_Value(ref propValue);
}
private void buttonScan_Click(object sender, EventArgs e)
{
var deviceManager = new DeviceManager();
DeviceInfo firstScannerAvailable = null;
for (int i = 1; i <= deviceManager.DeviceInfos.Count; i++)
{
if (deviceManager.DeviceInfos[i].Type != WiaDeviceType.ScannerDeviceType)
{
continue;
}
firstScannerAvailable = deviceManager.DeviceInfos[i];
break;
}
var device = firstScannerAvailable.Connect();
var scannerItem = device.Items[1];
int resolution = 300;
int width_pixel = 3510;
int height_pixel = 5100;
int color_mode = 1;
AdjustScannerSettings(scannerItem, resolution, 0, 0, width_pixel, height_pixel, 0, 0, color_mode);
var imageFile = (ImageFile)scannerItem.Transfer("{B96B3CAE-0728-11D3-9D7B-0000F81EF32E}");
var pathbase = Path.Combine(pictures, basedaemonpath);
string filebase = DateTime.Now.ToString("dd-MM-yyyy-hh-mm-ss-fffffff") + ".jpg";
var path = Path.Combine(pathbase, filebase);
WIA.ImageProcess myip = new WIA.ImageProcess(); // use to compress jpeg.
myip.Filters.Add(myip.FilterInfos["Convert"].FilterID);
myip.Filters[1].Properties["FormatID"].set_Value("{B96B3CAE-0728-11D3-9D7B-0000F81EF32E}");
myip.Filters[1].Properties["Quality"].set_Value(84);
ImageFile image = myip.Apply(imageFile);
image.SaveFile(path);
}
I can include the post-processing code as well if it is needed but there is a lot of it (It is the primary function of the app after all) and all it really does is get a bunch of information about the content of the image and then rotate and crop it. It shouldn't have an effect on actual look of the image with the exception of the rotation and crop, so I'm leaving this part out for now. If snippets of this code are necessary let me know and I will post them. Thanks for any help you might be able to provide!
You need to decrease the contrast and increase the brightness to get the expected results.
According to this Microsoft WIA page. The valid ranges are from -1000 to 1000.
Make the following adjustments to the buttonScan_Click method:
// ...
int width_pixel = 3510;
int height_pixel = 5100;
int color_mode = 1;
// Add the following two lines
int brightness = 500;
int contrast = -500;
// Change the 0, 0 to brightness, contrast in the next line.
AdjustScannerSettings(scannerItem, resolution, 0, 0, width_pixel, height_pixel, brightness, contrast, color_mode);
You will have to adjust the values according to the results.

XAudio2 - Cracking output when using a dynamic buffer

To provide a little bit of context. I am trying to output live audio from a camera in my c# application. After doing some research it seems pretty obvious to do it in a c++ managed dll. I chose the XAudio2 api because it should be pretty easy to implement and use with dynamic audio content.
So the idea is to create the XAudio device in c++ with an empty buffer and push in the audio from the c# code side. The audio chunks are pushed every 50ms because I want to keep the latency as small as possible.
// SampleRate = 44100; Channels = 2; BitPerSample = 16;
var blockAlign = (Channels * BitsPerSample) / 8;
var avgBytesPerSecond = SampleRate * blockAlign;
var avgBytesPerMillisecond = avgBytesPerSecond / 1000;
var bufferSize = avgBytesPerMillisecond * Time;
_sampleBuffer = new byte[bufferSize];
Everytime the timer runs it gets the pointer of the audio buffer, reads the data from the audio, copies the data to the pointer and calls the PushAudio method.
I am also using a stopwatch to check how long the processing took and calculate the interval again for the timer to include the processing time.
private void PushAudioChunk(object sender, ElapsedEventArgs e)
{
unsafe
{
_pushAudioStopWatch.Reset();
_pushAudioStopWatch.Start();
var audioBufferPtr = Output.AudioCapturerBuffer();
FillBuffer(_sampleBuffer);
Marshal.Copy(_sampleBuffer, 0, (IntPtr)audioBufferPtr, _sampleBuffer.Length);
Output.PushAudio();
_pushTimer.Interval = Time - _pushAudioStopWatch.ElapsedMilliseconds;
_pushAudioStopWatch.Stop();
DIX.Log.WriteLine("Push audio took: {0}ms", _pushAudioStopWatch.ElapsedMilliseconds);
}
}
This is the implementation of the c++ part.
Regarding to the documentation on msdn I created a XAudio2 device and added the MasterVoice and SourceVoice. The buffer is empty at first because the c# part is responsible to push in the audio data.
namespace Audio
{
using namespace System;
template <class T> void SafeRelease(T **ppT)
{
if (*ppT)
{
(*ppT)->Release();
*ppT = NULL;
}
}
WAVEFORMATEXTENSIBLE wFormat;
XAUDIO2_BUFFER buffer = { 0 };
IXAudio2* pXAudio2 = NULL;
IXAudio2MasteringVoice* pMasterVoice = NULL;
IXAudio2SourceVoice* pSourceVoice = NULL;
WaveOut::WaveOut(int bufferSize)
{
audioBuffer = new Byte[bufferSize];
wFormat.Format.wFormatTag = WAVE_FORMAT_PCM;
wFormat.Format.nChannels = 2;
wFormat.Format.nSamplesPerSec = 44100;
wFormat.Format.wBitsPerSample = 16;
wFormat.Format.nBlockAlign = (wFormat.Format.nChannels * wFormat.Format.wBitsPerSample) / 8;
wFormat.Format.nAvgBytesPerSec = wFormat.Format.nSamplesPerSec * wFormat.Format.nBlockAlign;
wFormat.Format.cbSize = 0;
wFormat.SubFormat = KSDATAFORMAT_SUBTYPE_PCM;
HRESULT hr = XAudio2Create(&pXAudio2, 0, XAUDIO2_DEFAULT_PROCESSOR);
if (SUCCEEDED(hr))
{
hr = pXAudio2->CreateMasteringVoice(&pMasterVoice);
}
if (SUCCEEDED(hr))
{
hr = pXAudio2->CreateSourceVoice(&pSourceVoice, (WAVEFORMATEX*)&wFormat,
0, XAUDIO2_DEFAULT_FREQ_RATIO, NULL, NULL, NULL);
}
buffer.pAudioData = (BYTE*)audioBuffer;
buffer.AudioBytes = bufferSize;
buffer.Flags = 0;
if (SUCCEEDED(hr))
{
hr = pSourceVoice->Start(0);
}
}
WaveOut::~WaveOut()
{
}
WaveOut^ WaveOut::CreateWaveOut(int bufferSize)
{
return gcnew WaveOut(bufferSize);
}
uint8_t* WaveOut::AudioCapturerBuffer()
{
if (!audioBuffer)
{
throw gcnew Exception("Audio buffer is not initialized. Did you forget to set up the audio container?");
}
return (BYTE*)audioBuffer;
}
int WaveOut::PushAudio()
{
HRESULT hr = pSourceVoice->SubmitSourceBuffer(&buffer);
if (FAILED(hr))
{
return -1;
}
return 0;
}
}
The problem I am facing is that I always have some cracking in the output. I tried to increase the interval of the timer or increased the buffer size a bit. Everytime the same result.
What am I doing wrong?
Update:
I created 3 buffers the XAudio engine can go through. The cracking got away. The missing part now is to fill the buffers at the right time from the c# part to avoid buffers with the same data.
void Render(void* param)
{
std::vector<byte> audioBuffers[BUFFER_COUNT];
size_t currentBuffer = 0;
// Get the current state of the source voice
while (BackgroundThreadRunning && pSourceVoice)
{
if (pSourceVoice)
{
pSourceVoice->GetState(&state);
}
while (state.BuffersQueued < BUFFER_COUNT)
{
std::vector<byte> resultData;
resultData.resize(DATA_SIZE);
CopyMemory(&resultData[0], pAudioBuffer, DATA_SIZE);
// Retreive the next buffer to stream from MF Music Streamer
audioBuffers[currentBuffer] = resultData;
// Submit the new buffer
XAUDIO2_BUFFER buf = { 0 };
buf.AudioBytes = static_cast<UINT32>(audioBuffers[currentBuffer].size());
buf.pAudioData = &audioBuffers[currentBuffer][0];
pSourceVoice->SubmitSourceBuffer(&buf);
// Advance the buffer index
currentBuffer = ++currentBuffer % BUFFER_COUNT;
// Get the updated state
pSourceVoice->GetState(&state);
}
Sleep(30);
}
}
XAudio2 does not copy the source data buffer at the time you submit it via SubmitSourceBuffer. You must keep that data (which is in your application memory) valid, and the buffer allocated for the entire time that XAudio2 will need to read out of it to process the data. This is done for efficiency to avoid the need for an extra copy, but puts the multi-threaded burden of keeping the memory available until it's done playing on you. That also means you can't modify the playing buffer.
Your current code is just reusing the same buffer which is causing the popping as you change the data while it's play. You can solve this with having 2 or three buffers you rotate between. A XAudio2 Source Voice has status information you can use to determine when it's done playing a buffer, or you can register for explicit callbacks which tell you when the buffer is no longer being used.
See DirectX Tool Kit for Audio and classic XAudio2 samples for examples of using XAudio2.

Video Capture output always in 320x240 despite changing resolution

Ok I have been at this for 2 days and need help with this last part.
I have a Microsoft LifeCam Cinema camera and I use the .NET DirectShowLib to capture the video stream. Well actually I use WPFMediaKit, but I am in the source code of that dealing directly with the direct show library now.
What I have working is:
- View the video output of the camera
- Record the video output of the camera in ASF or AVI (the only 2 MediaType's supported with ICaptureGraphBuilder2)
The problem is: I can save it as a .avi. This works fine and at a resolution of 1280x720 but it saves the file in RAW output. Meaning it is about 50-60MB per second. Way too high.
Or I can switch it to .asf and it outputs a WMV, but when I do this the capture and the output both go to resolution 320x240.
In WPFMediaKit there is a function I changed because apparently with Microsoft LifeCam Cinema cameras a lot of people have this problem. So instead of creating or changing the AMMediaType you iterate through and then use that to call SetFormat.
///* Make the VIDEOINFOHEADER 'readable' */
var videoInfo = new VideoInfoHeader();
int iCount = 0, iSize = 0;
videoStreamConfig.GetNumberOfCapabilities(out iCount, out iSize);
IntPtr TaskMemPointer = Marshal.AllocCoTaskMem(iSize);
AMMediaType pmtConfig = null;
for (int iFormat = 0; iFormat < iCount; iFormat++)
{
IntPtr ptr = IntPtr.Zero;
videoStreamConfig.GetStreamCaps(iFormat, out pmtConfig, TaskMemPointer);
videoInfo = (VideoInfoHeader)Marshal.PtrToStructure(pmtConfig.formatPtr, typeof(VideoInfoHeader));
if (videoInfo.BmiHeader.Width == DesiredWidth && videoInfo.BmiHeader.Height == DesiredHeight)
{
///* Setup the VIDEOINFOHEADER with the parameters we want */
videoInfo.AvgTimePerFrame = DSHOW_ONE_SECOND_UNIT / FPS;
if (mediaSubType != Guid.Empty)
{
int fourCC = 0;
byte[] b = mediaSubType.ToByteArray();
fourCC = b[0];
fourCC |= b[1] << 8;
fourCC |= b[2] << 16;
fourCC |= b[3] << 24;
videoInfo.BmiHeader.Compression = fourCC;
// pmtConfig.subType = mediaSubType;
}
/* Copy the data back to unmanaged memory */
Marshal.StructureToPtr(videoInfo, pmtConfig.formatPtr, true);
hr = videoStreamConfig.SetFormat(pmtConfig);
break;
}
}
/* Free memory */
Marshal.FreeCoTaskMem(TaskMemPointer);
DsUtils.FreeAMMediaType(pmtConfig);
if (hr < 0)
return false;
return true;
When that was implemented I could finally view the captured video as 1280x720 as long as I set the SetOutputFilename to a MediaType.Avi.
If I set it to a MediaType.Asf it goes to 320x240 and the output is the same.
Or the AVI works and outputs in the correct format but does so in RAW video, hence a very large file size. I have attempted to add a compressor to the graph but with no luck, this is far out of my experience.
I am looking for 1 of 2 answers.
Recording the ASF at 1280x720
Adding a compressor to the graph so that the filesize of my outputted AVI is small.
I figured this out. So I am posting it here for any other poor soul who passes by wondering why it doesn't work.
Download the source of the WPFMediaKit, you are going to need to change some code.
Go to Folder DirectShow > MediaPlayers and open up VideoCapturePlayer.cs
Find the function SetVideoCaptureParameters and replace it with this:
/// <summary>
/// Sets the capture parameters for the video capture device
/// </summary>
private bool SetVideoCaptureParameters(ICaptureGraphBuilder2 capGraph, IBaseFilter captureFilter, Guid mediaSubType)
{
/* The stream config interface */
object streamConfig;
/* Get the stream's configuration interface */
int hr = capGraph.FindInterface(PinCategory.Capture,
MediaType.Video,
captureFilter,
typeof(IAMStreamConfig).GUID,
out streamConfig);
DsError.ThrowExceptionForHR(hr);
var videoStreamConfig = streamConfig as IAMStreamConfig;
/* If QueryInterface fails... */
if (videoStreamConfig == null)
{
throw new Exception("Failed to get IAMStreamConfig");
}
///* Make the VIDEOINFOHEADER 'readable' */
var videoInfo = new VideoInfoHeader();
int iCount = 0, iSize = 0;
videoStreamConfig.GetNumberOfCapabilities(out iCount, out iSize);
IntPtr TaskMemPointer = Marshal.AllocCoTaskMem(iSize);
AMMediaType pmtConfig = null;
for (int iFormat = 0; iFormat < iCount; iFormat++)
{
IntPtr ptr = IntPtr.Zero;
videoStreamConfig.GetStreamCaps(iFormat, out pmtConfig, TaskMemPointer);
videoInfo = (VideoInfoHeader)Marshal.PtrToStructure(pmtConfig.formatPtr, typeof(VideoInfoHeader));
if (videoInfo.BmiHeader.Width == DesiredWidth && videoInfo.BmiHeader.Height == DesiredHeight)
{
///* Setup the VIDEOINFOHEADER with the parameters we want */
videoInfo.AvgTimePerFrame = DSHOW_ONE_SECOND_UNIT / FPS;
if (mediaSubType != Guid.Empty)
{
int fourCC = 0;
byte[] b = mediaSubType.ToByteArray();
fourCC = b[0];
fourCC |= b[1] << 8;
fourCC |= b[2] << 16;
fourCC |= b[3] << 24;
videoInfo.BmiHeader.Compression = fourCC;
// pmtConfig.subType = mediaSubType;
}
/* Copy the data back to unmanaged memory */
Marshal.StructureToPtr(videoInfo, pmtConfig.formatPtr, true);
hr = videoStreamConfig.SetFormat(pmtConfig);
break;
}
}
/* Free memory */
Marshal.FreeCoTaskMem(TaskMemPointer);
DsUtils.FreeAMMediaType(pmtConfig);
if (hr < 0)
return false;
return true;
}
Now that will sort out your screen display at what ever desired resolution you want, provided that your camera supports it.
Next you will soon figure out that this new correct capture you have isnt applied when writing the video to disk.
Since the ICaptureBuilder2 method only supports Avi and Asf (which is wmv) you need to set your mediatype to one of them.
hr = graphBuilder.SetOutputFileName(MediaSubType.Asf, this.m_fileName, out mux, out sink);
You will find that line in the SetupGraph function.
Asf will only output in 320x240, yet the Avi will output in the desired resolution, but uncompressed (meaning 50-60MB per second for a 1280x720 video feed), which is too high.
So that leaves you with 2 options
Figure out how to add a encoder (compression filter) to the Avi output
Figure out how to change the WMV profile
I tried 1, with no success. Mainly due to the fact this is my first time working with DirectShow and only just grasp the meaning of graphs.
But I was successful with #2 and here is how I did it.
Special thanks to (http://www.codeproject.com/KB/audio-video/videosav.aspx) I pulled out the needed code from here.
Create a new class in the same folder called WMLib.cs and place the following in it
Download the demo project from http://www.codeproject.com/KB/audio-video/videosav.aspx and copy and paste the WMLib.cs into your project (change the namespace as necessary)
Create a function in the VideoCapturePlayer.cs class
/// <summary>
/// Configure profile from file to Asf file writer
/// </summary>
/// <param name="asfWriter"></param>
/// <param name="filename"></param>
/// <returns></returns>
public bool ConfigProfileFromFile(IBaseFilter asfWriter, string filename)
{
int hr;
//string profilePath = "test.prx";
// Set the profile to be used for conversion
if ((filename != null) && (File.Exists(filename)))
{
// Load the profile XML contents
string profileData;
using (StreamReader reader = new StreamReader(File.OpenRead(filename)))
{
profileData = reader.ReadToEnd();
}
// Create an appropriate IWMProfile from the data
// Open the profile manager
IWMProfileManager profileManager;
IWMProfile wmProfile = null;
hr = WMLib.WMCreateProfileManager(out profileManager);
if (hr >= 0)
{
// error message: The profile is invalid (0xC00D0BC6)
// E.g. no <prx> tags
hr = profileManager.LoadProfileByData(profileData, out wmProfile);
}
if (profileManager != null)
{
Marshal.ReleaseComObject(profileManager);
profileManager = null;
}
// Config only if there is a profile retrieved
if (hr >= 0)
{
// Set the profile on the writer
IConfigAsfWriter configWriter = (IConfigAsfWriter)asfWriter;
hr = configWriter.ConfigureFilterUsingProfile(wmProfile);
if (hr >= 0)
{
return true;
}
}
}
return false;
}
In the SetupGraph function find the SetOutputFileName and below it put
ConfigProfileFromFile(mux, "c:\wmv.prx");
Now create a file called wmv.prx on your c: drive and place the relevant information in it.
You can see a sample of a PRX file from the demo project here: http://www.codeproject.com/KB/audio-video/videosav.aspx (Pal90.prx)
And now enjoy your .wmv file outputted at the right size.
Yes I know the code I placed in was rather scrappy but I will leave it up to you to polish it up.
Lifecam is known for unobvious behavior with setting capture format (more specifically, falling back to other formats). See prevoius discussions which are likely to suggest you a solution:
Can't make IAMStreamConfig.SetFormat() to work with LifeCam Studio
IAMStreamConfig settings are ignored by stream - Microsoft LifeCam Cinema records only on 640x480
Can Microsoft LifeCam record at a frame rates lower than 30 fps?

Audio generation software or .NET library

I need to be able to play certain tones in a c# application. I don't care if it generates them on the fly or if it plays them from a file, but I just need SOME way to generate tones that have not only variable volume and frequency, but variable timbre. It would be especially helpful if whatever I used to generate these tones would have many timbre pre-sets, and it would be even more awesome if these timbres didn't all sound midi-ish (meaning some of them sounded like the might have been recordings of actual instruments).
Any suggestions?
You might like to take a look at my question Creating sine or square wave in C#
Using NAudio in particular was a great choice
This article helped me with something similar:
http://social.msdn.microsoft.com/Forums/vstudio/en-US/18fe83f0-5658-4bcf-bafc-2e02e187eb80/beep-beep
The part in particular is the Beep Class:
public class Beep
{
public static void BeepBeep(int Amplitude, int Frequency, int Duration)
{
double A = ((Amplitude * (System.Math.Pow(2, 15))) / 1000) - 1;
double DeltaFT = 2 * Math.PI * Frequency / 44100.0;
int Samples = 441 * Duration / 10;
int Bytes = Samples * 4;
int[] Hdr = {0X46464952, 36 + Bytes, 0X45564157, 0X20746D66, 16, 0X20001, 44100, 176400, 0X100004, 0X61746164, Bytes};
using (MemoryStream MS = new MemoryStream(44 + Bytes))
{
using (BinaryWriter BW = new BinaryWriter(MS))
{
for (int I = 0; I < Hdr.Length; I++)
{
BW.Write(Hdr[I]);
}
for (int T = 0; T < Samples; T++)
{
short Sample = System.Convert.ToInt16(A * Math.Sin(DeltaFT * T));
BW.Write(Sample);
BW.Write(Sample);
}
BW.Flush();
MS.Seek(0, SeekOrigin.Begin);
using (SoundPlayer SP = new SoundPlayer(MS))
{
SP.PlaySync();
}
}
}
}
}
It can be used as follows
Beep.BeepBeep(100, 1000, 1000); /* 10% volume */
There's a popular article on CodeProject along these lines:
http://www.codeproject.com/KB/audio-video/CS_ToneGenerator.aspx
You might also check out this thread:
http://episteme.arstechnica.com/eve/forums/a/tpc/f/6330927813/m/197000149731
In order for your generated tones not to sound 'midi-ish', you'll have to use real-life samples and play them back. Try to find some good real instrument sample bank, like http://www.sampleswap.org/filebrowser-new.php?d=INSTRUMENTS+single+samples%2F
Then, when you want to compose melody from them, just alternate playback frequency relative to the original sample frequency.
Please drop me a line if you find this answer usefull.

Categories

Resources