I don't know how to tag this question, please edit if possible.
The job: Create an application which can auto-crop black borders in images in batch runs. Images vary in quality from 100-300dpi, 1bpp-24bpp and a batch can vary from 10 - 10 000 images.
The plan: Convert image to 1bpp (bitonal, black/white, if it isn't already) and after "cleaning up" white spots/dirt/noise find where the black ends and the white begins, these are the new coords for the image crop, apply them to a clone of the original image. Delete old image, save new one.
The progress: All of the above is done, and works, but...
The problem: When converting to 1bpp I have no control of a "threshold" value. I need this. A lot of dark images get cropped too much.
The tries: I've tried
Bitmap imgBitonal = imgOriginal.Clone(new Rectangle(0, 0, b.Width, b.Height), PixelFormat.Format1bppIndexed)
And also this. Both of which work, but none seem to give me the possibility to manually set a threshold value. I need for the user to be able to set this value, amongst others, and use my "preview" function before running the batch so as to see if the settings are any good.
The cry: I'm at a loss here. I don't now what to do or how to do it. Please help a fellow coder out. Point me in a direction, show me where in the code found in the link a threshold value is found (I haven't found one, or don't know where to look) or just give me some code that works. Any help is appreciated.
Try this, from very fast 1bpp convert:
Duplicate from here Convert 24bpp Bitmap to 1bpp
private static unsafe void Convert(Bitmap src, Bitmap conv)
{
// Lock source and destination in memory for unsafe access
var bmbo = src.LockBits(new Rectangle(0, 0, src.Width, src.Height), ImageLockMode.ReadOnly,
src.PixelFormat);
var bmdn = conv.LockBits(new Rectangle(0, 0, conv.Width, conv.Height), ImageLockMode.ReadWrite,
conv.PixelFormat);
var srcScan0 = bmbo.Scan0;
var convScan0 = bmdn.Scan0;
var srcStride = bmbo.Stride;
var convStride = bmdn.Stride;
byte* sourcePixels = (byte*)(void*)srcScan0;
byte* destPixels = (byte*)(void*)convScan0;
var srcLineIdx = 0;
var convLineIdx = 0;
var hmax = src.Height-1;
var wmax = src.Width-1;
for (int y = 0; y < hmax; y++)
{
// find indexes for source/destination lines
// use addition, not multiplication?
srcLineIdx += srcStride;
convLineIdx += convStride;
var srcIdx = srcLineIdx;
for (int x = 0; x < wmax; x++)
{
// index for source pixel (32bbp, rgba format)
srcIdx += 4;
//var r = pixel[2];
//var g = pixel[1];
//var b = pixel[0];
// could just check directly?
//if (Color.FromArgb(r,g,b).GetBrightness() > 0.01f)
if (!(sourcePixels[srcIdx] == 0 && sourcePixels[srcIdx + 1] == 0 && sourcePixels[srcIdx + 2] == 0))
{
// destination byte for pixel (1bpp, ie 8pixels per byte)
var idx = convLineIdx + (x >> 3);
// mask out pixel bit in destination byte
destPixels[idx] |= (byte)(0x80 >> (x & 0x7));
}
}
}
src.UnlockBits(bmbo);
conv.UnlockBits(bmdn);
}
Related
I simply want to convert a previously loaded BMP-File into a byte[][]. It works pretty well for my own testing images (just some black spots on white background) which are all in the 8 bits per pixel format.
Now I tried the same code for some bitmaps somebody gave me ( also black squares, rectangles on white background) but it's not working as I expected it:
I expected a white pixel to be represented by a value of 255 (and black just by 0 ) in the resulting array, but i found different values there. In one case pixels that are supposed to be white end up with a value of 1 in the array.
Again, all these files are of 8 bit color depth.
Also, I noticed, when I open the Images in paint, save them again as a 256 color bitmap, then it works.
So my questions are:
What is causing this problem ? (Do color palettes maybe play a role ?)
And how can I make it work ?
Here's my amateurish code:
public byte[][] ConvertImageToArray (BitmapSource Image)
{
byte[][] Result = null;
if (Image != null)
{
int Index = 0;
int size = Image.PixelWidth * Image.PixelHeight * Image.Format.BitsPerPixel/8;
byte[] RawImg = new byte[size];
BitmapPalette test = Image.Palette;
int stride = (Image.PixelWidth * Image.Format.BitsPerPixel) / 8;
Image.CopyPixels(RawImg, stride, 0);
Result = new byte[Image.PixelHeight][];
int Width = Image.PixelWidth;
for (int i = 0; i < Result.Length; i++)
{
Result[i] = new byte[Width];
for (int k = 0; k < Result[i].Length; k++)
{
Result[i][k] = RawImg[Index++];
}
}
}
return Result;
}
I am using this article to solve captchas. It works by removing the background from the image using AForge, and then applying Tesseract OCR to the resulting cleaned image.
The problem is, it currently relies on the letters being black, and since each captcha has a different text color, I need to either pass the color to the image cleaner, or change the color of the letters to black. To do either one, I need to know what the existing color of the letters is.
How might I go about identifying the color of the letters?
Using the answer by #Robert Harvey♦ I went and developed the same code using LockBits and unsafe methods to improve it's speed. You must compile with the "Allow unsafe code" flag on. Note that the order of pixels returned from the image is in the bgr not rgb format and I am locking the bitmap using a format of Format24bppRgb to force it to use 3 bytes per colour.
public unsafe Color GetTextColour(Bitmap bitmap)
{
BitmapData bitmapData = bitmap.LockBits(new Rectangle(0, 0, bitmap.Width, bitmap.Height), ImageLockMode.ReadOnly, PixelFormat.Format24bppRgb);
try
{
const int bytesPerPixel = 3;
const int red = 2;
const int green = 1;
int halfHeight = bitmap.Height / 2;
byte* row = (byte*)_bitmapData.Scan0 + (halfHeight * _bitmapData.Stride);
Color startingColour = Color.FromArgb(row[red], row[green], row[0]);
for (int wi = bytesPerPixel, wc = _bitmapData.Width * bytesPerPixel; wi < wc; wi += bytesPerPixel)
{
Color thisColour = Color.FromArgb(row[wi + red], row[wi + green], row[wi]);
if (thisColour != startingColour)
{
return thisColour;
}
}
return Color.Empty; //Or some other default value
}
finally
{
bitmap.UnlockBits(bitmapData);
}
}
The solution to this particular problem turned out to be relatively simple. All I had to do is get the color of the edge pixel halfway down the left side of the image, scan pixels to the right until the color changes, and that's the color of the first letter.
public Color GetTextColor(Bitmap bitmap)
{
var y = bitmap.Height/2;
var startingColor = bitmap.GetPixel(0, y);
for (int x = 1; x < bitmap.Width; x++)
{
var thisColor = bitmap.GetPixel(x, y);
if (thisColor != startingColor)
return thisColor;
}
return null;
}
I need a way to convert 1000+ 8bit bitmaps into eight 1bit bitmaps.
Currently I am running two loops which read each pixel from the main image and assign it a 1bpp image. It takes a very long time to accomplish this, anyway better to do it?
Here is an example of my code (separates only into two images):
Bitmap rawBMP = new Bitmap(path);
Bitmap supportRAW = new Bitmap(rawBMP.Width, rawBMP.Height);
Bitmap modelRAW = new Bitmap(rawBMP.Width, rawBMP.Height);
Color color = new Color();
for (int x = 0; x < rawBMP.Width; x++)
{
for (int y = 0; y < rawBMP.Height; y++)
{
color = rawBMP.GetPixel(x, y);
if (color.R == 166) //model
{
modelRAW.SetPixel(x, y, Color.White);
}
if (color.R == 249) //Support
{
supportRAW.SetPixel(x, y, Color.White);
}
}
}
var supportBMP = supportRAW.Clone(new Rectangle(0, 0, rawBMP.Width, rawBMP.Height), System.Drawing.Imaging.PixelFormat.Format1bppIndexed);
var modelBMP = modelRAW.Clone(new Rectangle(0, 0, rawBMP.Width, rawBMP.Height), System.Drawing.Imaging.PixelFormat.Format1bppIndexed);
If you have to check every pixel then you are going to have to loop through them all at least once, however like TaW suggest there are more efficient ways to access pixels.
SetPixel and GetPixel are much slower then you accessing the data directly, Look at the use of unsafe to get direct access to the data, or marshaling to copy the data back and forth.
( see https://stackoverflow.com/a/1563170 for a more detailed written by notJim)
I know how to do it in WPF but I have problem for capturing depth in winforms application.
I found some code as below:
private void Kinect_DepthFrameReady(object sender, DepthImageFrameReadyEventArgs e)
{
using (DepthImageFrame depthFrame = e.OpenDepthImageFrame())
{
if (depthFrame != null)
{
Bitmap DepthBitmap = new Bitmap(depthFrame.Width, depthFrame.Height, PixelFormat.Format32bppRgb);
if (_depthPixels.Length != depthFrame.PixelDataLength)
{
_depthPixels = new DepthImagePixel[depthFrame.PixelDataLength];
_mappedDepthLocations = new ColorImagePoint[depthFrame.PixelDataLength];
}
//Copy the depth frame data onto the bitmap
var _pixelData = new short[depthFrame.PixelDataLength];
depthFrame.CopyPixelDataTo(_pixelData);
BitmapData bmapdata = DepthBitmap.LockBits(new Rectangle(0, 0, depthFrame.Width,
depthFrame.Height), ImageLockMode.WriteOnly, DepthBitmap.PixelFormat);
IntPtr ptr = bmapdata.Scan0;
Marshal.Copy(_pixelData, 0, ptr, depthFrame.Width * depthFrame.Height);
DepthBitmap.UnlockBits(bmapdata);
pictureBox2.Image = DepthBitmap;
}
}
}
but this is not giving me the greyScale depth and it's purple. Any improvement or help?
I found the solution myself, by a function to convert the depth frame:
void Kinect_DepthFrameReady(object sender, DepthImageFrameReadyEventArgs e)
{
using (DepthImageFrame depthFrame = e.OpenDepthImageFrame())
{
if (depthFrame != null)
{
this.depthFrame32 = new byte[depthFrame.Width * depthFrame.Height * 4];
//Update the image to the new format
this.depthPixelData = new short[depthFrame.PixelDataLength];
depthFrame.CopyPixelDataTo(this.depthPixelData);
byte[] convertedDepthBits = this.ConvertDepthFrame(this.depthPixelData, ((KinectSensor)sender).DepthStream);
Bitmap bmap = new Bitmap(depthFrame.Width, depthFrame.Height, PixelFormat.Format32bppRgb);
BitmapData bmapdata = bmap.LockBits(new Rectangle(0, 0, depthFrame.Width, depthFrame.Height), ImageLockMode.WriteOnly, bmap.PixelFormat);
IntPtr ptr = bmapdata.Scan0;
Marshal.Copy(convertedDepthBits, 0, ptr, 4 * depthFrame.PixelDataLength);
bmap.UnlockBits(bmapdata);
pictureBox2.Image = bmap;
}
}
}
private byte[] ConvertDepthFrame(short[] depthFrame, DepthImageStream depthStream)
{
//Run through the depth frame making the correlation between the two arrays
for (int i16 = 0, i32 = 0; i16 < depthFrame.Length && i32 < this.depthFrame32.Length; i16++, i32 += 4)
{
// Console.WriteLine(i16 + "," + i32);
//We don’t care about player’s information here, so we are just going to rule it out by shifting the value.
int realDepth = depthFrame[i16] >> DepthImageFrame.PlayerIndexBitmaskWidth;
//We are left with 13 bits of depth information that we need to convert into an 8 bit number for each pixel.
//There are hundreds of ways to do this. This is just the simplest one.
//Lets create a byte variable called Distance.
//We will assign this variable a number that will come from the conversion of those 13 bits.
byte Distance = 0;
//XBox Kinects (default) are limited between 800mm and 4096mm.
int MinimumDistance = 800;
int MaximumDistance = 4096;
//XBox Kinects (default) are not reliable closer to 800mm, so let’s take those useless measurements out.
//If the distance on this pixel is bigger than 800mm, we will paint it in its equivalent gray
if (realDepth > MinimumDistance)
{
//Convert the realDepth into the 0 to 255 range for our actual distance.
//Use only one of the following Distance assignments
//White = Far
//Black = Close
//Distance = (byte)(((realDepth – MinimumDistance) * 255 / (MaximumDistance-MinimumDistance)));
//White = Close
//Black = Far
Distance = (byte)(255 - ((realDepth - MinimumDistance) * 255 / (MaximumDistance - MinimumDistance)));
//Use the distance to paint each layer (R G & of the current pixel.
//Painting R, G and B with the same color will make it go from black to gray
this.depthFrame32[i32 + RedIndex] = (byte)(Distance);
this.depthFrame32[i32 + GreenIndex] = (byte)(Distance);
this.depthFrame32[i32 + BlueIndex] = (byte)(Distance);
}
//If we are closer than 800mm, the just paint it red so we know this pixel is not giving a good value
else
{
this.depthFrame32[i32 + RedIndex] = 0;
this.depthFrame32[i32 + GreenIndex] = 0;
this.depthFrame32[i32 + BlueIndex] = 0;
}
}
so i presume that rgb frame is working out for you in that case:
first to enable depth camera you need to call:
sensor->NuiInitialize(NUI_INITIALIZE_FLAG_USES_DEPTH|all stuff you use also);
second to start streaming you need to call:
if (int(streams&_Kinect_zed)) ret=sensor->NuiImageStreamOpen(
NUI_IMAGE_TYPE_DEPTH, // Depth camera or rgb camera?
NUI_IMAGE_RESOLUTION_640x480, // Image resolution
NUI_IMAGE_STREAM_FLAG_DISTINCT_OVERFLOW_DEPTH_VALUES, // Image stream flags // NUI_IMAGE_STREAM_FLAG_ENABLE_NEAR_MODE nefunguje !!!
2, // Number of frames to buffer
NULL, // Event handle
&stream_hzed); else stream_hzed=NULL;
beware not all resolution/flags combinations work on all models of kinect !!!
this one above is safe even for the older models like mine
this is how i capture frame (called repeatedly from timer or thread loop)
ret=sensor->NuiImageStreamGetNextFrame(stream_hzed,0,&imageFrame); if (ret>=0)
{
// copy data from frame
imageFrame.pFrameTexture->LockRect(0, &LockedRect, NULL, 0);
if (LockedRect.Pitch!=0)
{
const BYTE* curr = (const BYTE*) LockedRect.pBits;
union _col { BYTE u8[2]; WORD u16; } col;
col.u16=0;
pnt3d p;
long ax,ay;
float mxs=float(xs)/(62.0*deg),mys=float(ys)/(48.6*deg);
for(int x=0,y=0;;)
{
col.u8[0]=*curr; curr++;
col.u8[1]=*curr; curr++;
p.raw=col.u16;
p.rgb=&rgb_default;
if (p.raw==0x0000) p.z=0.0; // p.z je kolma vzdialenost od senzora (kinect to correctuje sam)
else if (p.raw>=0x8000) p.z=4.0;
else p.z=0.8+(float(p.raw-6576)*0.00012115165336374002280501710376283);
// depth FOV correction
p.x=zx[x]*p.z;
p.y=zy[y]*p.z;
// color FOV correction zed 58.5° x 45.6° | rgb 62.0° x 48.6° | 25mm distance
if (p.z>0.0)
{
ax=(((x+10-xs2)*241)>>8)+xs2; // cameras x-offset and different FOV
ay=(((y+30-ys2)*240)>>8)+ys2; // cameras y-offset??? and different FOV
if ((ax>=0)&&(ax<xs))
if ((ay>=0)&&(ay<ys)) p.rgb=&rgb[ay][ax];
}
xyz[y][x]=p;
x++; if (x>=xs) { x=0; y++; if (y>=ys) break; }
}
}
// release frame
imageFrame.pFrameTexture->UnlockRect(0);
ret=sensor->NuiImageStreamReleaseFrame(stream_hzed, &imageFrame);
stream_changed|=_Kinect_zed;
}
Sorry for incomplete source code ...
- all is copy pasted from my kinect class (BDS2006 Turbo C++)
- so you need to check your code if you do not forget something
- and if yes then transform my code to C# (i am not C# user)
- most likely you forget to NUIinitialize with depth flag
- or set invalid resolution/flags/ precision or framerate for your HW
if nothing work at all then you need to initialize the sensor in the first place
int sensors;
INuiSensor *sensor;
if ((NUIGetSensorCount(&sensors)<0)||(sensors<1)) return false;
if (NUICreateSensorByIndex(0,&sensor)<0) return false;
if you link to dll on your own then link only these functions:
typedef HRESULT(__stdcall *_NuiGetSensorCount )(int * pCount); _NuiGetSensorCount NUIGetSensorCount =NULL;
typedef HRESULT(__stdcall *_NuiCreateSensorByIndex)(int index,INuiSensor **ppNuiSensor); _NuiCreateSensorByIndex NUICreateSensorByIndex=NULL;
Every other function (must) is obtained via COM inside SDK headers !!!
if you link and use them on your own then you will not be connected to your physical Kinect !!!
Basically kinect sdk is developed for WPf application. In windows form you have convert the short array of the depth data to the BItmap to display it on picturebox. And based on my expriment WPF is better for programming with kinect.
Below is the function that I used to convert depth frame to Bitmap for showing in picture box.
private Bitmap ImageToBitmap(DepthImageFrame Image)
{
short[] pixeldata = new short[Image.PixelDataLength];
int stride = Image.Width * 2;
Image.CopyPixelDataTo(pixeldata);
Bitmap bmap = new Bitmap(Image.Width, Image.Height, PixelFormat.Format16bppRgb555);
BitmapData bmapdata = bmap.LockBits(new Rectangle(0, 0, Image.Width, Image.Height), ImageLockMode.WriteOnly, bmap.PixelFormat);
IntPtr ptr = bmapdata.Scan0;
Marshal.Copy(pixeldata, 0, ptr, Image.PixelDataLength);
bmap.UnlockBits(bmapdata);
return bmap;
}
You may call it like this:
DepthImageFrame VFrame = e.OpenDepthImageFrame();
if (VFrame == null) return;
short[] pixelS = new short[VFrame.PixelDataLength];
Bitmap bmap = ImageToBitmap(VFrame);
I have an image that looks like this:
and I want to find the edges of the dark part so like this (the red lines are what I am looking for):
I have tried a few approaches and none have worked so I am hoping there is an emgu guru out there willing to help me...
Approach 1
Convert the image to grayscale
Remove noise and invert
Remove anything that is not really bright
Get the canny and the polygons
Code for this (I know that I should be disposing of things properly but I am keeping the code short):
var orig = new Image<Bgr, byte>(inFile);
var contours = orig
.Convert<Gray, byte>()
.PyrDown()
.PyrUp()
.Not()
.InRange(new Gray(190), new Gray(255))
.Canny(new Gray(190), new Gray(255))
.FindContours(CHAIN_APPROX_METHOD.CV_CHAIN_APPROX_SIMPLE,
RETR_TYPE.CV_RETR_TREE);
var output = new Image<Gray, byte>(orig.Size);
for (; contours != null; contours = contours.HNext)
{
var poly = contours.ApproxPoly(contours.Perimeter*0.05,
contours.Storage);
output.Draw(poly, new Gray(255), 1);
}
output.Save(outFile);
This is the result:
Approach 2
Convert the image to grayscale
Remove noise and invert
Remove anything that is not really bright
Get the canny and then lines
Code for this:
var orig = new Image<Bgr, byte>(inFile);
var linesegs = orig
.Convert<Gray, byte>()
.PyrDown()
.PyrUp()
.Not()
.InRange(new Gray(190), new Gray(255))
.Canny(new Gray(190), new Gray(255))
.HoughLinesBinary(
1,
Math.PI/45.0,
20,
30,
10
)[0];
var output = new Image<Gray, byte>(orig.Size);
foreach (var l in linesegs)
{
output.Draw(l, new Gray(255), 1);
}
output.Save(outFile);
This is the result:
Notes
I have tried adjusting all the parameters on those two approaches and adding smoothing but I can never get the simple edges that I need because, I suppose, the darker region is not a solid colour.
I have also tried dilating and eroding but the parameters I have to put in for those are so high to get a single colour that I end up including some of the grey stuff on the right and lose accuracy.
Yes, it's possible, and here is how you could do it:
Change the contrast of the image to make the lighter part disappear:
Then, convert it to HSV to perform a threshold operation on the Saturation channel:
And execute erode & dilate operations to get rid of the noises:
At this point you'll have the result you were looking for. For testing purposes, at the end I execute the bounding box technique to show how to detect the beggining and the end of the area of interest:
I didn't have the time to tweak the parameters and make a perfect detection, but I'm sure you can figure it out. This answer provides a roadmap for achieving that!
This is the C++ code I came up with, I trust you are capable of converting it to C#:
#include <cv.h>
#include <highgui.h>
int main(int argc, char* argv[])
{
cv::Mat image = cv::imread(argv[1]);
cv::Mat new_image = cv::Mat::zeros(image.size(), image.type());
/* Change contrast: new_image(i,j) = alpha*image(i,j) + beta */
double alpha = 1.8; // [1.0-3.0]
int beta = 100; // [0-100]
for (int y = 0; y < image.rows; y++)
{
for (int x = 0; x < image.cols; x++)
{
for (int c = 0; c < 3; c++)
{
new_image.at<cv::Vec3b>(y,x)[c] =
cv::saturate_cast<uchar>(alpha * (image.at<cv::Vec3b>(y,x)[c]) + beta);
}
}
}
cv::imshow("contrast", new_image);
/* Convert RGB Mat into HSV color space */
cv::Mat hsv;
cv::cvtColor(new_image, hsv, CV_BGR2HSV);
std::vector<cv::Mat> v;
cv::split(hsv,v);
// Perform threshold on the S channel of hSv
int thres = 15;
cv::threshold(v[1], v[1], thres, 255, cv::THRESH_BINARY_INV);
cv::imshow("saturation", v[1]);
/* Erode & Dilate */
int erosion_size = 6;
cv::Mat element = cv::getStructuringElement(cv::MORPH_CROSS,
cv::Size(2 * erosion_size + 1, 2 * erosion_size + 1),
cv::Point(erosion_size, erosion_size) );
cv::erode(v[1], v[1], element);
cv::dilate(v[1], v[1], element);
cv::imshow("binary", v[1]);
/* Bounding box */
// Invert colors
cv::bitwise_not(v[1], v[1]);
// Store the set of points in the image before assembling the bounding box
std::vector<cv::Point> points;
cv::Mat_<uchar>::iterator it = v[1].begin<uchar>();
cv::Mat_<uchar>::iterator end = v[1].end<uchar>();
for (; it != end; ++it)
{
if (*it) points.push_back(it.pos());
}
// Compute minimal bounding box
cv::RotatedRect box = cv::minAreaRect(cv::Mat(points));
// Draw bounding box in the original image (debug purposes)
cv::Point2f vertices[4];
box.points(vertices);
for (int i = 0; i < 4; ++i)
{
cv::line(image, vertices[i], vertices[(i + 1) % 4], cv::Scalar(0, 255, 0), 2, CV_AA);
}
cv::imshow("box", image);
cvWaitKey(0);
return 0;
}