I have a source image like left picture and a set of elements like right picture: Source Image And Elements...
..and I need to generate a mosaic picture like this.
But until this moment I have not worked with images, аnd I do not know where I should start.
I worked several years with C#, but you can give examples in other similar languages.
The result image you gave is apparently a ministeck pattern - in 2011 they had a downloadable software that seemed to do what you want. (Which is not available anymore by ministeck directly, but it seems that pfci.de still provides a download).
So, if you're just looking to generate the patterns for ministeck out of a given image, use their software. If you're after an algorithm to achieve something different, this won't help.
EDIT
Ok, if you're after analyzing your image, you need to load it into an object like this:
using(Bitmap b = new Bitmap(yourFileName))
{
MessageBox.Show(string.Format("image size {0} by {1} pixels", b.Width, b.Height));
MessageBox.Show(string.Format("color of pixel (100,100) is {0}", b.GetPixel(100, 100).ToString()));
}
The Bitmap object has several properties and methods that will help you to analyze the image content. Try this to get started with analyzing your image, and don't forget to either dispose your bitmap afterwards or wrap it into a using statement as shown above ...
Related
I would like to ask for some insight/assistance on how I might improve my OCR accuracy. My target images are low resolution (screenshots) and I would very much prefer not to upscale them, as my program needs to perform fast.
I have 2 images. I see no apparent difference between them, however tesseract is having trouble with one.
image 1 image2
The first image is the issue, the result I am getting is: 251\n41\n31\n\n11\n11\n\n11\n
As you can see, there is something wrong with how it's handling the spacing. There are 2x new lines when things start to go wrong.
Meanwhile, in the second image I get the expected result: 300\n60\n40\n\n1\n15\n15\n10\n6\n15\n
These images were created through the following preprocessing steps:
image.Alpha(AlphaOption.Remove);
image.BlackThreshold(new Percentage(27));
image.Negate(); // Original image has white text on black background
I have limited tesseract's charset to only digits (01234567890-).
I have tried various segmentation modes (SparseText, SingleColumn, SingleBlock). I am running Tesseract 4.1. Do you guys have any pointers?
Or maybe you could tell me what resize algorithm is fast and good for OCR?
If you are having issues with Tesseract and are considering using a more robust library without the need for training, you can try using a commercial library such as Leadtools. With the Leadtools OCR toolkit, I was able to get perfect results for both images with only the basic image processing built in to the OCR demo. There are, however, more sophisticated image processing functions that you can use for more complex tasks if need be. Besides the OCR demo, I was also able to get the same results, in JSON form, without any preprocessing, from one of the tutorials posted here below. As a disclaimer, I work for this vendor.
https://www.leadtools.com/help/sdk/v21/tutorials/dotnet-console-export-ocr-results-to-json.html
Here's some simplified source code that would achieve the same task for one image and print out the raw text:
// Create a new OCR engine with the default settings
IOcrEngine ocrEngine = OcrEngineManager.CreateEngine(OcrEngineType.LEAD);
ocrEngine.Startup(null, null, null, null);
// Create an OCR document to hold everything
using (IOcrDocument ocrDocument = ocrEngine.DocumentManager.CreateDocument()){
// Add the input image as a new page
using(IOcrPage page = ocrDocument.Pages.AddPage(inputFilename, null)){
// Perform OCR on just the one page
page.Recognize(null);
// build a string from the recognized characters
string text = page.GetText(0);
// Show output
Console.WriteLine($"text: '{text}'");
}
}
The results I got for the two images were "251\r\n41\r\n31\r\n1\r\n11\r\n11\r\n7\r\n4\r\n11\r\n" and "300\r\n60\r\n40\r\n1\r\n15\r\n15\r\n10\r\n6\r\n15\r\n".
I'm using a c# wrapper for the Tesseract library (3.02 if I'm not mistaken) (https://github.com/charlesw/tesseract). I've got it running and giving output, but that output is essentially garbage. Often it gives nothing and when it does give something it's often a mess. I know it's theoretically working because I've tried it on some really perfect images and it works. I'm wondering if someone can help me diagnose the issues and suggest some ways I can improve Tesseract accuracy. I've already converted all the images to black and white and the resolution is set at 300x300. I don't do any line straightening programmatically but as you can see below they're pretty straight.
This image works perfectly
This one does not work at all, producing either gibberish or nothing at all
I tried flipping the colors on some examples, thinking that it might give greater contrast (since most text is black on a white background, whereas the working ones were white text on black background). But:
Does not work at all, whereas
Again works perfectly.
I suspect this has something to do with the additional spacing between the letters in "INVOICE." But there must be some way to get decent results with a tighter font. Any suggestions are welcome, I'm a relative noob here.
If possible you should consider using pictures with a higher resolution. The other problem about the Payments image is probably the gap between the letters that is too small. Tesseract cannot detect single letters if they are (almost) connected to the next letter of the word.
I would suggest an image processing library like openCV to improve your results.
You could try erosion/dilation. This will seperate the letters if the right parameters are used for the kernel. Use different kernels to see what works best for you.
Mat element = getStructuringElement(erosion_type,
Size(2 * erosion_size + 1, 2 * erosion_size + 1),
Point(erosion_size, erosion_size));
erode(src, erosion_dst, element);
What was helping me a lot when I was working on my project was using an adaptive threshold. I found this to be way more effective than just turning it into a grayscale or binary image.
Note: Java Code, should be very similar in C though.
Imgproc.adaptiveThreshold(cropedIm, cropedIm, 255, Imgproc.ADAPTIVE_THRESH_GAUSSIAN_C, Imgproc.THRESH_BINARY, 29, 10);
This is what I get after selecting one of your images in Pixtern, an android project of mine(source code on github). I was using a the adapting threshold but no dilation/erosion and the result is already quite good.
[broken links removed]
For the Payments image and similar ones:
Try using a normal threshold and inverting the image(black font, white background). Again, dilation/erosion can be used afterwards. Java Code:
//results in binary image
Imgproc.threshold(cropedIm, cropedIm, 127, 255, Imgproc.THRESH_BINARY);
//Inverting image
Core.bitwise_not(cropedIm, cropedIm);
Tesseract expects whole pages or rather it was trained on those.
If you give it one or two characters or words it won't work well.
I assume you have more of these images. Stitch them together as lines of text: like each image is a line of text after the previous and it should work much better.
Furthermore, make sure you set the psm-parameter right when using tesseract. More on this: https://www.pyimagesearch.com/2021/11/15/tesseract-page-segmentation-modes-psms-explained-how-to-improve-your-ocr-accuracy/
I want to create tiles out of a equirectangular image. So I want the image to be split into 4 lateral faces+up and bottom. Does anyone know any library that I can import into my c# project and which is able to do something like this?
Depending on your original image format, System.Drawing.Bitmap.Clone(Rectangle, PixelFormat) should do the trick.
More information here.
EDIT:
First, let me say that this is not going to answer your question either (not even close), you're seeking a library that already exists for this purpose and I don't know of one personally.
Equirectangular projection is the same as plate carrée (wow, that's humbling), so it's a very simple projection to work with in code.
Here is an example of it's use in a GIS application. I don't know what your purposes are, but the math is the same.
One way to do it is to deproject each pixel then draw it on a new image, but understand that to do this you will still require some sort of projection because you're changing from 3 dimensions to 2 dimensions.
I wasn't able to find a good example, but an easier or faster way might be to first use a matrix transform (again, to change projections), then cut the image into the regions you need.
Like I said, this isn't an asnwer, but if nothing else it will give you more keywords to goggle for.
I want to create large image by C#. (i have some photos with large size (4800 * 4800). i want to merge these photos.)
I use Bitmap but don't support. (Error : Invalid Parameter)
Some code would be useful ...
... However, to hazard an educated guess, I suspect you're trying to create an instance of Bitmap with either Width or Height (or both) greater than 2^15.
Essentially, you can't - the .NET bitmap classes have a limitation on how big an image they can handle. Your original 4800 pixel square images won't be a problem, but going over 32,767 pixels will be.
you might like to check AForge.NET, it has a large image processor built in, and it's open source.
We have an application that show a large image file (satellite image) from local network resource.
To speed up the image rendering, we divide the image to smaller patches (e.g. 6x6 cm) and the app tiles them appropriately.
But each time the satellite image updated, the dividing pre-process should be done, which is a time consuming work.
I wonder how can we load the patches from the original file?
PS 1: I find the LeadTools library, but we need an open source solution.
PS 2: The app is in .NET C#
Edit 1:
The format is not a point for us, but currently it's JPG.
changing the format to a another could be consider, but BMP format is hardly acceptable, because of it large volume.
I wote a beautifull attempt of answer to your question, but my browser ate it... :(
Basically what I tried to say was:
1.- Since Jpeg (and most compression formats) uses a secuential compression, you'll always need to decode all the bits that are before the ones that you need.
2.- The solution I propose need to be done with each format you need to support.
3.- There are a lot of open source jpeg decoders that you could modify. Jpeg decoders need to decode blocks of bits (of variable size) that convert into pixel blocks of size 8x8. What you could do is modify the code to save in memory only the blocks you need and discard all the others as soon as they aren't needed any more (basically as soon as they are decoded). With those memory-saved blocks, create the image you need.
4.- Since Jpeg works with blocks of 8x8, your work could be easier if you work with patches of sizes multiples of 8 pixels.
5.- The modification done to the jpeg decoder could be used to substitute the preprocessing of the images you are doing if you save the patch and discard the blocks as soon as you complete them. It would be really fast and less memory consuming.
I know it needs a lot of work and there are a lot of details to be taken in consideration (specially if you work with color images), but if you need performance I belive you will always end fighting or playing (as you want to see it) with the bytes.
Hope it helps.
I'm not 100% sure what you're after but if you're looking for a way to go from string imagePath, Rectangle desiredPortion to a System.Drawing.Image object then perhaps something like this:
public System.Drawing.Image LoadImagePiece(string imagePath, Rectangle desiredPortion)
{
using (Image img = Image.FromFile(path))
{
Bitmap result = new Bitmap(desiredPortion.Width, desiredPortion.Height, PixelFormat.Format24bppRgb);
using (Graphics g = Graphics.FromImage((Image)result))
{
g.InterpolationMode = System.Drawing.Drawing2D.InterpolationMode.HighQualityBicubic;
g.SmoothingMode = System.Drawing.Drawing2D.SmoothingMode.HighQuality;
g.PixelOffsetMode = System.Drawing.Drawing2D.PixelOffsetMode.HighQuality;
g.CompositingQuality = System.Drawing.Drawing2D.CompositingQuality.HighQuality;
g.DrawImage(img, 0, 0, desiredPortion, GraphicsUnit.Pixel);
}
return result;
}
}
Note that for performance reasons you may want to consider building multiple output images at once rather than calling this multiple times - perhaps passing it an array of rectangles and getting back an array of images or similar.
If that's not what you're after can you clarify what you're actually looking for?