Modifying JPEG headers in C#

Modifying JPEG headers in C# - c#

I'm trying to open a file to bytes, convert it to a string, modify some data (think Steganography) and convert the file back to bytes and save it as a jpeg. So far, everything I've tried has corrupted the file in converting it to a string. I've tried converting it to a 64-bit string, but of course that's a bit hard to modify the data in :P
Any suggestions on how I can do this properly, without corrupting my file?

I don't have this in C# but in PHP, but you can take a look and adaptate to C#.
http://www.havenard.110mb.com/fotomagica/
This is my site where there's a tool to modify the EXIF data of a JPEG and build "magic pictures" which display something in the thumbnail which is not the real picture.
It opens the JPEG, slice its sectors, and build it back ignoring irrelevant sectors and placing my custom made EXIF header.
And this is the source of the PHP classes:
http://www.havenard.110mb.com/fotomagica/class.JpegMapper.php.txt (ExifMapper is incomplete)
http://www.havenard.110mb.com/fotomagica/class.DataMapper.php.txt
You can study it and rebuild in C#, it is really simple to slice a JPEG as you will see.
Usage of this PHP class (only JpegMapper):
$jpg = new JpegMapper('picture.jpg');
$jpg->save_filtered('filtere picture.jpg'); // save removing irrelevant sectors
It is great to get any JPEG even smaller (sometimes a lot smaller).

Related

How to write big image directly into bmp file (without creating bitmap object in memory) in c#?

Hello good smart programmers,
I need to merge small images into one big image which will have dimension about 7600 x 7600 px. When I create it in memory it takes too much memory I can't afford that.
I think good way to do this is make buffer (for every small picture which i want to put in big Image) and write directly into file (excatly blob - on azure). Somebody know how to do that (any free library?) I've searched google but no answers (maybe wrong question - my english is poor).

If you are talking about a "bmp" file, you can do this by directly writing data to a file stream in the Bitmap file format. It's pretty simple, actually the "bmp" is the simplest image format, so I doubt you will have any difficulties. Here are 2 useful articles that explain the bitmap file format in details:
BMP file format in Wikipedia
Microsoft Windows Bitmap

I'm not aware of any image libraries that will do the encoding into BMP format on the fly so I'm afraid you'll have to implement your own.
Fortunately the BMP format is very simple when no compression is used and not very difficult with RLE compression. It's basically a header followed by the raw bytes of the image pixels, line after line.
This means you're going to have to load all the images in a line (if your target image is say 30 by 40 images, you'll need to load 30.) Unless your input images are also in BMP format and you don't mind creating a custom reader.
You can get the BMP file format by typing "BMP Format" in Google (wikipedia has it as well.)

is there a way to compress a paged .tiff file using C#?

At the end of my process, I need to upload several paged .tiff file images to a website. The files need to be very small, 500kb or less when i upload them.
The problem is, even with me resizing them a lot but at the same time being able to read a few lines of text that are in some of them, they are around 1mb each or so.
I first resize all images going into the tiff files but it's not enough. I need a way to change the quality of them to decrease their size as well.
Can C# do this or would I need a third party software to do it?
The files being uploaded MUST be .tiff.

You don't provide much detail about your data, so can only make some guesses as to what you might need to look at.
First, can you loose some resolution? Can you make the images smaller?
Second, can you loose some color depth? Are you saving the files in a color format when bilevel or greyscale images would suffice?
Third, how clean are these images? Are they photos, scanned documents, what? If they are scanned documents of text or drawings, then some pre-processing to remove noise can make a significant difference in size.
Lastly, what compression method are you saving the file with? Only a lossy format is going to give you the highest degree of compression is most circumstances.
Based on your follow-up:
1) If you can make smaller, this of course saves significant storage space. Determine what is the minimum acceptable resolution that they need to be and standardize on that.
2) If you need to persist color, then this step might not be as effective, since you would have to algorithmically decrease the dynamic range of colors used in the image to an acceptable level before compressing. If you are not sure what this means, then you would probably best skip considering this completely unless you can spend time learning more about image processing and/or using a image processing library that will simplify this for you.
3) I don't think you addressed this in your comments. If you want more precise help, you should update your original question and add much more detail about what you are trying to accomplish. Provide some explanations of what/why you need to do in order to help determine what tradeoffs make sense.
4) Yes, JPG is a lossy format, but I think you may be confusing a few different things (or I may not be understanding your intent from your description). If you are first resizing your original images down into a new JPG file (an intermediate image file), then you are building a TIFF file and inserting the resized JPG as a source image into a multi-page TIFF and saving that, then you need to realize that the process of how the files are compressed in the intermediate files do not necessarily have any correlation with the compression format used in the TIFF file. Depending on what you are using to build and create the TIFF file, the compression format used in the TIFF is done separately and you probably need to specify those parameters when you save that file. If this is what you are doing, then the intermediary process of saving the JPG files may be increasing the size a bit.

Edit TIFF image without changing header data

I have a program that takes in stacks of TIFF images and is very particular about the header data (it expects all headers to be the same), however I want to edit a couple of the images in the stack before sending it to this program.
Every program I've tried so far (Paint.net, MS Paint, ImageJ) has altered the header file or outright corrupted it when it saves the new images. I have access to C# and LibTiff.Net but even after reading the documents I'm confused as to how to simply replace the image data without changing the header information at all.
Currently the idea is simply to replace an image with a solid colour, so it isn't too complicated.

Here's how I would go about this.
Check TIFF documentation to find out where the actual bitmap data is stored. (I believe it is a structured format so it won't be at the same place every time, although it could be if all your headers are the same.)
Once you've identified the data, you can extract it or replace it with raw bitmap data of the same dimensions and format.
For example, you could extract the bitmap data from the TIFF file with the changed headers and overwrite the data in a file with a good header.
It's pretty low level, but it should work.
Alternatively, you could read-in the edited files and write back out a TIFF file with your own headers in the correct format. Might be more or less difficult.

File extension from System.Drawing.Image

I'm writing a method that needs to save a System.Drawing.Image to a file. Without knowing the original file the Image was created from, is there anyway to determine what file extension it should have?
The best solution I've come up with is to use a Switch/Case statement with the value of Image.RawFormat.
Does it even matter that I save the Image in it's original format? Is an Image generated from a PNG any different from say one generated from a JPEG? Or is the data stored in an Image object completely generic?

While Steve Danner is correct in that an image created from a JPG will look different to an image created from a PNG once it's loaded into memory it's an uncompressed data stream.
This means that you can save it out to any file format you want.
However, if you load a JPG image and then save it as another JPG you are throwing away more information due to the compression algorithm. If you do this repeatedly you will eventually lose the image.
If you can I'd recommend always saving as PNG.

Image.RawFormat has cooties, stay away from it. I've seen several reports of it having no legal value for no apparent reason. Undiagnosed as yet.
You are quite right, it doesn't matter what format you save it to. After you loaded the file, the internal format is the same for any bitmap (not vector) with the same pixel format. Generally avoid recompressing jpeg files, they tend to get bigger and acquire more artifacts. Steve mentions multi-frame files, they need to be saved a different way.

Yes, it definitely matters because different fileformats support different features such as compression, multiple frames, etc.
I've always used a switch statement like you have, perhaps baked into an extension method or something.

To answer your question 'Does it even matter that I save the Image in it's original format?' explicitly: Yes, it does, but in a negative way.
When you load the image, it is uncompressed internally to a bitmap (or as ChrisF calls it, an uncompressed data stream). So if the original image used a lossy compression (for example jpeg), saving it in the same format will again result in loss of information (i.e. more artifacts, less detail, lower quality). Especially if you have repeated actions of read - modify - save, this is something to avoid.
(Note that it is also something to avoid if you are not modifying the picture. Just the repeated decompress - compress cycles will degrade the image quality).
So if disk space is not an issue here (and it usually isn't in the age of hard disks that are big enough for HD video), always store any intermediate pictures in lossless compression formats, or uncompressed. You may consider saving the finall output in a compressed format, depending on what you use it for. (If you want to present those final pictures on the web, jpeg or png would be good choices).

Error while loading JPEG: "A generic error occurred in GDI+."

I have some JPEG files that I can't seem to load into my C# application. They load fine into other applications, like the GIMP. This is the line of code I'm using to load the image:
System.Drawing.Image img = System.Drawing.Image.FromFile(#"C:\Image.jpg");
The exception I get is: "A generic error occurred in GDI+.", which really isn't very helpful. Has anyone else run into this, or know a way around it?
Note: If you would like to test the problem you can download a test image that doesn't work in C#.

There's an exact answer to this problem. We ran into this at work today, and I was able to prove conclusively what's going on here.
The JPEG standard defines a metadata format, a file that consists of a series of "chunks" of data (which they call "segments"). Each chunk starts with FF marker, followed by another marker byte to identify what kind of chunk it is, followed by a pair of bytes that describe the length of the chunk (a 16-bit little-endian value). Some chunks (like FFD8, "Start of Image") are critical to the file's usage, and some (like FFFE, "Comment") are utterly meaningless.
When the JPEG standard was defined, they also included the so-called "APP markers" --- types FFE0 through FFEF --- that were supposed to be used for "application-specific data." These are abused in various ways by various programs, but for the most part, they're meaningless, and can be safely ignored, with the exception of APP0 (FFE0), which is used for JFIF data: JFIF extends the JPEG standard slightly to include additional useful information like the DPI of the image.
The problem with your image is that it contains an FFE1 marker, with a size-zero chunk following that marker. It's otherwise unremarkable image data (a remarkable image, but unremarkable data) save for that weird little useless APP1 chunk. GDI+ is wrongly attempting to interpret that APP1 chunk, probably attempting to decode it as EXIF data, and it's blowing up. (My guess is that GDI+ is dying because it's attempting to actually process a size-zero array.) GDI+, if it was written correctly, would ignore any APPn chunks that it doesn't understand, but instead, it tries to make sense of data that is by definition nonstandard, and it bursts into flames.
So the solution is to write a little routine that will read your file into memory, strip out the unneeded APPn chunks (markers FFE1 through FFEF), and then feed the resulting "clean" image data into GDI+, which it will then process correctly.
We currently have a contest underway here at work to see who can write the JPEG-cleaning routine the fastest, with fun prizes :-)
For the naysayers: That image is not "slightly nonstandard." The image uses APP1 for its own purposes, and GDI+ is very wrong to try to process that data. Other applications have no trouble reading the image because they rightly ignore the APP chunks like they're supposed to.

.Net isn't handling the format of that particular image, potentially because the jpeg data format is slightly broken or non-standard. If you load the image into GIMP and save to a new file you can then load it with the Image class. Presumably GIMP is a bit more forgiving of file format problems.

This thread from MSDN Forums may be useful.
The error may mean the data is corrupt
or there is some underlying stream
that has been close too early.

The error can be a permission problem. Especially if your application is an ASP.NET application. Try moving the file to the same directory as your executable (if Win forms) or the root directory of your web application (if asp.net).

I have the same problem.
The only difference i've noticed is the compression. It works fine with "JPEG" but when the compression is "Progressive JPEG" i get the exception (A generic error occurred in GDI+).
At first i thought it could be a memory problem because the images i mentioned were kind of big (about 5MB in disk and maybe ~80MB in memory), but then i'v noticed the difference in the compression type.
When i open/save the image file in other program like IrfanView or GIMP, the result is ok, but that's not the idea.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.