Compressing images, is there a point? - c#

I'm working on a web site which will host thousands of user uploaded images in the formats, .png, .jpeg, and .gif.
Since there will be such a huge amount of images, saving just a few kb of space per file will in the end mean quite a lot on total storage requirements.
My first thought was to enable windows folder compression on the folder that the files are stored in (using a Windows / IIS server). On a total of 1Gb of data the total space saved on this was ~200kb.
This to me seems like a poor result. I therefore went to check if the windows folder compression could be tweaked but according to this post it cant be: NTFS compressed folders
My next though was then that I could use libraries such as Seven Zip Sharp to compress the files individually as I save them. But before I did this I went to test a few different compression programs on a few images.
The results on a 7Mb .gif was that
7z, Compress to .z7 = 1kb space saved
7z, Compress to .zip = 2kb space INCREASE
windows, native zip = 4kb space saved.
So this leaves me with two thoughs.. the zipping programs I'm using aren't very good, or images are pretty much already compressed as far as they can be (..and I'm surprised that windows built in compression is better than 7z).
So my question is, is there any way to decrease the filesize of an image archive consisting of the image formats listed above?

the zipping programs I'm using suck, or images are pretty much already compressed as far as they can be
Most common image formats are already compressed (PNG, JPEG, etc). Compressing a file twice will almost never yield any positive result, most likely it will only increase the file size.
So my question is, is there any way to decrease the filesize of an image archive consisting of the image formats listed above?
No, not likely. Compressed files might have at most a little more to give, but you have specialize on images itself, not the compression algoritm. Some good options are available in the post of Robert Levy. A tool I used to strip out metadata is PNGOUT.

Most users will likely be uploading files that have a basic level of compression already done on them so that's why you aren't seeing a ton of benefit. Some users may be uploading uncompressed files though in which case your attempts would make a difference.
That said, image compression should be thought of as a unique field from normal file compression. Normal file compression techniques will be "lossless", ensuring that every bit of the file is restored when the file is uncompressed - images (and other media) can be compressed in "lossy" ways without degrading the file to an unacceptable level.
There are specialized tools such which you can use to do things like strip out metadata, apply a slight blur, perform sampling, reduce quality, reduce dimensions, etc. Have a look at the answer here for a good example: Recommendation for compressing JPG files with ImageMagick. The top answer took the example file from 264kb to 170kb.

Related

iTextSharp, Why when creating a PDF file size is 2 times larger than the original folder with images?

I need the finished PDF file to be 30% smaller than the original image folder.
There is a folder with images in tiff format. Then I add one image per page (Doc = new document (), etc.), the resulting document size is equal to the size of the image folder. But after passing the doc.Close() PDF file size increased two times (so I use compression PDF later and then the file is approximately equal to the folder), but I need the finished PDF file to be 30% smaller than the original image folder.
Most image formats are already compressed, so they won't compress any more. PDFs usually compress because they're mostly text, but one that is mostly images won't.
Also, compression routines usually assume that the data is suitable for compression. If you give them pre-compressed data, the result can be a larger file. It's hard to tell exactly what happened without seeing your files, but I guess that's the reason.
If you want a smaller file, you'll have to reduce the amount of information in your images. Crop them, reduce the colour depth, increase the compression or reduce the number of images.

Zip folder to SVN?

This may sound a silly question but I just wanted to clear something up. I've zipped a folder up and added it to my SVN repository. Is doing this all ok? or should I upload the unzipped folder instead?
I just need to be sure!
If you are going to change the contents of the directory, then you should store it unzipped. Having it in zip file will exhaust storage on server much faster, as if you were storing every version of your zip as a separate file on your server.
Zip format has one cool properly: every file inside archive takes some segment of bytes, and is compressed/decompressed independently of all the other files. As the result, if you have a 100 MB zip, and modify two files inside each having size 1 MB, then the new zip will have at most 2 MB of new data, the rest 98 MB will be most likely by byte-exact copies of some pieces of the old zip. So it is in theory possible to represent small in-zip changes as small deltas. But there are many problems in practice.
First of all, you must be sure that you don't recompress the unchanged files. If you make both the first zip and the second zip from scratch using different programs, program versions, compression settings, etc., you can get slightly different compression on the unchanged files. As the result, the actual bytes in zip file will greatly differ, and any hope for small delta will be lost. The better approach is taking the first zip, and adding/removing files in it.
The main problem however is how SVN stores deltas. As far as I know, SVN uses xdelta algorithm for computing deltas. This algorithm is perfectly capable of detecting equal blocks inside zip file, if given unlimited memory. The problem is that SVN uses memory-limited version with a window of size = 100 KB. Even if you simply remove a segment longer than 100 KB from a file, then SVN's delta computation will break on it, and the rest of the file will be simply copied into delta. Most likely, the delta will take as much space as the whole file takes.

is there a way to compress a paged .tiff file using C#?

At the end of my process, I need to upload several paged .tiff file images to a website. The files need to be very small, 500kb or less when i upload them.
The problem is, even with me resizing them a lot but at the same time being able to read a few lines of text that are in some of them, they are around 1mb each or so.
I first resize all images going into the tiff files but it's not enough. I need a way to change the quality of them to decrease their size as well.
Can C# do this or would I need a third party software to do it?
The files being uploaded MUST be .tiff.
You don't provide much detail about your data, so can only make some guesses as to what you might need to look at.
First, can you loose some resolution? Can you make the images smaller?
Second, can you loose some color depth? Are you saving the files in a color format when bilevel or greyscale images would suffice?
Third, how clean are these images? Are they photos, scanned documents, what? If they are scanned documents of text or drawings, then some pre-processing to remove noise can make a significant difference in size.
Lastly, what compression method are you saving the file with? Only a lossy format is going to give you the highest degree of compression is most circumstances.
Based on your follow-up:
1) If you can make smaller, this of course saves significant storage space. Determine what is the minimum acceptable resolution that they need to be and standardize on that.
2) If you need to persist color, then this step might not be as effective, since you would have to algorithmically decrease the dynamic range of colors used in the image to an acceptable level before compressing. If you are not sure what this means, then you would probably best skip considering this completely unless you can spend time learning more about image processing and/or using a image processing library that will simplify this for you.
3) I don't think you addressed this in your comments. If you want more precise help, you should update your original question and add much more detail about what you are trying to accomplish. Provide some explanations of what/why you need to do in order to help determine what tradeoffs make sense.
4) Yes, JPG is a lossy format, but I think you may be confusing a few different things (or I may not be understanding your intent from your description). If you are first resizing your original images down into a new JPG file (an intermediate image file), then you are building a TIFF file and inserting the resized JPG as a source image into a multi-page TIFF and saving that, then you need to realize that the process of how the files are compressed in the intermediate files do not necessarily have any correlation with the compression format used in the TIFF file. Depending on what you are using to build and create the TIFF file, the compression format used in the TIFF is done separately and you probably need to specify those parameters when you save that file. If this is what you are doing, then the intermediary process of saving the JPG files may be increasing the size a bit.

How to compress image?

I have problem with image compression. I need to compres a lot of files (700-900kb) to files 70-80kb without
loss of quality. (or small loss ) I found menu item "Save for Web & Devices ..." in Photoshop. It works great.
But I don't want to use photoshop programmatically. May be someone knows how to solve this problem with
other third party components or frameworks?
Thanks for any ideas!
.NET has a number of image decoding/encoding libraries, often tied to a particular GUI framework (e.g. in Windows Forms you have System.Drawing.Image and for WPF, see the Imaging Overview chapter on msdn).
There are also third party libraries specialized in image conversion/compression that you can find online (both free and non free)
Generally though, the amount of saving you get from compressing an image highly depends on the original format. If you already have JPEG photos with normal compression (quality of 85%) then there is not much you can do in terms of making them smaller except resizing them. If you have raw bitmaps (e.g. BMP, uncompressed/low compression TIFF etc.) then you can expect quite large savings with most compressing formats
When choosing image format, consider this:
Photos and similar: JPEG will often do fine. Good savings with reasonable quality loss
Screenshots and similar: PNG will generally give best results (PNG is lossless). JPEG will often create highly visible artifacts on screenshots
Compressing an already compressed image (i.e. PNG, JPEG etc.) with a general purpose compression algorithm like ZIP or RAR will in practice not give you any savings. You may actually end up with a bigger file.
You can have a look at the FreeImage project. It has a C# wrapper that you can use.
Imagemagick allows you to batch-processing on files and offers a everything you could possible ask for when it comes to handling of images
E.g. to resize every image in folder (destroy originals) to QVGA do
mogrify -resize 320x240 *.jpg
To preserve aspect ratio do
mogrify -resize 320x240! *.jpg
If you need to traverse a directory structure, this is how you can do it in *nix based systems (also destroying originals)
find . -type f -name *.jpg -exec convert -resize 800x800 {} \;
There is also an quality switch available, see here

File extension from System.Drawing.Image

I'm writing a method that needs to save a System.Drawing.Image to a file. Without knowing the original file the Image was created from, is there anyway to determine what file extension it should have?
The best solution I've come up with is to use a Switch/Case statement with the value of Image.RawFormat.
Does it even matter that I save the Image in it's original format? Is an Image generated from a PNG any different from say one generated from a JPEG? Or is the data stored in an Image object completely generic?
While Steve Danner is correct in that an image created from a JPG will look different to an image created from a PNG once it's loaded into memory it's an uncompressed data stream.
This means that you can save it out to any file format you want.
However, if you load a JPG image and then save it as another JPG you are throwing away more information due to the compression algorithm. If you do this repeatedly you will eventually lose the image.
If you can I'd recommend always saving as PNG.
Image.RawFormat has cooties, stay away from it. I've seen several reports of it having no legal value for no apparent reason. Undiagnosed as yet.
You are quite right, it doesn't matter what format you save it to. After you loaded the file, the internal format is the same for any bitmap (not vector) with the same pixel format. Generally avoid recompressing jpeg files, they tend to get bigger and acquire more artifacts. Steve mentions multi-frame files, they need to be saved a different way.
Yes, it definitely matters because different fileformats support different features such as compression, multiple frames, etc.
I've always used a switch statement like you have, perhaps baked into an extension method or something.
To answer your question 'Does it even matter that I save the Image in it's original format?' explicitly: Yes, it does, but in a negative way.
When you load the image, it is uncompressed internally to a bitmap (or as ChrisF calls it, an uncompressed data stream). So if the original image used a lossy compression (for example jpeg), saving it in the same format will again result in loss of information (i.e. more artifacts, less detail, lower quality). Especially if you have repeated actions of read - modify - save, this is something to avoid.
(Note that it is also something to avoid if you are not modifying the picture. Just the repeated decompress - compress cycles will degrade the image quality).
So if disk space is not an issue here (and it usually isn't in the age of hard disks that are big enough for HD video), always store any intermediate pictures in lossless compression formats, or uncompressed. You may consider saving the finall output in a compressed format, depending on what you use it for. (If you want to present those final pictures on the web, jpeg or png would be good choices).

Categories

Resources