Recently, I was trying to answer another SO question about loading the frames (Bitmap and duration) of animated GIFs. The code can be found on pastenbin.
While doing additional tests on this code before moving it into my dev library, I noticed that there is a problem with this line of code:
//Get the times stored in the gif
//PropertyTagFrameDelay ((PROPID) 0x5100) comes from gdiplusimaging.h
//More info on http://msdn.microsoft.com/en-us/library/windows/desktop/ms534416(v=vs.85).aspx
var times = img.GetPropertyItem(0x5100).Value;
When running this on Windows .Net using this (example GIF), the array is of the same size as the amount of frames in the animated GIF and filled with the durations of the frames. In this case a byte[20] which converts to (BitConverter.ToInt32()) 5 durations:
[75,0,0,0,125,0,0,0,125,0,0,0,125,0,0,0,250,0,0,0]
On MonoMac however, this line of code for the same example GIF returns a byte[4] which converts to only one duration (the first):
[75,0,0,0]
I tested this for 10 different GIF's and the result is always the same. On Windows all durations are in the byte[], while MonoMac only lists the first duration:
[x,0,0,0]
[75,0,0,0]
[50,0,0,0]
[125,0,0,0]
Looking at the Mono System.Drawing.Image source code, the length seem to be set in this method, which is a GDI wrapper:
status = GDIPlus.GdipGetPropertyItemSize (nativeObject, propid,out propSize);
However, I don't really see any problems, not with the source as with my implementation. Am I missing something or is this a bug?
I don't see anything wrong in the mono source either. It would have been helpful if you would have posted one of the sample images you tried. One quirk about the GIF image format is that the Graphics Control Extension block that contains the frame time is optional and may be omitted before an image descriptor. Non-zero odds therefore that you have GIF files that just have one GCE that applies to all the frames, you are supposed to apply the same frame time to every frame.
Do note that you didn't get 4 values, the frame time is encoded as a 32-bit value and you are seeing the little endian encoding for it in a byte[]. You should use BitConverter.ToInt32(), as you correctly did in your sample code.
I therefore think you should probably use this instead:
//convert 4 bit value to integer
var duration = BitConverter.ToInt32(times, 4*i % times.Length);
Do note that there's another nasty implementation detail about GIF frames, frames #2 and up do not have to be the same size as the frame #1. And each frame has a metadata field that describes what should be done with the previous frame to merge it with the next one. There are no property IDs that I know of to obtain the frame offset, size and undraw method for each frame. I think you need to render each frame into a bitmap yourself to get a proper sequence of images. Very ugly details, GIF needs to die.
If you look into libgdiplus you will see that the properties are always read from the active bitmap:
if (gdip_bitmapdata_property_find_id(image->active_bitmap, propID, &index) != Ok) {
You can set the active bitmap by calling Image.SelectActiveFrame and then mono will return the correct durations, one by one. Since this is an incompatibility with windows, I'd call it a mono bug. As a simple workaround, you can of course just check the array length and handle both cases. This will be better than a check for mono, because if mono gets fixed this will continue to work.
Related
TL;DR below
Hi,
I'm trying to make an in-game screen, whose purpose is to show the contents of another window running on the computer (under Windows). For that matter, I had to get my hands dirty and play with that good ol' Windows legacy API, the one with external references and everything. After a couple days of struggle, I was able to make it work! The only problem is that Unity doesn't seem to support it...
I already asked this on Unity answers, but had no answers.
I have a WindowOperation class which wraps all of the API calls into static methods, and also performs a few basic tasks. One of them consists in capturing the screen pixels and saving them in a System.Drawing.Bitmap:
public static Bitmap CaptureApplication(string procName)
{
// ...
var bmp = new Bitmap(width, height, PixelFormat.Format24bppRgb);
// ...
return bmp;
}
Unity fails declaring bmp and outputs the following message in the console:
ArgumentException: Parameter is not valid.
System.Drawing.Bitmap..ctor (Int32 width, Int32 height, PixelFormat format)
(wrapper remoting-invoke-with-check) System.Drawing.Bitmap:.ctor (int,int,System.Drawing.Imaging.PixelFormat)
[More info about the stack trace, which leads to the declaration of bmp in the above code]
According to MSDN and other threads on SO, such exceptions are thrown when .NET is unable to allocate a single block of contiguous memory as big as requested.
Since I didn't have such problems when I tested my code outside of Unity, my best guess is that (correct me if I'm wrong) Unity sandboxes the execution of the script and somehow doesn't allow the allocation of the memory bmp needs.
My screen dimension is 1600*900, and pixel depth is 3 bytes ( PixelFormat.Format24bppRgb ), so the amount of memory needed is 4,320,000 bytes, or just a bit more than 4MB, which doesn't seem excessive to me.
To capture the screen, I'm using .NET's Graphics.CopyFromScreen() function. I don't know how it works, but let's say it uses the same amount of memory as bmp, so the biggest amount of memory being used at one point of time in this script according to a pessimistic estimation rounds up to 10MB. However, I don't think this changes the root of the problem, and bmp is declared before any usage of CopyFromScreen() anyway.
TL;DR, is there a way I can tell Unity that it should let me allocate tons of memory in my scripts if I want to? Or maybe I'm completely off tracks and the problem lies somewhere else?
All answers are more than welcome!
Thank you in advance. :)
I am trying to develop a basic screen sharing and collaboration app in C#. I am currently working on capturing the screen, finding areas of the screen that have changed and subsequently need to be transmitted to the end client.
I am having a problem in that the overall frame rate of the screen capture is too low. I have a fairly good algorithm for finding areas of the screen that have changed. Given a byte array of pixels on the screen it calculates areas that have changed in 2-4ms, however the overall frame rate I am getting is 15-18 fps (i.e. taking somewhere around 60ms per frame). The bottleneck is capturing the data on the screen as a byte array which is taking around 35-50ms. I have tried a couple of different techniques and can't push the fps past 20.
At first I tried something like this:
var _bmp = new Bitmap(screenSectionToMonitor.Width, screenSectionToMonitor.Height);
var _gfx = Graphics.FromImage(_bmp);
_gfx.CopyFromScreen(_screenSectionToMonitor.X, _screenSectionToMonitor.Y, 0, 0, new Size(_screenSectionToMonitor.Width, _screenSectionToMonitor.Height), CopyPixelOperation.SourceCopy);
var data = _bmp.LockBits(new Rectangle(0, 0, _screenSectionToMonitor.Width, _screenSectionToMonitor.Height), ImageLockMode.ReadOnly, _bmp.PixelFormat);
var ptr = data.Scan0;
Marshal.Copy(ptr, _screenshot, 0, _screenSectionToMonitor.Height * _screenSectionToMonitor.Width * _bytesPerPixel);
_bmp.UnlockBits(data);
This is too slow taking around 45ms just to run the code above for a single 1080p screen. This makes the overall frame rate too slow to be smooth, so I then tried using DirectX as per the example here:
http://www.codeproject.com/Articles/274461/Very-fast-screen-capture-using-DirectX-in-Csharp
However this didn't really net any results. It marginally increased the speed of the screen capture but it was still much too slow (taking around 25-40ms, and the small increase wasn't worth the overhead of the extra DLLs, code, etc.
After googling around a bit I couldn't really find any better solutions, so my question is what is the best way to capture the pixels currently displaying on the screen? An ideal solution would:
Capture the screen as an array of bytes as RGBA
Work on older windows platforms (e.g. Windows XP and above)
Work with multiple displays
Uses existing system libraries rather than 3rd party DLLs
All these points are negotiable for a solution that return a decent overall framerate, in the region of 5-10ms for the actual capturing so the framerate can be 40-60fps.
Alternatively, If there no solution that matches above, am I taking the wrong path to calculate screen changes. Is there a better way to calculate areas of the screen that have changed?
Perhaps you can access the screen buffers at a lower level of code and hook directly into the layers and regions Windows uses as part of its screen updates. It sounds like you are after the raw display changes and Windows already has to keep track of this data. Just offering a direction for you to pursue while you find someone more knowledgeable.
Not sure if what I'm trying to do will work out, or is even possible. Basically I'm creating a remote desktop type app which captures the screen as a jpeg image and sends it to the client app for displaying.
I want to reduce the amount of data sent each time by comparing the image to the older one and only sending the differences. For example:
var bitmap = new Bitmap(1024, 720);
string oldBase = "";
using (var stream = new MemoryStream())
using (var graphics = Graphics.FromImage(bitmap))
{
graphics.CopyFromScreen(bounds.X, bounds.Y, 0, 0, bounds.Size);
bitmap.Save(stream, ImageFormat.Jpeg);
string newBase = Convert.ToBase64String(stream.ToArray());
// ! Do compare/replace stuff here with newBase and oldBase !
// Store the old image as a base64 string.
oldBase = newBase;
}
Using something like this I could compare both base64 strings and replace any matches. The matched text could be replaced with something like:
[number of characters replaced]
That way, on the client side I know where to replace the old data and add the new. Again, I'm not sure if this would even work so anyones thoughts on this would be very appreciated. :) If it is possible, could you point me in the right direction? Thanks.
You can do this by comparing the bitmap bits directly. Look into Bitmap.LockBits, which will give you a BitmapData pointer from which you can get the pixel data. You can then compare the pixels for each scan line and encode them into whatever format you want to use for transport.
Note that a scan line's length in bytes is always a multiple of 4. So unless you're using 32-bit color, you have to take into account the padding that might be at the end of the scan line. That's what the Stride property is for in the BitmapData structure.
Doing things on a per-scanline basis is easier, but potentially not as efficient (in terms of reducing the amount of data sent) as treating the bitmap as one contiguous block of data. Your transport format should look something like:
<start marker>
// for each scan line
<scan line marker><scan line number>
<pixel position><number of pixels><pixel data>
<pixel position><number of pixels><pixel data>
...
// next scan line
<scan line marker><scan line number>
...
<end marker>
each <pixel position><number of pixels><pixel data> entry is a run of changed pixels. If a scan line has no changed pixels, you can choose not to send it. Or you can just send the scan line marker and number, followed immediately by the next scan line.
Two bytes will be enough for the <pixel position> field and for the <number of pixels> field. So you have an overhead of four bytes for each block. An optimization you might be interested in, after you have the simplest version working, would be to combine blocks of changed/unchanged pixels if there are small runs. For example, if you have uucucuc, where u is an unchanged pixel and c is a changed pixel, you'll probably want to encode the cucuc as one run of five changed pixels. That will reduce the amount of data you have to transmit.
Note that this isn't the best way to do things, but it's simple, effective, and relatively easy to implement.
In any case, once you've encoded things, you can run the data through the built-in GZip compressor (although doing so might not help much) and then push it down the pipe to the client, which would decompress it and interpret the result.
It would be easiest to build this on a single machine, using two windows to verify the results. Once that's working, you can hook up the network transport piece. Debugging the initial cut by having that transport step in the middle could prove very frustrating.
We're currently working on something very similar - basically, what you're trying to implement is video codec (very simple motion jpeg). There are some simple approaches and some very complicated.
The simplest approach is to compare consecutive frames and send only the differences. You may try to compare color differences between the frames in RGB space or YCbCr space and send only the pixels that changed with some metadata.
The more complicated solution is to compare the pictures after DCT transformation but before entropy coding. That would give you better comparisons and remove some ugly artifacts.
Check more info on JPEG, Motion JPEG, H.264 - you may use some methods these codecs are using or simply use the existing codec if possible.
This wont work for a JPEG. You need to use BMP, or possibly uncompressed TIFF.
I think if it were me I'd use BMP, scan the pixels for changes and construct a PNG where everything except the changes were transparent.
First, this would reduce your transmission size because the PNG conpression is quite good especially for repeating pixels.
Second, it makes dispay on the receiving end very easy since you can simply paint the new image overtop the old image.
I'm working on some university project and got stuck with memory issue.
I load a bitmap which takes about 1,5GB on HDD with code below:
Bitmap bmp = new Bitmap(pathToFile);
The issue is that the newly created Bitmap object uses about 3,5GB of RAM which is something I can't understand (that's really BIG wrapper :E). I need to get to the pixel array, and the use of Bitmap class is really helpful (I use LockBits() method later, and process the array byte per byte) but in this case it's total blocker. So here is my question:
Is there any easy way to extract the pixel array without lending additional 2gb?
I'm using c# just to extract the needed array, which is later processed in c++ - maybe I can extract all needed data in c++ (but conversion issue appears here - I'm concentrating on 24bgr format)?
PS: I need to keep the whole bitmap in memory so splitting it into parts is no solution.
PS2: Just to clarify some issues: I know the difference between file extension and file format. The loaded file is uncompressed bitmap 3 bytes per pixel of size ~1.42GB (16k x 32k pixels), so why Bitmap object is more than two times bigger? Any decompressing issues and converting into other format aren't taking place.
Consider using Memory Mapped Files to access your HUGE data :).
An example focused on what you need can be found here: http://visualstudiomagazine.com/articles/2010/06/23/memory-mapped-files.aspx
It's in managed code but you might as well use it from equivalent native code.
Let me know if you need more details.
You can use this solution , Work with bitmaps faster in C#
http://www.codeproject.com/Tips/240428/Work-with-bitmap-faster-with-Csharp
Or you can use memory mapped files
http://visualstudiomagazine.com/articles/2010/06/23/memory-mapped-files.aspx
You can stop memory caching.
Instead of
Bitmap bmp = new Bitmap(pathToFile);
Use
var bmp = (Bitmap)Image.FromStream(sourceFileStream, false, false);
see https://stackoverflow.com/a/47424918/887092
I am currently in the process of moving my C# application over to Qt / C++. I'm running into problems with lengths from TagLib. I find it odd that TagLib# returns audio durations in milliseconds, while TagLib returns its (incorrect) durations in seconds. TagLib just returns zero for the length values, while TagLib# remains correct.
Here is my source in C# / TagLib#...
TagLib.File tagfile = TagLib.File.Create(path);
uint milliseconds = (uint)tagfile.Properties.Duration.TotalMilliseconds;
And here is what should be nearly equivalent in C++ / TagLib. I've even forced it to read accurately. No success.
TagLib::FileName fn(path);
TagLib::FileRef fr(fn, true, TagLib::AudioProperties::Accurate);
uint length = fr.audioProperties()->length();
It works as expected for a good majority of my media files. However, a select few audio files fail to return any audio properties (the rest of the tag information reads fine!). The exact same audio properties are returned with no issues on TagLib#.
Any ideas are appreciated. Thanks.
Does anyone have any more ideas before the bounty ends?
Hi there is a patch to taglib that calculate the length in milliseconds, this guy added a method (lengthMilliseconds()) that return the length in milliseconds, maybe that could be useful for you:
http://web.archiveorange.com/archive/v/sF3Pjr01lSQjsqjrAC7L
A lot has changed in TagLib#'s parsing of audio files since it was originally ported, so its hard to say where exactly the difference would occur. You may check your C++ program for debug messages.
My guess is that the difference is in how the two libraries react to invalid headers. It appears that if the first frame header it finds is invalid, TagLib won't calculate any audio property values. TagLib#, on the other hand, looks for the first valid header in the first 16KiB of the audio part of the file. If the first header it encounters is corrupt, it will scan for the next one. If I remember correctly, an incorrectly saved ID3v2 tag could result in 0xFF FF FF FF appearing in the beginning of the audio section of the file. This would trigger the type of failure described above.
The problem is at line 166 of taglib/mpeg/mpegproperties.cpp. This could be solved using the same approach as lines 171 to 191, but you would want to update the code to give up after a point in case it really isn't an MP3 file.
As of this writing, TagLib 1.11 BETA 2 natively supports getting the length of audio in milliseconds. You can do so with the following code:
TagLib::FileRef f(path);
int lengthInMillis = f.audioProperties()->lengthInMilliseconds();