I have a PDF with four pages. Two images on the first page, one on the second, and one on the third. When I retrieve the value of the image on the second page or fourth,, I get a negative height. I tried setting it to Absolute as a quick fix but the Y position of the image was still slightly off. Also, the height and positioning on page three was fine.
Update: So far, this only seems to be a problem with PDF's created in Google Docs.
My code to extract the PDF images was taken from this thread Using iText 7, what's the proper way to export a Flate encoded image?.
This is how I access the height
var currentPDFImageInfo = extractedImages[i];
var currentPDFImageMatrix = currentPDFImageInfo.RenderInfo.GetImageCtm();
float pdfImageWidth = currentPDFImageMatrix.Get(iText.Kernel.Geom.Matrix.I11);
How I retrieve the PDF image data
public static List<PDFImageInfo> ExtractImagesFromPDF(string filePath)
{
Reader = new PdfReader(filePath);
Document = new PdfDocument(Reader);
var strategy = new ImageRenderListener();
PdfCanvasProcessor parser = new PdfCanvasProcessor(strategy);
for (int pageNumber = 1; pageNumber <= Document.GetNumberOfPages(); pageNumber++)
{
strategy.CurrentPageNumber = pageNumber;
parser.ProcessPageContent(Document.GetPage(pageNumber));
}
return strategy.ImageInfoList;
}
And of course the Strategy class
public class ImageRenderListener : IEventListener
{
public void EventOccurred(IEventData data, EventType type)
{
if (data is ImageRenderInfo imageData)
{
try
{
if (imageData.GetImage() == null)
{
Console.WriteLine("Image could not be read.");
}
else
{
var pdfImageInfo = new PDFImageInfo(CurrentPageNumber, imageData);
ImageInfoList.Add(pdfImageInfo);
}
}
catch (Exception ex)
{
Console.WriteLine("Image could not be read: {0}.", ex.Message);
}
}
}
public ICollection<EventType> GetSupportedEvents()
{
return null;
}
public int CurrentPageNumber { get; set; }
public List<PDFImageInfo> ImageInfoList { get; set; } = new List<PDFImageInfo>();
}
This is how I access the height
var currentPDFImageInfo = extractedImages[i];
var currentPDFImageMatrix = currentPDFImageInfo.RenderInfo.GetImageCtm();
float pdfImageWidth = currentPDFImageMatrix.Get(iText.Kernel.Geom.Matrix.I11);
This value is the height only under certain circumstances.
Some backgrounds: The contents of a PDF page are drawn by a sequence of instructions in some content stream. Some of these instructions can manipulate the so called current transformation matrix (CTM) which represents an affine transformation, i.e. some combination of a rotation, translation, mirroring, and skewing. Everything other instructions draw is manipulated by the CTM value at the time that instruction is executed.
When a bitmap image is drawn, it is conceptually first reduced to a 1×1 square which then is transformed by the CTM to the final form of the image on the page.
If the image is displayed upright, no rotation or anything else involved, then indeed the I11 value is the width of the displayed image and the I22 value is the height. The I12 and I21 values are 0 then
But often bitmaps are displayed at 90° clockwise or counterclockwise (e.g. because someone held the camera at an 90° angle while shooting). In these cases I11 and I22 are 0 while I12 and I21 are the height and width respectively, with one or the other having a negative sign depending on the direction of the rotation.
If the bitmap is rotated by 180°, I11 and I22 again contain width and height, but both with a negative sign. If it's mirrored along the x axis or the y axis, one of them is negative.
And if the transformation is something else, e.g. a rotation by an angle that's not a multiple of 90°, finding the height and width becomes more complicated.
Actually then it is not even clear what height and width of the skewed, rotated, and mirrored form shall mean.
Thus, as a start please define which values you exactly are after; based on that you can try and determine them from arbitrary transformation matrices.
Another possible cause for unexplainable coordinate data for pages after the first one is that your code re-uses the PdfCanvasProcessor for each page without resetting:
var strategy = new ImageRenderListener();
PdfCanvasProcessor parser = new PdfCanvasProcessor(strategy);
for (int pageNumber = 1; pageNumber <= Document.GetNumberOfPages(); pageNumber++)
{
strategy.CurrentPageNumber = pageNumber;
parser.ProcessPageContent(Document.GetPage(pageNumber));
}
This causes the graphics state at the end of one page incorrectly to be used as starting graphics state of the next one. Instead you should either use a new PdfCanvasProcessor instance for each page or call parser.Reset() at the start of each page.
Sorry, I'm not good at writing with English.
I've tried to make real time candle stick chart and almost completed.
But, there are a few problems.
When new data comes in, my program works, but it only shows half the width of the last candle.
I can then move the scroll bar to the end to see the finished bar. I don't know where to touch.
When the chart area is zoomed, the length of the x-axis changes slightly when data is received.
I want to fix the size of the area where the candle is displayed.
The following are some of the sources:
private void RealChart(Chart chart)
{
int a = chtMain.Series["BaseCandle"].Points.Count;
double yMinValue = double.MaxValue;
double yMaxValue = double.MinValue;
for (int i = 0; i < a; i++)
{
Series s = chtMain.Series["BaseCandle"];
if (i < s.Points.Count)
{
yMaxValue = Math.Max(yMaxValue, s.Points[i].YValues[0]);
yMinValue = Math.Min(yMinValue, s.Points[i].YValues[1]);
}
}
chtMain.ChartAreas["ChartArea1"].AxisY.Maximum = yMaxValue + Math.Abs(yMaxValue * 0.05);
chtMain.ChartAreas["ChartArea1"].AxisY.Minimum = yMinValue - Math.Abs(yMinValue * 0.05);
chtMain.ChartAreas["ChartArea1"].AxisX.ScaleView.Scroll(a);
}
First time doing this. I am currently building a bot using C# and want my bot to be able to move the mouse to a given point in a way that looks human. By this I am referring to the dragging of the mouse when a human moves the cursor to a point they are trying to click on. Currently my C# bot moves the mouse instantly to the location which doesn't look human.
private static Point[] FindColor(Color color)
{
int searchValue = color.ToArgb();
List<Point> result = new List<Point>();
using (Bitmap bmp = GetScreenShot())
{
for (int x = 0; x < bmp.Width; x++)
{
for (int y = 0; y < bmp.Height; y++)
{
if (searchValue.Equals(bmp.GetPixel(x, y).ToArgb()))
result.Add(new Point(x, y));
}
}
}
return result.ToArray();
}
// FUNCTIONS OCCUR BELOW
// Error message if program could not find bitmap within screenshot show error message
Color myRgbColor = new Color(); // Creates new colour called myRgbColor
myRgbColor = Color.FromArgb(51, 90, 9); // This colour equals the RGB value
Point[] points = FindColor(myRgbColor); // Create an array called points which list all the points found in the screen where the RgB value matches.
if (points.Length > 0)
{
Cursor.Position = points[2]; // Move mouse cursor to first point (Point 0)
Thread.Sleep(0200);
MouseClick();
}
if (points.Length == 0)
{
MessageBox.Show("No matches!"); // Return error
goto checkore;
}
You're going to want to use some kind of Timer with a callback, to move the mouse incrementally, step by step. As for the movement itself, you have a world of possibilities, but it's all maths.
So, let's decompose the problem.
What is a natural mouse movement?
Position change rate
It doesn't necessarilly looks like it, but when you move your mouse, you're simply setting its position multiple times per seconds.
The amount of times the position changes per second is equivalent to the polling rate of your mouse. The default polling rate for USB mice is 125Hz (or 125 position changes per second, if you will). This is the value we'll use for our Timer: its callback will be called 125 times per second.
var timer = new Timer(1000 / 125d);
timer.Elapsed += MoveMouse;
void MoveMouse(object sender, ElpasedEventArgs e) { }
Speed and acceleration
When you move your mouse, the distance between two cursor positions is not constant, because you're fast when you start moving your mouse, but you slow down when you get close to the item you want your cursor to be on.
There are also two ways I personally usually move my mouse depending on the context/mood:
One fast uniform movement to get close to the destination, then one slow to correct and get on it (I'll usually go past the destination during the first move)
One medium-slow movement with a small deceleration, follow by a stronger deceleration at the end
The overall speed of the movement also depends on three factors:
The distance between your cursor and the destination
The size of the destination area
Your personal speed
I have absolutely NO IDEA how to work out the formula based on these factors, that's gonna be a work of trial and error for yourself.
This one is purely math and observation based, and will be tricky to get perfectly right, if ever; every person moves their mouse a different way.
The solution I can offer you is to simply forget about deceleration, correction and so on, and just divide your movement into equal steps. That has the merit of being simple.
using System;
using System.Timers;
using System.Drawing;
public class Program
{
static int stepCount = 0;
static int numberOfSteps = 0;
static float stepDistanceX = 0;
static float stepDistanceY = 0;
static PointF destinationPoint;
static Timer timer;
public static void Main()
{
int timerStepDurationMs = 1000 / 125;
PointF currentPoint = Cursor.Position;
destinationPoint = new PointF(2000, 1800); // or however you select your point
int movementDurationMs = new Random().Next(900, 1100); // roughly 1 second
int numberOfSteps = movementDurationMs / timerStepDurationMs;
stepDistanceX = (destinationPoint.X - currentPoint.X) / (float)numberOfSteps;
stepDistanceY = (destinationPoint.Y - currentPoint.Y) / (float)numberOfSteps;
timer = new Timer(timerStepDurationMs);
timer.Elapsed += MoveMouse;
timer.Start();
while (stepCount != numberOfSteps) { }
}
static void MoveMouse(object sender, ElapsedEventArgs e)
{
stepCount++;
if (stepCount == numberOfSteps)
{
Cursor.Position = destinationPoint;
timer.Stop();
}
Cursor.Position.X += stepDistanceX;
Cursor.Position.Y += stepDistanceY;
}
}
Note that I haven't tested with "Cursor", but with some PointF variable instead. It seems to work fine here: dotnetfiddle.
thats how i wrote your beautiful code(some simple changes for me for easier understanding)
private void Form1_Load(object sender, EventArgs e)
{
prev = GetDesktopImage();//get a screenshot of the desktop;
cur = GetDesktopImage();//get a screenshot of the desktop;
var locked1 = cur.LockBits(new Rectangle(0, 0, cur.Width, cur.Height),
ImageLockMode.ReadWrite, PixelFormat.Format32bppArgb);
var locked2 = prev.LockBits(new Rectangle(0, 0, prev.Width, prev.Height),
ImageLockMode.ReadWrite, PixelFormat.Format32bppArgb);
ApplyXor(locked1, locked2);
compressionBuffer = new byte[1920* 1080 * 4];
// Compressed buffer -- where the data goes that we'll send.
int backbufSize = LZ4.LZ4Codec.MaximumOutputLength(this.compressionBuffer.Length) + 4;
backbuf = new CompressedCaptureScreen(backbufSize);
MessageBox.Show(compressionBuffer.Length.ToString());
int length = Compress();
MessageBox.Show(backbuf.Data.Length.ToString());//prints the new buffer size
}
the compression buffer length is for example 8294400
and the backbuff.Data.length is 8326947
I didn't like the compression suggestions, so here's what I would do.
You don't want to compress a video stream (so MPEG, AVI, etc are out of the question -- these don't have to be real-time) and you don't want to compress individual pictures (since that's just stupid).
Basically what you want to do is detect if things change and send the differences. You're on the right track with that; most video compressors do that. You also want a fast compression/decompression algorithm; especially if you go to more FPS that will become more relevant.
Differences. First off, eliminate all branches in your code, and make sure memory access is sequential (e.g. iterate x in the inner loop). The latter will give you cache locality. As for the differences, I'd probably use a 64-bit XOR; it's easy, branchless and fast.
If you want performance, it's probably better to do this in C++: The current C# implementation doesn't vectorize your code, and that will help you a great deal here.
Do something like this (I'm assuming 32bit pixel format):
for (int y=0; y<height; ++y) // change to PFor if you like
{
ulong* row1 = (ulong*)(image1BasePtr + image1Stride * y);
ulong* row2 = (ulong*)(image2BasePtr + image2Stride * y);
for (int x=0; x<width; x += 2)
row2[x] ^= row1[x];
}
Fast compression and decompression usually means simpler compression algorithms. https://code.google.com/p/lz4/ is such an algorithm, and there's a proper .NET port available for that as well. You might want to read on how it works too; there is a streaming feature in LZ4 and if you can make it handle 2 images instead of 1 that will probably give you a nice compression boost.
All in all, if you're trying to compress white noise, it simply won't work and your frame rate will drop. One way to solve this is to reduce the colors if you have too much 'randomness' in a frame. A measure for randomness is entropy, and there are several ways to get a measure of the entropy of a picture ( https://en.wikipedia.org/wiki/Entropy_(information_theory) ). I'd stick with a very simple one: check the size of the compressed picture -- if it's above a certain limit, reduce the number of bits; if below, increase the number of bits.
Note that increasing and decreasing bits is not done with shifting in this case; you don't need your bits to be removed, you simply need your compression to work better. It's probably just as good to use a simple 'AND' with a bitmask. For example, if you want to drop 2 bits, you can do it like this:
for (int y=0; y<height; ++y) // change to PFor if you like
{
ulong* row1 = (ulong*)(image1BasePtr + image1Stride * y);
ulong* row2 = (ulong*)(image2BasePtr + image2Stride * y);
ulong mask = 0xFFFCFCFCFFFCFCFC;
for (int x=0; x<width; x += 2)
row2[x] = (row2[x] ^ row1[x]) & mask;
}
PS: I'm not sure what I would do with the alpha component, I'll leave that up to your experimentation.
Good luck!
The long answer
I had some time to spare, so I just tested this approach. Here's some code to support it all.
This code normally run over 130 FPS with a nice constant memory pressure on my laptop, so the bottleneck shouldn't be here anymore. Note that you need LZ4 to get this working and that LZ4 is aimed at high speed, not high compression ratio's. A bit more on that later.
First we need something that we can use to hold all the data we're going to send. I'm not implementing the sockets stuff itself here (although that should be pretty simple using this as a start), I mainly focused on getting the data you need to send something over.
// The thing you send over a socket
public class CompressedCaptureScreen
{
public CompressedCaptureScreen(int size)
{
this.Data = new byte[size];
this.Size = 4;
}
public int Size;
public byte[] Data;
}
We also need a class that will hold all the magic:
public class CompressScreenCapture
{
Next, if I'm running high performance code, I make it a habit to preallocate all the buffers first. That'll save you time during the actual algorithmic stuff. 4 buffers of 1080p is about 33 MB, which is fine - so let's allocate that.
public CompressScreenCapture()
{
// Initialize with black screen; get bounds from screen.
this.screenBounds = Screen.PrimaryScreen.Bounds;
// Initialize 2 buffers - 1 for the current and 1 for the previous image
prev = new Bitmap(screenBounds.Width, screenBounds.Height, PixelFormat.Format32bppArgb);
cur = new Bitmap(screenBounds.Width, screenBounds.Height, PixelFormat.Format32bppArgb);
// Clear the 'prev' buffer - this is the initial state
using (Graphics g = Graphics.FromImage(prev))
{
g.Clear(Color.Black);
}
// Compression buffer -- we don't really need this but I'm lazy today.
compressionBuffer = new byte[screenBounds.Width * screenBounds.Height * 4];
// Compressed buffer -- where the data goes that we'll send.
int backbufSize = LZ4.LZ4Codec.MaximumOutputLength(this.compressionBuffer.Length) + 4;
backbuf = new CompressedCaptureScreen(backbufSize);
}
private Rectangle screenBounds;
private Bitmap prev;
private Bitmap cur;
private byte[] compressionBuffer;
private int backbufSize;
private CompressedCaptureScreen backbuf;
private int n = 0;
First thing to do is capture the screen. This is the easy part: simply fill the bitmap of the current screen:
private void Capture()
{
// Fill 'cur' with a screenshot
using (var gfxScreenshot = Graphics.FromImage(cur))
{
gfxScreenshot.CopyFromScreen(screenBounds.X, screenBounds.Y, 0, 0, screenBounds.Size, CopyPixelOperation.SourceCopy);
}
}
As I said, I don't want to compress 'raw' pixels. Instead, I'd much rather compress XOR masks of previous and the current image. Most of the times this will give you a whole lot of 0's, which is easy to compress:
private unsafe void ApplyXor(BitmapData previous, BitmapData current)
{
byte* prev0 = (byte*)previous.Scan0.ToPointer();
byte* cur0 = (byte*)current.Scan0.ToPointer();
int height = previous.Height;
int width = previous.Width;
int halfwidth = width / 2;
fixed (byte* target = this.compressionBuffer)
{
ulong* dst = (ulong*)target;
for (int y = 0; y < height; ++y)
{
ulong* prevRow = (ulong*)(prev0 + previous.Stride * y);
ulong* curRow = (ulong*)(cur0 + current.Stride * y);
for (int x = 0; x < halfwidth; ++x)
{
*(dst++) = curRow[x] ^ prevRow[x];
}
}
}
}
For the compression algorithm I simply pass the buffer to LZ4 and let it do its magic.
private int Compress()
{
// Grab the backbuf in an attempt to update it with new data
var backbuf = this.backbuf;
backbuf.Size = LZ4.LZ4Codec.Encode(
this.compressionBuffer, 0, this.compressionBuffer.Length,
backbuf.Data, 4, backbuf.Data.Length-4);
Buffer.BlockCopy(BitConverter.GetBytes(backbuf.Size), 0, backbuf.Data, 0, 4);
return backbuf.Size;
}
One thing to note here is that I make it a habit to put everything in my buffer that I need to send over the TCP/IP socket. I don't want to move data around if I can easily avoid it, so I'm simply putting everything that I need on the other side there.
As for the sockets itself, you can use a-sync TCP sockets here (I would), but if you do, you will need to add an extra buffer.
The only thing that remains is to glue everything together and put some statistics on the screen:
public void Iterate()
{
Stopwatch sw = Stopwatch.StartNew();
// Capture a screen:
Capture();
TimeSpan timeToCapture = sw.Elapsed;
// Lock both images:
var locked1 = cur.LockBits(new Rectangle(0, 0, cur.Width, cur.Height),
ImageLockMode.ReadWrite, PixelFormat.Format32bppArgb);
var locked2 = prev.LockBits(new Rectangle(0, 0, prev.Width, prev.Height),
ImageLockMode.ReadWrite, PixelFormat.Format32bppArgb);
try
{
// Xor screen:
ApplyXor(locked2, locked1);
TimeSpan timeToXor = sw.Elapsed;
// Compress screen:
int length = Compress();
TimeSpan timeToCompress = sw.Elapsed;
if ((++n) % 50 == 0)
{
Console.Write("Iteration: {0:0.00}s, {1:0.00}s, {2:0.00}s " +
"{3} Kb => {4:0.0} FPS \r",
timeToCapture.TotalSeconds, timeToXor.TotalSeconds,
timeToCompress.TotalSeconds, length / 1024,
1.0 / sw.Elapsed.TotalSeconds);
}
// Swap buffers:
var tmp = cur;
cur = prev;
prev = tmp;
}
finally
{
cur.UnlockBits(locked1);
prev.UnlockBits(locked2);
}
}
Note that I reduce Console output to ensure that's not the bottleneck. :-)
Simple improvements
It's a bit wasteful to compress all those 0's, right? It's pretty easy to track the min and max y position that has data using a simple boolean.
ulong tmp = curRow[x] ^ prevRow[x];
*(dst++) = tmp;
hasdata |= tmp != 0;
You also probably don't want to call Compress if you don't have to.
After adding this feature you'll get something like this on your screen:
Iteration: 0.00s, 0.01s, 0.01s 1 Kb => 152.0 FPS
Using another compression algorithm might also help. I stuck to LZ4 because it's simple to use, it's blazing fast and compresses pretty well -- still, there are other options that might work better. See http://fastcompression.blogspot.nl/ for a comparison.
If you have a bad connection or if you're streaming video over a remote connection, all this won't work. Best to reduce the pixel values here. That's quite simple: apply a simple 64-bit mask during the xor to both the previous and current picture... You can also try using indexed colors - anyhow, there's a ton of different things you can try here; I just kept it simple because that's probably good enough.
You can also use Parallel.For for the xor loop; personally I didn't really care about that.
A bit more challenging
If you have 1 server that is serving multiple clients, things will get a bit more challenging, as they will refresh at different rates. We want the fastest refreshing client to determine the server speed - not slowest. :-)
To implement this, the relation between the prev and cur has to change. If we simply 'xor' away like here, we'll end up with a completely garbled picture at the slower clients.
To solve that, we don't want to swap prev anymore, as it should hold key frames (that you'll refresh when the compressed data becomes too big) and cur will hold incremental data from the 'xor' results. This means you can basically grab an arbitrary 'xor'red frame and send it over the line - as long as the prev bitmap is recent.
H264 or Equaivalent Codec Streaming
There are various compressed streaming available which does almost everything that you can do to optimize screen sharing over network. There are many open source and commercial libraries to stream.
Screen transfer in Blocks
H264 already does this, but if you want to do it yourself, you have to divide your screens into smaller blocks of 100x100 pixels, and compare these blocks with previous version and send these blocks over network.
Window Render Information
Microsoft RDP does lot better, it does not send screen as a raster image, instead it analyzes screen and creates screen blocks based on the windows on the screen. It then analyzes contents of screen and sends image only if needed, if it is a text box with some text in it, RDP sends information to render text box with a text with font information and other information. So instead of sending image, it sends information on what to render.
You can combine all techniques and make a mixed protocol to send screen blocks with image and other rendering information.
Instead of handling data as an array of bytes, you can handle it as an array of integers.
int* p = (int*)((byte*)scan0.ToPointer() + y * stride);
int* p2 = (int*)((byte*)scan02.ToPointer() + y * stride2);
for (int x = 0; x < nWidth; x++)
{
//always get the complete pixel when differences are found
if (*p2 != 0)
*p = *p2
++p;
++p2;
}
I am using mschart to display 10 lines with up to 60,000 points of data each. There is a single ChartArea and an individual Series for each line, set as type FastLine.
Initial performance is very good, with the chart loading almost instantly. The problems start when any sort of interaction is required. In my case this means CursorX position / selection is changed. The GUI thread goes to ~100% usage (a whole core) until the cursor stops moving. During this time the graph updates sporadically. No additional code or functions are being called.
After profiling the application to see where all of the CPU time is being used, it would appear that every time the cursor is moved the whole chart has to be redrawn. All 10 * 60,000 points of data. While this is reasonable with just a few thousand points of data to draw, it doesn't scale very well at all. Changing the cursors Interval value doesn't seem to make any difference.
Are there any changes I can make to fix / avoid this performance issue? If not, can you recommend any other charting libraries?
//EDIT//
As requested, here is some test code that displays the same issues as mentioned. All that is required is a chart called chart1 exists. Setting CursorX.IsUserEnabled and CursorX.IsUserSelectionEnabled to true allows for the (problematic) interactions to take place:
public MainForm()
{
//
// The InitializeComponent() call is required for Windows Forms designer support.
//
InitializeComponent();
//Set up chart and add values
ChartArea ca = chart1.ChartAreas.Add("Data");
ca.AxisX.IsMarginVisible = false;
ca.CursorX.Interval = 0.001;
ca.CursorX.IsUserEnabled = true;
ca.CursorX.IsUserSelectionEnabled = true;
ca.AxisX.ScaleView.Zoomable = false;
for (int i = 0; i < 10; i++)
{
Series s = new Series("Series_" + i.ToString());
s.ChartArea = ca.Name;
s.ChartType = SeriesChartType.FastLine;
for (int p = 0; p < _maxPoints; p++)
{
double x = p / 100.0; //(10ms steps)
double y = p * (1 + i);
s.Points.AddXY(x, y);
}
chart1.Series.Add(s);
}
}
The larger the value for _maxPoints, the worse the problem becomes.