Magick.net check if two images are identical - c#

I'm trying to compare two screenshots from a webpage using Magick.NET, a C# library from ImageMagick. My code looks like this:
//Adapt image a bit otherwise he'll throw an error over the whole image
newScreenshot.ColorFuzz = new Percentage(15);
//Get the difference, 1 = perfectly the same, less then 1 not.
double diff = newScreenshot.Compare(benchmarkScreenshot, new ErrorMetric(), imgDiff);
//Output the result image for comparaison
imgDiff.Write(compareResultPath);
if (diff < 0.998)
{
//Do something
}
In this case I would get values lower then 1, where I imagined 1 would be "Identical" and everything less then 1 wouldn't be. I was wrong... So the only way I could think of to check if they are as identical as possible is to lower the tolerence by lowering the value in the if-statement.
So if I have a screenshot from a website, and I adapt it I get the following values for the "diff" variable:
Identical image: 0.99842343024053205
Removing a sentence: 0.99776453647987487
Removing one letter from any word on the page: 0.99698398328761506
I'm very afraid of the fact that removing an entire sentence has a higher value then just a single letter.
I also tried with ErrorMetric.Absolute rather then new ErrorMetric(), the values that I got for the "diff" variable were:
Identical image: 1949
Removing a sentence: 766
Removing one letter from any word on the page: 75
Is there a better, more accurate way then what I'm trying to do to check if there's an actual change or not?

Related

Word-wrap line of string overflow in C# and also give out the number of lines that have been wordwrapped

from these two questions in SO | 1 | and | 2 |(this one's my
own) I've tried to solve but I'm running into some problems.
Take a look at where I am stuck!!
I have successfully word-wrapped this string "Damodarmarg, Kusunti, Inside Ringroad, Lalitpur, Bagmati, Nepal" using a code (see my SECOND CODE)
But I DO NOT KNOW how to get the number of lines that have been wrapped. (because it wraps automatically).
This is the screenshot of my PRINT PREVIEW:
I want to put some space between Residential Address and Permanent Address. To do that I need to know how mny lines are being word-wrapped.
Thats my problem. The word-wrap is happening but NOT the line spacing! I just want to know how I can calculate the number of lines that have been word-wrapped so that I can run a function to do appropriate line spacing between two fields!
I want to know the number of lines that have been word-wrapped (which you can observe, in the screenshot above, is clearly = 2 here).
Why do I need this number? ==> If x is the number of lines word wrapped, I want to execute the function newline() x number of times. That's why. Its my own function whose purpose is to correct line spacing among different fields.
Example:
For a string "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
IF the page margin only allows 10 characters per line then
The output should beABCDEFGHIJKLMNOPQRSTUVWXYZ
and the number of lines used, stored by variable linesFilled(say), should be = 3
Obviously,this is just a run-down example. In real-practice, I would like to have no character limit per line. Instead I want MeasureString to automatically know how many words fit in a line and then word-wrap the rest that are not fitting to next consecutive line(s). For your information: I have already done this much . You see, I seek your help only to know how to get the number of lines that have word-wrapped. That has been really tricky to work around.
What I have tried so far:
FIRST CODE:
My code looks like this(for the line counting; which you need to help me with):
int charactersFitted;
int linesFilled;
SizeF stringSize = new SizeF();
stringSize = e.Graphics.MeasureString("Residential Address: " +
RAddressTextBox.Text, stringFont, layoutSize, newStringFormat,
out charactersFitted, out linesFilled);
textBox1.Text = Convert.ToString(stringSize);
textBox2.Text = Convert.ToString(stringSize.Width);
So, this first code is supposed to give me the number of lines that were wrapped around the print margin. Currently just gives the width of the part of string that occupied the whole line as opposed to the number of lines the string has occupied (can I get a method to know the number of lines?)
SECOND CODE:
Graphics RAddress = e.Graphics;
SizeF RAddressLength = RAddress.MeasureString("Residential Address: "+
RAddressTextBox.Text, stringFont,700);
RAddress.DrawString("Residential Address: " + RAddressTextBox.Text,
stringFont, Brushes.Black, new RectangleF(new Point(pagemarginX,newline()),
RAddressLength), StringFormat.GenericTypographic);
and this second code helps we actually wrap the string when it does not fit in page margin(this second code here works perfectly at the moment. It automatically word-wraps text NOT fitting in a line to next consecutive line(s) but it DOES NOT tell me how many lines have been word-wrapped. THATS my problem)
Note: newline() is my own function which leaves one line when called. And pagemarginX sets approapirate margin. Thats all. Do not be confused. As for why I havent used DrawString in my FIRST CODE; I have been using both codes. This one to display the string and the FIRST CODE to count the lines in string. I haven't been able to count the number of lines with this one. Sorry for the confusion.
SAMPLE OUTPUT(s) FOR YOUR INFORMATION: Currently, output of stringSize.Width is 114.226.As suggested in some of the comments, I tried outputting linesFilled instead of stringSize.Width and the output was 5 . Another suggestion was to try int numLines = Convert.ToInt32(Math.Ceiling(layoutSize.Width / stringSize.Width)); which gave me the output 7 . As shown in the screenshot of my PRINT PREVIEW over there^^^, I obviously need the output=2 for my string. Please somebody help me!
I welcome:
Solutions that augment and enhance my FIRST CODE to produce correct lines of word-wrap as output for the string: "Damodarmarg, Kusunti, Inside Ringroad, Lalitpur, Bagmati, Nepal" (which should be 2)
Solutions that can modify my SECOND CODE to also produce the number of lines that have been wordwrapped (This type of solution would be awesome!!)
Creative solution: A solution that takes input as string and wordwraps that string to a margin and also produces the number of lines word-wrapped. (OR, whatever you think that solves this problem also works!)
My specifications are NOT rigid. You can solve this problem any way you are comfortable with!
I have edited my question to include as much detail as possible. If you'd like the whole module, you can simply ask me too. I am hoping for a solution. Thanks!
Not sure if this is what you're looking for but when using a print dialog you can do something like this that will give you the number of characters on the page and how many lines per page the string takes up:
private void printDocument1_PrintPage(object sender, System.Drawing.Printing.PrintPageEventArgs e)
{
int charactersOnPage = 0;
int linesPerPage = 0;
// Sets the value of charactersOnPage to the number of characters
// of stringToPrint that will fit within the bounds of the page.
e.Graphics.MeasureString(stringToPrint, this.Font,
e.MarginBounds.Size, StringFormat.GenericTypographic,
out charactersOnPage, out linesPerPage);
}
e.MarginBounds.Size is what will do the trick for you I believe. Then you can just takes the "charactersOnPage" value and divide it by "linesPerPage" to get the number of characters that fit on one line:
var charactersPerLine = charactersOnPage / linesPerPage;
Once you have "charactersPerLine" you can accomplish the rest of what you're trying to do.
I think you meant
textBox2.Text = Convert.ToString(linesFilled);
instead of
textBox2.Text = Convert.ToString(stringSize.Width);
Edit: Try this:
int numLines = Convert.ToInt32(Math.Ceiling(layoutSize.Width / stringSize.Width));

remove ILArray<> elements

Using ILNumerics, I am trying to take the first n number of columns of an ILArray<> in the most efficient way possible.
using(ILScope.Enter(inframe)
{
ILArray<complex> frame = ILMath.check(inframe);
int[] dims = frame.Size.ToIntArray(); //frame is a 2d ILArray
frame.SetRange(complex.Zero, dims[0] -1 , (dims[1] * 2 - 1)); //doubles the size of the array with zeros.
//TODO- various computations.
frame.SetRange(null, dims[0], dims[1] - 1); //EXCEPTION: why doesn't this work?
}
In this example I am trying to take only the first half of the frame, but I am unable to size it back to the original dimensions. I have tried various permutations based on http://ilnumerics.net/ArrayAlter.html but have been unsuccessful.
The documentation for shrinking of ILNumerics arrays says:
The range definition must address the full dimension - for all dimensions except the one, which is to be removed.
You want to remove the last half from the 2nd dimension. So you must define full ranges for all other dimensions involved. Here, since frame is a matrix, there are only 2 dimensions. Hence, the first must get fully addressed.
It should work easier by using the C# indexer. The following example assumes your code in a class derived from ILMath. Otherwise, add ILMath. before all the full, r, and end functions / properties:
A[full, r(end / 2, end)] = null;
Watch out for ‘off by one’ errors and addressing with ‘end’. You may want to use end / 2 + 1 instead ?
Since you want the most efficient way, performance seems to be important to you. In this case, you should try to prevent from expanding and shrinking arrays! It is better to work with two arrays of different sizes: a large one and the original one. Copy the data accordingly. Expanding and shrinking does copy the data anyway, so this is not a disadvantage. Furthermore, frame.Size.ToIntArray() is not needed here. Simply use frame.S[0]and frame.S[1] for the length of the dimensions 0 and 1.

How do I count modified lines of code?

I have a program which counts lines of code (excluding comments, braces, whitespace, etc.) of two programs then compares them. It puts all the lines from one program in one List and the lines from the other program in another List. It then removes all lines that are identical between the two. One List is then all the lines added to program 1 to get program 2 and the other List is all the lines removed from program 1 to get program 2.
Now I need a way to detect how many lines of code from program 1 have been MODIFIED to get program 2. I found an algorithm for the Levenshtein Distance, and it seems like that will work. I just need to compare the distance with the length of the strings to get a percentage changed, and I'll need to come up with a good value for the threshold.
However my problem is this: how do I know which two strings to compare for the Levenshtein Distance? My best guess is to have a nested for loop and loop through one program once for every line in the other program to compare every line with every other line looking for a Distance that meets my difference threshold. However, that seems very inefficient. Are there any other ways of doing this?
I should add this is for a software engineering class. It's technically homework, but we're allowed to use any resource we need. While I'm just looking for an algorithm, I'll let you know I'm using C#.
If you allow lines to be shuffled, how do you count the changes? Not all shuffled lines might result in identical functionality, even if you compare all lines and find exact matches.
If you compare
var random = new Random();
for (int i = 0; i < 9; i++) {
int randomNumber = random.Next(1, 50);
}
to
for (int i = 0; i < 9; i++) {
var random = new Random();
int randomNumber = random.Next(1, 50);
}
you have four unchanged lines of code, but the second version is likely to produce different results. There is definitely a change in the code, and yet line-by-line comparison will not detect it if you allow shuffling.
This is a good reason to disallow shuffling and actually mark line 1 in the first code as deleted, and line 2 in the second code as added, even though the deleted line and the added line are exactly the same.
Once you dicide that lines cannot be shuffled, i think you can figure out quite easily how to match your lines for comparison.
To step through both sources and compare the line you might want to look up the balance line algorithm (e.g http://www.isqa.unomaha.edu/haworth/isqa3300/fs006.htm )
If you suggest that lines of codes are shuffled (their order can be changed) then you need to compare all lines from 1st program to all lines from 2nd program excluding not changed lines.
You can simplify you task suggesting that lines cannot be shuffled. They can be only inserted, removed or unchanged. From my experience most of the programs comparing text files work this way

Get start and end of string comparison

I'm trying to create some sort of string-based diff algorithm on my own.
What I'm doing is: I'm iterating through every paragraph in my textdocument, comparing them both.
Now what I'm struggling with is the comparison start and end of both strings.
Consider having the two strings:
This is a test-text.
This is a very long test-text.
This means there's a change of 10 characters (9 text, 1 whitespace) in the second line ('very long ').
These characters should be highlighted accordingly. I've already come up with the solution of finding the start of the string-differences (say: index n is where the differences start):
int diffIndexStart = localText.Zip(serverText, (c1, c2) => c1 == c2).TakeWhile(b => b).Count();
Now how can I detect when the string matches again, so I can stop highlighting there, instead of highlighting the rest of the row (starting with diffIndexStart).
There's also another issue: What's when there are multiple changes within one line, let's say:
This is a test-text.
This, apparently, is a very long test-text.
Now I've got two changes: , apparently, and very long.
You're looking at the common Longest Common Subsequence (LCS) problem. There are numerous papers on that (the Wikipedia page gives some links as a start), several common approaches are highlighted in Wiki already.

string(";P") is bigger or string("-_-") is bigger?

I found very confusing when sorting a text file. Different algorithm/application produces different result, for example, on comparing two string str1=";P" and str2="-_-"
Just for your reference here gave the ASCII for each char in those string:
char(';') = 59; char('P') = 80;
char('-') = 45; char('_') = 95;
So I've tried different methods to determine which string is bigger, here is my result:
In Microsoft Office Excel Sorting command:
";P" < "-_-"
C++ std::string::compare(string &str2), i.e. str1.compare(str2)
";P" > "-_-"
C# string.CompareTo(), i.e. str1.CompareTo(str2)
";P" < "-_-"
C# string.CompareOrdinal(), i.e. CompareOrdinal(w1, w2)
";P" > "-_-"
As shown, the result varied! Actually my intuitive result should equal to Method 2 and 4, since the ASCII(';') = 59 which is larger than ASCII('-') = 45 .
So I have no idea why Excel and C# string.CompareTo() gives a opposite answer. Noted that in C# the second comparison function named string.CompareOrdinal(). Does this imply that the default C# string.CompareTo() function is not "Ordinal" ?
Could anyone explain this inconsistency?
And could anyone explain in CultureInfo = {en-US}, why it tells ;P > -_- ? what's the underlying motivation or principle? And I have ever heard about different double multiplication in different cultureInfo. It's rather a cultural shock..!
?
std::string::compare: "the result of a character comparison depends only on its character code". It's simply ordinal.
String.CompareTo: "performs a word (case-sensitive and culture-sensitive) comparison using the current culture". So,this not ordinal, since typical users don't expect things to be sorted like that.
String::CompareOrdinal: Per the name, "performs a case-sensitive comparison using ordinal sort rules".
EDIT: CompareOptions has a hint: "For example, the hyphen ("-") might have a very small weight assigned to it so that "coop" and "co-op" appear next to each other in a sorted list."
Excel 2003 (and earlier) does a sort ignoring hyphens and apostrophes, so your sort really compares ; to _, which gives the result that you have. Here's a Microsoft Support link about it. Pretty sparse, but enough to get the point across.

Categories

Resources