How to get delete and added string in between two strings - c#

In my requirement i have two strings (Dynamically) i want to compare two strings and strike off the deleted/modified string and also highlight the newly added string. one string is my old string and one string is new string, some times both are same based on user input. i tried but i cant get output. please help me. below is my tried code in c#
Ex: string s1 = "Hello dear Alice and dear Bob.";string s2 = " Hello Alice and dear Bob Have a nice day.";
Need Output: Hello dear Alice and dear Bob Have a nice day.
Dear is strike off and Have a nice day is highlight. pLEASE help me friends
My code:
if(String.Equals(my_NString,my_String,StringComparison.OrdinalIgnoreCase))
{
sb.AppendLine("<div><p style='text-align:justify;'>"+my_NString+" </p></div>");
sb.AppendLine("<br/>");
}
else
{
sb.AppendLine("<p style='text-align:justify;border:3px;border-color:#FF0000;padding:1em;color:red;'>"+my_NString+" </p>");
sb.AppendLine("<br/>");
}
}

This is not so easy. There's no one way to compare strings like that. There are different strategies and each have their up and downs. This is a very complicated task. Your best shot is to use an existing implementation of difference and variation algorithm, like this:
https://github.com/kpdecker/jsdiff (sorry, it's js)
PS: Editted:
Example really depends on what library/engine you'd like to use. The one that I'm most familiar with (and used most often) would look like this:
class Difference {
public static void Main(string[] args) {
diff_match_patch match = new diff_match_patch();
List<Diff> diff = match.diff_main("Hello World.", "Goodbye World.");
for (int i = 0; i < diff.Count; i++) Console.WriteLine(diff[i]);
}
}
The result would be:
[(-1, "Hell"), (1, "G"), (0, "o"), (1, "odbye"), (0, " World.")]
You could also use match.diff_cleanupSemantic(diff); before displaying, and then the result would be:
[(-1, "Hello"), (1, "Goodbye"), (0, " World.")]
So basically use diff_cleanupSemantic to change level of differences from 'letter-level' to 'word-level'.

Related

Finding multiple semi predictable patterns in a string

Alright, so I'm writing an application that needs to be able to extract a VAT-Number from an invoice (https://en.wikipedia.org/wiki/VAT_identification_number)
The biggest challenge to overcome here is that as apparent from the wikipedia article I have linked to, each country uses its own format for these VAT-numbers (The Netherlands uses a 14 character number while Germany uses a 11 character number).
In order to extract these numbers, I throw every line from the invoice into an array of strings, and for each string I test if it has a length that is equal to one of the VAT formats, and if that checks out, I check if said string also contains a country code ("NL", "DE", etc).
string[] ProcessedFile = Reader.ProcessFile(Input);
foreach(string S in ProcessedFile)
{
RtBEditor.AppendText(S + "\n");
}
foreach(string X in ProcessedFile)
{
string S = X.Replace(" ", string.Empty);
if (S.Length == 7)
{
if (S.Contains("GBGD"))
{
MessageBox.Show("Land = Groot Britanie (Regering)");
}
}
/*
repeat for all other lenghts and country codes.
*/
The problem with this code is that 1st:
if there is a string that happens to have the same length as one of the VAT-formats, and it has a country code embedded in it, the code will incorrectly think that it has found the VAT-number.
2nd:
In some cases, the VAT-number will be included like "VAT-number: [VAT-number]". In this case, the text that precedes the actual number will be added to its length, making the program unable to detect the actual VAT-Number.
The best way to fix this is in my assumption to somehow isolate the VAT-Number from the strings all together, but I have yet to find a way how to actually do this.
Does anyone by any chance know any potential solution?
Many thanks in advance!
EDIT:
Added a dummy invoice to clarify what kind of data is contained within the invoices.
As someone in the comments had pointed out, the best way to fix this is by using Regex. After trying around a bit I came to the following solution:
public Regex FilterNormaal = new Regex(#"[A-Z]{2}(\d)+B?\d*");
private void BtnUitlezen_Click(object sender, EventArgs e)
{
RtBEditor.Clear();
/*
Temp dummy vatcodes for initial testing.
*/
Form1.Dummy1.VAT = "NL855291886B01";
Form1.Dummy2.VAT = "DE483270846";
Form1.Dummy3.VAT = "SE482167803501";
OCR Reader = new OCR();
/*
Grab and process image
*/
if(openFileDialog1.ShowDialog() == DialogResult.OK)
{
try
{
Input = new Bitmap(openFileDialog1.FileName);
}
catch
{
MessageBox.Show("Please open an image file.");
}
}
string[] ProcessedFile = Reader.ProcessFile(Input);
foreach(string S in ProcessedFile)
{
string X = S.Replace(" ", string.Empty);
RtBEditor.AppendText(X + "\n");
}
foreach (Match M in FilterNormaal.Matches(RtBEditor.Text))
{
MessageBox.Show(M.Value);
}
}
At first, I attempted to iterate through my array of strings to find a match, but for reasons unknown, this did not yield any results. When applying the regex to the entire textbox, it did output the results I needed.

equivalent number to find exact match between two sets

I am writing a program where I have a set of number 123456789 and words ABCDEFGHI. Now if a user enters any number its equivalent letter should show up in the result. Can someone guide me on how to approach this question.
For EX: user entry of 1352 should result in ACEB
Welcome here, your question is too 'easy' to become a question. And at lease you should show up what you have done.
But I will give you a shot.
I have wrote simple method for solve your question.
Sandbox to run this online
//Your code goes here
Console.WriteLine("Hello, world!");
//predifine your sets
var inputSet = new List<char> {'1','2','3','4','5','6','7','8','9','0'};
var outputSet = new List<char>{'A','B','C','D','E','F','G','H','I','J'};
//lets parse
Console.WriteLine(new string("1352".Select(x=>outputSet[inputSet.IndexOf(x)]).ToArray()));
Console.WriteLine(new string("199466856".Select(x=>outputSet[inputSet.IndexOf(x)]).ToArray()));
Console.WriteLine(new string("111222333444".Select(x=>outputSet[inputSet.IndexOf(x)]).ToArray()));
Result:
Hello, world!
ACEB
AIIDFFHEF
AAABBBCCCDDD
Edit:
Explain how it works.
"1352".Select(x) To select chars one by one in the string and store in x.
inputSet.IndexOf(x) To find position of x in inputSet
outputSet[int] To get value by given position from found position in inputSet recenly
new string(char array) Instantiate a new string by given char array.

How to get second value via c# StartsWith() method

string text = "Today is a good day for help. **David Diaz He went to school. **David Diaz like apple. ";
How to get how many times the text **David Diaz occurs in the string text?
UPDATED MY QUESTION
By using StartWhith you can check if the string starts whit ** if it is take the first two words of the string whits will represent the name
string text = "**David Diaz He went to school.";
if (text.StartsWith("**"))
{
var names = text.Split(' ')
.Take(2)
.ToArray();
var fullName = names[0] + " " + names[1];
}
UPDATE
As you said in the commend you want to look how many David Diaz occurs in one string, you can use regex for that.
string text = "Today is a good day for help. **David Diaz He went to school. **David Diaz like apple. ";
int matches = Regex.Matches(
text,
#"(?:\S+\s)?\S*David Diaz\S*(?:\s\S+)?",
RegexOptions.IgnoreCase
).Count;
var text = "Today is a good day for help. **David Diaz He went to school. **David Diaz like apple. ";
var pos = 0;
var num = 0;
var search = "**David Diaz";
while ((pos = text.IndexOf(search, pos)) > -1)
{
num ++;
pos += search.Length;
}
Console.WriteLine(num);
you can try out this in dotnetfiddle
Updated Answer:
It sounds like you want to find the number of times a substring exists in your text. For that, you'll want to use RegEx.Matches, as explained in this answer: https://stackoverflow.com/a/3016577/682840
or LINQ, as explained in this answer: https://stackoverflow.com/a/541994/682840
Original Answer:
.StartsWith returns true/false if the string begins with the search string you provide. If you're wanting to know where a substring exists within your text, you'll need to use .IndexOf or a Regular Expression for more advanced scenarios.
IndexOf will return the location in the text where your provided search string starts (or -1 if it isn't found).

how to get text after a certain comma on C#?

Ok guys so I've got this issue that is driving me nuts, lets say that I've got a string like this "aaa,bbb,ccc,ddd,eee,fff,ggg" (with out the double quotes) and all that I want to get is a sub-string from it, something like "ddd,eee,fff,ggg".
I also have to say that there's a lot of information and not all the strings look the same so i kind off need something generic.
thank you!
One way using split with a limit;
string str = "aaa,bbb,ccc,ddd,eee,fff,ggg";
int skip = 3;
string result = str.Split(new[] { ',' }, skip + 1)[skip];
// = "ddd,eee,fff,ggg"
I would use stringToSplit.Split(',')
Update:
var startComma = 3;
var value = string.Join(",", stringToSplit.Split(',').Where((token, index) => index > startComma));
Not really sure if all things between the commas are 3 length. If they are I would use choice 2. If they are all different, choice 1. A third choice would be choice 2 but implement .IndexOf(",") several times.
Two choices:
string yourString="aaa,bbb,ccc,ddd,eee,fff,ggg";
string[] partsOfString=yourString.Split(','); //Gives you an array were partsOfString[0] is "aaa" and partsOfString[1] is "bbb"
string trimmed=partsOfString[3]+","+partsOfString[4]+","+partsOfString[5]+","+partsOfSting[6];
OR
//Prints "ddd,eee,fff,ggg"
string trimmed=yourString.Substring(12,14) //Gets the 12th character of your string and goes 14 more characters.

Using wildcards or "tags" for user input

I've recently begun learning C# with no prior programming experience. I've been going through tutorials and I'm learning about "if" statements. I am trying to create a simple user feedback application that asks questions and responds to user input.
What I'd like to do is use some kind of keyword or tag system or a wildcard-type system (like the * in search queries) to allow a response to input that isn't exactly specific.
For example, in the following code, where I use the If and Else statements, is there a way to set the userValue equal to not just "good" or "bad" but to any number of variations on both those words, (e.g. "I'm quite good.") or have the If statement refer to a list of keywords or tags, (perhaps listed elsewhere, or in an external file) so that the user can input however they feel, and the predetermined program will pick up on the phrase.
I'm not looking to create some form of AI, only to make a fun responsive program. Would it be possible/sensible to use an array, and somehow invoke it as an If statement? Or is there another, more effective method, to provide feedback to user input? I'm sorry if this is too newbie-ish for this site. I've searched the internet, but the problem is, I don't exactly know what I'm searching for!
The code I have so far is as follows:
Console.WriteLine("Hello. How are you?");
string userValue = Console.ReadLine();
string message = "";
if (userValue == "good")
message = "That's good to hear. Do you want a cup of tea?";
else if (userValue == "bad")
message = "I'm sorry to hear that. Shall I make you a coffee?";
else
message = "I don't understand. Do you want a cuppa?";
Console.WriteLine(message);
string userValueTwo = Console.ReadLine();
string messageTwo = "";
if (userValueTwo == "yes")
messageTwo = "I'll get right onto it!";
else if (userValueTwo == "no")
messageTwo = "Right-o. Shutting down...";
Console.WriteLine(messageTwo);
Console.ReadLine();
You can use regular expressions here:
using System.Text.RegularExpressions;
...
// If the phrase stats/ends or contains the word "good"
if (Regex.IsMatch(userValue, #"(^|\s)good(\s|$)", RegexOptions.IgnoreCase)) {
message = "That's good to hear. Do you want a cup of tea?";
}
I hesitated to post this answer, since it uses LINQ which may confuse you given that you're only just learning. It's simpler than scary regular expressions though! You could do this using your own loops, but LINQ just saves you some code and makes it (arguably) more readable:
Console.WriteLine("Hello. How are you?");
string userValue = Console.ReadLine();
string message = "";
string[] goodWords = new string[] { "good", "well", "sweet", "brilliant"};
string[] badWords = new string[] { "terrible", "awful", "bad", "sucks");
if (goodWords.Any(word => userValue.Contains(word)))
message = "That's good to hear. Do you want a cup of tea?";
else if (badWords.Any(word => userValue.Contains(word)))
message = "I'm sorry to hear that. Shall I make you a coffee?";
else
message = "I don't understand. Do you want a cuppa?";
Basically the Any() function sees if there are any words in the list that match some criteria. The criteria we use is whether the userValue string Contains() that word.
The funny looking => syntax is a lambda expression, just a quick way of writing an anonymous function. Again, something that may be a bit too confusing right now.
Here's a non-LINQ version which you may find easier to understand:
void main()
{
Console.WriteLine("Hello. How are you?");
string userValue = Console.ReadLine();
string message = "";
string[] goodWords = new string[] { "good", "well", "sweet", "brilliant"};
string[] badWords = new string[] { "terrible", "awful", "bad", "sucks"};
if (DoesStringContainAnyOf(userValue, goodWords))
message = "That's good to hear. Do you want a cup of tea?";
else if (DoesStringContainAnyOf(userValue, badWords))
message = "I'm sorry to hear that. Shall I make you a coffee?";
else
message = "I don't understand. Do you want a cuppa?";
string answer = "I'm really well thanks";
}
bool DoesStringContainAnyOf(string searchIn, string[] wordsToFind)
{
foreach(string word in wordsToFind)
if (searchIn.Contains(word))
return true;
return false;
}
Try using the "Contains" method. http://msdn.microsoft.com/en-us/library/dy85x1sa(v=vs.110).aspx
Example:
if (userValue.ToLower().Contains("good"))
message = "That's good to hear. Do you want a cup of tea?";
How about a simple Contains check?
if (userValue.ToLower().Contains("good"))
I also added ToLower() case conversion so that it would work regardless of the case.
If you wanted to implement a list of keywords and programs (functions), I would do it somewhat like that:
var keywords = new Dictionary<string, System.Func<string>>() {
{ "good", () => "That's good to hear. Do you want a cup of tea?" },
{ "bad", () => "I'm sorry to hear that. Shall I make you a coffee?" }
};
foreach(var keyword in keywords)
{
if (userValue.ToLower().Contains(keyword.Key))
{
message = keyword.Value();
break;
}
}
In this case, I use C# lambda expressions to save the "list of programs" you wanted; it's a pretty powerful C# feature that allows to use code as data. Right now these functions just return constant string values, so they're a bit of an overkill for your scenario, though.
All previous answers are valid and contain many things well worth learning. I want to draw special attention to the lower method, which will help you to recognize both, 'Good' and 'good'.
Also, to make the list more complete, you should know about the good old IndexOf method, which will return the place of a substring within a string, or -1 if it isn't contained there.
But I have a hunch that your question also targets the problem of how to write the Eliza code (which of course is the name of the game) in a more effective way. You surely want more than 2 questions..?
It is most fun if the cue words and the responses can easily be expanded without changing and compiling the program again.
The easiest way to do that is to put all data in a text file; there are many ways to do that, but easiest maintance comes with a line-by-line format like this:
//The text file eliza.txt:
good=That's good to hear. Do you want a cup of tea?
bad=I'm sorry to hear that. Shall I make you a coffee?
father=Tell me more about your family!
mother=What would you do Daddy?
..
bye=Goodbye. See you soon..
This is read in with the File.ReadLines command. Add a using System.IO; to the top of your programm if it isn't there already..
From there I suggest using a Dictionary<string, string>
Dictionary<string, string> eliza = new Dictionary<string, string>();
var lines = File.ReadAllLines("D:\\eliza.txt");
foreach (string line in lines)
{
var parts = line.Split('=');
if (parts.Length==2) eliza.Add(parts[0].ToLower(), parts[1]);
}
Now you could create a loop until the user says 'bye' or nothing:
string userValue = Console.ReadLine();
do
{
foreach (string cue in eliza.Keys)
if (userValue.IndexOf(cue) >= 0)
{ Console.WriteLine(eliza[cue]); break; }
userValue = Console.ReadLine().ToLower();
}
while (userValue != "" && userValue.IndexOf("bye") < 0);
Fun things to expand are lists of cue words, lists of reponses, removing some responses after they have been used or throwing in a few random responses..

Categories

Resources