String replacement for Azure Translation notranslate - c#

I have been working on trying to get Azure translator to convert text stored in a database column. Here is a couple of examples of how the text is currently stored:
eg1. "Add %%objectives%% from predefined sets of %%objectives%%"
eg2. %%Risk%%
eg3. some text here %%model%%. Please refresh the page.
My goal is to translate everything but the data within the % %. The problem is with Azure translate it has to be within the syntax of <div class="notranslate">" "" which means I have to replace all of the %% with that syntax. I was able to convert this and it works with only 1 within the string but everything else seemed to go down a rabbit hole. Here is my code:
english = "Add %%objectives%% from predefined sets of %%objectives%%";
if (english.Contains("%%"))
{
Dictionary<int, int> positions = new Dictionary<int, int>(); // this is to hold the locations of where delims are in string
ArrayList l = new ArrayList();
char[] letters = english.ToCharArray();
// get the first location of %
for (int i = 0; i < english.Length; i++)
{
if (letters[i] == '%')
{
l.Add(i);
}
}
string temp = "";
// only works if theres 1 % in the string
if (l.Count == 4)
{
int loc = english.IndexOf('%'); //%%Model%% = 0
int lastloc = english.LastIndexOf('%');
temp = " <div class=\"notranslate\">" + english.Substring(loc + 2, (lastloc - 3) - loc) + "</div>";
var lang = Translate(convert(english, temp), "en", "it");
// need to convert back to %%
Console.WriteLine(lang);
dataNode.SelectSingleNode("value").InnerText = lang;
}
else if (l.Count > 4) //this means that there are more than 1 delimted
{
foreach(int i in l) // 4 , 5 , 16 ,17, 43, 44
// % % text % % text % %
{
}
Any help is appreciated!

Do not replace the '%%'. Insert the <div class="notranslate"> before any odd sequence number '%%' and insert the </div> after any even sequence number '%%'. This way, the translation retains the %% markup.

If you have the option to translate the string after the variables have been replaced with real, human-readable values, you will get a better translation out of it.

Related

How can I add a space between every 3 characters counting from the right to the left in a string in C#?

I want to add space between every 3 characters in a string in C#, but count from right to left.
For example :
11222333 -> 11 222 333
Answer by #Jimi from comments (will delete if they post their own)
var YourString = "11222333";
var sb = new StringBuilder(YourString);
for (int i = sb.Length -3; i >= 0; i -= 3)
sb.Insert(i, ' ');
return sb.ToString();
The benefit of this algorithm appears to be that you are working backwards through the string and therefore only moving a certain amount on each run, rather than the whole string.
If you are trying to format a string as a number according to some locale conventions you can use the NumberFormat class to set how you want a number to be formatted as a string
So for example
string input = "11222333";
NumberFormatInfo currentFormat = new NumberFormatInfo();
currentFormat.NumberGroupSeparator = " ";
if(Int32.TryParse(input, NumberStyles.None, currentFormat, out int result))
{
string output = result.ToString("N0", currentFormat);
Console.WriteLine(output); // 11 222 333
}
The following recursive function would do the job:
string space3(string s)
{
int len3 = s.Length - 3;
return (len <= 0) ? s
: (space3(s.Substring(0, len3)) + " " + s.Substring(len3));
}
C# 8.0 introduced string ranges. Ranges allow for a more compact form:
string space3(string s)
{
return (s.Length <= 3) ? s
: (space3(s[..^3]) + " " + s[^3..]);
}
Using Regex.Replace:
string input = "11222333";
string result = Regex.Replace( input, #"\d{3}", #" $0", RegexOptions.RightToLeft );
Demo and detailed explanation of RegEx pattern at regex101.
tl;dr: Match groups of 3 digits from right to left and replace them by space + the 3 digits.
The most efficient algorithm I can come up with is the following:
var sb = new StringBuilder(YourString.Length + YourString.Length / 3 + 1);
if (YourString.Length % 3 > 0)
{
sb.Append(YourString, 0, YourString.Length % 3);
sb.Append(' ');
}
for (var i = YourString.Length % 3; i < YourString.Length; i += 3)
{
sb.Append(YourString, i, 3);
sb.Append(' ');
}
return sb.ToString();
We first assign a StringBuilder of the correct size.
Then we check to see if we need to append the first one or two characters. Then we loop the rest.
dotnetfiddle

Missing characters after string concatenate

I am having a problem whereby the letter at the position(e.g 39) would be replaced with the text I wanted to input. However what I want was to insert the text at position 39 instead of replacing it. Anyone please guide me on this.
string description = variables[1]["value"].ToString();// where I get the text
int nInterval = 39;// for every 39 characters in the text I would have a newline
string res = String.Concat(description.Select((c, z) => z > 0 && (z % nInterval) == 0 ? Environment.NewLine +"Hello"+(((z/ nInterval)*18)+83).ToString()+"world": c.ToString()));
file_lines = file_lines.Replace("<<<terms_conditions>>>",resterms); //file_lines is where I read the text file
Original text
Present this redemption slip to receive: One
After String.Concat
Present this redemption slip to receive\r\n\u001bHello101world
One //: is gone
I am also having a issue where I want to put a new line if it contains * in the text. If anybody is able to help that would be great.
Edit:
What I want to achieve is something like this
Input
*Item is considered paid if unsealed.*No replacement or compensation will be given for any expired coupons.
so like i need to find every 39 character and also * to input newline so it will be
Output
*Item is considered paid if unsealed.
*No replacement or compensation will be
given for any expired coupons.
Try String.Insert(Int32, String) Method
Insert \n where you need new line.
If I understood your question properly, you want a newline after every 39 characters. You can use string.Insert(Int32, String) method for that.
And use String.Replace(String, String) for your * problem.
Below code snippet doing that using a simple for loop.
string sampleStr = "Lorem Ipsum* is simply..";
for (int i = 39; i < sampleStr.Length; i = i + 39){
sampleStr = sampleStr.Insert(i, Environment.NewLine);
}
//sampleStr = sampleStr.Replace("*", Environment.NewLine);
int[] indexes = Enumerable.Range(0, sampleStr.Length).Where(x => sampleStr[x] == '*').ToArray();
for (int i = 0; i < indexes.Length; i++)
{
int position = indexes[i];
if (position > 0) sampleStr = sampleStr.Insert(position, Environment.NewLine);
}
If you want to do both together
int[] indexes = Enumerable.Range(0, sampleStr.Length).Where(x => sampleStr[x] == '*' || x % 39 == 0).ToArray();
int j = 0;
foreach (var position in indexes)
{
if (position > 0)
{
sampleStr = sampleStr.Insert(position + j, Environment.NewLine);
j = j + 2; // increment by two since newline will take two chars
}
}
Without debating the method chosen to achieve the desired result, the problem with the code is that at the 39th character it adds some text, but the character itself has been forgotten.
Changing the following line should give the expected output.
string res = String.Concat(description.Select((c, z) => z > 0 && (z % nInterval) == 0 ? Environment.NewLine + "Hello" + (((z / nInterval) * 18) + 83).ToString() + "world" + c.ToString() : c.ToString()));
<== UPDATED ANSWER BASED ON CLARIFICATION IN QUESTION ==>
This will do what you want, I believe. See comments in line.
var description = "*Item is considered paid if unsealed.*No replacement or compensation will be given for any expired coupons.";
var nInterval = 39; // for every 39 characters in the text I would have a newline
var newline = "\r\n"; // for clarity in the Linq statement. Can be set to Environment.Newline if desired.
var z = 0; // we'll handle the count manually.
var res = string.Concat(
description.Select(
(c) => (++z == nInterval || c == '*') // increment z and check if we've hit the boundary OR if we've hit a *
&& ((z = 0)==0) // resetting the count - this only happens if the first condition was true
? newline + (c == ' ' ? string.Empty : c.ToString()) // if the first character of a newline is a space, we don't need it
: c.ToString()
));
Output:
*Item is considered paid if unsealed.
*No replacement or compensation will be
given for any expired coupons.

Adding 'space' in C# textbox

Hi guys, so I need to add a 'space' between each character in my displayed text box.
I am giving the user a masked word like this He__o for him to guess and I want to convert this to H e _ _ o
I am using the following code to randomly replace characters with '_'
char[] partialWord = word.ToCharArray();
int numberOfCharsToHide = word.Length / 2; //divide word length by 2 to get chars to hide
Random randomNumberGenerator = new Random(); //generate rand number
HashSet<int> maskedIndices = new HashSet<int>(); //This is to make sure that I select unique indices to hide. Hashset helps in achieving this
for (int i = 0; i < numberOfCharsToHide; i++) //counter until it reaches words to hide
{
int rIndex = randomNumberGenerator.Next(0, word.Length); //init rindex
while (!maskedIndices.Add(rIndex))
{
rIndex = randomNumberGenerator.Next(0, word.Length); //This is to make sure that I select unique indices to hide. Hashset helps in achieving this
}
partialWord[rIndex] = '_'; //replace with _
}
return new string(partialWord);
I have tried : partialWord[rIndex] = '_ ';however this brings the error "Too many characters in literal"
I have tried : partialWord[rIndex] = "_ "; however this returns the error " Cannot convert type string to char.
Any idea how I can proceed to achieve a space between each character?
Thanks
The following code should do as you ask. I think the code is pretty self explanatory., but feel free to ask if anything is unclear as to the why or how of the code.
// char[] partialWord is used from question code
char[] result = new char[(partialWord.Length * 2) - 1];
for(int i = 0; i < result.Length; i++)
{
result[i] = i % 2 == 0 ? partialWord[i / 2] : ' ';
}
return new string(result);
Since the resulting string is longer than the original string, you can't use only one char array because its length is constant.
Here's a solution with StringBuilder:
var builder = new StringBuilder(word);
for (int i = 0 ; i < word.Length ; i++) {
builder.Insert(i * 2, " ");
}
return builder.ToString().TrimStart(' '); // TrimStart is called here to remove the leading whitespace. If you want to keep it, delete the call.

Need help finding n amount of Excel Ranges

So I have this situation:
At work I need to make an Excel AddIn which can collect some data from user surveys and show them in a neat little Excel Report. I have the format down however I have trouble figuring out how I find the Excel Ranges needed to showcase the questions that were asked in the survey.
Every question needs to take up three cells each since there are three stats associated with each and that's fine until you reach Z and have to start over with AA, AB, AC, etc. I can't quite wrap my head around it and I feel my current solution is being needlessly complicated. I know that right now there are 13 questions. That's 39 cells I need for the questions total but that could change in the future, or I might have to find smaller reports than all of the 13 questions. I need to make sure my algorithm can take care of both scenarios.
Currently I have this:
const String ALPHABET = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
int alphabetCounter = 0;
int alphabetIndex = 1;
for (int i = 0; i < dict["questions"].Length; i++)
{
String start = "";
String end = "";
if ((alphabetIndex + 1) > ALPHABET.Length)
{
alphabetCounter++;
alphabetIndex = 0;
start += ALPHABET[alphabetCounter - 1] + ALPHABET[alphabetIndex];
}
else
{
start += ALPHABET[alphabetIndex];
alphabetIndex++;
}
if ((alphabetIndex + 1) > ALPHABET.Length)
{
alphabetCounter++;
alphabetIndex = 0;
end += ALPHABET[alphabetIndex];
}
else
{
alphabetIndex++;
end += ALPHABET[alphabetIndex];
}
Excel.Range range = sheet.get_Range(start + "7", end + "7");
questionRanges.Add(range);
}
It's not finished because I ran into a wall here. So just to explain:
ALPHABET is just that. The alphabet. I use that to get the cell letters.
AlphabetCounter is how many times I have gone through the alphabet so in the event that I need to add an extra letter in front of my cells letter (Like the A in AB) I can get that from the ALPHABET string
AlphabetIndex is where in the alphabet I currently am.
I hope you can help me.
How would I go about getting all the ranges I need to accompany the n amount of questions I can get details about?
The trivial solution would be to change
const string ALPHABET = "ABC..."
to
const string[] ColumnNames = { "A", "B", "C", ..., "Z", "AA".. }
But this doesn't scale well. Think about what happens when you need to add a column. You'd have to add another item in the array, and eventually you'd have 26^2 array entries. Certainly not ideal.
A better solution would be to treat the column index as a base 26 number and convert it using a function like the following:
string GetColumnName(int index)
{
List<char> chars = new List<char>();
while (index >= 0)
{
int current = index % 26;
chars.Add((char)('A' + current));
index = (int)((index - current) / 26) - 1;
}
chars.Reverse();
return new string(chars.ToArray());
}
The function here converts the base by repeatedly calculating the remainder (also known as modulus or %).
just another idea of implementation, maybe it can be useful:
...
List<char> start = new List<char>();
List<char> end = new List<char>();
start = Increment(end);
Increment(end);
Increment(end);
Excel.Range range = sheet.get_Range(new String(start.ToArray())+ "7",
new String(end.ToArray())+ "7");
}
private List<char> Increment(List<char> listColumn, int position=0)
{
if (listColumn.Count > position)
{
listColumn[position]++;
if (listColumn[position] == '[')
{
listColumn[position] = 'A';
Increment(listColumn, ++position);
}
}
else
{
listColumn.Add('A');
}
return listColumn;
}

What is the most efficient way to detect if a string contains a number of consecutive duplicate characters in C#?

For example, a user entered "I love this post!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!"
the consecutive duplicate exclamation mark "!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!" should be detected.
The following regular expression would detect repeating chars. You could up the number or limit this to specific characters to make it more robust.
int threshold = 3;
string stringToMatch = "thisstringrepeatsss";
string pattern = "(\\d)\\" + threshold + " + ";
Regex r = new Regex(pattern);
Match m = r.Match(stringToMatch);
while(m.Success)
{
Console.WriteLine("character passes threshold " + m.ToString());
m = m.NextMatch();
}
Here's and example of a function that searches for a sequence of consecutive chars of a specified length and also ignores white space characters:
public static bool HasConsecutiveChars(string source, int sequenceLength)
{
if (string.IsNullOrEmpty(source))
return false;
if (source.Length == 1)
return false;
int charCount = 1;
for (int i = 0; i < source.Length - 1; i++)
{
char c = source[i];
if (Char.IsWhiteSpace(c))
continue;
if (c == source[i+1])
{
charCount++;
if (charCount >= sequenceLength)
return true;
}
else
charCount = 1;
}
return false;
}
Edit fixed range bug :/
Can be done in O(n) easily: for each character, if the previous character is the same as the current, increment a temporary count. If it's different, reset your temporary count. At each step, update your global if needed.
For abbccc you get:
a => temp = 1, global = 1
b => temp = 1, global = 1
b => temp = 2, global = 2
c => temp = 1, global = 2
c => temp = 2, global = 2
c => temp = 3, global = 3
=> c appears three times. Extend it to get the position, then you should be able to print the "ccc" substring.
You can extend this to give you the starting position fairly easily, I'll leave that to you.
Here is a quick solution I crafted with some extra duplicates thrown in for good measure. As others pointed out in the comments, some duplicates are going to be completely legitimate, so you may want to narrow your criteria to punctuation instead of mere characters.
string input = "I loove this post!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!aa";
int index = -1;
int count =1;
List<string> dupes = new List<string>();
for (int i = 0; i < input.Length-1; i++)
{
if (input[i] == input[i + 1])
{
if (index == -1)
index = i;
count++;
}
else if (index > -1)
{
dupes.Add(input.Substring(index, count));
index = -1;
count = 1;
}
}
if (index > -1)
{
dupes.Add(input.Substring(index, count));
}
The better way i my opinion is create a array, each element in array is responsible for one character pair on string next to each other, eg first aa, bb, cc, dd. This array construct with 0 on each element.
Solve of this problem is a for on this string and update array values.
You can next analyze this array for what you want.
Example: For string: bbaaaccccdab, your result array would be { 2, 1, 3 }, because 'aa' can find 2 times, 'bb' can find one time (at start of string), 'cc' can find three times.
Why 'cc' three times? Because 'cc'cc & c'cc'c & cc'cc'.
Use LINQ! (For everything, not just this)
string test = "aabb";
return test.Where((item, index) => index > 0 && item.Equals(test.ElementAt(index)));
// returns "abb", where each of these items has the previous letter before it
OR
string test = "aabb";
return test.Where((item, index) => index > 0 && item.Equals(test.ElementAt(index))).Any();
// returns true

Categories

Resources