What's different Microsoft.JScript.GlobalObject.escape and Uri.EscapeUriString - c#

The service received the string from Uri.EscapeUriString and Microsoft.JScript.GlobalObject.escape are difference, then I use Microsoft.JScript.GlobalObject.escape to handle url is ok.
What's different between Microsoft.JScript.GlobalObject.escape and Uri.EscapeUriString in c#?

Although Uri.EscapeUriString is available to use in C# out of the box, it can not convert all the characters exactly the same way as JavaScript escape function does.
For example let's say the original string is: "Some String's /Hello".
Uri.EscapeUriString("Some String's /Hello")
output:
"Some%20String's%20/Hello"
Microsoft.JScript.GlobalObject.escape("Some String's /Hello")
output:
"Some%20String%27s%20/Hello"
Note how the Uri.EscapeUriString did not escape the '.
That being said, lets look at a more extreme example. Suppose we have this string "& / \ # , + ( ) $ ~ % .. ' " : * ? < > { }". Lets see what escaping this with both methods give us.
Microsoft.JScript.GlobalObject.escape("& / \\ # , + ( ) $ ~ % .. ' \" : * ? < > { }")
output: "%26%20/%20%5C%20%23%20%2C%20+%20%28%20%29%20%24%20%7E%20%25%20..%20%27%20%22%20%3A%20*%20%3F%20%3C%20%3E%20%7B%20%7D"
Uri.EscapeUriString("& / \\ # , + ( ) $ ~ % .. ' \" : * ? < > { }")
output: "&%20/%20%5C%20#%20,%20+%20(%20)%20$%20~%20%25%20..%20'%20%22%20:%20*%20?%20%3C%20%3E%20%7B%20%7D"
Notice that Microsoft.JScript.GlobalObject.escape escaped all characters except +, /, * and ., even those that are valid in a uri. For example the ? and & where escaped even though they are valid in a query string.
So it all depends on where and when you wish to escape your URI and what type of URI you are creating/escaping.

Related

Regex to replace a symbol & within quotes C#

I'm trying to replace '&' inside quotes.
Input
"I & my friends are stuck here", & we can't resolve
Output
"I and my friends are stuck here", & we can't resolve
Replace '&' by 'and' and only inside quotes, could you please help?
By far the quickest way is to use the \G construct and do it with a single regex.
C# code
var str =
"\"I & my friends are stuck here & we can't get up\", & we can't resolve\n" +
"=> \"I and my friends are stuck here and we can't get up\", & we can't resolve\n";
var rx = #"((?:""(?=[^""]*"")|(?<!""|^)\G)[^""&]*)(?:(&)|(""))";
var res = Regex.Replace(str, rx, m =>
// Replace the ampersands inside double quotes with 'and'
m.Groups[1].Value + (m.Groups[2].Value.Length > 0 ? "and" : m.Groups[3].Value));
Console.WriteLine(res);
Output
"I and my friends are stuck here and we can't get up", & we can't resolve
=> "I and my friends are stuck here and we can't get up", & we can't resolve
Regex ((?:"(?=[^"]*")|(?<!"|^)\G)[^"&]*)(?:(&)|("))
https://regex101.com/r/db8VkQ/1
Explained
( # (1 start), Preamble
(?: # Block
" # Begin of quote
(?= [^"]* " ) # One-time check for close quote
| # or,
(?<! " | ^ ) # If not a quote behind or BOS
\G # Start match where last left off
)
[^"&]* # Many non-quote, non-ampersand
) # (1 end)
(?: # Body
( & ) # (2), Ampersand, replace with 'and'
| # or,
( " ) # (3), End of quote, just put back "
)
Benchmark
Regex1: ((?:"(?=[^"]*")|(?<!"|^)\G)[^"&]*)(?:(&)|("))
Completed iterations: 50 / 50 ( x 1000 )
Matches found per iteration: 10
Elapsed Time: 2.21 s, 2209.03 ms, 2209035 µs
Matches per sec: 226,343
Use
Regex.Replace(s, "\"[^\"]*\"", m => Regex.Replace(m.Value, #"\B&\B", "and"))
See the C# demo:
using System;
using System.Linq;
using System.Text.RegularExpressions;
public class Test
{
public static void Main()
{
var s = "\"I & my friends are stuck here\", & we can't resolve";
Console.WriteLine(
Regex.Replace(s, "\"[^\"]*\"", m => Regex.Replace(m.Value, #"\B&\B", "and"))
);
}
}
Output: "I and my friends are stuck here", & we can't resolve

Matching balanced parentheses before a character

I need to match a string within balanced parentheses before a literal period in c#. My regex with balanced groups works except when there are extra open parens in the string. According to my understanding, this requires a conditional fail pattern to ensure the stack is empty on match, yet something is not quite right.
Original regex:
#"(?<Par>[(]).+(?<-Par>[)])\."
With fail-pattern:
#"(?<Par>[(]).+(?<-Par>[)])(?(Par)(?!))\."
Test-code (last 2 fail):
string[] tests = {
"a.c", "",
"a).c", "",
"(a.c", "",
"a(a).c", "(a).",
"a(a b).c", "(a b).",
"a((a b)).c", "((a b)).",
"a(((a b))).c", "(((a b))).",
"a((a) (b)).c", "((a) (b)).",
"a((a)(b)).c", "((a)(b)).",
"a((ab)).c", "((ab)).",
"a)((ab)).(c", "((ab)).",
"a(((a b)).c", "((a b)).",
"a(((a b)).)c", "((a b))."
};
Regex re = new Regex(#"(?<Par>[(]).+(?<-Par>[)])(?(Par)(?!))\.");
for (int i = 0; i < tests.Length; i += 2)
{
var result = re.Match(tests[i]).Groups[0].Value;
if (result != tests[i + 1]) throw new Exception
("Expecting: " + tests[i + 1] + ", got " + result);
}
You may use a well-known regex to match balanced parentheses and just append a \. to it:
\((?>[^()]+|(?<o>)\(|(?<-o>)\))*(?(o)(?!))\)\.
|---------- balanced parens part ----------|.|
See the regex demo.
Details
\( - a (
(?> - start of an atomic group
[^()]+ - 1 or more chars other than ( and )
| - or
(?<o>)\( - an opening ( is pushed on to the Group o stack
| - or
(?<-o>)\) - a closing ( is popped off the Group o stack
)* - 0 or more repetitions of the atomic group
(?(o)(?!)) - fail the match if Group o stack is not empty
\) - a )
\. - a dot.

How to match string that contains ^ in regular expression?

I tried to make a regular expression using online tool but not succeeded. Here is the string i need to match:-
27R4FF^27R4FF Text until end
always starts with alphanumeric (case-insensitive)
then always caret sign ^ (no space before & after)
then alphanumeric string
then always one white space
then string until end.
Here is the regular expression that is not working for me:-
((?:[a-z][a-z]*[0-9]+[a-z0-9]*))(\^)((?:[a-z][a-z]*[0-9]+[a-z0-9]*)).*?((?:[a-z][a-z]+))
c# code:-
string txt = "784SFS^784SFS Value is here";
var regs = #"((?:[a-z][a-z]*[0-9]+[a-z0-9]*))(\^)((?:[a-z][a-z]*[0-9]+[a-z0-9]*)).*?((?:[a-z][a-z]+))";
Regex r = new Regex(regs, RegexOptions.IgnoreCase | RegexOptions.Singleline);
Match m = r.Match(txt);
Console.Write(m.Success ? "matched" : "didn't match");
Console.ReadLine();
Help appreciated. Thanks
Verbatim ^[^\W_]+\^[^\W_]+[ ].*$
^ # BOS
[^\W_]+ # Alphanum
\^ # Caret
[^\W_]+ # Alphanum
[ ] # Space
.* # Anything
$ # EOS
Output
** Grp 0 - ( pos 0 , len 28 )
27R4FF^27R4FF Text until end
I didn't get if string 'until the end' should be matched.
This works for
27R4FF^27R4FF Text
^\w+\^\w+\s\w+$
if you have some spaces at the end, try with
^\w+\^\w+\s[\w\s]+$
Try this: https://regex101.com/r/hD0hV0/2
^[\da-z]+\^[\da-z]+\s.*$
...or commented (assumes RegexOptions.IgnorePatternWhitespace if you're using the format in code):
^ # always starts...
[\da-z]+ # ...with alphanumeric (case-insensitive)
\^ # then always caret sign ^ (no space before & after)
[\da-z]+ # then alphanumeric string
\s # then always one white space
.* # then string...
$ # ...until end.
The other answers don't actually match what you describe (at the time of this writing) because \w matches underscore and you didn't mention any limitations on "the string at the end".

Regular expressions: How to remove all "R.G(*******)" from a string

There are several strings, and I wanna to remove all "R.G(**)" from these strings. For example:
1、Original string:
Push("Command", string.Format(R.G("#{0} this is a string"), accID));
Result:
Push("Command", string.Format("#{0} this is a string", accID));
2、Original string:
Select(Case(T["AccDirect"]).WhenThen(1, R.G("input")).Else(R.G("output")).As("Direct"));
Result:
Select(Case(T["AccDirect"]).WhenThen(1, "input").Else("output").As("Direct"));
3、Original string:
R.G("this is a \"string\"")
Result:
"this is a \"string\""
4、Original string:
R.G("this is a (string)")
Result:
"this is a (string)"
5、Original string:
AppendLine(string.Format(R.G("[{0}] Error:"), str) + R.G("Contains one of these symbols: \\ / : ; * ? \" \' < > | & +"));
Result:
AppendLine(string.Format("[{0}] Error:", str) + "Contains one of these symbols: \\ / : ; * ? \" \' < > | & +");
6 、Original string:
R.G(#"this is the ""1st"" string.
this is the (2nd) string.")
Result:
#"this is the ""1st"" string.
this is the (2nd) string."
Please Help.
Use this, capture group 0 is your target, group 1 is your replace.
Fiddle
R[.]G[(]"(.*?[^\\])"[)]
Example acting on your #2 and #4 string and a new edge case R.G("this is a (\"string\")")
var pattern = #"R[.]G[(]\""(.*?[^\\])\""[)]";
var str = "Select(Case(T[\"AccDirect\"]).WhenThen(1, R.G(\"input\")).Else(R.G(\"output\")).As(\"Direct\"));";
var str2 = "R.G(\"this is a (string)\")";
var str3 = "R.G(\"this is a (\\\"string\\\")\")";
var res = Regex.Replace(str,pattern, "\"$1\"");
var res2 = Regex.Replace(str2,pattern, "\"$1\"");
var res3 = Regex.Replace(str3,pattern, "\"$1\"");
Try this:
var result = Regex.Replace(input, #"(.*)R\.G\(([^)]*)\)(.*)", "$1$2$3");
explanation:
(.*) # capture any characters
R.G\( # then match 'R.G.'
([^)]*) # then capture anything that isn't ')'
\) # match end parenthesis
(.*) # and capture any characters after
The $1$2$3 replaces your entire match with capture group 1, 2, and 3. Which effectively removes everything that isn't part of those matches, namely the "R.G(*)" part.
Note that you will run into problems if your strings contain 'R.G' or a right parenthesis somewhere, but depending on your input data, maybe this will do the trick well enough.

C# simple regex for replacing "\ / : * ? " < > |"

hey there!
im not really into regular expressions, but i need a simple regex for replacing all of the following chars:
\ / : * ? " < > |
thanks^^
There you have it:
[\\/:*?"<>|]
here is an eg which can be helpful to u
CAP*!(ITAL)!('CINE)!(CAP [A-Z])
finds CAP but not when followed by ITAL or 'CINE or when followed by a space and a world beginning with a capital letter.

Categories

Resources