I am trying to compare two strings but one of the string contains a white space at the end. I used Trim() and compared but didn't work because that white space is getting converted to %20 and I thing Trim does not remove that. it is something like "abc" and "abc%20" , what can I do in such situation to compare strings whih ignoring the case too?
%20 is the url-encoded version of space.
You can't directly strip it off using Trim(), but you can use HttpUtility.UrlDecode() to decode the %20 back to a space, then trim/do the comparison exactly as you would otherwise;
using System.Web;
//...
var test1 = "HELLO%20";
var test2 = "hello";
Console.WriteLine(HttpUtility.UrlDecode(test1).Trim().
Equals(HttpUtility.UrlDecode(test2).Trim(),
StringComparison.InvariantCultureIgnoreCase));
> true
Use HttpUtility.UrlDecode to decode the strings:
string s1 = "abc ";
string s2 = "abc%20";
if (System.Web.HttpUtility.UrlDecode(s1).Equals(System.Web.HttpUtility.UrlDecode(s2)))
{
//equals...
}
In case of WinForms or Console (or any non ASP.NET) project you will have to add reference to the System.Web assembly in your project.
Something like:
if (System.Uri.UnescapeDataString("abc%20").ToLower() == myString.ToLower()) {}
The "%20" is the url encoded version of the ' ' (space) character. Are you comparing an encoded URL parameter? If so, you can use:
string str = "abc%20";
string decoded = HttpUtility.UrlDecode(str); // decoded becomes "abc "
If you need to trim any white spaces, you should do this for the decoded string. The Trim method does not understand or recognize the encoded whitespace characters.
decoded = decoded.Trim();
Now you can compare with the decoded variable using:
decoded.Equals(otherValue, StringComparison.OrdinalIgnoreCase);
The StringComparison.OrdinalIgnoreCase is probably the fastest way for case-insensitive comparison between strings.
Did you try this?
string before = "abc%20";
string after = before.Replace("%20", "").ToLower();
You can use String.Replace and since you mentioned case insensitivity String.ToLower like this:
var str1 = "abc";
var str2 = "Abc%20";
str1.Replace("%20", "").ToLower() == str2.Replace("%20", "").ToLower();
// will be true
It seems the root problem is when you are with Encoding the Url. If you will use the character encoding, then you will never get %20. The default encoding used by HttpUtility.UrlEncode utf-8. here is the usage
System.Web.HttpUtility.UrlEncode("ãName Marcos", System.Text.Encoding.GetEncoding("iso-8859-1"))
And Here, on Microsoft website You can read more about Character Encoding.
And if you will do proper encoding you can avoid rest of the work
And here is what you asked -
The Second Case - If you have to compare two string as per your need, you need to Decode HttpUtility.UrlDecode(test)
bool result = HttpUtility.UrlDecode(stringOne).Equals(HttpUtility.UrlDecode(stringOne));
And result bool knows if they are equal or unequal
Console.WriteLine("Result is", result ? "equal." : "not equal.");
Hope it helps
Related
How would I go about in replacing every character in a string which are not the following:
abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789_.#-
with -
So for example, the name Danny D'vito would become DannyD-vito
My inital thought was converting string to char[] and looping through and checking each character, then convert back to string. But my hunch is telling me there must be an easier way to do this
Regex.Replace() approach
string input = "Danny D'vito";
string result = new Regex("[^a-zA-Z0-9_.#-]").Replace(input, "-");
I have a string (text) that I would like to convert using a JSON parser so that it is javascript friendly.
In my view page I have some javascript that looks like:
var site = {
strings: {
addToCart: #someValue,
So #someValue should be javascript safe like double quotes, escaped chars if needed etc.
That value #someValue is a string, but it has to be javascript friendly so I want to parse it using JSON.
Does the new System.Text.Json have something?
I tried this:
return System.Text.Json.JsonDocument.Parse(input).ToString();
But this doesnt' work because my text is just a string, not a JSON string.
Is there another way to parse something?
The rules for escaping strings to make them JSON safe are as follows:
Backspace is replaced with \b
Form feed is replaced with \f
Newline is replaced with \n
Carriage return is replaced with \r
Tab is replaced with \t
Double quote is replaced with \"
Backslash is replaced with \\
And while it's not strictly necessary, any non-web-safe character (i.e. any non-ASCII character) can be converted to its escaped Unicode equivalent to avoid potential encoding issues.
From this, it's pretty straightforward to create your own conversion method:
public static string MakeJsonSafe(String s)
{
var jsonEscaped = s.Replace("\\", "\\\\")
.Replace("\"", "\\\"")
.Replace("\b", "\\b")
.Replace("\f", "\\f")
.Replace("\n", "\\n")
.Replace("\r", "\\r")
.Replace("\t", "\\t");
var nonAsciiEscaped = jsonEscaped.Select((c) => c >= 127 ? "\\u" + ((int)c).ToString("X").PadLeft(4, '0') : c.ToString());
return string.Join("", nonAsciiEscaped);
}
DotNetFiddle
(Like I said, the nonAsciiEscaped stage can be omitted as it's not strictly necessary.)
I have this code which should erase all the numbers after a certain _
var fileNameOnly1 = Regex.Replace(fileNameOnly, #"[_\d]", string.Empty);
I.e.
Input
4a_32
abcdef43252_43242
Current Output
4a2
abcdef432523242
Expected output
4a
abcdef43252
I also tried using #"[_\d]"
is there any way to erase numbers after _ and erase the '_' also ??
You dont specifically mention that you need to use regex and in most cases I would advise against it as regex is rather slow (comparison to other methods) and cumbersome (difficult to read and write).
I would think that it would be better to do this using string manipulation instead.
var fileNameOnly1 = fileNameOnly.Split('_')[0];
The above code will find the first '_' and take all characters before it (returned as a string).
Try this
Pattern
_\d+
Example
var fileNameOnly = "asdads_234asd";
var result = Regex.Replace(fileNameOnly, #"_\d+", string.Empty);
Console.WriteLine(result);
Output
asdadsasd
Simply use this regex:
_\d+
Regex.Replace(fileNameOnly, #"_\d+", string.Empty);
I want to filter some string which has some wrong letters (non-ASCII). It looks different in Notepad, Visual Studio 2010 and MySQL.
How can I check if a string has non-ASCII letters and how I can remove them?
You could use a regular expression to filter non ASCII characters:
string input = "AB £ CD";
string result = Regex.Replace(input, "[^\x0d\x0a\x20-\x7e\t]", "");
You could use Regular Expressions.
Regex.Replace(input, "[^a-zA-Z0-9]+", "")
You could also use \W+ as the pattern to remove any non-character.
This has been a God-send:
Regex.Replace(input, #"[^\u0000-\u007F]", "");
I think I got it elsewhere originally, but here is a link to the same answer here:
How can you strip non-ASCII characters from a string? (in C#)
string testString = Regex.Replace(OldString, #"[\u0000-\u0008\u000A-\u001F\u0100-\uFFFF]", "");
First, you need to determine what you mean by a "word". If non-ascii, this probably implies non-english?
Personally, I'd ask why you need to do this and what fundamental assumption has your application got that conflicts with your data? Depending on the situation, I suggest you either re-encode the text from the source encoding, although this will be a lossy conversion, or alternatively, address that fundamental assumption so that your application handles data correctly.
I think something as simple as this would probably work, wouldn't it?
public static string AsciiOnly(this string input, bool includeExtendedAscii)
{
int upperLimit = includeExtendedAscii ? 255 : 127;
char[] asciiChars = input.Where(c => (int)c <= upperLimit).ToArray();
return new string(asciiChars);
}
Example usage:
string input = "AB£ȼCD";
string asciiOnly = input.AsciiOnly(false); // returns "ABCD"
string extendedAsciiOnly = input.AsciiOnly(true); // returns "AB£CD"
I need to strip unknown characters from the end of a string returned from an SQL database. I also need to log when a special character occurs in the string.
What's the best way to do this?
You can use the Trim() method to trim blanks or specific characters from the end of a string. If you need to trim a certain number of characters you can use the Substring() method. You can use Regexs (System.Text.RegularExpressions namespace) to match patterns in a string and detect when they occur. See MSDN for more info.
If you need more help you'll need to provide a bit more info on what exactly you're trying to do.
First define what are unknown characters (chars other than 0-9, a to z and A to Z ?) and put them in an array
Loop trough the characters of a string and check if the char occurs, if so remove.
you can also to a String.Replace with as param the unknown char, and replaceparam ''.
Since you've specified that the legal characters are only alphanumeric, you could do something like this:
Match m = Regex.Match(original, "^([0-9A-Za-z]*)(.*)$");
string good = m.Groups[1].Value;
string bad = m.Groups[2].Value;
if (bad.Length > 0)
{
// log bad characters
}
Console.WriteLine(good);
Your definition of the problem is not precise yet this is a fast trick to do so:
string input;
...
var trimed = input.TrimEnd(new[] {'#','$',...} /* array of unwanted characters */);
if(trimed != input)
myLogger.Log(input.Replace(trimed, ""));
check out the Regex.Replace methods...there are lots of overloads. You can use the Match methods for the logging to identify all matches.
String badString = "HELLO WORLD!!!!";
Regex regex = new Regex("!{1,}$" );
String newString = regex.Replace(badString, String.Empty);