unicode to human readable string c# .net - c#

This is probably a very basic question, but really appreciate if you could help me with this:
I want to convert an string that contains characters like \u000d\u000a\u000d\u000 to a human readable string, however I don't want to use .Replace method since the Unicode characters might be much more than what I include the software to check and replace.
string = "Test \u000d\u000a\u000d\u000aTesting with new line. \u000d\u000a\u000d\u000aone more new line"
I receive this string as a json Object from my server.

Do you even need that?
For example, the following code will print abc which is the actual decoded value:
var unicodeString = "\u0061\u0062\u0063";
Console.WriteLine(unicodeString);

Related

Is there a method to get escaped unicode characters from a string?

I need to created a method which validates if a string contains an escaped unicode character. First thing that passed through my mind was to convert the string to char[] in order to get each element and then just to extract the "\uxxxx" from there. It didn't work. Here is a little thing:
public static string Quoted(string text)
=> $"\"{text}\"";
string inputData = Quoted(#"a \u26Be b");
Console.WriteLine(inputData) // ==> output will be: "a \u26Be b"
So there is obviously an large unicode character. I feel those quotes make my work harder. Also, I'm a begginer who just passed the alghoritm part of learning, now I'm learning and working with .Json, github, gitbash, etc. This task is one unit test in order to validate .Json file. Any help or hints would be appreciated. Thanks in advance!

Base64EncodedString does not include NewLines

I´m using a .NET core 3.0 project on Windows 10. I´m trying to encode a string to base64 with below code:
var stringvalue = "Row1" + Environment.NewLine + "\n\n" + "Row2";
var encodedString = Convert.ToBase64String(Encoding.UTF8.GetBytes(stringvalue));
encodedString has then below result:
Um93MQ0KCgpSb3cy
stringvalue is:
Row1\r\n\n\nRow2
However, if I´m passing the same value to this site (https://www.base64encode.org/), i´m getting another result:
Um93MVxyXG5cblxuUm93Mg==
In visual studio, I tried to resave the file with Unix lineendings, but without any luck:
I want the string to be encoded as how it´s done in https://www.base64encode.org. Any ideas how to get this done?
From the screenshot, I can see that you have entered a different string from the string you used in your C# code. The string you used in https://www.base64encode.org is represented as a C# string literal like this:
"Row1\\r\\n\n\\nRow2"
// or
#"Row1\r\n\n\nRow2"
So to answer your question:
I want the string to be encoded as how it´s done in https://www.base64encode.org. Any ideas how to get this done?
You should do:
var encodedString = Convert.ToBase64String(Encoding.UTF8.GetBytes("Row1\\r\\n\n\\nRow2"));
But that's probably not what you actually want. Your first attempt at the C# code is more likely to be desired, because that is actually a carriage return character, followed by 3 new line characters. The string you entered in https://www.base64encode.org is simply the backslash character followed by the letter r (or n).
You can't really make the output on https://www.base64encode.org match the C# output, because you can only choose one kind of line separator on there. You can only either encode Row1\r\n\r\n\r\nRow2 or Row\n\n\nRow2. Nevertheless, you can check that the C# result is correct by decoding the output using https://www.base64decode.org.
The \r\n will be encoded on the website, this is not a newline, these are 4 characters. There is this newline-separator-checkbox, to say you want the windows style, to convert your real world input value:
Row1
Row2.
I guess your \r\n\n\n is just a mistake, the website is prepared to convert it to \r\n\r\n only.

Is there a way to trim leading characters using String.Format()?

Need to know of there is a way to use String.Format to remove leading characters from a string. I have a limitation in some existing code that I can only pass in a string and a format string for it.
So can you do something like
String.Format("Test output: {0:#}","001")
and produce the output
"Test output: 1"
I think the answer is 'No' but I wanted to make sure.
EDIT: To clarify, the format string will be put in a configuration file and the string to be formatted is a value coming out of a database. I can't execute any code on it. Has to be through the format string.
You could do it on the arg you are passing
String.Format("Test output: {0:#}", "001".TrimStart('0'))
Alternatively you could probably do a find with replace using a regular expression on the resulting string.
An other alternative is to define and pass in your own formatter using a custom implementation of IFormatProvider. I am not sure if this is allowed or not based your your last edit.
However, based on the restrictions listed, there is no way to do it with just the format string input

Is it possible to enter a New Line in a string without Escape Sequences?

I want a String to have a New Line in it, but I cannot use escape sequences because the interface I am sending my string to does not recognize them. As far as I know, C# does not actually store a New Line in the String, but rather it stores the escape sequence, causing the literal contents to be passed, rather than what they actually mean.
My best guess is that I would have to somehow parse the number 10 (the decimal value of a New Line according to the ASCII table) into ASCII. But I'm not sure how to do that, because C# parses numbers directly to String if attempting this:
"hello" + 10 + "world"
Any suggestions?
If you say "hello\nworld", the actual string will contain:
hello
world
There will be an actual new-line character in the string. At no point are the characters \ and n stored in the string.
There are a few ways to get the exact same result, but a simple \n in the string is a common way.
A simple cast should also do the same:
"hello" + (char)10 + "world"
Although likely slightly slower because of string concatenation. I say "likely" because it could probably be optimized away, or an actual example using \n will also result in string concatenation, taking roughly the same amount of time.
Test.
The preferred new line character is Environment.NewLine for its cross-platform capability.
You could use xml for communication, if you're receiver can handle this

Convert string to char

I get from another class string that must be converted to char. It usually contains only one char and that's not a problem. But control chars i receive like '\\n' or '\\t'.
Is there standard methods to convert this to endline or tab char or i need to parse it myself?
edit:
Sorry, parser eat one slash. I receive '\\t'
I assume that you mean that the class that sends you the data is sending you a string like "\n". In that case you have to parse this yourself using:
Char.Parse(returnedChar)
Otherwise you can just cast it to a string like this
(string)returnedChar
New line:
string escapedNewline = #"\\n";
string cleanupNewLine = escapedNewline.Replace(#"\\n", Environment.NewLine);
OR
string cleanupNewLine = escapedNewline.Replace(#"\\n", "\n");
Tab:
string escapedTab = #"\\t";
string cleanupTab= escapedTab.Replace(#"\\t", "\t");
Note the lack of the literal string (i.e. i did not use #"\t" because that will not represent a Tab)
Alternatively you could consider Regular Expressions if you need to replace a range of different string patterns.
You should probably write a utility function to encapsulate the common behaviour above for all the possible Escape Sequences
Then you'd write some Unit Tests to cover each of the cases you can think of.
As you encounter any bugs you add more unit tests to cover those cases.
UPDATE
You could represent a tab in the XML with a special character sequence:
see this article
This article applies to SQL Server but may well be relevant to C# also?
To be absolutely sure, you could try generating a string with a tab in it and putting it into some XML (programmatically) and using XmlSerializer to serialize that to a file to see what the output is, then you can be sure that this will faithfully 'round-trip' the string with the tab still in it.
how about using string.ToCharArray()
You can then add the appropriate logic to process whatever was in the string.
char.parse(string); is used to convert string to char and you can do vice versa
char.tostring();
100% solved

Categories

Resources