Reliably checking if a string is base64 encoded in .Net - c#

Before I start: Yes, I have checked the other questions and answers on this topic both here and elsewhere.
I have found an example string that the .Net will base64 decode even though it isn't actually base64 encoded. Here is the example:
Rhinocort Aqueous 64mcg/dose Nasal Spray
The .Net method Convert.FromBase64String does not throw an exception when decoding this string so my IsBase64Encoded method happily returns true for this string.
Interestingly, if I use the cygwin base64 -d command using this string as input, it fails with the message invalid input.
Even more interestingly, the source that I thought that belongs to this executable (http://libb64.sourceforge.net/) "decodes" this same string with the same result as I am getting from the .Net Convert.FromBase64String. I will keep looking hoping to find a clue elsewhere but right now I'm stumped.
Any ideas?

There's a slightly better solution which also checks the input string length.
I recommend you do a check at the beginning. If the input is null or empty then return false.
http://www.codeproject.com/Questions/177808/How-to-determine-if-a-string-is-Base-decoded-or

When strings do pass Base64 decoding and the decoded data has special characters, then perhaps we can conclude that it was not valid Base64 (this depends on the encoding). Also, sometimes we're expecting the data being passed to be Base64, but sometimes it may not be properly padded with '='. Therefore, one method uses "strict" rules for Base64 and the other is "forgiving".
[TestMethod]
public void CheckForBase64()
{
Assert.IsFalse(IsBase64DataStrict("eyJhIjoiMSIsImIiOiI2N2NiZjA5MC00ZGRiLTQ3OTktOTlmZi1hMjhhYmUyNzQwYjEiLCJmIjoiMSIsImciOiIxIn0"));
Assert.IsTrue(IsBase64DataForgiving("eyJhIjoiMSIsImIiOiI2N2NiZjA5MC00ZGRiLTQ3OTktOTlmZi1hMjhhYmUyNzQwYjEiLCJmIjoiMSIsImciOiIxIn0"));
Assert.IsFalse(IsBase64DataForgiving("testing123"));
Assert.IsFalse(IsBase64DataStrict("ABBA"));
Assert.IsFalse(IsBase64DataForgiving("6AC648C9-C08F-4F9D-A0A5-3904CF15ED3E"));
}
public bool IsBase64DataStrict(string data)
{
if (string.IsNullOrWhiteSpace(data)) return false;
if ((new Regex(#"[^A-Z0-9+\/=]", RegexOptions.IgnoreCase)).IsMatch(data)) return false;
if (data.Length % 4 != 0) return false;
var e = data.IndexOf('=');
var l = data.Length;
if (!(e == -1 || e == l - 1 || (e == l - 2 && data[l - 1] == '='))) return false;
var decoded = string.Empty;
try
{
byte[] decodedData = Convert.FromBase64String(data);
decoded = Encoding.UTF8.GetString(decodedData);
}
catch(Exception)
{
return false;
}
//check for special chars that you know should not be there
char current;
for (int i = 0; i < decoded.Length; i++)
{
current = decoded[i];
if (current == 65533) return false;
if (!((current == 0x9 || current == 0xA || current == 0xD) ||
((current >= 0x20) && (current <= 0xD7FF)) ||
((current >= 0xE000) && (current <= 0xFFFD)) ||
((current >= 0x10000) && (current <= 0x10FFFF))))
{
return false;
}
}
return true;
}
public bool IsBase64DataForgiving(string data)
{
if (string.IsNullOrWhiteSpace(data)) return false;
//it could be made more forgiving by replacing any spaces with '+' here
if ((new Regex(#"[^A-Z0-9+\/=]", RegexOptions.IgnoreCase)).IsMatch(data)) return false;
//this is the forgiving part
if (data.Length % 4 > 0)
data = data.PadRight(data.Length + 4 - data.Length % 4, '=');
var e = data.IndexOf('=');
var l = data.Length;
if (!(e == -1 || e == l - 1 || (e == l - 2 && data[l - 1] == '='))) return false;
var decoded = string.Empty;
try
{
byte[] decodedData = Convert.FromBase64String(data);
decoded = Encoding.UTF8.GetString(decodedData);
}
catch (Exception)
{
return false;
}
//check for special chars that you know should not be there
char current;
for (int i = 0; i < decoded.Length; i++)
{
current = decoded[i];
if (current == 65533) return false;
if (!((current == 0x9 || current == 0xA || current == 0xD) ||
((current >= 0x20) && (current <= 0xD7FF)) ||
((current >= 0xE000) && (current <= 0xFFFD)) ||
((current >= 0x10000) && (current <= 0x10FFFF))))
{
return false;
}
}
return true;
}

Related

Parse WwwAuthenticate challenge string

I am working on a client for a RESTful service, using .NET Core 2.0. The remote service returns challenges like this:
WwwAuthenticate: Bearer realm="https://somesite/auth",service="some site",scope="some scope"
Which need to get turned into token requests like:
GET https://somesite/auth?service=some%20site&scope=some%20scope
Parsing the header to get a scheme and parameter is easy with AuthenticationHeaderValue, but that just gets me the realm="https://somesite/auth",service="some site",scope="some scope" string. How can I easily and reliably parse this to the individual realm, service, and scope components? It's not quite JSON, so deserializing it with NewtonSoft JsonConvert won't work. I could regex it into something that looks like XML or JSON, but that seems incredibly hacky (not to mention unreliable).
Surely there's a better way?
Since I don't see a non-hacky way. Maybe this hacky way may help
string input = #"WwwAuthenticate: Bearer realm=""https://somesite/auth"",service=""some site"",scope=""some, scope""";
var dict = Regex.Matches(input, #"[\W]+(\w+)=""(.+?)""").Cast<Match>()
.ToDictionary(x => x.Groups[1].Value, x => x.Groups[2].Value);
var url = dict["realm"] + "?" + string.Join("&", dict.Where(x => x.Key != "realm").Select(x => x.Key + "=" + WebUtility.UrlEncode(x.Value)));
OUTPUT
url => https://somesite/auth?service=some+site&scope=some%2C+scope
BTW: I added a , in "scope"
Possible duplicate of How to parse values from Www-Authenticate
Using the schema defined in RFC6750 and RFC2616, a slightly more precise parser implementation is included below. This parser takes into account the possibility that strings might contain =, ,, and/or escaped ".
internal class AuthParamParser
{
private string _buffer;
private int _i;
private AuthParamParser(string param)
{
_buffer = param;
_i = 0;
}
public static Dictionary<string, string> Parse(string param)
{
var state = new AuthParamParser(param);
var result = new Dictionary<string, string>();
var token = state.ReadToken();
while (!string.IsNullOrEmpty(token))
{
if (!state.ReadDelim('='))
return result;
result.Add(token, state.ReadString());
if (!state.ReadDelim(','))
return result;
token = state.ReadToken();
}
return result;
}
private string ReadToken()
{
var start = _i;
while (_i < _buffer.Length && ValidTokenChar(_buffer[_i]))
_i++;
return _buffer.Substring(start, _i - start);
}
private bool ReadDelim(char ch)
{
while (_i < _buffer.Length && char.IsWhiteSpace(_buffer[_i]))
_i++;
if (_i >= _buffer.Length || _buffer[_i] != ch)
return false;
_i++;
while (_i < _buffer.Length && char.IsWhiteSpace(_buffer[_i]))
_i++;
return true;
}
private string ReadString()
{
if (_i < _buffer.Length && _buffer[_i] == '"')
{
var buffer = new StringBuilder();
_i++;
while (_i < _buffer.Length)
{
if (_buffer[_i] == '\\' && (_i + 1) < _buffer.Length)
{
_i++;
buffer.Append(_buffer[_i]);
_i++;
}
else if (_buffer[_i] == '"')
{
_i++;
return buffer.ToString();
}
else
{
buffer.Append(_buffer[_i]);
_i++;
}
}
return buffer.ToString();
}
else
{
return ReadToken();
}
}
private bool ValidTokenChar(char ch)
{
if (ch < 32)
return false;
if (ch == '(' || ch == ')' || ch == '<' || ch == '>' || ch == '#'
|| ch == ',' || ch == ';' || ch == ':' || ch == '\\' || ch == '"'
|| ch == '/' || ch == '[' || ch == ']' || ch == '?' || ch == '='
|| ch == '{' || ch == '}' || ch == 127 || ch == ' ' || ch == '\t')
return false;
return true;
}
}

Converting string expression to boolean logic - C#

I want to convert a string expression to a real boolean expression.
The expression below will be an input (string):
"(!A && B && C) || (A && !B && C) || (A && B && !C) || (A && B && C)"
The variables A, B and C will have your boolean values (true or false).
How I can transforming a string expression, replace the logic values and validate using C#?
If you don't want to use some available libraries to parse that string you need to separate those characters and implement the logic based on comparison. So for example say we have "a || b", we can loop though each character and decide the appropriate operation based on char == '|'. For more complex situation I'd use a stack to keep track of each results, like this one that can handle && and || without parentheses:
public bool ConvertToBool(string op, bool a, bool b)
{
var st = new Stack<bool>();
var opArray = op.ToCharArray();
var orFlag = false;
var andFlag = false;
for (var i = 0; i < opArray.Length; i++)
{
bool top;
switch (opArray[i])
{
case '|':
i++;
orFlag = true;
break;
case '&':
i++;
andFlag = true;
break;
case 'a':
if (orFlag)
{
top = st.Pop();
st.Push(top || a);
orFlag = false;
}
else if (andFlag)
{
top = st.Pop();
st.Push(top && a);
andFlag = false;
continue;
}
st.Push(a);
break;
case 'b':
if (orFlag)
{
top = st.Pop();
st.Push(top && b);
orFlag = false;
}
else if (andFlag)
{
top = st.Pop();
st.Push(top && b);
andFlag = false;
continue;
}
st.Push(b);
break;
}
}
return st.Pop();
}

C# for case in string(easy)

so I have this code. I need to generate a for loop that checks all the characters in the string and checks if they are all valid(So numbers from 0->7). But I don't know how to write it, I tried something but it didn't work. Here are the examples:user enters: 77, code works, user enters 99, code doesn't work, user enters 5., code doesn't work, etc..
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace NALOGA1
{
class Program
{
static string decToOct(int stevilo)//v mojon primere 7
{
string izhod = "";
//7>0 DRŽI
while (stevilo > 0)
{
//izhodi se dodeli ostanek deljenja z 8 keri se spremeni v string
izhod = (stevilo % 8) + izhod;
//7/8;
stevilo /= 8;
}
return izhod;
}
static int Octtodesetisko(string stevilo)
{
double vsota = 0;
for (int i = stevilo.Length - 1; i >= 0; i--)
{
int stevka = stevilo[i] - '0';
vsota += (stevka * Math.Pow(8, i));
}
return (int)vsota;
}
static void Main(string[] args)
{
//3 podprogram-in progress
string prvastevilka = Console.ReadLine();
int prvasprememba = Int32.Parse(prvastevilka);
if (prvasprememba > 0)
{
Console.WriteLine(decToOct(prvasprememba));
}
else
{
Console.WriteLine("Napaka");
}
string drugastevilka = Console.ReadLine();
int drugasprememba = Octtodesetisko(drugastevilka);
foreach (char znak in drugastevilka)
{
if(znak!=1 || znak!=2 || znak!=3 || znak!=4 || znak!=5 || znak!=6 || znak!=7)
{
Console.WriteLine("Napaka");
}
else
{
Console.WriteLine("dela :D");
}
}
Console.ReadKey();
}
}
}
Personally, I would take advantage of the LINQ Enumerable.All method to express this in a very concise and readable way:
if (str.Any() && str.All(c => c >= '0' && c <= '7'))
{
Console.WriteLine("good");
}
else
{
Console.WriteLine("bad");
}
EDIT: No LINQ
It's not hard to translate what the LINQ Enumerable.All method does to a normal loop. It's just more verbose:
bool isValid = true;
foreach (char c in str)
{
if (c < '0' || c > '7')
{
isValid = false;
break;
}
}
if (str.Length != 0 && isValid)
{
Console.WriteLine("good");
}
else
{
Console.WriteLine("bad");
}
Firstly, there seems to be a mistake in the line
if(znak!=1 || znak!=2 || znak!=3 || znak!=4 || znak!=5 || znak!=6 || znak!=7)
I guess it should read
if(znak!='1' || znak!='2' || znak!='3' || znak!='4' || znak!='5' || znak!='6' || znak!='7')
which should be compressed to
if (znak >= '0' && znak <= '7')
You can use linq instead of the for loop here like this:
if (drugastevilka.All(c => c >= '0' && c <= '7')
Console.WriteLine("dela :D");
else
Console.WriteLine("Napaka");
But the best solution is probably to use a regular expression:
Regex regex = new Regex("^[0-7]+$");
if (regex.IsMatch(drugastevilka))
Console.WriteLine("dela :D");
else
Console.WriteLine("Napaka");
Edit: the linq solution shown accepts empty strings, the regex (as shown) needs at least 1 character. Exchange the + with a * and it will accept empty strings, too. But I don't think you want to accept empty strings.
You are messing up with the datatype
Can you try with below code
static string decToOct(int stevilo)//v mojon primere 7
{
int izhod = 0;
//7>0 DRŽI
while (stevilo > 0)
{
//izhodi se dodeli ostanek deljenja z 8 keri se spremeni v string
izhod = (stevilo % 8) + izhod;
//7/8;
stevilo /= 8;
}
return (izhod.ToString());
}
What about something like this?
class Program
{
static void Main(string[] args)
{
string someString = "1234567";
string someOtherString = "1287631";
string anotherString = "123A6F2";
Console.WriteLine(IsValidString(someString));
Console.WriteLine(IsValidString(someOtherString));
Console.WriteLine(IsValidString(anotherString));
Console.ReadLine();
}
public static bool IsValidString(string str)
{
bool isValid = true;
char[] splitString = str.ToCharArray(); //get an array of each character
for (int i = 0; i < splitString.Length; i++)
{
try
{
double number = Char.GetNumericValue(splitString[i]); //try to convert the character to a double (GetNumericValue returns a double)
if (number < 0 || number > 7) //we get here if the character is an int, then we check for 0-7
{
isValid = false; //if the character is invalid, we're done.
break;
}
}
catch (Exception) //this will hit if we try to convert a non-integer character.
{
isValid = false;
break;
}
}
return isValid;
}
}
IsValidString() takes a string, converts it to a Char array, then checks each value as such:
Get the numeric value
Check if the value is between 0-7
GetNumericValue will fail on a non-integer character, so we wrap it in a try/catch - if we hit an exception we know that isValid = false, so we break.
If we get a valid number, and it's not between 0-7 we also know that isValid = false, so we break.
If we make it all the way through the list, the string is valid.
The sample given above returns:
IsValidString(someString) == true
IsValidString(someOtherString) == false
IsValidString(anotherString) == false

How to check for special characters in a string using a loop

since my prof won't let me use RegEx, I'm stuck with using loops to check each character on a string.
Does anyone have a sample code/algorithm?
public void setAddress(string strAddress)
{
do
{
foreach (char c in Name)
{
if ( /*check for characters*/ == false)
{
Address = strAddress;
}
}
if ( /*check for characters*/ == true)
{
Console.Write("Invalid!");
}
} while ( /*check for characters*/ == true)
}
public int getAddress()
{
return Address;
}
I need to only include letters and numbers. Characters such as !##$%^& are not allowed.
I'm not allowed to use RegEx because he hasn't taught that to us yet... well I couldn't attend class on the day he taught these loops and character checking, so now he won't tell me more. ANYWAY, if there's a more efficient way without using RegEx, that'd be helpful.
string s = #"$KUH% I*$)OFNlkfn$";
var withoutSpecial = new string(s.Where(c => Char.IsLetterOrDigit(c)
|| Char.IsWhiteSpace(c)).ToArray());
if (s != withoutSpecial)
{
Console.WriteLine("String contains special chars");
}
You can do it without loops at all :)
Source: https://stackoverflow.com/a/4503614/1714342
EDIT:
if(s.Any(c=>c => !Char.IsLetterOrDigit(c) || !Char.IsWhiteSpace(c))
{
Console.WriteLine("String contains special chars");
}
You may not need loops at all, just character checking will do:
if (Name.IndexofAny("!##$%^&*()".ToCharArray() != -1))
Console.WriteLine("Valid Address");
else
Console.WriteLine("Invalid Address");
See http://msdn.microsoft.com/en-us/library/system.string.indexofany.aspx
A solution with explicit loop could be
String s = #"$KUH% I*$)OFNlkfn$";
foreach (Char c in s)
if (!(Char.IsLetterOrDigit(c) || Char.IsWhiteSpace(c))) {
Console.WriteLine("String contains special chars");
break;
}
bool check_for_characters(char c)
{
return (c >= '0' && c <= '9') || (c >= 'A' && c <= 'Z') || (c >= 'a' && c <= 'z');
}

WebUtility.HtmlDecode vs HttpUtilty.HtmlDecode

I was using WebUtilty.HtmlDecode to decode HTML. It turns out that it doesn't decode properly, for example, – is supposed to decode to a "–" character, but WebUtilty.HtmlDecode does not decode it. HttpUtilty.HtmlDecode, however, does.
Debug.WriteLine(WebUtility.HtmlDecode("–"));
Debug.WriteLine(HttpUtility.HtmlDecode("–"));
> –
> –
The documentation for both of these is the same:
Converts a string that has been HTML-encoded for HTTP transmission into a decoded string.
Why are they different, which one should I be using, and what will change if I switch to WebUtility.HtmlDecode to get "–" to decode correctly?
The implementation of the two methods are indeed different on Windows Phone.
WebUtility.HtmlDecode:
public static void HtmlDecode(string value, TextWriter output)
{
if (value != null)
{
if (output == null)
{
throw new ArgumentNullException("output");
}
if (!StringRequiresHtmlDecoding(value))
{
output.Write(value);
}
else
{
int length = value.Length;
for (int i = 0; i < length; i++)
{
bool flag;
uint num4;
char ch = value[i];
if (ch != '&')
{
goto Label_01B6;
}
int num3 = value.IndexOfAny(_htmlEntityEndingChars, i + 1);
if ((num3 <= 0) || (value[num3] != ';'))
{
goto Label_01B6;
}
string entity = value.Substring(i + 1, (num3 - i) - 1);
if ((entity.Length <= 1) || (entity[0] != '#'))
{
goto Label_0188;
}
if ((entity[1] == 'x') || (entity[1] == 'X'))
{
flag = uint.TryParse(entity.Substring(2), NumberStyles.AllowHexSpecifier, NumberFormatInfo.InvariantInfo, out num4);
}
else
{
flag = uint.TryParse(entity.Substring(1), NumberStyles.Integer, NumberFormatInfo.InvariantInfo, out num4);
}
if (flag)
{
switch (_htmlDecodeConformance)
{
case UnicodeDecodingConformance.Strict:
flag = (num4 < 0xd800) || ((0xdfff < num4) && (num4 <= 0x10ffff));
goto Label_0151;
case UnicodeDecodingConformance.Compat:
flag = (0 < num4) && (num4 <= 0xffff);
goto Label_0151;
case UnicodeDecodingConformance.Loose:
flag = num4 <= 0x10ffff;
goto Label_0151;
}
flag = false;
}
Label_0151:
if (!flag)
{
goto Label_01B6;
}
if (num4 <= 0xffff)
{
output.Write((char) num4);
}
else
{
char ch2;
char ch3;
ConvertSmpToUtf16(num4, out ch2, out ch3);
output.Write(ch2);
output.Write(ch3);
}
i = num3;
goto Label_01BD;
Label_0188:
i = num3;
char ch4 = HtmlEntities.Lookup(entity);
if (ch4 != '\0')
{
ch = ch4;
}
else
{
output.Write('&');
output.Write(entity);
output.Write(';');
goto Label_01BD;
}
Label_01B6:
output.Write(ch);
Label_01BD:;
}
}
}
}
HttpUtility.HtmlDecode:
public static string HtmlDecode(string html)
{
if (html == null)
{
return null;
}
if (html.IndexOf('&') < 0)
{
return html;
}
StringBuilder sb = new StringBuilder();
StringWriter writer = new StringWriter(sb, CultureInfo.InvariantCulture);
int length = html.Length;
for (int i = 0; i < length; i++)
{
char ch = html[i];
if (ch == '&')
{
int num3 = html.IndexOfAny(s_entityEndingChars, i + 1);
if ((num3 > 0) && (html[num3] == ';'))
{
string entity = html.Substring(i + 1, (num3 - i) - 1);
if ((entity.Length > 1) && (entity[0] == '#'))
{
try
{
if ((entity[1] == 'x') || (entity[1] == 'X'))
{
ch = (char) int.Parse(entity.Substring(2), NumberStyles.AllowHexSpecifier, CultureInfo.InvariantCulture);
}
else
{
ch = (char) int.Parse(entity.Substring(1), CultureInfo.InvariantCulture);
}
i = num3;
}
catch (FormatException)
{
i++;
}
catch (ArgumentException)
{
i++;
}
}
else
{
i = num3;
char ch2 = HtmlEntities.Lookup(entity);
if (ch2 != '\0')
{
ch = ch2;
}
else
{
writer.Write('&');
writer.Write(entity);
writer.Write(';');
continue;
}
}
}
}
writer.Write(ch);
}
return sb.ToString();
}
Interestingly, WebUtility doesn't exist on WP7. Also, the WP8 implementation of WebUtility is identical to the desktop one. The desktop implementation of HttpUtility.HtmlDecode is just a wrapper around WebUtility.HtmlDecode. Last but not least, Silverlight 5 has the same implementation of HttpUtility.HtmlDecode as Windows Phone, and does not implement WebUtility.
From there, I can venture a guess: since the Windows Phone 7 runtime is based on Silverlight, WP7 inherited of the Silverlight version of HttpUtility.HtmlDecode, and WebUtility wasn't present. Then came WP8, whose runtime is based on WinRT. WinRT brought WebUtility, and the old version of HttpUtility.HtmlDecode was kept to ensure the compatibility with the legacy WP7 apps.
As to know which one you should use... If you want to target WP7 then you have no choice but to use HttpUtility.HtmlDecode. If you're targeting WP8, then just pick the method whose behavior suits your needs the best. WebUtility is probably the future-proof choice, just in case Microsoft decides to ditch the Silverlight runtime in an upcoming version of Windows Phone. But I'd just go with the practical choice of picking HttpUtility to not have to worry about manually supporting the example you've put in your question.
The methods do exactly the same. Moreover if you try to decompile them the implementations look like one was just copied from another.
The difference is only intended use. HttpUtility is contained in the System.Web assembly and is expected to be used in ASP.net applications which are built over this assembly. WebUtility is contained in the System assembly referenced by nearly all applications and is provided for more general purpose or client use.
Just to notify others who will find this in search. Use any function that mentioned in the question, but never use Windows.Data.Html.HtmlUtilities.ConvertToText(string input). It's 70 times slower than WebUtilty.HtmlDecode and produce crashes! Crash will be named as mshtml!IEPeekMessage in the DevCenter. It looks like this function call InternetExplorer to convert the string. Just avoid it.

Categories

Resources