Getting an unexpected "?" at the end of a Registry GetValue - c#

I use the Registry class to manage values in the Registry on Windows Seven in C#.
Registry.GetValue(...);
But, I'm facing a curious behavior :
Every time, the returned value is the correct one, but sometimes, it is followed by an unexpected "?"
When I check the Registry, (regedit), the "?" doesn't exist.
I really don't understand from where this question mark come from.
Info :
C#
3.5 framework
windows 7 64 bits (and i want my application to work on both 32 and 64 bits systems)

That didn't quite work for me, I was also getting a random ? at the end of a value from the registry that was a file path. It only appeared every now and again. It seems like a bug.
I use 2 pass method to see if directory exists and then strip out characters, I was getting unicode character 1792 at the end. < 128 probably won't work for some languages.
string configPath = val.ToString();
bool dirExists = false;
if (Directory.Exists(configPath))
{
dirExists = true;
}
else
{
_logger.Warn("The path for service {0} doesn't exist: {1}", serviceName, configPath);
StringBuilder configPathBuilder = new StringBuilder(configPath.Length);
// Do this to remove any dodgy characters in the path like a ? at end
char[] inValidChars = Path.GetInvalidPathChars();
foreach (Char c in configPath.ToCharArray())
{
if (inValidChars.Contains(c) == false && c < 128)
{
configPathBuilder.Append(c);
}
else
{
_logger.Warn("An invalid path was character was found in the path: {0} {1}", c, (int)c);
}
}
configPath = configPathBuilder.ToString();
if (Directory.Exists(configPath))
{
dirExists = true;
}
}

So my question is, "who set the value"?
Perhaps whoever did the setting put in an unprintable character at the end of the string. It is probably not actually a question mark. This may be a result of a bug in the program which did the setting, not anything to do with your code, per se.

I found a way to remove the unexpected char, thanks to all your comments ;)
String value = null;
try
{
foreach (Char item in Registry.GetValue(registryKey, key, "").ToString().ToCharArray())
{
if (Char.GetUnicodeCategory(item) != System.Globalization.UnicodeCategory.OtherLetter && Char.GetUnicodeCategory(item) != System.Globalization.UnicodeCategory.OtherNotAssigned)
{
value += item;
}
}
}
catch (Exception ex)
{
LOG.Error("Unable to get value of " + key + ex, ex);
}
return value;
I made some tests to know what kind of char appears from time to time. It was, just like you said Larry, an unicode problem.
I still don't understand why this char appears sometimes.

Related

foreach(string token in text.Split(";")) scans only for one value

I am trying to get better in C# and so I thought making a simple programming language that can run basic commands like println("Hi"); and print("Hi");
However, the code shown scans for one foreach() value, my C# code:
string text = File.ReadAllText(inputFile);
string[] fileText = text.Split(";");
foreach (string token in fileText) {
if (token.StartsWith("println(") && token.EndsWith(")")) {
if (token.Split("\"").Length == 3) {
Console.WriteLine(token.Split("\"")[1]);
} else {
throw new Exception($"println(string); takes exactly 3 arguments of which {token.Split("\"").Length} were given.");
}
} else if (token.StartsWith("println(")) {
throw new Exception("Syntax error");
} else if (token.StartsWith("print(") && token.EndsWith(")")) {
if (token.Split("\"").Length == 3) {
Console.Write(token.Split("\"")[1]);
} else {
throw new Exception(($"print(string); takes exactly 3 arguments of which {token.Split("\"").Length} were given."));
}
} else if (token.StartsWith("print(")) {
throw new Exception("Syntax error");
}
}
My testing file:
print("This ");
println("is a test.");
I only get This as output.
You have stated in a comment (now in the question proper) that text is populated with:
string text = File.ReadAllText(inputFile);
That means a split based on semi-colons will give you the following strings (though the third depends on what comes after the final semi-colon):
print("This ")
<newline>println("is a test.")
<possibly-newline-but-irrelevant-for-this-question>
In other words, that second one does not start with println(, which is why your code is not picking it up. You may want to rethink how you're handling that, especially if you want to handle indented code as well.
I'd (at least initially) opt for stripping all white-space from the beginning and end of token before doing any comparisons.
What #paxdiablo said but the following will also work for the input shown:
var lines = File.ReadAllLines(inputFile); // ReadAllLines, not ReadAllText
foreach (string line in lines) {
var token = line.TrimEnd(new []{';'});
// your code here

How to compare whether two strings are identical?

So I'm doing this years Advent of Code and I'm stuck on the second day, part 2.
You are given inputs which look like this:
"1-3 c: caaasa"
You have to check how many passwords are valid due to the policy like,
in above example:
letter c has to be in position 1 OR 3 in the string caaasa. If
yes, the password is valid.
I've broken down that string to different sections, and now I try to compare a string "znak" which contains that given letter to a letter on position zakresmin and zakresmax in string "passdiv"
Yet, everytime it returns False, so it doesn't add up to the count of passwords.
I tried using Equals() and CompareTo(), but they don't seem to work.
How can I modify my code so it returns proper values?
var iloschasel = 0;
using (StreamReader sr = new StreamReader(#"C:\Users\Wurf\Desktop\text.txt"))
{
string line;
while ((line = sr.ReadLine()) != null)
{
string[] linia = line.Split(" ");
string zakres = linia[0];
string[] zakresy = zakres.Split("-");
int zakresmin = Convert.ToInt32(zakresy[0]);
int zakresmax = Convert.ToInt32(zakresy[1]);
string znak = (linia[1].Replace(":", "")).Trim();
var suma = Regex.Matches(linia[2], znak);
string passdiv = linia[2];
if(passdiv[zakresmin].Equals(znak) || passdiv[zakresmax - 1].Equals(znak))
{
iloschasel += 1;
}
}
}
Console.WriteLine(iloschasel);
As mentioned, when you call Equals on two different types you are playing a game of chance with how the actual types are implemented. In this case you lose. Strings and chars will never have an equivalence or the same reference.
I believe the compiler or resharper would give you a warning alerting you that neither type derive from string and char
However, I was bored enough to give an alternate solution
public static bool IsValid(string input)
{
var match = Regex.Match(input, #"(\d)-(\d) (\S): (.*)");
if(!match.Success)
throw new ArgumentException( $"Invalid format : {input}",nameof(input));
var first = int.Parse(match.Groups[1].Value);
var second = int.Parse(match.Groups[2].Value);
var c = char.Parse(match.Groups[3].Value);
var password = match.Groups[4].Value;
return password[first-1] == c && password[second-1] == c;
}
Test
Console.WriteLine($"Is Valid = {IsValid("1-3 c: caaasa")}");
Console.WriteLine($"Is Valid = {IsValid("1-3 c: cacaasa")}");
Output
Is Valid = False
Is Valid = True
Note : this is not meant to be a complete bullet-proof solution. Just a novel elegant way to solve your problem
Your problem is that you are comparing a string to a char
var match = "c" == 'c';
Will give a compile error because they are different data types
var match = "c".Equals('c');
will let you compile, but will always return false because a char will never equal a string. You have to turn the char into a string or visa versa for the check to work
var match = "c"[0] == 'c';
So in your if statement, if you fix the check to compare strings with strings or chars with chars you should get some positive results. And also fix your indexing issue to decide if you want a 0 based index or a 1 based index with zakresmin and max
Also as a side note, it can be helpful to step through your code line by line in debug mode, to find out which line isn't behaving like you expect it to. In your case debugging would have helped you zero in on the if statement as a starting point to fixing things.
So it turns out (if I understand that correctly) that a compared element of the string passdiv was a char which I tried to compare to znak which was a string. I added ToString() to my code and it works well. Also fixed the range of zakresmin by subtracting 1 so it works properly.
if((passdiv[zakresmin - 1].ToString() == znak && passdiv[zakresmax - 1].ToString() != znak) || (passdiv[zakresmin - 1].ToString() != znak && passdiv[zakresmax - 1].ToString() == znak))
{
iloschasel += 1;
}

False positive using string.length in if statement

I'm trying to format phone numbers. Perhaps my approach is not the best but it works with the exception of some unexpected behavior. I'm using string.length in an if statement to see if the phone number's length (stored as a string) is greater than 9. I've also tried >= 10 instead of > 9 with the same results. All works fine with 18001234567 or 7041234567. I get (800) 123-4567 or (704) 123-4567. But with 828464047 I get (82) 846-4047 rather than the number just being returned as is.
try
{
if (ANI.Length > 9)
{
char[] Number1 = { '1' };
ANI = ANI.TrimStart(Number1);
return String.Format("{0:(###) ###-####}", Convert.ToDouble(ANI));
}
else if (ANI == "")
{
return "Private";
}
else
{
return ANI;
}
}
catch (Exception ex)
{
return ex.Message;
}
Any ideas? Is there a better way to approach this?
Thanks.
If I change the code that formats the phone number to use substrings, things break, as expected.
return "(" + ANI.Substring(0, 3) + ") " + ANI.Substring(3, 3) + "-" + ANI.Substring(6, 4);
An exception is caught and "Index and length must refer to a location within the string. Parameter name: length" is returned.
I put it into a unit test method and it works. You're obviously getting an extra character added onto the string 828464047. You can debug and place a breakpoint at the IF statement and see what is actually in ANI.
A few things as well,
Don't name a variable something ambiguous like "ANI".
rename Number1 to something like "firstNumber"
A try/Catch is not needed for this statement, if you're getting an exception you're doing something that can be solved by better coding.
I can see ANI.TrimStart() in your code which leads me to suspect that you have some leading whitespace. You can probably best solve the problem by moving the trimming to outside the if.
It's pretty safe to assume that something as fundamental as String.Length works correctly. When it says your string is a certain length, your string really will be that length.
I'd check your inputs for whitespace or, perhaps you transcribed your input wrong here. The following tests pass against your code, copied and pasted:
[TestMethod, Owner("ebd"), TestCategory("Proven"), TestCategory("Unit")]
public void Ten_Digit_800_Number()
{
var myPad = new NumberFormatter();
Assert.AreEqual<string>("(800) 123-4567", myPad.FormatNumber("18001234567"));
}
[TestMethod, Owner("ebd"), TestCategory("Proven"), TestCategory("Unit")]
public void Ten_Digit_800_Number()
{
var myPad = new NumberFormatter();
Assert.AreEqual<string>("(800) 123-4567", myPad.FormatNumber("18001234567 "));
}
[TestMethod, Owner("ebd"), TestCategory("Proven"), TestCategory("Unit")]
public void TroubleString()
{
var myPad = new NumberFormatter();
Assert.AreEqual<string>("828464047", myPad.FormatNumber("828464047"));
}
The problem was stripping the leading '1' after having evaluated the length of the string. Stripping the '1' bef

How determine if a string has been encoded programmatically in C#?

How determine if a string has been encoded programmatically in C#?
Lets for example string:
<p>test</p>
I would like have my logic understand that this value it has been encoded..
Any ideas? Thanks
You can use HttpUtility.HtmlDecode() to decode the string, then compare the result with the original string. If they're different, the original string was probably encoded (at least, the routine found something to decode inside):
public bool IsHtmlEncoded(string text)
{
return (HttpUtility.HtmlDecode(text) != text);
}
Strictly speaking that's not possible. What the string contains might actually be the intended text, and the encoded version of that would be &lt;p&gt;test&lt;/p&gt;.
You could look for HTML entities in the string, and decode it until there are no left, but it's risky to decode data that way, as it's assuming things that might not be true.
this is my take on it... if the user passes in partially encoded text, this'll catch it.
private bool EncodeText(string val)
{
string decodedText = HttpUtility.HtmlDecode(val);
string encodedText = HttpUtility.HtmlEncode(decodedText);
return encodedText.Equals(val, StringComparison.OrdinalIgnoreCase);
}
I use the NeedsEncoding() method below to determine whether a string needs encoding.
Results
-----------------------------------------------------
b --> NeedsEncoding = True
<b> --> NeedsEncoding = True
<b> --> NeedsEncoding = True
<b< --> NeedsEncoding = False
" --> NeedsEncoding = False
Here are the helper methods, I split it into two methods for clarity. Like Guffa says it is risky and hard to produce a bullet proof method.
public static bool IsEncoded(string text)
{
// below fixes false positive <<>
// you could add a complete blacklist,
// but these are the ones that cause HTML injection issues
if (text.Contains("<")) return false;
if (text.Contains(">")) return false;
if (text.Contains("\"")) return false;
if (text.Contains("'")) return false;
if (text.Contains("script")) return false;
// if decoded string == original string, it is already encoded
return (System.Web.HttpUtility.HtmlDecode(text) != text);
}
public static bool NeedsEncoding(string text)
{
return !IsEncoded(text);
}
A simple way of detecting this would be to check for characters that are not allowed in an encoded string, such as < and >.
All I can suggest is that you replace known encoded sections with the decoded string.
replace("<", "<")
I'm doing .NET Core 2.0 development and I'm using System.Net.WebUtility.HtmlDecode, but I have a situation where strings being processed in a microservice might have an indeterminate number of encodings performed on some strings. So I put together a little recursive method to handle this:
public string HtmlDecodeText(string value, int decodingCount = 0)
{
// If decoded text equals the original text, then we know decoding is done;
// Don't go past 4 levels of decoding to prevent possible stack overflow,
// and because we don't have a valid use case for that level of multi-decoding.
if (decodingCount < 0)
{
decodingCount = 1;
}
if (decodingCount >= 4)
{
return value;
}
var decodedText = WebUtility.HtmlDecode(value);
if (decodedText.Equals(value, StringComparison.OrdinalIgnoreCase))
{
return value;
}
return HtmlDecodeText(decodedText, ++decodingCount);
}
And here I called the method on each item in a list where strings were encoded:
result.FavoritesData.folderMap.ToList().ForEach(x => x.Name = HtmlDecodeText(x.Name));
Try this answer: Determine a string's encoding in C#
Another code project might be of help..
http://www.codeproject.com/KB/recipes/DetectEncoding.aspx
You could also use regex to match on the string content...

Work-around for C# CodeDom causing stack-overflow (CS1647) in csc.exe?

I've got a situation where I need to generate a class with a large string const. Code outside of my control causes my generated CodeDom tree to be emitted to C# source and then later compiled as part of a larger Assembly.
Unfortunately, I've run into a situation whereby if the length of this string exceeds 335440 chars in Win2K8 x64 (926240 in Win2K3 x86), the C# compiler exits with a fatal error:
fatal error CS1647: An expression is too long or complex to compile near 'int'
MSDN says CS1647 is "a stack overflow in the compiler" (no pun intended!). Looking more closely I've determined that the CodeDom "nicely" wraps my string const at 80 chars.This causes the compiler to concatenate over 4193 string chunks which apparently is the stack depth of the C# compiler in x64 NetFx. CSC.exe must internally recursively evaluate this expression to "rehydrate" my single string.
My initial question is this: "does anyone know of a work-around to change how the code generator emits strings?" I cannot control the fact that the external system uses C# source as an intermediate and I want this to be a constant (rather than a runtime concatenation of strings).
Alternatively, how can I formulate this expression such that after a certain number of chars, I am still able to create a constant but it is composed of multiple large chunks?
Full repro is here:
// this string breaks CSC: 335440 is Win2K8 x64 max, 926240 is Win2K3 x86 max
string HugeString = new String('X', 926300);
CodeDomProvider provider = CodeDomProvider.CreateProvider("C#");
CodeCompileUnit code = new CodeCompileUnit();
// namespace Foo {}
CodeNamespace ns = new CodeNamespace("Foo");
code.Namespaces.Add(ns);
// public class Bar {}
CodeTypeDeclaration type = new CodeTypeDeclaration();
type.IsClass = true;
type.Name = "Bar";
type.Attributes = MemberAttributes.Public;
ns.Types.Add(type);
// public const string HugeString = "XXXX...";
CodeMemberField field = new CodeMemberField();
field.Name = "HugeString";
field.Type = new CodeTypeReference(typeof(String));
field.Attributes = MemberAttributes.Public|MemberAttributes.Const;
field.InitExpression = new CodePrimitiveExpression(HugeString);
type.Members.Add(field);
// generate class file
using (TextWriter writer = File.CreateText("FooBar.cs"))
{
provider.GenerateCodeFromCompileUnit(code, writer, new CodeGeneratorOptions());
}
// compile class file
CompilerResults results = provider.CompileAssemblyFromFile(new CompilerParameters(), "FooBar.cs");
// output reults
foreach (string msg in results.Output)
{
Console.WriteLine(msg);
}
// output errors
foreach (CompilerError error in results.Errors)
{
Console.WriteLine(error);
}
Using a CodeSnippetExpression and a manually quoted string, I was able to emit the source that I would have liked to have seen from Microsoft.CSharp.CSharpCodeGenerator.
So to answer the question above, replace this line:
field.InitExpression = new CodePrimitiveExpression(HugeString);
with this:
field.InitExpression = new CodeSnippetExpression(QuoteSnippetStringCStyle(HugeString));
And finally modify the private string quoting Microsoft.CSharp.CSharpCodeGenerator.QuoteSnippetStringCStyle method to not wrap after 80 chars:
private static string QuoteSnippetStringCStyle(string value)
{
// CS1647: An expression is too long or complex to compile near '...'
// happens if number of line wraps is too many (335440 is max for x64, 926240 is max for x86)
// CS1034: Compiler limit exceeded: Line cannot exceed 16777214 characters
// theoretically every character could be escaped unicode (6 chars), plus quotes, etc.
const int LineWrapWidth = (16777214/6) - 4;
StringBuilder b = new StringBuilder(value.Length+5);
b.Append("\r\n\"");
for (int i=0; i<value.Length; i++)
{
switch (value[i])
{
case '\u2028':
case '\u2029':
{
int ch = (int)value[i];
b.Append(#"\u");
b.Append(ch.ToString("X4", CultureInfo.InvariantCulture));
break;
}
case '\\':
{
b.Append(#"\\");
break;
}
case '\'':
{
b.Append(#"\'");
break;
}
case '\t':
{
b.Append(#"\t");
break;
}
case '\n':
{
b.Append(#"\n");
break;
}
case '\r':
{
b.Append(#"\r");
break;
}
case '"':
{
b.Append("\\\"");
break;
}
case '\0':
{
b.Append(#"\0");
break;
}
default:
{
b.Append(value[i]);
break;
}
}
if ((i > 0) && ((i % LineWrapWidth) == 0))
{
if ((Char.IsHighSurrogate(value[i]) && (i < (value.Length - 1))) && Char.IsLowSurrogate(value[i + 1]))
{
b.Append(value[++i]);
}
b.Append("\"+\r\n");
b.Append('"');
}
}
b.Append("\"");
return b.ToString();
}
So am I right in saying you've got the C# source file with something like:
public const HugeString = "xxxxxxxxxxxx...." +
"yyyyy....." +
"zzzzz.....";
and you then try to compile it?
If so, I would try to edit the text file (in code, of course) before compiling. That should be relatively straightforward to do, as presumably they'll follow a rigidly-defined pattern (compared with human-generated source code). Convert it to have a single massive line for each constant. Let me know if you'd like some sample code to try this.
By the way, your repro succeeds with no errors on my box - which version of the framework are you using? (My box has the beta of 4.0 on, which may affect things.)
EDIT: How about changing it to not be a string constant? You'd need to break it up yourself, and emit it as a public static readonly field like this:
public static readonly HugeString = "xxxxxxxxxxxxxxxx" + string.Empty +
"yyyyyyyyyyyyyyyyyyy" + string.Empty +
"zzzzzzzzzzzzzzzzzzz";
Crucially, string.Empty is a public static readonly field, not a constant. That means the C# compiler will just emit a call to string.Concat which may well be okay. It'll only happen once at execution time of course - slower than doing it at compile-time, but it may be an easier workaround than anything else.
Note that if you declare the string as const, it will be copied in each assembly that uses this string in its code.
You may be better off with static readonly.
Another way would be to declare a readonly property that returns the string.
I have no idea how to change the behavior of the code generator, but you can change the stack size that the compiler uses with the /stack option of EditBin.EXE.
Example:
editbin /stack:100000,1000 csc.exe <options>
Following is an example of its use:
class App
{
private static long _Depth = 0;
// recursive function to blow stack
private static void GoDeep()
{
if ((++_Depth % 10000) == 0) System.Console.WriteLine("Depth is " +
_Depth.ToString());
GoDeep();
return;
}
public static void Main() {
try
{
GoDeep();
}
finally
{
}
return;
}
}
editbin /stack:100000,1000 q.exe
Depth is 10000
Depth is 20000
Unhandled Exception: StackOverflowException.
editbin /stack:1000000,1000 q.exe
Depth is 10000
Depth is 20000
Depth is 30000
Depth is 40000
Depth is 50000
Depth is 60000
Depth is 70000
Depth is 80000
Unhandled Exception: StackOverflowException.
Make sure the application pools in IIS have 32-bit applications enabled. That's all it took for me to cure this problem trying to compile a 32-bit app in Win7 64-bit. Oddly (or not), Microsoft could not supply this answer. After a full day of searching, I found this link to the fix on an Iron Speed Designer forum:
http://darrell.mozingo.net/2009/01/17/running-iis-7-in-32-bit-mode/

Categories

Resources