I have a C#/.NET library, which works fine in my own environment, but not in a customer's for some reason. In this specific case, .NET Framework 4.8 is used, but they have tried .NET 6 as well with the same results.
I am converting a double with value 0.25 to a string like this:
doubleValue.ToString("E16", CultureInfo.InvariantCulture);
In my environment, I get the expected string
2.5000000000000000E-001
with a hyphen-minus sign (U+002D) in the scientific notation. As seen in the code, I am using InvariantCulture in order to avoid any confusions regarding decimal point signs and minus signs.
In the customer's environment, with the same code they get the string
2.5000000000000000E−001
with a mathematical minus sign (U+2212) in the scientific notation.
We are both running Windows, with the en-SV culture active. I am printing out the details of InvariantCulture and CurrentCulture in a test program, and in both environments, the negative sign for both cultures is hyphen-minus. Not that current culture should affect anything, since I'm explicitly using the InvariantCulture for the conversion.
The customer has tried setting the environment variable DOTNET_SYSTEM_GLOBALIZATION_USENLS to true, in case there were issues with ICU, but it didn't help. Not that it was likely, since ICU isn't used in .NET Framework. I just couldn't find anything else to try.
What else could affect .NET's choice of minus sign in ToString, apart from culture and NLS/ICU?
EDIT: Additional information: This was not an issue in the previous release of my library. I just released a new version where this became a problem. Since the previous release, I have not touched this conversion code at all. I have added support for .NET 6 (new code that the customer is not running above), and migrated my code from VS2019 to VS2022.
EDIT: Clarified the unicode characters used.
Related
I have the following line of code:
String.Equals("strasse", "straße", StringComparison.InvariantCultureIgnoreCase)
In .net 4.7.2, this returns true.
In .net 5 (and .net 6), this returns false.
Why?
I'm currently learning how comparing strings works in C#. NET. and have come across an unexpected result that I do not fully understand.
When using the overloaded method String.Equals(string,string,Stringcomparison) to compare string :"strasse" and string : "straße" with the following Stringcomparison :
Console.WriteLine(String.Equals("strasse", "straße", StringComparison.OrdinalIgnoreCase));
Console.WriteLine(String.Equals("strasse", "straße", StringComparison.CurrentCultureIgnoreCase));
Console.WriteLine(String.Equals("strasse", "straße", StringComparison.InvariantCultureIgnoreCase));
I get the following result :
False
False
False
I expected the first one to return false but both the second and third line to return true.
I first though maybe my CurrentCulture was the issue, so to be sure is et both the CurrentCulture and CurrentUICulture to :
CultureInfo.DefaultThreadCurrentCulture = CultureInfo.CreateSpecificCulture("de-DE");
CultureInfo.DefaultThreadCurrentUICulture = CultureInfo.CreateSpecificCulture("de-DE");
Did I incorrectly understand String comparison ? or am I missing something obvious here ?
Thanks in advance for anyone willing to help me understand
When you target .NET Framework 4.x you are implicitly targeting a Windows Platform. Windows platforms handle Unicode and Cultures in their specific way, which is the direct byproduct of the evolution of the platform during the last 30 years. The standard for Windows platforms is using NLS Apis.
However, this is not the case for other platforms.
During the past years the ICU project tried to provide a unified, de-facto standard for handling Unicode characters (and more). Since .NET Core runs by design on Windows, Linux, Mac and (possibly) Android devices, you expect your application to behave consistently regardless the platform it runs on
For this reason, .Net Core 5 switched to ICU libraries as a breaking change. You can find more information on Globalization APIs use ICU libraries on Windows. ICU libraries are available for many languages and are interoperable, thus allowing better integration.
On my machine when I run and output the following
string locale = "nb-NO";
CultureInfo culture = CultureInfo.CreateSpecificCulture(locale);
string shortDateFormatString = culture.DateTimeFormat.ShortDatePattern;
string shortTimeFormatString = culture.DateTimeFormat.ShortTimePattern;
I got the following output
shortDateFormatString "dd.MM.yyyy"
ShortTimePattern "HH:mm"
But on dotnetfiddle.net I got the following
shortDateFormatString "dd.MM.yyyy"
ShortTimePattern "HH.mm"
I suppose C# uses CLDR, so according to
https://github.com/unicode-cldr/cldr-dates-full/blob/1af902b749bef761f07281f80241250053b4313d/main/nb/ca-gregorian.json#L323
Both short time pattern should be valid.
And on dotnetfiddle it is possible to parse nb-NO datetime looking as following
06.12.2017 12:34
06.12.2017 12.34
However in VS2019 on my machine it is only possible to parse
06.12.2017 12:34
How is it possible it is different? both is using .NET 4.7.2.
You can check my fiddle here https://dotnetfiddle.net/68DDYz
How is it possible it is different?
Because culture information is loaded from the operating system, and changes over time. Unless two machines are on the exact same version of Windows (same set of updates, hotfixes etc), it's entirely possible for them to have different values for things like short time patterns. (And yes, that's annoying, but it's part of life.)
Jon is quite right (duh!)
Culture settings are tricky. Since they are stored on the windows registry, they can be different/change over .net framework versions, operating system versions, windows updates, hotfixes etc. That means even if both server uses same .NET Framework version, that doesn't mean that every culture settings will be same for both.
I can show you the it-IT culture for example.
See: .NET (3.5) formats times using dots instead of colons as TimeSeparator for it-IT culture?
For .NET Framework 3.5, it-IT culture has . as a TimeSeparator but with .NET Framework 4.0 version, it changed to : which is stated on Wikipedia.
This changes are not end of the world of course, but it was not pleasant either.
If you're using ClickOnce to manage your deployments and updates it may be configured to actively query a URL/manifest for the latest version of your project and then comparing its current version to this to determine if an update needs to be done. Does anyone know what the numerical limits are of the comparison routine? Because I have an automated process doing the builds, we're dropping a timestamp into the four-component of the version (e.g. 1.0.0.x; it's just digits without any symbols). However, I'm concerned that there being an eight-digit number in this spot might potentially crash the comparison. Microsoft no no do so good with unexpected requirements.
Does anyone have experience with this?
Thanks.
Let's walk the trail. If you start plugging in larger numbers, eventually setup.exe will poll for the latest version and then fail with "Cannot continue. The application is improperly formatted. Contact the application vendor for assistance."
If you look at the details, you'll see a log which may say the following:
+ The 'version' attribute is invalid - The value '1.0.0.161739' is invalid according to its datatype 'urn:schemas-microsoft-com:asm.v1:fourPartVersionType' - The Pattern constraint failed.
+ The Pattern constraint failed.
If you Google for "fourPartVersionType", you'll find yourself at FourPartVersionType Simple Type, which provides the following regular expression:
([0-9]{1,4}|[0-5][0-9]{4}|6[0-4][0-9]{3}|65[0-4][0-9]{2}|655[0-2][0-9]|6553[0-5])(\.([0-9]{1,4}|[0-5][0-9]{4}|6[0-4][0-9]{3}|65[0-4][0-9]{2}|655[0-2][0-9]|6553[0-5])){3}
This basically limits each component to four- or five-digits and, essentially, no greater than 65536 in the latter.
Hi guys my problem is this. I made a software in c# that is able to read and edit dxf files, I have to give this software to an american company but I have discovered the following problem:
Where I live we use the ',' to separate the integer part from the decimal part of the number (for example: 2,3658) but in the USA they use the '.' so they write 2.3658.
When I try to read the string "2.3658" and convert it into a double with "Double.Parse("2.3658")" the double I get is 23658 like the method "Parse()" didn't recognised the decimal part.
I have found the following solution:
UpdatedCoorx = double.Parse(shiftedE[w + 1] ,NumberStyles.Number,CultureInfo.CreateSpecificCulture ("en-US"));
Using CultureInfo.CreateSpecificCulture ("en-US") the c# can read the numbers correctly.
My question is : is there a way that make c# automatically recognised the "Culture" of the pc where is installed so that it can read the number correctly???
is there a way that make c# automatically recognised the "Culture" of the pc where is installed so that it can read the number correctly?
That's what it's doing by default - and why you're having a problem, because the culture used to created of the value you're parsing ("2.3658") isn't the culture on your local machine.
For any particular value, you should really know which culture produced it. For machine-to-machine communication, it's best to use the invariant culture (CultureInfo.Invariant) which is mostly similar to the US. Ideally, you shouldn't store values in a culture-specific format at all; either store them in a binary representation instead of a string, or if you must store a string, treat that as effectively machine-to-machine communication.
If you're in the unfortunate position of receiving data where you know it's been formatted according to some human culture, but you don't know which one, you should probably use some heuristics to detect what the culture is. That can easily fail though - for example, is "1,234" meant to be a 1 followed by a "grouping" separator, followed by 234 - meaning one thousand, two hundred and thirty-four... or is it meant to be a 1 followed by a decimal separator, making the value just a bit more than 1? Both interpretations are valid, depending on the culture you use...
If you want to detect the Culture, you should be able to do so with this
As explained in the link CultureInfo.CurrentCulture returns the Culture from the Windows GetUserDefaultLocaleName function.
Jon's answer is pretty spot on (as usual). I just wanted to add that if you are developing an application that will be used by people in another country, it may be helpful for you to use CultureInfo.DefaultThreadCurrentUICulture while you are in development mode: just assign your users' culture to this property when the application starts, and you will have exactly the same usage experience (culture-related-things-wise) that your users will have. (Just remember to remove this when you ship your application! You could for example use a #if debug block)
After installing VS2012 Premium on a dev machine a unit test failed, so the developer fixed the issue. When the changes were pushed to TeamCity the unit test failed. The project has not changed other than the solution file being upgraded to be compatible with VS2012. It still targets .net framework 4.0
I've isolated the problem to an issue with unicode characters being escaped when calling Uri.ToString. The following code replicates the behavior.
Imports NUnit.Framework
<TestFixture()>
Public Class UriTest
<Test()>
Public Sub UriToStringUrlDecodes()
Dim uri = New Uri("http://www.example.org/test?helloworld=foo%B6bar")
Assert.AreEqual("http://www.example.org/test?helloworld=foo¶bar", uri.ToString())
End Sub
End Class
Running this in VS2010 on a machine that does not have VS2012 installed succeeds, running this in VS2010 on a machine with VS2012 installed fails. Both using the latest version of NCrunch and NUnit from NuGet.
The messages from the failed assert are
Expected string length 46 but was 48. Strings differ at index 42.
Expected: "http://www.example.org/test?helloworld=foo¶bar"
But was: "http://www.example.org/test?helloworld=foo%B6bar"
-----------------------------------------------------^
The documentation on MSDN for both .NET 4 and .NET 4.5 shows that ToString should not encode this character, meaning that the old behavior should be the correct one.
A String instance that contains the unescaped canonical representation of the Uri instance. All characters are unescaped except #, ?, and %.
After installing VS2012, that unicode character is being escaped.
The file version of System.dll on the machine with VS2012 is 4.0.30319.17929
The file version of System.dll on the build server is 4.0.30319.236
Ignoring the merits of why we are using uri.ToString(), what we are testing and any potential work around. Can anyone explain why this behavior seems to have changed, or is this a bug?
Edit, here is the C# version
using System;
using NUnit.Framework;
namespace SystemUriCSharp
{
[TestFixture]
public class UriTest
{
[Test]
public void UriToStringDoesNotEscapeUnicodeCharacters()
{
var uri = new Uri(#"http://www.example.org/test?helloworld=foo%B6bar");
Assert.AreEqual(#"http://www.example.org/test?helloworld=foo¶bar", uri.ToString());
}
}
}
A bit of further investigation, if I target .NET 4.0 or .NET 4.5 the tests fail, if I switch it to .NET 3.5 then it succeeds.
There are some changes introduced in .NET Framework 4.5, which is installed along with VS2012, and which is also (to the best of my knowledge) a so called "in place upgrade". This means that it actually upgrades .NET Framework 4.
Furthermore, there are breaking changes documented in System.Uri. One of them says Unicode normalization form C (NFC) will no longer be performed on non-host portions of URIs. I am not sure whether this is applicable to your case, but it could serve as a good starting point in your investigation of the error.
The change is related to problems with earlier .NET versions, which have now changed to become more compliant to the standards. %B6 is UTF-16, but according to the standards UTF-8 should be used in the Uri, meaning that it should be %C2%B6. So as %B6 is not UTF-8 it is now correctly ignored and not decoded.
More details from the connect report quoted in verbatim below.
.NET 4.5 has enhanced and more compatible application of RFC 3987
which supports IRI parsing rules for URI's. IRIs are International
Resource Identifiers. This allows for non-ASCII characters to be in a
URI/IRI string to be parsed.
Prior to .NET 4.5, we had some inconsistent handling of IRIs. We had
an app.config entry with a default of false that you could turn on:
which did some IRI handling/parsing. However, it had some problems. In
particular it allowed for incorrect percent encoding handling.
Percent-encoded items in a URI/IRI string are supposed to be
percent-encoded UTF-8 octets according to RFC 3987. They are not
interpreted as percent-encoded UTF-16. So, handling “%B6” is incorrect
according to UTF-8 and no decoding will occur. The correct UTF-8
encoding for ¶ is actually “%C2%B6”.
If your string was this instead:
string strUri = #"http://www.example.com/test?helloworld=foo%C2%B6bar";
Then it will get normalized in the ToString() method and the
percent-encoding decoded and removed.
Can you provide more information about your application needs and the
use of ToString() method? Usually, we recommend the AbsoluteUri
property of the Uri object for most normalization needs.
If this issue is blocking your application development and business
needs then please let us know via the "netfx45compat at Microsoft dot
com" email address.
Thx,
Networking Team
In that situation you can't do like that.
The main issue is the character "¶".
In .Net we got a problem on character ¶.
You can make a research on that.
Take the uri' parameters one by one.
Add them by one by and compare them.
May be you can use a method for "¶" character to create it or replace it.
For example;
Dim uri = New Uri("http://www.example.org/test?helloworld=foo%B6bar")
Assert.AreEqual("http://www.example.org/test?helloworld=foo¶bar", uri.Host+uri.AbsolutePath+"?"+uri.Query)
that'll work
uri.AbsolutePath: /test
url.Host: http://www.example.org
uri.Query: helloworld=foo¶bar