Issues Copying a single file using SHFileOperation() FO_COPY - c#

Just trying to copy a single font file to the C:\Windows\Fonts folder using this specific method (I do already know about the other various methods of copying files).
Here's my code, issues are mentioned below the code:
Declarations:
Private Structure SHFILEOPSTRUCT
Dim hwnd As Integer
Dim wFunc As Integer
Dim pFrom As String
Dim pTo As String
Dim fFlags As Short
Dim fAnyOperationsAborted As Boolean
Dim hNameMappings As Integer
Dim lpszProgressTitle As String
End Structure
<DllImport("shell32.dll", EntryPoint:="SHFileOperation", CharSet:=CharSet.Auto, SetLastError:=True, ThrowOnUnmappableChar:=True)>
Private Function SHFileOperation(ByRef lpFileOp As SHFILEOPSTRUCT) As Integer
End Function
Private Const FO_COPY As Integer = &H2S
Private Const FOF_NOCONFIRMATION As Integer = &H10S
Private Const FOF_SILENT As Integer = &H4S
Function:
Dim shf As SHFILEOPSTRUCT
Dim strWinFontFolder As String = Environment.ExpandEnvironmentVariables("%WINDIR%" & "\Fonts")
With shf
.wFunc = FO_COPY
.pFrom = String.Format("{0}{1}{1}", strFontPath, vbNullChar)
.pTo = String.Format("{0}{1}{1}", strWinFontFolder, vbNullChar)
.fFlags = FOF_NOCONFIRMATION Or FOF_SILENT
.lpszProgressTitle = String.Format("Sending {0} to the Font Folder", strFontPath)
End With
Try
SHFileOperation(shf)
Catch ex As Exception
Debug.WriteLine(String.Format("SHFILEOPSTRUCT: {0}", ex.Message))
End Try
Remark from MS Docs:
Important You must ensure that the source and destination paths are double-null terminated. A normal string ends in just a single null character. If you pass that value in either the source or destination members, the function will not realize when it has reached the end of the string and will continue to read on in memory until it comes to a random double null value. This can at least lead to a buffer overrun, and possibly the unintended deletion of unrelated data.
Issues:
When I double null the endings of both the .pTo and .pFrom lines as in the remark, nothing at all seems to happen and the file isn't copied. No errors, nothing at all. Crickets.
When I accidentally only single null terminated the endings, I got this error (on my English system, no idea why it shows Asian characters?)
I should also note that Before this function is even called, I do a:
If File.Exists(strFontPath) = True Then (yada yada yada)
and the file does indeed exist.
Anyone know why it won't copy?

According to SHFileOperationA
This function has been replaced in Windows Vista by IFileOperation.
If you do not check fAnyOperationsAborted as well as the return value,
you cannot know that the function accomplished the full task you asked
of it and you might proceed under incorrect assumptions.
Do not use GetLastError with the return values of this function.
To examine the nonzero values for troubleshooting purposes, they
largely map to those defined in Winerror.h. However, several of its
possible return values are based on pre-Win32 error codes, which in
some cases overlap the later Winerror.h values without matching their
meaning. Those particular values are detailed here, and for these
specific values only these meanings should be accepted over the
Winerror.h codes. However, these values are provided with these
warnings:
These are pre-Win32 error codes and are no longer supported or defined
in any public header file. To use them, you must either define them
yourself or compare against the numerical value. These error codes are
subject to change and have historically done so. These values are
provided only as an aid in debugging. They should not be regarded as
definitive.
SHFileOperation returns a value. If an error occurs it will be in the return value.
Dim retVal As Integer = SHFileOperation(shf)
Debug.WriteLine($"retVal: {retVal.ToString("X2")}")
Note: %windir%\Fonts is a protected operating system folder and requires administrator privileges.
In the OP, is the following Dim fFlags As Short which is incorrect.
If one looks at shellapi.h (%ProgramFiles(x86)%\Windows Kits\10\Include\10.0.19041.0\um\shellapi.h), one sees:
typedef WORD FILEOP_FLAGS;
Looking at Windows Data Types one sees:
WORD: A 16-bit unsigned integer. The range is 0 through 65535 decimal.
and Data Type Summary (Visual Basic)
Integer: -2,147,483,648 through 2,147,483,647
Short: -32,768 through 32,767
UShort: 0 through 65,535
For installing a font, see How to install a Font programmatically (C#).
Additional Resources:
Windows Data Types
Data Type Summary (Visual Basic)
Native interoperability best practices

Related

Generate BIOS Manufacturer HWID (a GUID from SHA-1) exactly as Microsoft does

SCENARIO
I would like to learn, in C# or VB.NET, how to generate a Hardware Id. based on the same methodology Microsoft developers does for that.
First of all before continue I must advertise this question is a better specific different question derivated from this other question: Get Hardware IDs like Microsoft does
And the basic concerns for the creation of a Microsoft's HWID, which is based on the SMBIOS table, are described in the answers of that question linked above, and more deeply here: Specifying Hardware IDs for a Computer
And my reason for trying to reproduce their methodology is just to follow a professional standard guidelines about how to do things in the right way: Microsoft's way.
PROBLEM
Since getting the SMBIOS table from managed .NET seems a impossible/not implemented task, and also I can't find any demostrative usage example of the Win32 function: GetSystemFirmwareTable, and anyways the fields parsing of a SMBIOS table it seems a real nightmare that is too much work for one single dev. ...because I suspect a working parser algorithm would need to be maintained/updated from short time to time: SMBIOS version history
...Then I assumed all that, and I choose to use the WMI classes to reproduce with limitations (SMBIOS fields that WMI doesn't expose or always are null using WMI) all what I could reproduce from the original concepts of Microsoft's HWID implementation.
To try start reproducing what Microsoft does, I attempted to start generating the HWID that just takes one field: the SMBIOS manufacturer.
My problem is that I'm doing something wrong and I don't know exactly what it could be...
So, the problems I noticed with my code:
My resulting SHA-1 string differs from the GUID that is built with the SHA-1 that is generated using the ComputerHardwareIds.exe tool contained in the WDK/SDK: ComputerHardwareIds specifications
(the basics for those HWID creation are explained in the urls that I linked inside the 'SCENARIO' section of this question, I insist)
My resulting SHA-1 string is 40 char. length. The System.GUID structure doesns't expects a string of this length, neither accepts the raw Byte-Array data of my computed SHA-1.
#Richard Hughes he said here:
You then need to use a type 5 (SHA-1) UUID generation scheme with
70ffd812-4c7f-4c7d-0000-000000000000 as the namespace
...But I admit I don't understand what he means about using a namespace for a cypher?. I'm probably missing that detail when building my SHA-1 string. I tried to analyze the members of the SHA1CryptoServiceProvider class in search for some property where to specify a namespace... nothing found.
This is the code I'm using, where I hardcoded the manufacturer string retrieval just to simplify things:
C# sample:
string manufacturer = "American Megatrends Inc.";
byte[] charBuff = Encoding.Unicode.GetBytes(manufacturer); // UTF-16
byte[] hashBuff = null;
string hashStr = null;
using (SHA1CryptoServiceProvider cypher = new SHA1CryptoServiceProvider()) {
hashBuff = cypher.ComputeHash(charBuff);
hashStr = BitConverter.ToString(hashBuff).Replace("-", ""); // Same string conversion methodology employed in a MSDN article.
}
Debug.WriteLine("SHA-1=\"{0}\"", hashStr); // SHA-1="0E74E534EE9F1985AE173C640302F58121190593"
Guid guid = new Guid(hashBuff); // System.ArgumentException: 'Byte array for GUID must be exactly 16 bytes long.'
VB.NET sample:
Dim manufacturer As String = "American Megatrends Inc."
Dim charBuff As Byte() = Encoding.Unicode.GetBytes(manufacturer) ' UTF-16
Dim hashBuff As Byte()
Dim hashStr As String
Using cypher As New SHA1CryptoServiceProvider
hashBuff = cypher.ComputeHash(charBuff)
hashStr = BitConverter.ToString(hashBuff).Replace("-", "") ' Same string conversion methodology employed in a MSDN article.
End Using
Debug.WriteLine("SHA-1=""{0}""", hashStr) ' SHA-1="0E74E534EE9F1985AE173C640302F58121190593"
Dim guid As New Guid(hashBuff) ' System.ArgumentException: 'Byte array for GUID must be exactly 16 bytes long.'
The expected resulting GUID would be the same GUID that the ComputerHardwareIds.exe tool generates:
{035a20a6-fccf-5040-bc3e-b8b794c57f52} <- Manufacturer
QUESTION
Well, just... what I need and how can I fix my code to get the expected result?.

How to marshal a structure in .NET to be used by native code?

I'm trying to write a DLL in .NET that can be called from a C++ executable. The executable expects a specific DLL to exist in its folder and expects a specific function name to be exported for it to consume. I'm using info from this Unmanaged Exports page to do it.
I have the following struct in C++ which I have to accept when the .NET function is called:
#pragma pack(4)
typedef struct sFMSelectorData
{
// sizeof(sFMSelectorData)
int nStructSize;
// game version string as returned by AppName() (ie. in the form "Thief 2 Final 1.19")
const char *sGameVersion;
// supplied initial FM root path (the FM Selector may change this)
char *sRootPath;
int nMaxRootLen;
// buffer to copy the selected FM name
char *sName;
int nMaxNameLen;
// set to non-zero when selector is invoked after game exit (if requested during game start)
int bExitedGame;
// FM selector should set this to non-zero if it wants to be invoked after game exits (only done for FMs)
int bRunAfterGame;
// optional list of paths to exclude from mod_path/uber_mod_path in + separated format and like the config
// vars, or if "*" all mod paths are excluded (leave buffer empty for no excludes)
// the specified exclude paths work as if they had a "*\" wildcard prefix
char *sModExcludePaths;
int nMaxModExcludeLen;
} sFMSelectorData;
But I haven't the slightest clue how to marshal everything. Here's my structure currently. You can see I've been trying to experiment. If I remove all the marshalling attributes, I get the MessageBox when the C++ code calls the function (below), but the data in the variables of the structure are not what's expected. When I attempt to add marshalling attributes like this example, the C++ code crashes and terminates. I was trying to match the #pragma pack(4) from the C++ structure layout, but not sure how to fiddle with the strings to make them compatible with what I guess are pointers in the C++ struct. Also, I'm guessing that the <FieldOffset(0)> attributes refers to the byte index of that variable within the struct. I had to stop there and decided to post this question.
<StructLayout(LayoutKind.Sequential, Pack:=4)>
Public Structure sFMSelectorData
' sizeof(sFMSelectorData)
Dim nStructSize As Integer
' game version string as returned by AppName() (ie. in the form "Thief 2 Final 1.19")
<MarshalAs(UnmanagedType.LPStr)>
Dim sGameVersion As String
' supplied initial FM root path (the FM Selector may change this)
<MarshalAs(UnmanagedType.LPStr)>
Dim sRootPath As String
Dim nMaxRootLen As Integer
' buffer to copy the selected FM name
<MarshalAs(UnmanagedType.LPStr)>
Dim sName As String
Dim nMaxNameLen As Integer
' set to non-zero when selector Is invoked after game exit (if requested during game start)
Dim bExitedGame As Integer
' FM selector should set this to non-zero if it wants to be invoked after game exits (only done for FMs)
Dim bRunAfterGame As Integer
' optional list of paths to exclude from mod_path/uber_mod_path in + separated format And Like the config
' vars, Or if "*" all Mod paths are excluded (leave buffer empty for no excludes)
' the specified exclude paths work as if they had a "*\" wildcard prefix
<MarshalAs(UnmanagedType.LPStr)>
Dim sModExcludePaths As String
Dim nMaxModExcludeLen As Integer
End Structure
So the C++ code is calling this .NET function:
<DllExport(CallingConvention:=CallingConvention.Cdecl, ExportName:="SelectFM")>
Public Function SelectFM(<MarshalAs(UnmanagedType.Struct)> ByRef data As sFMSelectorData) As Int32
Select Case MsgBox("Start the game?", MsgBoxStyle.Question Or MsgBoxStyle.YesNo, data.nStructSize)
Case MsgBoxResult.Yes : Return eFMSelReturn.kSelFMRet_Cancel
Case MsgBoxResult.No : Return eFMSelReturn.kSelFMRet_ExitGame
End Select
Return eFMSelReturn.kSelFMRet_Cancel
End Function
It does what I want. When I click Yes in the MessageBox, the game starts. When I click No, the game closes. But I need to use the data that's supposed to be populated into the structure. I'm not there yet.
Here's what the documentation says
An FM selector is a separate library (DLL) containing a utility, usually a UI based application, that lists the available FMs and lets the user pick which one to run. A selector could range from a simple list box with the FM names to a full blown manager with extended info, last played timestamps, sorting/filtering etc.
The default name for the selector is "FMSEL.DLL", but can be configured with the "fm_selector" cam_mod.ini var.
Exports
The DLL only needs to have a single symbol exported "SelectFM", which
is a function in the form of:
int __cdecl SelectFM(sFMSelectorData *data);
The following return values are defined:
0 = 'data->sName' is expected to contain the selected FM name, if
string is empty it means no FM 1 = cancel and exit game
Any other value is interpreted as cancel-and-continue, the game will
start using the cam_mod.ini based active FM if defined, otherwise it
will run without any FM.
Data types
#pragma pack(4)
typedef struct sFMSelectorData { // sizeof(sFMSelectorData) int structSize;
// game version string as returned by AppName() (ie. in the form "Thief 2 Final 1.19")
const char *sGameVersion;
// supplied initial FM root path (the FM selector may change this)
char *sRootPath; int nMaxRootLen;
// buffer to copy the selected FM name
char *sName; int nMaxNameLen;
// set to non-zero when selector is invoked after game exit (if requested during game start)
int bExitedGame;
// FM selector should set this to non-zero if it wants to be invoked after game exits (only done for FMs)
int bRunAfterGame;
// optional list of paths to exclude from mod_path/uber_mod_path in + separated format and like the config
// vars, or if "*" all mod paths are excluded (leave buffer empty for no excludes)
// the specified exclude paths work as if they had a "*\" wildcard prefix
char *sModExcludePaths; int nMaxModExcludeLen;
// language setting for FM (set by the FM selector when an FM is selected), may be empty if FM has no
// language specific resources
// when 'bForceLanguage' is 0 this is used to ensure an FM runs correctly even if it doesn't support
// the game's current language setting (set by the "language" config var)
// when 'bForceLanguage' is 1 this is used to force a language (that must be supported by the FM) other
// than the game's current language
char *sLanguage; int nLanguageLen; int bForceLanguage;
} sFMSelectorData;
#pragma pack()
typedef enum eFMSelReturn {
kSelFMRet_OK = 0, // run selected FM 'data->sName' (0-len string to run without an FM)
kSelFMRet_Cancel = -1, // cancel FM selection and start game as-is (no FM or if defined in cam_mod.ini use that)
kSelFMRet_ExitGame = 1 // abort and quit game
} eFMSelReturn;
typedef int (__cdecl *FMSelectorFunc)(sFMSelectorData*);
Hoping some bilingual C++/.NET guru can help me out.
With DllImport this structure would be
<StructLayout(LayoutKind.Sequential, Pack:=4)>
Public Structure sFMSelectorData
Public nStructSize As Integer
<MarshalAs(UnmanagedType.LPStr)>
Public sGameVersion As String
' supplied initial FM root path (the FM Selector may change this)
<MarshalAs(UnmanagedType.LPStr)>
Public sRootPath As String
Public nMaxRootLen As Integer
' buffer to copy the selected FM name
<MarshalAs(UnmanagedType.LPStr)>
Public sName As String
Public nMaxNameLen As Integer
' set to non-zero when selector Is invoked after game exit (if requested during game start)
Public bExitedGame As Integer
' FM selector should set this to non-zero if it wants to be invoked after game exits (only done for FMs)
Public bRunAfterGame As Integer
' optional list of paths to exclude from mod_path/uber_mod_path in + separated format And Like the config
' vars, Or if "*" all Mod paths are excluded (leave buffer empty for no excludes)
' the specified exclude paths work as if they had a "*\" wildcard prefix
<MarshalAs(UnmanagedType.LPStr)>
Public sModExcludePaths As String
Public nMaxModExcludeLen As Integer
End Structure
Whether that works depends on exactly what that DllExport does.
For .Net applications, you can use C++/CLI and you may find more information on MSDN. I have used C++/CLI in the past with great success for calling C++ objects in C#; although, later we used SWIG for this purpose, as we needed for Java, Python, and R too.

VB5 dll, how can I invoke the function from C# ( .NET 4.5 )

My Question is simple
VB.dll (VB5.0 I guess) includes these methods
Private Declare Function ffr_device_find Lib ".\ffr_32.dll" () As Boolean
Private Declare Function ffr_data_transceive_ex Lib ".\ffr_32.dll" (ByVal sp_sdata As String, ByVal sp_rdata As String) As Boolean
in C#.... ( .NET 4.5 )
[DllImport("FFR_32.dll", CallingConvention = CallingConvention.Cdecl)]
extern public static Boolean ffr_device_find();
[DllImport("FFR_32.dll", CallingConvention = CallingConvention.Cdecl)]
extern public static void ffr_data_transceive_ex([Out] string sp_sdata, [Out] string sp_rdata);
// FYI, I tried [Out], out, and ref but to no avail.
The first one works great,
but the second one spilt this error.
A call to PInvoke function 'ffr_data_transceive_ex' has unbalanced the stack.
This is likely because the managed PInvoke signature does not match
the unmanaged target signature. Check that the calling convention and
parameters of the PInvoke signature match the target unmanaged signature.
FYI
This is a working code from VB... ( NOT INNER DLL SOURCES )
Dim st As String
Dim rData As String * 40
st = "4401" & "20202020202020202020202020202020"
Text1.Text = st
Cal_BCC
Call ffr_data_transceive_ex(Text1.Text, rData)
Text2.Text = rData
I don't even understand what Dim rData As String * 40 is about... will it become 0 when rData is 0? and become 40 when rData has 1? ...
What's wrong with my DllImport methods in C#???
[DllImport("FFR_32.dll", CallingConvention = CallingConvention.Cdecl)]
It is not Cdecl. Visual Basic assumes the default, StdCall. You got away with it on the first function because it does not take any arguments. Not on the second since it imbalanced the stack, both the caller and the callee popped the arguments off the stack, forcing the MDA to step in and tell you about the drastic mishap.
Simply remove the property completely so you get the correct default in your C# program as well, CallingConvention.StdCall.
void ffr_data_transceive_ex([Out] string sp_sdata, [Out] string sp_rdata)
You cannot use string, strings are immutable in .NET. Use StringBuilder instead. Do make sure that its Capacity is large enough to be able to store the received data that the function writes. Guessing too low causes heap corruption, a very nasty bug to troubleshoot.
Also odds that it should be byte[], your question doesn't document the type of the returned data well enough. VB5 did not have a Byte type yet so a fixed string (like String * 40) was the next best choice. Doesn't work in .NET either, not all possible byte values have a corresponding Unicode codepoint. Use StringBuilder only if you know for a fact that the function only returns ASCII codes.
I choose the above answer by Hans Passant
But for those who want an exact solution for this case, I will give you some additional information and some code snippets.
First
Like Hans Passant said, CallingConvention should be gotten rid of.
Second
Like Hans Passant said, string shouldn't be passed into the function. More accurately, the first parameter can be passed into as string type. But the second one should be char[] with explicit length.
I tested this and It throws another error when the length is different.
Working code
extern public static void ffr_data_transceive_ex([Out] string sp_sdata, [Out] char[] sp_rdata);
Necessary Ingredient
// To pass the second parameter.
// Because the dll returns 40 length characters, this should be specified as length '40'
char[] ForCardHex = new char[40];
// Command is an already defined protocol format, like "200080581028000001"
// , which can be taken as string.
ffr_data_transceive_ex(Command, ForCardHex);
This works greatly and it returns the expected value on ForCardHex. You should use char[]. I couldn't get byte[] passed into the function.
Thank all of the commentors and the answerer.

ReadInt32 vs. ReadUInt32

I was tinkering with IP packet 'parsers' when I noticed something odd.
When it came to parsing the IP addresses, in C#
private uint srcAddress;
// stuff
srcAddress = (uint)(binaryReader.ReadInt32());
does the trick, so you'd think this VB.Net equivallent
Private srcAddress As UInteger
'' stuff
srcAddress = CUInt(binaryReader.ReadInt32())
would do the trick too. It doesn't. This :
srcAddress = reader.ReadUInt32()
however will.
Took some time to discover, but what have I dicovered -- if anything ? Why is this ?
VB.NET, by default, does something that C# doesn't do by default. It always check for numerical overflow. And that will trigger in your code, IP addresses whose last bit is 1 instead of 0 will produce a negative number and that cannot be converted to UInteger. A data type that can only store positive 32-bit numbers.
C# has this option too, you'd have to explicitly use the checked keyword in your code. Or use the same option that VB.NET projects have turned on by default: Project + Properties, Build tab, Advanced, tick the "Check for arithmetic overflow/underflow" checkbox. The same option in VB.NET project is named "Remove integer overflow checks", off by default.
Do note how these defaults affected the syntax of the languages as well. In C# you have to write a cast to convert a value to an incompatible value type. Not necessary in VB.NET, the runtime check keeps you out of trouble. It is very bad kind of trouble to have, overflow can produce drastically bad results. Not in your case, that happens, an IP address really is an unsigned number.
Do keep the other quirk about IP-addresses in mind, sockets were first invented on Unix machines that were powered by LSD and big-endian processors. You must generally use IPAddress.NetworkToHostOrder() to get the address in the proper order. Which only has overloads that take a signed integer type as the argument. So using ReadInt32() is actually correct, assuming it is an IPv4 address, you pass that directly to NetworkToHostOrder(). No fear of overflow.

Comparing Unicode Strings in C Returns Different Values Than C#

So I am attempting to write a compare function in C which can take a UTF-8 encoded Unicode string and use the Windows CompareStringEx() function and I am expecting it to work just like .NET's CultureInfo.CompareInfo.Compare().
Now the function I have written in C works some of the time, but not in all cases and I'm trying to figure out why. Here is a case that fails (passes in C#, not in C):
CultureInfo cultureInfo = new CultureInfo("en-US");
CompareOptions compareOptions = CompareOptions.IgnoreCase | CompareOptions.IgnoreKanaType | CompareOptions.IgnoreWidth;
string stringA = "คนอ้วน ๆ";
string stringB = "はじめまして";
//Result is -1 which is expected
int result = cultureInfo.CompareInfo.Compare(stringA, stringB);
And here is what I have written in C. Keep in mind this is supposed to take a UTF-8 encoded string and use the Windows CompareStringEx() function so conversion is necessary.
// Compare flags for the string comparison
#define COMPARE_STRING_FLAGS (NORM_IGNORECASE | NORM_IGNOREKANATYPE | NORM_IGNOREWIDTH)
int CompareStrings(int lenA, const void *strA, int lenB, const void *strB)
{
LCID ENGLISH_LCID = MAKELCID(MAKELANGID(LANG_ENGLISH, SUBLANG_ENGLISH_US), SORT_DEFAULT);
int compareString = -1;
// Get the size of the strings as UTF-18 encoded Unicode strings.
// Note: Passing 0 as the last parameter forces the MultiByteToWideChar function
// to give us the required buffer size to convert the given string to utf-16s
int strAWStrBufferSize = MultiByteToWideChar(CP_UTF8, 0, (LPCSTR)strA, lenA, NULL, 0);
int strBWStrBufferSize = MultiByteToWideChar(CP_UTF8, 0, (LPCSTR)strB, lenB, NULL, 0);
// Malloc the strings to store the converted UTF-16 values
LPWSTR utf16StrA = (LPWSTR) GlobalAlloc(GMEM_FIXED, strAWStrBufferSize * sizeof(WCHAR));
LPWSTR utf16StrB = (LPWSTR) GlobalAlloc(GMEM_FIXED, strBWStrBufferSize * sizeof(WCHAR));
// Convert the UTF-8 strings (SQLite will pass them as UTF-8 to us) to standard
// windows WCHAR (UTF-16\UCS-2) encoding for Unicode so they can be used in the
// Windows CompareStringEx() function.
if(strAWStrBufferSize != 0)
{
MultiByteToWideChar(CP_UTF8, 0, (LPCSTR)strA, lenA, utf16StrA, strAWStrBufferSize);
}
if(strBWStrBufferSize != 0)
{
MultiByteToWideChar(CP_UTF8, 0, (LPCSTR)strB, lenB, utf16StrB, strBWStrBufferSize);
}
// Compare the strings using the windows compare function.
// Note: We subtract 1 from the size since we don't want to include the null termination character
if(NULL != utf16StrA && NULL != utf16StrB)
{
compareValue = CompareStringEx(L"en-US", COMPARE_STRING_FLAGS, utf16StrA, strAWStrBufferSize - 1, utf16StrB, strBWStrBufferSize - 1, NULL, NULL, 0);
}
// In the Windows CompareStringEx() function, 0 indicates an error, 1 indicates less than,
// 2 indicates equal to, 3 indicates greater than so subtract 2 to maintain C convention
if(compareValue > 0)
{
compareValue -= 2;
}
return compareValue;
}
Now if I run the following code, I expect the result to be -1 based on the .NET implementation (see above) but I get 1 indicating that the strings are greater than:
char strA[50] = "คนอ้วน ๆ";
char strB[50] = "はじめまして";
// Will be 1 when we expect it to be -1
int result = CompareStrings(strlen(strA), strA, strlen(strB), strB);
Any ideas on why the results I'm getting are different? I'm using the same LCID/cultureInfo and compareOptions in both implementations and the conversions are successful as far as I can tell.
FYI: This function will be used as a custom collation in SQLite. Not relevant to the question but in case anyone is wondering why the function signature is the way it is.
UPDATE: I also determined that when running the same code in .NET 4 I would see the behavior I saw in the native code. As a result there was now a discrepancy between .NET versions. See my answer below for the reasons behind this.
Well, your code performs several steps here - it's not clear whether it's the compare step which is failing or not.
As a first step, I would write out - in both the .NET code and the C code - the exact UTF-16 code units which you've got in utf16StrA, utf16StrB, stringA and stringB. I wouldn't be at all surprised to find that there's a problem in the input data you're using in the C code.
What you are hoping for here is that your text editor will save the source code file in utf-8 format. And that the compiler will then somehow not interpret the source code as utf-8. That's too much to hope for, at least on my compiler:
warning C4566: character represented by universal-character-name '\u0E04' cannot be represented in the current code page (1252)
Fix:
const wchar_t* strA = L"คนอ้วน ๆ";
const wchar_t* strB = L"はじめまして";
And remove the conversion code.
So I ended up figuring out the issue after contacting Microsoft support. Here is what they had to say about the issue:
The reason for the issue you are seeing, namely, running CompareInfo.Compare against the same string with the same compare options but getting different return values when run under different versions of the .NET Framework, is that the sorting rules are tied to the Unicode spec, which evolves over time. Historically .NET has snapped data for side by side releases to correspond to the newest version of Windows and the corresponding version of Unicode implemented at that time so 2.0, 3.0 and 3.5 correspond to the version for Windows XP or Server 2003, whereas v4.0 matched the Vista sorting rules. As a result the sorting rules for the various versions of the .NET Framework have changed over time.
This also means that when I ran the native code I was calling the sort methods that adhered ot the Vista sorting rules and when I ran in .NET 3.5 I was running sort methods that used the Windows XP sorting rules. Seems odd to me that the Unicode spec would change in such a manner as to cause such a dramatic difference but apparently that's the case here. Seems to me that changing the Unicode spec in such a dramatic way is a fantastic way to break backwards compatibility.

Categories

Resources