C# datagridview unicode character encoding

C# datagridview unicode character encoding - c#

I cannot sort out the following problem:
I use a datagridview column to tell the user if the item of that row has already been processed. A little unicode icon should suffice, I thought, so I went for U+2174 (check mark) and U+2715 (cross) to achieve what I wanted. For the datatable...
row["Done"] = (listProcessed.Contains(file.FullName)) ? "\u2714" : "\u2715";
It works well in debug and release mode on my development machine, but it fails on a Windows XP virtual machine. On that one, only narrow squares are shown, just as if it didn't know the characters.
I read somewhere that it might be due to line endings, so I tried to apply TrimEnd(null) to the strings, but that did not help.
Is there a way to make this work on Windows XP? What exactly is going wrong?
thx i.a.

That means that the Windows XP machine is using a font that does not contain those characters.
Use charmap to see if you can find a font which does. (try Arial Unicode MS)

Related

What does "Beta: Use Unicode UTF-8 for worldwide language support" actually do?

In some Windows 10 builds (insiders starting April 2018 and also "normal" 1903) there is a new option called "Beta: Use Unicode UTF-8 for worldwide language support".
You can see this option by going to Settings and then:
All Settings -> Time & Language -> Language -> "Administrative Language Settings"
This is what it looks like:
When this checkbox is checked I observe some irregularities (below) and I would like to know what exactly this checkbox does and why the below happens.
Create a brand new Windows Forms application in your Visual Studio 2019. On the main form specify the Paint even handler as follows:
private void Form1_Paint(object sender, PaintEventArgs e)
{
Font buttonFont = new Font("Webdings", 9.25f);
TextRenderer.DrawText(e.Graphics, "0r", buttonFont, new Point(), Color.Black);
}
Run the program, here is what you will see if the checkbox is NOT checked:
However, if you check the checkbox (and reboot as asked) this changes to:
You can look up Webdings font on Wikipedia. According to character table given, the codes for these two characters are "\U0001F5D5\U0001F5D9". If I use them instead of "0r" it works with the checkbox checked but without the checkbox checked it now looks like this:
I would like to find a solution that always works that is regardless whether the box checked or unchecked.
Can this be done?

You can see it in ProcMon.
It seems to set the REG_SZ values ACP, MACCP, and OEMCP in HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Nls\CodePage
to 65001.
I'm not entirely sure but it might be related to the variable gAnsiCodePage in KernelBase.dll, which GetACP reads. If you really want to, you might be able to change it dynamically for your program regardless of the system setting by dynamically disassembling GetACP to find the instruction sequence that reads gAnsiCodePage and obtaining a pointer to it, then updating the variable directly.
(Actually, I see references to an undocumented function named SetCPGlobal that would've done the job, but I can't find that function on my system. Not sure if it still exists.)

Most Windows C APIs come in two different variants:
"A" variant that uses 8-bit strings with whatever the systems configured encoding is. This varies depending on the configured country/language.
(Microsoft calls the configured encoding the "ANSI Code Page", but it's not really anything to do with ANSI).
"W" variant that uses 16-bit strings in a fixed almost-UTF-16 encoding. (The "almost" is because "unpaired surrogates" are allowed; if you don't know what those are then don't worry about them).
The official Microsoft advice is not to use the "A" versions, but to ensure your code always use uses the "W" variants. That way you're supposed to get consistent behaviour no matter what the user's country/language is configured as.
However, it looks like that checkbox is doing more than one thing. It's clear it's supposed to change the "ANSI Code Page" to 65001, which means UTF-8. It looks like it's also changing font rendering to be more Unicody.
I suggest you detect if GetACP() == 65001, then draw the Unicode version of your strings, otherwise draw the old "0r" version. I'm not sure how you do that from .NET...

Please look at this question to see what it solves when it is enabled: How to save to file non-ascii output of program in Powershell?
Also I found explanation written by Ghisler helpful (source):
If you check this option, Windows will use codepage 65001 (Unicode
UTF-8) instead of the local codepage like 1252 (Western Latin1) for
all plain text files. The advantage is that text files created in e.g.
Russian locale can also be read in other locale like Western or
Central Europe. The downside is that ANSI-Only programs (most older
programs) will show garbage instead of accented characters.
I leave here two ways to enable it, I think they will be helpful for many users:
Win+R -> intl.cpl
Administrative tab
Click the Change system locale button.
Enable Beta: Use Unicode UTF-8 for worldwide language support
Reboot
or alternatively via reg file:
Windows Registry Editor Version 5.00
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Nls\CodePage]
"ACP"="65001"
"OEMCP"="65001"
"MACCP"="65001"

On my windows, When I checked the Beta: Use Unicode UTF-8 for worldwide language support.
The following regedit values in HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Nls\CodePage changed.
ACP: 936 -> 65001
MACCP: 10008 -> 65001
OEMCP : 936 -> 65001
If I do not checked, then the visual studio compilation failed with Exception: Bad UTF-8 encoding (U+FFFD; REPLACEMENT CHARACTER) found while decoding string: ..., If I checked, then the compilation successed, but the os is full with unreadable code.

How to type a grave accent/ back tick in a Visual Studio Console?

I need to type a back tick character in a Visual Studio 2010 Console but I can't seem to make it happen. I know it is the Unicode character +0060, and I tried the Alt+ method but that didn't work; after some research I added this line to my C# application but it still doesn't let me type it: Console.OutputEncoding = System.Text.Encoding.UTF8
Is there a simple way to make it appear? I am using the Lucida Console font.
Thanks!

ALT-096 (96 decimal = 0060 in hex) should normally work, I have a vague memory from back in the DOS days of it having to be the Alt-Gr key and the numbers must be typed on the number pad but that certainly not the case on my setup here, unless that something keyboard specific.
An alternative technique is to run the charmap.exe windows accessory (you may have to install it from Add Programs/Programs and Features if it wasn't selected at original install but it is available in every windows installation that I've come across since Win 3.x days). From that you can easily copy characters in the clipboard buffer and paste in to whatever you need.
Charmap.exe is especially useful for dealing with symbol/wingdings type fonts.
The final approach I know of is to simply use '(char) 96', a char, have you tried;
Console.Write((char) 96);

Printing a line instead of "--------"

For a Windows CE project that we print slips, we have a new request which asks if it is possible to print a line insted of printing "-----------" all the way.
Is this possible without printing an image?
c# / .net 3.5
Thank you

On your desktop run charmap.exe. Tick "Advanced view" and type "box" in the Search box. You'll get the Unicode codepoints that you can use to draw lines and boxes. Copy and paste them into your code. Whether they actually show up properly on your device depends on the font support. Odds are decent since they've been around since the first IBM PC. You'll have to try.

There are extendedascii values to do this (196) but it really depends on the printer.
Or as quppa comments use _ but it will not be adequate if you want to box in a title or so.

Wikipedia has an article on box-drawing characters.
Since ─ (U+2500) didn't work for you, it's unlikely ━ (U+2501) will work either, but it's perhaps worth a shot. There is also no guarantee that there won't be spaces between these characters, given that spaces appear between underscores.
The issue is not Windows CE supporting Unicode but finding a font that you can use that has the box-drawing characters. Given the likely size limitations (fonts with lots of characters are tens of megabytes big), this might be a challenge.

C# Unicode (Japanese Characters)

I have a Japanese final coming up soon, so to help me study I made a program to help me study. But, I can't seem to get VS2008 to display any Unicode in the Console. This is a sample I used to see if I could display Unicode:
string diancai = new string(new char[]{ '\u70B9','\u83DC' });
Console.Write(diancai[0] + " " + diancai[1]);
Output is:
? ?
Please help! Thank you!

Go to your command prompt and try a command "chcp"
It should be like this
C:\> chcp
現在のコード ページ: 932
932 is japanese, If code page is not correct or if your windows does not support, It can't display it in console.
I can run yours in mine, its display following chars, mine is japanese windows.
点 菜
So, For your case, I recommand you to try with GUI program instead of console

There are two conditions that must be satisfied in order for this to work:
The console's output encoding must be able to represent Japanese characters
The console's font must be able to render them
Condition 1 should be fairly simple to deal with; just set System.Console.OutputEncoding to an appropriate Encoding, such as a UTF8Encoding. (Of course, this won't work on Windows 9x, since that doesn't really support encodings or Unicode. But you aren't using that, now, are you?)
Satisfying condition 2 is a bit more involved:
First, an appropriate font must be installed on the user's system. If there aren't any installed yet, the user will have to install some, perhaps by:
Opening intl.cpl ("Regional and Language Options" in the Control Panel on Windows XP in English)
Going to the "Languages" tab
Enabling "Install files for East Asian languages"
Clicking "OK"
Actually getting the console to use such a font seems to be fairly hairy; see the question: How to display japanese Kanji inside a cmd window under windows? for more about that.

I use Windows XP english version.
But I set my OS so it can show Japanese characters.
For Windows XP this is the step:
1.Control Panel -> Regional and Language Options -> Advanced
2.Choose Japanese.
3.Choose code page conversion tables for language do you use.
4.Push OK button
5.Restart your computer.
I tried to use "chcp" command on command prompt.
It display: Active code page 932

Windows - Can console output inadvertently cause a system beep?

I have a C# console application that logs a lot to the console (using Trace). Some of the stuff it logs is the compressed representation of a network message (so a lot of that is rendered as funky non-alphabetic characters).
I'm getting system beeps every so often while the application is running. Is it possible that some "text" I am writing to the console is causing them?
(By system beep, I mean from the low-tech speaker inside the PC case, not any kind of Windows sound scheme WAV)
If so, is there any way to disable it for my application? I want to be able to output any possible text without the it being interpreted as a sound request.

That's usually caused by outputting character code 7, CTRL-G, which is the BEL (bell) character.
The first thing I normally do when buying a new computer or motherboard is to ensure the wire from the motherboard to the speaker is not connected. I haven't used the speaker since the days of Commander Keen (and removing that wire is the best OS-agnostic way of stopping the sound :-).

HKEY_CURRENT_USER\Control Panel\Sound
set the "Beep" key to "no".

absolutely, if you output ASCII control code "Bell" (0x7) to a console, it beeps.

If you don't want if to beep, you'll either have to replace the 0x7 character before outputting it, or disable the "Beep" device driver, which you'll find in the Non-Plug and Play Drivers section, visible if you turn on the Show Hidden Devices option. Or take the speaker out.

Even if you check the input for BELL characters, it may still beep. This is due to font settings and unicode conversion. The character in question is U+2022, Bullet.
Raymond Chen explains:
In the OEM code page, the bullet character is being converted to a
beep. But why is that?
What you're seeing is MB_USEGLYPHCHARS in reverse. Michael Kaplan
discussed MB_USEGLYPHCHARS a while ago. It determines whether certain
characters should be treated as control characters or as printable
characters when converting to Unicode. For example, it controls
whether the ASCII bell character 0x07 should be converted to the
Unicode bell character U+0007 or to the Unicode bullet U+2022. You
need the MB_USEGLYPHCHARS flag to decide which way to go when
converting to Unicode, but there is no corresponding ambiguity when
converting from Unicode. When converting from Unicode, both U+0007 and
U+2022 map to the ASCII bell character.

\b in output string will cause beep, if not disabled on the OS level.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

C# datagridview unicode character encoding - c#

That means that the Windows XP machine is using a font that does not contain those characters. Use charmap to see if you can find a font which does. (try Arial Unicode MS)

Related

What does "Beta: Use Unicode UTF-8 for worldwide language support" actually do?

How to type a grave accent/ back tick in a Visual Studio Console?

Printing a line instead of "--------"

C# Unicode (Japanese Characters)

Windows - Can console output inadvertently cause a system beep?

Categories

Resources