I have a problem in output emoji in console.
String starts with Unicode flag "\u" works well, like "\u263A".
However, if just simply copy and paste an emoji into string, like "🎁", it does not work.
code test below:
using System;
using System.Text;
namespace Test
{
class Program
{
static void Main(string[] args)
{
Console.OutputEncoding = Encoding.UTF8;
string s1 = "🎁";
string s1_uni = "\ud83c\udf81"; // unicode code for s1
string s2 = "☺";
string s2_uni = "\u263A"; // unicode code for s2
Console.WriteLine(s1);
Console.WriteLine(s1_uni);
Console.WriteLine(s2);
Console.WriteLine(s2_uni);
Console.ReadLine();
}
}
}
s1 and s1_uni can successfully be outputted while s2 and s2_uni failed.
I want to know how to fix this problem.
By the way, the font applied is 'Consolas', which works perfectly in Visual Studio.
Update:
Please note that, I've done some searches in stackoverflow before I present this question. The most common way is to set the Console encoding to utf-8, which is done in the first line of Main.
This way (Console.OutputEncoding = Encoding.UTF8) can not totally fit the situation I presented.
Also, the reason why I make supplement to the console font in the question is to declare that Consolas font works perfectly in showing emoji in VS but failed in console. The first emoji failed to show.
Please do not close this question. Thanks.
Update2:
this emoji can be shown in the VS terminal.
Update3:
Thank Peter Duniho for help. And you are right.
While we are discussing, I look through the document MS Unicode Support for the Console.
Display of characters outside the Basic Multilingual Plane (that is, of surrogate pairs) is not supported, even if they are defined in a linked font file.
Code point of the emoji can't be shown in the console is just outside the BMP. And console does not support show code point outside BMP. Therefore, this emoji is not shown.
To change running context which may support this emoji. I did some experiments.
CMD:
Power Shell:
Windows Terminal:
You can see, windows terminal supports it.
Strictly speaking, the problem I met is not a duplicate question in stackoverflow. Because my code just did whatever can be done to meet the requirement. The problem is the running context, not code.
Thank Peter Duniho for help.
The current Windows command line console, cmd.exe, still uses GDI+ to render text. And the GDI+ API it uses does not correctly handle combining/surrogate pair characters like the emoji you want to display.
This is true even when using a font that includes the glyph for the character you want, and even when you have correctly set the output encoding for the Console class to a Unicode encoding (both of which you've done in your example).
Microsoft appears to be working on improvements to the command prompt code, to upgrade it to use the DirectWrite API instead of GDI+. If and when these improvements are released, the console window should be able to display your emoji correctly. See Github issue UTF-8 rendering woes #75
In the meantime, you can run your program in a context that is able to render these characters correctly, such as Windows Terminal or PowerShell.
Additional details regarding the limitations of the GDI+ font rendering can be found in Github issues Add emoji support to Windows Console #190 and emoji/unicode support mostly broken in windows #2693 (the latter isn't about a Windows component per se, but still relates to this problem).
Related
I've written a C# application (using VS) and in general it works fine on a Raspi using Mono to run the exe. Now I need a Chinese version, but Mono will not display code points like U+34B5.
I'm quite sure Mono causes the problem, because running everything on Windows works fine displaying the right symbols, and when I open the language csv on the Raspi using a simple text editor, I see the correct symbols, too.
I'm using the same font in my app as in the text editor. I tried lable, textbox aswell as richttextbox. no success.
string str = "\u34B4 \u34B5 \u34B6";
TextBox1.Text = str;
I'd expect those nice little symbols to be displayed like on windows, but on the Raspi I only see small empty boxes.
I am having big problems trying to print a PDF file in Windows using Ghostscript. The 'in Windows' argument comes from the fact that I am trying to use MS Windows default driver for this '-sDEVICE=mswinpr2'. I need all windows printers/drivers support. Also I can not use the PDF to images then to print job kind of solution. I cant use the gswin64c.exe file also, and I need that the job is done without any popups (no form of any kind). All I can do is just to send some parameters to gsdll32.dll and it to create a print job.
I am using C# wrapper
https://github.com/mephraim/ghostscriptsharp/tree/master
I am sending the following parameters: "-dBATCH -dNOPAUSE -dNOPROMPT -dDEVICEWIDTHPOINTS=612 -dDEVICEHEIGHTPOINTS=792 -dFIXEDMEDIA -dPDFFitPage -sDEVICE=mswinpr2 -dQUIET -sOutputFile=\"%printer%Epson Stylus Pro 4900\" D:\1.pdf"
And every time the printer selection dialog keeps popping up. I understand that the order the parameters are in matters .... because I changed it and it had different results.
Actual Question:
What parameters do I have to send to GhostScript dll so that I can print a PDF file using the default MS Windows printing driver.
Have you tried this using the command line version of GS instead of the DLL or C# thingy ? I'd suggest you concentrate on getting that to work first.
What is the name of the printer (as it appears in Windows) ?
What version of Ghostscript are you using ?
Try using the command line without '-dBATCH', '-dNOPAUSE', '-dNOPROMPT', '-dQUIET'. that way if Ghostscript tries to tell you something you won't just ignore it or miss it.
If the command line works then; I see you've escaped the " characters, but not the '%', you might want to escape those, or double them up. Depending how this wrapper of yours works they might be getting read as format specifiers.
The parameters used in the command line have been verified first in the command line version gswin64c.exe (64 bit operation system) and they work fine.
I am using GS version 9.10 (latest version).
I've tried different combinations of parameters, with or without some of them ... same result ... -100 exit code (general fault with no specification of the error that caused it).
It doesn't seem to be a problem with that % character... I'll try some more things.
Thanks Ken for the help
As far as the printer dialog popup is concerned, if you replace "mswinpr2" with the a compatible device name such as ljet4, the prompts would go away. My guess is that your computer must be having more than one printer installed and hence windows prompts for you to choose one from the list.
I need to type a back tick character in a Visual Studio 2010 Console but I can't seem to make it happen. I know it is the Unicode character +0060, and I tried the Alt+ method but that didn't work; after some research I added this line to my C# application but it still doesn't let me type it: Console.OutputEncoding = System.Text.Encoding.UTF8
Is there a simple way to make it appear? I am using the Lucida Console font.
Thanks!
ALT-096 (96 decimal = 0060 in hex) should normally work, I have a vague memory from back in the DOS days of it having to be the Alt-Gr key and the numbers must be typed on the number pad but that certainly not the case on my setup here, unless that something keyboard specific.
An alternative technique is to run the charmap.exe windows accessory (you may have to install it from Add Programs/Programs and Features if it wasn't selected at original install but it is available in every windows installation that I've come across since Win 3.x days). From that you can easily copy characters in the clipboard buffer and paste in to whatever you need.
Charmap.exe is especially useful for dealing with symbol/wingdings type fonts.
The final approach I know of is to simply use '(char) 96', a char, have you tried;
Console.Write((char) 96);
I have a Japanese final coming up soon, so to help me study I made a program to help me study. But, I can't seem to get VS2008 to display any Unicode in the Console. This is a sample I used to see if I could display Unicode:
string diancai = new string(new char[]{ '\u70B9','\u83DC' });
Console.Write(diancai[0] + " " + diancai[1]);
Output is:
? ?
Please help! Thank you!
Go to your command prompt and try a command "chcp"
It should be like this
C:\> chcp
現在のコード ページ: 932
932 is japanese, If code page is not correct or if your windows does not support, It can't display it in console.
I can run yours in mine, its display following chars, mine is japanese windows.
点 菜
So, For your case, I recommand you to try with GUI program instead of console
There are two conditions that must be satisfied in order for this to work:
The console's output encoding must be able to represent Japanese characters
The console's font must be able to render them
Condition 1 should be fairly simple to deal with; just set System.Console.OutputEncoding to an appropriate Encoding, such as a UTF8Encoding. (Of course, this won't work on Windows 9x, since that doesn't really support encodings or Unicode. But you aren't using that, now, are you?)
Satisfying condition 2 is a bit more involved:
First, an appropriate font must be installed on the user's system. If there aren't any installed yet, the user will have to install some, perhaps by:
Opening intl.cpl ("Regional and Language Options" in the Control Panel on Windows XP in English)
Going to the "Languages" tab
Enabling "Install files for East Asian languages"
Clicking "OK"
Actually getting the console to use such a font seems to be fairly hairy; see the question: How to display japanese Kanji inside a cmd window under windows? for more about that.
I use Windows XP english version.
But I set my OS so it can show Japanese characters.
For Windows XP this is the step:
1.Control Panel -> Regional and Language Options -> Advanced
2.Choose Japanese.
3.Choose code page conversion tables for language do you use.
4.Push OK button
5.Restart your computer.
I tried to use "chcp" command on command prompt.
It display: Active code page 932
The picture below explains all:
alt text http://img133.imageshack.us/img133/4206/accentar9.png
The variable textInput comes from File.ReadAllText(path); and characters like : ' é è ... do not display. When I run my UnitTest, all is fine! I see them... Why?
The .NET classes (System.IO.StreamReader and the likes) take UTF-8 as the default encoding. If you want to read a different encoding you have to pass this explicitly to the appropriate constructor overload.
Also note that there's not one single encoding called “ANSI”. You're probably referring to the Windows codepage 1252 aka “Western European”. Notice that this is different from the Windows default encoding in other countries. This is relevant when you try to use System.Text.Encoding.Default because this actually differs from system to system.
/EDIT: It seems you misunderstood both my answer and my comment:
The problem in your code is that you need to tell .NET what encoding you're using.
The other remark, saying that “ANSI” may refer to different encodings, didn't have anything to do with your problem. It was just a “by the way” remark to prevent misunderstandings (well, that one backfired).
So, finally: The solution to your problem should be the following code:
string text = System.IO.File.ReadAllText("path", Encoding.GetEncoding(1252));
The important part here is the usage of an appropriate System.Text.Encoding instance.
However, this assumes that your encoding is indeed Windows-1252 (but I believe that's what Notepad++ means by “ANSI”). I have no idea why your text gets displayed correctly when read by NUnit. I suppose that NUnit either has some kind of autodiscovery for text encodings or that NUnit uses some weird defaults (i.e. not UTF-8).
Oh, and by the way: “ANSI” really refers to the “American National Standards Institute”. There are a lot of completely different standards that have “ANSI” as part of their names. For example, C++ is (among others) also an ANSI standard.
Only in some contexts it's (imprecisely) used to refer to the Windows encodings. But even there, as I've tried to explain, it usually doesn't refer to a specific encoding but rather to a class of encodings that Windows uses as defaults for different countries. One of these is Windows-1252.
Try setting your console sessin's output code page using the chcp command. The code pages supported by windows are here, here, and here. Remember, fundametnaly the console is pretty simple: it displays UNCICODE or DBCS characters by using a code page to dtermine the glyph that will be displayed.
I do not know why It works with NUnit, but I open the file with NotePad++ and I see ANSI in the format. Now I converted to UTF-8 and it works.
I am still wondering why it was working with NUnit and not in the console? but at least it works now.
Update
I do not get why I get down voted on the question and in this answer because the question is still good, why in a Console I can't read an ANSI file but in NUNit I can?