Windows - Can console output inadvertently cause a system beep? - c#

I have a C# console application that logs a lot to the console (using Trace). Some of the stuff it logs is the compressed representation of a network message (so a lot of that is rendered as funky non-alphabetic characters).
I'm getting system beeps every so often while the application is running. Is it possible that some "text" I am writing to the console is causing them?
(By system beep, I mean from the low-tech speaker inside the PC case, not any kind of Windows sound scheme WAV)
If so, is there any way to disable it for my application? I want to be able to output any possible text without the it being interpreted as a sound request.

That's usually caused by outputting character code 7, CTRL-G, which is the BEL (bell) character.
The first thing I normally do when buying a new computer or motherboard is to ensure the wire from the motherboard to the speaker is not connected. I haven't used the speaker since the days of Commander Keen (and removing that wire is the best OS-agnostic way of stopping the sound :-).

HKEY_CURRENT_USER\Control Panel\Sound
set the "Beep" key to "no".

absolutely, if you output ASCII control code "Bell" (0x7) to a console, it beeps.

If you don't want if to beep, you'll either have to replace the 0x7 character before outputting it, or disable the "Beep" device driver, which you'll find in the Non-Plug and Play Drivers section, visible if you turn on the Show Hidden Devices option. Or take the speaker out.

Even if you check the input for BELL characters, it may still beep. This is due to font settings and unicode conversion. The character in question is U+2022, Bullet.
Raymond Chen explains:
In the OEM code page, the bullet character is being converted to a
beep. But why is that?
What you're seeing is MB_USEGLYPHCHARS in reverse. Michael Kaplan
discussed MB_USEGLYPHCHARS a while ago. It determines whether certain
characters should be treated as control characters or as printable
characters when converting to Unicode. For example, it controls
whether the ASCII bell character 0x07 should be converted to the
Unicode bell character U+0007 or to the Unicode bullet U+2022. You
need the MB_USEGLYPHCHARS flag to decide which way to go when
converting to Unicode, but there is no corresponding ambiguity when
converting from Unicode. When converting from Unicode, both U+0007 and
U+2022 map to the ASCII bell character.

\b in output string will cause beep, if not disabled on the OS level.

Related

How to handle directory separator character in japanese and korean? [duplicate]

tl;dr: How do I ask Windows what the current directory separator character on the system is?
Different versions of Windows seem to behave differently (e.g. \ and / both work on the English versions, ¥ is apparently on the Japanese version, ₩ is apparently on the Korean version, etc...
Is there any way to avoid hard-coding this, and instead ask Windows at run time?
Note:
Ideally, the solution should not depend on a high-level DLL like ShlWAPI.dll, because lower-level libraries also depend on this. So it should really either depend on kernel32.dll or ntdll.dll or the like... although I'm having a trouble finding anything at all, whether at a high level or at a low level.
Edit:
A little experimentation told me that it's the Win32 subsystem (i.e. kernel32.dll... or is it perhaps RtlDosPathNameToNtPathName_U in ntdll.dll? not sure, didn't test...) which converts forward slashes to backslashes, not the kernel. (Prefixing \\?\ makes it impossible to use forward slashes later in the path -- and the NT native user-mode API also fails with forward slashes.)
So apparently it's not quite "built into" Windows, but rather just a compatibility feature -- which means you can't just blindly substitute slashes instead of backslashes, because any program which randomly prefixes \\?\ to paths will automatically break on forward slashes.
I have mixed feelings on what conclusions to make regarding this, but I just thought I'd mention it.
(I tagged this as "path separator" even though that's technically incorrect because the path separator is used for separating paths, not directories (; vs. \). Hopefully people get what I meant.)
While the ₩ and ¥ characters are shown as directory separator symbols in the respective Korean and Japanese windows versions, they are only how those versions of Windows represent the same Unicode code point U+005c as a glyph. The underlying code point for backslash is still the same across English Windows and the Japanese and Korean windows versions.
Extra confirmation for this can be found on this page: http://msdn.microsoft.com/en-us/library/dd374047(v=vs.85).aspx
Security Considerations for Character Sets in File Names
Windows code page and OEM character sets used on Japanese-language systems contain the Yen symbol (¥) instead of a backslash (\). Thus, the Yen character is a prohibited character for NTFS and FAT file systems. When mapping Unicode to a Japanese-language code page, conversion functions map both backslash (U+005C) and the normal Unicode Yen symbol (U+00A5) to this same character. For security reasons, your applications should not typically allow the character U+00A5 in a Unicode string that might be converted for use as a FAT file name.
Also, I don't know of any Windows API function that gets you the system's path separator, but you can rely on it being \ in all circumstances.
http://msdn.microsoft.com/en-us/library/aa365247%28VS.85%29.aspx#naming_conventions
The following fundamental rules enable applications to create and process valid names for files and directories, regardless of the file system:
...
Use a backslash (\) to separate the components of a path. The backslash divides the file name from the path to it, and one directory name from another directory name in a path. You cannot use a backslash in the name for the actual file or directory because it is a reserved character that separates the names into components.
...
About /
Windows should support the use of / as a directory separator in the API functions, though not necessarily in the command prompt (command.com).
Note File I/O functions in the Windows API convert "/" to "\" as part of converting the name to an NT-style name, except when using the "\?\" prefix as detailed in the following sections.
It's 'tough' to figure out the truth of all this, but this might be a really helpful link about / in Windows paths: http://bytes.com/topic/python/answers/23123-when-did-windows-start-accepting-forward-slash-path-separator
The original poster added the phrase "kernel-mode" in a comment to someone else's answer.
If the original question intended to ask about kernel mode, then it probably isn't a good idea to depend on / being a path separator. Different file systems allow different character sets on disk. Different file system drivers in Windows can also allow different characters sets, which normally cannot include characters which the underlying file systems don't accept on disk, but sometimes they can behave strangely. For example Posix mode allows a component name to contain some characters in a path name in an NTFS partition, even though NTFS ordinarily doesn't allow those characters. (But obviously / isn't one of them, in Posix.)
In kernel mode in Unicode, U+005C is always a backslash and it is always the path separator. Unicode code points for yen and won are not U+005C and are not path separators.
In kernel mode in ANSI, complications arise depending on which ANSI code page. In code pages that are sufficiently similar to ASCII, 0x5C is a backslash and it is the path separator. In ANSI code pages 932 and 949, 0x5C is not a backslash but 0x5C might be a path separator depending on where it occurs. If 0x5C is the first byte of a multibyte character, then it's a yen sign or won sign and it is a path separator. If 0x5C is the second byte of a multibyte character, then it's not a character by itself, so it's not a yen sign or won sign and it's not a path separator. You have to start parsing from the beginning of the string to figure out if a particular char is actually a whole character or not. Also in Chinese and UTF-8, multibyte characters can be longer than two chars.
The standard forward slash (/) has always worked in all versions of DOS and Windows. If you use it, you don't have to worry about issues with how the backslash is displayed on Japanese and Korean versions of Windows, and you also don't have to special-case the path separator for Windows as opposed to POSIX (including Mac). Just use forward slash everywhere.

µ and é in namespace

We have developed a c# program. The program is distributed in Europe without problem on misc hardware configurations. Some of the namespaces in our program contains a 'µ' or a 'é' character. When deploying our program on 'non-european' ie China or some US systems a problem occurs somewhere in the process the 'µ' is changed into 'µ' causing lots of problems. What is causing this problem and how can we work around it (preferably without changing the name of the namespace)
edit 2015.08.07
Thanks all for your comments, but to clarify: the source files are not distributed as such. The program is compiled to an exe and then distributed using nsis. Source control is done using SVN. How can I verify the presence of the BOM in my source files ?
Either you or the recipient or both are using a character encoding other than UTF-8.
People shouldn't do that, but they do.
Some tools will default to a legacy encoding unless you include a BOM at the start of each file, so include a BOM at the start of each file.
You are hitting a difference in the character sets used by the different systems. Your software was probably running on systems assuming ISO-8859, most often used in European languages, while the Chinese and US systems you are encountering are probably using the Universal Character Set (ISO/IEC 10646). The mapping between the two is not a simple 1-to-1, so you run into the problems you are having. W3.org has a good article on this topic at http://www.w3.org/International/articles/definitions-characters/
Pay special attention to the sections on "Character sets, coded character sets, and encodings", and "The Document Character Set". If this is a web app, "The HTTP Header" might be particularly useful.
This isn't an answer. However I want to point out that this might be an encoding problem but this can happen on the same system and show up only when running code that (guess here) reads a byte at a time, as opposed to explicitly reading text of a particular encoding.
I have a C program (32 bit if that matters) which reads in a file using fgetc and saves characters to be used as "illegal" characters in names. It isn't fancy, just to prevent a few ascii characters from coming in accidentally, like an ' (apostrophe) in the name of a thing/object/label. Some one asked me to test µ (mu, appears as single character in this interface to stackoverflow). I generated this (without examining underlying encoding in MS Word) using Insert-Symbol in MS Word. I cut it from MS Word and inserted it into a text file using Notepad++. In Notepad++ and MS Word, it seems to be the same symbol. BUT fgetc (taking one int or char, however you like to think of it) at a time, sees in my debug output for a test case:
About to check for illegal characters in =>NameOfItemµ<=
Illegal character =>Â<= was found. Illegal characters are: '`µ
Illegal character =>µ<= was found. Illegal characters are: '`µ
I am compiling with Visual C++ Express 2013.
I'm happy it catches the illegal characters, and hope this isn't just noise to readers of this topic.

C# datagridview unicode character encoding

I cannot sort out the following problem:
I use a datagridview column to tell the user if the item of that row has already been processed. A little unicode icon should suffice, I thought, so I went for U+2174 (check mark) and U+2715 (cross) to achieve what I wanted. For the datatable...
row["Done"] = (listProcessed.Contains(file.FullName)) ? "\u2714" : "\u2715";
It works well in debug and release mode on my development machine, but it fails on a Windows XP virtual machine. On that one, only narrow squares are shown, just as if it didn't know the characters.
I read somewhere that it might be due to line endings, so I tried to apply TrimEnd(null) to the strings, but that did not help.
Is there a way to make this work on Windows XP? What exactly is going wrong?
thx i.a.
That means that the Windows XP machine is using a font that does not contain those characters.
Use charmap to see if you can find a font which does. (try Arial Unicode MS)

Printing a line instead of "--------"

For a Windows CE project that we print slips, we have a new request which asks if it is possible to print a line insted of printing "-----------" all the way.
Is this possible without printing an image?
c# / .net 3.5
Thank you
On your desktop run charmap.exe. Tick "Advanced view" and type "box" in the Search box. You'll get the Unicode codepoints that you can use to draw lines and boxes. Copy and paste them into your code. Whether they actually show up properly on your device depends on the font support. Odds are decent since they've been around since the first IBM PC. You'll have to try.
There are extendedascii values to do this (196) but it really depends on the printer.
Or as quppa comments use _ but it will not be adequate if you want to box in a title or so.
Wikipedia has an article on box-drawing characters.
Since ─ (U+2500) didn't work for you, it's unlikely ━ (U+2501) will work either, but it's perhaps worth a shot. There is also no guarantee that there won't be spaces between these characters, given that spaces appear between underscores.
The issue is not Windows CE supporting Unicode but finding a font that you can use that has the box-drawing characters. Given the likely size limitations (fonts with lots of characters are tens of megabytes big), this might be a challenge.

Localization: How to map culture info to a script name or Unicode character range?

I need some information about localization. I am using .net 2.0 with C# 2.0 which takes care of most of the localization related issues. However, I need to manually draw the alphabets corresponding to the current culture on the screen in one particular screen.
This would be similar to the Contacts screen in Microsoft Outlook (Address Cards view or Detailed Address Cards View under Contacts), and so it needs a the column of buttons at the right end, one for each alphabet.
I am trying to emulate that, but I don't want to ask the user to choose the script. If the current culture is say, Chinese, I want to draw Chinese alphabets. When the user changes the culture info to English (and when he restarts the application) I want to draw English alphabets instead. Hope you understand where I am going with this query.
I can determine the culture of the current user (Application.CurrentCulture or System.Globalization.CultureInfo.CurrentCulture will give the culture related information). I also have all the scripts to render the alphabets. However, the problem is that I don't know how to map the culture info to the name of a script.
In other words, is there a way to determine the script name corresponding to a culture? Or is it possible to determine the range of Unicode character values corresponding to a culture? Either of them would allow me to render the alphabets on the button properly.
Any suggestions or guidance regarding this is truly appreciated. If there is something fundamentally wrong with my approach (or with what I am trying to achieve), please point out that as well. Thanks for your time.
PS: I know the easiest solution is to either configure the script name as part of user preferences or display a list of languages for the user to choose from (a la Contact in Outlook 2007). But I am just trying to see whether I can render the alphabets corresponding to the culture without the user having to do anything.
In native code there's LOCALE_SSCRIPTS for GetLocaleInfoEx() (Vista & above) that shows you what scripts are expected for a locale. There isn't a similar concept for .Net at this time.
Chinese has thousands of characters, so it might not be feasible to show all the characters in their character set. There's no native concept of 'alphabet' in Chinese, and I don't think Chinese has a syllabary like Japanese does.
Pinyin (Chinese written in roman alphabet) can be used to represent the Chinese characters, and that might help you index them. I know this doesn't answer your question, but I hope it's helpful.
I fully agree with mikiemacman. In addition, a given laguage doesn't necessarily uses all the letters of a script.
Anyway, the closest I can think of is CultureInfo.TextInfo.ANSICodePage -> There are only a handful of ANSI code pages. You could have create a table (or a switch() statement, whatever) that lists the script for each ANSI codepage.
Proto, wait! There's a much more accurate solution. It's an unmanaged on hance you may have to P/Invoke.
GetLocaleInfoW(MAKELCID(wLangId, SORT_DEFAULT), LOCALE_FONTSIGNATURE, wcBuf, MAXWCBUF);
This gives you a LOCALESIGNATURE stucture. The anwer is in the lsUsb field: Unicode subsets bitfield. Rats! the MS page for this structure is empty. But look it up in your MSDN copy. It's fully documented there: A whole set of flags that describe which scripts are spported. And yes, there's a flag for Tamil ;-)
HTH.
EDIT: Oops! Hadn't seen Shawne's answer. Wow! Answer from an in-house expert! ;-) Anyway, you may still be interested in a Pre-Vista compatible answer.
Fascinating topic. While it might not answer your question, Omniglot is a good resource.
The correct answer is likely to be complex, and depend on the exact problem you're solving. Assuming your goal showing only letters used in a particular language to separate phonebook sections (as in Outlook), few of the issues are:
People who have contact names spanning several scripts/languages.
2-glyph letters (e.g. 'Lj' in Serbian). It is one phoneme, always treated as a single letter although it has 2 Unicode symbols. 'It would have its own section in the phonebook (separate from 'L').
Too many glyphs to list (e.g. Chinese)
Unorthodox ordering (e.g. Thai -- a phone book would be separated by consonants only, ignoring the vowels).
Uppercase / lowercase distinction (presumably you'd only want one case for languages that support it -- which breaks down in minor ways Turkish 'i').

Categories

Resources