Why does the Notepad++ [NULL] character not paste? - c#

I am new to this site, and I don't know if I am providing enough info - I'll do my best =)
If you use Notepad++, then you will know what I am talking about -- When a user loads a .exe into Notepad++, the NUL / \x0 character is replaced by NULL, which has a black background, and white text. I tried pasting it into Visual Studio, hoping to obtain the same output, but it just pasted some spaces...
Does anyone know if this is a certain key-combination, or something? I would like to put the NULL character in replacement of \x0, just like Notepad++ =)

Notepad++ is a rich text editor unlike your regular notepad. It can display custom graphics so common in all modern text editors. While reading a file whenever notepad++ encounters the ASCII code of a null character then instead of displaying nothing it adds the string "NULL" to the UI setting the text background colour to black and text colour to white which is what you are seeing. You can show any custom style in your rich text editor too.
NOTE: This is by no means an efficient solution. I'm clearly traversing a read string 2 times just to take benefit of already present methods. This can be done manually in a single pass. It is just to give a hint about how you can do it. Also I wrote the code carefully but haven't ran it because I don't have the tools at the moment. I apologise for any mistakes let me know I'll update it
Step 1 : Read a text file by line (line ends at '\n') and replace all instances of null character of that line with the string "NUL" using the String.Replace(). Finally append the modified text to your RichTextBox.
Step 2 : Re traverse your read line using String.IndexOf() finding start indexes of each "NUL" word. Using these indexed you select text from RichTextBox and then style that selected text using RichTextBox.SelectionColor and RichTextBox.SelectionBackColor
richTextBoxCursor basically just represents the start index of each line in RichTextBox
StreamReader sr = new StreamReader(#"c:\test.txt" , Encoding.UTF8);
int richTextBoxCursor = 0;
while (!sr.EndOfStream){
richTextBoxCursor = richTextBox.TextLength;
string line = sr.ReadLine();
line = line.Replace(Convert.ToChar(0x0).ToString(), "NUL");
richTextBox.AppendText(line);
i = 0;
while(true){
i = line.IndexOf("NUL", i) ;
if(i == -1) break;
// This specific select function select text start from a certain start index to certain specified character range passed as second parameter
// i is the start index of each found "NUL" word in our read line
// 3 is the character range because "NUL" word has three characters
richTextBox.Select(richTextBoxCursor + i , 3);
richTextBox.SelectionColor = Color.White;
richTextBox.SelectionBackColor = Color.Black;
i++;
}
}

Notepad++ may use custom or special fonts to show these particular characters. This behavior also may not appropriate for all text editors. So, they don't show them.
If you want to write a text editor that visualize these characters, you probably need to implement this behavior programmatically. Seeing notepad++ source can be helpful If you want.

Text editor
As far as I know in order to make Visual Studio display non printable characters you need to install an extension from the marketplace at https://marketplace.visualstudio.com.
One such extension, which I have neither tried nor recomend - I just did a quick search and this is the first result - is
Invisible Character Visualizer.
Having said that, copy-pasting binaries is a risky business.
You may try Edit > Advanced > View White Space first.
Binary editor
To really see what's going on you could use the VS' binary editor: File->Open->(Open with... option)->Binary Editor -> OK

To answer your question.
It's a symbolic representation of 00H double byte.
You're copying and pasting the values. Notepad++ is showing you symbols that replace the representation of those values (because you configured it to do so in that IDE).

Related

RichTextBox not showing formfeed character

I have a text file that, when I open it in Notepad, shows the form feed character (byte 12). I want to show this character in my richtextbox but no matter which encoding I use when I read the text file it won't show. When I enter the character myself it shows. When I do myRTB.Text = "♀" it shows, but when I do
myRTB.Text = File.ReadAllText(myFileName.txt);
it doesn't show. I've also tried using the readers in the Encoding class to no avail.
How can I show the form feed character in my rtb?
Firstly, a line feed has a value of 13. If you have characters with the value 12 in there then they are not line feeds.
As for your issue, ReadAllLines reads the lines of a file into a String array, thus stripping out all the line breaks. You might do as Damith suggests and call ReadAllText, which reads the file contents as a single String, and assign the result to the Text property or else call ReadAllLines and assign the result to the Lines property. Better to call LoadFile on the RichTextBox itself though.
try with ReadAllText
myRTB.Text = File.ReadAllText(myFileName.txt, Encoding.Unicode);
Thanks for the help #jmcilhinney and #Damith. I ended up cheating the system by doing a dirty. I saw that myRTB was replacing the form feed char with \page in the RTF, but when I typed the form feed char myself it put \u9792. Therefore I went with the hack:
myRTB.Rtf = myRTB.Rtf.Replace("\\page", "\\u9792");
If you have something less hackish that I can get working please let me know.

Text de-justification to get correct substring

I have a text file which is made up with justify (all tabs aligned - different size).
Therefore I can't get the desired value at a certain column (substring).
Since this is a migration I can't change the format of the file.
How can I "de-justify" the text to spaces while preserving the spacing length, any scripts out there?
I need the upper value. replacing \t with a fixed value doesn't help.
*EDIT: files seems to be formatted with fmt
*EDIT2: Solution found it seems; when I use fmt on windows (coreutils) it stays the same.
However on my mac I get the desired result (maybe something in the win command not setup right).
fmt original_file >> new.txt
Grts
There's probably lots of other ways to do this, but sublime text has built in tab to space conversion.
http://css-tricks.com/changing-spaces-tabs-sublime-text/

Outputting Programmatically to MSword; sensing end of line

I'm trying to use the MSWord Interop Library to write a C# application that outputs specially formated text (isolated arabic letters) to a file. The problem I'm running into is determining how many characters remain before the text wraps onto a new line. I need the words to be on the same line, without wrapping, which is the default behavior. I'm finding this difficult because when I have the Arabic letters of the word isolated with spaces, they are treated as individual characters and therefore behave differently then connected words.
Any help is appreciated. Thanks.
Add each character to your range and then check the number of lines in the range
LineCount = range.ComputeStatistics(Word.WdStatistic.wdStatisticLines);
When the line count changes, you know it has been wrapped, and can remove the last character or reformat accordingly
Actually I don't know how this behaves today, but I've written something for the MSWork API when I was facing a somewhat weird fact. Actually you can't find that out. In MSWord, text in a document is always in paragraphs.
If you input text to your document, you won't get it in a page only, but this page will at least contain a paragraph for the text you wrote into it.
Unfortunately I can't figure this out again, because I don't have a license for MS Word these day.
Give it a try and look at the problem again in this way.
Hope this helps, and if not, please provide the code that generates the input and the exact version of MSWord.
Greetings,
Kjellski
I'm not sure what "Arabic letters of the word isolated with spaces" means exactly, but I assume that non breaking space is what you need.
Here's more details.

Delete Lines From Beginning of Multiline Textbox in C#

Is there a graceful way in C# to delete multiple lines of text from the beginning of a multiline textbox? I am using Microsoft Visual C# 2008 Express Edition.
EDIT - Additional Details
The multiline textbox in my application is disabled (i.e. it is only editable by the application itself), and every line is terminated with a "\r\n".
This is an incomplete question. So assuming you are using either TextBox or RichTextBox you can use the Lines property found inTextBoxBase.
//get all the lines out as an arry
string[] lines = this.textBox.Lines;
You can then work with this array and set it back.
this.textBox.Lines= newLinesArray;
This might not be the most elegant way, but it will remove the first line.
EDIT: you don't need select, just using skip will be fine
//number of lines to remove from the beginning
int numOfLines = 30;
var lines = this.textBox1.Lines;
var newLines = lines.Skip(numOfLines);
this.textBox1.Lines = newLines.ToArray();
This solution works for me in WPF:
while (LogTextBox.LineCount > Constants.LogMaximumLines)
{
LogTextBox.Text = LogTextBox.Text.Remove(0, LogTextBox.GetLineLength(0));
}
You can replace LogTextBox with the name of your text box, and Constants.LogMaximumLines with the maximum number of lines you would like your text box to have.
Unfortunately, no, there is no "elegant" way to delete lines from the text of a multiline TextBox, regardless of whether you are using ASP.NET, WinForms, or WPF/Silverlight. In every case, you build a string that does not contain the lines you don't want and set the Text property.
WinForms will help you a little bit by pre-splitting the Text value into lines, using the Lines property, but it's not very helpful because it's a string array, and it's not exactly easy to delete an element of an array.
Generally, this algorithm will work for all possible versions of the TextBox class:
var lines = (from item in myTextBox.Text.Split('\n') select item.Trim());
lines = lines.Skip(numLinesToSkip);
myTextBox.Text = string.Join(Environment.Newline, lines.ToArray());
Note: I'm using Environment.Newline specifically for the case of Silverlight on a Unix platform. For all other cases, you're perfectly fine using "\r\n" in the string.Join call.
Also, I do not consider this an elegant solution, even though it's only 3 lines. What it does is the following:
splits the single string into an array of strings
iterates over that array and builds a second array that does not include the lines skipped
joins the array back into a single string.
I do not consider it elegant because it essentially builds two separate arrays, then builds a string from the second array. A more elegant solution would not do this.
One thing to keep in mind is that the Lines collection of the TextBox does not accurately reflect what the user sees as lines. The Lines collection basically works off of carriage returns, whereas the user could see lines wrapping from one line to the next without a carriage return. This may or may not be the behavior you want.
For example, the user would see the below as three lines, but the Lines collection will show 2 (since there are only 2 carriage returns):
This is line number
one.
This is line 2.
Also, if the form, and the text control are resizable the visible lines in the text will change as the control grows or shrinks.
I wrote a blog post several years ago on how to determine the number of lines in the textbox as the user sees them and get the index of a given line (like to get the line at index: http://ryanfarley.com/blog/archive/2004/04/07/511.aspx, perhaps this post will help.
if (txtLog.Lines.Length > maxNumberLines)
{
txtLog.Lines = txtLog.Lines.Skip(txtLog.Lines.Length - maxNumberLines).ToArray();
}

Reading line by line

I have a program that generates a plain text file. The structure (layout) is always the same. Example:
Text File:
LinkLabel
"Hello, this text will appear in a LinkLabel once it has been
added to the form. This text may not always cover more than one line. But will always be surrounded by quotation marks."
240, 780
So, to explain what is going on in that file:
Control
Text
Location
And when a button on the Form is clicked, and the user opens one of these files from the OpenFileDialog dialog, I need to be able to Read each line. Starting from the top, I want to check to see what control it is, then starting on the second line I need to be able to get all text inside the quotation marks (regardless of whether is is one line of text or more), and on the next line (after the closing quotation mark), I need to extract the location (240, 780)... I have thought of a few ways of going about this but when I go to write it down and put it to practice, it doesn't make much sense and end up figuring out ways that it won't work.
Has anybody ever done this before? Would anybody be able to provide any help, suggestions or advice on how I'd go about doing this?
I have looked up CSV files but that seems too complicated for something that seems so simple.
Thanks
jase
You could use a regular expression to get the lines from the text:
MatchCollection lines = Regex.Matches(File.ReadAllText(fileName), #"(.+?)\r\n""([^""]+)""\r\n(\d+), (\d+)\r\n");
foreach (Match match in lines) {
string control = match.Groups[1].Value;
string text = match.Groups[2].Value;
int x = Int32.Parse(match.Groups[3].Value);
int y = Int32.Parse(match.Groups[4].Value);
Console.WriteLine("{0}, \"{1}\", {2}, {3}", control, text, x, y);
}
I'll try and write down the algorithm, the way I solve these problems (in comments):
// while not at end of file
// read control
// read line of text
// while last char in line is not "
// read line of text
// read location
Try and write code that does what each comment says and you should be able to figure it out.
HTH.
You are trying to implement a parser and the best strategy for that is to divide the problem into smaller pieces. And you need a TextReader class that enables you to read lines.
You should separate your ReadControl method into three methods: ReadControlType, ReadText, ReadLocation. Each method is responsible for reading only the item it should read and leave the TextReader in a position where the next method can pick up. Something like this.
public Control ReadControl(TextReader reader)
{
string controlType = ReadControlType(reader);
string text = ReadText(reader);
Point location = ReadLocation(reader);
... return the control ...
}
Of course, ReadText is the most interesting one, since it spans multiple lines. In fact it's a loop that calls TextReader.ReadLine until the line ends with a quotation mark:
private string ReadText(TextReader reader)
{
string text;
string line = reader.ReadLine();
text = line.Substring(1); // Strip first quotation mark.
while (!text.EndsWith("\"")) {
line = reader.ReadLine();
text += line;
}
return text.Substring(0, text.Length - 1); // Strip last quotation mark.
}
This kind of stuff gets irritating, it's conceptually simple, but you can end up with gnarly code. You've got a comparatively simple case:one record per file, it gets much harder if you have lots of records, and you want to deal nicely with badly formed records (consider writing a parser for a language such as C#.
For large scale problems one might use a grammar driven parser such as this: link text
Much of your complexity comes from the lack of regularity in the file. The first field is terminated by nwline, the second by delimited by quotes, the third terminated by comma ...
My first recomendation would be to adjust the format of the file so that it's really easy to parse. You write the file so you're in control. For example, just don't have new lines in the text, and each item is on its own line. Then you can just read four lines, job done.

Categories

Resources