I'm currently beating my head against a wall trying to figure this out. But long story short, I'd like to convert a string between 2 UTF-8 '\u0002' to bold formating. This is for an IRC client that I'm working on so I've been running into these quite a bit. I've treid regex and found that matching on the rtf as ((\'02) works to catch it, but I'm not sure how to match the last character and change it to \bclear or whatever the rtf formating close is.
I can't exactly paste the text I'm trying to parse because the characters get filtered out of the post. But when looking at the char value its an int of 2.
Here's an attempt to paste the offending text:
[02:34] test test
You could use either
rtb.Rtf = Regex.Replace(rtb.Rtf, #"\\'02\s*(.*?)\s*\\'02", #"\b $1 \b0");
or
rtb.Rtf = Regex.Replace(rtb.Rtf, #"\\'02\s*(.*?)\s*\\'02", #"\'02 \b $1 \b0 \'02");
depending on whether you want to keep the \u0002s in there.
The \b and \b0 turn the bold on and off in RTF.
I don't have a test case, but you could also probably use the Clipboard class's GetText method with the Unicode TextDataFormat. Basically, I think you could place the input in the clipboard and get it out in a different format (works for RTF and the like). Here's MS's demo code (not applicable directly, but demonstrates the API):
// Demonstrates SetText, ContainsText, and GetText.
public String SwapClipboardHtmlText(String replacementHtmlText)
{
String returnHtmlText = null;
if (Clipboard.ContainsText(TextDataFormat.Html))
{
returnHtmlText = Clipboard.GetText(TextDataFormat.Html);
Clipboard.SetText(replacementHtmlText, TextDataFormat.Html);
}
return returnHtmlText;
}
Of course, if you do that, you probably want to save and restore what was in the clipboard, or else you may upset your users!
Related
Ok, so I need to design a regex to insert dashes. Im tasked with building a web API function that returns a specifically formatted string based upon input parameters. For some reason that hasn't been made clear to me, the source data isn't properly formatted, and I need to reformat the data with dashes in the correct place.
Depending on the first two characters and string length there is an optional third dash. Fortunately Im not concerned what those characters are. This system is a passthrough, so garbage in, garbage out. However, i do need to make sure the dashes are spaced appropriately on length.
Structure Types
XX-9999999999-XX AB
XX-9999999999-99 CD, EF
XX-9999999999-XXX-99 GH
XX-9999999999-XX-99 IJ, KL
For Example:
AB123456789044 should be AB-01234567890-44 and
GH1234567890YYY99 becomes GH-01234567890-YYY-99.
Thus far ive gotten to this point.
^(\w\w)(\d{10})(\w{2,3})(\d\d)?$
Which leads to my Question(s)
1) Im attempting to replace with $1-$2-$3-$4 However, whenever there is a fourth section of decimals, such as the case with IJ, its hard to distinguish between that and AB in the replace.
Ive gotten GH-01234567890-YY-99 And GH-01234567890-YY-.
How do I reference a conditional capture group in a replace string such that the dash relating to it only shows up if the grouping exists?
The problem is that you need conditional replacements, and C# doesn't support those. So you've got to do the replacements programmatically. Something like:
string resultString = null;
try {
Regex regexObj = new Regex(#"([A-Z]{2})-?(\d{10})-?(?:([A-Z]{2,3})|(\d{2}))-?(\d{2})?", RegexOptions.IgnorePatternWhitespace | RegexOptions.Multiline);
resultString = regexObj.Replace(subjectString, new MatchEvaluator(ComputeReplacement));
} catch (ArgumentException ex) {
// Error handling
}
public String ComputeReplacement(Match m) {
// Vary the replacement text in C# as needed
return "$1-$2-$3-$4-$5";
}
I haven't paid too much attention to the actual RegEx here, as it seems like you know what you're doing with it. I just included some conditional hyphens in case the data are quite dirty (partially formatted). Obviously you have to edit the "return" part of this, using conditionals in case any of the captures are blank. I haven't worked out that logic for you, as C# isn't my strength.
I use this function for updating a RichTextBox in cross thread situations
public void AddRtf(string text)
{
// cross thread allowed
if (rtb.InvokeRequired)
{
rtb.Invoke((MethodInvoker)delegate()
{
AddRtf(text);
});
}
else
{
rtb.Rtf = #"{\rtf1\ansi This is in \b bold\b0.}"; // this works
rtb.Rtf = #"{\rtf1\ansi This "+text+"is in \b bold\b0.}"; // this not
}
}
However, is not working, I can't see the RTF format when passing the "text" argument.
What will be the problem?
In fact, I need a simple solution to update a RichTextBox with COLOR, BOLD, UNDERLINE and some URLs inside a text. I wrote some functions for that such as rtb.AddLink() .AddBold() and so on, including a nice extension for adding URLs but seems more logical to pass RTF format and let the control to update formatting. But this will enforce me to break the text in each point where I need something in BOLD or whatever.
I think that HTML will be more convenient but I need a simple parser, at least simpler than HTMLAgilitypack.
So simple write in one line:
log.write("<font color="red">This is error</font> and this is the link... etc")
Anyone has a simple solution for this?
You need to escape the \ in the second part of the string:
#"{\rtf1\ansi This "+text+"is in \\b bold\\b0.}"
^^ ^^
or use an # again
#"{\rtf1\ansi This "+text+#"is in \b bold\b0.}"
^
I know this is kind of easy question but i cant seem to find it anywhere. Is there someone out there who knows how to create a soft return inside a set of text using C#.net?
I need to print soft return to a text file/xml file. this text file will be generated using c#.net. you could verify if the answer is correct if you use NOTEPAD++ then enable the option to “View>Show Symbol > Show End of Line” then you will see a symbol like this:
Thanks in advance :)
Not sure what you mean by a soft return. A quick Google search says it's a non-stored line break typically due to word wrapping in which case you wouldn't actually put this in a string, it would only be relevant when the string was rendered for display.
To put a carriage return and/or line feed in the string you would use:
string s = "line one\r\nline two";
And for further reference, here are the other escape codes that you can use.
Link (MSDN Blogs)
In response to your edit
The LF that you see can be represented with \n in a string. Obviously you have a specific line ending sequence that you need to represent. If you were to use Environment.NewLine that is going to give you different results on different platforms.
var message = $"Tom{Convert.ToChar(10)}Harry";
Results in:
Tom
Harry
With just a line feed between.
Lke already mentioned you can use Enviroment.NewLine but I am not sure if that i what you want or if you are actually trying to append a ASCII 141 to your string as mentioned in the comments.
You can add ASCII chr sequences to your string like this.
var myString = new StringBuilder("Foo");
myString.Append((char)141);
I need an "idea" on how to read text file data between quotes. For example:
line 1: "read a title"
line 2: "read a descr"
line 1: "read a title"
line 2: "read a descr"
I want to do a foreach type of thing, and I want to read all Line 1's, and Line 2's as a pair, but between the ".
In my program I am going to output (foreach of course):
readTerminatedNull(file1);
readTerminatedNull(file2);
I would read line by line, but some of the text could be:
line 1: "read a super long
title that goes off"
line 2: "read a descr"
So that's why I want to read between the ".
Sorry if that is too complicated, and it's a little hard to explain.
Edit:
Thanks for all the feed back guys, but I'm not sure you are getting what I am trying to do :p not your faults, I wrote this kinda wierd.
I will have a text file full of refrences, and text. like so.
text inside:
Refren: "myrefrence_1"
String: "This is a string of a refrence"
Refren: "myrefrence_2"
String: "hello world"
Refren: "myrefrence_3"
String: "I like cookies."
I want it to to read myrefrence_1 in the quotes of the first line, and then read the string in the next line between the ".
I will then stuff into my program that matches the refrence with the string.
But sometimes the text will be more than one line.
Refren: "this is text that goes and then
return keys on some parts."
and I still want it to read through the ".
(not tested, but you'll get the idea)
// Read all text from file
string sData = File.ReadAllText(#"c:/file.txt");
// Match strings between " "
Match match = Regex.Match(sData , "\"(\w|\d|\s|\\\")*\"",
RegexOptions.IgnoreCase);
// Read results and strip " out of them
foreach (var sResult in match) {
sResult = sResult.Remove(0,1).Remove(sResult.length-2, 1);
// Do whatever with sResult
}
You could learn some new tricks by looking into state machines. Basically: Read each character at a time and figure out what state you are in now. First, code this as a big while loop with a big switch statement inside. Then, go and read up on the state pattern for how to do this in an object oriented way. Then, ditch that and use delegates, because c# makes this stuff so easy to do.
Then, scrap it all, write some crappy Regular Expression with a multiline flag and slurp it the Perl way. Meditate on why this is the same as your original state machine solution.
Then, get really stuck in and learn about parser generators (lexx/yacc or some .NET variant) and write a simple BNF grammar for your problem. Take special note of how the trivial grammars used in the tutorials are all way more complicated than the one you need to write. Why is that so? Check out what Noam Chomsky had to say about that.
Eventually, you'll burn out. We all do. But you'll have so much fun digging into what makes programming the coolest activity on the planet. Burn-out is just the realization that that's a pipe dream ;)
When you're done, go outside. Meet people. Talk. Smile a lot. Be friendly. You're now a zen infused developer with a wicked grin. Yay for you! You rock!
What you're describing sounds like a single-column CSV file. The easiest way to access that is probably to use the Microsoft.VisualBasic.FileIO.TextFieldParser class, something like:
using (var csvParser = new TextFieldParser(new StringReader(content))
{
Delimiters = new[] {","},
HasFieldsEnclosedInQuotes = true
})
{
while (!csvParser.EndOfData)
{
var fields = csvParser.ReadFields();
Console.Print(fields[0]); //do something with the first (in your case only) field found.
}
}
Probably the easiest way to determine whether this approach makes sense, is to think about what happens if the string you're reading actually contains a double quote. Would it end up as "He said ""this is quoted"", but I wasn't listening" (doubling up the quotes), or is this situation impossible?
If the quotes would be doubled up in this way, then a standard CSV reader like this built-in framework one is probably your best bet.
To read all of the lines of the file you can use:
File.ReadAllLines(pathToFile);
to strip the text from "" you can use the substring method of string: http://msdn.microsoft.com/en-us/library/aka44szs.aspx
you can do it like that:
string strippedString = original.Substring(1, original.length -2);
Try this one
var text = File.ReadAllLines(pathToFile);
var lines = text.Split(':')
.Where((s,i) => i % 2 != 0)
.Select(s => s.trim('"'));
First of all you need to read in the file using:
File.ReadAllLines(filePath);
Then you could split all the lines using the string.Split function.
Splitting on the closing bracket would be your best bet.
As i have understood from you question is you want to read and write text file with some specific settings. is it ?
I would like to refer to to INI files which are the text files it self and provide the settings configurations as you wish to achieve. here are some links these could help you.
http://www.codeproject.com/Articles/1966/An-INI-file-handling-class-using-C
http://jachman.wordpress.com/2006/09/11/how-to-access-ini-files-in-c-net/
Since this is my first question here on stackoverflow I hope my question is correctly asked.
Basicly I have a normal .txt file which contains any text like:
car accident
people died
cat without owner
<!-- Text added at 6/29/2011 9:20:38 AM -->
Some addintional Text
other Text added
add Text
I have a write/append function which allows the user to append some text and set a little timestamp.
So my problem is: With another function, you can search and replace text in the textfile, but as you can guess if someone wants to replace the word "Text" it will be replaced in the xml-stylish comment(timestamp) as well.
My result until now is
content = Regex.Replace(content,"[^<+.*"+input+".*>+]*", replace);
//content = content of the .txt file, input = search term, replace = string to replace
But this fails miserably, as some regex pro's will see without executing it.
Now I hope that some regex pro could help me out here and provide me a search pattern which replaces the normal text but ignores the timestamp.
I'm not realy aware of the logic from regex until now, nevertheless I understand the single expressions so this would be a hook for me to understand Regex more properly.
Thanks in advice.
If I understand your question correctly, you want to replace every instance of "Text" except for the one(s) inside the comment.
The easist way is to use a negative lookbehind (fantastic description here) as below:
content = Regex.Replace(content, #"(?<!<!--.*?)" + input, replace);
What you're doing is attempting to replace a repetition of any length of a character that is NOT <+.*> or a character contained in input with the value in replace.
If you're going to be working a lot with Regex, I would HIGHLY recommend giving the website above a good read. It's hands down the best intro to Regex that I've found, the time spent now will save you lots of headaches later!
Edit
Updated to add flexibility thanks to #stema