Inserting a character inside the body of a paragraph

Inserting a character inside the body of a paragraph - c#

If someone enters a very long title/sentence, the text will stretch across the web page.
Is there a way to break the text so it continues on to the next line?
Using overflow hidden will hide the text.
I think I should be using the wbr tag.
Should I use the insert(); method for this?
i.e.
string myText = "111111111111111111111111111111111111111111111111111111111111111111111111";
myText = myText.Insert(80, "<wbr/>");
Not sure how cross browser the wbr tag is also!

Strictly speaking, you should use the zero width space () for this rather than <wbr>. However, Internet Explorer 6 and earlier are known not to support this (they show an ugly box). So <wbr> is probably the safest choice. Except... Internet Explorer 8 in standards mode is known not to support <wbr>, so you've got yourself a wonderful conundrum here.
You can read more at quirksmode.org.
Do note HBoss' comment in that it's hard to predict where to break, unless you're using a fixed width font like Courier. You should probably heed his advice and break more often than just every 80 characters. (And don't get me started on combining characters.)
As far as ASP.NET is concerned, you can indeed use the Insert method for this, but beware when you need to insert more than one: you'' need to do some book keeping (and a StringBuilder would also be advised).

You could use a regex to find words surrounded by whitespace/special chars, and surround it with a div/span that has different overflow properties.
If you do use <wbr>, be sure to surround the word with <nobr>.

I'm not sure you can solve this by splitting the content with breaks since you aren't guaranteed that the break will fit uniformly across browsers. You have variations of font-size, widths, etc.
Normally when I see content that extends too var it simply overlaps over the rest of the page or the designer sets the overflow so that the content can be scrolled. There could potentially be some CSS tricks you could use, but I'm not aware of any.
As an alternative approach, instead of simply inserting a line break every x number of characters, you might just insert a space after certain characters, for example, punctuation. This will make sure that the content wraps at some point or another.

Related

Detect non-displayable characters in string C#

Hopefully someone can help me with this, because I haven't found any solution online so far.
I am processing strings with special characters and I want to detect if any character in a string can't be displayed properly by for instance a webbrowser or even Visual Studio itself. The following string shows such characters. This comes from the Text vizualizer in VS2019:
TargetsforReduceCO
I've checked similar questions, but the answers were mostly limited to checking if the character code exceeds 255. However, there are lots of characters that can still be displayed, like Greek and Cyrillic symbols.
I also found this website that has an overview of all Unicode characters and show how they are displayed in the browser, but there doesn't seem to be any logic in which characters can't be displayed and their character code.
I can imagine that VS doesn't know which characters can't be displayed in various browsers, but I'm hoping that there is at least a way of checking if VS can display them.
Thanks in advance for your help!
Edit:
Right now I'm using
input.Any(c => !char.IsLetterOrDigit(c) && c > 255);
Because the input shouldn't normally contain other symbols than what you can usually find in a text, but I'm sure it will be triggered on symbols that can actually be displayed by VS or a webbrowser.

Type char has a number of static member methods like IsPunctuation() that should help you "categorize" character by character. See example on this page System.Char reference. Each of those methods' documentation explains what characters it applies to. As commenters have mentioned, your "displayable" criterion is more a font-presentation problem than a character value problem but you'll be able to narrow down what your system can work with using these methods. Look out for other methods like GetUnicodeCategory().
It may be that something as simple as !char.IsControl(c) will do the trick.
See similar Q&A here C# Printable Characters

.NET Regular Expression (perl-like) for detecting text that was pasted twice in a row

I've got a ton of json files that, due to a UI bug with the program that made them, often have text that was accidentally pasted twice in a row (no space separating them).
Example: {FolderLoc = "C:\testC:\test"}
I'm wondering if it's possible for a regular expression to match this. It would be per-line. If I can do this, I can use FNR, which is a batch text processing tool that supports .NET RegEx, to get rid of the accidental duplicates.
I regret not having an example of one of my attempts to show, but this is a very unique problem and I wasn't able to find anything on search engines resembling it to even start to base a solution off of.
Any help would be appreciated.

Can collect text along the string (.+ style) followed by a lookahead check for what's been captured up to that point, so what would be a repetition of it, like
/(.+)(?=\1)/; # but need more restrictions
However, this gets tripped even just on double leTTers, so it needs at least a little more. For example, our pattern can require the text which gets repeated to be at least two words long.
Here is a basic and raw example. Please also see the note on regex at the end.
use warnings;
use strict;
use feature 'say';
my #lines = (
q(It just wasn't able just wasn't able no matter how hard it tried.),
q(This has no repetitions.),
q({FolderLoc = "C:\testC:\test"}),
);
my $re_rep = qr/(\w+\W+\w+.+)(?=\1)/; # at least two words, and then some
for (#lines) {
if (/$re_rep/) {
# Other conditions/filtering on $1 (the capture) ?
say $1
}
}
This matches at least two words: word (\w+) + non-word-chars + word + anything. That'll still get some legitimate data, but it's a start that can now be customized to your data. We can tweak the regex and/or further scrutinize our catch inside that if branch.
The pattern doesn't allow for any intervening text (the repetition must follow immediately), what is changed easily if needed; the question is whether then some legitimate repetitions could get flagged.
The program above prints
just wasn't able
C:\test
Note on regex This quest, to find repeated text, is much too generic
as it stands and it will surely pick on someone's good data. It is enough to note that I had to require at least two words (with one word that that is flagged), which is arbitrary and still insufficient. For one, repeated numbers realistically found in data files (3,3,3,3,3) will be matched as well.
So this needs further specialization, for what we need to know about data.

Outputting Programmatically to MSword; sensing end of line

I'm trying to use the MSWord Interop Library to write a C# application that outputs specially formated text (isolated arabic letters) to a file. The problem I'm running into is determining how many characters remain before the text wraps onto a new line. I need the words to be on the same line, without wrapping, which is the default behavior. I'm finding this difficult because when I have the Arabic letters of the word isolated with spaces, they are treated as individual characters and therefore behave differently then connected words.
Any help is appreciated. Thanks.

Add each character to your range and then check the number of lines in the range
LineCount = range.ComputeStatistics(Word.WdStatistic.wdStatisticLines);
When the line count changes, you know it has been wrapped, and can remove the last character or reformat accordingly

Actually I don't know how this behaves today, but I've written something for the MSWork API when I was facing a somewhat weird fact. Actually you can't find that out. In MSWord, text in a document is always in paragraphs.
If you input text to your document, you won't get it in a page only, but this page will at least contain a paragraph for the text you wrote into it.
Unfortunately I can't figure this out again, because I don't have a license for MS Word these day.
Give it a try and look at the problem again in this way.
Hope this helps, and if not, please provide the code that generates the input and the exact version of MSWord.
Greetings,
Kjellski

I'm not sure what "Arabic letters of the word isolated with spaces" means exactly, but I assume that non breaking space is what you need.
Here's more details.

Text macros - replace them with function result

I need to introduce some text macros, for example:
"Some text here, some text here #from_file[a.txt,2,N] and here and here"
The #from_file[a.txt,2,N] macro should get 2 random lines from a.txt and join them with new line character another #from_file[a.txt,5,S] - take 5 random lines and join with space
I of course need some another macros: #random[0-9] - random number, #random[A-B,5] - random string with 5 characters
Macros can be in another format etc: {from_file:a.txt,2,N}
My first idea was to use regular expressions - but maybe exist another solution for my problem?

It sounds like you want to create some sort of "general purpose" text-macro system, and while I'm sure this can be done with regexps, what you want basically boil down to what you want to be capable of, and how extensive & flexible it needs to be.
You basically need to define your grammar and constraints. Can the file-name contain the macro-block terminator-character '}' ? If so, does it need to be escaped? Should escaping be supported? Are spaces within a macro-block allowed?
Basically find out how you want things to work, preferably as constrained as possible, as this means you can implement a simpler solution, and there might not be any need for a full blown parser and similar ilk.
Maybe a regex-based solution will be sufficient (although most certainly not very good). But before you can tell that, you need to spec better ;)

How do I use a regular expression to add linefeeds?

I have a really long string. I would like to add a linefeed every 80 characters. Is there a regular expression replacement pattern I can use to insert "\r\n" every 80 characters? I am using C# if that matters.
I would like to avoid using a loop.
I don't need to worry about being in the middle of a word. I just want to insert a linefeed exactly every 80 characters.

I don't know the exact C# names, but it should be something like
str.Replace("(.{80})", "$1\r\n");
The idea is to grab 80 characters and save it in a group, then put it back in (I think "$1" is the right syntax) along with the "\r\n".
(Edit: The original regex had a + in it, which you definitely don't want. That would completely eliminate everything except the last line and any leftover pieces--a decidedly suboptimal result.)
Note that this way, you will most likely split inside words, so it might look pretty ugly.
You should be looking more into word wrapping if this is indeed supposed to be readable text. A little googling turned up a couple of functions; or if this is a text box, you can just turn on the WordWrap property.
Also, check out the .Net page at regular-expressions.info. It's by far the best reference site for regexes that I know of. (Jan Goyvaerts is on SO, but nobody told me to say that.)

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Inserting a character inside the body of a paragraph - c#

You could use a regex to find words surrounded by whitespace/special chars, and surround it with a div/span that has different overflow properties. If you do use <wbr>, be sure to surround the word with <nobr>.

Related

Detect non-displayable characters in string C#

.NET Regular Expression (perl-like) for detecting text that was pasted twice in a row

Outputting Programmatically to MSword; sensing end of line

Text macros - replace them with function result

How do I use a regular expression to add linefeeds?

Categories

Resources