How to use Span<T> for string replacement operations - c#

I have a html string in which there are many placeholders and I want to perform replace operations on the same. Currently, I am using StringBuilder but want to try Span<T> for any performance benefit.
My HTML parser can run 15 times for a single html template.
I see Span<T> gives us a chunk of stack memory where we can store char but is there any way I can perform String.Replace operations using Span<T>?
string html = <html>[Fname] [Lname]</html>
Loop(Object)
{
var newStr = html.Replace([Fname],Object.fname)
}

Related

Which of one from string interpolation and string.format is better in performance?

Consider this code:
var url = "www.example.com";
String.Format:
var targetUrl = string.Format("URL: {0}", url);
String Interpolation:
var targetUrl=$"URL: {url}";
Which of one from string interpolation and string.Format is better in performance?
Also what are the best fit scenarios for use of them?
According to C# docs string interpolation so maybe there is no difference at all?
... it's typically transformed into a String.Format method call
Which of one from string interpolation and string.format is better in performance?
Neither of them is better since they are equal on run-time. String interpolation is rewritten by the compiler to string.Format, so both statements are exactly the same on run-time.
Also what are the best fit scenarios for use of them?
Use string interpolation if you have variables in scope which you want to use in your string formatting. String interpolation is safer since it has compile time checks on the validness of your string. Use string.Format if you load the text from an external source, like a resource file or database.
Under most conditions they're the same, but not if you're using them in a logging statement that accepts a format string. (ex: log.Debug("URL: {0}", url))
If you use the {0} format, the argument won't get translated into a string until after the logging framework has decided whether or not to emit this message, whereas the {myVar} format is translated to a string before the function is entered.
Hence, in a log.Debug() call where you're only outputting Info and above, the {myVar} format always pays the binary to text tax, whereas the {0} format only pays the tax when you actually want it.
In the case of passing a string it doesn't matter, but for binary data and floating point data it can make quite a difference.

Fastest way to find/replace in a large string

I have a function that does a lot of finding and replacing on strings, using Regex and other string processing functions. Essentially I'm looping through a string, and adding the resulting data into a StringBuilder so its faster than modifying the string itself. Is there a faster way?
Essentially I'm looping through a string, and adding the resulting
data into a StringBuilder so its faster than modifying the string
itself. Is there a faster way?
StringBuilder class is faster when you want to concatenate some strings into a loop.
If you're concatening an array String.Concat() is faster bacause it has some overloads which accept arrays.
else use simply the + operator if you have to do something like: string s = "text1" + "text2" + "text3"; or use String.Concat("text1", "text2", "text3");.
For more info look here: Concatenate String Efficiently.
EDIT :
The + operator compiles to a call to String.Concat() as said usr in his comment.

overwrite stringbuilder

I am reading text from a file into a string builder, I make a quick replacement using .Replace(), then I need to run two regex against the string builder to completely overwrite the string builder. What is the best way to do this?
I used Append to initially load the StringBuilder from the streamreader, then used .Replace() for the simple replacement. Now I need to remove the beginning and end of each line based on two different regex.
You will have to convert the StringBuilder into a string (calling its ToString() method) and perform the Regex operations on the string.
Also if you are just interested in reading all text from a file, you don't need to use a stream and a StringBuilder, instead just use File.ReadAllText(someFile) which returns a string, or File.ReadAllLines(someFile) which returns a string array .

string manipulations

I have string variable declared globally.I have to append a substring to this string dynamically based on the user input.To do this I use str=str+substring;
In this case the string in str doesn't have meaningful sentence finally ie.,there is no spaces between the words.to make it sense I used the following statement instead,
str=str+" "+substring; or str=str+substring+" ";
here everytime I have to append extra space to the substring before appending this to the main string were additional string processing is required.
Can anybody help on this were i can do this effectively?
It depends on how often you are doing it. If this is intermittent (or in fact pretty-much anything except a tight loop), then forget it; what you have is fine. Sure an extra string is generated occasionally (the combined substring/space), but it will be collected at generation 0; very cheap.
If you are doing this aggressively (in a loop etc), then use a StringBuilder instead:
// declaration
StringBuilder sb = new StringBuilder();
...
// composition
sb.Append(' ').Append(substring);
...
// obtaining the string
string s = sb.ToString();
A final (unrelated) point - re "globally" - if you mean static, you might want to synchronize access if you have multiple threads.
What do you want to achieve exactly? You could store the words in a list
List<string> words = new List<string>();
...
words.Add(str);
And then delay the string manipulation (i.e. adding the spaces between words) until at the very end. This way, you're on the fly operation is just an add to a list, and you can do all the complex processing (whatever it may be) at the end.
If you are doing it rarely, you could slightly pretty up the code by doing:
str += " " + substring;
Otherwise, I'd go with Nanda's solution.
#Nanda: in your case you should use string builder.
StringBuilder data = new StringBuilder();
data.AppendFormat(" {0}", substring);

Best way to replace tokens in a large text template

I have a large text template which needs tokenized sections replaced by other text. The tokens look something like this: ##USERNAME##. My first instinct is just to use String.Replace(), but is there a better, more efficient way or is Replace() already optimized for this?
System.Text.RegularExpressions.Regex.Replace() is what you seek - IF your tokens are odd enough that you need a regex to find them.
Some kind soul did some performance testing, and between Regex.Replace(), String.Replace(), and StringBuilder.Replace(), String.Replace() actually came out on top.
The only situation in which I've had to do this is sending a templated e-mail. In .NET this is provided out of the box by the MailDefinition class. So this is how you create a templated message:
MailDefinition md = new MailDefinition();
md.BodyFileName = pathToTemplate;
md.From = "test#somedomain.com";
ListDictionary replacements = new ListDictionary();
replacements.Add("<%To%>", someValue);
// continue adding replacements
MailMessage msg = md.CreateMailMessage("test#someotherdomain.com", replacements, this);
After this, msg.Body would be created by substituting the values in the template. I guess you can take a look at MailDefinition.CreateMailMessage() with Reflector :). Sorry for being a little off-topic, but if this is your scenario I think it's the easiest way.
Well, depending on how many variables you have in your template, how many templates you have, etc. this might be a work for a full template processor. The only one I've ever used for .NET is NVelocity, but I'm sure there must be scores of others out there, most of them linked to some web framework or another.
string.Replace is fine. I'd prefer using a Regex, but I'm *** for regular expressions.
The thing to keep in mind is how big these templates are. If its real big, and memory is an issue, you might want to create a custom tokenizer that acts on a stream. That way you only hold a small part of the file in memory while you manipulate it.
But, for the naiive implementation, string.Replace should be fine.
If you are doing multiple replaces on large strings then it might be better to use StringBuilder.Replace(), as the usual performance issues with strings will appear.
Regular expressions would be the quickest solution to code up but if you have many different tokens then it will get slower. If performance is not an issue then use this option.
A better approach would be to define token, like your "##" that you can scan for in the text. Then select what to replace from a hash table with the text that follows the token as the key.
If this is part of a build script then nAnt has a great feature for doing this called Filter Chains. The code for that is open source so you could look at how its done for a fast implementation.
Had to do something similar recently. What I did was:
make a method that takes a dictionary (key = token name, value = the text you need to insert)
Get all matches to your token format (##.+?## in your case I guess, not that good at regular expressions :P) using Regex.Matches(input, regular expression)
foreach over the results, using the dictionary to find the insert value for your token.
return result.
Done ;-)
If you want to test your regexes I can suggest the regulator.
FastReplacer implements token replacement in O(n*log(n) + m) time and uses 3x the memory of the original string.
FastReplacer is good for executing many Replace operations on a large string when performance is important.
The main idea is to avoid modifying existing text or allocating new memory every time a string is replaced.
We have designed FastReplacer to help us on a project where we had to generate a large text with a large number of append and replace operations. The first version of the application took 20 seconds to generate the text using StringBuilder. The second improved version that used the String class took 10 seconds. Then we implemented FastReplacer and the duration dropped to 0.1 seconds.
If your template is large and you have lots of tokens, you probably don't want walk it and replace the token in the template one by one as that would result in an O(N * M) operation where N is the size of the template and M is the number of tokens to replace.
The following method accepts a template and a dictionary of the keys value pairs you wish to replace. By initializing the StringBuilder to slightly larger than the size of the template, it should result in an O(N) operation (i.e. it shouldn't have to grow itself log N times).
Finally, you can move the building of the tokens into a Singleton as it only needs to be generated once.
static string SimpleTemplate(string template, Dictionary<string, string> replacements)
{
// parse the message into an array of tokens
Regex regex = new Regex("(##[^#]+##)");
string[] tokens = regex.Split(template);
// the new message from the tokens
var sb = new StringBuilder((int)((double)template.Length * 1.1));
foreach (string token in tokens)
sb.Append(replacements.ContainsKey(token) ? replacements[token] : token);
return sb.ToString();
}
This is an ideal use of Regular Expressions. Check out this helpful website, the .Net Regular Expressions class, and this very helpful book Mastering Regular Expressions.

Categories

Resources