Is there an easy method to insert spaces between the characters of a string? I'm using the below code which takes a string (for example ( UI$.EmployeeHours * UI.DailySalary ) / ( Month ) ) . As this information is getting from an excel sheet, i need to insert [] for each columnname. The issue occurs if user avoids giving spaces after each paranthesis as well as an operator. AnyOne to help?
text = e.Expression.Split(Splitter);
string expressionString = null;
for (int temp = 0; temp < text.Length; temp++)
{
string str = null;
str = text[temp];
if (str.Length != 1 && str != "")
{
expressionString = expressionString + "[" + text[temp].TrimEnd() + "]";
}
else
expressionString = expressionString + str;
}
User might be inputing something like (UI$.SlNo-UI+UI$.Task)-(UI$.Responsible_Person*UI$.StartDate) while my desired output is ( [UI$.SlNo-UI] + [UI$.Task] ) - ([UI$.Responsible_Person] * [UI$.StartDate] )
Here is a short way to insert spaces after every single character in a string (which I know isn't exactly what you were asking for):
var withSpaces = withoutSpaces.Aggregate(string.Empty, (c, i) => c + i + ' ');
This generates a string the same as the first, except with a space after each character (including the last character).
You can do that with regular expressions:
using System.Text.RegularExpressions;
class Program {
static void Main() {
string expression = "(UI$.SlNo-UI+UI$.Task)-(UI$.Responsible_Person*UI$.StartDate) ";
string replaced = Regex.Replace(expression, #"([\w\$\.]+)", " [ $1 ] ");
}
}
If you are not familiar with regular expressions this might look rather cryptic, but they are a powerful tool, and worth learning. In case, you may check how regular expressions work, and use a tool like Expresso to test your regular expressions.
Hope this helps...
Here is an algorithm that does not use regular expressions.
//Applies dobule spacing between characters
public static string DoubleSpace(string s)
{
if (string.IsNullOrEmpty(s))
{
return string.Empty;
}
char[] a = s.ToCharArray();
char[] b = new char[ (a.Length * 2) - 1];
int bIndex = 0;
for(int i = 0; i < a.Length; i++)
{
b[bIndex++] = a[i];
//Insert a white space after the char
if(i < (a.Length - 1))
{
b[bIndex++] = ' ';
}
}
return new string(b);
}
Well, you can do this by using Regular expressions, search for specific paterns and add brackets where needed. You could also simply Replace every Paranthesis with the same Paranthesis but with spaces on each end.
I would also advice you to use StringBuilder aswell instead of appending to an existing string (this creates a new string for each manipulation, StringBuilder has a smaller memory footprint when doing this kind of manipulation)
Related
I have a string of text and want to ensure that it contains at most one single occurrence of a specific character (,). Therefore I want to keep the first one, but simply remove all further occurrences of that character.
How could I do this the most elegant way using C#?
This works, but not the most elegant for sure :-)
string a = "12,34,56,789";
int pos = 1 + a.IndexOf(',');
return a.Substring(0, pos) + a.Substring(pos).Replace(",", string.Empty);
You could use a counter variable and a StringBuilder to create the new string efficiently:
var sb = new StringBuilder(text.Length);
int maxCount = 1;
int currentCount = 0;
char specialChar = ',';
foreach(char c in text)
if(c != specialChar || ++currentCount <= maxCount)
sb.Append(c);
text = sb.ToString();
This approach is not the shortest but it's efficient and you can specify the char-count to keep.
Here's a more "elegant" way using LINQ:
int commasFound = 0; int maxCommas = 1;
text = new string(text.Where(c => c != ',' || ++commasFound <= maxCommas).ToArray());
I don't like it because it requires to modify a variable from a query, so it's causing a side-effect.
Regular expressions are elegant, right?
Regex.Replace("Eats, shoots, and leaves.", #"(?<=,.*),", "");
This replaces every comma, as long as there is a comma before it, with nothing.
(Actually, it's probably not elegant - it may only be one line of code, but it may also be O(n^2)...)
If you don't deal with large strings and you reaaaaaaly like Linq oneliners:
public static string KeepFirstOccurence (this string #string, char #char)
{
var index = #string.IndexOf(#char);
return String.Concat(String.Concat(#string.TakeWhile(x => #string.IndexOf(x) < index + 1)), String.Concat(#string.SkipWhile(x=>#string.IndexOf(x) < index)).Replace(#char.ToString(), ""));
}
You could write a function like the following one that would split the string into two sections based on the location of what you were searching (via the String.Split() method) for and it would only remove matches from the second section (using String.Replace()) :
public static string RemoveAllButFirst(string s, string stuffToRemove)
{
// Check if the stuff to replace exists and if not, return the original string
var locationOfStuff = s.IndexOf(stuffToRemove);
if (locationOfStuff < 0)
{
return s;
}
// Calculate where to pull the first string from and then replace the rest of the string
var splitLocation = locationOfStuff + stuffToRemove.Length;
return s.Substring(0, splitLocation) + (s.Substring(splitLocation)).Replace(stuffToRemove,"");
}
You could simply call it by using :
var output = RemoveAllButFirst(input,",");
A prettier approach might actually involve building an extension method that handled this a bit more cleanly :
public static class StringExtensions
{
public static string RemoveAllButFirst(this string s, string stuffToRemove)
{
// Check if the stuff to replace exists and if not, return the
// original string
var locationOfStuff = s.IndexOf(stuffToRemove);
if (locationOfStuff < 0)
{
return s;
}
// Calculate where to pull the first string from and then replace the rest of the string
var splitLocation = locationOfStuff + stuffToRemove.Length;
return s.Substring(0, splitLocation) + (s.Substring(splitLocation)).Replace(stuffToRemove,"");
}
}
which would be called via :
var output = input.RemoveAllButFirst(",");
You can see a working example of it here.
static string KeepFirstOccurance(this string str, char c)
{
int charposition = str.IndexOf(c);
return str.Substring(0, charposition + 1) +
str.Substring(charposition, str.Length - charposition)
.Replace(c, ' ').Trim();
}
Pretty short with Linq; split string into chars, keep distinct set and join back to a string.
text = string.Join("", text.Select(c => c).Distinct());
I'm trying to remove single vowels from a string, but not if a vowel is double same.
For example string
"I am keeping a foobar"
should print out as
"m keepng foobr"
I have tried everything but didn't come up with a solution so far.
Try:
Regex.Replace(input, #"([aeiou])\1", "");
Though for I am keeping a foobar, it will give you m keepng foobr, which is different to your required m keepng foobr, as you're stripped spaces out of your required result, too.
If you want to remove the extraneous spaces, then it's a three step operation: remove vowels; remove proceeding/trailing spaces; remove double spaces.
var raw = Regex.Replace(input, #"([aeiou])\1", "");
var trimmed = raw.Trim();
var final = trimmed.Replace(" ", " ");
You could try this logic:
loop trough string and check two by two characters
if (isBothVowelsAndEqual()) do nothing; else removeFirstChar();
EDIT:
public List<char> vowels = "AEIOUaeiou".ToList();
public bool isBothVowelsAndEqual(char first, char second)
{
return (first == second && vowels.Contains(first));
}
const string s = "I am keeeping a foobar";
string output=String.Empty;
for (int i = 0; i < s.Length-1; i++)
{
if (isBothVowelsAndEqual(s[i], s[i + 1]))
{
output = output + s[i] + s[i+1];
i++;
}
else
{
if (!vowels.Contains(s[i])) {
output += s[i];
}
}
}
Console.WriteLine(output.Trim());
I use Visual Studio 2010 ver.
I have array strings [] = { "eat and go"};
I display it with foreach
I wanna convert strings like this : EAT and GO
Here my code:
Console.Write( myString.First().ToString().ToUpper() + String.Join("",myString].Skip(1)).ToLower()+ "\n");
But the output is : Eat and go . :D lol
Could you help me? I would appreciate it. Thanks
While .ToUpper() will convert a string to its upper case equivalent, calling .First() on a string object actually returns the first element of the string (since it's effectively a char[] under the hood). First() is actually exposed as a LINQ extension method and works on any collection type.
As with many string handling functions, there are a number of ways to handle it, and this is my approach. Obviously you'll need to validate value to ensure it's being given a long enough string.
using System.Text;
public string CapitalizeFirstAndLast(string value)
{
string[] words = value.Split(' '); // break into individual words
StringBuilder result = new StringBuilder();
// Add the first word capitalized
result.Append(words[0].ToUpper());
// Add everything else
for (int i = 1; i < words.Length - 1; i++)
result.Append(words[i]);
// Add the last word capitalized
result.Append(words[words.Length - 1].ToUpper());
return result.ToString();
}
If it's always gonna be a 3 words string, the you can simply do it like this:
string[] mystring = {"eat and go", "fast and slow"};
foreach (var s in mystring)
{
string[] toUpperLower = s.Split(' ');
Console.Write(toUpperLower.First().ToUpper() + " " + toUpperLower[1].ToLower() +" " + toUpperLower.Last().ToUpper());
}
If you want to continuously alternate, you can do the following:
private static string alternateCase( string phrase )
{
String[] words = phrase.split(" ");
StringBuilder builder = new StringBuilder();
//create a flag that keeps track of the case change
book upperToggle = true;
//loops through the words
for(into i = 0; i < words.length; i++)
{
if(upperToggle)
//converts to upper if flag is true
words[i] = words[i].ToUpper();
else
//converts to lower if flag is false
words[i] = words[i].ToLower();
upperToggle = !upperToggle;
//adds the words to the string builder
builder.append(words[i]);
}
//returns the new string
return builder.ToString();
}
Quickie using ScriptCS:
scriptcs (ctrl-c to exit)
> var input = "Eat and go";
> var words = input.Split(' ');
> var result = string.Join(" ", words.Select((s, i) => i % 2 == 0 ? s.ToUpperInvariant() : s.ToLowerInvariant()));
> result
"EAT and GO"
I need to remove all additional spaces in a string.
I use regex for matching strings and matched strings i replace with some others.
For better understanding please see examples below:
3 input strings:
Hello, how are you?
Hello , how are you?
Hello , how are you ?
This are 3 strings that should match by one pattern-regex.
It looks something like this:
Hello\s*,\s+how\s+are\s+you\s*?
It works fine but there is a perfomance problem.
If I have a lot of patterns (~20k) and try to execute each pattern it runs very slow (3-5 minutes).
Maybe there is better way for doing this?
for example use some 3d-party libs?
UPD: Folks, this question is not about how to do this. It's about how to do this with best perfomance. :)
Let me explain more detailed. The main goal is tokenize text. (replace some token with special symbols)
For example I have a token "nice try".
Then I input text "this is nice try".
result: "this is #tokenizedtext#" where #tokenizedtext# some special symbols. It doesen't matter in this case.
Next I have string "Mike said it was a nice try".
result should be "Mike said it was a #tokenizedtext#".
I think the main idea is clear.
So I can have a lot of tokens. When I process it I convert my token from "nice try" to pattern "nice\s+try". and try to replace with this pattern input text.
It works fine. But if in tokens there is more spaces and there is also punctuation then my regexes became bigger and works very slow.
Do you have some suggestions (technical or logic) for solving this problem?
I can suggest a few solutions.
First of all, avoid the static Regex method. Create an instance of it (and store it, don't call the constructor for each replacement!) and, if possible, use RegexOptions.Compiled. It should improve your performance.
Second, you can try to review your pattern. I'll do some profiling, but I'm currently undecisive between:
#"(?<=\s)\s+"
With replacement being an empty string or:
#"\s+"
With a space as a replacement. You can try this code, in the meanwhile:
var s = "Hello , how are you?";
var pattern = #"\s+";
var regex = new Regex(pattern, RegexOptions.Compiled);
var replaced = regex.Replace(s, " ");
EDIT: After having done some measurement, the second pattern seems to be faster. I'm editing my sample to adapt it.
EDIT 2: I've written an unsafe method. It's much faster than the other ones presented here, including the Regex ones, but, as the word itself says, it's unsafe. I don't think that there's any problem with the code I've written but I may be wrong -- So please, check it again and again in case there's a bug in the method.
static unsafe string TrimInternal(string input)
{
var length = input.Length;
var array = stackalloc char[length];
fixed (char* fix = input)
{
var ptr = fix;
var counter = 0;
var lastWasSpace = false;
while (*ptr != '\x0')
{
//Current char is a space?
var isSpace = *ptr == ' ';
//If it's a space but the last one wasn't
//Or if it's not a space
if (isSpace && !lastWasSpace || !isSpace)
//Write into the result array
array[counter++] = *ptr;
//The last character (before the next loop) was a space
lastWasSpace = isSpace;
//Increase the pointer
ptr++;
}
return new string(array, 0, counter);
}
}
Usage (compile with /unsafe):
var s = TrimInternal("Hello , how are you?");
Profiling made in Release build, optimizations on, 1000000 iterations:
My above solution with Regex: 00:00:03.2130121
The unsafe solution: 00:00:00.2063467
This might work for you. It should be pretty fast. Note that it also removes spaces at the end of the string; that might not be what you want...
using System;
namespace Demo
{
class Program
{
static void Main(string[] args)
{
Console.WriteLine(">{0}<", RemoveExtraSpaces("Hello, how are you?"));
Console.WriteLine(">{0}<", RemoveExtraSpaces("Hello , how are you?"));
Console.WriteLine(">{0}<", RemoveExtraSpaces("Hello , how are you ?"));
}
public static string RemoveExtraSpaces(string text)
{
var buffer = new char[text.Length];
bool isSpaced = false;
int n = 0;
foreach (char c in text)
{
if (c == ' ')
{
isSpaced = true;
}
else
{
if (isSpaced)
{
if ((c != ',') && (c != '?'))
{
buffer[n++] = ' ';
}
isSpaced = false;
}
buffer[n++] = c;
}
}
return new string(buffer, 0, n);
}
}
}
Something of my own :
find all the position of WhiteSpacechar in string;
private static IEnumerable<int> GetWhiteSpacePos(string input)
{
int iPos = -1;
while ((iPos = input.IndexOf(" ", iPos + 1, StringComparison.Ordinal)) > -1)
{
yield return iPos;
}
}
Remove all whitespace that are in in sequence Returned from GetWhiteSpacePos
string original_string = "Hello , how are you ?";
var poss = GetWhiteSpacePos(original_string).ToList();
int startPos;
int endPos;
StringBuilder builder = new StringBuilder(original_string);
for (int i = poss.Count -1; i > 1; i--)
{
endPos = poss[i];
while ((poss[i] == poss[i - 1] + 1) && i > 1)
{
i--;
}
startPos = poss[i];
if (endPos - startPos > 1)
{
builder.Remove(startPos, endPos - startPos);
}
}
string new_string = builder.ToString();
You are using a very complex regex..simplify the regex and that would definitely increasre the performance
Use \s+ and replace it with a single space
Well, these kind of problems really trouble us. Use this code, and I'm sure you're getting the result for what you've asked. This command removes any extra white space between any string.
cleanString= Regex.Replace(originalString, #"\s", " ");
Hope thar works for you. Thanks.
And since this is a single Instruction. It will utilize less CPU resource and hence less CPU time, which ultimately increases your performance. Therefore A/C to me this method works the best when compared in terms of performance.
if its just a matter of SPACE;
try this
Source : http://www.codeproject.com/Articles/10890/Fastest-C-Case-Insenstive-String-Replace
private static string ReplaceEx(string original,
string pattern, string replacement)
{
int count, position0, position1;
count = position0 = position1 = 0;
string upperString = original.ToUpper();
string upperPattern = pattern.ToUpper();
int inc = (original.Length / pattern.Length) *
(replacement.Length - pattern.Length);
char[] chars = new char[original.Length + Math.Max(0, inc)];
while ((position1 = upperString.IndexOf(upperPattern,
position0)) != -1)
{
for (int i = position0; i < position1; ++i)
chars[count++] = original[i];
for (int i = 0; i < replacement.Length; ++i)
chars[count++] = replacement[i];
position0 = position1 + pattern.Length;
}
if (position0 == 0) return original;
for (int i = position0; i < original.Length; ++i)
chars[count++] = original[i];
return new string(chars, 0, count);
}
Usage:
string original_string = "Hello , how are you ?";
while (original_string.Contains(" "))
{
original_string = ReplaceEx(original_string, " ", " ");
}
Replacing the regex way:
string resultString = null;
try {
resultString = Regex.Replace(subjectString, #"\s+", " ", RegexOption.Compiled);
} catch (ArgumentException ex) {
// Syntax error in the regular expression
}
I'm trying to fetch multiple email addresses seperated by "," within string from database table, but it's also returning me whitespaces, and I want to remove the whitespace quickly.
The following code does remove whitespace, but it also becomes slow whenever I try to fetch large number email addresses in a string like to 30000, and then try to remove whitespace between them. It takes more than four to five minutes to remove those spaces.
Regex Spaces =
new Regex(#"\s+", RegexOptions.Compiled);
txtEmailID.Text = MultipleSpaces.Replace(emailaddress),"");
Could anyone please tell me how can I remove the whitespace within a second even for large number of email address?
I would build a custom extension method using StringBuilder, like:
public static string ExceptChars(this string str, IEnumerable<char> toExclude)
{
StringBuilder sb = new StringBuilder(str.Length);
for (int i = 0; i < str.Length; i++)
{
char c = str[i];
if (!toExclude.Contains(c))
sb.Append(c);
}
return sb.ToString();
}
Usage:
var str = s.ExceptChars(new[] { ' ', '\t', '\n', '\r' });
or to be even faster:
var str = s.ExceptChars(new HashSet<char>(new[] { ' ', '\t', '\n', '\r' }));
With the hashset version, a string of 11 millions of chars takes less than 700 ms (and I'm in debug mode)
EDIT :
Previous code is generic and allows to exclude any char, but if you want to remove just blanks in the fastest possible way you can use:
public static string ExceptBlanks(this string str)
{
StringBuilder sb = new StringBuilder(str.Length);
for (int i = 0; i < str.Length; i++)
{
char c = str[i];
switch (c)
{
case '\r':
case '\n':
case '\t':
case ' ':
continue;
default:
sb.Append(c);
break;
}
}
return sb.ToString();
}
EDIT 2 :
as correctly pointed out in the comments, the correct way to remove all the blanks is using char.IsWhiteSpace method :
public static string ExceptBlanks(this string str)
{
StringBuilder sb = new StringBuilder(str.Length);
for (int i = 0; i < str.Length; i++)
{
char c = str[i];
if(!char.IsWhiteSpace(c))
sb.Append(c);
}
return sb.ToString();
}
Given the implementation of string.Replaceis written in C++ and part of the CLR runtime I'm willing to bet
email.Replace(" ","").Replace("\t","").Replace("\n","").Replace("\r","");
will be the fastest implementation. If you need every type of whitespace, you can supply the hex value the of unicode equivalent.
With linq you can do it simply:
emailaddress = new String(emailaddress
.Where(x=>x!=' ' && x!='\r' && x!='\n')
.ToArray());
I didn't compare it with stringbuilder approaches, but is much more faster than string based approaches.
Because it does not create many copy of strings (string is immutable and using it directly causes to dramatically memory and speed problems), so it's not going to use very big memory and not going to slow down the speed (except one extra pass through the string at first).
You should try String.Trim(). It will trim all spaces from start to end of a string
Or you can try this method from linked topic: [link]
public static unsafe string StripTabsAndNewlines(string s)
{
int len = s.Length;
char* newChars = stackalloc char[len];
char* currentChar = newChars;
for (int i = 0; i < len; ++i)
{
char c = s[i];
switch (c)
{
case '\r':
case '\n':
case '\t':
continue;
default:
*currentChar++ = c;
break;
}
}
return new string(newChars, 0, (int)(currentChar - newChars));
}
emailaddress.Replace(" ", string.Empty);
There are many diffrent ways, some faster then others:
public static string StripTabsAndNewlines(this string str) {
//string builder (fast)
StringBuilder sb = new StringBuilder(str.Length);
for (int i = 0; i < str.Length; i++) {
if ( ! Char.IsWhiteSpace(s[i])) {
sb.Append();
}
}
return sb.tostring();
//linq (faster ?)
return new string(str.ToCharArray().Where(c => !Char.IsWhiteSpace(c)).ToArray());
//regex (slow)
return Regex.Replace(str, #"\s+", "")
}
Please use the TrimEnd() method of the String class. You can find a great example here.
You should consider replacing spaces on the record-set within your stored procedure or query using the REPLACE( ) function if possible & even better fix your DB records since a space in an email address is invalid anyways.
As mentioned by others you would need to profile the different approaches. If you are using Regex you should minimally make it a class-level static variable:
public static Regex MultipleSpaces = new Regex(#"\s+", RegexOptions.Compiled);
emailAddress.Where(x=>{ return x != ' ';}).ToString( ) is likely to have function overhead although it could be optimized to inline by Microsoft -- again profiling will give you the answer.
The most efficient method would be to allocate a buffer and copy character by character to a new buffer and skip the spaces as you do that. C# does support pointers so you could use unsafe code, allocate a raw buffer and use pointer arithmetic to copy just like in C and that is as fast as this can possibly be done. The REPLACE( ) in SQL will handle it like that for you.
string str = "Hi!! this is a bunch of text with spaces";
MessageBox.Show(new String(str.Where(c => c != ' ').ToArray()));
I haven't done performance testing on this, but it's simpler than most of the other answers.
var s1 = "\tstring \r with \t\t \nwhitespace\r\n";
var s2 = string.Join("", s1.Split());
The result is
stringwithwhitespace
string input =Yourinputstring;
string[] strings = input.Split(new string[] { Environment.NewLine }, StringSplitOptions.RemoveEmptyEntries);
foreach (string value in strings)
{
string newv= value.Trim();
if (newv.Length > 0)
newline += value + "\r\n";
}
string s = " Your Text ";
string new = s.Replace(" ", string.empty);
// Output:
// "YourText"
Fastest and general way to do this (line terminators, tabs will be processed as well). Regex powerful facilities don't really needed to solve this problem, but Regex can decrease performance.
new string
(stringToRemoveWhiteSpaces
.Where
(
c => !char.IsWhiteSpace(c)
)
.ToArray<char>()
)