I have written an extension method for string manipulation. I'm confused what should I name it - since this will become part of the base library front-end developers in the team will use. Here's the profile of the class member.
Info: Utility Extension method for String types. Overloads of this method may do the same thing characters other than space [with what supplied in argument]
Purpose: Trims down all intermediate or in-between spaces to single space.
Ex:
string Input = "Hello Token1 Token2 Token3 World! ";
string Output = Input.TrimSpacesInBetween();
//Output will be: "Hello Token1 Token2 Token3 World!"
I have read [in fact I'm reading] the Framework Design guidelines but this seems to be bothering me.
Some options I think..
TrimIntermediate();
TrimInbetween();
Here's the code on Request:
It's recursive..
public static class StringExtensions
{
public static string Collapse(this string str)
{
return str.Collapse(' ');
}
public static string Collapse(this string str, char delimeter)
{
char[] delimeterts = new char[1];
delimeterts[0] = delimeter;
str = str.Trim(delimeterts);
int indexOfFirstDelimeter = str.IndexOf(delimeter);
int indexTracker = indexOfFirstDelimeter + 1;
while (str[indexTracker] == delimeter)
indexTracker++;
str = str.Remove(indexOfFirstDelimeter + 1, indexTracker - indexOfFirstDelimeter - 1);
string prevStr = str.Substring(0, indexOfFirstDelimeter + 1);
string nextPart = str.Substring(indexOfFirstDelimeter + 1);
if (indexOfFirstDelimeter != -1)
nextPart = str.Substring(indexOfFirstDelimeter + 1).Collapse(delimeter);
string retStr = prevStr + nextPart;
return retStr;
}
}
What about CollapseSpaces?
CollapseSpaces is good for just spaces, but to allow for the overloads you might want CollapseDelimiters or CollapseWhitespace if it's really just going to be for various whitespace characters.
Not really an answer, more a comment on your posted code...
You could make the method a lot shorter and more understandable by using a regular expression. (My guess is that it would probably perform better than the recursive string manipulations too, but you would need to benchmark to find out for sure.)
public static class StringExtensions
{
public static string Collapse(this string str)
{
return str.Collapse(' ');
}
public static string Collapse(this string str, char delimiter)
{
str = str.Trim(delimiter);
string delim = delimiter.ToString();
return Regex.Replace(str, Regex.Escape(delim) + "{2,}", delim);
}
}
In ruby I believe they call this squeeze
NormalizeWhitespace ?
This way is more clear that there will be a usable value left after processing.
As other have stated earlier, 'Collapse' sounds somewhat rigorous and might even mean that it can return an empty string.
Try this, it works for me and seems to be a lot less complicated than a recursive solution...
public static class StringExtensions
{
public static string NormalizeWhitespace(this string input, char delim)
{
return System.Text.RegularExpressions.Regex.Replace(input.Trim(delim), "["+delim+"]{2,}", delim.ToString());
}
}
It can be called as such:
Console.WriteLine(input.NormalizeWhitespace(' '));
CollapseExtraWhitespace
PaulaIsBrilliant of course!
How is makeCompact?
Related
I have the following string
string a = #"\\server\MainDirectory\SubDirectoryA\SubdirectoryB\SubdirectoryC\Test.jpg";
I'm trying to remove part of the string so in the end I want to be left with
string a = #"\\server\MainDirectory\SubDirectoryA\SubdirectoryB";
So currently I'm doing
string b = a.Remove(a.LastIndexOf('\\'));
string c = b.Remove(b.LastIndexOf('\\'));
Console.WriteLine(c);
which gives me the correct result. I was wondering if there is a better way of doing this? because I'm having to do this in a fair few places.
Note: the SubdirectoryC length will be unknown. As it is made of the numbers/letters a user inputs
There is Path.GetDirectoryName
string a = #"\\server\MainDirectory\SubDirectoryA\SubdirectoryB\SubdirectoryC\Test.jpg";
string b = Path.GetDirectoryName(Path.GetDirectoryName(a));
As explained in MSDN it works also if you pass a directory
....passing the returned path back into the GetDirectoryName method will
result in the truncation of one folder level per subsequent call on
the result string
Of course this is safe if you have at least two directories level
Heyho,
if you just want to get rid of the last part.
You can use :
var parentDirectory = Directory.GetParent(Path.GetDirectoryName(path));
https://msdn.microsoft.com/de-de/library/system.io.directory.getparent(v=vs.110).aspx
An alternative answer using Linq:
var b = string.Join("\\", a.Split(new string[] { "\\" }, StringSplitOptions.None)
.Reverse().Skip(2).Reverse());
Some alternatives
string a = #"\\server\MainDirectory\SubDirectoryA\SubdirectoryB\SubdirectoryC\Test.jpg";
var b = Path.GetFullPath(a + #"\..\..");
var c = a.Remove(a.LastIndexOf('\\', a.LastIndexOf('\\') - 1));
but I do find this kind of string extensions generally usefull:
static string beforeLast(this string str, string delimiter)
{
int i = str.LastIndexOf(delimiter);
if (i < 0) return str;
return str.Remove(i);
}
For such repeated tasks, a good solution is often to write an extension method, e.g.
public static class Extensions
{
public static string ChopPath(this string path)
{
// chopping code here
}
}
Which you then can use anywhere you need it:
var chopped = a.ChopPath();
I have a filepath that follows the following pattern:
Some\File\Path\Base\yyyy\MM\dd\HH\mm\Random8.3
I want to extract everything from 2012 and beyond, but the problem is that while the right side is standard the base directory can be different for each record.
Here are two examples:
C:\Temp\X\2012\08\27\18\35\wy32dm1q.qyt
Returns: 2012\08\27\18\35\wy32dm1q.qyt
D:\Temp\X\Y\2012\08\27\18\36\tx84uwvr.puq
Returns: 2012\08\27\18\36\tx84uwvr.puq
Right now I'm grabbing the LastIndexOf(Path.DirectorySeparatorChar) N number of times to get the index of the string right before 2012, then getting the substring from that index on. But, I have a feeling that maybe there is a better way?
static void Main(string[] args)
{
Console.WriteLine(GetLastParts(#"D:\Temp\X\Y\2012\08\27\18\36\tx84uwvr.puq", #"\", 6));
Console.ReadLine();
}
static string GetLastParts(string text, string separator, int count)
{
string[] parts = text.Split(new string[] { separator }, StringSplitOptions.None);
return string.Join(separator, parts.Skip(parts.Count() - count).Take(count).ToArray());
}
Here's a solution that uses regular expressions, assuming the format you're looking for always contains \yyyy\MM\dd\HH\mm.
class Program
{
static void Main(string[] args)
{
Console.WriteLine(ExtractPath(#"C:\Temp\X\2012\08\27\18\35\wy32dm1q.qyt"));
Console.WriteLine(ExtractPath(#"D:\Temp\X\Y\2012\08\27\18\36\tx84uwvr.puq"));
}
static string ExtractPath(string fullPath)
{
string regexconvention = String.Format(#"\d{{4}}\u{0:X4}(\d{{2}}\u{0:X4}){{4}}\w{{8}}.\w{{3}}", Convert.ToInt32(Path.DirectorySeparatorChar, CultureInfo.InvariantCulture));
return Regex.Match(fullPath, regexconvention).Value;
}
}
A c# solution would be
string str = #"C:\Temp\X\2012\08\27\18\35\wy32dm1q.qyt";
string[] arr=str.Substring(str.IndexOf("2012")).Split(new char[]{'\\'});
I don't think there's anything wrong w/ your current approach. It's likely the best for the job.
public string GetFilepath(int nth, string needle, string haystack) {
int lastindex = haystack.Length;
for (int i=nth; i>=0; i--)
lastindex = haystack.LastIndexOf(needle, lastindex-1);
return haystack.Substring(lastindex);
}
I'd keep it simple (KISS). Easier to debug/maintain and probably twice as fast as regex variant.
I am a beginner programmer in C# who just got started. I have a task at hand where a program needs to read a string and perform some string manipulation. The UI provides a TextBox and all the options below as CheckBoxes. User can select any or all.
Remove any spaces.
Remove any special chars like ',' etc.
Remove any numbers.
Convert to camelCase.
There can be more options as part of the string cleanup. I have wrttten the string processing in a method, that has a chasm of if ... else ifs ...
I am sure there is a way around.
Appreciate any help.
Thanks for all the solutions, but I think my point did was not put across correctly.
The string processing will be done in a particular order depending on the checkbox value.
User might select just one or every option provided. In case there is more than one selected, it should be like
if(RemoveSpaces.checked)
{
RemoveSpaces(string inputString);
// After removing spaces do the other operations
}
else if (RemoveSpecialChars.checked)
{
RemoveSpecialChars(string inputString);
// Do other processing
}
For easy String manipulation, use String.replace
See String.replace
This code example might also help:
string start = "a b 3 4 5.7";
string noSpace = start.Replace(" ", "");
string noDot = noSpace.Replace(".", "");
string noNumbers = Regex.Replace(noDot, "[0-9]", "");
Console.WriteLine(start);
Console.WriteLine(noSpace);
Console.WriteLine(noDot);
Console.WriteLine(noNumbers);
The output will then be as follows
"a b 3 4 5.7" // start
"ab345.7" // noSpace
"ab3457" // noDot
"ab" // noNumbers
You can make some class and 4 functions inside. for example:
public static class StringOperations
{
public static string RemoveSpaces(string sourceString)
{
string convertedString = "";
//some operations
return convertedString;
}
public static string RemoveCharacters(string sourceString, params char[] charactersToRemove)
{
string convertedString = "";
//some operations
return convertedString;
}
public static string RemoveAnyNumbers(string sourceString)
{
string convertedString = "";
//some operations
return convertedString;
}
public static string ConvertToCamelCase(string sourceString)
{
string convertedString = "";
//some operations
return convertedString;
}
}
In Your UI you just call one of functions...
I am using C# 2.0 and I have got below type of strings:
string id = "tcm:481-191820"; or "tcm:481-191820-32"; or "tcm:481-191820-8"; or "tcm:481-191820-128";
The last part of string doesn't matter i.e. (-32,-8,-128), whatever the string is it will render below result.
Now, I need to write one function which will take above string as input. something like below and will output as "tcm:0-481-1"
public static string GetPublicationID(string id)
{
//this function will return as below output
return "tcm:0-481-1"
}
Please suggest!!
If final "-1" is static you could use:
public static string GetPublicationID(string id)
{
int a = 1 + id.IndexOf(':');
string first = id.Substring(0, a);
string second = id.Substring(a, id.IndexOf('-') - a);
return String.Format("{0}0-{1}-1", first, second);
}
or if "-1" is first part of next token, try this
public static string GetPublicationID(string id)
{
int a = 1 + id.IndexOf(':');
string first = id.Substring(0, a);
string second = id.Substring(a, id.IndexOf('-') - a + 2);
return String.Format("{0}0-{1}", first, second);
}
This syntax works even for different length patterns, assuming that your string is
first_part:second_part-anything_else
All you need is:
string.Format("{0}0-{1}", id.Substring(0,4), id.Substring(4,5));
This just uses substring to get the first four characters and then the next five and put them into the format with the 0- in there.
This does assume that your format is a fixed number of characters in each position (which it is in your example). If the string might be abcd:4812... then you will have to modify it slightly to pick up the right length of strings. See Marco's answer for that technique. I'd advise using his if you need the variable length and mine if the lengths stay the same.
Also as an additional note your original function of returning a static string does work for all of those examples you provided. I have assumed there are other numbers visible but if it is only the suffix that changes then you could happily use a static string (at which point declaring a constant or something rather than using a method would probably work better).
Obligatory Regular Expression Answer:
using System.Text.RegularExpressions;
public static string GetPublicationID(string id)
{
Match m = RegEx.Match(#"tcm:([\d]+-[\d]{1})", id);
if(m.Success)
return string.Format("tcm:0-{0}", m.Groups[1].Captures[0].Value.ToString());
else
return string.Empty;
}
Regex regxMatch = new Regex("(?<prefix>tcm:)(?<id>\\d+-\\d)(?<suffix>.)*",RegexOptions.Singleline|RegexOptions.Compiled);
string regxReplace = "${prefix}0-${id}";
string GetPublicationID(string input) {
return regxMatch.Replace(input, regxReplace);
}
string test = "tcm:481-191820-128";
stirng result = GetPublicationID(test);
//result: tcm:0-481-1
let's say I have a string "hello world". I would like to end up with " dehllloorw". As I don't find any ready-made solution I thought: I can split the string into a character array, sort it and convert it back to a string.
In perl I can do s// but in .Net I'd have to do a .Split() but there's no overload with no parameters... if I do .Split(null) it seems to split by whitespace and .Split('') won't compile.
how do I do this (I hate to run a loop!)?
Array.Sort("hello world".ToCharArray());
Below is a quick demo console app
class Program
{
static void Main(string[] args)
{
var array = "hello world".ToCharArray();
Array.Sort(array);
Console.WriteLine(new String(array));
Console.ReadLine();
}
}
The characters in a string can be directly used, the string class exposed them as an enumeration - combine that with Linq / OrderBy and you have a one-liner to create the ordered output string:
string myString = "hello world";
string output = new string(myString.OrderBy(x => x).ToArray()); // dehllloorw
You could always do this:
private static string SortStringCharacters(string value)
{
if (value == null)
return null;
return new string(value.ToList().Sort().ToArray());
}