How to separate string after whitespace in c# - c#

I'm using c# and have a string like x="12 $Math A Level$"` that could be also x="12 Math A Level"
How can I separate this string in order to have a variable year=12 and subject=Math A Level?
I was using something like:
char[] whitespace = new char[] { ' ', '\t' };
var x = item.Split(whitespace);
but then I didn't know what to do after or if there's a better way to do this.

You could use the override of split that takes the count :
var examples = new []{"2 $Math A Level$", "<some_num> <some text>"} ;
foreach(var s in examples)
{
var parts = s.Split(' ', count: 2, StringSplitOptions.RemoveEmptyEntries | StringSplitOptions.TrimEntries);
Console.WriteLine($"'{parts[0]}', '{parts[1]}'");
}
This prints:
'2', '$Math A Level$'
'<some_num>', '<some text>'

You could do
var item = "12 Math A Level";
var index = item.IndexOf(' ');
var year = item.Substring(0, index);
var subject = item.Substring(index + 1, item.Length - index-1).Trim('$');
This assumes that the year is the first word, and the subject is everything else. It also assumes you are not interested in any '$' signs. You might also want to add a check that the index was actually found, in case there are no spaces in the string.

To add a Regex-based answer:
using System;
using System.Text.RegularExpressions;
public class Program
{
public static readonly Regex regex = new Regex(#"(?<ID>[0-9]+)\s+[$]?(?<Text>[^$]*)[$]?", RegexOptions.Compiled);
public static void Main()
{
MatchCollection matches = regex.Matches("12 $Math A Level$");
foreach( Match m in matches )
{
Console.WriteLine($"{(m.Groups["ID"].Value)} | {(m.Groups["Text"].Value)}");
}
matches = regex.Matches("13 Math B Level");
foreach( Match m in matches )
{
Console.WriteLine($"{(m.Groups["ID"].Value)} | {(m.Groups["Text"].Value)}");
}
}
}
In action: https://dotnetfiddle.net/6XEQw8
Output:
12 | Math A Level
13 | Math B Level
To explain the expression:
(?[0-9]+)\s+[$]?(?[^$]*)[$]?
(?[0-9]+) - Named Catpure-Group "ID"
[0-9] - Match literal chars '0' to '9'
+ - ^^ One or more times
\s+ - Match whitespace one or more times
[$]? - Match literal '$' one or zero times
(?[^$]*) - Named Capture-Group "Text"
[^$] - Match anything that is _not_ literal '$'
* - ^^ Zero or more times
[$]? - Match literal '$' one or zero times
See also https://regex101.com/r/WV366l/1
Mind: I personally would benchmark this solution against a (or several) non-regex solutions and then make a choice.

var x = "12 $Math A Level$".Split('$', StringSplitOptions.RemoveEmptyEntries | StringSplitOptions.TrimEntries);
string year = x[0];
string subject = x[1];
Console.WriteLine(year);
Console.WriteLine(subject);

If you can rely on the string format specified ("12 $Math A Level$"), you could split with at $ like this:
using System;
public class Program
{
public static void Main()
{
var sample = "12 $Math A Level$";
var rec = Parse(sample);
Console.WriteLine($"Year={rec.Year}\nSubject={rec.Subject}");
}
private static Record Parse(string value)
{
var delimiter = new char[] { '$' };
var parts = value.Split(delimiter, StringSplitOptions.RemoveEmptyEntries);
return new Record { Year = Convert.ToInt32(parts[0]), Subject = parts[1] };
}
public class Record
{
public int Year { get; set; }
public string Subject { get; set; }
}
}
Output:
Year=12
Subject=Math A Level
▶️ Try it out here: https://dotnetfiddle.net/DAFLjA

Related

Want to save objects with specfic chars only removing the chars which is not found in char list [duplicate]

How might I remove characters from a string? For example: "My name #is ,Wan.;'; Wan".
I would like to remove the characters '#', ',', '.', ';', '\'' from that string so that it becomes "My name is Wan Wan"
var str = "My name #is ,Wan.;'; Wan";
var charsToRemove = new string[] { "#", ",", ".", ";", "'" };
foreach (var c in charsToRemove)
{
str = str.Replace(c, string.Empty);
}
But I may suggest another approach if you want to remove all non letter characters
var str = "My name #is ,Wan.;'; Wan";
str = new string((from c in str
where char.IsWhiteSpace(c) || char.IsLetterOrDigit(c)
select c
).ToArray());
Simple:
String.Join("", "My name #is ,Wan.;'; Wan".Split('#', ',' ,'.' ,';', '\''));
Sounds like an ideal application for RegEx -- an engine designed for fast text manipulation. In this case:
Regex.Replace("He\"ll,o Wo'r.ld", "[#,\\.\";'\\\\]", string.Empty)
Comparing various suggestions (as well as comparing in the context of single-character replacements with various sizes and positions of the target).
In this particular case, splitting on the targets and joining on the replacements (in this case, empty string) is the fastest by at least a factor of 3. Ultimately, performance is different depending on the number of replacements, where the replacements are in the source, and the size of the source. #ymmv
Results
(full results here)
| Test | Compare | Elapsed |
|---------------------------|---------|--------------------------------------------------------------------|
| SplitJoin | 1.00x | 29023 ticks elapsed (2.9023 ms) [in 10K reps, 0.00029023 ms per] |
| Replace | 2.77x | 80295 ticks elapsed (8.0295 ms) [in 10K reps, 0.00080295 ms per] |
| RegexCompiled | 5.27x | 152869 ticks elapsed (15.2869 ms) [in 10K reps, 0.00152869 ms per] |
| LinqSplit | 5.43x | 157580 ticks elapsed (15.758 ms) [in 10K reps, 0.0015758 ms per] |
| Regex, Uncompiled | 5.85x | 169667 ticks elapsed (16.9667 ms) [in 10K reps, 0.00169667 ms per] |
| Regex | 6.81x | 197551 ticks elapsed (19.7551 ms) [in 10K reps, 0.00197551 ms per] |
| RegexCompiled Insensitive | 7.33x | 212789 ticks elapsed (21.2789 ms) [in 10K reps, 0.00212789 ms per] |
| Regex Insensitive | 7.52x | 218164 ticks elapsed (21.8164 ms) [in 10K reps, 0.00218164 ms per] |
Test Harness (LinqPad)
(note: the Perf and Vs are timing extensions I wrote)
void test(string title, string sample, string target, string replacement) {
var targets = target.ToCharArray();
var tox = "[" + target + "]";
var x = new Regex(tox);
var xc = new Regex(tox, RegexOptions.Compiled);
var xci = new Regex(tox, RegexOptions.Compiled | RegexOptions.IgnoreCase);
// no, don't dump the results
var p = new Perf/*<string>*/();
p.Add(string.Join(" ", title, "Replace"), n => targets.Aggregate(sample, (res, curr) => res.Replace(new string(curr, 1), replacement)));
p.Add(string.Join(" ", title, "SplitJoin"), n => String.Join(replacement, sample.Split(targets)));
p.Add(string.Join(" ", title, "LinqSplit"), n => String.Concat(sample.Select(c => targets.Contains(c) ? replacement : new string(c, 1))));
p.Add(string.Join(" ", title, "Regex"), n => Regex.Replace(sample, tox, replacement));
p.Add(string.Join(" ", title, "Regex Insentive"), n => Regex.Replace(sample, tox, replacement, RegexOptions.IgnoreCase));
p.Add(string.Join(" ", title, "Regex, Uncompiled"), n => x.Replace(sample, replacement));
p.Add(string.Join(" ", title, "RegexCompiled"), n => xc.Replace(sample, replacement));
p.Add(string.Join(" ", title, "RegexCompiled Insensitive"), n => xci.Replace(sample, replacement));
var trunc = 40;
var header = sample.Length > trunc ? sample.Substring(0, trunc) + "..." : sample;
p.Vs(header);
}
void Main()
{
// also see https://stackoverflow.com/questions/7411438/remove-characters-from-c-sharp-string
"Control".Perf(n => { var s = "*"; });
var text = "My name #is ,Wan.;'; Wan";
var clean = new[] { '#', ',', '.', ';', '\'' };
test("stackoverflow", text, string.Concat(clean), string.Empty);
var target = "o";
var f = "x";
var replacement = "1";
var fillers = new Dictionary<string, string> {
{ "short", new String(f[0], 10) },
{ "med", new String(f[0], 300) },
{ "long", new String(f[0], 1000) },
{ "huge", new String(f[0], 10000) }
};
var formats = new Dictionary<string, string> {
{ "start", "{0}{1}{1}" },
{ "middle", "{1}{0}{1}" },
{ "end", "{1}{1}{0}" }
};
foreach(var filler in fillers)
foreach(var format in formats) {
var title = string.Join("-", filler.Key, format.Key);
var sample = string.Format(format.Value, target, filler.Value);
test(title, sample, target, replacement);
}
}
Less specific to your question, it is possible to remove ALL punctuation from a string (except space) by white listing the acceptable characters in a regular expression:
string dirty = "My name #is ,Wan.;'; Wan";
// only space, capital A-Z, lowercase a-z, and digits 0-9 are allowed in the string
string clean = Regex.Replace(dirty, "[^A-Za-z0-9 ]", "");
Note there is a space after that 9 so as not to remove spaces from your sentence. The third argument is an empty string which serves to replace any substring that does not belong in the regular expression.
string x = "My name #is ,Wan.;'; Wan";
string modifiedString = x.Replace("#", "").Replace(",", "").Replace(".", "").Replace(";", "").Replace("'", "");
The simplest way would be to use String.Replace:
String s = string.Replace("StringToReplace", "NewString");
Here's a method I wrote that takes a slightly different approach. Rather than specifying the characters to remove, I tell my method which characters I want to keep -- it will remove all other characters.
In the OP's example, he only wants to keep alphabetical characters and spaces. Here's what a call to my method would look like (C# demo):
var str = "My name #is ,Wan.;'; Wan";
// "My name is Wan Wan"
var result = RemoveExcept(str, alphas: true, spaces: true);
Here's my method:
/// <summary>
/// Returns a copy of the original string containing only the set of whitelisted characters.
/// </summary>
/// <param name="value">The string that will be copied and scrubbed.</param>
/// <param name="alphas">If true, all alphabetical characters (a-zA-Z) will be preserved; otherwise, they will be removed.</param>
/// <param name="numerics">If true, all numeric characters (0-9) will be preserved; otherwise, they will be removed.</param>
/// <param name="dashes">If true, all dash characters (-) will be preserved; otherwise, they will be removed.</param>
/// <param name="underlines">If true, all underscore characters (_) will be preserved; otherwise, they will be removed.</param>
/// <param name="spaces">If true, all whitespace (e.g. spaces, tabs) will be preserved; otherwise, they will be removed.</param>
/// <param name="periods">If true, all dot characters (".") will be preserved; otherwise, they will be removed.</param>
public static string RemoveExcept(string value, bool alphas = false, bool numerics = false, bool dashes = false, bool underlines = false, bool spaces = false, bool periods = false) {
if (string.IsNullOrWhiteSpace(value)) return value;
if (new[] { alphas, numerics, dashes, underlines, spaces, periods }.All(x => x == false)) return value;
var whitelistChars = new HashSet<char>(string.Concat(
alphas ? "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ" : "",
numerics ? "0123456789" : "",
dashes ? "-" : "",
underlines ? "_" : "",
periods ? "." : "",
spaces ? " " : ""
).ToCharArray());
var scrubbedValue = value.Aggregate(new StringBuilder(), (sb, #char) => {
if (whitelistChars.Contains(#char)) sb.Append(#char);
return sb;
}).ToString();
return scrubbedValue;
}
Another simple solution:
var forbiddenChars = #"#,.;'".ToCharArray();
var dirty = "My name #is ,Wan.;'; Wan";
var clean = new string(dirty.Where(c => !forbiddenChars.Contains(c)).ToArray());
new List<string> { "#", ",", ".", ";", "'" }.ForEach(m => str = str.Replace(m, ""));
Taking the performance figures from #drzaus, here is an extension method that uses the fastest algorithm.
public static class StringEx
{
public static string RemoveCharacters(this string s, params char[] unwantedCharacters)
=> s == null ? null : string.Join(string.Empty, s.Split(unwantedCharacters));
}
Usage
var name = "edward woodward!";
var removeDs = name.RemoveCharacters('d', '!');
Assert.Equal("ewar woowar", removeDs); // old joke
A string is just a character array so use Linq to do the replace (similar to Albin above except uses a linq contains statement to do the replace):
var resultString = new string(
(from ch in "My name #is ,Wan.;'; Wan"
where ! #"#,.;\'".Contains(ch)
select ch).ToArray());
The first string is the string to replace chars in and the
second is a simple string containing the chars
I might as well throw this out here.
Make an extension to remove chars from a string:
public static string RemoveChars(this string input, params char[] chars)
{
var sb = new StringBuilder();
for (int i = 0; i < input.Length; i++)
{
if (!chars.Contains(input[i]))
sb.Append(input[i]);
}
return sb.ToString();
}
And it's usable like this:
string str = "My name #is ,Wan.;'; Wan";
string cleanedUpString = str.RemoveChars('#', ',', '.', ';', '\'');
Or just like this:
string str = "My name #is ,Wan.;'; Wan".RemoveChars('#', ',', '.', ';', '\'');
It seems that the shortest way is to combine LINQ and string.Concat:
var input = #"My name #is ,Wan.;'; Wan";
var chrs = new[] {'#', ',', '.', ';', '\''};
var result = string.Concat(input.Where(c => !chrs.Contains(c)));
// => result = "My name is Wan Wan"
See the C# demo. Note that string.Concat is a shortcut to string.Join("", ...).
Note that using a regex to remove individual known chars is still possible to build dynamically, although it is believed that regex is slower. However, here is a way to build such a dynamic regex (where all you need is a character class):
var pattern = $"[{Regex.Escape(new string(chrs))}]+";
var result = Regex.Replace(input, pattern, string.Empty);
See another C# demo. The regex will look like [#,\.;']+ (matching one or more (+) consecutive occurrences of #, ,, ., ; or ' chars) where the dot does not have to be escaped, but Regex.Escape will be necessary to escape other chars that must be escaped, like \, ^, ] or - whose position inside the character class you cannot predict.
Here is a nice way to remove the invalid characters in a Filename:
string.Join(string.Empty, filename.Split(System.IO.Path.GetInvalidFileNameChars()));
Lots of good answers here, here's my addition along with several unit tests that can be used to help test correctness, my solution is similar to #Rianne's above but uses an ISet to provide O(1) lookup time on the replacement characters (and also similar to #Albin Sunnanbo's Linq solution).
using System;
using System.Collections.Generic;
using System.Linq;
/// <summary>
/// Returns a string with the specified characters removed.
/// </summary>
/// <param name="source">The string to filter.</param>
/// <param name="removeCharacters">The characters to remove.</param>
/// <returns>A new <see cref="System.String"/> with the specified characters removed.</returns>
public static string Remove(this string source, IEnumerable<char> removeCharacters)
{
if (source == null)
{
throw new ArgumentNullException("source");
}
if (removeCharacters == null)
{
throw new ArgumentNullException("removeCharacters");
}
// First see if we were given a collection that supports ISet
ISet<char> replaceChars = removeCharacters as ISet<char>;
if (replaceChars == null)
{
replaceChars = new HashSet<char>(removeCharacters);
}
IEnumerable<char> filtered = source.Where(currentChar => !replaceChars.Contains(currentChar));
return new string(filtered.ToArray());
}
NUnit (2.6+) tests here
using System;
using System.Collections;
using System.Collections.Generic;
using NUnit.Framework;
[TestFixture]
public class StringExtensionMethodsTests
{
[TestCaseSource(typeof(StringExtensionMethodsTests_Remove_Tests))]
public void Remove(string targetString, IEnumerable<char> removeCharacters, string expected)
{
string actual = StringExtensionMethods.Remove(targetString, removeCharacters);
Assert.That(actual, Is.EqualTo(expected));
}
[TestCaseSource(typeof(StringExtensionMethodsTests_Remove_ParameterValidation_Tests))]
public void Remove_ParameterValidation(string targetString, IEnumerable<char> removeCharacters)
{
Assert.Throws<ArgumentNullException>(() => StringExtensionMethods.Remove(targetString, removeCharacters));
}
}
internal class StringExtensionMethodsTests_Remove_Tests : IEnumerable
{
public IEnumerator GetEnumerator()
{
yield return new TestCaseData("My name #is ,Wan.;'; Wan", new char[] { '#', ',', '.', ';', '\'' }, "My name is Wan Wan").SetName("StringUsingCharArray");
yield return new TestCaseData("My name #is ,Wan.;'; Wan", new HashSet<char> { '#', ',', '.', ';', '\'' }, "My name is Wan Wan").SetName("StringUsingISetCollection");
yield return new TestCaseData(string.Empty, new char[1], string.Empty).SetName("EmptyStringNoReplacementCharactersYieldsEmptyString");
yield return new TestCaseData(string.Empty, new char[] { 'A', 'B', 'C' }, string.Empty).SetName("EmptyStringReplacementCharsYieldsEmptyString");
yield return new TestCaseData("No replacement characters", new char[1], "No replacement characters").SetName("StringNoReplacementCharactersYieldsString");
yield return new TestCaseData("No characters will be replaced", new char[] { 'Z' }, "No characters will be replaced").SetName("StringNonExistantReplacementCharactersYieldsString");
yield return new TestCaseData("AaBbCc", new char[] { 'a', 'C' }, "ABbc").SetName("CaseSensitivityReplacements");
yield return new TestCaseData("ABC", new char[] { 'A', 'B', 'C' }, string.Empty).SetName("AllCharactersRemoved");
yield return new TestCaseData("AABBBBBBCC", new char[] { 'A', 'B', 'C' }, string.Empty).SetName("AllCharactersRemovedMultiple");
yield return new TestCaseData("Test That They Didn't Attempt To Use .Except() which returns distinct characters", new char[] { '(', ')' }, "Test That They Didn't Attempt To Use .Except which returns distinct characters").SetName("ValidateTheStringIsNotJustDistinctCharacters");
}
}
internal class StringExtensionMethodsTests_Remove_ParameterValidation_Tests : IEnumerable
{
public IEnumerator GetEnumerator()
{
yield return new TestCaseData(null, null);
yield return new TestCaseData("valid string", null);
yield return new TestCaseData(null, new char[1]);
}
}
Its a powerful method I usually use in the same case:
private string Normalize(string text)
{
return string.Join("",
from ch in text
where char.IsLetterOrDigit(ch) || char.IsWhiteSpace(ch)
select ch);
}
Enjoy...
Old School in place copy/stomp:
private static string RemoveDirtyCharsFromString(string in_string)
{
int index = 0;
int removed = 0;
byte[] in_array = Encoding.UTF8.GetBytes(in_string);
foreach (byte element in in_array)
{
if ((element == ' ') ||
(element == '-') ||
(element == ':'))
{
removed++;
}
else
{
in_array[index] = element;
index++;
}
}
Array.Resize<byte>(ref in_array, (in_array.Length - removed));
return(System.Text.Encoding.UTF8.GetString(in_array, 0, in_array.Length));
}
Not sure about the efficiency w.r.t. other methods (i.e. the overhead of all the function calls and instantiations that happen as a side effect in C# execution).
I make it extension method and with string array, I think string[] is more useful than char[] because char can also be string:
public static class Helper
{
public static string RemoverStrs(this string str, string[] removeStrs)
{
foreach (var removeStr in removeStrs)
str = str.Replace(removeStr, "");
return str;
}
}
then you can use it anywhere:
string myname = "My name #is ,Wan.;'; Wan";
string result = myname.RemoveStrs(new[]{ "#", ",", ".", ";", "\\"});
I needed to remove special characters from an XML file. Here's how I did it. char.ToString() is the hero in this code.
string item = "<item type="line" />"
char DC4 = (char)0x14;
string fixed = item.Replace(DC4.ToString(), string.Empty);
new[] { ',', '.', ';', '\'', '#' }
.Aggregate("My name #is ,Wan.;'; Wan", (s, c) => s.Replace(c.ToString(), string.Empty));
If you want to remove all the spaces and special characters
var input = Console.ReadLine();
foreach (var item in input)
{
var limit = ((int)item);
if (limit>=65 && limit<=90 || limit>=97 && limit<= 122)
{
Console.Write(item);
}
}

How to find 1 in my string but ignore -1 C#

I have a string
string test1 = "255\r\n\r\n0\r\n\r\n-1\r\n\r\n255\r\n\r\n1\r";
I want to find all the 1's in my string but not the -1's. So in my string there is only one 1. I use string.Contain("1") but this will find two 1's. So how do i do this?
You can use regular expression:
string test1 = "255\r\n\r\n0\r\n\r\n-1\r\n\r\n255\r\n\r\n1\r";
// if at least one "1", but not "-1"
if (Regex.IsMatch(test1, "(?<!-)1")) {
...
}
the pattern is exactly 1 which is not preceed by -. To find all the 1s:
var matches = Regex
.Matches(test1, "(?<!-)1")
.OfType<Match>()
.ToArray(); // if you want an array
Try this simple solution:
Note : You can convert this to extension Method Easily.
static List<int> FindIndexSpecial(string search, char find, char ignoreIfPreceededBy)
{
// Map each Character with its Index in the String
var characterIndexMapping = search.Select((x, y) => new { character = x, index = y }).ToList();
// Check the Indexes of the excluded Character
var excludeIndexes = characterIndexMapping.Where(x => x.character == ignoreIfPreceededBy).Select(x => x.index).ToList();
// Return only Indexes who match the 'find' and are not preceeded by the excluded character
return (from t in characterIndexMapping
where t.character == find && !excludeIndexes.Contains(t.index - 1)
select t.index).ToList();
}
Usage :
static void Main(string[] args)
{
string test1 = "255\r\n\r\n0\r\n\r\n-1\r\n\r\n255\r\n\r\n1\r";
var matches = FindIndexSpecial(test1, '1', '-');
foreach (int index in matches)
{
Console.WriteLine(index);
}
Console.ReadKey();
}
You could use String.Split and Enumerable.Contains or Enumerable.Where:
string[] lines = test1.Split(new[] {Environment.NewLine, "\r"}, StringSplitOptions.RemoveEmptyEntries);
bool contains1 = lines.Contains("1");
string[] allOnes = lines.Where(l => l == "1").ToArray();
String.Contains searches for sub-strings in a given string instance. Enumerable.Contains looks if there's at least one string in the string[] which equals it.

How to ignore the punctuation c#

I want to ignore the punctuation.So, I'm trying to make a program that counts all the appearences of every word in my text but without taking in consideration the punctuation marks.
So my program is:
static void Main(string[] args)
{
string text = "This my world. World, world,THIS WORLD ! Is this - the world .";
IDictionary<string, int> wordsCount =
new SortedDictionary<string, int>();
text=text.ToLower();
text = text.replaceAll("[^0-9a-zA-Z\text]", "X");
string[] words = text.Split(' ',',','-','!','.');
foreach (string word in words)
{
int count = 1;
if (wordsCount.ContainsKey(word))
count = wordsCount[word] + 1;
wordsCount[word] = count;
}
var items = from pair in wordsCount
orderby pair.Value ascending
select pair;
foreach (var p in items)
{
Console.WriteLine("{0} -> {1}", p.Key, p.Value);
}
}
The output is:
is->1
my->1
the->1
this->3
world->5
(here is nothing) -> 8
How can I remove the punctuation here?
You should try specifying StringSplitOptions.RemoveEmptyEntries:
string[] words = text.Split(" ,-!.".ToCharArray(), StringSplitOptions.RemoveEmptyEntries);
Note that instead of manually creating a char[] with all the punctuation characters, you may create a string and call ToCharArray() to get the array of characters.
I find it easier to read and to modify later on.
string[] words = text.Split(new char[]{' ',',','-','!','.'}, StringSplitOPtions.RemoveEmptyItems);
It is simple - first step is to remove undesired punctuation with function Replace and then continue with splitting as you have it.
... you can go with the making people cry version ...
"This my world. World, world,THIS WORLD ! Is this - the world ."
.ToLower()
.Split(" ,-!.".ToCharArray(), StringSplitOptions.RemoveEmptyEntries)
.GroupBy(i => i)
.Select(i=>new{Word=i.Key, Count = i.Count()})
.OrderBy(k => k.Count)
.ToList()
.ForEach(Console.WriteLine);
.. output
{ Word = my, Count = 1 }
{ Word = is, Count = 1 }
{ Word = the, Count = 1 }
{ Word = this, Count = 3 }
{ Word = world, Count = 5 }

C# string operation. get file name substring

myfinename_slice_1.tif
myfilename_slice_2.tif
...
...
myfilename_slice_15.tif
...
...
myfilename_slice_210.tif
In C#, how can I get file index, like "1", "2", "15", "210" using string operations?
You have some options:
Regular expressions with the Regex class;
String.Split.
Most important is what are the assumptions you can make about the format of the file name.
For example if it's always at the end of the file name, without counting the extension, and after an underscore you can do:
var id = Path.GetFileNameWithoutExtension("myfinename_slice_1.tif")
.Split('_')
.Last();
Console.WriteLine(id);
If for example you can assume that the identifier is guaranteed to appear in the filename and the characters [0-9] are only allowed to appear in the filename as part of the identifier, you can just do:
var id = Regex.Match("myfinename_slice_1.tif", #"\d+").Value;
Console.WriteLine(id);
There are probably more ways to do this, but the most important thing is to assert which assumptions you can make and then code a implementation based on them.
This looks like a job for regular expressions. First define the pattern as a regular expression:
.*?_(?<index>\d+)\.tif
Then get a match against your string. The group named index will contain the digits:
var idx = Regex.Match(filename, #".*?_(?<index>\d+)\.tif").Groups["index"].Value;
You can use the regex "(?<digits>\d+)\.[^\.]+$", and if it's a match the string you're looking for is in the group named "digits"
Here is the method which will handle that:
public int GetFileIndex(string argFilename)
{
return Int32.Parse(argFilename.Substring(argFilename.LastIndexOf("_")+1, argFilename.LastIndexOf(".")));
}
Enjoy
String.Split('_')[2].Split('.')[0]
public class UnitTest1
{
[TestMethod]
public void TestMethod1()
{
var s1 = "myfinename_slice_1.tif";
var s2 = "myfilename_slice_2.tif";
var s3 = "myfilename_slice_15.tif";
var s4 = "myfilename_slice_210.tif";
var s5 = "myfilena44me_slice_210.tif";
var s6 = "7myfilena44me_slice_210.tif";
var s7 = "tif999";
Assert.AreEqual(1, EnumerateNumbers(s1).First());
Assert.AreEqual(2, EnumerateNumbers(s2).First());
Assert.AreEqual(15, EnumerateNumbers(s3).First());
Assert.AreEqual(210, EnumerateNumbers(s4).First());
Assert.AreEqual(210, EnumerateNumbers(s5).Skip(1).First());
Assert.AreEqual(210, EnumerateNumbers(s6).Skip(2).First());
Assert.AreEqual(44, EnumerateNumbers(s6).Skip(1).First());
Assert.AreEqual(999, EnumerateNumbers(s7).First());
}
static IEnumerable<int> EnumerateNumbers(string input)
{
var digits = new char[] { '0', '1', '2', '3', '4', '5', '6', '7', '8', '9' };
string result = string.Empty;
foreach (var c in input.ToCharArray())
{
if (!digits.Contains(c))
{
if (!string.IsNullOrEmpty(result))
{
yield return int.Parse(result);
result = string.Empty;
}
}
else
{
result += c;
}
}
if (result.Length > 0)
yield return int.Parse(result);
}
}

How can i split the string only once using C#

Example : a - b - c must be split as
a and b - c, instead of 3 substrings
Specify the maximum number of items that you want:
string[] splitted = text.Split(new string[]{" - "}, 2, StringSplitOptions.None);
string s = "a - b - c";
string[] parts = s.Split(new char[] { '-' }, 2);
// note, you'll still need to trim off any whitespace
"a-b-c".Split( new char[] { '-' }, 2 );
You could use indexOf() to find the first instance of the character you want to split with, then substring() to get the two aspects. For example...
int pos = myString.IndexOf('-');
string first = myString.Substring(0, pos);
string second = myString.Substring(pos);
This is a rough example - you'll need to play with it if you don't want the separator character in there - but you should get the idea from this.
string[] splitted = "a - b - c".Split(new char[]{' ', '-'}, 2, StringSplitOptions.RemoveEmptyEntries);
var str = "a-b-c";
int splitPos = str.IndexOf('-');
string[] split = { str.Remove(splitPos), str.Substring(splitPos + 1) };
I have joined late and many of above answers are matched with my following words:
string has its own
Split
You can use the same to find the solution of your problem, following is the example as per your issue:
using System;
public class Program
{
public static void Main()
{
var PrimaryString = "a - b - c";
var strPrimary = PrimaryString.Split( new char[] { '-' }, 2 );
Console.WriteLine("First:{0}, Second:{1}",strPrimary[0],strPrimary[1]);
}
}
Output:
First:a , Second: b - c

Categories

Resources