C# string operation. get file name substring - c#

myfinename_slice_1.tif
myfilename_slice_2.tif
...
...
myfilename_slice_15.tif
...
...
myfilename_slice_210.tif
In C#, how can I get file index, like "1", "2", "15", "210" using string operations?

You have some options:
Regular expressions with the Regex class;
String.Split.
Most important is what are the assumptions you can make about the format of the file name.
For example if it's always at the end of the file name, without counting the extension, and after an underscore you can do:
var id = Path.GetFileNameWithoutExtension("myfinename_slice_1.tif")
.Split('_')
.Last();
Console.WriteLine(id);
If for example you can assume that the identifier is guaranteed to appear in the filename and the characters [0-9] are only allowed to appear in the filename as part of the identifier, you can just do:
var id = Regex.Match("myfinename_slice_1.tif", #"\d+").Value;
Console.WriteLine(id);
There are probably more ways to do this, but the most important thing is to assert which assumptions you can make and then code a implementation based on them.

This looks like a job for regular expressions. First define the pattern as a regular expression:
.*?_(?<index>\d+)\.tif
Then get a match against your string. The group named index will contain the digits:
var idx = Regex.Match(filename, #".*?_(?<index>\d+)\.tif").Groups["index"].Value;

You can use the regex "(?<digits>\d+)\.[^\.]+$", and if it's a match the string you're looking for is in the group named "digits"

Here is the method which will handle that:
public int GetFileIndex(string argFilename)
{
return Int32.Parse(argFilename.Substring(argFilename.LastIndexOf("_")+1, argFilename.LastIndexOf(".")));
}
Enjoy

String.Split('_')[2].Split('.')[0]

public class UnitTest1
{
[TestMethod]
public void TestMethod1()
{
var s1 = "myfinename_slice_1.tif";
var s2 = "myfilename_slice_2.tif";
var s3 = "myfilename_slice_15.tif";
var s4 = "myfilename_slice_210.tif";
var s5 = "myfilena44me_slice_210.tif";
var s6 = "7myfilena44me_slice_210.tif";
var s7 = "tif999";
Assert.AreEqual(1, EnumerateNumbers(s1).First());
Assert.AreEqual(2, EnumerateNumbers(s2).First());
Assert.AreEqual(15, EnumerateNumbers(s3).First());
Assert.AreEqual(210, EnumerateNumbers(s4).First());
Assert.AreEqual(210, EnumerateNumbers(s5).Skip(1).First());
Assert.AreEqual(210, EnumerateNumbers(s6).Skip(2).First());
Assert.AreEqual(44, EnumerateNumbers(s6).Skip(1).First());
Assert.AreEqual(999, EnumerateNumbers(s7).First());
}
static IEnumerable<int> EnumerateNumbers(string input)
{
var digits = new char[] { '0', '1', '2', '3', '4', '5', '6', '7', '8', '9' };
string result = string.Empty;
foreach (var c in input.ToCharArray())
{
if (!digits.Contains(c))
{
if (!string.IsNullOrEmpty(result))
{
yield return int.Parse(result);
result = string.Empty;
}
}
else
{
result += c;
}
}
if (result.Length > 0)
yield return int.Parse(result);
}
}

Related

How to separate string after whitespace in c#

I'm using c# and have a string like x="12 $Math A Level$"` that could be also x="12 Math A Level"
How can I separate this string in order to have a variable year=12 and subject=Math A Level?
I was using something like:
char[] whitespace = new char[] { ' ', '\t' };
var x = item.Split(whitespace);
but then I didn't know what to do after or if there's a better way to do this.
You could use the override of split that takes the count :
var examples = new []{"2 $Math A Level$", "<some_num> <some text>"} ;
foreach(var s in examples)
{
var parts = s.Split(' ', count: 2, StringSplitOptions.RemoveEmptyEntries | StringSplitOptions.TrimEntries);
Console.WriteLine($"'{parts[0]}', '{parts[1]}'");
}
This prints:
'2', '$Math A Level$'
'<some_num>', '<some text>'
You could do
var item = "12 Math A Level";
var index = item.IndexOf(' ');
var year = item.Substring(0, index);
var subject = item.Substring(index + 1, item.Length - index-1).Trim('$');
This assumes that the year is the first word, and the subject is everything else. It also assumes you are not interested in any '$' signs. You might also want to add a check that the index was actually found, in case there are no spaces in the string.
To add a Regex-based answer:
using System;
using System.Text.RegularExpressions;
public class Program
{
public static readonly Regex regex = new Regex(#"(?<ID>[0-9]+)\s+[$]?(?<Text>[^$]*)[$]?", RegexOptions.Compiled);
public static void Main()
{
MatchCollection matches = regex.Matches("12 $Math A Level$");
foreach( Match m in matches )
{
Console.WriteLine($"{(m.Groups["ID"].Value)} | {(m.Groups["Text"].Value)}");
}
matches = regex.Matches("13 Math B Level");
foreach( Match m in matches )
{
Console.WriteLine($"{(m.Groups["ID"].Value)} | {(m.Groups["Text"].Value)}");
}
}
}
In action: https://dotnetfiddle.net/6XEQw8
Output:
12 | Math A Level
13 | Math B Level
To explain the expression:
(?[0-9]+)\s+[$]?(?[^$]*)[$]?
(?[0-9]+) - Named Catpure-Group "ID"
[0-9] - Match literal chars '0' to '9'
+ - ^^ One or more times
\s+ - Match whitespace one or more times
[$]? - Match literal '$' one or zero times
(?[^$]*) - Named Capture-Group "Text"
[^$] - Match anything that is _not_ literal '$'
* - ^^ Zero or more times
[$]? - Match literal '$' one or zero times
See also https://regex101.com/r/WV366l/1
Mind: I personally would benchmark this solution against a (or several) non-regex solutions and then make a choice.
var x = "12 $Math A Level$".Split('$', StringSplitOptions.RemoveEmptyEntries | StringSplitOptions.TrimEntries);
string year = x[0];
string subject = x[1];
Console.WriteLine(year);
Console.WriteLine(subject);
If you can rely on the string format specified ("12 $Math A Level$"), you could split with at $ like this:
using System;
public class Program
{
public static void Main()
{
var sample = "12 $Math A Level$";
var rec = Parse(sample);
Console.WriteLine($"Year={rec.Year}\nSubject={rec.Subject}");
}
private static Record Parse(string value)
{
var delimiter = new char[] { '$' };
var parts = value.Split(delimiter, StringSplitOptions.RemoveEmptyEntries);
return new Record { Year = Convert.ToInt32(parts[0]), Subject = parts[1] };
}
public class Record
{
public int Year { get; set; }
public string Subject { get; set; }
}
}
Output:
Year=12
Subject=Math A Level
▶️ Try it out here: https://dotnetfiddle.net/DAFLjA

C# Split text into Substrings

What I'm actually trying is to split a StreamReader.ReadLine() object such as "1 A & B 2 C & D" into "1", "A & B", "2" and "C & D" substrings. Anybody an idea of a simple algorithm to implement this splitting?
Something like this (using a tiny bit of Linq): ?
static private List<string> Parse(string s)
{
var result = new List<string>();
string[] rawTextParts = s.Split(new char[] { '0', '1', '2', '3', '4', '5', '6', '7', '8', '9' });
var textParts = rawTextParts.Where(t => !string.IsNullOrWhiteSpace(t)).Select(t => t.Trim());
foreach (string textPart in textParts)
{
string numberstring = s.Substring(0, s.IndexOf(textPart)).Trim();
s = s.Substring(s.IndexOf(textPart) + textPart.Length);
result.Add(numberstring);
result.Add(textPart);
}
return result;
}
Regex is made for pattern matching. There are two patterns, Alphabetic character(s) a non character and alphabetic character(s) or the final pattern of numbers. Here is the regex to do such:
var input = "1 A & B 2 C & D";
var pattern = #"[a-zA-Z]\s+\W\s+[a-zA-Z]|\d+";
var resultItems =
Regex.Matches(input, pattern)
.OfType<Match>()
.Select(m => m.Value)
.ToList();
Result is
The \s+ was not mentioned for that handles all spaces, such it is 1 to many spaces for something like (A & B). If you believe there will be no spaces such A&B use \s* which is zero to many spaces.
It's hard to infer precise requirements from your question. But according to your example I'd come with something like:
void Main()
{
var input = "1 A & B 2 C & D";
var result = Parse(input);
Console.WriteLine(String.Join("\n", result));
}
static IEnumerable<string> Parse(string input)
{
var words = input.Split();
var builder = new StringBuilder();
foreach (var word in words)
{
if (int.TryParse(word, out var value))
{
if (builder.Length > 0)
{
yield return builder.ToString();
builder.Clear();
}
yield return word;
}
else
{
if (builder.Length > 0)
{
builder.Append(' ');
}
builder.Append(word);
}
}
if (builder.Length > 0) // leftovers
{
yield return builder.ToString();
}
}
The output of the above code will be:
1
A & B
2
C & D

How to find 1 in my string but ignore -1 C#

I have a string
string test1 = "255\r\n\r\n0\r\n\r\n-1\r\n\r\n255\r\n\r\n1\r";
I want to find all the 1's in my string but not the -1's. So in my string there is only one 1. I use string.Contain("1") but this will find two 1's. So how do i do this?
You can use regular expression:
string test1 = "255\r\n\r\n0\r\n\r\n-1\r\n\r\n255\r\n\r\n1\r";
// if at least one "1", but not "-1"
if (Regex.IsMatch(test1, "(?<!-)1")) {
...
}
the pattern is exactly 1 which is not preceed by -. To find all the 1s:
var matches = Regex
.Matches(test1, "(?<!-)1")
.OfType<Match>()
.ToArray(); // if you want an array
Try this simple solution:
Note : You can convert this to extension Method Easily.
static List<int> FindIndexSpecial(string search, char find, char ignoreIfPreceededBy)
{
// Map each Character with its Index in the String
var characterIndexMapping = search.Select((x, y) => new { character = x, index = y }).ToList();
// Check the Indexes of the excluded Character
var excludeIndexes = characterIndexMapping.Where(x => x.character == ignoreIfPreceededBy).Select(x => x.index).ToList();
// Return only Indexes who match the 'find' and are not preceeded by the excluded character
return (from t in characterIndexMapping
where t.character == find && !excludeIndexes.Contains(t.index - 1)
select t.index).ToList();
}
Usage :
static void Main(string[] args)
{
string test1 = "255\r\n\r\n0\r\n\r\n-1\r\n\r\n255\r\n\r\n1\r";
var matches = FindIndexSpecial(test1, '1', '-');
foreach (int index in matches)
{
Console.WriteLine(index);
}
Console.ReadKey();
}
You could use String.Split and Enumerable.Contains or Enumerable.Where:
string[] lines = test1.Split(new[] {Environment.NewLine, "\r"}, StringSplitOptions.RemoveEmptyEntries);
bool contains1 = lines.Contains("1");
string[] allOnes = lines.Where(l => l == "1").ToArray();
String.Contains searches for sub-strings in a given string instance. Enumerable.Contains looks if there's at least one string in the string[] which equals it.

How to check a String for characters NOT to be included in C#

So I have a String like:
String myString = "AAAaAAA";
I want to check the String if it contains ANY characters that are not "A"
How can I do this? my previous code is:
Regex myChecker = new Regex("[^A.$]$");
if (checkForIncluded.IsMatch(myString))
{
//Do some Stuff
}
Is there any other way to do it? The code above does not detect the small a. But when I use a different String with only characters that are not "A" it works. Thank you!
String myString = "AAAaAAA";
if(myString.Any(x => x != 'A')) {
// Yep, contains some non-'A' character
}
Try something like this:
var allowedChars = new List<char>() { 'a', 'b', 'c' };
var myString = "abcA";
var result = myString.Any(c => !allowedChars.Contains(c));
if (result) {
// myString contains something not in allowed chars
}
or even like this:
if (myString.Except(allowedChars).Any()) {
// ...
}
allowedChars can be any IEnumerable< char >.
I want to check the String if it contains ANY characters that are not
"A"
You can use Enumerable.Any like;
string myString = "AAAaAAA";
bool b = myString.Any(s => !s.Equals('A')); // True
You can use Linq:
String myString = "AAAaAAA";
var result = myString.Where(x=>x != 'A'); // return all character that are not A
if(result.Count() > 0)
{
Console.WriteLine("Characters exists other than a");
}
if you want both cases:
String myString = "AAAaAAA";
var result = myString.Where(x=>x != 'A' || x != 'a');
or Use String.Equals():
var result = myString.Where(x => !String.Equals(x.ToString(), "A", StringComparison.OrdinalIgnoreCase));
Your regular expression is only trying to match the last character. This should work:
var myString = "AAaA";
bool anyNotAs = Regex.IsMatch(myString, "[^A]", RegexOptions.None);

Separating numbers from other signs in a string

I got a string that contains:
"(" ")" "&&" "||"
and numbers (0 to 99999).
I want to get a string and return a list like this:
get:
"(54&&1)||15"
return new List<string>(){
"(",
"54",
"&&",
"1",
")",
"||",
"15"}
I suspect a regex would do the trick here. Something like:
string text = "(54&&1)||15";
Regex pattern = new Regex(#"\(|\)|&&|\|\||\d+");
Match match = pattern.Match(text);
while (match.Success)
{
Console.WriteLine(match.Value);
match = match.NextMatch();
}
The tricky bit in the above is that a lot of stuff needs escaping. The | is the alternation operator, so this is "open bracket or close bracket or && or || or at least one digit".
If you want to extract only numbers from your string you can use the regex
but if you want to parse this string and made some as formula and calculate result you should look at the math expression parser
for example look at this Math Parser
Here's the LINQ/Lambda way to do it:
var operators = new [] { "(", ")", "&&", "||", };
Func<string, IEnumerable<string>> operatorSplit = t =>
{
Func<string, string, IEnumerable<string>> inner = null;
inner = (p, x) =>
{
if (x.Length == 0)
{
return new [] { p, };
}
else
{
var op = operators.FirstOrDefault(o => x.StartsWith(o));
if (op != null)
{
return (new [] { p, op }).Concat(inner("", x.Substring(op.Length)));
}
else
{
return inner(p + x.Substring(0, 1), x.Substring(1));
}
}
};
return inner("", t).Where(x => !String.IsNullOrEmpty(x));
};
Now you just call this:
var list = operatorSplit("(54&&1)||15").ToList();
Enjoy!

Categories

Resources