Filter text between two special characters - c#

For example I have those strings:
"qwe/qwe/qwe/qwe//qwe/somethinghere_blabla.exe"
"qwe/qwe/q//we/qwe//qwe/somethingother_here_blabla.exe"
"qwe/qwe/qwe/qwe//qwe/some_numbers_here_blabla.exe"
Now I want to get the text between the last '/' and the last '_'.
So the outcome would be:
"somethinghere"
"somethingother_here"
"some_numbers_here"
What is the easiest and clearest way to do this?
I have no idea how to do this, should I split them in '/' and '_', so do this apart? I couldn't think of any way how to do it.
Maybe scan the string from the end till it reaches its first '/' and '_'? Or is there an easier and faster way? Because it has to scan ~10.000 strings.
string[] words = line.Split('/', '_'); //maybe use this? probably not
Thanks in advance!

string s = "qwe/qwe/q//we/qwe//qwe/somethingother_here_blabla.exe";
int last_ = s.LastIndexOf('_');
if (last_ < 0) // _ not found, take the tail of string
last_ = s.Length;
int lastSlash = s.LastIndexOf('/');
string part = s.Substring(lastSlash + 1, last_ - lastSlash - 1);

Well, there is string.LastIndexOf:
var start = line.LastIndexOf('/') + 1;
var end = line.LastIndexOf('_');
var result = line.Substring(start, end - start);

The LINQ way:
var str = "qwe/qwe/qwe/qwe//qwe/somethinghere_blabla.exe";
var newStr = new string(str.Reverse().SkipWhile(c => c != '_').Skip(1).TakeWhile(c => c != '/').Reverse().ToArray());

var string = "qwe/qwe/qwe/qwe//qwe/some_numbers_here_blabla.exe";
var start = string.lastIndexOf("/");
var end = string.lastIndexOf("_");
var result = string.substring(start + 1, end);
Note: above code does not handle error if string does not have / or _ after a last slash

Related

remove all characters after X character

Please check variable "mystr" value where a "-" sign between two part of numbers. I want to find "-" then remove all character after that then I want find same "-" and remove all Character from first to till that. I know it's simple but not getting exact solution on c# due to I am new.
public void test()
{
string mystr = "1.30-50.50";
//first output I want is- "1.30"
//second output I want is- "50.50"
}
Use string.Split method:
var mystr = "1.30-50.50";
var result = mystr.Split('-');
var a = result[0]; //"1.30"
var b = result[1]; //"50.50"
you can also String.IndexOf method
string mystr = "1.30-50.50";
int indexOfDash = mystr.IndexOf('-');
string firsResult = mystr.Substring(0, indexOfDash);
string secondResult = mystr.Substring(indexOfDash + 1, mystr.Length - indexOfDash - 1);

C# Removing all extra occurrences BEYOND the FIRST in string

So, I have some code that works the way I want it to, but I am wondering if there is a better way to do this with a regex? I have played with a few regex but with no luck(And I know I need to get better with regex stuff).
This code is purely designed to remove any extra spaces or non email valid characters. Then it goes through and removes extra # symbols beyond the first.
List<string> second_pass = new List<string>();
string final_pass = "";
if (email_input.Text.Length > 0)
{
string first_pass = Regex.Replace(email_input.Text, #"[^\w\.#-]", "");
if (first_pass.Contains("#"))
{
second_pass = first_pass.Split('#').Select(sValue => sValue.Trim()).ToList();
string third_pass = second_pass[0] + "#" + second_pass[1];
second_pass.Remove(second_pass[0]);
second_pass.Remove(second_pass[1]);
if (second_pass.Count > 0)
{
final_pass = third_pass + string.Join("", second_pass.ToArray());
}
}
email_output.Text = final_pass;
}
If you can get by by replacing only the captured groups, then this should be able to work.
([^\w\.\#\-])|(?<=\#).*?(\#)
Demo
Going by your description and not the code:
var final_pass = email_input.Text;
var atPos = final_pass.IndexOf('#');
if (atPos++ >= 0)
final_pass = final+pass.Substring(0, atPos) + Regex.Replace(final_pass.Substring(atPos), "[# ]", "");
For an (almost) pure regex solution, using a state cheap, this seems to be working:
var first = 0;
final_pass = Regex.Replace(final_pass, "(^.+?#)?([^ #]+?)?[# ]", m => (first++ == 0) ? m.Groups[1].Value+m.Groups[2].Value : m.Groups[2].Value);

C# split string of first character occurrence

I thought this was simple but this is just kicking my butt.
I have this string 21. A.Person I simply want to get A.Person out of this.
I try the following but I only get 21
string[] pName = values[i, j].ToString().Split(new char[] { '.' }, 2);
pName[1] ???
values[i, j].ToString() = 21. A.Person and yes I've verified this.
Try this:
var substr="";
var indedx = yourString.IndexOf('.');
if(index>-1)
substr = yourString.Substring(index);
substr=substr.Trim();
For string "21. A.Person" should return "A.Person"
Everyone is giving you alternate solutions when yours should work.
The problem is that values[i, j] must not equal 21. A.Person
I plugged it into a simple test..
[Test]
public void junk()
{
string[] pName = "21. A.Person".Split(new char[] { '.' }, 2);
Console.WriteLine(pName[1]);
}
What does it print?
A.Person
(With the space in the front, because you didn't trim the space)
I would use substring() with the position of the first '.' as your start point:
var name = sourceString.Substring(sourceString.IndexOf('.'));
string pName = values[i, j].ToString().Substring(values[i, j].ToString().IndexOf('.')+1);
Try something like that:
var str = "21. A.Person";
var index = str.IndexOf('.') +1;
var substr = str.Substring(index, str.Length - index);

Format string with regex in c#

I would like to format a string that looks like this
BPT4SH9R0XJ6
Into something that looks like this
BPT4-SH9R-0XJ6
The string will always be a mix of 12 letters and numbers
Any advice will be highly appreciated, thanks
Try Regex.Replace(input, #"(\w{4})(\w{4})(\w{4})", #"$1-$2-$3");
Regex is often derided, but is a pretty neat way of doing what you need. Can be extended to more complex requirements that are difficult to meet using string methods.
You can use "(.{4})(.{4})(.{4})" as your expression and "$1-$2-$3" as your replacement. This is, however, hardly a good use for regexp: you can do it much easier with Substring.
var res = s.Substring(0,4)+"-"+s.Substring(4,4)+"-"+s.Substring(8);
If the rule is to always split in three block of four characters no need for a reg exp:
str.Substring(0,4) + "-" + str.Substring(4,4) + "-" + str.Substring(8,4)
It would seem that a combination of String.Concat and string.Substring should take care of everything that you need.
var str = "BPT4SH9R0XJ6";
var newStr = str.Substring(0, 4) + "-" + str.Substring(4, 4) + "-" + str.Substring(8, 4);
Any reason you want to do a regex? you could just insert hyphens:
string s = "BPT4SH9R0XJ6";
for(int i = 4; i < s.length; i = i+5)
s = s.Insert(i, "-");
This would keep adding hyphens every 4 characters, would not error out if string was too short/long/etc.
return original_string.SubString(0,4)+"-"+original_string.SubString(4,4)+"-"+original_string.SubString(8,4);
string str = #"BPT4SH9R0XJ6";
string formattedString = string.Format("{0}-{1}-{2}", str.Substring(0, 4), str.Substring(4,4), str.Substring(8,4));
This works with any length of string:
for (int i = 0; i < (int)Math.Floor((myString.Length - 1) / 4d); i++)
{
myString = myString.Insert((i + 1) * 4 + i, "-");
}
Ended upp using this
var original = "BPT4SH9R0XJ6".ToCharArray();
var first = new string(original, 0, 4);
var second = new string(original, 4, 4);
var third = new string(original, 8, 4);
var mystring = string.Concat(first, "-", second, "-", third);
Thanks
If you are guaranteed the text you're operating on is the 12 character code then why don't you just use substring? Why do you need the Regex?
String theString = "AB12CD34EF56";
String theNewString = theString.Substring(0, 4) + "-" + theString.Substring(4, 4) + "-" + theString.Substring(8, 4);'

strip out digits or letters at the most right of a string

I have a file name: kjrjh20111103-BATCH2242_20111113-091337.txt
I only need 091337, not the txt or the - how can I achieve that. It does not have to be 6 numbers it could be more or less but will always be after "-" and the last ones before ."doc" or ."txt"
You can either do this with a regex, or with simple string operations. For the latter:
int lastDash = text.LastIndexOf('-');
string afterDash = text.Substring(lastDash + 1);
int dot = afterDash.IndexOf('.');
string data = dot == -1 ? afterDash : afterDash.Substring(0, dot);
Personally I find this easier to understand and verify than a regular expression, but your mileage may vary.
String fileName = kjrjh20111103-BATCH2242_20111113-091337.txt;
String[] splitString = fileName.Split ( new char[] { '-', '.' } );
String Number = splitString[2];
Regex: .*-(?<num>[0-9]*). should do the job. num capture group contains your string.
The Regex would be:
string fileName = "kjrjh20111103-BATCH2242_20111113-091337.txt";
string fileMatch = Regex.Match(fileName, "(?<=-)\d+", RegexOptions.IgnoreCase).Value;
String fileName = "kjrjh20111103-BATCH2242_20111113-091337.txt";
var startIndex = fileName.LastIndexOf('-') + 1;
var length = fileName.LastIndexOf('.') - startIndex;
var output = fileName.Substring(startIndex, length);

Categories

Resources