Get a part of a string using regex - c#

I need to get //table[#data-account='test'] from the string //table[#data-account='test']//span[contains(.,'FB')] using regex.
I am new to regex and not able to use the existing samples for my purpose.
Thanks

You don't need regex for that. You can use String.Split method like;
Returns a string array that contains the substrings in this string
that are delimited by elements of a specified string array.
string s = #"//table[#data-account='test']//span[contains(.,'FB')]";
string[] stringarray = s.Split(new string[1] {#"//"}, StringSplitOptions.RemoveEmptyEntries);
Console.WriteLine("//" + stringarray[0]);
Output will be;
//table[#data-account='test']
Here is a DEMO.

using System;
using System.Text.RegularExpressions;
class P
{
static void Main()
{
Console.WriteLine(
Regex.Match("//table[#data-account='test']//span[contains(.,'FB')]", "^([^]]+])").Groups[1].Value);
}
}

Related

Get Removed characters from string

I am using Regex to remove unwanted characters from string like below:
str = System.Text.RegularExpressions.Regex.Replace(str, #"[^\u0020-\u007E]", "");
How can I retrieve distinct characters which will be removed in efficient way?
EDIT:
Sample input : str = "This☺ contains Åüsome æspecialæ characters"
Sample output : str = "This contains some special characters"
removedchar = "☺,Å,ü,æ"
string pattern = #"[\u0020-\u007E]";
Regex rgx = new Regex(pattern);
List<string> matches = new List<string> ();
foreach (Match match in rgx.Matches(str))
{
if (!matches.Contains (match.Value))
{
matches.Add (match.Value);
}
}
Here is an example how you can do it with a callback method inside the Regex.Replace overload with an evaluator:
evaluator
Type: System.Text.RegularExpressions.MatchEvaluator
A custom method that examines each match and returns either the original matched string or a replacement string.
C# demo:
using System;
using System.Collections.Generic;
using System.Text.RegularExpressions;
public class Test
{
public static List<string> characters = new List<string>();
public static void Main()
{
var str = Regex.Replace("§My string 123”˝", "[^\u0020-\u007E]", Repl);//""
Console.WriteLine(str); // => My string 123
Console.WriteLine(string.Join(", ", characters)); // => §, ”, ˝
}
public static string Repl(Match m)
{
characters.Add(m.Value);
return string.Empty;
}
}
See IDEONE demo
In short, declare a "global" variable (a list of strings, here, characters), initialize it. Add the Repl method to handle the replacement, and when Regex.Replace calls that method, add each matched value to the characters list.

How do i combine my string which contain multiple delimeters in to single string seperated by comma

I want to join my string with comma when multiple delimiters are there.
Like for eg: abc,pqr lmn,rty qqq
Input:
SearchKeyword=abc,pqr lmn,rty qqq.ttt
Output:
string output=searchKeyword.Join(",",searchKeyword.Split(new Char [] {',' ,null))
I want my input to be join by comma in a single string variable output.
Output: abc,pqr,lmn,qqq,ttt
How do I do this??
Below code that converts string with comas, white space and semi colons into a string containing only comas. If necessary just extend the collection in Split method.
var searchKeyword = "abc,pqr lmn,rty qqq";
var split = searchKeyword.Split(new[] {',', ' ', ';'});
var res = String.Join(",", split);
EDIT
And a oneliner version:
var res = String.Join(",", searchKeyword.Split(new[] { ',', ' ' }));
Hope this is better way to that.. using this you can replace any spacial character with ,
using System;
using System.Text.RegularExpressions;
public class Program
{
public static void Main()
{
string input = "abc,pqr lmn,rty qqq.ttt";
string output = Regex.Replace(input, #"\W", ",");
Console.WriteLine(input);
Console.WriteLine(output);
}
}
https://dotnetfiddle.net/PtOPVA

Split string that contains array to get array of values C#

I have a string that contains an array
string str = "array[0]=[1,a,3,4,asdf54,6];array[1]=[1aaa,2,4,k=6,2,8];array[2]=[...]";
I'd like to split it to get an array like this:
str[0] = "[1,a,3,4,asdf54,6]";
str[1] = "[1aaa,2,4,k=6,2,8]";
str[2] = ....
I've tried to use Regex.Split(str, #"\[\D+\]") but it didn't work..
Any suggestions?
Thanks
SOLUTION:
After seen your answers I used
var arr = Regex.Split(str, #"\];array\[[\d, -]+\]=\[");
This works just fine, thanks all!
var t = str.Split(';').Select(s => s.Split(new char[]{'='}, 2)).Select(s => s.Last()).ToArray();
In regex, \d matches any digit, whilst \D matches anything that is not a digit. I assume your use of the latter is erroneous. Additionally, you should allow your regex to also match negation signs, commas, and spaces, using the character class [\d\-, ]. You can also include a lookahead lookbehind for the = character, written as (?<=\=), in order to avoid getting the [0], [1], [2], ...
string str = "array[0]=[1,2,3,4,5,6];array[1]=[1,2,4,6,2,8];array[2]=[...]";
string[] results = Regex.Matches(str, #"(?<=\=)\[[\d\-, ]+\]")
.Cast<Match>()
.Select(m => m.Value)
.ToArray();
Try this - using regular expression look behind to grab the relevant parts of your string.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Text.RegularExpressions;
namespace RegexSplit
{
class Program
{
static void Main(string[] args)
{
string str = "array[0]=[1,2,3,4,5,6];array[1]=[1,2,4,6,2,8];array[2]=[...]";
Regex r = new Regex(#"(?<=\]=)(\[.+?\])");
string[] results = r.Matches(str).Cast<Match>().Select(p => p.Groups[1].Value).ToArray();
}
}
}
BONUS - convert to int[][] if you are fancy.
int[][] ints = results.Select(p => p.Split(new [] {'[', ',', ']'}, StringSplitOptions.RemoveEmptyEntries)
.Where(s => { int temp; return int.TryParse(s, out temp);}) //omits the '...' match from your sample. potentially you could change the regex pattern to only catch digits but that isnt what you wanted (i dont think)
.Select(s => int.Parse(s)).ToArray()).ToArray();
Regex would be an option, but it would be a bit complicated. Assuming you don't have a parser for your input, you can try the following:
-Split the string by ; characters, and you'd get a string array (e.g. string[]):
"array[0]=[1,2,3,4,5,6]", "array[1]=[1,2,4,6,2,8]", "array[2]=[...]". Let's call it list.
Then for each of the elements in that array (assuming the input is in order), do this:
-Find the index of ]=, let that be x.
-Take the substring of your whole string from the starting index x + 2. Let's call it sub.
-Assign the result string as the current string in your array, e.g. if you are iterating with a regular for loop, and your indexing variable is i such as for(int i = 0; i < len; i++){...}:
list[i] = sub.
I know it is a dirty and an error-prone solution, e.g. if input is array[0] =[1,2... instead of array[0]=[1,2,... it won't work due to the extra space there, but if your input mathces that exact pattern (no extra spaces, no newlines etc), it will do the job.
UPDATE: cosset's answer seems to be the most practical and easiest way to achieve your result, especially if you are familiar with LINQ.
string[] output=Regex.Matches(input,#"(?<!array)\[.*?\]")
.Cast<Match>()
.Select(x=>x.Value)
.ToArray();
OR
string[] output=Regex.Split(input,#";?array[\d+]=");
string str = "array[0]=[1,a,3,4,asdf54,6];array[1]=[1aaa,2,4,k=6,2,8];array[2]=[2,3,2,3,2,3=3k3k]";
string[] m1 = str.Split(';');
List<string> m3 = new List<string>();
foreach (string ms in m1)
{
string[] m2 = ms.Split('=');
m3.Add(m2[1]);
}

C# string splitting

If I have a string: str1|str2|str3|srt4 and parse it with | as a delimiter. My output would be str1 str2 str3 str4.
But if I have a string: str1||str3|str4 output would be str1 str3 str4. What I'm looking for my output to be like is str1 null/blank str3 str4.
I hope this makes sense.
string createText = "srt1||str3|str4";
string[] txt = createText.Split(new[] { '|', ',' },
StringSplitOptions.RemoveEmptyEntries);
if (File.Exists(path))
{
//Console.WriteLine("{0} already exists.", path);
File.Delete(path);
// write to file.
using (StreamWriter sw = new StreamWriter(path, true, Encoding.Unicode))
{
sw.WriteLine("str1:{0}",txt[0]);
sw.WriteLine("str2:{0}",txt[1]);
sw.WriteLine("str3:{0}",txt[2]);
sw.WriteLine("str4:{0}",txt[3]);
}
}
Output
str1:str1
str2:str3
str3:str4
str4:"blank"
Thats not what i'm looking for. This is what I would like to code:
str1:str1
str2:"blank"
str3:str3
str4:str4
Try this one:
str.Split('|')
Without StringSplitOptions.RemoveEmptyEntries passed, it'll work as you want.
this should do the trick...
string s = "str1||str3|str4";
string[] parts = s.Split('|');
The simplest way is to use Quantification:
using System.Text.RegularExpressions;
...
String [] parts = new Regex("[|]+").split("str1|str2|str3|srt4");
The "+" gets rid of it.
From Wikipedia :
"+" The plus sign indicates that there is one or more of the preceding element. For example, ab+c matches "abc", "abbc", "abbbc", and so on, but not "ac".
Form msdn: The Regex.Split methods are similar to the String.Split method, except Split splits the string at a delimiter determined by a regular expression instead of a set of characters. The input string is split as many times as possible. If pattern is not found in the input string, the return value contains one element whose value is the original input string.
Additional wish can be done with:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace ConsoleApplication1 {
class Program{
static void Main(string[] args){
String[] parts = "str1||str2|str3".Replace(#"||", "|\"blank\"|").Split(#"|");
foreach (string s in parts)
Console.WriteLine(s);
}
}
}
Try something like this:
string result = "str1||str3|srt4";
List<string> parsedResult = result.Split('|').Select(x => string.IsNullOrEmpty(x) ? "null" : x).ToList();
when using the Split() the resulting string in the array will be empty (not null). In this example i have tested for it and replaced it with the actual word null so you can see how to substitute in another value.

Separating a string into substrings

I want to separate a string consisting of one or more two-letter codes separated by commas into two-letter substrings and put them in a string array or other suitable data structure. The result is at one point to be databound to a combo box so this needs to be taken into account.
The string I want to manipulate can either be empty, consist of two letters only or be made up by multiple two-letter codes separated by commas (and possibly a space).
I was thinking of using a simple string array but I'm not sure if this is the best way to go.
So... what data structure would you recommend that I use and how would you implement it?
Definitely at least start with a string array, because it's the return type of string.Split():
string MyCodes = "AB,BC,CD";
char[] delimiters = new char[] {',', ' '};
string[] codes = MyCodes.Split(delimiters, StringSplitOptions.RemoveEmptyEntries);
Update: added space to the delimiters. That will have the effect of trimming spaces from your result strings.
Would something like this work?
var list = theString.Split(", ".ToCharArray(), StringSplitOptions.RemoveEmptyEntries).ToList();
My answer is "right", but I suggest Joel Coehoorn's answer.
public static string[] splitItems(string inp)
{
if(inp.Length == 0)
return new string[0];
else
return inp.Split(',');
}
If you are simply going to bind to the structure then a String[] ought to be fine - if you need to work with the data before you use it as a data source then a List<String> is probably a better choice.
Here is an example:
using System;
using System.Collections.Generic;
class Program
{
static void Main()
{
String s = "ab,cd,ef";
// either a String[]
String[] array = s.Split(new Char[] {','});
// or a List<String>
List<String> list = new List<String>(s.Split(new Char[] { ',' }));
}
}

Categories

Resources