I have following String
string test = "viv-ek is a good boy.Mah - esh is Cra - zy.";
I want to get {"Vivek","Mahesh","Crazy"} words from that string
Some having only "-" and some having " - " in between words.
You can find your words with following regex :
\b\w+(?:\s-\s|-)\w+\b
and replace the result of match strings with (?:\s-\s|-) with empty string ''.
\b\w+\s*-\s*\w+\b
You can try this.See demo.
https://regex101.com/r/cZ0sD2/14
This might do the trick for you
string test = "viv-ek is a good boy.Mah - esh is Cra - zy.";
test = test.Replace(" -", "-").Replace("- ", "-").Replace(".", ". ");
//Or
//test = test.Replace(" - ", "-").Replace(".", ". ");
string[] allwords = test.Split(' ');
List<string> extractedWords=new List<string>();
foreach(string wrd in allwords)
{
if(wrd.Contains("-"))
{
extractedWords.Add(wrd.Replace("-", ""));
}
}
If you only want to select those words use this:
string test = "viv-ek is a good boy.Mah - esh is Cra - zy.";
var words =
Regex
.Matches(test, #"(?<part>\w+)(\s*-\s*(?<part>\w+))+\b")
.Cast<Match>()
.Select(
x => string.Join(
string.Empty,
x.Groups["part"].Captures.Cast<Capture>().SelectMany(capture => capture.Value)))
.ToList();
words is a list containing "vivek","Mahesh","Crazy".
DEMO
Replacing words will work the same way:
var replacingValues = new Dictionary<string, string> { { "Crazy", "XXX" } };
var test = "viv-ek is a good boy.Mah - esh is Cra - zy.";
var replacedTest =
Regex.Replace(
test,
#"\b(?<part>\w+)(\s*-\s*(?<part>\w+))+\b",
match =>
{
var word = string.Join(string.Empty, match.Groups["part"].Captures.Cast<Capture>().SelectMany(capture => capture.Value));
string replacingValue;
return replacingValues.TryGetValue(word, out replacingValue) ? replacingValue : match.Value;
});
replacedTestcontains viv-ek is a good boy.Mah - esh is XXX.
DEMO
Related
I have a string filled something like this -
".... . -.-- .--- ..- -.. ."
I need to split it to substrings, but a space in the middle must be written to string array too.
public static string Decode(string morseCode)
{
string[] words = morseCode.Split(new char[] { ' ' });
...
}
I expect :
words[0] = "....";
words[1] = ".";
words[2] = "-.--";
words[3] = " "; // <- Space in the middle should be preserved
words[4] = ".---";
...
You can try regular expressions in order to match required chunks:
using System.Linq;
using System.Text.RegularExpressions;
public static string Decode(string morseCode) {
string[] words = Regex.Matches(morseCode, #"(?<=^|\s).+?(?=$|\s)")
.Cast<Match>()
.Select(match => match.Value.All(c => char.IsWhiteSpace(c))
? match.Value
: match.Value.Trim())
.ToArray();
//Relevant code here
}
Demo:
using System.Linq;
using System.Text.RegularExpressions;
...
string morseCode = ".... . -.-- .--- ..- -.. .";
string[] words = Regex.Matches(morseCode, #"(?<=^|\s).+?(?=$|\s)")
.Cast<Match>()
.Select(match => match.Value.All(c => char.IsWhiteSpace(c))
? match.Value
: match.Value.Trim())
.ToArray();
string report = string.Join(Environment.NewLine, words
.Select((word, i) => $"words[{i}] = \"{word}\""));
Console.Write(report);
Outcome:
words[0] = "...."
words[1] = "."
words[2] = "-.--"
words[3] = " "
words[4] = ".---"
words[5] = "..-"
words[6] = "-.."
words[7] = "."
Try below also. It is with Regex itself.
string code = ".... . -.-- .--- ..- -.. .";
code = Regex.Replace(code, #"(\s{2})", " ").ToString();
string[] codes = code.Split(' ');
for (int i=0; i<codes.Length;i++){
Console.WriteLine(i + " - "+codes[i]);
}
The output is as below
0 - ....
1 - .
2 - -.--
3 -
4 - .---
5 - ..-
6 - -..
7 - .
I just replaced all consecutive spaces (>=2) with one space and than split the string. Hope this will help.
I want to split a long string (that contains only numbers) to string arr 0f numbers with 8 digits after the comma.
for example:
input:
string str = "45.00019821162.206580920.032150970.03215097244.0031982274.245303020.014716900.046867870.000198351974.613444580.391664580.438532450.00020199 3499.19734739 0.706802871.145335320.000202002543.362378010.513759201.659094520.000202102.391733720.000483371.65957789"
output:
string[] Arr=
"
45.00019821 162.20658092 234.03215097 123123.03215097
255.00019822 74.24530302 23422.01471690 1.04686787
12.00019835 1974.61344458 234.39166458 123212.43853245
532.00020199 3499.19734739 878.70680287 1.14533532
1234.00020200 2543.36237801 23.51375920 1.65909452
12221.00020210 2.39173372 0.00048337 1.65957789"
EDIT:
I try use
String.Format("{0:0.00000000}", str);
or some SubString such as:
public static string GetSubstring(string input, int count, char delimiter)
{
return string.Join(delimiter.ToString(), input.Split(delimiter).Take(count));
}
with no success.
You can split the string using Regex:
var strRegex = #"(?<num>\d+\.\d{8})";
var myRegex = new Regex(strRegex, RegexOptions.None);
foreach (Match myMatch in myRegex.Matches(str))
{
var part = myMatch.Groups["num"].Value;
// convert 'part' to double and store it wherever you want...
}
More compact version:
var myRegex = new Regex(#"(?<num>\d*\.\d{8})", RegexOptions.None);
var myNumbers = myRegex.Matches(str).Cast<Match>()
.Select(m => m.Groups["num"].Value)
.Select(v => Convert.ToDouble(v, CultureInfo.InvariantCulture));
The input string str can be converted to the desired output as follows.
static IEnumerable<string> NumberParts(string iString)
{
IEnumerable<char> iSeq = iString;
while (iSeq.Count() > 0)
{
var Result = new String(iSeq.TakeWhile(Char.IsDigit).ToArray());
iSeq = iSeq.SkipWhile(Char.IsDigit);
Result += new String(iSeq.Take(1).ToArray());
iSeq = iSeq.Skip(1);
Result += new String(iSeq.Take(8).ToArray());
iSeq = iSeq.Skip(8);
yield return Result;
}
}
The parsing method above can be called as follows.
var Parts = NumberParts(str).ToArray();
var Result = String.Join(" ", Parts);
This would be the classical for-loop version of it, (no magic involved):
// split by separator
string[] allparts = str.Split('.');
// Container for the resulting numbers
List<string> numbers = new List<string>();
// Handle the first number separately
string start = allparts[0];
string decimalPart ="";
for (int i = 1; i < allparts.Length; i++)
{
decimalPart = allparts[i].Substring(0, 8);
numbers.Add(start + "." + decimalPart);
// overwrite the start with the next number
start = allparts[i].Substring(8, allparts[i].Length - 8);
}
EDIT:
Here would be a LINQ Version yielding the same result:
// split by separator
string[] allparts = str.Split('.');
IEnumerable<string> allInteger = allparts.Select(x => x.Length > 8 ? x.Substring(8, x.Length - 8) : x);
IEnumerable<string> allDecimals = allparts.Skip(1).Select(x => x.Substring(0,8));
string [] allWholeNumbers = allInteger.Zip(allDecimals, (i, d) => i + "." + d).ToArray();
The shortest way without regex:
var splitted = ("00000000" + str.Replace(" ", "")).Split('.');
var result = splitted
.Zip(splitted.Skip(1), (f, s) =>
string.Concat(f.Skip(8).Concat(".").Concat(s.Take(8))))
.ToList()
Try it online!
I'm trying to get the table name from a string that is in the format:
[schemaname].[tablename]
I think this can be done with split but not sure how to handle the trailing ] character.
A simple approach is using String.Split and String.Trim in this little LINQ query:
string input = "[schemaname].[tablename]";
string[] schemaAndTable = input.Split('.')
.Select(t => t.Trim('[', ']'))
.ToArray();
string schema = schemaAndTable[0];
string table = schemaAndTable[1];
Another one using IndexOf and Substring:
int pointIndex = input.IndexOf('.');
if(pointIndex >= 0)
{
string schema = input.Substring(0, pointIndex).Trim('[', ']');
string table = input.Substring(pointIndex + 1).Trim('[', ']');
}
//find the seperator
var pos = str.IndexOf('].[');
if (pos == -1)
return null; //sorry, can't be found.
//copy everything from the find position, but ignore ].[
// and also ignore the last ]
var tableName = str.Substr(pos + 3, str.Length - pos - 4);
Just to be the different here is another version with regex;
var result = Regex.Match(s, #"(?<=\.\[)\w+").Value;
Split by 3 characters. i.e [.] with option RemoveEmptyEntries that is pretty self explanatory.
var result = input.Split(new [] {'[','.',']'}, StringSplitOptions.RemoveEmptyEntries);
Try this:
var tableAndSchema = "[schemaname].[tablename]";
var tableName = tableAndSchema
.Split('.')[1]
.TrimStart('[')
.TrimEnd(']');
Split will split the string on the . character and turn it into an array of two strings:
[0] = "[schemaname]"
[1] = "[tablename]"
The second (index 1) element is the one you want. TrimStart and TrimEnd will remove the starting and ending brackets.
Another way to do this is with Regular Expressions:
var tableAndSchema = "[schemaname].[tablename]";
var regex = new Regex(#"\[.*\].\[(.*)\]");
var tableName = regex.Match(tableAndSchema).Groups[1];
The regex pattern \[.*\].\[(.*)\] creates a capture group for the characters within the second pair of brackets and lets you easily pull them out.
var res = input.Split('.')[1].Trim('[', ']');
Another LINQ solution:
var tableName = String.Join("", input.SkipWhile(c => c != '.').Skip(1)
.Where(c => Char.IsLetter(c)));
Say we have a list of strings L, a given string S. We have a regexp like (\w+)\-(\w+) we want to get all L elements for which S matches $1 of regexp. How to do such thing?
You can do this:
// sample data
string[] L = new string[] { "bar foo", "foo bar-zoo", "bar-", "zoo bar-foo" };
string S = "bar";
Regex regex = new Regex(#"(\w+)\-(\w+)");
string[] res = L.Where(l => {
Match m = regex.Match(l);
if (m.Success) return m.Groups[1].Value == S;
else return false;
}).ToArray();
and get
foo bar-zoo
zoo bar-foo
An easier way that probably works out for you too is to include S in the regex:
Regex regex = new Regex(S + #"\-(\w+)");
string[] res = L.Where(l => regex.Match(l).Success).ToArray();
I'd like to split a string using the Split function in the Regex class. The problem is that it removes the delimiters and I'd like to keep them. Preferably as separate elements in the splitee.
According to other discussions that I've found, there are only inconvenient ways to achieve that.
Any suggestions?
Just put the pattern into a capture-group, and the matches will also be included in the result.
string[] result = Regex.Split("123.456.789", #"(\.)");
Result:
{ "123", ".", "456", ".", "789" }
This also works for many other languages:
JavaScript: "123.456.789".split(/(\.)/g)
Python: re.split(r"(\.)", "123.456.789")
Perl: split(/(\.)/g, "123.456.789")
(Not Java though)
Use Matches to find the separators in the string, then get the values and the separators.
Example:
string input = "asdf,asdf;asdf.asdf,asdf,asdf";
var values = new List<string>();
int pos = 0;
foreach (Match m in Regex.Matches(input, "[,.;]")) {
values.Add(input.Substring(pos, m.Index - pos));
values.Add(m.Value);
pos = m.Index + m.Length;
}
values.Add(input.Substring(pos));
Say that input is "abc1defg2hi3jkl" and regex is to pick out digits.
String input = "abc1defg2hi3jkl";
var parts = Regex.Matches(input, #"\d+|\D+")
.Cast<Match>()
.Select(m => m.Value)
.ToList();
Parts would be: abc 1 defg 2 hi 3 jkl
For Java:
Arrays.stream("123.456.789".split("(?<=\\.)|(?=\\.)+"))
.forEach((p) -> {
System.out.println(p);
});
outputs:
123
.
456
.
789
inspired from this post (How to split string but keep delimiters in java?)
Add them back:
string[] Parts = "A,B,C,D,E".Split(',');
string[] Parts2 = new string[Parts.Length * 2 - 1];
for (int i = 0; i < Parts.Length; i++)
{
Parts2[i * 2] = Parts[i];
if (i < Parts.Length - 1)
Parts2[i * 2 + 1] = ",";
}
for c#:
Split paragraph to sentance keeping the delimiters
sentance is splited by . or ? or ! followed by one space (otherwise if there any mail id in sentance it will be splitted)
string data="first. second! third? ";
Regex delimiter = new Regex("(?<=[.?!] )"); //there is a space between ] and )
string[] afterRegex=delimiter.Split(data);
Result
first.
second!
third?