finding middle character in string using regex only

finding middle character in string using regex only - c#

How can I find middle character with regex only
For example,this shows the expected output
Hello -> l
world -> r
merged -> rg (see this for even number of occurances)
hi -> hi
I -> I
I tried
(?<=\w+).(?=\w+)

Regular expressions cannot count in the way that you are looking for. This looks like something regular expressions cannot accomplish. I suggest writing code to solve this.

String str="Hello";
String mid="";
int len = str.length();
if(len%2==1)
mid= Character.toString(str.getCharAt(len/2));
else
mid= Character.toString(str.getChatAt(len/2))+ Character.toStringstr.getCharAt((len/2)-1));
This should probably work.

public static void main(String[] args) {
String s = "jogijogi";
int size = s.length() / 2;
String temp = "";
if (s.length() % 2 == 0) {
temp = s.substring(size - 1, (s.length() - size) + 1);
} else if (s.length() % 2 != 0) {
temp = s.substring(size, (s.length() - size));
} else {
temp = s.substring(1);
}
System.out.println(temp);
}

Related: How to match the middle character in a string with regex?
The following regex is based on #jaytea's approach and works well with e.g. PCRE, Java or C#.
^(?:.(?=.+?(.\1?$)))*?(^..?$|..?(?=\1$))
Here is the demo at regex101 and a .NET demo at RegexPlanet (click the green ".NET" button)
Middle character(s) will be found in the second capturing group. The goal is to capture two middle characters if there is an even amount of characters, else one. It works by a growing capture towards the end (first group) while lazily going through the string until it ends with the captured substring that grows with each repitition. ^..?$ is used to match strings with one or two characters length.
This "growing" works with capturing inside a repeated lookahead by placing an optional reference to the same group together with a freshly captured character into that group (further reading here).
A PCRE-variant with \K to reset and full matches: ^(?:.(?=.+?(.\1?$)))+?\K..?(?=\1$)|^..?
Curious about the "easy solution using balancing groups" that #Qtax mentions in his question.

Related

Moving the first char in a string to the send of the string using a method. C#

I know there are a lot of similar questions asked, and I've looked over those, but I still can't figure out my solution.
I'm trying to write a method that takes the first character of an inputted string and moves it to the back, then I can add additional characters if needed.
Basically if the input is Hello the output would be elloH + "whatever." I hope that makes sense.
As proof that I'm just not being lazy, here is the rest of the source code for the other parts of what I am working on. It all works, I just don't know where to begin with the last part.
Thanks for looking and thanks for the help!
private string CaseSwap(string str)//method for swaping cases
{
string result = ""; //create blank var
foreach (var c in str)
if (char.IsUpper(c)) //find uppers
result += char.ToLower(c); //change to lower
else
result += char.ToUpper(c); //all other lowers changed to upper
str = result; //assign var to str
return str; //return string to method
}
private string Reverse(string str)//method for reversing string
{
char[] revArray = str.ToCharArray(); //copy into an array
Array.Reverse(revArray); //reverse the array
return new string(revArray); //return the new string
}
private string Latin(string str)//method for latin
{
}
}
}

If you want to move first character to the end of string, then you can try below
public string MoveFirstCharToEnd(string str, string whateverStr="")
{
if(string.IsNullOrEmpty(str))
return str;
string result = str.Substring(1) + str[0] + whateverStr;
return result;
}
Note: I added whateverStr as an optional parameter, so that it can support only moving first character to the end and also it supports concatenating extra string to the result.
String.Substring(Int32):
Retrieves a substring from this instance. The substring starts at a
specified character position and continues to the end of the string.

Why not just take the 1st char and combine it with the rest of the string? E.g.
Hello
^^ ^
|| |
|Substring(1) - rest of the string (substring starting from 1)
|
value[0] - first character
Code:
public static string Rotate(string value) => string.IsNullOrEmpty(value)
? value
: $"{value.Substring(1)}{value[0]}";
Generalized implementation for arbitrary rotation (either positive or negative):
public static string Rotate(string value, int count = 1) {
if (string.IsNullOrWhiteSpace(value))
return value;
return string.Concat(Enumerable
.Range(0, value.Length)
.Select(i => value[(i + count % value.Length + value.Length) % value.Length]));
}
You can simplify your current implementation with a help of Linq
using System.Linq;
...
private static string CaseSwap(string value) =>
string.Concat(value.Select(c => char.IsUpper(c)
? char.ToLower(c)
: char.ToUpper(c)));
private static string Reverse(string value) =>
string.Concat(value.Reverse());

You can try to get the first character of a string with the String.Substring(int startPosition, int length) method . With this method you can also get the rest of your text starting from position 1 (skip the first character). When you have these 2 pieces, you can concat them.
Don't forget to check for empty strings, this can be done with the String.IsNullOrEmpty(string text) method.
public static string RemoveAndConcatFirstChar(string text){
if (string.IsNullOrEmpty(text)) return "";
return text.Substring(1) + text.Substring(0,1);
}

Appending multiple characters to a string is inefficient due to the number of string objects allocated, which is not just memory intensive it's also slow. There's a reason we have StringBuilder and other such options available to us, like working with char[]s.
Here's a fairly quick method that for rotating a string left one character (moving the first character to the end):
string RotateLeft(string source)
{
var chars = source.ToCharArray();
var initial = chars[0];
Array.Copy(chars, 1, chars, 0, chars.Length - 1);
chars[^1] = initial;
return new String(chars);
}
Sadly we can't do that in-place in the string itself since they're immutable, so there's no avoiding the temporary array and string construction at the end.
Based on the fact that you called the method Latin(...) and the bit of the question where you said: "Basically if the input is Hello the output would be elloH + "whatever."... I'm assuming that you're writing a Pig Latin translation. If that's the case, you're going to need a bit more.
Pig Latin is a slightly tricky problem because it's based on the sound of the word, not the letters. For example, onto becomes ontohay (or variants thereof) while one becomes unway because the word is pronounced the same as won (with a u to capture the vowel pronunciation correctly). Phonetic operations on English is quite annoying because of all the variations with silent and implied initial letters. And don't even get me started on pseudo-vowels like y.
Special cases aside, the most common rules of Pig Latin translation code appear to be as follows:
Words starting with a single consonant followed by a vowel: move the consonant to the end and append ay.
Words starting with a pair of consonants followed by a vowel: move the consonant pair to the end and append ay.
Words that start with a vowel: append hay, yay, tay, etc.
That third one is a bit difficult since choosing the right suffix is a matter of what makes the result easiest to say... which code can't really decide all that easily. Just pick one and go with that.
Of course there are plenty of words that don't fit those rules. Anything starting with a consonant triplet for example (Christmas being the first that came to mind, followed shortly by strip... and others). Pseudo-vowels like y mess things up (cry for instance). And of course the ever-present problem of correctly representing the initial vowel sounds when you've stripped context: won is converted to un-way vocally, so rendering it as on-way in text is a little bit wrong. Same with word, whose Pig Latin version is pronounced erd-way.
For a simple first pass though... just follow the rules, treating y as a consonant if it's the first letter and as a vowel in the second or third spots.
And since this is so often a homework problem, I'm going to stop here and let you play with it for a bit. Just in case :P
(Oh, and don't forget to preserve the case of your first character just in case you're working on a capitalized word. Latin should become Atinlay, not atinLay. Just saying.)

Detect Two Consecutive Single Quotes Inside Single Quotes

I'm struggling to get this regex pattern exactly right, and am open to other options outside of regex if someone has a better alternative.
The situation:
I'm basically looking to parse a T-SQL "in" clause against a text column in C#. So, I need to take a string value like this:
"'don''t', 'do', 'anything', 'stupid'"
And interpret that as a list of values (I'll take care of the double single quotes later):
"don''t"
"do"
"anything"
"stupid"
I have a regex that works for most cases, but I'm struggling to generalize it to the point where it will accept any character OR a doubled-up single quote inside my group: (?:')([a-z0-9\s(?:'(?='))]+)(?:')[,\w]*
I'm fairly experienced with regexes, but have rarely, if ever, found a need for look-arounds (so downgrade my assessment of my regex experience accordingly).
So, to put this another way, I'm wanting to take a string of comma-delimited values, each enclosed in single quotes but can contain doubled single quotes, and output each such value.
EDIT
Here's a non-working example with my current regex (my problem is I need to handle all characters in my grouping and stop when I encounter a single quote not followed by a second single quote):
"'don''t', 'do?', 'anything!', '#stupid$'"

If you still think about a regex-based solution, you can use the following regex:
'(?:''|[^'])*'
Or an "un-rolled" version suggested by #sln:
'[^']*(?:''[^']*)*'
See demo
It is fairly simple, it captures double single quotation marks OR anything that is not a single quotation mark. No need using any look-behinds or look-aheads. It does not take care of any escaped entities, but I do not see this requirement in your question.
Moreover, this regex will return matches that are easy to access and deal with:
var text = "'don''t', 'do', 'anything', 'stupid'";
var re = new Regex(#"'[^']*(?:''[^']*)*'"); // Updated thanks to #sln, previous (#"'(?:''|[^'])*'");
var match_values = re.Matches(text).Cast<Match>().Select(p => p.Value).ToList();
Output:

If you want to use the Capture Collection feature, you can grab them all in a
single pass.
# #"""\s*(?:'([^']*(?:''[^']*)*)'\s*(?:,\s*|(?="")))+"""
"
\s*
(?:
'
( # (1 start)
[^']*
(?:
'' [^']*
)*
) # (1 end)
'
\s*
(?:
, \s*
| (?= " )
)
)+
"
C# code:
string strSrc = "\"'don''t', 'do', 'anything', 'stupid'\"";
Regex rx = new Regex(#"""\s*(?:'([^']*(?:''[^']*)*)'\s*(?:,\s*|(?="")))+""");
Match srcMatch = rx.Match(strSrc);
if (srcMatch.Success)
{
CaptureCollection cc = srcMatch.Groups[1].Captures;
for (int i = 0; i < cc.Count; i++)
Console.WriteLine("{0} = '{1}'", i, cc[i].Value);
}
Output:
0 = 'don''t'
1 = 'do'
2 = 'anything'
3 = 'stupid'
Press any key to continue . . .

Why don't you split on ', ':
Regex regex = new Regex(#"'\s*,\s*'");
string[] substrings = regex.Split(str);
And then take care of the extra single quotes by Trimming

Looks to me like you're over-thinking the problem. A quoted string with an escaped quote looks just like two strings without escaped quotes, one right after the other (not even spaces between them).
(?:'[^']*')+
Of course, you'll have to remove the enclosing quotes, but you probably had to do some post-processing anyway, to unescape the escaped quotes.
Also note that I'm not trying to validate the input or work around possible errors; for example, I don't bother matching the commas between the strings. If the input is well formed, this regex should be all you need.

In the interest of maintainability, I decided against a regex and followed the advice of using a state machine. Here's the crux of my implementation:
string currentTerm = string.Empty;
State currentState = State.BetweenTerms;
foreach (char c in valueToParse)
{
switch (currentState)
{
// if between terms, only need to do something if we encounter a single quote, signalling to start a new term
// encloser is client-specified char to look for (e.g. ')
case State.BetweenTerms:
if (c == encloser)
{
currentState = State.InTerm;
}
break;
case State.InTerm:
if (c == encloser)
{
if (valueToParse.Length > index + 1 && valueToParse[index + 1] == encloser && valueToParse.Length > index + 2)
{
// if next character is also encloser then add it and move on
currentTerm += c;
}
else if (currentTerm.Length > 0 && currentTerm[currentTerm.Length - 1] != encloser)
{
// on an encloser and didn't just add encloser, so we are done
// converterFunc is a client-specified Func<string,T> to return terms in the specified type (to allow for converting to int, for example)
yield return converterFunc(currentTerm);
currentTerm = string.Empty;
currentState = State.BetweenTerms;
}
}
else
{
currentTerm += c;
}
break;
}
index++;
}
if (currentTerm.Length > 0)
{
yield return converterFunc(currentTerm);
}

Cut the string to be <= 80 characters AND must keep the words without cutting them

I am new to C#, but I have a requirement to cut the strings to be <= 80 characters AND they must keep the words integrity (without cutting them)
Examples
Before: I have a requirenment to cut the strings to be <= 80 characters AND must keep the words without cutting them (length=108)
After: I have a requirenment to cut the strings to be <= 80 characters AND must keep (length=77)
Before: requirenment to cut the strings to be <= 80 characters AND must keep the words without cutting them (length=99)
After: requirenment to cut the strings to be <= 80 characters AND must keep the words (length=78)
Before: I have a requirenment the strings to be <= 80 characters AND must keep the words without cutting them (length=101)
After: I have a requirenment the strings to be <= 80 characters AND must keep the words (length=80)
I want to use the RegEx, but I don't know anything about the regex. It would be a hassle to to the else-if's for this.
I would appreciate if you could point me to the right article which I could use to create this expression.
this is my function that I want to cut to one line:
public String cutTitleto80(String s){
String[] words = Regex.Split(s, "\\s+");
String finalResult = "";
foreach (String word in words)
{
String tmp = finalResult + " " + word;
if (tmp.Length > 80)
{
return finalResult;
}
finalResult = tmp;
}
return finalResult;
}

Try
^(.{0,80})(?: |$)
This is a capturing greedy match which must be followed by a space or end of string. You could also use a zero-width lookahead assertion, as in
^.{0,80}(?= |$)
If you use a live test tool like http://regexhero.net/tester/ it's pretty cool, you can actually see it jump back to the word boundary as you type beyond 80 characters.
And here's one which will simply truncate at the 80th character if there are no word boundaries (spaces) to be found:
^(.{1,80}(?: |$)|.{80})

Here's an approach without using Regex: just split the string (however you'd like) into whatever you consider "words" to be. Then, just start concatenating them together using a StringBuilder, checking for your desired length, until you can't add the next "word". Then, just return the string that you have built up so far.
(Untested code ahead)
public string TruncateWithPreservation(string s, int len)
{
string[] parts = s.Split(' ');
StringBuilder sb = new StringBuilder();
foreach (string part in parts)
{
if (sb.Length + part.Length > len)
break;
sb.Append(' ');
sb.Append(part);
}
return sb.ToString();
}

string truncatedText = text.Substring(0, 80); // truncate to 80 characters
if (text[80] != ' ') // don't remove last word if a space occurs after it in the original string (in other words, last word is already complete)
truncatedText = truncatedText.Substring(0, truncatedText.LastIndexOf(' ')); // remove any cut-off words
Updated to fix issue from comments where last word could get cut off even if it is complete.

This isn't using regex but this is how I would do it:
Use String.LastIndexOf to get the last space before the 81st char.
If the 81th char is a space then take it until 80.
if it returns a number > -1 cut it off there.
If it's -1 you-have-a-really-long-word-or-someone-messing-with-the-system so you do wathever you like.

C# stripping out the string needed

Ok so i have these strings
location = "C:\\Users\\John\\Desktop\\399";
location = "C:\\Users\\John\\Desktop\\399\\DISK1";
location = "C:\\Users\\John\\Desktop\\399\\DISK2";
location = "\\somewhere\\on\\Network\\399\\DISK2";
how do i strip out the 399 from all these situations ....FYI the number might be 2 digits like 42 so i cant grab the last 3 in the first case....i was thinking of some regex that would take out the DISKn if it exists and grab the number till the \ before the number but i dont know how to do that in C#...any ideas

Here is how to do this with Regex against your example input:
Regex rgx = new Regex("\\\d+");
string result = rgx.Replace(input, string.Empty);
The regular expression will match on a \ followed by at least one digit and replace them. You need to be careful though, as it will not preserve the string if you have this pattern elsewhere in the string.
If your inputs are exactly as you have described, using string.Split can be much more efficient (assuming the portion you need to remove is always last of before last).
Update:
The regex I provided will work only if you have a single part of the path that starts with numbers, not multiples or paths that have begin with numbers but do not end with them.
The information you have provided is not enough to built a regular expression that will do as you wish - how do you distinguish between numeric paths that do need to be stripped out and those that do not, for example?

var parts = location.Split('\\');
var number = parts.Last().Starts("DISK") ? parts[parts.Length - 2] : parts[parts.Length - 1];
strip number out:
var index = parts.Last().Starts("DISK") ? parts.Length - 2 : parts.Length - 1;
var newParts = parts.Take(index).Concat(parts.Skip(index + 1)).ToArray();
var newLocation = string.Join("\\", newParts);

Take a look at the Split() method for breaking the string up around separators. Then you can use techniques such as checking for the last part starting with DISK, or checking for a part that is purely integer (possibly risky, in case higher subdirectories are pure numbers - unless you work from the back!).

int i = int.Parse(location.Split(new string[] { "\\" }, StringSplitOptions.RemoveEmptyEntries)[4]);

C# program to find no of words that can be generated from a word by adding ' . ' in between characters?

I asked this question in the chat room. but no answer so i am posting the question here.
The question is, for example take the word abcd
it has 4 charcters. by adding the ' . ' in between the characters you can write it as a.b.c.d
rules
can use only 1 dot between characters
can use multiple dots in the word
Edit: there can be characters without ' . ' in between them. eg (ab or abcd)
cannot use dot at the beginning or end of the word ie .abcd or abcd. are false
some of the answers
a.b.c.d
a.bcd
ab.cd
abc.d
a.b.cd
a.bc.d
ab.c.d
abc.d
how many word are possible to make. how to write a program to find it in c# ?
Edit
how to display each possible word ?

You don't really need to write a program for this.
For a word of n characters, there are n-1 positions where there can be a dot (i.e. between each pair of characters). Each position either has a dot or doesn't.
There are therefore 2n-1 possible words.
If you really want to write a C# program to display this:
using System;
class Test
{
static void Main(string[] args)
{
// Argument validation left as an exercise for the reader
string word = args[0];
Console.WriteLine("Word {0} has {1} possibilities",
word, Math.Pow(2, word.Length - 1));
}
}
EDIT: Note that this assumes that the original word (with no dots) still counts. If you don't want it to count, subtract one from the result.
EDIT: I've changed the computation to use Math.Pow so that:
It copes with words with more than 33 letters (up to another limit, of course)
It's clearer

You can do it recursively.
All possible combinations of (abcd) are:
a + . + all combinations of (bcd)
ab + . + all combinations of (cd)
abc + . + all combinations of (d)
abcd
Code:
public static IEnumerable<string> GetCombinations(string str) {
for (int i = 1; i < str.Length; i++) {
foreach (string s in GetCombinations(str.Substring(i))) {
yield return str.Substring(0, i) + "." + s;
}
}
yield return str;
}
Usage:
foreach (string s in GetCombinations("abcd")) Console.WriteLine(s);

Number of combinations:
string s = "abcd";
int len = s.Length;
int combinations = 1 << (len - 1);
EDIT: as Paul notes in the comments,
int combinations = 1 << (len - 1) - 1;
to remove the word that contains no dots if that's not a valid combination.

Why do you need a program?
if the string is length n, then there are n-1 places you can put a .
In any place, there can either be a . or not, that is, two options.
SO the answer is 2**(n-1) - 1 (the -1 being for the answer that has no dots, i.e the original word)

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

finding middle character in string using regex only - c#

How can I find middle character with regex only For example,this shows the expected output Hello -> l world -> r merged -> rg (see this for even number of occurances) hi -> hi I -> I I tried (?<=\w+).(?=\w+)

Regular expressions cannot count in the way that you are looking for. This looks like something regular expressions cannot accomplish. I suggest writing code to solve this.

String str="Hello"; String mid=""; int len = str.length(); if(len%2==1) mid= Character.toString(str.getCharAt(len/2)); else mid= Character.toString(str.getChatAt(len/2))+ Character.toStringstr.getCharAt((len/2)-1)); This should probably work.

Related

Moving the first char in a string to the send of the string using a method. C#

Detect Two Consecutive Single Quotes Inside Single Quotes

Cut the string to be <= 80 characters AND must keep the words without cutting them

C# stripping out the string needed

C# program to find no of words that can be generated from a word by adding ' . ' in between characters?

Categories

Resources