I am new to regex. I have this string
new.TITLE['kinds.of'].food
or
new.TITLE['deep thought'].food
I want to retrieve these tokens:
new, TITLE, kinds.of, food.
or (2nd example)
new, TITLE, deep thought, food.
I can't simply split it with '.' I need regex match to get the values.
How is it done?
When working with tokens a parser (FST - Finite State Machine in this case) should do:
private static IEnumerable<string> ParseIt(string value) {
int lastIndex = 0;
bool inApostroph = false;
for (int i = 0; i < value.Length; ++i) {
char ch = value[i];
if (ch == '\'') {
inApostroph = !inApostroph;
continue;
}
if (inApostroph)
continue;
if (ch == '.' || ch == ']' || ch == '[') {
if (i - lastIndex > 0) {
if (value[lastIndex] != '\'')
yield return value.Substring(lastIndex, i - lastIndex);
else {
string result = value.Substring(lastIndex, i - lastIndex).Replace("''", "'");
yield return result.Substring(1, result.Length - 2);
}
}
lastIndex = i + 1;
}
}
if (lastIndex < value.Length)
yield return value.Substring(lastIndex);
}
Tests:
string test1 = #"new.TITLE['kinds.of'].food";
string test2 = #"new.TITLE['deep thought'].food";
string[] result1 = ParseIt(test1).ToArray();
string[] result2 = ParseIt(test2).ToArray();
Console.WriteLine(string.Join(Environment.NewLine, result1));
Console.WriteLine(string.Join(Environment.NewLine, result2));
Outcome:
new
TITLE
kinds.of
food
new
TITLE
deep thought
food
Related
What I want to do is to split an array and then put the character which i split at into another element
i.e. string text = "1*5+89-43&99" should become string[] textsplit = ["1","*","5","+","89","-","43","&","99"] (it must be a string)
and I will supply the characters to be left in seperate elements
You can do this using string.IndexOfAny.
Simply keep looking for the next index of any of the separators. When you find a separator, add the text between it and the last separator to your results, then look for the next separator.
string input = "1*1*5+89-43&33";
var separators = new[] { '+', '-', '*', '/', '&' };
var result = new List<string>();
int index;
int lastIndex = 0;
while ((index = input.IndexOfAny(separators, lastIndex)) != -1)
{
// Add the text before the separator, if there is any
if (index - lastIndex > 0)
{
result.Add(input.Substring(lastIndex, index - lastIndex));
}
// Add the separator itself
result.Add(input[index].ToString());
lastIndex = index + 1;
}
// Add any text after the last separator
if (lastIndex < input.Length)
{
result.Add(input.Substring(lastIndex));
}
Try with the following code snippet:
string text = "1*1*5+89-43&33";
List<string> textsplit = new List<string>();
foreach(var match in Regex.Matches(text, #"([*+/\-)(])|([0-9]+)"))
{
textsplit.Add(match.ToString());
}
Result added as an image.
Here's a basic and naive implementation that I beliewe will do what you want:
public static List<string> SplitExpression(string expression)
{
var parts = new List<string>();
bool isNumber(char c) => c == '.' || (c >= '0' && c <= '9');
bool isOperator(char c) => !isNumber(c);
int index = 0;
while (index < expression.Length)
{
char c = expression[index];
index++;
if (isNumber(c))
{
int numberIndex = index - 1;
while (index < expression.Length && isNumber(expression[index]))
index++;
parts.Add(expression.Substring(numberIndex, index - numberIndex));
}
else
parts.Add(c.ToString());
}
// move unary signs into following number
index = 0;
while (index < parts.Count - 1)
{
bool isSign = parts[index] == "-" || parts[index] == "+";
bool isFirstOrFollowingOperator = index == 0 || isOperator(parts[index - 1][0]);
bool isPriorToNumber = isNumber(parts[index + 1][0]);
if (isSign && isFirstOrFollowingOperator && isPriorToNumber)
{
parts[index + 1] = parts[index] + parts[index + 1];
parts.RemoveAt(index);
}
else
index++;
}
return parts;
}
Example input: "-1+-2*-10.1*.1", and output:
-1
+
-2
*
-10.1
*
.1
I want to calculate the summary of string in terms of number of alphabets, digits and special character in C#. For example:
String abc123$% should have summary like A3D3S2 (which means 3 Alphabet, 3 Digits and 2 Special character)
a34=$# should have summary like A1D2S3 (which means 1 Alphabet, 2 Digits and 3 Special character)
a3b$s should have summary like A1D1A1S1A1 (which means 1 Alphabet, 1 Digits,1 Alphabet, 1 Special character,1 Alphabet)
Can anyone guide me how can write an algorithm which can perform the above task in a quick way? as I think if I search the string character by character, then it will take considerable amount of time. and I have a large dataset of strings.
This works:
static string GetSummary(string input)
{
var sb = new StringBuilder();
string prevMode = "";
string curMode = "";
int sameModeCount = 0;
for (int i = 0; i <= input.Length; ++i)
{
if (i < input.Length)
{
char c = input[i];
if ('a' <= c && c <= 'z' || 'A' <= c && c <= 'Z')
{
curMode = "A";
}
else if ('0' <= c && c <= '9')
{
curMode = "D";
}
else
{
curMode = "S";
}
}
else
{
curMode = "";
}
if (curMode != prevMode && prevMode != "")
{
sb.Append(prevMode);
sb.Append(sameModeCount);
sameModeCount = 0;
}
prevMode = curMode;
++sameModeCount;
}
return sb.ToString();
}
Test:
public static void Main()
{
Console.WriteLine(GetSummary("abc123$%"));
Console.WriteLine(GetSummary("a34=$#"));
Console.WriteLine(GetSummary("a3b$s"));
}
Results:
A3D3S2
A1D2S3
A1D1A1S1A1
With Linq, you can do like this :
string myinput = "abc123$%";
int letter =0 , digit = 0, specialCharacter = 0;
myinput.ToCharArray().ToList().ForEach(x =>
{
letter = Char.IsLetter(x) ? ++letter : letter;
digit = Char.IsDigit(x) ? ++digit : digit;
specialCharacter = !Char.IsLetterOrDigit(x) ?
++specialCharacter : specialCharacter;
});
string formattedVal = String.Format("A{0}D{1}S{2}", letter, digit,
specialCharacter);
You can directly use array in Linq ForEach without converting to list by :
Array.ForEach(myinput.ToCharArray(), x =>
{
letter = Char.IsLetter(x) ? ++letter : letter;
digit = Char.IsDigit(x) ? ++digit : digit;
specialCharacter = !Char.IsLetterOrDigit(x) ? ++specialCharacter : specialCharacter;
});
string formattedVal = String.Format("A{0}D{1}S{2}", letter, digit, specialCharacter);
This should work:
string s = "a3b$s";
char etype = 'X'; //current character's type
char etypeinit = 'X'; //tracker variable - holds type of last character
string str = "";
int count = 1;
foreach(char c in s)
{
//Use this block of conditionals to assign type for current character
if(char.IsLetter(c))
{
etype = 'A';
}
else if(char.IsDigit(c))
{
etype = 'D';
}
else
{
etype = 'S';
}
//This is a different type of character compared to the previous one
if(etypeinit != etype)
{
str += string.Format("{0}{1}",etype,count); //Build the string
count = 1; //Reset count
}
else
{
count++; //Increment because this is the same type as previous one
}
etypeinit = etype; //Set tracker variable to type of current character
}
Console.WriteLine(str);
Little late and little complex but able to produces all expected output as per given inputs in the question, please take a look:
string inputString = "abc123$%ab12";
var results = inputString.Select(x => char.IsLetter(x) ? 'A' :
char.IsDigit(x) ? 'D' : 'S');
StringBuilder outPutBuilder = new StringBuilder();
char previousChar = results.First();
int charCount = 0;
foreach (var item in results)
{
switch (item)
{
case 'A':
if (previousChar == 'A')
{
charCount++;
}
else
{
outPutBuilder.Append(previousChar.ToString() + charCount);
charCount = 1;
}
break;
case 'D':
if (previousChar == 'D')
charCount++;
else
{
outPutBuilder.Append(previousChar.ToString() + charCount);
charCount = 1;
}
break;
default:
if (previousChar == 'S')
charCount++;
else
{
outPutBuilder.Append(previousChar.ToString() + charCount);
charCount = 1;
}
break;
}
previousChar = item;
}
outPutBuilder.Append(previousChar.ToString() + charCount);
Working example
Use a FOR loop to go through each character. If the character is in the range of a-z or A-Z then it is an alphabet. If in the range of 0-9 then it is a digit else special character.
Code
string inputStr = "a3b$s";
string outputStr = string.Empty;
char firstChar = Convert.ToChar(inputStr.Substring(0, 1));
outputStr = char.IsLetter(firstChar) ? "A1" : char.IsDigit(firstChar) ? "D1" : "S1";
for (int i = 1; i < inputStr.Length; i++)
{
char nextChar = char.IsLetter(inputStr[i]) ? 'A' :
char.IsDigit(inputStr[i]) ? 'D' : 'S';
char prevChar = Convert.ToChar(outputStr.Substring(outputStr.Length - 2, 1));
if (nextChar == prevChar)
{
int lastDig = Convert.ToInt32(outputStr.Substring(outputStr.Length - 1, 1));
outputStr = outputStr.Substring(0, outputStr.Length - 1) +
(lastDig + 1).ToString();
}
else
outputStr += nextChar.ToString() + "1";
}
Console.WriteLine(outputStr.ToString());
Output
A1D1A1S1A1
Find demo here
The string has to be split into 4 pairwise different non-empty parts. For example,
"happynewyear" could become ["happy", "new", "ye" and "ar"]
No deletion, change of order of characters is permitted.
This question was part of an online competition, which is now over. I have written the following C# code which works for the test cases which I have run but it failed in 3 test cases after submission. I am not sure what cases I might be missing, can anyone help?
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace Hackerearth___India_Hacks
{
class Program
{
static void Main(string[] args)
{
var line1 = System.Console.ReadLine().Trim();
var N = Int32.Parse(line1);
string[] s = new string[N];
string result = "";
for (var i = 0; i < N; i++)
{
s[i] = System.Console.ReadLine().Trim();
result = result + "\n" + check(s[i]);
}
System.Console.Write(result);
Console.ReadKey();
}
static string check(string s)
{
if (s.Length > 3)
{
string[] s1 = new string[4];
int k = 0;
string c = "";
foreach (char ch in s)
{
c = c + ch.ToString();
// Console.WriteLine("C :" +c);
if (k == 0)
{
s1[k] = c;
c = "";
k = 1;
}
else
for (int i = 0; i < k; i++)
{
int f = 0;
for (int j = 0; j < k; j++)
{
if (s1[j].Equals(c) || c == "")
f=1;
}
if (f == 1)
break;
s1[k] = c;
c = "";
if (k == 3 && s1[k] != null)
return "YES";
k++;
// Console.WriteLine("K :"+s[k]);
}
}
return "NO";
}
else
{
return "NO";
}
}
}
}
This would be an example which would not work with your algorithm: "aababa". The 4 strings should be ["aa", "b", "a","ba"] given your criteria, but your algorithm always assumes that the first character is the first string in the solution. This assumption is false. If "a" is the first string in the example I give, your algorithm would fail because it would make the first 3 strings ["a", "ab", "aba",...] that last one would fail with your algorithm because it has no more characters to add to the array.
A recursive solution makes sense to me... here's some code that I think would work.
EDIT: it does work... here's a dotnetfiddle
public static List<string> FindStrings(string s, int n) {
if (n == 0) {
if (string.IsNullOrEmpty(s)) {
return new List<string>{ };
}
return null; // null means invalid
}
for (var i=s.Length-1; i>=0; i--){
var startOfString = s.Substring(0, i);
var endOfString = s.Substring(i);
var list = FindStrings(startOfString, n-1);
// invalid... gotta continue to next try
if (list == null) continue;
// make sure there are no matches so far
if (list.Contains(endOfString)) continue;
// bingo!
if (list.Count == n-1) {
list.Add(endOfString);
return list;
}
}
return null; // null means invalid
}
One way to tackle this problem is to solve the problem of creating all possible substrings. Then going through all the possibilities and making sure the results are distinct.
private static void Main(string[] args)
{
var N = int.Parse(Console.ReadLine());
for (var i = 0; i < N; i++)
{
Console.WriteLine(IsPairwiseUnquie(Console.ReadLine(), 4) ? "YES" : "NO");
}
}
public static bool IsPairwiseUnquie(string s, int count)
{
return s.AllSubstrings(4).Any(subs => subs.Count == subs.Distinct().Count());
}
public static IEnumerable<List<string>> AllSubstrings(this string str, int count)
{
if(str.Length < count)
throw new ArgumentException("Not enough characters");
if(count <= 0)
throw new ArgumentException("Must be greater than 0", nameof(count));
// Base case of only one substring, just return the original string.
if (count == 1)
{
yield return new List<string> { str };
yield break;
}
// break the string down by making a substring of all possible lengths from the first n
// then recursively call to get the possible substrings for the rest of the string.
for (int i = 1; i <= str.Length - count + 1; i++)
{
foreach (var subsubstrings in str.Substring(i).AllSubstrings(count - 1))
{
subsubstrings.Insert(0, str.Substring(0, i));
yield return subsubstrings;
}
}
}
My initial code is 'A0AA' and I need a code/function in C# that will increment it until it goes to 'Z9ZZ'.
for example.
first code is 'D9ZZ'
the next code should be 'E0AA'
sorry maybe my example is quite confusing.. here's another example.. thanks.
first code is 'D9AZ'
the next code should be 'D9BA'
string start = "A9ZZ";
int add = 1;
string next = String.Concat(start.Reverse().Select((x,i) =>
{
char first = i == 2 ? '0' : 'A';
char last = i == 2 ? '9' : 'Z';
if ((x += (char)add) > last)
{
return first;
}
else
{
add = 0;
return x;
}
})
.Reverse());
This should fix it.
private static IEnumerable<string> Increment(string value)
{
if (value.Length != 4)
throw new ArgumentException();
char[] next = value.ToCharArray();
while (new string(next) != "Z9ZZ")
{
next[3]++;
if (next[3] > 'Z')
{
next[3] = 'A';
next[2]++;
}
if (next[2] > 'Z')
{
next[2] = 'A';
next[1]++;
}
if (next[1] > '9')
{
next[1] = '0';
next[0]++;
}
yield return new string(next);
}
}
Example of calling this code:
IList<string> values = Increment("A0AA").Take(100).ToList();
foreach (var value in values)
{
Console.Write(value + " ");
}
Here's a pretty clean solution that checks every character starting at the end:
public SomeMethod()
{
var next = Increment("A2CZ"); // A2DZ
}
public string Increment(string code)
{
var arr = code.ToCharArray();
for (var i = arr.Length - 1; i >= 0; i--)
{
var c = arr[i];
if (c == 90 || c == 57)
continue;
arr[i]++;
return new string(arr);
}
return code;
}
I need to develop an efficient algorithm for determining the unique (repeated) string given a string with repeating content (and only repeating content)...
For example:
"AbcdAbcdAbcdAbcd" => "Abcd"
"Hello" => "Hello"
I'm having some trouble coming up with an algorithm that is fairly efficient; any input would be appreciated.
Clarification: I want the shortest string that, when repeated enough times, is equal to the total string.
private static string FindShortestRepeatingString(string value)
{
if (value == null) throw new ArgumentNullException("value", "The value paramter is null.");
for (int substringLength = 1; substringLength <= value.Length / 2; substringLength++)
if (IsRepeatingStringOfLength(value, substringLength))
return value.Substring(0, substringLength);
return value;
}
private static bool IsRepeatingStringOfLength(string value, int substringLength)
{
if (value.Length % substringLength != 0)
return false;
int instanceCount = value.Length / substringLength;
for (int characterCounter = 0; characterCounter < substringLength; characterCounter++)
{
char currentChar = value[characterCounter];
for (int instanceCounter = 1; instanceCounter < instanceCount; instanceCounter++)
if (value[instanceCounter * substringLength + characterCounter] != currentChar)
return false;
}
return true;
}
Maybe this can work:
static string FindShortestSubstringPeriod(string input)
{
if (string.IsNullOrEmpty(input))
return input;
for (int length = 1; length <= input.Length / 2; ++length)
{
int remainder;
int repetitions = Math.DivRem(input.Length, length, out remainder);
if (remainder != 0)
continue;
string candidate = input.Remove(length);
if (String.Concat(Enumerable.Repeat(candidate, repetitions)) == input)
return candidate;
}
return input;
}
Something like this:
public string ShortestRepeating(string str)
{
for(int len = 1; len <= str.Length/2; len++)
{
if (str.Length % len == 0)
{
sub = str.SubString(0, len);
StringBuilder builder = new StringBuilder(str.Length)
while(builder.Length < str.Length)
builder.Append(sub);
if(str == builder.ToString())
return sub;
}
}
return str;
}
This just starts looking at sub strings starting at the beginning and then repeats them to see if they match. It also skips any that do not have a length that doesn't evenly divide into the original strings length and only goes up to the length / 2 since anything over that cannot be a candidate for repeating.
I'd go with something like this:
private static string FindRepeat(string str)
{
var lengths = Enumerable.Range(1, str.Length - 1)
.Where(len => str.Length % len == 0);
foreach (int len in lengths)
{
bool matched = true;
for (int index = 0; matched && index < str.Length; index += len)
{
for (int i = index; i < index + len; ++i)
{
if (str[i - index] != str[i])
{
matched = false;
break;
}
}
}
if (matched)
return str.Substring(0, len);
}
return str;
}
Try this regular expression:
^(\w*?)\1*$
It captures as few characters as possible where the captured sequence (and only the captured sequence) repeat 0 or more times. You can get the text of the shortest match from the capture afterwards, as per Jacob's answer.
You could use a regular expression with back-references.
Match match = Regex.Match(#"^(.*?)\0*$");
String smallestRepeat = match.Groups[0];