Change Variable back to original value after Regex matching

Change Variable back to original value after Regex matching - c#

I just "finished" expanding my Palindrome Tester, made in C#. To allow for phrases I added a simple regex match for all non-alphanumeric characters. At the end of the program it states " is(n't) a palindrome." But now with the regex it prints the no spaces/punctuation version of it.
I would like to be able to print the original user input. How do I do that?
Here is my program: http://gist.github.com/384565

Calling ToLower() doesn't do anything by itself. Strings are immutable, meaning that it's impossible to modify an instance of string. The ToLower() function returns a new string, so you have to store that value in a variable (either the same or a new one).
To return the value passed into the function, just create a new string variable.
Like this:
public static string Tester(string input)
{
string pattern = "\\W";
string data = Regex.Replace(input.ToLower(), pattern, String.Empty);
if (data == StringHelper.ReverseString(data))
{
Console.Write(input); Console.Write(" is a Palindrome.");
}
else
{
Console.Write(input); Console.Write(" isn't a Palindrome.");
}
return input;
}

Related

How to perform mutliple Replace calls at once

I have a bit of a weird question here at hands. I have a text that's encoded in such a way that each character is replaced by another character and I'm creating an application that will replace each character with a correct one. But I've come across a problem that I have trouble solving. Let me show with an example:
Original text: This is a line.
Encoded text: (.T#*T#*%*=T50;
Now, as I said, each character represents another character, '(' is 'T', '.' is actually a 'h' and so on.
Now I could just go with
string decoded = encoded.Replace('(','T'); //T.T#*T#*%*=T50;
And that will solve one problem, but when I reach character 'T' that is actually encoded character 'i' I will have to replace all 'T' with 'i', which means that all previously decoded letter 'T's (that were once '(') will also change along with the encoded 'T'.
//T.T#*T#*%*=T50; -> i.i#*i#*%*=i50;
in this situation it's obvious that I should've just went the other way around, first change 'T' to 'i' and then '(' to 'T', but in the text I'm changing that kind of analysis is not an option.
What's the alternative here that I could do to perform the task correctly?
Thank you!

One possible solution is do not use replace string method at all.
Instead you can create method which for every encoded character will output decoded one, and then go through your string as through array of char and for every character in this array use "decryption" method to get decoded character - thus you'll receive decoded string.
For example (using StringBulder to create new string):
private static char Decode(char source)
{
if (source == '(')
return 'T';
else if (source == '.')
return 'h';
//.... and so on
}
string source = "ABC";
var builder = new StringBuilder();
foreach (var c in source)
builder.Append(Decode(c));
var result = builder.ToString();

Using .Replace() probably isn't the way to go in the first place, since as you're finding it covers the whole string every time. And once you've modified the whole string once, the encoding is lost.
Instead, loop over the string one time and replace characters individually.
Create a function that accepts a char and returns the replaced char. For simplicity, I'll just show the signature:
private char Decode(char c);
Then just loop over the string and call that function on each character. LINQ can make short work of that:
var decodedString = new string(encodedString.Select(c => Decode(c)).ToArray());
(This is freehand and untested, you may or may not need that .ToArray() for the string constructor to be happy, I'm not certain. But you get the idea.)
If it's easier to read you can also just loop manually over the string and perhaps use a StringBuilder with each successive char to build the final decoded result.

Without knowledge of your encryption algorithm, this answer assumes that it's a simple character translation akin to the Caesar Cipher.
Pass in your encrypted string, the method loops over each character, adjusting it by the value of shiftDelta and returns the resulting string.
private string Decrypt(string input)
{
const int shiftDelta = 10;
var inputChars = input.ToCharArray();
var outputChars = new char[inputChars.Length];
for (var i = 0; i < outputChars.Length; i++)
{
// Perform character translation here
outputChars[i] = (char)(inputChars[i] + shiftDelta);
}
return outputChars.ToString();
}

grouping adjacent similar substrings

I am writing a program in which I want to group the adjacent substrings, e.g ABCABCBC can be compressed as 2ABC1BC or 1ABCA2BC.
Among all the possible options I want to find the resultant string with the minimum length.
Here is code what i have written so far but not doing job. Kindly help me in this regard.
using System;
using System.Collections.Generic;
using System.Linq;
namespace EightPrgram
{
class Program
{
static void Main(string[] args)
{
string input;
Console.WriteLine("Please enter the set of operations: ");
input = Console.ReadLine();
char[] array = input.ToCharArray();
List<string> list = new List<string>();
string temp = "";
string firstTemp = "";
foreach (var x in array)
{
if (temp.Contains(x))
{
firstTemp = temp;
if (list.Contains(firstTemp))
{
list.Add(firstTemp);
}
temp = "";
list.Add(firstTemp);
}
else
{
temp += x;
}
}
/*foreach (var item in list)
{
Console.WriteLine(item);
}*/
Console.ReadLine();
}
}
}

You can do this with recursion. I cannot give you a C# solution, since I do not have a C# compiler here, but the general idea together with a python solution should do the trick, too.
So you have an input string ABCABCBC. And you want to transform this into an advanced variant of run length encoding (let's called it advanced RLE).
My idea consists of a general first idea onto which I then apply recursion:
The overall target is to find the shortest representation of the string using advanced RLE, let's create a function shortest_repr(string).
You can divide the string into a prefix and a suffix and then check if the prefix can be found at the beginning of the suffix. For your input example this would be:
(A, BCABCBC)
(AB, CABCBC)
(ABC, ABCBC)
(ABCA, BCBC)
...
This input can be put into a function shorten_prefix, which checks how often the suffix starts with the prefix (e.g. for the prefix ABC and the suffix ABCBC, the prefix is only one time at the beginning of the suffix, making a total of 2 ABC following each other. So, we can compact this prefix / suffix combination to the output (2ABC, BC).
This function shorten_prefix will be used on each of the above tuples in a loop.
After using the function shorten_prefix one time, there still is a suffix for most of the string combinations. E.g. in the output (2ABC, BC), there still is the string BC as suffix. So, need to find the shortest representation for this remaining suffix. Wooo, we still have a function for this called shortest_repr, so let's just call this onto the remaining suffix.
This image displays how this recursion works (I only expanded one of the node after the 3rd level, but in fact all of the orange circles would go through recursion):
We start at the top with a call of shortest_repr to the string ABABB (I selected a shorter sample for the image). Then, we split this string at all possible split positions and get a list of prefix / suffix pairs in the second row. On each of the elements of this list we first call the prefix/suffix optimization (shorten_prefix) and retrieve a shortened prefix/suffix combination, which already has the run-length numbers in the prefix (third row). Now, on each of the suffix, we call our recursion function shortest_repr.
I did not display the upward-direction of the recursion. When a suffix is the empty string, we pass an empty string into shortest_repr. Of course, the shortest representation of the empty string is the empty string, so we can return the empty string immediately.
When the result of the call to shortest_repr was received inside our loop, we just select the shortest string inside the loop and return this.
This is some quickly hacked code that does the trick:
def shorten_beginning(beginning, ending):
count = 1
while ending.startswith(beginning):
count += 1
ending = ending[len(beginning):]
return str(count) + beginning, ending
def find_shortest_repr(string):
possible_variants = []
if not string:
return ''
for i in range(1, len(string) + 1):
beginning = string[:i]
ending = string[i:]
shortened, new_ending = shorten_beginning(beginning, ending)
shortest_ending = find_shortest_repr(new_ending)
possible_variants.append(shortened + shortest_ending)
return min([(len(x), x) for x in possible_variants])[1]
print(find_shortest_repr('ABCABCBC'))
print(find_shortest_repr('ABCABCABCABCBC'))
print(find_shortest_repr('ABCABCBCBCBCBCBC'))
Open issues
I think this approach has the same problem as the recursive levenshtein distance calculation. It calculates the same suffices multiple times. So, it would be a nice exercise to try to implement this with dynamic programming.

If this is not a school assignment or performance critical part of the code, RegEx might be enough:
string input = "ABCABCBC";
var re = new Regex(#"(.+)\1+|(.+)", RegexOptions.Compiled); // RegexOptions.Compiled is optional if you use it more than once
string output = re.Replace(input,
m => (m.Length / m.Result("$1$2").Length) + m.Result("$1$2")); // "2ABC1BC" (case sensitive by default)

Casting HexNumber as character to string

I need to process a numeral as a string.
My value is 0x28 and this is the ascii code for '('.
I need to assign this to a string.
The following lines do this.
char c = (char)0x28;
string s = c.ToString();
string s2 = ((char)0x28).ToString();
My usecase is a function that only accepts strings.
My call ends up looking cluttered:
someCall( ((char)0x28).ToString() );
Is there a way of simplifying this and make it more readable without writing '(' ?
The Hexnumber in the code is always paired with a Variable that contains that hex value in its name, so "translating" it would destroy that visible connection.
Edit:
A List of tuples is initialised with this where the first item has the character in its name and the second item results from a call with that character.
One of the answers below is exactly what i am looking for so i incorporated it here now.
{ existingStaticVar0x28, someCall("\u0028") }
The reader can now instinctively see the connection between item1 and item2 and is less likely to run into a trap when this gets refactored.

You can use Unicode character escape sequence in place of a hex to avoid casting:
string s2 = '\u28'.ToString();
or
someCall("\u28");

Well supposing that you have not a fixed input then you could write an extension method
namespace MyExtensions
{
public static class MyStringExtensions
{
public static string ConvertFromHex(this string hexData)
{
int c = Convert.ToInt32(hexCode, 16);
return new string(new char[] {(char)c});
}
}
}
Now you could call it in your code wjth
string hexNumber = "0x28"; // or whatever hexcode you need to convert
string result = hexNumber.ConvertFromHex();
A bit of error handling should be added to the above conversion.

Insert spaces into string using string.format

I've been using C# String.Format for formatting numbers before like this (in this example I simply want to insert a space):
String.Format("{0:### ###}", 123456);
output:
"123 456"
In this particular case, the number is a string. My first thought was to simply parse it to a number, but it makes no sense in the context, and there must be a prettier way.
Following does not work, as ## looks for numbers
String.Format("{0:### ###}", "123456");
output:
"123456"
What is the string equivalent to # when formatting? The awesomeness of String.Format is still fairly new to me.

You have to parse the string to a number first.
int number = int.Parse("123456");
String.Format("{0:### ###}", number);
of course you could also use string methods but that's not as reliable and less safe:
string strNumber = "123456";
String.Format("{0} {1}", strNumber.Remove(3), strNumber.Substring(3));

As Heinzi pointed out, you can not have format specifier for string arguments.
So, instead of String.Format, you may use following:
string myNum="123456";
myNum=myNum.Insert(3," ");

Not very beautiful, and the extra work might outweigh the gains, but if the input is a string on that format, you could do:
var str = "123456";
var result = String.Format("{0} {1}", str.Substring(0,3), str.Substring(3));

string is not a IFormattable
Console.WriteLine("123456" is IFormattable); // False
Console.WriteLine(21321 is IFormattable); // True
No point to supply a format if the argument is not IFormattable only way is to convert your string to int or long

We're doing string manipulation, so we could always use a regex.
Adapted slightly from here:
class MyClass
{
static void Main(string[] args)
{
string sInput, sRegex;
// The string to search.
sInput = "123456789";
// The regular expression.
sRegex = "[0-9][0-9][0-9]";
Regex r = new Regex(sRegex);
MyClass c = new MyClass();
// Assign the replace method to the MatchEvaluator delegate.
MatchEvaluator myEvaluator = new MatchEvaluator(c.ReplaceNums);
// Replace matched characters using the delegate method.
sInput = r.Replace(sInput, myEvaluator);
// Write out the modified string.
Console.WriteLine(sInput);
}
public string ReplaceNums(Match m)
// Replace each Regex match with match + " "
{
return m.ToString()+" ";
}
}
How's that?
It's been ages since I used C# and I can't test, but this may work as a one-liner which may be "neater" if you only need it once:
sInput = Regex("[0-9][0-9][0-9]").Replace(sInput,MatchEvaluator(Match m => m.ToString()+" "));

There is no way to do what you want unless you parse the string first.
Based on your comments, you only really need a simple formatting so you are better off just implementing a small helper method and thats it. (IMHO it's not really a good idea to parse the string if it isn't logically a number; you can't really be sure that in the future the input string might not be a number at all.
I'd go for something similar to:
public static string Group(this string s, int groupSize = 3, char groupSeparator = ' ')
{
var formattedIdentifierBuilder = new StringBuilder();
for (int i = 0; i < s.Length; i++)
{
if (i != 0 && (s.Length - i) % groupSize == 0)
{
formattedIdentifierBuilder.Append(groupSeparator);
}
formattedIdentifierBuilder.Append(s[i]);
}
return formattedIdentifierBuilder.ToString();
}
EDIT: Generalized to generic grouping size and group separator.

The problem is that # is a Digit placeholder and it is specific to numeric formatting only. Hence, you can't use this on strings.
Either parse the string to a numeric, so the formatting rules apply, or use other methods to split the string in two.
string.Format("{0:### ###}", int.Parse("123456"));

Check for special characters (/*-+_#&$#%) in a string?

How do I check a string to make sure it contains numbers, letters, or space only?

In C# this is simple:
private bool HasSpecialChars(string yourString)
{
return yourString.Any(ch => ! char.IsLetterOrDigit(ch));
}

The easiest way it to use a regular expression:
Regular Expression for alphanumeric and underscores
Using regular expressions in .net:
http://www.regular-expressions.info/dotnet.html
MSDN Regular Expression
Regex.IsMatch
var regexItem = new Regex("^[a-zA-Z0-9 ]*$");
if(regexItem.IsMatch(YOUR_STRING)){..}

string s = #"$KUH% I*$)OFNlkfn$";
var withoutSpecial = new string(s.Where(c => Char.IsLetterOrDigit(c)
|| Char.IsWhiteSpace(c)).ToArray());
if (s != withoutSpecial)
{
Console.WriteLine("String contains special chars");
}

Try this way.
public static bool hasSpecialChar(string input)
{
string specialChar = #"\|!#$%&/()=?»«#£§€{}.-;'<>_,";
foreach (var item in specialChar)
{
if (input.Contains(item)) return true;
}
return false;
}

String test_string = "tesintg#$234524##";
if (System.Text.RegularExpressions.Regex.IsMatch(test_string, "^[a-zA-Z0-9\x20]+$"))
{
// Good-to-go
}
An example can be found here: http://ideone.com/B1HxA

If the list of acceptable characters is pretty small, you can use a regular expression like this:
Regex.IsMatch(items, "[a-z0-9 ]+", RegexOptions.IgnoreCase);
The regular expression used here looks for any character from a-z and 0-9 including a space (what's inside the square brackets []), that there is one or more of these characters (the + sign--you can use a * for 0 or more). The final option tells the regex parser to ignore case.
This will fail on anything that is not a letter, number, or space. To add more characters to the blessed list, add it inside the square brackets.

Use the regular Expression below in to validate a string to make sure it contains numbers, letters, or space only:
[a-zA-Z0-9 ]

You could do it with a bool. I've been learning recently and found I could do it this way. In this example, I'm checking a user's input to the console:
using System;
using System.Linq;
namespace CheckStringContent
{
class Program
{
static void Main(string[] args)
{
//Get a password to check
Console.WriteLine("Please input a Password: ");
string userPassword = Console.ReadLine();
//Check the string
bool symbolCheck = userPassword.Any(p => !char.IsLetterOrDigit(p));
//Write results to console
Console.WriteLine($"Symbols are present: {symbolCheck}");
}
}
}
This returns 'True' if special chars (symbolCheck) are present in the string, and 'False' if not present.

A great way using C# and Linq here:
public static bool HasSpecialCharacter(this string s)
{
foreach (var c in s)
{
if(!char.IsLetterOrDigit(c))
{
return true;
}
}
return false;
}
And access it like this:
myString.HasSpecialCharacter();

private bool isMatch(string strValue,string specialChars)
{
return specialChars.Where(x => strValue.Contains(x)).Any();
}

Create a method and call it hasSpecialChar with one parameter
and use foreach to check every single character in the textbox, add as many characters as you want in the array, in my case i just used ) and ( to prevent sql injection .
public void hasSpecialChar(string input)
{
char[] specialChar = {'(',')'};
foreach (char item in specialChar)
{
if (input.Contains(item)) MessageBox.Show("it contains");
}
}
in your button click evenement or you click btn double time like that :
private void button1_Click(object sender, EventArgs e)
{
hasSpecialChar(textbox1.Text);
}

While there are many ways to skin this cat, I prefer to wrap such code into reusable extension methods that make it trivial to do going forward. When using extension methods, you can also avoid RegEx as it is slower than a direct character check. I like using the extensions in the Extensions.cs NuGet package. It makes this check as simple as:
Add the [https://www.nuget.org/packages/Extensions.cs][1] package to your project.
Add "using Extensions;" to the top of your code.
"smith23#".IsAlphaNumeric() will return False whereas "smith23".IsAlphaNumeric() will return True. By default the .IsAlphaNumeric() method ignores spaces, but it can also be overridden such that "smith 23".IsAlphaNumeric(false) will return False since the space is not considered part of the alphabet.
Every other check in the rest of the code is simply MyString.IsAlphaNumeric().

Based on #prmph's answer, it can be even more simplified (omitting the variable, using overload resolution):
yourString.Any(char.IsLetterOrDigit);

No special characters or empty string except hyphen
^[a-zA-Z0-9-]+$

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Change Variable back to original value after Regex matching - c#

Related

How to perform mutliple Replace calls at once

grouping adjacent similar substrings

Casting HexNumber as character to string

Insert spaces into string using string.format

Check for special characters (/*-+_#&$#%) in a string?

Categories

Resources