How to get a sequence of letters between two letters - c#

This is how I can get the sequence of English letters between two letters, but it works only for English. Somebody know how can I do the same thing, but for Russian alphabet. Should I somehow use Unicode representations? If you did something similar, please, let me know here.u
public static int aMatrixDim = 10;
public static byte aFirstChar = (byte) 'a';
public static byte aLastChar = (byte) 'z';
public static int aCharsCount = aLastChar - aFirstChar + 1;
public PatternsCollection CreateTrainingPatterns(Font font)
{
var result = new PatternsCollection(aCharsCount, aMatrixDim*aMatrixDim, aCharsCount);
for (var i = 0; i < aCharsCount; i++)
{
var aBitMatrix = CharToBitArray(Convert.ToChar(aFirstChar + i), font, aMatrixDim, 0);
for (var j = 0; j < aMatrixDim*aMatrixDim; j++)
result[i].Input[j] = aBitMatrix[j];
result[i].Output[i] = 1;
}
return result;
}

To fetch Cryllic capital characters (Range 0410 to 042F) in a List<char>:
char CRYLLIC_CAPITAL_START = '\x0410';
char CRYLLIC_CAPITAL_END = '\x042F';
List<char> cryllicCapitalCharacters = new List<char>();
for (char c = CRYLLIC_CAPITAL_START; c <= CRYLLIC_CAPITAL_END; c++)
{
cryllicCapitalCharacters.Add(c);
}
Or alternatively using Linq:
cryllicCapitalCharacters = Enumerable.Range('\x0410', '\x042F' - '\x0410' + 1)
.Select(x => (char)x).ToList();
To do the same for small letters, use 0430 to 044F
Russian Unicode Source: https://en.wikipedia.org/wiki/Cyrillic_script_in_Unicode

Unicode defines 32 out of 33 Russian alphabet letters as consecutive ranges
from 0x0410 to 0x042F (for capital letters) and from 0x0430 to 0x044F (for small letters). The missing letter Ё/ё has the codes 0x0401/0x0451.
So to build a list of Russian letters you may iterate through that ranges and add the missing Ё/ё. Additional sort operation is required if you need the letters to be ordered alphabetically:
var russianSmall = Enumerable.Range(0x0430, 32)
.Concat(new[] { 0x0451 })
.Select(i => Convert.ToChar(i))
.ToList();
var russianSmallOrdered = russianSmall
.OrderBy(c => c.ToString(), StringComparer.Create(new CultureInfo("ru-RU"), false))
.ToList();
var russianCapital = Enumerable.Range(0x410, 32)
.Concat(new[] { 0x0401 })
.Select(i => Convert.ToChar(i))
.ToList();
var russianCapitalOrdered = russianCapital
.OrderBy(c => c.ToString(), StringComparer.Create(new CultureInfo("ru-RU"), false))
.ToList();
Demo: https://dotnetfiddle.net/NrcAUy

Related

Regular expression to split a string (which contains newline characters) into equal length chunks

I have got a string let's say
Test Subject\r\nTest Comments...
I want to write a regular expression which would split the string to chunks of n characters say n=6 and the split process should not be affected by newline characters (\r\n).
The code which i have come up with is
string pattern = ".{1," + 6 + "}";
string noteDetails = "Test Subject\r\nTest Comments...";
List<string> noteComments = Regex.Matches(noteDetails, pattern).Cast<Match>().Select(x => x.Value).ToList();`
But the output which i am getting is
Test S
ubject
Test C
omment
s...
The desired output is
Test S
ubject
\r\nTe
st Com
ments.
..
If \r\n is not present then the code works fine. The bottom line is \r\n should also be considered as normal characters.
Thanks in advance
You do not need regex. Use string methods :
string input = "Test Subject\nTest Comment";
string[] results = input.ToCharArray()
.Where(x => x != '\n')
.Select((x, i) => new { chr = x, index = i })
.GroupBy(x => x.index / 6)
.Select(x => string.Join("", x.Select(y => y.chr)))
.ToArray();
A second more traditional approach, because Regex is rarely the best choice:
var stringToSplit = #"Test Subject\r\nTest Comments...";
var length = stringToSplit.Length;
var lineLength = 6;
var lastIndex = 0;
for(int i = 0; i < length - lineLength ; i+= lineLength)
{
lastIndex = i;
Console.WriteLine(stringToSplit.Substring(i, lineLength));
}
if (lastIndex < length)
{
Console.WriteLine(stringToSplit.Substring(lastIndex + lineLength, (length - (lastIndex + lineLength))));
}
And the output:
Test S
ubject
\r\nTe
st Com
ments.
..

how to solve this logic

Write a function which takes a string input and removes all the characters which appear more than or equal to the given number.
RemoveCharacters("Spanish", 2) should return "panih"
RemoveCharacters("Spanish", 3) should return "Spanish"
string text = "Spanish";
var sb = new StringBuilder(text.Length);
int maxCount = 2;
int currentCount = 2;
char specialChar = 'S';
foreach (char c in text)
if (c != specialChar || ++currentCount <= maxCount)
sb.Append(c);
text = sb.ToString();
int commasFound = 0;
int maxCommas = 1;
text = new string(text.Where(c => c != 'S' || ++commasFound <= maxCommas).ToArray());
Console.WriteLine(text);
Let's process the string in two steps:
Find out characters to remove (which apperas more or equal than count times)
Remove such characters from the string.
Implemenattaion
private static String RemoveCharacters(string value, int count) {
if (string.IsNullOrEmpty(value))
return value;
else if (count <= 1)
return "";
HashSet<char> toRemove = new HashSet<char>(value
.GroupBy(c => char.ToUpper(c))
.Where(chunk => chunk.Count() >= count)
.Select(chunk => chunk.Key));
return string.Concat(value.Where(c => !toRemove.Contains(char.ToUpper(c))));
}
Some tests:
string[] tests = new string[] {
"Spanish",
"bla-bla-bla",
"Abracadabra",
};
string report = string.Join(Environment.NewLine, tests
.Select(test => $"{test,-15} => '{RemoveCharacters(test, 2)}'"));
Console.Write(report);
Outcome:
Spanish => 'panih' // S is removed
bla-bla-bla => '' // all characters are removed
Abracadabra => 'cd' // A, b, r are removed
I'm not writing your homework for you, but consider using a dictionary. Iterate the characters in your word. If it exists in your dictionary, increment that element. Otherwise insert it in your dictionary with a value of 1. Then iterate your dictionary keys and note which ones exceed your target. Finally, write out your string excluding those characters.

C# counting characters

I am trying to complete a brain teaser which has a bug and I cant find it. Just wondering if anyone knows the answer. My goal is to return the character that appears most often.
public string solution(string S)
{
int[] occurrences = new int[26];
foreach (char ch in S)
{
occurrences[ch - 'a']++;
}
char best_char = 'a';
int best_res = 0;
for (int i = 1; i < 26; i++)
{
if (occurrences[i] >= best_res)
{
best_char = (char)('a' + i);
best_res = occurrences[i];
}
}
return best_char.ToString();
}
You have small mistake. Your index should start from 0, not 1
for (int i = 0; i < 26; i++)
{
if (occurrences[i] >= best_res)
{
best_char = (char)('a' + i);
best_res = occurrences[i];
}
}
Another and safer version is that
public string Solution(string text)
{
string strResponse = string.Empty;
if (!string.IsNullOrEmpty(text))
{
List<KeyValuePair<char, int>> occurance = text.GroupBy(ch => ch)
.Where(grp => char.IsLetter(grp.Key))
.Select(grp => new KeyValuePair<char, int>(grp.Key, grp.Count()))
.OrderByDescending(c => c.Value)
.ToList();
if (occurance.Any())
strResponse = occurance.First().Key.ToString();
}
return strResponse;
}
There could actually be more than one character with the maximum number of occurrences, so:
private static Char[] GetMostFrequentChars(String text)
{
Dictionary<Char,Int32> rank = new Dictionary<Char,Int32>();
foreach (Char c in text.Where(c => !char.IsWhiteSpace(c)))
{
if (rank.ContainsKey(c))
rank[c]++;
else
rank.Add(c, 1);
}
return rank.Where(r => r.Value == rank.Values.Max()).Select(x => x.Key).ToArray();
}
If you don't care about special characters (like spaces), you could do this with LINQ:
public static GetMostFrequentCharacter(string value)
{
return value
.GroupBy(o => o)
.OrderByDescending(o => o.Count())
.First()
.Key
.ToString()
}
There are at least 2 problems:
as #Adem Çatamak says, for loop should start at index 0
ch - 'a' will throw an exception if the string contains any other character than a-z lowercase,
public static string solution(string S)
{
var charDict = new Dictionary<char, int>();
foreach (char c in S.Where(c => !char.IsWhiteSpace(c)))
{
if(!charDict.TryGetValue(c, out int count))
{
charDict[c] = 1;
}
charDict[c]++;
}
return charDict.OrderByDescending(kvp => kvp.Value).First().Key.ToString();
}
Using a dictionary and LINQ is going to be better I think. Don't just copy this code and paste it into what ever homework or class this is for, use it to learn otherwise its a waste of my time and yours really

Add separator to string at every N characters?

I have a string which contains binary digits. How to separate string after each 8 digit?
Suppose the string is:
string x = "111111110000000011111111000000001111111100000000";
I want to add a separator like ,(comma) after each 8 character.
output should be :
"11111111,00000000,11111111,00000000,11111111,00000000,"
Then I want to send it to a list<> last 8 char 1st then the previous 8 chars(excepting ,) and so on.
How can I do this?
Regex.Replace(myString, ".{8}", "$0,");
If you want an array of eight-character strings, then the following is probably easier:
Regex.Split(myString, "(?<=^(.{8})+)");
which will split the string only at points where a multiple of eight characters precede it.
Try this:
var s = "111111110000000011111111000000001111111100000000";
var list = Enumerable
.Range(0, s.Length/8)
.Select(i => s.Substring(i*8, 8));
var res = string.Join(",", list);
There's another Regex approach:
var str = "111111110000000011111111000000001111111100000000";
# for .NET 4
var res = String.Join(",",Regex.Matches(str, #"\d{8}").Cast<Match>());
# for .NET 3.5
var res = String.Join(",", Regex.Matches(str, #"\d{8}")
.OfType<Match>()
.Select(m => m.Value).ToArray());
...or old school:
public static List<string> splitter(string in, out string csv)
{
if (in.length % 8 != 0) throw new ArgumentException("in");
var lst = new List<string>(in/8);
for (int i=0; i < in.length / 8; i++) lst.Add(in.Substring(i*8,8));
csv = string.Join(",", lst); //This we want in input order (I believe)
lst.Reverse(); //As we want list in reverse order (I believe)
return lst;
}
Ugly but less garbage:
private string InsertStrings(string s, int insertEvery, char insert)
{
char[] ins = s.ToCharArray();
int length = s.Length + (s.Length / insertEvery);
if (ins.Length % insertEvery == 0)
{
length--;
}
var outs = new char[length];
long di = 0;
long si = 0;
while (si < s.Length - insertEvery)
{
Array.Copy(ins, si, outs, di, insertEvery);
si += insertEvery;
di += insertEvery;
outs[di] = insert;
di ++;
}
Array.Copy(ins, si, outs, di, ins.Length - si);
return new string(outs);
}
String overload:
private string InsertStrings(string s, int insertEvery, string insert)
{
char[] ins = s.ToCharArray();
char[] inserts = insert.ToCharArray();
int insertLength = inserts.Length;
int length = s.Length + (s.Length / insertEvery) * insert.Length;
if (ins.Length % insertEvery == 0)
{
length -= insert.Length;
}
var outs = new char[length];
long di = 0;
long si = 0;
while (si < s.Length - insertEvery)
{
Array.Copy(ins, si, outs, di, insertEvery);
si += insertEvery;
di += insertEvery;
Array.Copy(inserts, 0, outs, di, insertLength);
di += insertLength;
}
Array.Copy(ins, si, outs, di, ins.Length - si);
return new string(outs);
}
If I understand your last requirement correctly (it's not clear to me if you need the intermediate comma-delimited string or not), you could do this:
var enumerable = "111111110000000011111111000000001111111100000000".Batch(8).Reverse();
By utilizing morelinq.
Here my two little cents too. An implementation using StringBuilder:
public static string AddChunkSeparator (string str, int chunk_len, char separator)
{
if (str == null || str.Length < chunk_len) {
return str;
}
StringBuilder builder = new StringBuilder();
for (var index = 0; index < str.Length; index += chunk_len) {
builder.Append(str, index, chunk_len);
builder.Append(separator);
}
return builder.ToString();
}
You can call it like this:
string data = "111111110000000011111111000000001111111100000000";
string output = AddChunkSeparator(data, 8, ',');
One way using LINQ:
string data = "111111110000000011111111000000001111111100000000";
const int separateOnLength = 8;
string separated = new string(
data.Select((x,i) => i > 0 && i % separateOnLength == 0 ? new [] { ',', x } : new [] { x })
.SelectMany(x => x)
.ToArray()
);
I did it using Pattern & Matcher as following way:
fun addAnyCharacter(input: String, insertion: String, interval: Int): String {
val pattern = Pattern.compile("(.{$interval})", Pattern.DOTALL)
val matcher = pattern.matcher(input)
return matcher.replaceAll("$1$insertion")
}
Where:
input indicates Input string. Check results section.
insertion indicates Insert string between those characters. For example comma (,), start(*), hash(#).
interval indicates at which interval you want to add insertion character.
input indicates Input string. Check results section. Check results section; here I've added insertion at every 4th character.
Results:
I/P: 1234XXXXXXXX5678 O/P: 1234 XXXX XXXX 5678
I/P: 1234567812345678 O/P: 1234 5678 1234 5678
I/P: ABCDEFGHIJKLMNOP O/P: ABCD EFGH IJKL MNOP
Hope this helps.
As of .Net 6, you can simply use the IEnumerable.Chunk method (Which splits elements of a sequence into chunks) then reconcatenate the chunks using String.Join.
var text = "...";
string.Join(',', text.Chunk(size: 6).Select(x => new string(x)));
This is much faster without copying array (this version inserts space every 3 digits but you can adjust it to your needs)
public string GetString(double valueField)
{
char[] ins = valueField.ToString().ToCharArray();
int length = ins.Length + (ins.Length / 3);
if (ins.Length % 3 == 0)
{
length--;
}
char[] outs = new char[length];
int i = length - 1;
int j = ins.Length - 1;
int k = 0;
do
{
if (k == 3)
{
outs[i--] = ' ';
k = 0;
}
else
{
outs[i--] = ins[j--];
k++;
}
}
while (i >= 0);
return new string(outs);
}
For every 1 character, you could do this one-liner:
string.Join(".", "1234".ToArray()) //result: 1.2.3.4
If you intend to create your own function to acheive this without using regex or pattern matching methods, you can create a simple function like this:
String formatString(String key, String seperator, int afterEvery){
String formattedKey = "";
for(int i=0; i<key.length(); i++){
formattedKey += key.substring(i,i+1);
if((i+1)%afterEvery==0)
formattedKey += seperator;
}
if(formattedKey.endsWith("-"))
formattedKey = formattedKey.substring(0,formattedKey.length()-1);
return formattedKey;
}
Calling the mothod like this
formatString("ABCDEFGHIJKLMNOPQRST", "-", 4)
Would result in the return string as this
ABCD-EFGH-IJKL-MNOP-QRST
A little late to the party, but here's a simplified LINQ expression to break an input string x into groups of n separated by another string sep:
string sep = ",";
int n = 8;
string result = String.Join(sep, x.InSetsOf(n).Select(g => new String(g.ToArray())));
A quick rundown of what's happening here:
x is being treated as an IEnumerable<char>, which is where the InSetsOf extension method comes in.
InSetsOf(n) groups characters into an IEnumerable of IEnumerable -- each entry in the outer grouping contains an inner group of n characters.
Inside the Select method, each group of n characters is turned back into a string by using the String() constructor that takes an array of chars.
The result of Select is now an IEnumerable<string>, which is passed into String.Join to interleave the sep string, just like any other example.
I am more than late with my answer but you can use this one:
static string PutLineBreak(string str, int split)
{
for (int a = 1; a <= str.Length; a++)
{
if (a % split == 0)
str = str.Insert(a, "\n");
}
return str;
}

How to get ASCII for non-numeric characters in a given string

I will have my string as follows
String S="AB-1233-444";
From this i would like to separate AB and would like to find out the ASCII for that 2 alphabets.
You should be able to use LINQ to take care of that (testing the syntax now):
var asciiCodes = S.Where(c => char.IsLetter(c)).Select(c => (int)c);
Or if you don't want to use the LINQ-y version:
var characterCodes = new List<int>();
foreach(var c in S)
{
if(char.IsLetter(c))
{
characterCodes.Add((int)c);
}
}
You can convert a character to a codepoint using this: (int)'a'.
To seperate (if you know that it's split on - you can use string.Split
To get the ASCII representation of 'A' for example, use the following code
int asciivalue = (int)'A';
So complete example might be
Dictionary<char,int> asciilist = new Dictionary<char,int>();
string s = "AB-1233-444";
string[] splitstrings = s.Split('-');
foreach( char c in splitstrings[0]){
asciilist.Add( c, (int)c );
}
var result = (from c in S.ToCharArray() where
((int)c >= (int)'a' &&
(int)c <= (int)'z') ||
((int)c >= (int)'A' &&
(int)c <= (int)'Z') select c).ToArray();
Non-linq version is as follows:
List<char> result = new List<char>();
foreach(char c in S)
{
if(((int)c >= (int)'a' &&
(int)c <= (int)'z') ||
((int)c >= (int)'A' &&
(int)c <= (int)'Z'))
{
result.Add(c);
}
}
You can use substring to get alphabets alone and use a for loop to store value of alphabets in an array and print it one by one

Categories

Resources