Split string after variable amount of characters

Split string after variable amount of characters - c#

I want to split my string called: invoerstring after a variable amount of characters (n is the number of characters when the string needs to be split)..
If the string length is shorter then the variable n, spaces need to be added until the string length = n. The result needs to be shown in a textfield called uitvoer.
This is what so far:
string invoerstring = invoer.Text;
if (invoerstring.Length < n)
{
invoerstring += "";
char[] myStrChars = invoerstring.ToCharArray();
}
if (invoerstring.Length == n)
{
string [] blok = invoerstring.Split();
foreach (string word in blok)
{
uitvoer.Text = word;
}
}
EDIT:
The solutions given above aren't completely doing the job for me, maybe it helps when I post the exercise:
|| crypt n m d text || text is padded with spaces until its length is
a multiple of n || the characters in text are circulary shifted in the
alphabet by the displacement d || example: if d = 1 then 'a' -> 'b' ,
'b' -> 'c' .... etc... 'z' -> 'a' || text is divided in blocks of
length n characters || inside every block of n the characters are
circulary shifted m times to the left || the shifted groups are
concatenated
I already solved the m and d only have to solve the n.
The solutions given above aren't completely doing the job for me, maybe it helps when I post the exercise:
|| crypt n m d text
|| text is padded with spaces until its length is a multiple of n
|| the characters in text are circulary shifted in the alphabet by the displacement d
|| example: if d = 1 then 'a' -> 'b' , 'b' -> 'c' .... etc... 'z' -> 'a'
|| text is divided in blocks of length n characters
|| inside every block of n the characters are circulary shifted m times to the left
|| the shifted groups are concatenated
I already solved the m and d only have to solve the n.

Here's the code you want, no need to go through a character array:
public static string EnsureExactLength(this string s, int length)
{
if (s == null)
throw new ArgumentNullException("null");
return s.PadRight(length).Substring(0, length);
}
You call it like this:
string s = "Test string".EnsureExactLength(4); // will return "Test"
string s = "Te".EnsureExactLength(4); // will return "Te "
You can find an example LINQPad program here.

Okay, I'm honestly not sure what the code you have above is doing because I see calls like Split() without any parameters, and stuff. But to meet the requirements, this one line should do:
string invoerstring = invoer.Text.PadRight(n, ' ').Substring(0, n);
the PadRight will make sure it's as long as n and the Substring will then return the portion of the string up to n.
If you then wanted that string in an array, because I see you have one at the end, you could do this:
invoerstring.ToArray();

Here is a LinqPad script:
void Main()
{
const string TEXT = "12345ABCDE12345ABCDE1234";
const int LENGTH = 5;
const char PAD = '#';
Enumerable.Range(0, TEXT.Length / LENGTH)
.Union(TEXT.Length < LENGTH ? new int[] { 1 } : new int[] {})
.Select((index, item) => TEXT.Length < LENGTH
? TEXT.PadRight(LENGTH, PAD)
: TEXT.Substring(index * LENGTH, LENGTH))
.Concat(TEXT.Length % LENGTH != 0
? new string[] { TEXT.Substring(TEXT.Length - (TEXT.Length % LENGTH)).PadRight(LENGTH, PAD) }
: new string[] { })
.Dump();
}

Related

How to exchange numbers to alphabet and alphabet to numbers in a string?

How do I convert numbers to its equivalent alphabet character and convert alphabet character to its numeric values from a string (except 0, 0 should stay 0 for obvious reasons)
So basically if there is a string
string content="D93AK0F5I";
How can I convert it to ?
string new_content="4IC11106E9";

I'm assuming you're aware this is not reversible, and that you're only using upper case and digits. Here you go...
private string Transpose(string input)
{
StringBuilder result = new StringBuilder();
foreach (var character in input)
{
if (character == '0')
{
result.Append(character);
}
else if (character >= '1' && character <= '9')
{
int offset = character - '1';
char replacement = (char)('A' + offset);
result.Append(replacement);
}
else if (character >= 'A' && character <= 'Z') // I'm assuming upper case only; feel free to duplicate for lower case
{
int offset = character - 'A' + 1;
result.Append(offset);
}
else
{
throw new ApplicationException($"Unexpected character: {character}");
}
}
return result.ToString();
}

Well, if you are only going to need a one way translation, here is quite a simple way to do it, using linq:
string convert(string input)
{
var chars = "0abcdefghijklmnopqrstuvwxyz";
return string.Join("",
input.Select(
c => char.IsDigit(c) ?
chars[int.Parse(c.ToString())].ToString() :
(chars.IndexOf(char.ToLowerInvariant(c))).ToString())
);
}
You can see a live demo on rextester.

You can use ArrayList of Albhabets. For example
ArrayList albhabets = new ArrayList();
albhabets.Add("A");
albhabets.Add("B");
and so on.
And now parse your string character by character.
string s = "1BC34D";
char[] characters = s.ToCharArray();
for (int i = 0; i < characters.Length; i++)
{
if (Char.IsNumber(characters[0]))
{
var index = characters[0];
var stringAlbhabet = albhabets[index];
}
else
{
var digitCharacter = albhabets.IndexOf(characters[0]);
}
}
This way you can get "Alphabet" representation of number & numeric representation of "Alphabet".

c# read file content code optimization

I have a large string which is converted from a text file (eg 1 MB text 0file) and I want to process the string. It takes near 10 minutes to process the string.
Basically string is read character by character and increment counter for each character by one, some characters such as space, comma, colon and semi-colon are counted as space and rest characters are just ignored and thus space's counter is incremented.
Code:
string fileContent = "....." // a large string
int min = 0;
int max = fileContent.Length;
Dictionary<char, int> occurrence // example c=>0, m=>4, r=>8 etc....
// Note: occurrence has only a-z alphabets, and a space. comma, colon, semi-colon are coutned as space and rest characters ignored.
for (int i = min; i <= max; i++) // run loop to end
{
try // increment counter for alphabets and space
{
occurrence[fileContent[i]] += 1;
}
catch (Exception e) // comma, colon and semi-colon are spaces
{
if (fileContent[i] == ' ' || fileContent[i] == ',' || fileContent[i] == ':' || fileContent[i] == ';')
{
occurrence[' '] += 1;
//new_file_content += ' ';
}
else continue;
}
totalFrequency++; // increment total frequency
}

Try this:
string input = "test string here";
Dictionary<char, int> charDict = new Dictionary<char, int>();
foreach(char c in input.ToLower()) {
if(c < 97 || c > 122) {
if(c == ' ' || c == ',' || c == ':' || c == ';') {
charDict[' '] = (charDict.ContainsKey(' ')) ? charDict[' ']++ : 0;
}
} else {
charDict[c] = (charDict.ContainsKey(c)) ? charDict[c]++ : 0;
}
}

Given your loop is iterating through a large number you want to minimize the checks inside the loop and remove the catch which is pointed out in the comments. There should never be a reason to control flow logic with a try catch block. I would assume you initialize the dictionary first to set the occurrence cases to 0 otherwise you have to add to the dictionary if the character is not there. In the loop you can test the character with something like char.IsLetter() or other checks as D. Stewart is suggesting. I would not do a toLower on the large string if you are going to iterate every character anyway (this would do the iteration twice). You can do that conversion in the loop if needed.
Try something like the below code. You could also initialize all 256 possible characters in the dictionary and completely remove the if statement and then remove items you don't care about and add the 4 space items to the space character dictionary after the counting is complete.
foreach (char c in fileContent)
{
if (char.IsLetter(c))
{
occurrence[c] += 1;
}
else
{
if (c == ' ' || c == ',' || c == ':' || c == ';')
{
occurrence[' '] += 1;
}
}
}
}
You could initialize the entire dictionary in advance like this also:
for (int i = 0; i < 256; i++)
{
occurrence.Add((char)i, 0);
}

There are several issues with that code snippet (i <= max, accessing dictionary entry w/o being initialized etc.), but of course the performance bottleneck is relying on exceptions, since throwing / catching exceptions is extremely slow (especially when done in a inner loop).
I would start with putting the counts into a separate array.
Then I would either prepare a char to count index map and use it inside the loop w/o any ifs:
var indexMap = new Dictionary<char, int>();
int charCount = 0;
// Map the valid characters to be counted
for (var ch = 'a'; ch <= 'z'; ch++)
indexMap.Add(ch, charCount++);
// Map the "space" characters to be counted
foreach (var ch in new[] { ' ', ',', ':', ';' })
indexMap.Add(ch, charCount);
charCount++;
// Allocate count array
var occurences = new int[charCount];
// Process the string
foreach (var ch in fileContent)
{
int index;
if (indexMap.TryGetValue(ch, out index))
occurences[index]++;
}
// Not sure about this, but including it for consistency
totalFrequency = occurences.Sum();
or not use dictionary at all:
// Allocate array for char counts
var occurences = new int['z' - 'a' + 1];
// Separate count for "space" chars
int spaceOccurences = 0;
// Process the string
foreach (var ch in fileContent)
{
if ('a' <= ch && ch <= 'z')
occurences[ch - 'a']++;
else if (ch == ' ' || ch == ',' || ch == ':' || ch == ';')
spaceOccurences++;
}
// Not sure about this, but including it for consistency
totalFrequency = spaceOccurences + occurences.Sum();
The former is more flexible (you can add more mappings), the later - a bit faster. But both are fast enough (complete in milliseconds for 1M size string).

Ok, it´s a little late, but it should be the fastest solution:
using System.Collections.Generic;
using System.Linq;
namespace ConsoleApplication99
{
class Program
{
static void Main(string[] args)
{
string fileContent = "....."; // a large string
// --- high perf section to count all chars ---
var charCounter = new int[char.MaxValue + 1];
for (int i = 0; i < fileContent.Length; i++)
{
charCounter[fileContent[i]]++;
}
// --- combine results with linq (all actions consume less than 1 ms) ---
var allResults = charCounter.Select((count, index) => new { count, charValue = (char)index }).Where(c => c.count > 0).ToArray();
var spaceChars = new HashSet<char>(" ,:;");
int countSpaces = allResults.Where(c => spaceChars.Contains(c.charValue)).Sum(c => c.count);
var usefulChars = new HashSet<char>("abcdefghijklmnopqrstuvwxyz");
int countLetters = allResults.Where(c => usefulChars.Contains(c.charValue)).Sum(c => c.count);
}
}
}
for very large text-files, it´s better to use the StreamReader...

Compare ASCII Values in C#

I'm trying to compare the ascii value of each character in the input and then I want to shift it with a certain distance and reconvert it to valid character. (using Caesar Ciphering Algorithm)
public void Caesar_Cipher_Optimal(string input, int shift)
{
res = "";
int indx;
byte[] asciiInput = Encoding.ASCII.GetBytes(input);
foreach (byte element in asciiInput)
{
//compare if the current char is between[A-Z]
if(asciiInput[element] >= 65 && asciiInput[element] <= 90)
{
//convert the current value of element to int and add the shift value then mod 90
indx=((Convert.ToInt32(asciiInput[element])) + shift) % 90;
res += Convert.ToChar(indx).ToString();
}
}
}
When I'm testing the code, it's giving me an OutOfRange exception, is it the right way to compare the current ASCII value with what I want?

It's your array access using the value from the foreach that gives the out of range exception, just as SLaks showed.
You don't need to convert the characters to bytes, as you are only dealing with characters that are in the range A to Z. Characters are 16 bit values, and convert easily into their character codes as integers.
You would use modulo 26 rather than modulo 90, otherwise you would end up with characters with character codes from 0 to 64. You can calculate 26 as the difference between 'A' and 'Z' to avoid the magic number.
Doing calculations on the character code directly means that you have to do a check afterwards to clean up out of range values. Instead convert the 65-90 range to 0-25, do the calculation, and convert back. Subtract 65 ('A') from the character code, add the shift, apply the modulo, and add 65 back.
public static string Caesar_Cipher_Optimal(string input, int shift) {
return new String(
input.Where(c => c >= 'A' && c <= 'Z')
.Select(c => (char)((c - 'A' + shift) % ('Z' - 'A' + 1) + 'A'))
.ToArray()
);
}

Here, I fixed some errors in your code so it works now!
public void CaesarCipherOptimal(string input, int shift)
{
var res = "";
byte[] asciiInput = Encoding.ASCII.GetBytes(input);
// Loop for every character in the string, set the value to the element variable
foreach (byte element in asciiInput)
{
if (element >= 65 && element <= 90)
{
var indx = (element + shift - 65) % 26 + 65;
res += (char)indx;
}
}
return res;
}
Here's how you can use it: (probably in your static void Main())
CaesarCipherOptimal("ABCDEFGHIJKLMNOPQRSTUVWXYZ", 10);

Convert a word into character array

How do I convert a word into a character array?
Lets say i have the word "Pneumonoultramicroscopicsilicovolcanoconiosis" yes this is a word ! I would like to take this word and assign a numerical value to it.
a = 1
b = 2
... z = 26
int alpha = 1;
int Bravo = 2;
basic code
if (testvalue == "a")
{
Debug.WriteLine("TRUE A was found in the string"); // true
FinalNumber = Alpha + FinalNumber;
Debug.WriteLine(FinalNumber);
}
if (testvalue == "b")
{
Debug.WriteLine("TRUE B was found in the string"); // true
FinalNumber = Bravo + FinalNumber;
Debug.WriteLine(FinalNumber);
}
My question is how do i get the the word "Pneumonoultramicroscopicsilicovolcanoconiosis" into a char string so that I can loop the letters one by one ?
thanks in advance

what about
char[] myArray = myString.ToCharArray();
But you don't actually need to do this if you want to iterate the string. You can simply do
for( int i = 0; i < myString.Length; i++ ){
if( myString[i] ... ){
//do what you want here
}
}
This works since the string class implements it's own indexer.

string word = "Pneumonoultramicroscopicsilicovolcanoconiosis";
char[] characters = word.ToCharArray();
Voilá!

you can use simple for loop.
string word = "Pneumonoultramicroscopicsilicovolcanoconiosis";
int wordCount = word.Length;
for(int wordIndex=0;wordIndex<wordCount; wordIndex++)
{
char c = word[wordIndex];
// your code
}

You can use the Linq Aggregate function to do this:
"wordsto".ToLower().Aggregate(0, (running, c) => running + c - 97);
(This particular example assumes you want to treat upper- and lower-case identically.)
The subtraction of 97 translates the ASCII value of the letters such that 'a' is zero. (Obviously subtract 96 if you want 'a' to be 1.)

you can use ToCharArray() method of string class
string strWord = "Pneumonoultramicroscopicsilicovolcanoconiosis";
char[] characters = strWord.ToCharArray();

What is the most efficient way to detect if a string contains a number of consecutive duplicate characters in C#?

For example, a user entered "I love this post!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!"
the consecutive duplicate exclamation mark "!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!" should be detected.

The following regular expression would detect repeating chars. You could up the number or limit this to specific characters to make it more robust.
int threshold = 3;
string stringToMatch = "thisstringrepeatsss";
string pattern = "(\\d)\\" + threshold + " + ";
Regex r = new Regex(pattern);
Match m = r.Match(stringToMatch);
while(m.Success)
{
Console.WriteLine("character passes threshold " + m.ToString());
m = m.NextMatch();
}

Here's and example of a function that searches for a sequence of consecutive chars of a specified length and also ignores white space characters:
public static bool HasConsecutiveChars(string source, int sequenceLength)
{
if (string.IsNullOrEmpty(source))
return false;
if (source.Length == 1)
return false;
int charCount = 1;
for (int i = 0; i < source.Length - 1; i++)
{
char c = source[i];
if (Char.IsWhiteSpace(c))
continue;
if (c == source[i+1])
{
charCount++;
if (charCount >= sequenceLength)
return true;
}
else
charCount = 1;
}
return false;
}
Edit fixed range bug :/

Can be done in O(n) easily: for each character, if the previous character is the same as the current, increment a temporary count. If it's different, reset your temporary count. At each step, update your global if needed.
For abbccc you get:
a => temp = 1, global = 1
b => temp = 1, global = 1
b => temp = 2, global = 2
c => temp = 1, global = 2
c => temp = 2, global = 2
c => temp = 3, global = 3
=> c appears three times. Extend it to get the position, then you should be able to print the "ccc" substring.
You can extend this to give you the starting position fairly easily, I'll leave that to you.

Here is a quick solution I crafted with some extra duplicates thrown in for good measure. As others pointed out in the comments, some duplicates are going to be completely legitimate, so you may want to narrow your criteria to punctuation instead of mere characters.
string input = "I loove this post!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!aa";
int index = -1;
int count =1;
List<string> dupes = new List<string>();
for (int i = 0; i < input.Length-1; i++)
{
if (input[i] == input[i + 1])
{
if (index == -1)
index = i;
count++;
}
else if (index > -1)
{
dupes.Add(input.Substring(index, count));
index = -1;
count = 1;
}
}
if (index > -1)
{
dupes.Add(input.Substring(index, count));
}

The better way i my opinion is create a array, each element in array is responsible for one character pair on string next to each other, eg first aa, bb, cc, dd. This array construct with 0 on each element.
Solve of this problem is a for on this string and update array values.
You can next analyze this array for what you want.
Example: For string: bbaaaccccdab, your result array would be { 2, 1, 3 }, because 'aa' can find 2 times, 'bb' can find one time (at start of string), 'cc' can find three times.
Why 'cc' three times? Because 'cc'cc & c'cc'c & cc'cc'.

Use LINQ! (For everything, not just this)
string test = "aabb";
return test.Where((item, index) => index > 0 && item.Equals(test.ElementAt(index)));
// returns "abb", where each of these items has the previous letter before it
OR
string test = "aabb";
return test.Where((item, index) => index > 0 && item.Equals(test.ElementAt(index))).Any();
// returns true

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Split string after variable amount of characters - c#

Related

How to exchange numbers to alphabet and alphabet to numbers in a string?

c# read file content code optimization

Compare ASCII Values in C#

Convert a word into character array

What is the most efficient way to detect if a string contains a number of consecutive duplicate characters in C#?

Categories

Resources