String concatenate/shorten algorithm - c#

I want to publish server-messages on Twitter, for our clients.
Unfortunately, Twitter only allows posting 140 Chars or less. This is a shame.
Now, I have to write an algorithm that concatenates the different messages from the server together, but shortens them to a max of 140 characters.
It's pretty tricky.
CODE
static string concatinateStringsWithLength(string[] strings, int length, string separator) {
// This is the maximum number of chars for the strings
// We have to subtract the separators
int maxLengthOfAllStrings = length - ((strings.Length - 1) * separator.Length);
// Here we save all shortenedStrings
string[] cutStrings = new string[strings.Length];
// This is the average length of all the strings
int averageStringLenght = maxLengthOfAllStrings / strings.Length;
// Now we check how many strings are longer than the average string
int longerStrings = 0;
foreach (string singleString in strings)
{
if (singleString.Length > averageStringLenght)
{
longerStrings++;
}
}
// If a string is smaller than the average string, we can more characters to the longer strings
int maxStringLength = averageStringLenght;
foreach (string singleString in strings)
{
if (averageStringLenght > singleString.Length)
{
maxStringLength += (int)((averageStringLenght - singleString.Length) * (1.0 / longerStrings));
}
}
// Finally we shorten the strings and save them to the array
int i = 0;
foreach (string singleString in strings)
{
string shortenedString = singleString;
if (singleString.Length > maxStringLength)
{
shortenedString = singleString.Remove(maxStringLength);
}
cutStrings[i] = shortenedString;
i++;
}
return String.Join(separator, cutStrings);
}
Problem with this
This algorithm works, but it's not very optimized.
It uses less characters than it actually could.
The main problem with this is that the variable longerStrings is relative to the maxStringLength, and backwards.
This means if I change longerStrings, maxStringLength gets changed, and so on and so on.
I'd have to make a while loop and do this until there are no changes, but I don't think that's necessary for such a simple case.
Can you give me a clue on how to continue?
Or maybe there already exists something similar?
Thanks!
EDIT
The messages I get from the server look like this:
Message
Subject
Date
Body
Message
Subject
Date
Body
And so on.
What I want is to concatenate the strings with a separator, in this case a semi-colon.
There should be a max length. The long strings should be shortened first.
Example
This is a subject
This is the body and is a bit lon...
25.02.2013
This is a s...
This is the...
25.02.2013
I think you get the idea ;)

Five times slower than yours (in our simple example) but should use maximum avaliable space (no critical values checking):
static string Concatenate(string[] strings, int maxLength, string separator)
{
var totalLength = strings.Sum(s => s.Length);
var requiredLength = totalLength - (strings.Length - 1)*separator.Length;
// Return if there is enough place.
if (requiredLength <= maxLength)
return String.Concat(strings.Take(strings.Length - 1).Select(s => s + separator).Concat(new[] {strings.Last()}));
// The problem...
var helpers = new ConcatenateInternal[strings.Length];
for (var i = 0; i < helpers.Length; i++)
helpers[i] = new ConcatenateInternal(strings[i].Length);
var avaliableLength = maxLength - (strings.Length - 1)*separator.Length;
var charsInserted = 0;
var currentIndex = 0;
while (charsInserted != avaliableLength)
{
for (var i = 0; i < strings.Length; i++)
{
if (charsInserted == avaliableLength)
break;
if (currentIndex >= strings[i].Length)
{
helpers[i].Finished = true;
continue;
}
helpers[i].StringBuilder.Append(strings[i][currentIndex]);
charsInserted++;
}
currentIndex++;
}
var unified = new StringBuilder(avaliableLength);
for (var i = 0; i < strings.Length; i++)
{
if (!helpers[i].Finished)
{
unified.Append(helpers[i].StringBuilder.ToString(0, helpers[i].StringBuilder.Length - 3));
unified.Append("...");
}
else
{
unified.Append(helpers[i].StringBuilder.ToString());
}
if (i < strings.Length - 1)
{
unified.Append(separator);
}
}
return unified.ToString();
}
And ConcatenateInternal:
class ConcatenateInternal
{
public StringBuilder StringBuilder { get; private set; }
public bool Finished { get; set; }
public ConcatenateInternal(int capacity)
{
StringBuilder = new StringBuilder(capacity);
}
}

Related

c# How to run a application faster

I am creating a word list of possible uppercase letters to prove how insecure 8 digit passwords are this code will write aaaaaaaa to aaaaaaab to aaaaaaac etc. until zzzzzzzz using this code:
class Program
{
static string path;
static int file = 0;
static void Main(string[] args)
{
new_file();
var alphabet = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ123456789+-*_!$£^=<>§°ÖÄÜöäü.;:,?{}[]";
var q = alphabet.Select(x => x.ToString());
int size = 3;
int counter = 0;
for (int i = 0; i < size - 1; i++)
{
q = q.SelectMany(x => alphabet, (x, y) => x + y);
}
foreach (var item in q)
{
if (counter >= 20000000)
{
new_file();
counter = 0;
}
if (File.Exists(path))
{
using (StreamWriter sw = File.AppendText(path))
{
sw.WriteLine(item);
Console.WriteLine(item);
/*if (!(Regex.IsMatch(item, #"(.)\1")))
{
sw.WriteLine(item);
counter++;
}
else
{
Console.WriteLine(item);
}*/
}
}
else
{
new_file();
}
}
}
static void new_file()
{
path = #"C:\" + "list" + file + ".txt";
if (!File.Exists(path))
{
using (StreamWriter sw = File.CreateText(path))
{
}
}
file++;
}
}
The Code is working fine but it takes Weeks to run it. Does anyone know a way to speed it up or do I have to wait? If anyone has a idea please tell me.
Performance:
size 3: 0.02s
size 4: 1.61s
size 5: 144.76s
Hints:
removed LINQ for combination generation
removed Console.WriteLine for each password
removed StreamWriter
large buffer (128k) for file writing
const string alphabet = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ123456789+-*_!$£^=<>§°ÖÄÜöäü.;:,?{}[]";
var byteAlphabet = alphabet.Select(ch => (byte)ch).ToArray();
var alphabetLength = alphabet.Length;
var newLine = new[] { (byte)'\r', (byte)'\n' };
const int size = 4;
var number = new byte[size];
var password = Enumerable.Range(0, size).Select(i => byteAlphabet[0]).Concat(newLine).ToArray();
var watcher = new System.Diagnostics.Stopwatch();
watcher.Start();
var isRunning = true;
for (var counter = 0; isRunning; counter++)
{
Console.Write("{0}: ", counter);
Console.Write(password.Select(b => (char)b).ToArray());
using (var file = System.IO.File.Create(string.Format(#"list.{0:D5}.txt", counter), 2 << 16))
{
for (var i = 0; i < 2000000; ++i)
{
file.Write(password, 0, password.Length);
var j = size - 1;
for (; j >= 0; j--)
{
if (number[j] < alphabetLength - 1)
{
password[j] = byteAlphabet[++number[j]];
break;
}
else
{
number[j] = 0;
password[j] = byteAlphabet[0];
}
}
if (j < 0)
{
isRunning = false;
break;
}
}
}
}
watcher.Stop();
Console.WriteLine(watcher.Elapsed);
}
Try the following modified code. In LINQPad it runs in < 1 second. With your original code I gave up after 40 seconds. It removes the overhead of opening and closing the file for every WriteLine operation. You'll need to test and ensure it gives the same results because I'm not willing to run your original code for 24 hours to ensure the output is the same.
class Program
{
static string path;
static int file = 0;
static void Main(string[] args)
{
new_file();
var alphabet = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ123456789+-*_!$£^=<>§°ÖÄÜöäü.;:,?{}[]";
var q = alphabet.Select(x => x.ToString());
int size = 3;
int counter = 0;
for (int i = 0; i < size - 1; i++)
{
q = q.SelectMany(x => alphabet, (x, y) => x + y);
}
StreamWriter sw = File.AppendText(path);
try
{
foreach (var item in q)
{
if (counter >= 20000000)
{
sw.Dispose();
new_file();
counter = 0;
}
sw.WriteLine(item);
Console.WriteLine(item);
}
}
finally
{
if(sw != null)
{
sw.Dispose();
}
}
}
static void new_file()
{
path = #"C:\temp\list" + file + ".txt";
if (!File.Exists(path))
{
using (StreamWriter sw = File.CreateText(path))
{
}
}
file++;
}
}
your alphabet is missing 0
With that fixed there would be 89 chars in your set. Let's call it 100 for simplicity. The set you are looking for is all the 8 character length strings drawn from that set. There are 100^8 of these, i.e. 10,000,000,000,000,000.
The disk space they will take up depends on how you encode them, lets be generous - assume you use some 8 bit char set that contains the these characters, and you don't put in carriage returns, so one byte per char, so 10,000,000,000,000,000 bytes =~ 10 peta byes?
Do you have 10 petabytes of disk? (10000 TB)?
[EDIT] In response to 'this is not an answer':
The original motivation is to create the list? The shows how large the list would be. Its hard to see what could be DONE with the list if it was actualised, i.e. it would always be quicker to reproduce it than to load it. Surely whatever point could be made by producing the list can also be made by simply knowing it's size, which the above shows how to work it out.
There are LOTS of inefficiencies in you code, but if your questions is 'how can i quickly produce this list and write it to disk' the answer is 'you literally cannot'.
[/EDIT]

Cutting a string into array of <"x" chars each [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 8 years ago.
Improve this question
I have a situation where my string can't go past a certain point, so what I want to do is cut it into smaller strings of "x" characters, and then print them one by one on top of each other. They don't all need to be equal, if x is 5, and I have an 11 character string, printing 3 lines with 5, 5, and 1 characters is fine. Is there an easy way to do this in C#?
Example:
string Test = "This is a test string";
stringarray = Cutup(Test, 5);
//Result:
//"This "
//"is a "
//"test "
//"strin"
//"g"
try something like this:
public string[] Cutcup(string s, int l)
{
List<string> result = new List<string>();
for (int i = 0; i < s.Length; i += l)
{
result.Add(s.Substring(i, Math.Min(5, s.Substring(i).Length)));
}
return result.ToArray();
}
You could cut up the strings then do a test.lastIndexOf(' '); If that helps
You can use the String Manipultion function Substring() and a for loop to accomplish this.
here is an example
int maxChars = 5;
String myStr = "This is some text used in testing this method of splitting a string and just a few more chars and the string is complete";
List<String> mySubStrings = new List<String>();
while (myStr.Length > maxChars)
{
mySubStrings.Add(myStr.Substring(0,maxChars));
myStr = myStr.Substring(maxChars);
}
mySubStrings.ToArray();
List<string> result = new List<string>();
string testString = "This is a test string";
string chunkBuilder = "";
int chunkSize = 5;
for (int i = 0; i <= testString.Length-1; i++)
{
chunkBuilder += testString[i];
if (chunkBuilder.Length == chunkSize || i == testString.Length - 1)
{
result.Add(chunkBuilder);
chunkBuilder = "";
}
}
Another try, with less string concatenations
string Test = "This is a test string";
List<string> parts = new List<string>();
int i = 0;
do
{
parts.Add(Test.Substring(i,System.Math.Min(5, Test.Substring(i).Length)));
i+= 5;
} while (i < Test.Length);
Here are a couple more ways. Cutup2 below is more efficient but less pretty. Both pass the test case given.
private static IEnumerable<string> Cutup(string given, int chunkSize)
{
var skip = 0;
var iterations = 0;
while (iterations * chunkSize < given.Length)
{
iterations++;
yield return new string(given.Skip(skip).Take(chunkSize).ToArray());
skip += chunkSize;
}
}
private static unsafe IEnumerable<string> Cutup2(string given, int chunkSize)
{
var pieces = new List<string>();
var consumed = 0;
while (consumed < given.Length)
{
fixed (char* g = given)
{
var toTake = consumed + chunkSize > given.Length
? given.Length - consumed
: chunkSize;
pieces.Add(new string(g, consumed, toTake));
}
consumed += chunkSize;
}
return pieces;
}
All in one line
var size = 5;
var results = Enumerable.Range(0, (int)Math.Ceiling(test.Length / (double)size))
.Select(i => test.Substring(i * size, Math.Min(size, test.Length - i * size)));
I once made an extensionmethod that can be used for this:
public static IEnumerable<IEnumerable<T>> Subsequencise<T>(this IEnumerable<T> input, int subsequenceLength)
{
var enumerator = input.GetEnumerator();
SubsequenciseParameter parameter = new SubsequenciseParameter { Next = enumerator.MoveNext() };
while (parameter.Next)
yield return getSubSequence(enumerator, subsequenceLength, parameter);
}
private static IEnumerable<T> getSubSequence<T>(IEnumerator<T> enumerator, int subsequenceLength, SubsequenciseParameter parameter)
{
do
{
yield return enumerator.Current;
} while ((parameter.Next = enumerator.MoveNext()) && --subsequenceLength > 0);
}
// Needed to let the Subsequencisemethod know when to stop, since you cant use out or ref parameters in an yield-return method.
class SubsequenciseParameter
{
public bool Next { get; set; }
}
then you can do this:
string Test = "This is a test string";
stringarray = Test.Subsequencise(5).Select(subsequence => new String(subsequence.Toarray())).Toarray();
Here's a rather LINQy one-liner:
static IEnumerable<string> SliceAndDice1( string s , int n )
{
if ( s == null ) throw new ArgumentNullException("s");
if ( n < 1 ) throw new ArgumentOutOfRangeException("n");
int i = 0 ;
return s.GroupBy( c => i++ / n ).Select( g => g.Aggregate(new StringBuilder() , (sb,c)=>sb.Append(c)).ToString() ) ;
}
If that gives you a headache, try the more straightforward
static IEnumerable<string> SliceAndDice2( string s , int n )
{
if ( s == null ) throw new ArgumentNullException("s") ;
if ( n < 1 ) throw new ArgumentOutOfRangeException("n") ;
int i = 0 ;
for ( i = 0 ; i < s.Length-n ; i+=n )
{
yield return s.Substring(i,n) ;
}
yield return s.Substring(i) ;
}

Sorting objects using IComparable

I am trying to implement the IComparable interface in my custom object so that List.Sort() can sort them alphabetically.
My object has a field called _name which is a string type, and I want it to sort based on that. Here is the method I implemented:
public int CompareTo(object obj)
{
//Int reference table:
//1 or greater means the current instance occurs after obj
//0 means both elements occur in the same position
//-1 or less means the current instance occurs before obj
if (obj == null)
return 1;
Upgrade otherUpgrade = obj as Upgrade;
if (otherUpgrade != null)
return _name.CompareTo(otherUpgrade.Name);
else
throw new ArgumentException("Passed object is not an Upgrade.");
}
Not sure if I did something wrong or if it's just the way the string CompareTo works, but basically my List was sorted like this:
Test Upgrade
Test Upgrade 10
Test Upgrade 11
Test Upgrade 12
Test Upgrade 13
Test Upgrade 14
Test Upgrade 15
Test Upgrade 2
Test Upgrade 3
Test Upgrade 4
Test Upgrade 5
I want them to be sorted like this:
Test Upgrade
Test Upgrade 2
Test Upgrade 3
...etc
You want "natural order" -- the collation that a human who is familiar with the conventions of English would choose -- as opposed to what you've got, which is "lexicographic" collation: assign every letter a strict ordering, and then sort by each letter in turn.
Jeff has a good article on some of the ins and outs here, with links to different algorithms that try to solve the problem:
http://www.codinghorror.com/blog/2007/12/sorting-for-humans-natural-sort-order.html
and Raymond discussed how Windows dealt with it here:
http://technet.microsoft.com/en-us/magazine/hh475812.aspx
Basically the problem is: natural order collation requires solving an artificial intelligence problem; you're trying to emulate what a human would do, and that can be surprisingly tricky. Good luck!
Strings are sorted in lexicographic order. You'll have to either format all your numbers to have the same length (eg: Test Upgrade 02) or parse the number in your comparer and incorporate it in your comparison logic.
The reason this happens is that you are doing string comparison, which has no explicit knowledge of numbers. It orders each string by the respective character codes of each character.
To get the effect you want will require a bit more work. See this question: Sort on a string that may contain a number
AlphaNumeric Sorting
public class AlphanumComparatorFast : IComparer
{
public int Compare(object x, object y)
{
string s1 = x as string;
if (s1 == null)
{
return 0;
}
string s2 = y as string;
if (s2 == null)
{
return 0;
}
int len1 = s1.Length;
int len2 = s2.Length;
int marker1 = 0;
int marker2 = 0;
// Walk through two the strings with two markers.
while (marker1 < len1 && marker2 < len2)
{
char ch1 = s1[marker1];
char ch2 = s2[marker2];
// Some buffers we can build up characters in for each chunk.
char[] space1 = new char[len1];
int loc1 = 0;
char[] space2 = new char[len2];
int loc2 = 0;
// Walk through all following characters that are digits or
// characters in BOTH strings starting at the appropriate marker.
// Collect char arrays.
do
{
space1[loc1++] = ch1;
marker1++;
if (marker1 < len1)
{
ch1 = s1[marker1];
}
else
{
break;
}
} while (char.IsDigit(ch1) == char.IsDigit(space1[0]));
do
{
space2[loc2++] = ch2;
marker2++;
if (marker2 < len2)
{
ch2 = s2[marker2];
}
else
{
break;
}
} while (char.IsDigit(ch2) == char.IsDigit(space2[0]));
// If we have collected numbers, compare them numerically.
// Otherwise, if we have strings, compare them alphabetically.
string str1 = new string(space1);
string str2 = new string(space2);
int result;
if (char.IsDigit(space1[0]) && char.IsDigit(space2[0]))
{
int thisNumericChunk = int.Parse(str1);
int thatNumericChunk = int.Parse(str2);
result = thisNumericChunk.CompareTo(thatNumericChunk);
}
else
{
result = str1.CompareTo(str2);
}
if (result != 0)
{
return result;
}
}
return len1 - len2;
}
}
Usage :
using System;
using System.Collections;
class Program
{
static void Main()
{
string[] highways = new string[]
{
"100F",
"50F",
"SR100",
"SR9"
};
//
// We want to sort a string array called highways in an
// alphanumeric way. Call the static Array.Sort method.
//
Array.Sort(highways, new AlphanumComparatorFast());
//
// Display the results
//
foreach (string h in highways)
{
Console.WriteLine(h);
}
}
}
Output
50F 100F SR9 SR100
Thanks for all the replies guys. I did my own method and it seems to work fine. It doesn't work for all cases but it works for my scenario. Here is the code for anyone who is interested:
/// <summary>
/// Compares the upgrade to another upgrade
/// </summary>
/// <param name="obj"></param>
/// <returns></returns>
public int CompareTo(object obj)
{
//Int reference table:
//1 or greater means the current instance occurs after obj
//0 means both elements occur in the same position
//-1 or less means the current instance occurs before obj
if (obj == null)
return 1;
Upgrade otherUpgrade = obj as Upgrade;
if (otherUpgrade != null)
{
//Split strings into arrays
string[] splitStringOne = _name.Split(new char[] { ' ' });
string[] splitStringTwo = otherUpgrade.Name.Split(new char[] { ' ' });
//Will hold checks to see which comparer will be used
bool sameWords = false, sameLength = false, bothInt = false;
//Will hold the last part of the string if it is an int
int intOne = 0, intTwo = 0;
//Check if they have the same length
sameLength = (splitStringOne.Length == splitStringTwo.Length);
if (sameLength)
{
//Check to see if they both end in an int
bothInt = (int.TryParse(splitStringOne[splitStringOne.Length - 1], out intOne) && int.TryParse(splitStringTwo[splitStringTwo.Length - 1], out intTwo));
if (bothInt)
{
//Check to see if the previous parts of the string are equal
for (int i = 0; i < splitStringOne.Length - 2; i++)
{
sameWords = (splitStringOne[i].ToLower().Equals(splitStringTwo[i].ToLower()));
if (!sameWords)
break;
}
}
}
//If all criteria is met, use the customk comparer
if (sameWords && sameLength && bothInt)
{
if (intOne < intTwo)
return -1;
else if (intOne > intTwo)
return 1;
else //Both equal
return 0;
}
//Else use the default string comparer
else
return _name.CompareTo(otherUpgrade.Name);
}
else
throw new ArgumentException("Passed object is not an Upgrade.");
}
Will work for strings spaced out using a " " character, like so:
Test data:
Hello 11
Hello 2
Hello 13
Result
Hello 2
Hello 11
Hello 13
Will not work for data such as Hello11 and Hello2 since it cannot split them. Not case sensitive.

c# string split and combine

i have a string that conatin 5 numbers like
'1,4,14,32,47'
i want to make from this string 5 strings of 4 number in each one
like :
'1,4,14,32'
'1,4,14,47'
'1,4,32,47'
'1,14,32,47'
'4,14,32,47'
what is the simple/faster way to do it
is the way of convert this to array unset every time diffrent entery and combine
them back to string ?
is there a simple way to do it ?
thanks
How about something like
string s = "1,4,14,32,47";
string r = String.Join(",", s.Split(',').Where((x, index) => index != 1).ToArray());
Using string.Split() you can create a string array. Loop through it, so that in each loop iteration you indicate which element should be skipped (on the first pass, ignore the first element, on the second pass, ignore the second element).
Inside that loop, create a new array that contains all elements but the one you want to skip, then use string.Join() to create each result.
Have a look at:
http://msdn.microsoft.com/en-us/magazine/ee310028.aspx Here you'll find an example in F# who will give the correct background in combinations and permutations (that's how is called what you need). There is code too, I think it's easy to translate it in C#
Example of Code in C# (text in Italian but code in English)
For those who need a more generic algorithm, here is one which gives n length subsets of m items:
private void GetPermutations()
{
int permutationLength = 4;
string s = "1,4,14,32,47";
string[] subS = s.Split(',');
int[] indexS = new int[permutationLength];
List<string> result = new List<string>();
IterateNextPerm(permutationLength, indexS, subS, result);
// Result will hold all your genberated data.
}
private void IterateNextPerm(int permutationLength, int[] pIndexes, string[] subS, List<string> result)
{
int maxIndexValue = subS.Count() - 1;
bool isCorrect = true;
for (int index = 0; index < permutationLength - 1; index++)
{
if (pIndexes[index] >= pIndexes[index + 1])
{
isCorrect = false;
break;
}
}
// Print result if correct
if (isCorrect)
{
string strNewPermutation = string.Empty;
for (int index = 0; index < permutationLength; index++)
{
strNewPermutation += subS[pIndexes[index]] + ",";
}
result.Add(strNewPermutation.TrimEnd(','));
}
// Increase last index position
pIndexes[permutationLength - 1]++;
// Check and fix if it's out of bounds
if (pIndexes[permutationLength - 1] > maxIndexValue)
{
int? lastIndexIncreased = null;
// Step backwards and increase indexes
for (int index = permutationLength - 1; index > 0; index--)
{
if (pIndexes[index] > maxIndexValue)
{
pIndexes[index - 1]++;
lastIndexIncreased = index - 1;
}
}
// Normalize indexes array, to prevent unnecessary steps
if (lastIndexIncreased != null)
{
for (int index = (int)lastIndexIncreased + 1; index <= permutationLength - 1; index++)
{
if (pIndexes[index - 1] + 1 <= maxIndexValue)
{
pIndexes[index] = pIndexes[index - 1] + 1;
}
else
{
pIndexes[index] = maxIndexValue;
}
}
}
}
if (pIndexes[0] < maxIndexValue)
{
IterateNextPerm(permutationLength, pIndexes, subS, result);
}
}
I know that it's not the nicest coding, but I've written it right now in the last half an hour so I'm sure there are things to tweek there.
Have fun coding!
var elements = string.Split(',');
var result =
Enumerable.Range(0,elements.Length)
.Reverse()
.Select(
i=>
string.Join(","
Enumerable.Range(0,i).Concat(Enumerable.Range(i+1,elements.Length - i - 1))
.Select(j=>elements[j]).ToArray() // This .ToArray() is not needed in NET 4
)
).ToArray();
your wording is pretty confusing...but the example is clear enough. just split it on commas, then remove one index, then use string.Join(",", list); to put it back together..

What's an Efficient Inversion of the Convert-to-Arbitrary-Base C# Function?

I need to convert an integer into a base64-character representation. I'm using OxA3's answer on this thread: Quickest way to convert a base 10 number to any base in .NET?
How do I inverse this to get my original integer back, given a string?
Joel Mueller's answer should guide you the base-64 case.
In response to the preliminary code you've provided in your own answer, you can definitely improve its efficiency by changing the code to accomplish what your for loop is doing (effectively an O(N) IndexOf) to use a hash lookup (which should make it O(1)).
I am basing this on the assumption that baseChars is a field that you initialize in your class's constructor. If this is correct, make the following adjustment:
private Dictionary<char, int> baseChars;
// I don't know what your class is called.
public MultipleBaseNumberFormatter(IEnumerable<char> baseCharacters)
{
// check for baseCharacters != null and Count > 0
baseChars = baseCharacters
.Select((c, i) => new { Value = c, Index = i })
.ToDictionary(x => x.Value, x => x.Index);
}
Then in your StringToInt method:
char next = encodedString[currentChar];
// No enumerating -- we've gone from O(N) to O(1)!
if (!characterIndices.TryGetValue(next, out nextCharIndex))
{
throw new ArgumentException("Input includes illegal characters.");
}
I have a first-pass of a working version here, albeit I'm not sure how efficient it is.
public static int StringToInt(string encodedString, char[] baseChars)
{
int result = 0;
int sourceBase = baseChars.Length;
int nextCharIndex = 0;
for (int currentChar = encodedString.Length - 1; currentChar >= 0; currentChar--)
{
char next = encodedString[currentChar];
// For loop gets us: baseChar.IndexOf(char) => int
for (nextCharIndex = 0; nextCharIndex < baseChars.Length; nextCharIndex++)
{
if (baseChars[nextCharIndex] == next)
{
break;
}
}
// For character N (from the end of the string), we multiply our value
// by 64^N. eg. if we have "CE" in hex, F = 16 * 13.
result += (int)Math.Pow(baseChars.Length, encodedString.Length - 1 - currentChar) * nextCharIndex;
}
return result;
}
Here is a version using Linq functionality and .NET Framework 4.0 Zip extension to perform the calculation.
public static int StringToInt(string encodedString, char[] baseChars) {
int sourceBase = baseChars.Length;
var dict = baseChars
.Select((c, i) => new { Value = c, Index = i })
.ToDictionary(x => x.Value, x => x.Index);
return encodedString.ToCharArray()
// Get a list of positional weights in descending order, calcuate value of weighted position
.Zip(Enumerable.Range(0,encodedString.Length).Reverse(), (f,s) => dict[f] * (int)Math.Pow(sourceBase,s))
.Sum();
}
FYI, calculating the dictionary once outside the function would be more efficient for a large batch of transformations.
Here's a complete solution that converts a base10 number to baseK and back:
public class Program
{
public static void Main()
{
int i = 100;
Console.WriteLine("Int: " + i);
// Default base definition. By moving chars around in this string, we can further prevent
// users from guessing identifiers.
var baseDefinition = "ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789";
//var baseDefinition = "WBUR17GHO8FLZIA059M4TESD2VCNQKXPJ63Y"; // scrambled to minimize guessability
// Convert base10 to baseK
var newId = ConvertToBaseK(i, baseDefinition);
Console.WriteLine(string.Format("To base{0} (short): {1}", baseDefinition.Length, newId));
// Convert baseK to base10
var convertedInt2 = ConvertToBase10(newId, baseDefinition);
Console.WriteLine(string.Format("Converted back: {0}", convertedInt2));
}
public static string ConvertToBaseK(int val, string baseDef)
{
string result = string.Empty;
int targetBase = baseDef.Length;
do
{
result = baseDef[val % targetBase] + result;
val = val / targetBase;
}
while (val > 0);
return result;
}
public static int ConvertToBase10(string str, string baseDef)
{
double result = 0;
for (int idx = 0; idx < str.Length; idx++)
{
var idxOfChar = baseDef.IndexOf(str[idx]);
result += idxOfChar * System.Math.Pow(baseDef.Length, (str.Length-1) - idx);
}
return (int)result;
}
}
If base-64 is really what you need rather than "any base" then everything you need is already built in to the framework:
int orig = 1337;
byte[] origBytes = BitConverter.GetBytes(orig);
string encoded = Convert.ToBase64String(origBytes);
byte[] decoded = Convert.FromBase64String(encoded);
int converted = BitConverter.ToInt32(decoded, 0);
System.Diagnostics.Debug.Assert(orig == converted);

Categories

Resources