StackOverFlowException thrown only when multi-threading - c#

I'm writing a program to help me gather statistics for a research report I'm writing on password security. I decided to make the application run on multiple threads when attempted to brute-force an MD5 hashed password for the obvious performance increase. The application runs fine on a single thread, but the moment 2 threads are running, a StackOverFlowException is throwing at "using (MD5 md5Hash = MD5.Create())" in the TryPass function.
// Microsoft's GetMd5Hash function.
static string GetMd5Hash(MD5 md5Hash, string input)
{
// Convert the input string to a byte array and compute the hash.
byte[] data = md5Hash.ComputeHash(Encoding.UTF8.GetBytes(input));
// Create a new Stringbuilder to collect the bytes
// and create a string.
StringBuilder sBuilder = new StringBuilder();
// Loop through each byte of the hashed data
// and format each one as a hexadecimal string.
for (int i = 0; i < data.Length; i++)
{
sBuilder.Append(data[i].ToString("x2"));
}
// Return the hexadecimal string.
return sBuilder.ToString();
}
static bool TryPass(string attempt, string password)
{
using (MD5 md5Hash = MD5.Create())
{
if (GetMd5Hash(md5Hash, attempt) == password)
return true;
else
return false;
}
}
static bool BruteForce(BruteOptions bruteOptions)
{
if (bruteOptions.prefix.Length == 1 && TryPass(bruteOptions.prefix, bruteOptions.password)) // If it's the first in a series, try it.
return true;
for (int i = 0; i < bruteOptions.chars.Length; i++)
{
if (TryPass(bruteOptions.prefix + bruteOptions.chars[i], bruteOptions.password))
{
Console.WriteLine("The password is: " + bruteOptions.prefix + bruteOptions.chars[i]);
return true;
}
if (bruteOptions.prefix.Length + 1 < bruteOptions.maxLength)
if (BruteForce(bruteOptions))
return true;
//Console.WriteLine(prefix + chars[i]);
}
return false;
}
public struct BruteOptions
{
public string password, prefix;
public char[] chars;
public int maxLength;
}
static void OptionBruteForce()
{
Console.WriteLine("-----------------------");
Console.WriteLine("----- Brute-Force -----");
Console.WriteLine("-----------------------");
BruteOptions bruteOptions = new BruteOptions();
bruteOptions.password = ReadString("Enter the MD5 password hash to brute-force: ");
bruteOptions.chars = ReadString("Enter the characters to use: ").ToCharArray();
bruteOptions.maxLength = ReadIntegerRange("Max length of password: ", 1, 16);
bruteOptions.prefix = "";
Stopwatch myStopWatch = Stopwatch.StartNew();
int NUM_THREADS = bruteOptions.chars.Length;
Thread[] workers = new Thread[NUM_THREADS]; // Run a thread for each char.
var countdownEvent = new CountdownEvent(NUM_THREADS);
bool result = false;
// Start workers.
for (int i = 0; i < NUM_THREADS; i++)
{
int index = i;
BruteOptions newBruteOptions = bruteOptions;
newBruteOptions.prefix = bruteOptions.chars[index].ToString();
workers[index] = new Thread(delegate()
{
// Also check single char.
if (BruteForce(bruteOptions))
{
result = true;
// End all other threads.
for (int ii = 0; ii < NUM_THREADS; ii++)
{
if (workers[ii].ThreadState == System.Threading.ThreadState.Running && index != ii) // Ensures we don't prematurely abort this thread.
{
workers[ii].Abort();
countdownEvent.Signal(); // Signal so we can zero it and continue on the UI thread.
}
}
}
// Signal the CountdownEvent.
countdownEvent.Signal();
});
workers[index].Start();
}
// Wait for workers.
countdownEvent.Wait();
if (!result)
Console.WriteLine("No Match.");
Console.WriteLine("Took " + myStopWatch.ElapsedMilliseconds + " Milliseconds");
}
That's all the relevant code. Any insight onto why this is happening would be greatly appreciated! I'm completely stumped. I attempted to specify a greater stack size when initialising each thread, to no avail.
Thanks in advance!

Your static bool BruteForce(BruteOptions bruteOptions) is infinitely recursive: it calls itself if the length allows, with the same parameters:
if (bruteOptions.prefix.Length + 1 < bruteOptions.maxLength)
if (BruteForce(bruteOptions))
bruteOptions remain the same as it was on the entry to the function.
As a solution you may use this code:
if (bruteOptions.prefix.Length + 1 < bruteOptions.maxLength)
{
BruteOptions newOptions = bruteOptions;
newOptions.prefix += bruteOptions.chars[i];
if (BruteForce(newOptions))
return true;
}
Plus, you pass bruteOptions, not newBruteOptions to the delegate you use in main function. Thus, your multithreading is not used, in fact. All your threads test the same passwords. Change it to newBruteOptions.
Additionally, don't assume anything about threads execution order. You assume that all workers are filled by the moment one finds a proper password, which can be wrong. You will then get a NullReferenceException in this line:
if (workers[ii].ThreadState == System.Threading.ThreadState.Running && index != ii) // Ensures we don't prematurely abort this thread.

I'm guessing your single threaded code looked a little different.
I would lose the recursion.

Just for kicks, here's an alternate implementation without the recursion and making a bit better use of the language constructs and frameworks available as part of .Net
class Program
{
private static string StringFromIndexPermutation(char[] characters, int[] indexes)
{
var buffer = new char[indexes.Length];
for (var i = 0; i < buffer.Length; ++i)
{
buffer[i] = characters[indexes[i]];
}
return new string(buffer);
}
/// <summary>
/// Increments a set of "digits" in a base "numberBase" number with the MSD at position 0
/// </summary>
/// <param name="buffer">The buffer of integers representing the numeric string</param>
/// <param name="numberBase">The base to treat the digits of the number as</param>
/// <returns>false if the number in the buffer has just overflowed, true otherwise</returns>
private static bool Increment(int[] buffer, int numberBase)
{
for (var i = buffer.Length - 1; i >= 0; --i)
{
if ((buffer[i] + 1) < numberBase)
{
++buffer[i];
return true;
}
buffer[i] = 0;
}
return false;
}
/// <summary>
/// Calculate all the permutations of some set of characters in a string from length 1 to maxLength
/// </summary>
/// <param name="characters">The characters to permute</param>
/// <param name="maxLength">The maximum length of the permuted string</param>
/// <returns>The set of all permutations</returns>
public static IEnumerable<string> Permute(char[] characters, int maxLength)
{
for (var i = 0; i < maxLength; ++i)
{
var permutation = new int[i + 1];
yield return StringFromIndexPermutation(characters, permutation);
while (Increment(permutation, characters.Length))
{
yield return StringFromIndexPermutation(characters, permutation);
}
}
}
static string ReadString(string message)
{
Console.Write(message);
return Console.ReadLine();
}
private static int ReadIntegerRange(string message, int min, int max)
{
Console.Write(message + "({0} - {1})", min, max);
while(true)
{
var test = Console.ReadLine();
int value;
if (int.TryParse(test, out value))
{
return value;
}
}
return -1;
}
static void OptionBruteForce()
{
Console.WriteLine("-----------------------");
Console.WriteLine("----- Brute-Force -----");
Console.WriteLine("-----------------------");
var password = ReadString("Enter the MD5 password hash to brute-force: ");
var chars = ReadString("Enter the characters to use: ").Distinct().ToArray();
var maxLength = ReadIntegerRange("Max length of password: ", 1, 16);
var myStopWatch = Stopwatch.StartNew();
var result = false;
string match = null;
var cancellationTokenSource = new CancellationTokenSource();
var passwordBytes = Encoding.Default.GetBytes(password);
var originalMd5 = MD5.Create();
var passwordHash = originalMd5.ComputeHash(passwordBytes);
var hashAlgGetter = new ConcurrentDictionary<Thread, MD5>();
try
{
Parallel.ForEach(Permute(chars, maxLength), new ParallelOptions
{
CancellationToken = cancellationTokenSource.Token
}, test =>
{
var md5 = hashAlgGetter.GetOrAdd(Thread.CurrentThread, t => MD5.Create());
var data = Encoding.Default.GetBytes(test);
var hash = md5.ComputeHash(data);
if (hash.SequenceEqual(passwordHash))
{
myStopWatch.Stop();
match = test;
result = true;
cancellationTokenSource.Cancel();
}
});
}
catch (OperationCanceledException)
{
}
if (!result)
{
Console.WriteLine("No Match.");
}
else
{
Console.WriteLine("Password is: {0}", match);
}
Console.WriteLine("Took " + myStopWatch.ElapsedMilliseconds + " Milliseconds");
}
static void Main()
{
OptionBruteForce();
Console.ReadLine();
}
}

Related

I need to process multiple data buffers in the order they are received in the background using delegates synchronously

need to process data received on a serial port in the background so that the decoded data can be used/displayed on the user form as it is received. Serial data message blocks may sometimes be received in multiple buffers. Therefore, the serial data buffers must be processed in the order they are received in order to insure that all message blocks are complete.
I am trying to process multiple serial data buffers in the order they are received using delegates in c#.
The method "public static void ReceiveSerialData(byte[] byteBuffer, int length)" is called from my ClassSerialPort when data is received.
I get the exception "System.NullReferenceException: 'Object reference not set to an instance of an object.'" on the line "caller.EndInvoke(out threadID, result);"
I have two questions:
How do I correct the System.NullReferenceException exception?
Will using delegates this way process multiple serial data buffers in the order they are received?
My code:
using ClassSerialPort;
using TheUserForm;
using System;
using System.Threading;
using System.Diagnostics;
using System.Text;
namespace TheUserForm
{
class ClsReceiveSerialData
{
// SerialPort property
private ClsSerialPort serialPort;
public ClsSerialPort SerialPor`enter code here`t
{
get
{
return serialPort;
}
set
{
serialPort = value;
}
}
public static void ReceiveSerialData(byte[] byteBuffer, int length)
{
int i;
int threadID = 0;
#if (TEST)
StringBuilder sb = new StringBuilder(byteBuffer.Length * 2);
i = 0;
foreach (byte b in byteBuffer)
{
//convert a byte to a hex string
sb.AppendFormat("{0:x2}", b);
i++;
}
//write the debuf string to the output window
Debug.WriteLine("byteBuffer[" + (i - 1) + "] = " + sb.ToString());
#endif
// create an instance of a delegate class
ClsProcessData ad = new ClsProcessData();
// Create the delegate.
AsyncProcessData caller =
AsyncProcessData(ad.BeginProcessingData);
// Initiate the asychronous call.
IAsyncResult result =
caller.BeginInvoke(byteBuffer, length, out threadID, null, null);
// Call EndInvoke to wait for the asynchronous call to complete,
// and to retrieve the results.
caller.EndInvoke(out threadID, result);
}
}
class ClsProcessData
{
bool gotFirstFEcode = false;
bool gotSecondFEcode = false;
bool gotCtrlrCIV = false;
bool okToSaveCommandBytes = false;
bool gotEndOfMessageCode = true;
byte[] commandData;
int commandIndex = 0;
public void BeginProcessingData(byte[] data, int byteCount,
out int threadId)
{
lock (this)
{
int i;
threadId = Thread.CurrentThread.ManagedThreadId;
#if (TEST)
Debug.WriteLine(
String.Concat("BeginProcessingData threadID = ",
Convert.ToString(threadId)));
#endif
//find a preamble
i = 0;
while (i < byteCount)
{
// have we completed processing the current message?
if (gotEndOfMessageCode)
{
//reset the preamble detection booleans
gotFirstFEcode = false;
gotSecondFEcode = false;
gotCtrlrCIV = false;
okToSaveCommandBytes = false;
gotEndOfMessageCode = false;
} // If
//can we save a command byte now?
if (okToSaveCommandBytes)
{
//go save a command byte
i = ExtractCommand(data, i, byteCount);
//reset the preamble detection booleans
gotFirstFEcode = false;
gotSecondFEcode = false;
gotCtrlrCIV = false;
okToSaveCommandBytes = false;
gotEndOfMessageCode = false;
} // If
//have we found the radio//s civ address?
if (gotCtrlrCIV && (data[i] ==
ClsConstants.CTRLR_DEFAULT_CIV_ADDR))
{
//we are ok to start saving command bytes
okToSaveCommandBytes = true;
} // If
//do we have both FE codes of the preamble and not
another extraneous FE code in the buffer
if (gotSecondFEcode && (data[i] ==
ClsConstants.CTRLR_DEFAULT_CIV_ADDR))
{
gotCtrlrCIV = true;
} // If
if (gotFirstFEcode && (data[i] ==
ClsConstants.PREAMBLE_CODE))
{
gotSecondFEcode = true;
} // If
//do we have the first preamble code?
if ((data[i] == ClsConstants.PREAMBLE_CODE) &&
(!gotSecondFEcode))
{
gotFirstFEcode = true;
//reset } // of message boolean
gotEndOfMessageCode = false;
} // If
//increment the array index
i++;
} // while
}
}
// ExrtractCommand returns an updated index value
private int ExtractCommand(byte[] data, int index, int byteCount)
{
int i = index;
while ((i < byteCount) && (!gotEndOfMessageCode))
{
if (data[i] == ClsConstants.END_OF_MESSAGE_CODE)
{
//set the end of message flag
gotEndOfMessageCode = true;
//Process the command
ProcessCommand(commandData, commandIndex);
}
else
{
//save a command byte
commandData[commandIndex] = data[i];
//increment to the next command index
commandIndex++;
} // If
//increment the data array index
i++;
} // while
return i;
} // void
}

c# How to run a application faster

I am creating a word list of possible uppercase letters to prove how insecure 8 digit passwords are this code will write aaaaaaaa to aaaaaaab to aaaaaaac etc. until zzzzzzzz using this code:
class Program
{
static string path;
static int file = 0;
static void Main(string[] args)
{
new_file();
var alphabet = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ123456789+-*_!$£^=<>§°ÖÄÜöäü.;:,?{}[]";
var q = alphabet.Select(x => x.ToString());
int size = 3;
int counter = 0;
for (int i = 0; i < size - 1; i++)
{
q = q.SelectMany(x => alphabet, (x, y) => x + y);
}
foreach (var item in q)
{
if (counter >= 20000000)
{
new_file();
counter = 0;
}
if (File.Exists(path))
{
using (StreamWriter sw = File.AppendText(path))
{
sw.WriteLine(item);
Console.WriteLine(item);
/*if (!(Regex.IsMatch(item, #"(.)\1")))
{
sw.WriteLine(item);
counter++;
}
else
{
Console.WriteLine(item);
}*/
}
}
else
{
new_file();
}
}
}
static void new_file()
{
path = #"C:\" + "list" + file + ".txt";
if (!File.Exists(path))
{
using (StreamWriter sw = File.CreateText(path))
{
}
}
file++;
}
}
The Code is working fine but it takes Weeks to run it. Does anyone know a way to speed it up or do I have to wait? If anyone has a idea please tell me.
Performance:
size 3: 0.02s
size 4: 1.61s
size 5: 144.76s
Hints:
removed LINQ for combination generation
removed Console.WriteLine for each password
removed StreamWriter
large buffer (128k) for file writing
const string alphabet = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ123456789+-*_!$£^=<>§°ÖÄÜöäü.;:,?{}[]";
var byteAlphabet = alphabet.Select(ch => (byte)ch).ToArray();
var alphabetLength = alphabet.Length;
var newLine = new[] { (byte)'\r', (byte)'\n' };
const int size = 4;
var number = new byte[size];
var password = Enumerable.Range(0, size).Select(i => byteAlphabet[0]).Concat(newLine).ToArray();
var watcher = new System.Diagnostics.Stopwatch();
watcher.Start();
var isRunning = true;
for (var counter = 0; isRunning; counter++)
{
Console.Write("{0}: ", counter);
Console.Write(password.Select(b => (char)b).ToArray());
using (var file = System.IO.File.Create(string.Format(#"list.{0:D5}.txt", counter), 2 << 16))
{
for (var i = 0; i < 2000000; ++i)
{
file.Write(password, 0, password.Length);
var j = size - 1;
for (; j >= 0; j--)
{
if (number[j] < alphabetLength - 1)
{
password[j] = byteAlphabet[++number[j]];
break;
}
else
{
number[j] = 0;
password[j] = byteAlphabet[0];
}
}
if (j < 0)
{
isRunning = false;
break;
}
}
}
}
watcher.Stop();
Console.WriteLine(watcher.Elapsed);
}
Try the following modified code. In LINQPad it runs in < 1 second. With your original code I gave up after 40 seconds. It removes the overhead of opening and closing the file for every WriteLine operation. You'll need to test and ensure it gives the same results because I'm not willing to run your original code for 24 hours to ensure the output is the same.
class Program
{
static string path;
static int file = 0;
static void Main(string[] args)
{
new_file();
var alphabet = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ123456789+-*_!$£^=<>§°ÖÄÜöäü.;:,?{}[]";
var q = alphabet.Select(x => x.ToString());
int size = 3;
int counter = 0;
for (int i = 0; i < size - 1; i++)
{
q = q.SelectMany(x => alphabet, (x, y) => x + y);
}
StreamWriter sw = File.AppendText(path);
try
{
foreach (var item in q)
{
if (counter >= 20000000)
{
sw.Dispose();
new_file();
counter = 0;
}
sw.WriteLine(item);
Console.WriteLine(item);
}
}
finally
{
if(sw != null)
{
sw.Dispose();
}
}
}
static void new_file()
{
path = #"C:\temp\list" + file + ".txt";
if (!File.Exists(path))
{
using (StreamWriter sw = File.CreateText(path))
{
}
}
file++;
}
}
your alphabet is missing 0
With that fixed there would be 89 chars in your set. Let's call it 100 for simplicity. The set you are looking for is all the 8 character length strings drawn from that set. There are 100^8 of these, i.e. 10,000,000,000,000,000.
The disk space they will take up depends on how you encode them, lets be generous - assume you use some 8 bit char set that contains the these characters, and you don't put in carriage returns, so one byte per char, so 10,000,000,000,000,000 bytes =~ 10 peta byes?
Do you have 10 petabytes of disk? (10000 TB)?
[EDIT] In response to 'this is not an answer':
The original motivation is to create the list? The shows how large the list would be. Its hard to see what could be DONE with the list if it was actualised, i.e. it would always be quicker to reproduce it than to load it. Surely whatever point could be made by producing the list can also be made by simply knowing it's size, which the above shows how to work it out.
There are LOTS of inefficiencies in you code, but if your questions is 'how can i quickly produce this list and write it to disk' the answer is 'you literally cannot'.
[/EDIT]

Detecting Filename Patterns for Creating RegEx

When using a string to define a RegEx, I'd like to know if there is a way to get my code to recognize a pattern in the files contained within a directory.
The goal is to rename these files using our naming conventions, so I'm writing something to try to create the expression to use in RegEx.
I've started something here, but I don't think it is the best, and I'm not sure how to fill in the "{0}" portion of my RegEx expression.
private Regex m_regex;
public string DirPattern(string path, string[] extensions) {
string result = null;
int endPos = 0;
int resLen = 0;
int startLen = 0;
var dir = new DirectoryInfo(path);
foreach (var file in dir.GetFiles()) {
if (extensions.Contains(file.Extension)) {
if (!String.IsNullOrEmpty(result)) {
int sL = 0;
int fileLen = file.Name.Length;
string one = null;
for (int i = 0; i < resLen && i < fileLen; i++) {
if (result[i] == file.Name[i]) {
sL = i + 1;
if (String.IsNullOrEmpty(one)) {
one = file.Name;
} else {
break;
}
}
}
if (!String.IsNullOrEmpty(one)) {
int eP = 0;
int oneLen = one.Length;
for (int i = fileLen - 1; -1 < i; i--) {
if (result[i] == file.Name[i]) {
eP = i - 1;
} else {
break;
}
}
if ((0 < endPos) && (eP == endPos)) {
if ((0 < startLen) && (sL == startLen)) {
result = one.Substring(0, startLen) + "{0}" + one.Substring(endPos);
} else if (0 < sL) {
startLen = sL;
}
} else if (0 < sL) {
startLen = sL;
}
}
} else {
result = file.Name;
resLen = result.Length;
}
}
}
return result;
}
public bool GenerateRexEx(string path, string[] extensions) {
var pattern = DirPattern(path, extensions);
if (!String.IsNullOrEmpty(pattern)) {
m_regex = new Regex(pattern);
return true;
}
return false;
}
Here is an example of a list of files that would be most like our company files (which I am not allowed to post):
UPDATE:
The goal is to take files with names like this:
FOLDER_PATTERN_1 + MixedContent + FOLDER_PATTERN_2
and rename them using our format:
OUR_PATTERN_1 + MixedContent + OUR_PATTERN_2
That way, our software will be able to search the files more efficiently.
I think that in your case you need just to find count of characters in the prefix pattern and postfix pattern. Then you can simply replace some count of characters with your pattern. I wrote a simple code which I tested and works. You can inspire yourself and use the same method I think. Anyway there are areas to make this better, but I hope it is enough to answer your question.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Text.RegularExpressions;
using System.Threading.Tasks;
namespace ConsoleApplication1
{
static class Program
{
static void Main()
{
var inputFilenames = new string[]
{
"mtn_flint501-muxed",
"mtn_flint502-muxed",
"mtn_flint503-muxed",
"mtn_flint504-muxed",
"mtn_flint505-muxed",
"mtn_flint506-muxed",
"mtn_flint507-muxed",
"mtn_flint508-muxed",
"mtn_flint509-muxed",
"mtn_flint510-muxed",
"mtn_flint511-muxed",
"mtn_flint512-muxed",
};
var replacedFilenames = ReplaceFileNames(inputFilenames);
for (int i = 0; i < inputFilenames.Length; i++)
{
Console.WriteLine("{0} >> {1}", inputFilenames[i], replacedFilenames[i]);
}
Console.ReadKey();
}
private const string OurPrefixPattern = "Prefix_";
private const string OurPostfixPattern = "_Postfix";
/// <summary>
/// Method which will find the filename's pattern and replace it with our pattern
/// </summary>
/// <param name="fileNames"></param>
/// <returns></returns>
public static string[] ReplaceFileNames(params string[] fileNames)
{
//At first, we will find count of characters, which are same for
//all filenames as prefix and store it to prefixCount variable and
//we will find count of characters which are same for all filenames
//as postfix and store it to postfixCount variable
var prefixCount = int.MaxValue;
var postfixCount = int.MaxValue;
//We will use first filename as the reference one (we will be comparing)
//all filenames with this one
var referenceFilename = fileNames[0];
var reversedReferenceFilename = referenceFilename.ReverseString();
//Lets find the prefixCount and postfixCount
foreach (var filename in fileNames)
{
if (filename == referenceFilename)
{
continue;
}
//Check for prefix count
var firstDifferenceIndex = referenceFilename.GetFirstDifferentIndexWith(filename);
if (firstDifferenceIndex < prefixCount)
{
prefixCount = firstDifferenceIndex;
}
//For postfix count we will do the same, but with reversed strings
firstDifferenceIndex = reversedReferenceFilename.GetFirstDifferentIndexWith(filename.ReverseString());
if (firstDifferenceIndex < postfixCount)
{
postfixCount = firstDifferenceIndex;
}
}
//So now replace given filnames with our prefix and post fix.
//Our regex determines only how many characters should be replaced
var prefixRegexToReplace = string.Format("^.{{{0}}}", prefixCount);
var postfixRegexToReplace = string.Format(".{{{0}}}$", postfixCount);
var result = new string[fileNames.Length];
for (int i = 0; i < fileNames.Length; i++)
{
//Replace the prefix
result[i] = Regex.Replace(fileNames[i], prefixRegexToReplace, OurPrefixPattern);
//Replace the postfix
result[i] = Regex.Replace(result[i], postfixRegexToReplace, OurPostfixPattern);
}
return result;
}
/// <summary>
/// Gets the first index in which the strings has different character
/// </summary>
/// <param name="value"></param>
/// <param name="stringToCompare"></param>
/// <returns></returns>
private static int GetFirstDifferentIndexWith(this string value, string stringToCompare)
{
return value.Zip(stringToCompare, (c1, c2) => c1 == c2).TakeWhile(b => b).Count();
}
/// <summary>
/// Revers given string
/// </summary>
/// <param name="value">String which should be reversed</param>
/// <returns>Reversed string</returns>
private static string ReverseString(this string value)
{
char[] charArray = value.ToCharArray();
Array.Reverse(charArray);
return new string(charArray);
}
}
}
The console output looks like this
mtn_flint501-muxed >> Prefix_01_Postfix
mtn_flint502-muxed >> Prefix_02_Postfix
mtn_flint503-muxed >> Prefix_03_Postfix
mtn_flint504-muxed >> Prefix_04_Postfix
mtn_flint505-muxed >> Prefix_05_Postfix
mtn_flint506-muxed >> Prefix_06_Postfix
mtn_flint507-muxed >> Prefix_07_Postfix
mtn_flint508-muxed >> Prefix_08_Postfix
mtn_flint509-muxed >> Prefix_09_Postfix
mtn_flint510-muxed >> Prefix_10_Postfix
mtn_flint511-muxed >> Prefix_11_Postfix
mtn_flint512-muxed >> Prefix_12_Postfix

String concatenate/shorten algorithm

I want to publish server-messages on Twitter, for our clients.
Unfortunately, Twitter only allows posting 140 Chars or less. This is a shame.
Now, I have to write an algorithm that concatenates the different messages from the server together, but shortens them to a max of 140 characters.
It's pretty tricky.
CODE
static string concatinateStringsWithLength(string[] strings, int length, string separator) {
// This is the maximum number of chars for the strings
// We have to subtract the separators
int maxLengthOfAllStrings = length - ((strings.Length - 1) * separator.Length);
// Here we save all shortenedStrings
string[] cutStrings = new string[strings.Length];
// This is the average length of all the strings
int averageStringLenght = maxLengthOfAllStrings / strings.Length;
// Now we check how many strings are longer than the average string
int longerStrings = 0;
foreach (string singleString in strings)
{
if (singleString.Length > averageStringLenght)
{
longerStrings++;
}
}
// If a string is smaller than the average string, we can more characters to the longer strings
int maxStringLength = averageStringLenght;
foreach (string singleString in strings)
{
if (averageStringLenght > singleString.Length)
{
maxStringLength += (int)((averageStringLenght - singleString.Length) * (1.0 / longerStrings));
}
}
// Finally we shorten the strings and save them to the array
int i = 0;
foreach (string singleString in strings)
{
string shortenedString = singleString;
if (singleString.Length > maxStringLength)
{
shortenedString = singleString.Remove(maxStringLength);
}
cutStrings[i] = shortenedString;
i++;
}
return String.Join(separator, cutStrings);
}
Problem with this
This algorithm works, but it's not very optimized.
It uses less characters than it actually could.
The main problem with this is that the variable longerStrings is relative to the maxStringLength, and backwards.
This means if I change longerStrings, maxStringLength gets changed, and so on and so on.
I'd have to make a while loop and do this until there are no changes, but I don't think that's necessary for such a simple case.
Can you give me a clue on how to continue?
Or maybe there already exists something similar?
Thanks!
EDIT
The messages I get from the server look like this:
Message
Subject
Date
Body
Message
Subject
Date
Body
And so on.
What I want is to concatenate the strings with a separator, in this case a semi-colon.
There should be a max length. The long strings should be shortened first.
Example
This is a subject
This is the body and is a bit lon...
25.02.2013
This is a s...
This is the...
25.02.2013
I think you get the idea ;)
Five times slower than yours (in our simple example) but should use maximum avaliable space (no critical values checking):
static string Concatenate(string[] strings, int maxLength, string separator)
{
var totalLength = strings.Sum(s => s.Length);
var requiredLength = totalLength - (strings.Length - 1)*separator.Length;
// Return if there is enough place.
if (requiredLength <= maxLength)
return String.Concat(strings.Take(strings.Length - 1).Select(s => s + separator).Concat(new[] {strings.Last()}));
// The problem...
var helpers = new ConcatenateInternal[strings.Length];
for (var i = 0; i < helpers.Length; i++)
helpers[i] = new ConcatenateInternal(strings[i].Length);
var avaliableLength = maxLength - (strings.Length - 1)*separator.Length;
var charsInserted = 0;
var currentIndex = 0;
while (charsInserted != avaliableLength)
{
for (var i = 0; i < strings.Length; i++)
{
if (charsInserted == avaliableLength)
break;
if (currentIndex >= strings[i].Length)
{
helpers[i].Finished = true;
continue;
}
helpers[i].StringBuilder.Append(strings[i][currentIndex]);
charsInserted++;
}
currentIndex++;
}
var unified = new StringBuilder(avaliableLength);
for (var i = 0; i < strings.Length; i++)
{
if (!helpers[i].Finished)
{
unified.Append(helpers[i].StringBuilder.ToString(0, helpers[i].StringBuilder.Length - 3));
unified.Append("...");
}
else
{
unified.Append(helpers[i].StringBuilder.ToString());
}
if (i < strings.Length - 1)
{
unified.Append(separator);
}
}
return unified.ToString();
}
And ConcatenateInternal:
class ConcatenateInternal
{
public StringBuilder StringBuilder { get; private set; }
public bool Finished { get; set; }
public ConcatenateInternal(int capacity)
{
StringBuilder = new StringBuilder(capacity);
}
}

Is it required to check before replacing a string in StringBuilder (using functions like "Contains" or "IndexOf")?

Is there any method IndexOf or Contains in C#. Below is the code:
var sb = new StringBuilder(mystring);
sb.Replace("abc", "a");
string dateFormatString = sb.ToString();
if (sb.ToString().Contains("def"))
{
sb.Replace("def", "aa");
}
if (sb.ToString().Contains("ghi"))
{
sb.Replace("ghi", "assd");
}
As you might have noticed I am using ToString() above again and again which I want to avoid as it is creating new string everytime. Can you help me how can I avoid it?
If the StringBuilder doesn't contain "def" then performing the replacement won't cause any problems, so just use:
var sb = new StringBuilder(mystring);
sb.Replace("abc", "a");
sb.Replace("def", "aa");
sb.Replace("ghi", "assd");
There's no such method in StringBuilder but you don't need the Contains tests. You can simply write it like this:
sb.Replace("abc", "a");
sb.Replace("def", "aa");
sb.Replace("ghi", "assd");
If the string in the first parameter to Replace is not found then the call to Replace is a null operation—exactly what you want.
The documentation states:
Replaces all occurrences of a specified string in this instance with another specified string.
The way you read this is that when there are no occurrences, nothing is done.
You can write a class that extends methods to the StringBuilder object. Here, I have added IndexOf, Substring, and other methods to the StringBuilder class. Just put this class in your project.
using System;
using System.Text;
namespace Helpers
{
/// <summary>
/// Adds IndexOf, IsStringAt, AreEqual, and Substring to all StringBuilder objects.
/// </summary>
public static class StringBuilderExtension
{
// Adds IndexOf, Substring, AreEqual to the StringBuilder class.
public static int IndexOf(this StringBuilder theStringBuilder,string value)
{
const int NOT_FOUND = -1;
if (theStringBuilder == null)
{
return NOT_FOUND;
}
if (String.IsNullOrEmpty(value))
{
return NOT_FOUND;
}
int count = theStringBuilder.Length;
int len = value.Length;
if (count < len)
{
return NOT_FOUND;
}
int loopEnd = count - len + 1;
for (int loop = 0; loop < loopEnd; loop++)
{
bool found = true;
for (int innerLoop = 0; innerLoop < len; innerLoop++)
{
if (theStringBuilder[loop + innerLoop] != value[innerLoop])
{
found = false;
break;
}
}
if (found)
{
return loop;
}
}
return NOT_FOUND;
}
public static int IndexOf(this StringBuilder theStringBuilder, string value,int startPosition)
{
const int NOT_FOUND = -1;
if (theStringBuilder == null)
{
return NOT_FOUND;
}
if (String.IsNullOrEmpty(value))
{
return NOT_FOUND;
}
int count = theStringBuilder.Length;
int len = value.Length;
if (count < len)
{
return NOT_FOUND;
}
int loopEnd = count - len + 1;
if (startPosition >= loopEnd)
{
return NOT_FOUND;
}
for (int loop = startPosition; loop < loopEnd; loop++)
{
bool found = true;
for (int innerLoop = 0; innerLoop < len; innerLoop++)
{
if (theStringBuilder[loop + innerLoop] != value[innerLoop])
{
found = false;
break;
}
}
if (found)
{
return loop;
}
}
return NOT_FOUND;
}
public static string Substring(this StringBuilder theStringBuilder, int startIndex, int length)
{
return theStringBuilder == null ? null : theStringBuilder.ToString(startIndex, length);
}
public static bool AreEqual(this StringBuilder theStringBuilder, string compareString)
{
if (theStringBuilder == null)
{
return compareString == null;
}
if (compareString == null)
{
return false;
}
int len = theStringBuilder.Length;
if (len != compareString.Length)
{
return false;
}
for (int loop = 0; loop < len; loop++)
{
if (theStringBuilder[loop] != compareString[loop])
{
return false;
}
}
return true;
}
/// <summary>
/// Compares one string to part of another string.
/// </summary>
/// <param name="haystack"></param>
/// <param name="needle">Needle to look for</param>
/// <param name="position">Looks to see if the needle is at position in haystack</param>
/// <returns>Substring(theStringBuilder,offset,compareString.Length) == compareString</returns>
public static bool IsStringAt(this StringBuilder haystack, string needle,int position)
{
if (haystack == null)
{
return needle == null;
}
if (needle == null)
{
return false;
}
int len = haystack.Length;
int compareLen = needle.Length;
if (len < compareLen + position)
{
return false;
}
for (int loop = 0; loop < compareLen; loop++)
{
if (haystack[loop+position] != needle[loop])
{
return false;
}
}
return true;
}
}
}
IMHO you don't have to use StringBuilder in this case... StringBuilder is more useful when used in a loop. Like Microsoft say in In this article
The String object is immutable. Every
time you use one of the methods in the
System.String class, you create a new
string object in memory, which
requires a new allocation of space for
that new object. In situations where
you need to perform repeated
modifications to a string, the
overhead associated with creating a
new String object can be costly. The
System.Text.StringBuilder class can be
used when you want to modify a string
without creating a new object. For
example, using the StringBuilder class
can boost performance when
concatenating many strings together in
a loop
So simply you can use String and avoid use ToString()...

Categories

Resources