I am trying to add a string to an array, I have done a lot of research, and came up with two options but neither work, I get a pop-up and the details make it sound like my array is out of bounds, Both methods are inside my addSpam function. Any ideas on how to fix either method?
namespace HW8_DR
{
class Tester : Spam_Scanner
{
private string[] spam = {"$$$", "Affordable", "Bargain", "Beneficiary", "Best price", "Big bucks",
"Cash", "Cash bonus", "Cashcashcash", "Cents on the dollar", "Cheap", "Check",
"Claims", "Collect", "Compare rates", "Cost", "Credit", "Credit bureaus",
"Discount", "Earn", "Easy terms", "F r e e", "Fast cash", "For just $XXX",
"Hidden assets", "hidden charges", "Income", "Incredible deal", "Insurance",
"Investment", "Loans", "Lowest price", "Million dollars", "Money", "Money back",
"Mortgage", "Mortgage rates", "No cost", "No fees", "One hundred percent free",
"Only $", "Pennies a day", "Price", "Profits", "Pure profit", "Quote", "Refinance",
"Save $", "Save big money", "Save up to", "Serious cash", "Subject to credit",
"They keep your money – no refund!", "Unsecured credit", "Unsecured debt",
"US dollars", "Why pay more?"};
public static double countSpam = 0;
public static double wordCount = 0;
public static string posSpam = "";
public void tester(string email)
{
for(int i = 0; i < spam.Length-1; i++)
if(email.Contains(spam[i]))
{
countSpam++;
posSpam = string.Concat(posSpam, spam[i], "\r\n\r\n");
}
wordCount = email.Split(' ').Length;
}
public void addSpam(string spamFlag)
{
//attempt 1 to add string to spam array
Array.Resize(ref spam, spam.Length + 1);
spam[spam.Length] = spamFlag;
//attempt 2 to add string to spam array
string[] temp = new string[spam.Length + 1];
Array.Copy(spam, temp, spam.Length);
temp.SetValue(spamFlag, spam.Length);
Array.Copy(temp, spam, temp.Length);
}
}
}`
Simple solution: don't use an array! List<T> is much better suited for this.
using System.Collections.Generic;
...
private List<string> spam = {"$$$", "Affordable", "Bargain", "Beneficiary", ... }
...
public void addSpam(string spamFlag)
{
spam.Add(spamFlag);
}
DLeh's answer is best - this is what a List<T> is for, so that's your solution.
But the reason things are failing for you is that you're attempting to access an index that is one higher than the max index of the array. The highest index is always one less than the length, because arrays are zero-based.
int[] arr = new[] { 1, 2, 3 };
Console.WriteLine(arr.Length); // 3
Console.WriteLine(arr[0]); // 1
Console.WriteLine(arr[1]); // 2
Console.WriteLine(arr[2]); // 3
Console.WriteLine(arr[3]); // Exception
To access the last item in an array, you either need to use:
var lastItem = arr[arr.Length - 1];
// or
var lastItem = arr[arr.GetUpperBound(0)];
Array.Resize(ref spam, spam.Length + 1);
spam[spam.Length] = spamFlag;
Here you're trying to write to index 58 (spam.Length after the re-size) of a 58-element zero-indexed array; that is, it goes from 0 to 57.
You should use:
Array.Resize(ref spam, spam.Length + 1);
spam[spam.Length - 1] = spamFlag;
That said, you should really use List<string> instead. Among other things it does the resizing of the internal array it uses in batches rather than on every Add(), which makes things much more efficient, as well as being easier.
If you really need an array for some reason, then use List<string> for most of the work, and then call ToArray() at the end.
While DLeh raise legit points about how it's better to use a List that dynamically grows, and Joe's answer provides a great explanation, I want to add on to a few more things.
Firstly, to Answer your question, to fix either method, you probably wanted to do spam[spam.Length-1] = spamFlag instead of spam[spam.Length] = spamFlag in attempt one. Because indexes start at 0 and the last index within the bound is thus length -1 (As Joe pointed out)
Your second attempt will not work as an exception is thrown if either array is too short. See this dotnetPerl link on the explanation. Basically it isn't recommended to use Array.Copy as you have to ensure the types and lengths are the same as well.
To elaborate on Array.Resize(), it should be noted that it's actually a misnomer in which C# doesn't actually resize the array. Rather, it creates a new array and copies the contents over. It always does this unless an exception is thrown, so from a performance point of view this is discouraged. (You could have a variable to keep track of how full your array is, and always grow it by double the amount, this avoids you from having to always create a new array.)
This is also used in hashtables as well where if a certain bucket is full, we grow it and rehash everything back in (it's a really fast lookup table/data structure).
Read this DotNetPerl tutorial on Array Resize
However, a lot of times a List is better, but of course there may be a reason why you don't want to/can't use. I know of a few classes that explicitly tell us not to use built in data structures to learn how arrays work.
Related
I have a file consisting of a list of text which looks as follows:
ABC Abbey something
ABD Aasdasd
This is the text file
The first string will always be the length of 3. So I want to loop through the file content, store those first 3 letters as Key and remaining as value. I am removing white space between them and Substringing as follows to store. The key works out fine but the line where I am storing the value returns following error. ArgumentOutOfRangeException
This is the exact code causing the problem.
line.Substring(4, line.Length)
If I call the subString between 0 and line.length it works fine. As long as I call it between 1and upwards - line.length I get the error. Honestly don't get it and been at it for hours. Some assistance please.
class Program {
static string line;
static Dictionary<string, string> stations = new Dictionary<string, string>();
static void Main(string[] args) {
var lines = File.ReadLines("C:\\Users\\username\\Desktop\\a.txt");
foreach (var l in lines) {
line = l.Replace("\t", "");
stations.Add(line.Substring(0, 3), line.Substring(4, line.Length));//error caused by this line
}
foreach(KeyValuePair<string, string> item in stations) {
//Console.WriteLine(item.Key);
Console.WriteLine(item.Value);
}
Console.ReadLine();
}
}
This is because the documentation specifies it will throw an ArgumentOutOfRangeException if:
startIndex plus length indicates a position not within this instance.
With the signature:
public string Substring(int startIndex, int length)
Since you use line.Length, you know that startIndex plus length will be 4+line.Length which is definitely not a position of this instance.
I recommend using the one parameter version:
public string Substring(int startIndex)
Thus line.Substring(3) (credit to #adv12 for spotting that). Since here you only should provide the startIndex. Of course you can use line.SubString(3,line.Length-3), but as always, better use a library since libraries are made to make programs fool-proof (this is not intended as offensive, simply make sure you reduce the amount of brain cycles for this task). Mind however that it still can throw an error if:
startIndex is less than zero or greater than the length of this instance.
So better provide checks that 3 is less than or equal to line.length...
Additional advice
Perhaps you should take a look to regex capturing. Now each key in your file contains three characters. But it is possible that in the (near) future four characters will be possible. Using regex capture, you could specify a pattern such that it is less likely that errors will occur during parsing.
You need to actually get less than the length of total line:
line.Substring(4, line.Length - 4) //subtract the chars which you're skipping
Your string:
ABC Abbey something
Length = 19
Start = 4
Remaining chars = 19 - 4 = 15 //and you are expecting 19, that is the error
I know this is a late answer that doesn't address what's wrong with your code but I feel that has already been done by other people. Instead I have different way to make the dictionary that doesn't involve substring at all so it's a little more robust, IMHO.
As long as you can guarantee that the two values are always separated by tab then this would work even if there were more or less characters in the key. It uses LINQ which should be fine from .NET 3.5.
// LINQ
using System.Linq;
// Creates a string[][] array with the list of keys in the first array position
// and the values in the second
var lines = File.ReadAllLines(#"path/to/file.txt")
.Select(s => s.Split('\t'))
.ToArray();
// Your dictionary
Dictionary<string, string> stations = new Dictionary<string, string>();
// Loop through the array and add the key/value pairs to the dictionary
for (int i = 0; i < lines.Length; i++)
{
// For example lines[i][0] = ABW, lines[i][1] = Abbey Wood
stations[lines[i][0]] = lines[i][1];
}
// Prove it works
foreach (KeyValuePair<string, string> entry in stations)
{
MessageBox.Show(entry.Key + " - " + entry.Value);
}
Hope this makes sense and gives you an alternate to consider ;-)
I'm sure this has been asked a million times, but when I searched all the examples didn't quite fit, so I thought I should ask it anyway.
I have two arrays which will always contain 6 items each. For example:
string[] Colors=
new string[] { "red", "orange", "yellow", "green", "blue", "purple" };
string[] Foods=
new string[] { "fruit", "grain", "dairy", "meat", "sweet", "vegetable" };
Between these two arrays, there are 36 possible combinations(e.g. "red fruit", "red grain").
Now I need to further group these into sets of six unique values.
For example:
meal[0]=
new Pair[] {
new Pair { One="red", Two="fruit" },
new Pair { One="orange", Two="grain" },
new Pair { One="yellow", Two="dairy" },
new Pair { One="green", Two="meat" },
new Pair { One="blue", Two="sweet" },
new Pair { One="purple", Two="vegetable" }
};
where meal is
Pair[][] meal;
No element can be repeated in my list of "meals". So there is only ever a single "Red" item, and a single "meat" item, etc.
I can easily create the pairs based on the first two arrays, but I am drawing a blank on how best to then group them into unique combinations.
OK, you want a sequence containing all 720 possible sequences. This is a bit trickier but it can be done.
The basic idea is the same as in my previous answer. In that answer we:
generated a permutation at random
zipped the permuted second array with the unpermuted first array
produced an array from the query
Now we'll do the same thing except instead of producing a permutation at random, we'll produce all the permutations.
Start by getting this library:
http://www.codeproject.com/Articles/26050/Permutations-Combinations-and-Variations-using-C-G
OK, we need to make all the permutations of six items:
Permutations<string> permutations = new Permutations<string>(foods);
What do we want to do with each permutation? We already know that. We want to first zip it with the colors array, turning it into a sequence of pairs, which we then turn into an array. Instead, let's turn it into a List<Pair> because, well, trust me, it will be easier.
IEnumerable<List<Pair>> query =
from permutation in permutations
select colors.Zip(permutation, (color, food)=>new Pair(color, food)).ToList();
And now we can turn that query into a list of results;
List<List<Pair>> results = query.ToList();
And we're done. We have a list with 720 items in it. Each item is a list with 6 pairs in it.
The heavy lifting is done by the library code, obviously; the query laid on top of it is straightforward.
('ve been meaning to write a blog article for some time on ways to generate permutations in LINQ; I might use this as an example!)
There are 720 possible combinations that meet your needs. It is not clear from your question whether you want to enumerate all 720 or choose one at random or what. I'm going to assume the latter.
UPDATE: Based on comments, this assumption was incorrect. I'll start a new answer.
First, produce a permutation of the second array. You can do it in-place with the Fischer-Yates-Knuth shuffle; there are many examples of how to do so on StackOverflow. Alternatively, you could produce a permutation with LINQ by sorting with a random key.
The former technique is fast even if the number of items is large, but mutates an existing array. The second technique is slower, particularly if the number of items is extremely large, which it isn't.
The most common mistake people make with the second technique is sorting on a guid. Guids are guaranteed to be unique, not guaranteed to be random.
Anyway, produce a query which, when executed, permutes the second array:
Random random = new Random();
IEnumerable<string> shuffled = from food in foods
orderby random.NextDouble()
select food;
A few other caveats:
Remember, the result of a query expression is a query, not a set of results. The permutation doesn't happen until you actually turn the thing into an array at the other end.
if you make two instances of Random within the same millisecond, you get the same sequence out of them both.
Random is pseudo-random, not truly random.
Random is not threadsafe.
Now you can zip-join your permuted sequence to the first array:
IEnumerable<Pair> results = colors.Zip(shuffled, (color, food)=>new Pair(color, food));
Again, this is still a query representing the action of zipping the two sequences together. Nothing has happened yet except building some queries.
Finally, turn it into an array. This actually executes the queries.
Pair[] finalResults = results.ToArray();
Easy peasy.
Upon request, I will be specific about how I view the problem in regards to sorting. I know that since C# is a higher level language there are tons of quick and easy libraries and objects that can be used to reduce this to minimal code. This answer is actually attempting the solve the question by implementing sorting logic.
When initially reading this question I was reminded of sorting a deck of cards. The two arrays are very similar to an array for suit and an array for face value. Since one way to solve a shuffle is to randomize the arrays and then pick a card combined of both, you could apply the same logic here.
Sorting as a possible solution
The Fisher-Yates sorting algorithm essentially loops through all the indices of the array swapping the current index with a random index. This creates a fairly efficient sorting method. So then how does this apply to the problem at hand? One possible implementation could be...
static Random rdm = new Random();
public string[] Shuffle(string[] c)
{
var random = rdm;
for (int i = c.Length; i > 1; i--)
{
int iRdm = rdm.Next(i);
string cTemp = c[iRdm];
c[iRdm] = c[i - 1];
c[i - 1] = cTemp;
}
return c;
}
Source: Fisher-Yates Shuffle
The code above randomizes the positions of values within the string array. If you passed the Colors and Food arrays into this function, you would get unique pairings for your Pairs by referencing a specific index of both.
Since the array is shuffled, the pairing of the two arrays at index 0,1,2,etc are unique. The problem however asks for Pairs to be created. A Pair class should then be created that takes in a value at a specific index for both Colors and Foods. ie...Colors[3] and Foods[3]
public class Pair
{
public string One;
public string Two;
public Pair(string m1, string m2)
{
One = m1;
Two = m2;
}
}
Since we have sorted arrays and a class to contain the unique parings, we simply create the meal array and populate it with Pairs.
If we wanted to create a new pair we would have...
Pair temp = new Pair(Colors[0],Foods[0]);
With this information we can finally populate the meal array.
Pair[] meal = new Pair[Colors.Length - 1];
for (int i = 0; i < Colors.Length - 1; i++)
{
meal[i] = new Pair(Colors[i],Foods[i]);
}
This section of code creates the meal array and defines its number of indices by the length of Colors. The code then loops through the total number of Color values while creating new pair combos and dropping them in meal. This method assumes the length of the arrays are identical, a check could easily be made for the smallest array.
Full Code
private void Form1_Load(object sender, EventArgs e)
{
string[] Colors = new string[] { "red", "orange", "yellow", "green", "blue", "purple" };
string[] Foods = new string[] { "fruit", "grain", "dairy", "meat", "sweet", "vegetable" };
Colors = Shuffle(Colors);
Foods = Shuffle(Foods);
Pair[] meal = new Pair[Colors.Length - 1];
for (int i = 0; i < Colors.Length - 1; i++)
{
meal[i] = new Pair(Colors[i],Foods[i]);
}
}
static Random rdm = new Random();
public string[] Shuffle(string[] c)
{
var random = rdm;
for (int i = c.Length; i > 1; i--)
{
int iRdm = rdm.Next(i);
string cTemp = c[iRdm];
c[iRdm] = c[i - 1];
c[i - 1] = cTemp;
}
return c;
}
}
public class Pair
{
public string One;
public string Two;
public Pair(string m1, string m2)
{
One = m1;
Two = m2;
}
}
-Original Post-
You can simply shuffle the array. This will allow for the same method to populate meal, but with different results. There is a post on Fisher-Yates shuffle Here
I am fairly new to C# programming and I am stuck on my little ASP.NET project.
My website currently examines Twitter statuses for URLs and then adds those URLs to an array, all via a regular expression pattern matching procedure. Clearly more than one person will update a with a specific URL so I do not want to list duplicates, and I want to count the number of times a particular URL is mentioned in, say, 100 tweets.
Now I have a List<String> which I can sort so that all duplicate URLs are next to each other. I was under the impression that I could compare list[i] with list[i+1] and if they match, for a counter to be added to (count++), and if they don't match, then for the URL and the count value to be added to a new array, assuming that this is the end of the duplicates.
This would remove duplicates and give me a count of the number of occurrences for each URL. At the moment, what I have is not working, and I do not know why (like I say, I am not very experienced with it all).
With the code below, assume that a JSON feed has been searched for using a keyword into srchResponse.results. The results with URLs in them get added to sList, a string List type, which contains only the URLs, not the message as a whole.
I want to put one of each URL (no duplicates), a count integer (to string) for the number of occurrences of a URL, and the username, message, and user image URL all into my jagged array called 'urls[100][]'. I have made the array 100 rows long to make sure everything can fit but generally, this is too big. Each 'row' will have 5 elements in them.
The debugger gets stuck on the line: if (sList[i] == sList[i + 1]) which is the crux of my idea, so clearly the logic is not working. Any suggestions or anything will be seriously appreciated!
Here is sample code:
var sList = new ArrayList();
string[][] urls = new string[100][];
int ctr = 0;
int j = 1;
foreach (Result res in srchResponse.results)
{
string content = res.text;
string pattern = #"((https?|ftp|gopher|telnet|file|notes|ms-help):((//)|(\\\\))+[\w\d:##%/;$()~_?\+-=\\\.&]*)";
MatchCollection matches = Regex.Matches(content, pattern);
foreach (Match match in matches)
{
GroupCollection groups = match.Groups;
sList.Add(groups[0].Value.ToString());
}
}
sList.Sort();
foreach (Result res in srchResponse.results)
{
for (int i = 0; i < 100; i++)
{
if (sList[i] == sList[i + 1])
{
j++;
}
else
{
urls[ctr][0] = sList[i].ToString();
urls[ctr][1] = j.ToString();
urls[ctr][2] = res.text;
urls[ctr][3] = res.from_user;
urls[ctr][4] = res.profile_image_url;
ctr++;
j = 1;
}
}
}
The code then goes on to add each result into a StringBuilder method with the HTML.
Is now edite
The description of your algorithm seems fine. I don't know what's wrong with the implementation; I haven't read it that carefully. (The fact that you are using an ArrayList is an immediate red flag; why aren't you using a more strongly typed generic collection?)
However, I have a suggestion. This is exactly the sort of problem that LINQ was intended to solve. Instead of writing all that error-prone code yourself, just describe the transformation you're interested in, and let the compiler work it out for you.
Suppose you have a list of strings and you wish to determine the number of occurrences of each:
var notes = new []{ "Do", "Fa", "La", "So", "Mi", "Do", "Re" };
var counts = from note in notes
group note by note into g
select new { Note = g.Key, Count = g.Count() }
foreach(var count in counts)
Console.WriteLine("Note {0} occurs {1} times.", count.Note, count.Count);
Which I hope you agree is much easier to read than all that array logic you wrote. And of course, now you have your sequence of unique items; you have a sequence of counts, and each count contains a unique Note.
I'd recommend using a more sophisticated data structure than an array. A Set will guarantee that you have no duplicates.
Looks like C# collections doesn't include a Set, but there are 3rd party implementations available, like this one.
Your loop fails because when i == 99, (i + 1) == 100 which is outside the bounds of your array.
But as other have pointed out, .Net 3.5 has ways of doing what you want more elegantly.
If you don't need to know how many duplicates a specific entry has you could do the following:
LINQ Extension Methods
.Count()
.Distinct()
.Count()
I understand the benefits of StringBuilder.
But if I want to concatenate 2 strings, then I assume that it is better (faster) to do it without StringBuilder. Is this correct?
At what point (number of strings) does it become better to use StringBuilder?
I warmly suggest you to read The Sad Tragedy of Micro-Optimization Theater, by Jeff Atwood.
It treats Simple Concatenation vs. StringBuilder vs. other methods.
Now, if you want to see some numbers and graphs, follow the link ;)
But if I want to concatinate 2
strings, then I assume that it is
better (faster) to do it without
StringBuilder. Is this correct?
That is indeed correct, you can find why exactly explained very well on :
Article about strings and StringBuilder
Summed up : if you can concatinate strings in one go like
var result = a + " " + b + " " + c + ..
you are better off without StringBuilder for only on copy is made (the length of the resulting string is calculated beforehand.);
For structure like
var result = a;
result += " ";
result += b;
result += " ";
result += c;
..
new objects are created each time, so there you should consider StringBuilder.
At the end the article sums up these rules of thumb :
Rules Of Thumb
So, when should you use StringBuilder,
and when should you use the string
concatenation operators?
Definitely use StringBuilder when
you're concatenating in a non-trivial
loop - especially if you don't know
for sure (at compile time) how many
iterations you'll make through the
loop. For example, reading a file a
character at a time, building up a
string as you go using the += operator
is potentially performance suicide.
Definitely use the concatenation
operator when you can (readably)
specify everything which needs to be
concatenated in one statement. (If you
have an array of things to
concatenate, consider calling
String.Concat explicitly - or
String.Join if you need a delimiter.)
Don't be afraid to break literals up
into several concatenated bits - the
result will be the same. You can aid
readability by breaking a long literal
into several lines, for instance, with
no harm to performance.
If you need the intermediate results
of the concatenation for something
other than feeding the next iteration
of concatenation, StringBuilder isn't
going to help you. For instance, if
you build up a full name from a first
name and a last name, and then add a
third piece of information (the
nickname, maybe) to the end, you'll
only benefit from using StringBuilder
if you don't need the (first name +
last name) string for other purpose
(as we do in the example which creates
a Person object).
If you just have a few concatenations
to do, and you really want to do them
in separate statements, it doesn't
really matter which way you go. Which
way is more efficient will depend on
the number of concatenations the sizes
of string involved, and what order
they're concatenated in. If you really
believe that piece of code to be a
performance bottleneck, profile or
benchmark it both ways.
System.String is an immutable object - it means that whenever you modify its content it will allocate a new string and this takes time (and memory?).
Using StringBuilder you modify the actual content of the object without allocating a new one.
So use StringBuilder when you need to do many modifications on the string.
Not really...you should use StringBuilder if you concatenate large strings or you have many concatenations, like in a loop.
If you concatenate strings in a loop, you should consider using StringBuilder instead of regular String
In case it's single concatenation, you may not see the difference in execution time at all
Here is a simple test app to prove the point:
static void Main(string[] args)
{
//warm-up rounds:
Test(500);
Test(500);
//test rounds:
Test(500);
Test(1000);
Test(10000);
Test(50000);
Test(100000);
Console.ReadLine();
}
private static void Test(int iterations)
{
int testLength = iterations;
Console.WriteLine($"----{iterations}----");
//TEST 1 - String
var startTime = DateTime.Now;
var resultString = "test string";
for (var i = 0; i < testLength; i++)
{
resultString += i.ToString();
}
Console.WriteLine($"STR: {(DateTime.Now - startTime).TotalMilliseconds}");
//TEST 2 - StringBuilder
startTime = DateTime.Now;
var stringBuilder = new StringBuilder("test string");
for (var i = 0; i < testLength; i++)
{
stringBuilder.Append(i.ToString());
}
string resultString2 = stringBuilder.ToString();
Console.WriteLine($"StringBuilder: {(DateTime.Now - startTime).TotalMilliseconds}");
Console.WriteLine("---------------");
Console.WriteLine("");
}
Results (in milliseconds):
----500----
STR: 0.1254
StringBuilder: 0
---------------
----1000----
STR: 2.0232
StringBuilder: 0
---------------
----10000----
STR: 28.9963
StringBuilder: 0.9986
---------------
----50000----
STR: 1019.2592
StringBuilder: 4.0079
---------------
----100000----
STR: 11442.9467
StringBuilder: 10.0363
---------------
There's no definitive answer, only rules-of-thumb. My own personal rules go something like this:
If concatenating in a loop, always use a StringBuilder.
If the strings are large, always use a StringBuilder.
If the concatenation code is tidy and readable on the screen then it's probably ok.
If it isn't, use a StringBuilder.
To paraphrase
Then shalt thou count to three, no more, no less. Three shall be the number thou shalt count, and the number of the counting shall be three. Four shalt thou not count, neither count thou two, excepting that thou then proceed to three. Once the number three, being the third number, be reached, then lobbest thou thy Holy Hand Grenade of Antioch
I generally use string builder for any block of code which would result in the concatenation of three or more strings.
Since it's difficult to find an explanation for this that's not either influenced by opinions or followed by a battle of prides I thought to write a bit of code on LINQpad to test this myself.
I found that using small sized strings rather than using i.ToString() changes response times (visible in small loops).
The test uses different sequences of iterations to keep time measurements in sensibly comparable ranges.
I'll copy the code at the end so you can try it yourself (results.Charts...Dump() won't work outside LINQPad).
Output (X-Axis: Number of iterations tested, Y-Axis: Time taken in ticks):
Iterations sequence: 2, 3, 4, 5, 6, 7, 8, 9, 10
Iterations sequence: 10, 20, 30, 40, 50, 60, 70, 80
Iterations sequence: 100, 200, 300, 400, 500
Code (Written using LINQPad 5):
void Main()
{
Test(2, 3, 4, 5, 6, 7, 8, 9, 10);
Test(10, 20, 30, 40, 50, 60, 70, 80);
Test(100, 200, 300, 400, 500);
}
void Test(params int[] iterationsCounts)
{
$"Iterations sequence: {string.Join(", ", iterationsCounts)}".Dump();
int testStringLength = 10;
RandomStringGenerator.Setup(testStringLength);
var sw = new System.Diagnostics.Stopwatch();
var results = new Dictionary<int, TimeSpan[]>();
// This call before starting to measure time removes initial overhead from first measurement
RandomStringGenerator.GetRandomString();
foreach (var iterationsCount in iterationsCounts)
{
TimeSpan elapsedForString, elapsedForSb;
// string
sw.Restart();
var str = string.Empty;
for (int i = 0; i < iterationsCount; i++)
{
str += RandomStringGenerator.GetRandomString();
}
sw.Stop();
elapsedForString = sw.Elapsed;
// string builder
sw.Restart();
var sb = new StringBuilder(string.Empty);
for (int i = 0; i < iterationsCount; i++)
{
sb.Append(RandomStringGenerator.GetRandomString());
}
sw.Stop();
elapsedForSb = sw.Elapsed;
results.Add(iterationsCount, new TimeSpan[] { elapsedForString, elapsedForSb });
}
// Results
results.Chart(r => r.Key)
.AddYSeries(r => r.Value[0].Ticks, LINQPad.Util.SeriesType.Line, "String")
.AddYSeries(r => r.Value[1].Ticks, LINQPad.Util.SeriesType.Line, "String Builder")
.DumpInline();
}
static class RandomStringGenerator
{
static Random r;
static string[] strings;
public static void Setup(int testStringLength)
{
r = new Random(DateTime.Now.Millisecond);
strings = new string[10];
for (int i = 0; i < strings.Length; i++)
{
strings[i] = Guid.NewGuid().ToString().Substring(0, testStringLength);
}
}
public static string GetRandomString()
{
var indx = r.Next(0, strings.Length);
return strings[indx];
}
}
But if I want to concatenate 2 strings, then I assume that it's better and faster to do so without StringBuilder. Is this correct?
Yes. But more importantly, it is vastly more readable to use a vanilla String in such situations. Using it in a loop, on the other hand, makes sense and can also be as readable as concatenation.
I’d be wary of rules of thumb that cite specific numbers of concatenation as a threshold. Using it in loops (and loops only) is probably just as useful, easier to remember and makes more sense.
As long as you can physically type the number of concatenations (a + b + c ...) it shouldn't make a big difference. N squared (at N = 10) is a 100X slowdown, which shouldn't be too bad.
The big problem is when you are concatenating hundreds of strings. At N=100, you get a 10000X times slowdown. Which is pretty bad.
A single concatenation is not worth using a StringBuilder. I've typically used 5 concatenations as a rule of thumb.
I don't think there's a fine line between when to use or when not to. Unless of course someone performed some extensive testings to come out with the golden conditions.
For me, I will not use StringBuilder if just concatenating 2 huge strings. If there's loop with an undeterministic count, I'm likely to, even if the loop might be small counts.
I have a two dimensional array that I need to load data into. I know the width of the data (22 values) but I do not know the height (estimated around 4000 records, but variable).
I have it declared as follows:
float[,] _calibrationSet;
....
int calibrationRow = 0;
While (recordsToRead)
{
for (int i = 0; i < SensorCount; i++)
{
_calibrationSet[calibrationRow, i] = calibrationArrayView.ReadFloat();
}
calibrationRow++;
}
This causes a NullReferenceException, so when I try to initialize it like this:
_calibrationSet = new float[,];
I get an "Array creation must have array size or array initializer."
Thank you,
Keith
You can't use an array.
Or rather, you would need to pick a size, and if you ended up needing more then you would have to allocate a new, larger, array, copy the data from the old one into the new one, and continue on as before (until you exceed the size of the new one...)
Generally, you would go with one of the collection classes - ArrayList, List<>, LinkedList<>, etc. - which one depends a lot on what you're looking for; List will give you the closest thing to what i described initially, while LinkedList<> will avoid the problem of frequent re-allocations (at the cost of slower access and greater memory usage).
Example:
List<float[]> _calibrationSet = new List<float[]>();
// ...
while (recordsToRead)
{
float[] record = new float[SensorCount];
for (int i = 0; i < SensorCount; i++)
{
record[i] = calibrationArrayView.ReadFloat();
}
_calibrationSet.Add(record);
}
// access later: _calibrationSet[record][sensor]
Oh, and it's worth noting (as Grauenwolf did), that what i'm doing here doesn't give you the same memory structure as a single, multi-dimensional array would - under the hood, it's an array of references to other arrays that actually hold the data. This speeds up building the array a good deal by making reallocation cheaper, but can have an impact on access speed (and, of course, memory usage). Whether this is an issue for you depends a lot on what you'll be doing with the data after it's loaded... and whether there are two hundred records or two million records.
You can't create an array in .NET (as opposed to declaring a reference to it, which is what you did in your example) without specifying its dimensions, either explicitly, or implicitly by specifying a set of literal values when you initialize it. (e.g. int[,] array4 = { { 1, 2 }, { 3, 4 }, { 5, 6 }, { 7, 8 } };)
You need to use a variable-size data structure first (a generic list of 22-element 1-d arrays would be the simplest) and then allocate your array and copy your data into it after your read is finished and you know how many rows you need.
I would just use a list, then convert that list into an array.
You will notice here that I used a jagged array (float[][]) instead of a square array (float [,]). Besides being the "standard" way of doing things, it should be much faster. When converting the data from a list to an array you only have to copy [calibrationRow] pointers. Using a square array, you would have to copy [calibrationRow] x [SensorCount] floats.
var tempCalibrationSet = new List<float[]>();
const int SensorCount = 22;
int calibrationRow = 0;
while (recordsToRead())
{
tempCalibrationSet[calibrationRow] = new float[SensorCount];
for (int i = 0; i < SensorCount; i++)
{
tempCalibrationSet[calibrationRow][i] = calibrationArrayView.ReadFloat();
} calibrationRow++;
}
float[][] _calibrationSet = tempCalibrationSet.ToArray();
I generally use the nicer collections for this sort of work (List, ArrayList etc.) and then (if really necessary) cast to T[,] when I'm done.
you would either need to preallocate the array to a Maximum size (float[999,22] ) , or use a different data structure.
i guess you could copy/resize on the fly.. (but i don't think you'd want to)
i think the List sounds reasonable.
You could also use a two-dimensional ArrayList (from System.Collections) -- you create an ArrayList, then put another ArrayList inside it. This will give you the dynamic resizing you need, but at the expense of a bit of overhead.