Is it possible to save a multidimensional ArrayList to the Properties.Settings? I get errors when i try to.
If not, is there any other way i can save some sort of multidimensional thingie to the Properties.Settings?
Ah, I remember fighting with Properties.Settings... never a fun battle :(. The quick and dirty option would be to use different separator characters to do a string "serialization" (in quotes because a real serialization would be much better about handling edge cases etc.). Something like this:
int[][] myArray = GetArrayFromElsewhere();
string stringVersion = string.Join(";", myArray.Select(subArray => string.Join(",", subArray)));
Properties.Settings.StringVersion = stringVersion;
You could also use esoteric Unicode characters instead of ; and ,, so as to avoid accidentally splitting a string, and you could use a loop (or probably recursion) to generalize this to any number of dimensions.
But, this is of course a quick and dirty workaround. The real solution would be to do some sort of serialization of the multidimensional array. You might be able to get away with just some simple XmlSerializer or BinaryFormatter or even JavaScriptSerializer code---actually, I think the last of those might work really well---but if you need to get more complicated, this question discusses a similar solution for a hash table, with lots of gory details.
Related
I am very new to C# and am trying to feel it out. Slow going so far! What I am trying to achieve should be relatively simple; I want to read a row from a CSV file with a search. I.e. if I search for username "Toby" it would fetch the entire row, preferably as an array.
Here is my users.csv file:
Id,Name,Password
1,flugs,password
2,toby,foo
I could post the code that I've tried, but I haven't even come close in previous attempts. It's a bit easier to do such a thing in Python, it may be easy in C# too but I'm far too new to know!
Does anyone have any ideas as to how I should approach/code this? Many thanks.
Easy to do in c# too:
var lineAsArray = File.ReadLines("path").First(s => s.Contains(",toby,")).Split(',');
If you want case insens, use e.g. Contains(",toby,", StringComparison.OrdinalIgnoreCase)
If your user is going to type in "Toby" you can either concatenate a comma on the start/end of it to follow this simplistic searching (which will find Toby anywhere on the line) or you can split the lone first and look to see if the second element is Toby
var lineAsArray = File.ReadLines("path").Split(',').First(a => a[1].Equals("toby"));
To make this one case insensitive, put a suitable StringComparison argument into the Equals using the same approach as above
Sky's the limit with how involved you want to get with it; using a library that parses CSV to objects that represent your lines with named, typed parameters is probably where I'd stop.. take a look at CSVHelper from josh close or ServiceStack Text, though there are no shortage of csv parser libs- it's been done to death!
We have a requirement to transform a string containing a date in dd/mm/yyyy format to ddmmyyyy format (In case you want to know why I am storing dates in a string, my software processes bulk transactions files, which is a line based textual file format used by a bank).
And I am currently doing this:
string oldFormat = "01/01/2014";
string newFormat = oldFormat.Replace("/", "");
Sure enough, this converts "01/01/2014" to "01012014". But my question is, does the replace happen in one step, or does it create an intermediate string (e.g.: "0101/2014" or "01/012014")?
Here's the reason why I am asking this:
I am processing transaction files ranging in size from few kilobytes to hundreds of megabytes. So far I have not had a performance/memory problem, because I am still testing with very small files. But when it comes to megabytes I am not sure if I will have problems with these additional strings. I suspect that would be the case because strings are immutable. With millions of records this additional memory consumption will build up considerably.
I am already using StringBuilders for output file creation. And I also know that the discarded strings will be garbage collected (at some point before the end of the time). I was wondering if there is a better, more efficient way of replacing all occurrences of a specific character/substring in a string, that does not additionally create an string.
Sure enough, this converts "01/01/2014" to "01012014". But my question
is, does the replace happen in one step, or does it create an
intermediate string (e.g.: "0101/2014" or "01/012014")?
No, it doesn't create intermediate strings for each replacement. But it does create new string, because, as you already know, strings are immutable.
Why?
There is no reason to a create new string on each replacement - it's very simple to avoid it, and it will give huge performance boost.
If you are very interested, referencesource.microsoft.com and SSCLI2.0 source code will demonstrate this(how-to-see-code-of-method-which-marked-as-methodimploptions-internalcall):
FCIMPL3(Object*, COMString::ReplaceString, StringObject* thisRefUNSAFE,
StringObject* oldValueUNSAFE, StringObject* newValueUNSAFE)
{
// unnecessary code ommited
while (((index=COMStringBuffer::LocalIndexOfString(thisBuffer,oldBuffer,
thisLength,oldLength,index))>-1) && (index<=endIndex-oldLength))
{
replaceIndex[replaceCount++] = index;
index+=oldLength;
}
if (replaceCount != 0)
{
//Calculate the new length of the string and ensure that we have
// sufficent room.
INT64 retValBuffLength = thisLength -
((oldLength - newLength) * (INT64)replaceCount);
gc.retValString = COMString::NewString((INT32)retValBuffLength);
// unnecessary code ommited
}
}
as you can see, retValBuffLength is calculated, which knows the amount of replaceCount's. The real implementation can be a bit different for .NET 4.0(SSCLI 4.0 is not released), but I assure you it's not doing anything silly :-).
I was wondering if there is a better, more efficient way of replacing
all occurrences of a specific character/substring in a string, that
does not additionally create an string.
Yes. Reusable StringBuilder that has capacity of ~2000 characters. Avoid any memory allocation. This is only true if the the replacement lengths are equal, and can get you a nice performance gain if you're in tight loop.
Before writing anything, run benchmarks with big files, and see if the performance is enough for you. If performance is enough - don't do anything.
Well, I'm not a .NET development team member (unfortunately), but I'll try to answer your question.
Microsoft has a great site of .NET Reference Source code, and according to it, String.Replace calls an external method that does the job. I wouldn't argue about how it is implemented, but there's a small comment to this method that may answer your question:
// This method contains the same functionality as StringBuilder Replace. The only difference is that
// a new String has to be allocated since Strings are immutable
Now, if we'll follow to StringBuilder.Replace implementation, we'll see what it actually does inside.
A little more on a string objects:
Although String is immutable in .NET, this is not some kind of limitation, it's a contract. String is actually a reference type, and what it includes is the length of the actual string + the buffer of characters. You can actually get an unsafe pointer to this buffer and change it "on the fly", but I wouldn't recommend doing this.
Now, the StringBuilder class also holds a character array, and when you pass the string to its constructor it actually copies the string's buffer to his own (see Reference Source). What it doesn't have, though, is the contract of immutability, so when you modify a string using StringBuilder you are actually working with the char array. Note that when you call ToString() on a StringBuilder, it creates a new "immutable" string any copies his buffer there.
So, if you need a fast and memory efficient way to make changes in a string, StringBuilder is definitely your choice. Especially regarding that Microsoft explicitly recommends to use StringBuilder if you "perform repeated modifications to a string".
I haven't found any sources but i strongly doubt that the implementation creates always new strings. I'd implement it also with a StringBuilder internally. Then String.Replace is absolutely fine if you want to replace once a huge string. But if you have to replace it many times you should consider to use StringBuilder.Replace because every call of Replace creates a new string.
So you can use StringBuilder.Replace since you're already using a StringBuilder.
Is StringBuilder.Replace() more efficient than String.Replace?
String.Replace() vs. StringBuilder.Replace()
There is no string method for that. You are own your own. But you can try something like this:
oldFormat="dd/mm/yyyy";
string[] dt = oldFormat.Split('/');
string newFormat = string.Format("{0}{1}/{2}", dt[0], dt[1], dt[2]);
or
StringBuilder sb = new StringBuilder(dt[0]);
sb.AppendFormat("{0}/{1}", dt[1], dt[2]);
At the moment I maintain a quirky codebase, and came across the following same pattern more than 100 times:
string NotMySqlQuery = ""; //why initialize the string with "", only to overwrite it on the next line?
NotMySqlQuery = "The query to be executed";
Since I came across this so often, I now doubt my own good judgement.
Is this a trick to optimize the compiler or does it bring any other advantages?
It reminds me a bit of the old times when I did write some code in C++, but it still doesn't look like proper dealing with strings to me.
Why would someone write code like that?
There is no performance advantage of that syntax. It is even slightly worse than not initializing it at all, since the strings are immutable in c# and this way 2 separate strings are allocated.
For your simple case, it is better to save the 2 lines into one, there is no point to assign it an empty string, and immediately assign another value to it.
string NotMySqlQuery = "The query to be executed";
This is clearer.
I need a list of strings and a way to quickly determine if a string is contained within that list.
To enhance lookup speed, I considered SortedList and Dictionary; however, both work with KeyValuePairs when all I need is a single string.
I know I could use a KeyValuePair and simply ignore the Value portion. But I do prefer to be efficient and am just wondering if there is a collection better suited to my requirements.
If you're on .NET 3.5 or higher, use HashSet<String>.
Failing that, a Dictionary<string, byte> (or whatever type you want for the TValue type parameter) would be faster than a SortedList if you have a lot of entries - the latter will use a binary search, so it'll be O(log n) lookup, instead of O(1).
If you just want to know if a string is in the set use HashSet<string>
This sounds like a job for
var keys = new HashSet<string>();
Per MSDN: The Contains function has O(1) complexity.
But you should be aware that it does not give an error for duplicates when adding.
HashSet<string> is like a Dictionary, but with only keys.
If you feel like rolling your own data structure, use a Trie.
http://en.wikipedia.org/wiki/Trie
worst-case is if the string is present: O(length of string)
I know this answer is a bit late to this party, but I was running into an issue where our systems were running slow. After profiling we found out there was a LOT of string lookups happening with the way we had our data structures structured.
So we did some research, came across these benchmarks, did our own tests, and have switched over to using SortedList now.
if (sortedlist.ContainsKey(thekey))
{
//found it.
}
Even though a Dictionary proved to be faster, it was less code we had to refactor, and the performance increase was good enough for us.
Anyway, wanted to share the website in case other people are running into similar issues. They do comparisons between data structures where the string you're looking for is a "key" (like HashTable, Dictionary, etc) or in a "value" (List, Array, or in a Dictionary, etc) which is where ours are stored.
I know the question is old as hell, but I just had to solve the same problem, only for a very small set of strings(between 2 and 4).
In my case, I actually used manual lookup over an array of strings which turned up to be much faster than HashSet<string>(I benchmarked it).
for (int i = 0; i < this.propertiesToIgnore.Length; i++)
{
if (this.propertiesToIgnore[i].Equals(propertyName))
{
return true;
}
}
Note, that it is better than hash set for only for tiny arrays!
EDIT: works only with a manual for loop, do not use LINQ, details in comments
I have a Dictionary the first string, the key's, must never change.. it cant be deleted or anything.. but the value, i keep adding lines, and lines, and lines to the values.. i just create new lines with \r\n or \r .. and im just wondering what would be the easiest way to retain just the last 50 lines. and delete anything over the 50 lines.. im doing this because when i return it i have to put the values through a char array, and go through each letter, and this can be slow if there is too much data. any suggestions?
Guffa's general idea is right - your data structure should reflect what you actually want, which is a list of strings rather than a single string. The concept of "the last 50 lines" is pretty obviously to do with a collection rather than a single string, even if you've originally read it that way.
However, I'd suggest using a LinkedList<T> rather than a List<T>: every time you remove the first element of a List<T>, everything else has to shuffle up. List<T> is great for giving random access and not too bad at adding to the end, but sucks for removing from the start. LinkedList<T> is great at giving you iterator access, adding to / removing from the start, and adding to / removing from the end. It's a better fit. (If you really wanted to go to town you could even write your own fixed-size circular buffer type which encapsulated the logic for you; this would give the best of both worlds, in the situation where you don't want to be able to expand beyond a certain size.)
Regarding your comments to Guffa's answer: it's pretty common to convert input into a form which is more appropriate for processing, then convert it back to the original format for output. The reason why you do it is precisely the "more appropriate" bit. You don't want to have to parse the string for line breaks as part of the "updating the dictionary" action, IMO. In particular, it sounds like you're currently introducing the idea of "lines" where the original text is just being read in as strings. You're effectively creating your own "collection" class backed by a string, by delimiting strings with line breaks. That's inefficient, error-prone, and much harder to manage than using the built-in collections. It's easy to perform the conversion to a line-break-delimited string at the end if you want it, but it sounds like you're doing it way too early.
Instead of concatenating the lines, use a Dictionary<string, List<string>>. When you are about to add a string to the list you can check the count and remove the first string if the list already has 50 strings:
List<string> list;
if (!theDictionary.TryGetValue(key, out list)) {
theDictionary.Add(list = new List<string>());
}
if (list.Count == 50) {
list.RemoveAt(0);
}
list.Add(line);