Is there any side effect of passing and extra argument to string.Format function in C#? I was looking at the string.Format function documentation at MSDN ( http://msdn.microsoft.com/en-us/library/b1csw23d.aspx) but unable to find an answer.
Eg:-
string str = string.Format("Hello_{0}", 255, 555);
Now, as you can see that according to format string, we are suppose to pass only one argument after it but I have passed two.
EDIT:
I have tried it on my end and everything looks fine to me. Since I am new to C# and from C background, I just want to make sure that it will not cause any problem in later run.
Looking in Reflector, it will allocate a little more memory for building the string, but there's no massive repercussion for passing in an extra object.
There's also the "side effect" that, if you accidentally included a {n} in your format string where n was too large, and then added some spare arguments, you'd no longer get an exception but get a string with unexpected items in.
If you look at the exception section of the link you provide for string.Format
"The index of a format item is less than zero, or greater than or equal to the length of the args array."
Microsoft doesn't indicate that it can throw if you have too much arguments, so it won't. The effect is a small loss of memory due to an useless parameter
Related
We have a requirement to transform a string containing a date in dd/mm/yyyy format to ddmmyyyy format (In case you want to know why I am storing dates in a string, my software processes bulk transactions files, which is a line based textual file format used by a bank).
And I am currently doing this:
string oldFormat = "01/01/2014";
string newFormat = oldFormat.Replace("/", "");
Sure enough, this converts "01/01/2014" to "01012014". But my question is, does the replace happen in one step, or does it create an intermediate string (e.g.: "0101/2014" or "01/012014")?
Here's the reason why I am asking this:
I am processing transaction files ranging in size from few kilobytes to hundreds of megabytes. So far I have not had a performance/memory problem, because I am still testing with very small files. But when it comes to megabytes I am not sure if I will have problems with these additional strings. I suspect that would be the case because strings are immutable. With millions of records this additional memory consumption will build up considerably.
I am already using StringBuilders for output file creation. And I also know that the discarded strings will be garbage collected (at some point before the end of the time). I was wondering if there is a better, more efficient way of replacing all occurrences of a specific character/substring in a string, that does not additionally create an string.
Sure enough, this converts "01/01/2014" to "01012014". But my question
is, does the replace happen in one step, or does it create an
intermediate string (e.g.: "0101/2014" or "01/012014")?
No, it doesn't create intermediate strings for each replacement. But it does create new string, because, as you already know, strings are immutable.
Why?
There is no reason to a create new string on each replacement - it's very simple to avoid it, and it will give huge performance boost.
If you are very interested, referencesource.microsoft.com and SSCLI2.0 source code will demonstrate this(how-to-see-code-of-method-which-marked-as-methodimploptions-internalcall):
FCIMPL3(Object*, COMString::ReplaceString, StringObject* thisRefUNSAFE,
StringObject* oldValueUNSAFE, StringObject* newValueUNSAFE)
{
// unnecessary code ommited
while (((index=COMStringBuffer::LocalIndexOfString(thisBuffer,oldBuffer,
thisLength,oldLength,index))>-1) && (index<=endIndex-oldLength))
{
replaceIndex[replaceCount++] = index;
index+=oldLength;
}
if (replaceCount != 0)
{
//Calculate the new length of the string and ensure that we have
// sufficent room.
INT64 retValBuffLength = thisLength -
((oldLength - newLength) * (INT64)replaceCount);
gc.retValString = COMString::NewString((INT32)retValBuffLength);
// unnecessary code ommited
}
}
as you can see, retValBuffLength is calculated, which knows the amount of replaceCount's. The real implementation can be a bit different for .NET 4.0(SSCLI 4.0 is not released), but I assure you it's not doing anything silly :-).
I was wondering if there is a better, more efficient way of replacing
all occurrences of a specific character/substring in a string, that
does not additionally create an string.
Yes. Reusable StringBuilder that has capacity of ~2000 characters. Avoid any memory allocation. This is only true if the the replacement lengths are equal, and can get you a nice performance gain if you're in tight loop.
Before writing anything, run benchmarks with big files, and see if the performance is enough for you. If performance is enough - don't do anything.
Well, I'm not a .NET development team member (unfortunately), but I'll try to answer your question.
Microsoft has a great site of .NET Reference Source code, and according to it, String.Replace calls an external method that does the job. I wouldn't argue about how it is implemented, but there's a small comment to this method that may answer your question:
// This method contains the same functionality as StringBuilder Replace. The only difference is that
// a new String has to be allocated since Strings are immutable
Now, if we'll follow to StringBuilder.Replace implementation, we'll see what it actually does inside.
A little more on a string objects:
Although String is immutable in .NET, this is not some kind of limitation, it's a contract. String is actually a reference type, and what it includes is the length of the actual string + the buffer of characters. You can actually get an unsafe pointer to this buffer and change it "on the fly", but I wouldn't recommend doing this.
Now, the StringBuilder class also holds a character array, and when you pass the string to its constructor it actually copies the string's buffer to his own (see Reference Source). What it doesn't have, though, is the contract of immutability, so when you modify a string using StringBuilder you are actually working with the char array. Note that when you call ToString() on a StringBuilder, it creates a new "immutable" string any copies his buffer there.
So, if you need a fast and memory efficient way to make changes in a string, StringBuilder is definitely your choice. Especially regarding that Microsoft explicitly recommends to use StringBuilder if you "perform repeated modifications to a string".
I haven't found any sources but i strongly doubt that the implementation creates always new strings. I'd implement it also with a StringBuilder internally. Then String.Replace is absolutely fine if you want to replace once a huge string. But if you have to replace it many times you should consider to use StringBuilder.Replace because every call of Replace creates a new string.
So you can use StringBuilder.Replace since you're already using a StringBuilder.
Is StringBuilder.Replace() more efficient than String.Replace?
String.Replace() vs. StringBuilder.Replace()
There is no string method for that. You are own your own. But you can try something like this:
oldFormat="dd/mm/yyyy";
string[] dt = oldFormat.Split('/');
string newFormat = string.Format("{0}{1}/{2}", dt[0], dt[1], dt[2]);
or
StringBuilder sb = new StringBuilder(dt[0]);
sb.AppendFormat("{0}/{1}", dt[1], dt[2]);
according to my understanding, a base64 encoded string (ie the output of encode) must always be a multiple of 4.
the c# Convert.FromBase64String says that its input must be a multiple of 4
However if I give it a 25 character string it doesnt complain
[convert]::FromBase64String("ei5gsIELIki+GpnPGyPVBA==")
[convert]::FromBase64String("1ei5gsIELIki+GpnPGyPVBA==")
both work. (The first one is 24 , second is 25)
[convert]::FromBase64String("11ei5gsIELIki+GpnPGyPVBA==")
fails with Invalid length exception
I assume this is a bug in the c# library but I just want to make sure - I am writing code that is sniffing strings to see if they are valid base64 strings and I want to be sure that I understand what a valid one looks like (one possible implementation was to give the string to system.convert and see if it threw - why reinvent perfectly good code)
Yes, this is a flaw (aka bug). It got started due to a perf optimization in an internal helper function named FromBase64_ComputeResultLength() which calculates the length of the byte[] result. It has this comment (edited to fit):
// For legal input, we can assume that 0 <= padding < 3. But it may be
// more for illegal input.
// We will notice it at decode when we see a '=' at the wrong place.
The "we will notice" remark is not entirely accurate, the decoder does flag an '=' if one isn't expected but it fails to check if there's one too many. Which is the case for the 25-char string.
You can report the problem at connect.microsoft.com, I don't see an existing report that resembles it. Do note that it is fairly unlikely that Microsoft can actually fix it any time soon since the change is going to break existing programs that now successfully parse bad base64 strings. It normally requires a major .NET release update to get rid of such problems, like it was done for .NET 4.0, there isn't one on the horizon afaik.
But yes, the simple workaround for you is to check if the string length is divisible by 4, use the % operator.
I have a C# function that persists data to SQL-server through a stored procedure with a little more than 100 parameters. (Yes, I know, that's probably not a good design, but legacy code and lets not get into that here.) ;-)
For some reason this code now comes up with this error:
System.Data.SqlClient.SqlException was unhandled
Message=Error converting data type int to tinyint.
Now, I've taken a first look down the parameter- and variable-pairs, and there is no obvious culprit. (The parameters are of many datatypes, not just tinyint)
So what is the fastest way to determine the rogue variable? And does anybody know of a good way to either handle this proactively, by making the c#-side verify the variable against the parameter at run-time - or somehow extending the error message to tell exactly which parameter is failing?
Implementing an ORM is not an option at this stage.
Edit: I've started writing a function CheckParameters that will initially loop through the parameters and simply list the parameters and the corresponding value. I'm thinking it could be extended with actual knowledge of the different datatypes, but for now I just want it to aid in finding the bad variable.
Structure is as follows:
try
{
cmdStat.ExecuteNonQuery();
}
catch(SqlException)
{
CheckParameters(cmdStat.Parameters);
//re-throw
}
private void CheckParameters(SqlParameterCollection parameterCollection)
{
foreach (SqlParameter parameter in parameterCollection)
{
Trace.Write("{0}, {1}, {2}, {3}", parameter.ParameterName, parameter.DbType, parameter.SqlDbType, parameter.Value);
}
}
What I would do is replace the proc with a stub.
Then I'd binary chop half the parameters to be long enough to cope with int. If the error went away I'd binary chop only half, and check again. If the error didn't go away then I'd do the other half.
I suspect it would take at max less than 10 attempts to find it.
Today I read an article where it's written that we should always use TryParse(string, out MMM) for conversion rather than Convert.ToMMM().
I agree with article but after that I got stuck in one scenario.
When there will always be some valid value for the string and hence we can also use Convert.ToMMM() because we don't get any exception from Convert.ToMMM().
What I would like to know here is: Is there any performance impact when we use TryParse because when I know that the out parameter is always going to be valid then we can use Convert.ToMMM() rather TryParse(string, out MMM).
What do you think?
If you know the value can be converted, just use Parse(). If you 'know' that it can be converted, and it can't, then an exception being thrown is a good thing.
EDIT: Note, this is in comparison to using TryParse or Convert without error checking. If you use either of the other methods with proper error checking then the point is moot. I'm just worried about your assumption that you know the value can be converted. If you want to skip the error checking, use Parse and die immediately on failure rather than possibly continuing and corrupting data.
When the input to TryParse/Convert.ToXXX comes from user input, I'd always use TryParse. In case of database values, I'd check why you get strings from the database (maybe bad design?). If string values can be entered in the database columns, I'd also use TryParse as you can never be sure that nobody modifies data manually.
EDIT
Reading Matthew's reply: If you are unsure and would wrap the conversion in a try-catch-block anyway, you might consider using TryParse as it is said to be way faster than doing a try-catch in that case.
There is significant difference regarding the developing approach you use.
Convert: Converting one "primitive" data in to another type and corresponding format using multiple options
Case and point - converting an integer number in to its bit by bit representation. Or hexadecimal number (as string) in to integer, etc...
Error Messages : Conversion Specific Error Message - for problems in multiple cases and at multiple stages of the conversion process.
TryParse: Error-less transfer from one data format to another. Enabling T/F control of possible or not.
Error Messages: NONE
NB: Even after passing the data in to a variable - the data passed is the default of the type we try to parse in to.
Parse: in essence taking some data in one format and transfer it in to another. No representations and nothing fancy.
Error Messages: Format-oriented
P.S. Correct me if I missed something or did not explain it well enough.
Im playing around with TryParse()
But lets say the parsing fails, then returns false, and ... nothing..
Is there a way to get info back about what failed the parsing?
I saw something like that at codeproject, but i didnt really understand it.
Thanks :)
No, there's no way of getting that information from the normal .NET routines. You could check for a few things manually:
Try parsing the number as a decimal. If that works, but parsing as an integer doesn't, then it's either out of the range of the integer, or it's not an integer.
Look for non-decimal, non +/-, non-decimal-point characters
Check whether it's an empty string
You haven't said what you're trying to parse (integer, double etc) or what options you want (allow hex, thousands separators etc) which makes it harder to give a good list of things to check.
The TryParse() method is there when you want to be shielded from any exceptions.
If you want to see the exceptions then why not use the standard Parse() method in a try/catch block which will allow you to view any FormatExceptions etc thrown?
As expected, with exception handling, this could impact performance however if the Parse() is expected to succeed then this should be tolerable.
Why not just use the regular Parse method instead?