I have a few scenarios where I need to store an unlimited value (or maximum, whatever you like to call it), which represents no limitation in business.
A few options I considered:
Make the field Nullable, and Use DB NULL to represent such case. but the problem is I have to check it anywhere I need to do a comparison or display it.
Use actual Maximum value of the given type (for example, integer, i can use the largest Int32 value), but this need some tweaks at DB level - I have to write a constraint at the field (as I could use fixed length of decimal or Integer DB type) to limit the maximum value, and it could have no meaning to business either.
Use a predefined big value (that might make sense to the business) to represent it and store it at DB level, again, i have to write a constraint to the db field.
I have used all of them before for different scenarios, and all are not too bad, but you know, it's a pain to handle some specific cases.
My question is a bit broad: what do you guys suggest for this? what good/best practices are available?
Any help/suggestions are appreciated.
I would think that storing it as a separate column, IsXyzUnlimited, may be a good alternate practice.
Since it doesn't mean null, it may not be best to represent it as null. As you mentioned, there is also the problem of checking it before you invoke it.
Also, as you mentioned, the other 2 values could have business meaning. If you want the data to be self-revealing about the business, explicitly say "hey business, this thing is unlimited when this box is checked". No magic values.
Related
Let’s say I have double length that can be either a real length or not ready yet since we got no length yet in the server and there is nothing to send to the client. We need to pass this length from the server to the client as part of a fixed data protocol. The client currently uses the length only once, but might use it more than that in the future.
Pass double length and bool isLengthValid, and in every place you use length, check if isLengthValid
-Clean design without mixing data types but user have to remember to check
Pass double? length, and in every place you use length, check if length==null
-Design is clear (since it’s a nullable) but if you look and the type. Also – there will be an exception if someone uses without checking (good and bad, depends how you look at it)
Make a class Length instead of double. The class will have a clear interface of GetLengthIfYouCheckedIt or something.
Very readable and hard to make mistakes but design is a little over done.
What is your solution?
I say option2:
What you want is precisely why nullables were introduced.
Instead of adding a method to check wether it's a valid number or not, you'd use the built-in Nullable<double>.HasValue, just as it was meant for it.
Making a class for Length makes it doubly closed: it's only for LENGTH and it holds a Double. Think of how many of such classes you'll have to make and maintain for TIME/DateTime, MONEY/Decimal etc. It will never end.
The option 1 is just your own rolled Nullable<T> rewrapped with another name.
In other words, enforce the DRY principle, and use Nullable<T> ;)
HTH,
Bab.
I'd pass a double?. That's essentially a double + a bool value indicating if it's valid so using the 1) option would just be reinventing nullable. I think that the 3) option is overkill.
My advise would be that use nullable like this public Double? Length;
You will get methods like Length.HasValue, and Length.Value this will make the code easy to read and quicker for you to use( i mean no need to write new class etc by quicker for you)
Why not just keep it as a length parameter but return -1?
If possible, I would suggest making the request async, so that you do not return anything to the client until the data is actually ready.
If that is not possible, go with the second option.
`I need to know if two references from completely different parts of the program refers to the same object.
I can not compare references programaticaly because they are from the different context (one reference is not visible from another and vice versa).
Then I want to print unique identifier for each object using Console.WriteLine(). But ToString() method doesn't return "unique" identifier, it just returns "classname".
Is it possible to print unique identifier in C# (like in Java)?
The closest you can easily get (which won't be affected by the GC moving objects around etc) is probably RuntimeHelpers.GetHashCode(Object). This gives the hash code which would be returned by calling Object.GetHashCode() non-virtually on the object. This is still not a unique identifier though. It's probably good enough for diagnostic purposes, but you shouldn't rely on it for production comparisons.
EDIT: If this is just for diagnostics, you could add a sort of "canonicalizing ID generator" which was just a List<object>... when you ask for an object's "ID" you'd check whether it already existed in the list (by comparing references) and then add it to the end if it didn't. The ID would be the index into the list. Of course, doing this without introducing a memory leak would involve weak references etc, but as a simple hack this might work for you.
one reference is not visible from another and vice versa
I don't buy that. If you couldn't even get the handles, how would you get their ID's?
In C# you can always get handles to objects, and you can always compare them. Even if you have to use reflection to do it.
If you need to know if two references are pointing the same object, I'll just citate this.
By default, the operator == tests for
reference equality. This is done by
determining if two references indicate
the same object. Therefore reference
types do not need to implement
operator == in order to gain this
functionality.
So, == operator will do the trick without doing the Id workaround.
I presume you're calling ToString on your object reference, but not entirely clear on this or your explained situatyion, TBH, so just bear with me.
Does the type expose an ID property? If so, try this:
var idAsString = yourObjectInstance.ID.ToString();
Or, print directly:
Console.WriteLine(yourObjectInstance.ID);
EDIT:
I see Jon seen right through this problem, and makes my answer look rather naive - regardless, I'm leaving it in if for nothing else but to emphasise the lack of clarity of the question. And also, maybe, provide an avenue to go down based on Jon's statement that 'This [GetHashCode] is still not a unique identifier', should you decide to expose your own uniqueness by way of an identifier.
The example below may not be problematic as is, but it should be enough to illustrate a point. Imagine that there is a lot more work than trimming going on.
public string Thingy
{
set
{
// I guess we can throw a null reference exception here on null.
value = value.Trim(); // Well, imagine that there is so much processing to do
this.thingy = value; // That this.thingy = value.Trim() would not fit on one line
...
So, if the assignment has to take two lines, then I either have to abusereuse the parameter, or create a temporary variable. I am not a big fan of temporary variables. On the other hand, I am not a fan of convoluted code. I did not include an example where a function is involved, but I am sure you can imagine it. One concern I have is if a function accepted a string and the parameter was "abused", and then someone changed the signature to ref in both places - this ought to mess things up, but ... who would knowingly make such a change if it already worked without a ref? Seems like it is their responsibility in this case. If I mess with the value of value, am I doing something non-trivial under the hood? If you think that both approaches are acceptable, then which do you prefer and why?
Thanks.
Edit: Here is what I mean when I say I am not a fan of temp variables. I do not like code like this:
string userName = userBox.Text;
if (userName.Length < 5) {
MessageBox.Show("The user name " + userName + " that you entered is too short.");
....
Again, this may not be the best way to communicate a problem to the user, but it is just an illustration. The variable userName is unnecessary in my strong opinion in this case. I am not always against temporary variables, but when their use is very limited and they do not save that much typing, I strongly prefer not to use them.
First off, it's not a big deal.
But I would introduce a temp variable here. It costs nothing and is less prone to errors. Imagine someone has to maintain the code later. Better if value only has 1 meaning and purpose.
And don't call it temp, call it cleanedValue or something.
It is a good practice not to change the values of incoming parameters, even if you technically can. Don't touch the value.
I am not a big fan of temporary variables.
Well, programming is largely about creating temporary variables all over the place, reading and assigning values. You'd better start to love them. :)
One more remark regarding properties. Although you could technically put a lot of logic there, it is recommended to keep properties simple and try not to use any code that could throw exceptions. A need to call other functions may indicate that this property is better be made a method or that there is some initialization code needed somewhere. Just rethink what you're doing and whether it does really look like a property.
Both in SQL and C#, I've never really liked output parameters. I never passed parameters ByRef in VB6, either. Something about counting on side effects to get something done just bothers me.
I know they're a way around not being able to return multiple results from a function, but a rowset in SQL or a complex datatype in C# and VB work just as well, and seem more self-documenting to me.
Is there something wrong with my thinking, or are there resources from authoritative sources that back me up? What's your personal take on this and why? What can I say to colleagues that want to design with output parameters that might convince them to use different structures?
EDIT: interesting turn- the output parameter I was asking this question about was used in place of a return value. When the return value is "ERROR", the caller is supposed to handle it as an exception. I was doing that but not pleased with the idea. A coworker wasn't informed of the need to handle this condition and as a result, a great deal of money was lost as the procedure failed silently!
Output parameters can be a code smell indicating that your method is doing too much. If you need to return more than one value, the method is likely doing more than one thing. If the data is tightly related, then it would probably benefit from a class that holds both values.
Of course, this is not ALWAYS the case, but I have found that it is usually the case.
In other words, I think you are right to avoid them.
They have their place. Int32.TryParse method is a good example of an effective use of an out parameter.
bool result = Int32.TryParse(value, out number);
if (result)
{
Console.WriteLine("Converted '{0}' to {1}.", value, number);
}
Bob Martin wrote about this Clean Code. Output params break the fundamental idea of a function.
output = someMethod(input)
I think they're useful for getting IDs of newly-inserted rows in the same SQL command, but i don't think i've used them for much else.
I too see very little use of out/ref parameters, although in SQL it sometimes is easier to pass a value back by a parameter than by a resultset (which would then require the use of a DataReader, etc.)
Though, as luck would have it, I just created one such rare function in C# today. It validated a table-like data structure and returned the number of rows and columns in it (which was tricky to calculate because the table could have rowspans/colspans like in HTML). In this case the calculation of both values was done at the same time. Separating it into two functions would have resulted in double the code, memory and CPU time requirements. Creating a custom type just for this one function to return also seems like an overkill to me.
So - there are times when they are the best thing, but mostly you can do just fine without them.
The OUTPUT clause in SQL Server 2005 onwards is a great step forward for getting any field values for rows affected by your DML statements. Ithink that there are a lot of situations where this does away with output parameters.
In VB6, ByRef parameters are good for passing ADO objects around.
other than those two specific cases that come to mind, I tend to avoid using them.
In SQL only...
Stored procedure output parameters are useful.
Say you need one value back. Do you "create #table, insert... exec, select #var = ". Or use an output parameter?
For client calls, an output parameter is far quicker than processing a recordset.
Using RETURN values is limited to signed integer.
Easier to re-use (eg a security check helper procedure)
When using both: recordsets = data, output parameters = status/messages/rowcount etc
Stored procedures recordset output can not be strongly typed like UDFs or client code
You can't always use a UDF (eg logging during security check above)
However, as long as you don't generally use the same parameter for input and output, then until SQL changes completely your options are limited. Saying that, I have one case where I use a paramter for in and out values, but I have a good reason.
My Two Cents:
I agree that output parameters are a concerning practice. VBA is often maintained by people very new to programming and if someone maintaining your code fails to notice that a parameter is ByRef they could introduce some serious logical errors. Also it does tend to break the Property/Function/Sub paradigm.
Another reason that using out parameters is bad practice is that if you really do need to be returning more than one value, chances are that you should have those values in a data structure such as a class or a User Defined Type.
They can however solve some problems. VB5 (and therefore VBA for Office 97) did not allow for a function to return an array. This meant anything returning or altering an array would have to do so via an "out" parameter. In VB6 this ability has been added, but VB6 still forces array parameters to be by reference (to prevent excessive copying in memory). Now you can return a value from a function that alters an array. But it will be just a hair slow (due to the acrobatics going on behind the scenes); it can also confuse newbies into thinking that the array input will not be altered (which will only be true if someone specifically structured it that way). So I find that if I have a function that alters an array it reduces confusion to just use a sub instead of a function (and it will be a tiny bit faster too).
Another possible scenario would be if you are maintaining code and you want to add an out value without breaking the interface you can add an optional out parameter and be confident you won't be breaking any old code. It's not good practice, but if someone wants something fixed right now and you don't have time to do it the "right way" and restructure everything, this can be a handy addition to your tool box.
However if you are developing things from the ground up and you need to return multiple values you should consider:
1. Breaking up the function.
2. Returning a UDT.
3. Returning a Class.
I generally never use them, I think they are confusing and too easy to abuse. We do occasionally use ref parameters but that has more to do with passing in structures vs. getting them back.
Your opinion sounds reasonable to me.
Another drawback of output parameters is the extra code needed to pass results from one function to another. You have to declare the variable(s), call the function to get their values, and then pass the values to another function. You can't just nest function calls. This makes code read very imperatively, rather than declaratively.
C++0x is getting tuples, an anonymous struct-like thing, whose members you access by index. C++ programmers will be able to pack multiple values into one of those and return it. Does C# have something like that? Can it return an array, perhaps, instead? But yeah output parameters are a bit awkward and unclear.
I'm designing a WCF service that will return a list of objects that are describing a person in the system.
The record count is really big and there I have some properties like person's sex.
Is it better to create a new enum (It's a complex-type and consumes more bandwidth) named Sex with two values (Male and Female) or use a primitive type for this like bool IsMale?
Very little point switching to bool; which is bigger:
<gender>male</gender> (or an attribute gender="male")
or
<isMale>true</isMale> (or an attribute isMale="true")
Not much in it ;-p
The record count is really big...
If bandwidth becomes an issue, and you control both ends of the service, then rather than change your entities you could look at some other options:
pre-compress the data as (for example) gzip, and pass (instead) a byte[] or Stream, noting to enable MTOM on the service
(or) switch serializer; proobuf-net has WCF hooks, and can achieve significant bandwidth improvements over the default DataContractSerialier (again: enable MTOM). In a test based on Northwind data (here) it reduced 736,574 bytes to 133,010. and reduced the CPU required to process it (win:win). For info, it reduces enums to integers, typically requiring only 1 byte for the enum value and 1 byte to identify the field; contrast to <gender>Male</gender>, which under UTF8 is 21 bytes (more for most other encodings), or gender="male" at 14 bytes.
However, either change will break your service if you have external callers who are expecting regular SOAP...
The reason to not use an enum is that XML Schema does not have a concept equivalent to an enum. That is, it does not have a concept of named values. The result is that enums don't always translate between platforms.
Use a bool, or a single-character field instead.
I'd suggest you model it in whatever way seems most natural until you run into a specific issue or encounter a requirement for a change.
WCF is designed to abstract the underlying details, and if bandwidth is a concern then i think a bool, int or enum will all probably be 4 bytes. You could optimize by using a bitmask or using a single byte.
Again, the ease of use of the API and maintainability is probably more important, which do you prefer?
if( user[i].Sex == Sexes.Male )
if( user[i].IsMale == true; ) // Could also expose .IsFemale
if( user[i].Sex == 'M' )
etc. Of course you could expose multiple.