How to use StepArgumentTransformation properly in Specflow(BDD)?

How to use StepArgumentTransformation properly in Specflow(BDD)? - c#

I encountered a scenario where i have to send an array of integers as parameter from specflow feature file. I could have used tables which i don't want to do as i have send as row[] or col[]. If i pass parameter as a string
eg: Given set the value as '470,471,472,472'
and receive it and do split in step definition file. How different is StepArgumentTransformation from the above scenario? Is any other benefit in using step argument transformation. I understand we can convert XML,Date or any object. Why do we have to use stepargumenttransformation???

I hope I understood the question correctly.
Specflow supports some automatic transformation out of the box, so things like converting to Date, Double, int etc etc, and it does these by default as there is no ambiguity about them. You can easily convert a string to a double or a Date as you know the locale being used.
Why isn't converting to arrays supported? I suppose it could be, but there is some ambiguity. What should the list separator be? a comma? What about locales that use that as a separator between the whole and fractional part of a number?
So providing a default implementation of something which converted a list to int[] or IEnumerable<int> could be possible, but its just likely to get some people asking why it doesn't work for them when they have used ☃ as a list separator.
It's better to leave the things with ambiguity to individuals to implement, rather than guess at the best implementation.
The StepArgumentTransformation you want is very simple to write and could be included in an external step assembly if you wanted to share it amongst many projects.
So to answer your many questions:
It's not really any different, it just encapsulates it in a single place, which is good practise, which is a benefit.
Yes you can convert any object.
You don't have to use StepArgumentTransformation, many people don't, but IMHO they make your life much easier

Related

Design better text file format for reading mixed type data of variable length

I am designing a text file format to be read in C#. I have a need to store types: int, double and string on a single line. I'm planning to use a .CSV format so the file can be manually opened and read. A particular record may have say 8 known types, then a variable number of "indicator" combinations of either (string, int, double) or (string, int, double, double), and some lines may include no "indicators". Thus, each record is may be of variable length.
In VB6 I would just input the data, split the data, into a variant array, then determine the number of elements on that line in the array, and use the ***VarType function to determine if the final "indicator" variables are string, int, or double and parse the field accordingly.
There may be a better way to design a text file and that may be the best solution. If so I'm interested in hearing ideas. I have searched but found no questions that specifically talk about reading variable length lines of text with mixed type into C#.
If a better format is not forthcoming, is there a way to duplicate the VB6 VarType function within C# as described two paragraphs above***? I can handle the text file reading and line splitting easily in C#.

you could use either json or xml as they are well supported in .NET and have automatic serialization capabilities

First I agree with Keith's suggestion to use Xml or JSON. You are reinventing a wheel here. This page has an introductory example of how to serialize objects to a file and some links to more info.
If you need to stick with your own file format and custom serialization/deserialization however take a look at the Convert class, as well as the various TryParse methods which hang off of the intrinsic value types like int and double.

String likeness algorithms

I have two strings (they're going to be descriptions in a simple database eventually), let's say they're
String A: "Apple orange coconut lime jimmy buffet"
String B: "Car
bicycle skateboard"
What I'm looking for is this. I want a function that will have the input "cocnut", and have the output be "String A"
We could have differences in capitalization, and the spelling won't always be spot on. The goal is a 'quick and dirty' search if you will.
Are there any .net (or third party), or recommend 'likeness algorithms' for strings, so I could check that the input has a 'pretty close fragment' and return it? My database is going to have liek 50 entries, tops.

What you’re searching for is known as the edit distance between two strings. There exist plenty of implementations – here’s one from Stack Overflow itself.
Since you’re searching for only part of a string what you want is a locally optimal match rather than a global match as computed by this method.
This is known as the local alignment problem and once again it’s easily solvable by an almost identical algorithm – the only thing that changes is the initialisation (we don’t penalise whatever comes before the search string) and the selection of the optimum value (we don’t penalise whatever comes after the search string).

Methodologies or algorithms for filling in missing data

I am dealing with datasets with missing data and need to be able to fill forward, backward, and gaps. So, for example, if I have data from Jan 1, 2000 to Dec 31, 2010, and some days are missing, when a user requests a timespan that begins before, ends after, or encompasses the missing data points, I need to "fill in" these missing values.
Is there a proper term to refer to this concept of filling in data? Imputation is one term, don't know if it is "the" term for it though.
I presume there are multiple algorithms & methodologies for filling in missing data (use last measured, using median/average/moving average, etc between 2 known numbers, etc.
Anyone know the proper term for this problem, any online resources on this topic, or ideally links to open source implementations of some algorithms (C# preferably, but any language would be useful)

The term you're looking for is interpolation. (obligatory wiki link)
You're asking for a C# solution with datasets but you should also consider doing this at the database level like this.
An simple, brute-force approach in C# could be to build an array of consecutive dates with your beginning and ending values as the min/max values. Then use that array to merge "interpolated" date values into your data set by inserting rows where there is no matching date for your date array in the dataset.
Here is an SO post that gets close to what you need: interpolating missing dates with C#. There is no accepted solution but reading the question and attempts at answers may give you an idea of what you need to do next. E.g. Use the DateTime data in terms of Ticks (long value type) and then use an interpolation scheme on that data. The convert the interpolated long values to DateTime values.

The algorithm you use will depend a lot on the data itself, the size of the gaps compared to the available data, and its predictability based on existing data. It could also incorporate other information you might know about what's missing, as is common in statistics, when your actual data may not reflect the same distribution as the universe across certain categories.
Linear and cubic interpolation are typical algortihms that are not difficult to implement, try googling those.
Here's a good primer with some code:
http://paulbourke.net/miscellaneous/interpolation/
The context of the discussion in that link is graphics but the concepts are universally applicable.

For the purpose of feeding statistical tests, a good search term is imputation - e.g. http://en.wikipedia.org/wiki/Imputation_%28statistics%29

c#: tryparse vs convert

Today I read an article where it's written that we should always use TryParse(string, out MMM) for conversion rather than Convert.ToMMM().
I agree with article but after that I got stuck in one scenario.
When there will always be some valid value for the string and hence we can also use Convert.ToMMM() because we don't get any exception from Convert.ToMMM().
What I would like to know here is: Is there any performance impact when we use TryParse because when I know that the out parameter is always going to be valid then we can use Convert.ToMMM() rather TryParse(string, out MMM).
What do you think?

If you know the value can be converted, just use Parse(). If you 'know' that it can be converted, and it can't, then an exception being thrown is a good thing.
EDIT: Note, this is in comparison to using TryParse or Convert without error checking. If you use either of the other methods with proper error checking then the point is moot. I'm just worried about your assumption that you know the value can be converted. If you want to skip the error checking, use Parse and die immediately on failure rather than possibly continuing and corrupting data.

When the input to TryParse/Convert.ToXXX comes from user input, I'd always use TryParse. In case of database values, I'd check why you get strings from the database (maybe bad design?). If string values can be entered in the database columns, I'd also use TryParse as you can never be sure that nobody modifies data manually.
EDIT
Reading Matthew's reply: If you are unsure and would wrap the conversion in a try-catch-block anyway, you might consider using TryParse as it is said to be way faster than doing a try-catch in that case.

There is significant difference regarding the developing approach you use.
Convert: Converting one "primitive" data in to another type and corresponding format using multiple options
Case and point - converting an integer number in to its bit by bit representation. Or hexadecimal number (as string) in to integer, etc...
Error Messages : Conversion Specific Error Message - for problems in multiple cases and at multiple stages of the conversion process.
TryParse: Error-less transfer from one data format to another. Enabling T/F control of possible or not.
Error Messages: NONE
NB: Even after passing the data in to a variable - the data passed is the default of the type we try to parse in to.
Parse: in essence taking some data in one format and transfer it in to another. No representations and nothing fancy.
Error Messages: Format-oriented
P.S. Correct me if I missed something or did not explain it well enough.

Should we store format strings in resources?

For the project that I'm currently on, I have to deliver specially formatted strings to a 3rd party service for processing. And so I'm building up the strings like so:
string someString = string.Format("{0}{1}{2}: Some message. Some percentage: {3}%", token1, token2, token3, number);
Rather then hardcode the string, I was thinking of moving it into the project resources:
string someString = string.Format(Properties.Resources.SomeString, token1, token2, token3, number);
The second option is in my opinion, not as readable as the first one i.e. the person reading the code would have to pull up the string resources to work out what the final result should look like.
How do I get around this? Is the hardcoded format string a necessary evil in this case?

I do think this is a necessary evil, one I've used frequently. Something smelly that I do, is:
// "{0}{1}{2}: Some message. Some percentage: {3}%"
string someString = string.Format(Properties.Resources.SomeString
,token1, token2, token3, number);
..at least until the code is stable enough that I might be embarrassed having that seen by others.

There are several reasons that you would want to do this, but the only great reason is if you are going to localize your application into another language.
If you are using resource strings there are a couple of things to keep in mind.
Include format strings whenever possible in the set of resource strings you want localized. This will allow the translator to reorder the position of the formatted items to make them fit better in the context of the translated text.
Avoid having strings in your format tokens that are in your language. It is better to use
these for numbers. For instance, the message:
"The value you specified must be between {0} and {1}"
is great if {0} and {1} are numbers like 5 and 10. If you are formatting in strings like "five" and "ten" this is going to make localization difficult.
You can get arround the readability problem you are talking about by simply naming your resources well.
string someString = string.Format(Properties.Resources.IntegerRangeError, minValue, maxValue );
Evaluate if you are generating user visible strings at the right abstraction level in your code. In general I tend to group all the user visible strings in the code closest to the user interface as possible. If some low level file I/O code needs to provide errors, it should be doing this with exceptions which you handle in you application and consistent error messages for. This will also consolidate all of your strings that require localization instead of having them peppered throughout your code.

One thing you can do to help add hard coded strings or even speed up adding strings to a resource file is to use CodeRush Xpress which you can download for free here: http://www.devexpress.com/Products/Visual_Studio_Add-in/CodeRushX/
Once you write your string you can access the CodeRush menu and extract to a resource file in a single step. Very nice.
Resharper has similar functionality.

I don't see why including the format string in the program is a bad thing. Unlike traditional undocumented magic numbers, it is quite obvious what it does at first glance. Of course, if you are using the format string in multiple places it should definitely be stored in an appropriate read-only variable to avoid redundancy.
I agree that keeping it in the resources is unnecessary indirection here. A possible exception would be if your program needs to be localized, and you are localizing through resource files.

yes you can
new lets see how
String.Format(Resource_en.PhoneNumberForEmployeeAlreadyExist,letterForm.EmployeeName[i])
this will gave me dynamic message every time
by the way I'm useing ResXManager

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.