Should we store format strings in resources?

Should we store format strings in resources? - c#

For the project that I'm currently on, I have to deliver specially formatted strings to a 3rd party service for processing. And so I'm building up the strings like so:
string someString = string.Format("{0}{1}{2}: Some message. Some percentage: {3}%", token1, token2, token3, number);
Rather then hardcode the string, I was thinking of moving it into the project resources:
string someString = string.Format(Properties.Resources.SomeString, token1, token2, token3, number);
The second option is in my opinion, not as readable as the first one i.e. the person reading the code would have to pull up the string resources to work out what the final result should look like.
How do I get around this? Is the hardcoded format string a necessary evil in this case?

I do think this is a necessary evil, one I've used frequently. Something smelly that I do, is:
// "{0}{1}{2}: Some message. Some percentage: {3}%"
string someString = string.Format(Properties.Resources.SomeString
,token1, token2, token3, number);
..at least until the code is stable enough that I might be embarrassed having that seen by others.

There are several reasons that you would want to do this, but the only great reason is if you are going to localize your application into another language.
If you are using resource strings there are a couple of things to keep in mind.
Include format strings whenever possible in the set of resource strings you want localized. This will allow the translator to reorder the position of the formatted items to make them fit better in the context of the translated text.
Avoid having strings in your format tokens that are in your language. It is better to use
these for numbers. For instance, the message:
"The value you specified must be between {0} and {1}"
is great if {0} and {1} are numbers like 5 and 10. If you are formatting in strings like "five" and "ten" this is going to make localization difficult.
You can get arround the readability problem you are talking about by simply naming your resources well.
string someString = string.Format(Properties.Resources.IntegerRangeError, minValue, maxValue );
Evaluate if you are generating user visible strings at the right abstraction level in your code. In general I tend to group all the user visible strings in the code closest to the user interface as possible. If some low level file I/O code needs to provide errors, it should be doing this with exceptions which you handle in you application and consistent error messages for. This will also consolidate all of your strings that require localization instead of having them peppered throughout your code.

One thing you can do to help add hard coded strings or even speed up adding strings to a resource file is to use CodeRush Xpress which you can download for free here: http://www.devexpress.com/Products/Visual_Studio_Add-in/CodeRushX/
Once you write your string you can access the CodeRush menu and extract to a resource file in a single step. Very nice.
Resharper has similar functionality.

I don't see why including the format string in the program is a bad thing. Unlike traditional undocumented magic numbers, it is quite obvious what it does at first glance. Of course, if you are using the format string in multiple places it should definitely be stored in an appropriate read-only variable to avoid redundancy.
I agree that keeping it in the resources is unnecessary indirection here. A possible exception would be if your program needs to be localized, and you are localizing through resource files.

yes you can
new lets see how
String.Format(Resource_en.PhoneNumberForEmployeeAlreadyExist,letterForm.EmployeeName[i])
this will gave me dynamic message every time
by the way I'm useing ResXManager

Related

How to use StepArgumentTransformation properly in Specflow(BDD)?

I encountered a scenario where i have to send an array of integers as parameter from specflow feature file. I could have used tables which i don't want to do as i have send as row[] or col[]. If i pass parameter as a string
eg: Given set the value as '470,471,472,472'
and receive it and do split in step definition file. How different is StepArgumentTransformation from the above scenario? Is any other benefit in using step argument transformation. I understand we can convert XML,Date or any object. Why do we have to use stepargumenttransformation???

I hope I understood the question correctly.
Specflow supports some automatic transformation out of the box, so things like converting to Date, Double, int etc etc, and it does these by default as there is no ambiguity about them. You can easily convert a string to a double or a Date as you know the locale being used.
Why isn't converting to arrays supported? I suppose it could be, but there is some ambiguity. What should the list separator be? a comma? What about locales that use that as a separator between the whole and fractional part of a number?
So providing a default implementation of something which converted a list to int[] or IEnumerable<int> could be possible, but its just likely to get some people asking why it doesn't work for them when they have used ☃ as a list separator.
It's better to leave the things with ambiguity to individuals to implement, rather than guess at the best implementation.
The StepArgumentTransformation you want is very simple to write and could be included in an external step assembly if you wanted to share it amongst many projects.
So to answer your many questions:
It's not really any different, it just encapsulates it in a single place, which is good practise, which is a benefit.
Yes you can convert any object.
You don't have to use StepArgumentTransformation, many people don't, but IMHO they make your life much easier

String likeness algorithms

I have two strings (they're going to be descriptions in a simple database eventually), let's say they're
String A: "Apple orange coconut lime jimmy buffet"
String B: "Car
bicycle skateboard"
What I'm looking for is this. I want a function that will have the input "cocnut", and have the output be "String A"
We could have differences in capitalization, and the spelling won't always be spot on. The goal is a 'quick and dirty' search if you will.
Are there any .net (or third party), or recommend 'likeness algorithms' for strings, so I could check that the input has a 'pretty close fragment' and return it? My database is going to have liek 50 entries, tops.

What you’re searching for is known as the edit distance between two strings. There exist plenty of implementations – here’s one from Stack Overflow itself.
Since you’re searching for only part of a string what you want is a locally optimal match rather than a global match as computed by this method.
This is known as the local alignment problem and once again it’s easily solvable by an almost identical algorithm – the only thing that changes is the initialisation (we don’t penalise whatever comes before the search string) and the selection of the optimum value (we don’t penalise whatever comes after the search string).

Designing Tests for Culture/Globalisation problems

I'm concerned about the predictability of my application in handling string input in different cultures. It has been a problem in older software and I don't want it to be a problem in the new.
I have generally two sources of input; Strings entered into a WPF application and Streams, loaded from files, containing text. These cultured strings are generally entered into an model before being used
public struct MyModel
{
public String Name;
}
I want to design a meaningful test to ensure some logic can actually handle Result DoSomething(MyModel model); when it contains text inputted on a different machine.
But how can I show a case where the difference matters?
For example the following fails.
var inNativeCulture= "[Something12345678.9:1] {YeS/nO}";
var inChineseCulture = inNativeCulture.ToString(new CultureInfo("zh-CN"));
Assert.That(inChineseCulture, Is.Not.EqualTo(inNativeCulture));
[Question]
How can I test DoSomething such that the test is able to fail if the strings are not converted to InvarientCulture?
Should I even bother? i.e. the string Something entered on a french keyboard will always equal Something entered on a Chinese keyboard?
What can I test for that will mitigate Globalization problems?

The ToString method taking a IFormatProvider on a string is essentially a no-op. The documentation states "Returns this instance of String; no actual conversion is performed."
Since you are concerned about avoiding issues here's some general advice. First it is very helpful to have a clear distinction in your mind between frontend (user facing) strings and backend (database, wire, file, etc) strings. Frontend strings should be generated/accepted according to the user's culture / application language. These strings should not be persisted (with few exceptions like when you are generating a document that will be read only by people and not by machine). Backend strings should always use standard formats that will not change over time. If you accept the fact that the data used to generate/parse globalized strings changes, then you will isolate yourself from the effects by ensuring that you do not persist user facing strings.

C# string translation

Does C# offer a way to translate strings on-the-fly or something similiar?
I'm now working on some legacy code, which has some parts like this:
section.AddParagraph(String.Format("Premise: {0}", currentReport.Tenant.Code));
section.AddParagraph(String.Format("Description: {0}", currentReport.Tenant.Name));
section.AddParagraph();
section.AddParagraph(String.Format("Issued: #{0:D5}", currentReport.Id));
section.AddParagraph(String.Format("Date: {0}", currentReport.Timestamp.ToString(
"dd MMM yyyy", CultureInfo.InvariantCulture)));
section.AddParagraph(String.Format("Time: {0:HH:mm}", currentReport.Timestamp));
So, I want to implement the translation of these strings on-the-fly based on some substitution table (for example, as Qt does).
Is this possible (probably, using something what C# already has or using some post-processing - may be possible with PostSharp)?
Does some generic internalization approach for applications built with C# (from scratch) exist?

Does some generic internalization approach for applications built with C# (from scratch) exist?
Yes, using resource files. And here's another article on MSDN.

In the C# project I currently work on, we wrote a helper function that works like this:
section.AddParagraph(I18n.Translate("Premise: {0}", currentReport.Tenant.Code));
section.AddParagraph(I18n.Translate("That's all");
At build time, a script searches all I18n.Translate invocations, as well as all UI controls, and populates a table with all english phrases. This gets translated.
At runtime, the english text is looked up in a dictionary, and replaced with the translated text.
Something similar happens to our winforms Dialog resources: they are constructed in english and then translated using the same dictionary.
The biggest strength of this scheme, is also the biggest weakness: If you use the same string in two places, it gets translated the same. This shortens the file you send to translater which helps to reduce cost. If you ever need to force a different translation of the same english word, you need to work around that. As long as we have the system (4ish years or so), we never had the need for it. There's also benefits: You read the english UI text inline with the source (so not hiding behind an identifier you need to name), and if you delete code, its automatically removed from the translated resources as well.

Count distinct strings in C# code

I'm in need to estimate localization effort needed for a legacy project. I'm looking for a tool that I could point at a directory, and it would:
Parse all *.cs files in the directory structure
Extract all C# string literals from the code
Count total number of occurrences of the strings
Do you know any tool that could do that? Writing it would be simple, but if some time can be saved, then why not save it?

Use ILDASM to decompile your .DLL / .EXE.
I just use options to dump all, and you get an .il file with a section "User String":
User Strings
-------------------------------------------------------
70000001 : (14) L"Starting up..."
7000001f : (12) L"progressBar1"
70000039 : (21) L"$this.BackgroundImage"
70000065 : (10) L"$this.Icon"
7000007b : ( 6) L"Splash"
Now if you want to know how many time a certain string is used. Search for a "ldstr" like this:
IL_003c: /* 72 | (70)000001 */ ldstr "Starting up..." /* 70000001 */
I think this will be a lot easier to parse as C#.

Doing a quick search, I found the following tool that may or may not be useful to you.
http://www.devincook.com/goldparser/
I also found another SO user who was trying to do something similar.
Regex to parse C# source code to find all strings

Well, if you have hardcoded strings, you need to know what is your i18n effort first (unhardcoding them could be quite painful). Another issue: you need to count translatable words not distinct strings, that is the input for translation providers. And even though string might seem duplicated, it could be translated in a different way depending on the context, so you don't need to care about "distninct", you just have to count all words... That's how Localization works per my experience.

In most common development, you should keep your strings external to your program source code. In your case, could you spare the effort to extract the strings into a resource file?
If so, then you can make use of the default localization solution in .NET, i.e.
resource.resx,
resource.fr.resx,
resources.es.resx
stores strings for different locales.
Updated :
The actual implementation depends on your project architecture/technology, resource files ain't the best way to do this, but it is the easiest, and the recommended way in .NET.
Like in this article
A few more tutorials
A few more tutorials

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.