Related
I'm in need to estimate localization effort needed for a legacy project. I'm looking for a tool that I could point at a directory, and it would:
Parse all *.cs files in the directory structure
Extract all C# string literals from the code
Count total number of occurrences of the strings
Do you know any tool that could do that? Writing it would be simple, but if some time can be saved, then why not save it?
Use ILDASM to decompile your .DLL / .EXE.
I just use options to dump all, and you get an .il file with a section "User String":
User Strings
-------------------------------------------------------
70000001 : (14) L"Starting up..."
7000001f : (12) L"progressBar1"
70000039 : (21) L"$this.BackgroundImage"
70000065 : (10) L"$this.Icon"
7000007b : ( 6) L"Splash"
Now if you want to know how many time a certain string is used. Search for a "ldstr" like this:
IL_003c: /* 72 | (70)000001 */ ldstr "Starting up..." /* 70000001 */
I think this will be a lot easier to parse as C#.
Doing a quick search, I found the following tool that may or may not be useful to you.
http://www.devincook.com/goldparser/
I also found another SO user who was trying to do something similar.
Regex to parse C# source code to find all strings
Well, if you have hardcoded strings, you need to know what is your i18n effort first (unhardcoding them could be quite painful). Another issue: you need to count translatable words not distinct strings, that is the input for translation providers. And even though string might seem duplicated, it could be translated in a different way depending on the context, so you don't need to care about "distninct", you just have to count all words... That's how Localization works per my experience.
In most common development, you should keep your strings external to your program source code. In your case, could you spare the effort to extract the strings into a resource file?
If so, then you can make use of the default localization solution in .NET, i.e.
resource.resx,
resource.fr.resx,
resources.es.resx
stores strings for different locales.
Updated :
The actual implementation depends on your project architecture/technology, resource files ain't the best way to do this, but it is the easiest, and the recommended way in .NET.
Like in this article
A few more tutorials
A few more tutorials
I have developed a large business portal. I just realized I need my website in another language. I have researched the solutions available like
Used third party control on my website. (Does fit in my design. Not useful regarding SEO point of view. Dont want to show third party brand names.)
Create Resource files for each language.( A lot of work required to restructure pages to use text from resource files. What about the data entered by the user like Business Description. )
Are there any Other options available.
I was thinking of a solution like a when a page is created on server side then I could translate it before sending back to client. Is there any way I can do that?(to translate everything including data added from databases or through a code. And without effecting design. )
If you really need to translate your application, it's going to take a lot of hard, tedious work. There is no magic bullet.
The first thing you need to do is convert your plain text in your markup to asp:Localize controls. By using the Localize control, you can leave your existing <span> tags in place and just replace the text inside of them. There's really no way around this. Visual Studio's search and replace supports regular expression matching that may help you with this, or you can use Resharper (see below).
The first approach would be to download the open source shopping application nopCommerce and see how they handle their localization. They store their strings in a database and have a UI for editing languages. A similar approach may work well for you.
Alternatively, if you want to use Resource Files, there are two tools that I would recommend using in addition to Visual Studio: Resharper 5 (Localization Features screencast) and Zeta Resource Editor. These are the steps I would take to accomplish it using this method:
Use the "Generate Local Resource" tool in visual studio for each page
Use Resharper's "Move HTML to resource" on the text in your markup to make them into Localize controls.
Use Resharper to search out any localizable strings in your code behind and move them to the resource file as well.
Use the Globalization Rules of Code Analysis / FXCop to help find any additional problems you might face formatting numbers, dates, etc.
Once all text is in the resx files, use Zeta Resource Editor to load up all of your resx files, add new languages, and export for translation (or auto translate if you're brave enough).
I've used this approach on a site translated into 8 languages (and growing) with dozens of pages (and growing). However, this is not a user-editable site; the pages are solely controlled by the programmers.
a large switch case? use a dictionary/hashtable (seperate instance for each a language), it is much, much more effective and fast.
To Convert The Page To Arabic Language Or Other Language .
Go to :
1-page design
2-Tools
3-Generate Local Resource
4-obtain "App_LocalResources" include "filename.aspx.resx"
5-copy the file and change the name to "filename.aspx.ar.resx" to convert the page to arabic language or other .
hope to helpful :)
I found a good solution, see in http://www.nopcommerce.com/p/1784/nopcommerce-translator.aspx
this project is open source and source repository is here: https://github.com/Marjani/NopCommerce-Translator
good luck
Without installing any 3rd party tool, APIs, or dll objects, I am able to utilize the App_LocalResources. Although I still use Google Translate for the words and sentences to be translated and copy and paste it to the file as you can see in one of the screenshots below (or you can have a person translator and type manually to add). In your Project folder (using MS Visual Studio as editor), add an App_LocalResources folder and create the English and other language (resx file). In my case, it's Spanish (es-ES) translation. See screenshot below.
Next, on your aspx, add the meta tags (meta:resourcekey) that will match in the App_LocalResources. One for English and another to the Spanish file. See screenshots below:
Spanish: (filename.aspx.es-ES.resx)
English: (filename.aspx.resx)
.
Then create a link on your masterpage file with a querystring that will switch the page translation and will be available on all pages:
<%--ENGLISH/SPANISH VERSION BUTTON--%>
<asp:HyperLink ID="eng_ver" runat="server" Text="English" Font-Underline="false"></asp:HyperLink> |
<asp:HyperLink ID="spa_ver" runat="server" Text="Español" Font-Underline="false"></asp:HyperLink>
<%--ENGLISH/SPANISH VERSION BUTTON--%>
.
On your masterpage code behind, create a dynamic link to the Hyperlink tags:
////LOCALIZATION
string thispage = Request.Url.AbsolutePath;
eng_ver.NavigateUrl = thispage;
spa_ver.NavigateUrl = thispage + "?ver=es-ES";
////LOCALIZATION
.
Now, on your page files' code behind, you can set a session variable to make all links or redirections to stick to the desired translation by always adding a querystring to urls.
On PageLoad:
///'LOCALIZATION
//dynamic querystring; add this to urls ---> ?" + Session["add2url"]
{
if (Session["version"] != null)
{
Session["add2url"] = "?ver=" + Session["version"]; //SPANISH version
}
else
{
Session["add2url"] = ""; // ENGLISH as default
}
}
///'LOCALIZATION
.
On Click Events sample:
protected void btnBack_Click(object sender, EventArgs e)
{
Session["FileName.aspx"] = null;
Response.Redirect("FileName.aspx" + Session["add2url"]);
}
I hope my descriptions were easy enough.
If you don't want to code more and if its feasible with google translator then You can try with Google Translator API. you can check below code.
<script src="http://translate.google.com/translate_a/element.js?cb=googleTranslateElementInit"></script>
<script>
function googleTranslateElementInit() {
$.when(
new google.translate.TranslateElement({pageLanguage: 'en', includedLanguages: 'en',
layout: google.translate.TranslateElement.FloatPosition.TOP_LEFT}, 'google_translate_element')
).done(function(){
var select = document.getElementsByClassName('goog-te-combo')[0];
select.selectedIndex = 1;
select.addEventListener('click', function () {
select.dispatchEvent(new Event('change'));
});
select.click();
});
}
$(window).on('load', function() {
var select = document.getElementsByClassName('goog-te-combo')[0];
select.click();
var selected = document.getElementsByClassName('goog-te-gadget')[0];
selected.hidden = true;
});
</script>
Also, Find below code for <body> tag
<div id="google_translate_element"></div>
It will certainly be more work to create resource files for each language - but this is the option I would opt for, as it gives you the opportunity to be more accurate. If you do it this way you can have the text translated, manually, by someone that speaks the language (there are many companies out there that offer this kind of service).
Automatic translation systems are often good for giving a general impression of what something in another language means, but I would never use them when trying to portray a professional image, as often what they output just doesn't make sense. Nothing screams 'unprofessional!' like text that just doesn't make sense because it's been automatically translated.
I would take the resource file route over the translation option because the meaning of words in a language can be very contextual and even one mistake could undermine your site's credibility.
As you suggest Visual Studio can generate the meta resource file keys for most controls containing text but may leave you having to do the rest manually but I don't see an easier, more reliable solution.
I don't think localisation is an easy-to-automate thing anyway as text held in the database often results in schema changes to allow for multiple languages, and web HTML often need restructuring to deal with truncated or wrapped label and button text because, for example, you've translated into German or something.
Other considerations:
Culture settings - financial delimitors, date formats.
Right-to-left - some languages like arabic are written right to left meaning that the pages require rethinking as to control positioning like images etc.
Good luck whatever you go with.
I ended up doing it the hard way:
I wrote an extension method on the string class called TranslateInto
On the Page's PreRender method I grab all controls recursively based on their type (the types that would have text)
Foreach through them and text.TranslateInto(SupportedLanguages.CurrentLanguage)
In my TranslateInto method I have a ridiculously large switch statement with every string displayed to the user and its associated translation.
Its not very pretty, but it worked.
We work with a Translation CAT tool (Computer Assisted Translation) called MemoQ that allows us to translate the text while leaving all the tags and coding in place. This is very helpful when the order of words change when you translate from one language to another.
It is also very useful because it allows us to work with translators from around the world, without the need for them to have any technical expertise. It also allows us to have the translation proof read by a second translator.
We use this translation environment to translate html, xml, InDesign, Word, etc.
I think you should try Google Translate.
http://translate.google.com/translate_tools
Very easy and very very effective.
HTH
For the project that I'm currently on, I have to deliver specially formatted strings to a 3rd party service for processing. And so I'm building up the strings like so:
string someString = string.Format("{0}{1}{2}: Some message. Some percentage: {3}%", token1, token2, token3, number);
Rather then hardcode the string, I was thinking of moving it into the project resources:
string someString = string.Format(Properties.Resources.SomeString, token1, token2, token3, number);
The second option is in my opinion, not as readable as the first one i.e. the person reading the code would have to pull up the string resources to work out what the final result should look like.
How do I get around this? Is the hardcoded format string a necessary evil in this case?
I do think this is a necessary evil, one I've used frequently. Something smelly that I do, is:
// "{0}{1}{2}: Some message. Some percentage: {3}%"
string someString = string.Format(Properties.Resources.SomeString
,token1, token2, token3, number);
..at least until the code is stable enough that I might be embarrassed having that seen by others.
There are several reasons that you would want to do this, but the only great reason is if you are going to localize your application into another language.
If you are using resource strings there are a couple of things to keep in mind.
Include format strings whenever possible in the set of resource strings you want localized. This will allow the translator to reorder the position of the formatted items to make them fit better in the context of the translated text.
Avoid having strings in your format tokens that are in your language. It is better to use
these for numbers. For instance, the message:
"The value you specified must be between {0} and {1}"
is great if {0} and {1} are numbers like 5 and 10. If you are formatting in strings like "five" and "ten" this is going to make localization difficult.
You can get arround the readability problem you are talking about by simply naming your resources well.
string someString = string.Format(Properties.Resources.IntegerRangeError, minValue, maxValue );
Evaluate if you are generating user visible strings at the right abstraction level in your code. In general I tend to group all the user visible strings in the code closest to the user interface as possible. If some low level file I/O code needs to provide errors, it should be doing this with exceptions which you handle in you application and consistent error messages for. This will also consolidate all of your strings that require localization instead of having them peppered throughout your code.
One thing you can do to help add hard coded strings or even speed up adding strings to a resource file is to use CodeRush Xpress which you can download for free here: http://www.devexpress.com/Products/Visual_Studio_Add-in/CodeRushX/
Once you write your string you can access the CodeRush menu and extract to a resource file in a single step. Very nice.
Resharper has similar functionality.
I don't see why including the format string in the program is a bad thing. Unlike traditional undocumented magic numbers, it is quite obvious what it does at first glance. Of course, if you are using the format string in multiple places it should definitely be stored in an appropriate read-only variable to avoid redundancy.
I agree that keeping it in the resources is unnecessary indirection here. A possible exception would be if your program needs to be localized, and you are localizing through resource files.
yes you can
new lets see how
String.Format(Resource_en.PhoneNumberForEmployeeAlreadyExist,letterForm.EmployeeName[i])
this will gave me dynamic message every time
by the way I'm useing ResXManager
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 9 years ago.
I've mainly been doing C# development for the past few years but recently started to do a bit of Python (not Iron Python). But I'm not sure if I've made the mental leap to Python...I kind of feel I'm trying to do things as I would in C#.
Any advice on how I can fully take advantage of Python?
Or any tips\tricks, things to learn more about, things to watch out for?
First, check tgray's and Lundström's advice.
Then, some things you may want to know:
Python is dynamically typed, so unlike C#, you will not
check type, but behavior. You may want to google about duck
typing. It implies you do not have to deal with boxing and
unboxing.
Python is fully object oriented, but the syntax does not
enforce this paradigm. You can write Python without using
the word "class".
The GUI library featured with Python can't compare with
C#'s. Check PyQt, GTK or wxPython libraries.
Python has a lot of concepts you may not be familiar with:
list comprehensions, generators ("yield" does exist in C#,
but it is not used much), decorators, metaclasses, etc. Don't
be afraid; you can program in Python without them. They
are just smart tools, not mandatory.
Like in C#, the Python standard library is huge. Always
look at it when you encounter any problem. It is most
likely that someone solved it already.
Python use LATE binding and variable labels. It's far too
early for somebody starting with the language to worry
about it, but remember that one day you will encounter a
behavior with variables that SEEMS illogical, and you'll
have to check that. For the moment:
Just remember to never do the following:
def myfunc(my_list=[]) :
# bla
Instead:
def myfunc(my_list=()) :
my_list = list(my_list)
And you'll be good. There is a good reason for that, but
that's not the point :-)
Python is cross platform, enjoy writing on Mac, and
run on Linux, if you wish.
Python is not provided with a complex IDE (you got IDLE :-)).
If you are a Visual Studio addict, check Glade. This is
not as advanced as Visual Studio, but it's still a good RAD.
If you want to develop some web application in Python,
remember that Python is not .NET. You must add a web
framework to it if you want to compare. I like Django.
Python does not need a huge IDE to work with. SciTE,
Notepad++, IDLE, Kate, gedit...
Lightweight editors are really sufficient.
Python enforces indentation using spaces and line break,
you can't change that. You should avoid using tabs for
indenting and choose spaces instead. The equivalent of
empty bracelets {} is the keyword "pass".
Python does not enforce private variables. You can define a
private var using "__" (two underscores) at the beginning of
the variable name, but it's still bypassable in some tricky
ways. Python usually assume programmers are grown adults
that know what they do and communicate.
Python uses iteration. A lot. A lot of a lot. And so the
itertools module is you best friend.
Python has no built in delegates. The delegate module is
not what you think. For event-driven programming, use a
GUI lib (or code the pattern yourself, it's not that
difficult).
Python has an interpreter: you can test almost anything,
live. It should always be running next to your text
editor. Python basic interpreter is not much, try IPython
for something tasty.
Python is autodocumented: use docstrings in your own code
and consult other's using "help()" in the python interpreter
Module basics:
sys: manipulate system features
os: set credential, manipulate file paths, rename, recursive file walk, etc
shutil: batch file processing (such as recursive delete)
re: regexp
urllib and urllib2: HTTP¨scripting like downloading, post / get resquests, etc.
datetime: manipulate date, time AND DURATION
thread: you guess it
zlib: compression
pickle: serialization
xml: parsing / Writing XML with SAX or DOM
There are hundreds of modules. Enjoy.
Some typical ways to do things in Python:
Loops:
Python coders use massively the equivalent of the foreach C#
loop, and prefer it to any others:
Basic iterations:
for item in collection:
print str(item)
"collection" can be a string, a list, a tuple... Any
iterable: any object defining the .next() method. There are
a lot of iterables in Python. E.g: a typical Python idiom
to read files:
for line in open("/path/to/file") :
print line
A shortcut to the for loop is called "list comprehension".
It's a way to create an new iterable in one line:
Creating a filtered list with list comprehension:
my_list = [item for item in collection if condition]
Creating a new list with a list comprehension:
my_list = [int(item) * 3 for item in collection]
Creating a new generator with a list comprehension:
my_list = (int(item) * 3 for item in collection)
Same as above, but the values will be generated on the fly
at the first iteration then lost. More information about it here.
Ordinary for loop
If you want to express a usual for loop, you can use the
xrange() function. for (int i = 0; i < 5; i++) becomes:
for i in xrange(0,5) :
do while equivalent
There is no "Do While" in Python. I never missed it, but if
you have to use this logic, do the following:
while True : # Yes, this is an infinite loop. Crazy, hu?
# Do your stuff
if condition :
break
Unpacking
Swapping variables:
a, b = b, a
Multiple assignations:
The above is just a result of what we call "unpacking" (here
applied to a tuple). A simple way to explain it is that you
can assign each value of any sequence directly to an equal
number a variables, in one row:
animal1, animal2, animal3, animal4 = ["cow", "dog", "bird", "fish"]
This has a lot of implications. While iterating on a
multidimensional array, you normally get each sub sequence
one by one then use it, for example:
agenda = [("steve", "jobs"), ("linus", "torvald"), ("bill", "gates"),("jon", "skeet")]
for person in agenda:
print person[0], person[1]
But with unpacking, you can assign the values directly to
variables as well:
agenda = [("steve", "jobs"), ("linus", "torvald"), ("bill", "gates"),("jon", "skeet")]
for name, lastname in agenda:
print name, lastname
And that's why if you want to get an index while iterating,
Python coders use the following idioms (enumerate() is a
standard function):
for index, value in enumerate(sequence) :
print index, value
Unpacking in functions calls
This is advanced use, and you can skip it if it bothers you.
You can unpack values using the sign "*" to use a sequence
directly in a function call. E.g:
>>> foo(var1, var1, var3) :
print var1, var2
print var3
>>> seq = (3.14, 42, "yeah")
>>> foo(*seq)
3.14 42
yeah
There is even more than that. You can unpack a dictionary as
named variables, and write function prototypes with *,
** to accept an arbitrary number of arguments. But it not
used enough to deserve to make this post even longer :-).
String formatting:
print "This is a %s on %s about %s" % ("post", "stackoverflow", "python")
print "This is a %(subject)s on %(place)s about %(about)s" % {"subject" : "post", "place" : "stackoverflow", "about" : "python"}
Slicing an iterable:
You can get any part of an iterable using a very concise syntax:
print "blebla"[2:4] # Print "eb"
last = string[:-1] # Getting last element
even = (0,1,2,3,4,5,6,7,8,9)[::2] # Getting evens only (third argument is a step)
reversed = string[::-1] # Reversing a string
Logical checks:
You can check the way you do in C#, but there are "Pythonic"
ways (shorter, clearer :-)):
if 1 in (1, 2, 3, 4) : # Check en element is in a sequence
if var : # check is var is true. Var == false if it's False, 0, (), [], {} or None
if not var : # Contrary of above
if thing is var: # Check if "thing" and "var" label the same content.
if thing is None : # We use that one because None means nothing in Python (almost null)
Combo (print on one line all the words containing an "o" in uppercase ):
sentence = "It's a good day to write some code"
print " ".join([word.upper() for word in sentence.split() if "o" in word])
Output: "GOOD TO SOME CODE"
Easier to ask for forgiveness than permission
Python coders usually don't check if something is possible. They are a bit like Chuck Norris. They do it. Then catch the exception. Typically, you don't check if a file exists, you try to open it, and roll back if it fails:
try :
f = open(file)
except IOerror :
print "no file here !"
Of course Chuck Norris never uses excepts since he never fails.
The else clause
"Else" is a world of many uses in Python. You will find
"else" after "if", but after "except" and "for" as well.
for stuff in bunch :
# Do things
else :
# This always happens unless you hit "break" in the loop
This works for "while" loop too, even if we do not use this
loop as much.
try :
# A crazy stuff
except ToCrazyError :
# This happens if the crazy stuff raises a ToCrazyError Exception
else :
# This will happen if there is no error so you can put only one line after the "try" clause
finally :
# The same as in C#
If you are curious, here is a bunch of advanced quick and
dirty (but nice) Python snippets.
Refrain from using classes. Use dictionaries, sets, list and tuples.
Setters and getters are forbidden.
Don't have exception handlers unless you really need to - let it crash in style.
Pylint can be your friend for more pythonish coding style.
When you're ready - check out list comprehensions, generators and lambda functions.
If you are not new to programming, I would recommend the book "Dive into Python" by Mark Pilgrim. It explains Python in a way that makes it easy to understand how Python techniques and idioms can be applied to build practical applications.
Start by reading The Zen of Python
You can read it at the link above, or just type import this at the Python prompt. =)
Take advantage of Python features not offered* by C#
Such as duck-typing, metaclasses, list comprehension, etc.*
Write simple programs just to test these features. You'll get used (if not addicted) to them in no time.
Look at the Python Standard Library
So you don't reinvent the wheel. Don't try to read the whole thing, even a quick look at the TOC could save you a lot of time.
* I know C# already has some of these features, but from what I can see they're either pretty new or not commonly used by C# developers. Please correct me if I'm wrong.
In case you haven't heard about it yet, Dive Into Python is a great place to start for anyone learning Python. It also has a bunch of Tips & Tricks.
If you are someone who is better learning a new language by taking small incremental steps then I would recommend using IronPython. Otherwise use regular CPython and don't do any more C# coding until you feel like you have a grasp of Python.
I would suggest getting a good editor so that you don't get bitten by whitespace. For simplicity, I just use ActivePython's packages Link, which include an editor and all of the win32api libraries. They are pretty fun to get into if you have been using C#. The win32api in Python can be a little bit simpler. You don't need to do the whole DDLImport thing. Download ActivePython (which comes with CPython), open it up, and start entering some stuff at the console. You will pick it up fairly easy after using C#. For some more interesting Python tidbits, try ActiveState code, which has all sorts of recipes, which can allow you to very simply see different things that you can do with Python.
I'm pretty much in your shoes too, still using C# for most of my work, but using Python more and more for other projects.
#e-satis probably knows Python inside-out and all his advice is top-notch. From my point of view what made the biggest difference to me was the following:
Get back into functional. not necessarily spaghetti code, but learning that not everything has to be in an object, nor should it be.
The interpreter. It's like the immediate window except 10^10 better. Because of how Python works you don't need all the baggage and crap C# makes you put in before you can run things; you can just whack in a few lines and see how things work.
I've normally got an IDLE instance up where I just throw around snippets as I'm working out how the various bits in the language works while I'm editing my files... e.g. busy working out how to do a map call on a list, but I'm not 100% on the lambda I should use... whack in a few lines into IDLE, see how it works and what it does.
And finally, loving into the verbosity of Python, and I don't mean that in the long winded meaning of verbosity, but as e-satis pointed out, using verbs like "in", "is", "for", etc.
If you did a lot of reflection work in C# you'll feel like crying when you see how simple the same stuff is in Python.
Good luck with it.
If you have programming experience and don't feel like spending money I'd recommend How to Think Like a Computer Scientist in Python.
And then something you can benefit from:
IPython shell: Auto completion in the shell. It does batch operations, adds a ton of features, logging and such. >>> Play with the shell - always!
easy_install / pip: So nice and an easy way to install a 3rd party Python application.
I have two strings and would like to display the difference between them. For example, if I have the strings "I am from Mars" and "I am from Venus", the output could be "I am from Venus". (Typically used to show what changed in an audit log, etc.)
Is there a simple algorithm for this? I am using C# but I guess a generic algorithm could be adapted from any programming language.
Or is there a framework class/third-party library that will do this sort of thing?
Check this out: http://en.wikipedia.org/wiki/Diff#Algorithm
Also: http://en.wikipedia.org/wiki/Longest_common_subsequence_problem
There is also an implementation described here: http://www.codeproject.com/KB/recipes/DiffAlgorithmCS.aspx