Author Topic: Localization tips and advice for software developers  (Read 4934 times)

spiros

  • Administrator
  • Hero Member
  • *****
  • Posts: 293795
  • Gender: Male
  • point d’amour
    • spiros.doikas
    • greektranslator
    • doikas
    • 102094522373850556729
    • lavagraph
    • Greek translator CV
Localization tips and advice for software developers
« on: 12 Nov, 2005, 22:32:09 »
It is quite sad that even today, in Greece, there are many programmers who develop very popular software applications which are so damn hard to localize just because they have not taken into consideration localization issues at the outset of the project.

One of the many cases is Ellinomatheia whose gravest fault is the existence of hardcoded strings all over the place.


In computer programming or text markup, to hardcode (less frequently, hard code) is to use an explicit rather than a symbolic name for something that is likely to change at a later time. Such coding is sometimes known as hardcode (noun) and it is more difficult to change if it later becomes necessary. In most programming languages, it is possible to equate a symbol with a particular name (which may also represent a number) value. If the name changes, the symbol stays the same and only the equate line of code needs to be changed to reflect the new name. When the program is recompiled, the new name is picked up wherever the symbol occurs in the code. Although there are search-and-replace tools that can change all occurrences of a given name, program code is very unforgiving in case a small error is introduced, and it is safer to have a single place in which such a change can be made. For this reason, hardcoding is usually a practice to be avoided.

Below are some tips and links for developing products that can be easily localized:

Consider differences in length for text strings   
Many languages use different amounts of space to convey the same meaning, therefore, when creating your application, you should consider the different lengths of the text strings for the language your application will be in. An average of 30% more space should be added for any text. Depending on the language and the phrase, the localized string might even require twice as much space.

Use generic images and icons   
Bitmaps that are culturally neutral and that do not contain anything that must be localized are ideal.

Use text that can wrap   
With labels for such controls as radio buttons and check boxes, it is important that you set the appropriate property that makes it possible for the text to wrap. Localization vendors often cannot change the properties of radio button and check box labels. If you do not make it possible for the text to wrap and the translation does not fit on one line, the localizer's only alternative is to omit some of the text.

Add one extra line per variable   
When variables are used in UI strings, localization usually requires extra space. As a rule of thumb, one line per variable should be added to the text box so that there is enough space for the localized text. For example, the translation of a sentence that does not contain variables might require the entire first line of a text box. Where there is a variable inside of the sentence, the inserted text for the variable would extend the sentence into the second line. Allocate enough space for this spillover so that it will still be readable.

Avoid hiding or overlapping controls   
UI controls such as buttons or drop-down lists should not be placed on top of other controls. Sizing and hotkey issues with hidden controls usually are found through testing, which might not be done during localization. In this case, the UI is not localizable because the button size cannot be extended to the length required for the translation without rearranging the button positions. Rearranging button positions can be costly and makes the UI inconsistent among languages.

Avoid hard coding localizable string elements   
Because localization vendors often don't work with the source code, hard coded strings are usually only found through testing. For example, there might be a string in a resource file without a period at the end of a sentence because the period is added in the code. Because they don't see the code, translators might assume that this is a mistake and end the translation with a period. The result would be localized versions that have two periods at the end of the sentence. It is best to avoid hard coding all localizable string elements to save time for localization, build and test teams.

Avoid using controls within a sentence   
You might want to place a UI control within a sentence. For example, you might want to give users a drop-down menu to make a choice within a sentence. This practice is not recommended, because to localize a sentence that includes UI controls, the localizer often has to either change the position of the controls (if possible) or be content with an improper sentence structure. Also, the UI controls are often drop-down combo boxes that are comprised of multiple controls. Moving and aligning these can be error-prone.

Avoid placing button text into a string variable   
Text on a button should never be dynamically linked onto the button from a string variable but should be placed on the button itself as a property of the button. Button sizes usually have to be adjusted to fit the length of the translation on it. Localizers have no way of telling which strings in the string table end up on which button at run time.

Remove unused strings   
Unused strings and dialog boxes should be removed from the source, so localizers do not waste time localizing them.


Links

« Last Edit: 22 Jan, 2013, 18:46:11 by spiros »


spiros

  • Administrator
  • Hero Member
  • *****
  • Posts: 293795
  • Gender: Male
  • point d’amour
    • spiros.doikas
    • greektranslator
    • doikas
    • 102094522373850556729
    • lavagraph
    • Greek translator CV

spiros

  • Administrator
  • Hero Member
  • *****
  • Posts: 293795
  • Gender: Male
  • point d’amour
    • spiros.doikas
    • greektranslator
    • doikas
    • 102094522373850556729
    • lavagraph
    • Greek translator CV
Re: Localization tips and advice for software developers
« Reply #2 on: 10 Apr, 2012, 15:39:56 »
The attached pdf contains a check list for software design that is localization-friendly.


International software design checklist: 30 software engineering considerations

1. Design team considers translation and localisation from the beginning of the project.      
2. All international editions are compiled from one set of source files.      
3. Localisable items are stored externally in resource files, or resource bundles.      
4. Code supports Unicode or conversion between Unicode and local codepages.      
5. String buffers are large enough to handle translated words and phrases.      
6. No assumptions are made that one character storage element represents one linguistic character.      
7. Validate databases to ensure that schemas, datatypes and table design are ready for a multi-locale environment.      
8. All language editions can deal with one another's data.      
9. Program takes advantage of generic text layout functions when available.      
10. International laws affecting design and operation are considered.      
11. Code uses generic datatypes and generic function prototypes if available.      
12. Program handles input of international data.      
13. Program contains support for locale-specific hardware if required.      
14. The product runs properly in its base language in all locales.      
15. Program depends on operating or runtime system functions for sorting, character typing and string mapping.      
16. Third-party components and software used in the product are examined for I18N support.      
17. Strings are not assembled by concatenation of fragments.      
18. Source code does not contain hard-coded character constants, numeric constants, screen positions, filenames or pathnames that assume a particular language.      
19. Code is generic enough to handle the required range of character sets.      
20. Code properly handles all characters in the program's character set.      
21. Code processes all character sets correctly regardless of character widths.      
22. Application works correctly on localised editions of the target operating system.      
23. Program meets international testing standards.      
24. Icons, cursors and bitmaps are generic, are culturally independent and do not contain text whenever possible.      
25. Code does not use embedded font names or make assumptions about particular fonts being available.      
26. Displayed and printed text uses appropriate fonts.      
27. Menu and dialog-box keyboard assignments are unique.      
28. If ethnocentric graphics, colours or fonts are used, they can be replaced dynamically using locale-sensitive switch statements.      
29. Sorting and case conversion are culturally correct.      
30. Program handles user keyboard layout changes.      

International software design checklist: 12 international usability, UI and human factors

1. Consistent terminology is used in messages.      
2. UI language strings are reviewed for meaning and spelling to reduce user confusion and lessen translation errors.      
3. Menus, dialogs and Web layouts can tolerate text expansion.      
4. Strings are documented using comments to provide context for translators.      
5. Users can type all supported characters into documents, dialog boxes and filenames.      
6. Shortcut-key combinations are accessible on all international keyboards.      
7. Program responds to changes in the user's choice of international settings (for example, UI language can be changed through a straightforward menu option).      
8. Translated text meets requirements of end users who are native speakers.      
9. Dialogs and forms are resized and UI text is aesthetically presented.      
10. Translated dialogs, toolbars, status bars and menus fit on the screen at different resolutions.      
11. User can successfully cut, paste, save and print text regardless of language.      
12. Are there established test plans and tools for the source product, and can they be applied to localised versions?      
« Last Edit: 29 Dec, 2012, 15:28:30 by spiros »


spiros

  • Administrator
  • Hero Member
  • *****
  • Posts: 293795
  • Gender: Male
  • point d’amour
    • spiros.doikas
    • greektranslator
    • doikas
    • 102094522373850556729
    • lavagraph
    • Greek translator CV
Re: Localization tips and advice for software developers
« Reply #3 on: 25 Oct, 2012, 01:10:12 »
Pitfall #1: Pixel Based Layouts
English text is often very compact compared to other languages where the translated text is often substantially longer. Therefore the interface must be able to adjust size to accommodate the length of translations provided at runtime. If it can't do this, then messages will end up misaligned and truncated.

The answer is to use layout managers. Qt provides a number of such layout managers pre-made for you. They include QHBoxLayout, QVBoxLayout, QGridLayout and QStackedLayout, all of which are subclasses of QLayout. You may also create your own QLayout based classes, but this is generally not needed.

These layout classes manage the pixel positioning of widgets for you at runtime, so no matter what the size of the translated strings your interface will adjust properly. For more information look at the documentation for QLayout.

Pitfall #2: Word Puzzles
Another thing to be aware of is to not concatenate pieces of sentences together like this:

QString msg=i18n("Do you want to replace ") +
                 oldFile+i18n(" with ") +
                 newFile + "?"
Such "word puzzles" are very hard or even impossible to translate. This is because the structure of the sentence will often be completely different in another language and thus must be controlled by the translator. When the order of words and phrases is hard-coded as in the above example, the translator can not create a proper translation.

Adding to this problem, a translator will only see parts of the sentence while translating and will have to guess at what belongs together.

The solution thankfully is quite simple: use %number placeholder substitution, which lets the translators not only make good translations because they can see the entirety of the sentence during translation, but which also lets them change the order of the arguments freely. The arguments themselves are passed as extra parameters to i18n().

The above example written properly would then look like this:

QString msg = i18n("Do you want to replace %1 with %2?",
                   oldFile, newFile)
It would be even better, if you wrote it more explictly:

QString msg = i18n("Do you want to replace file %1 with file %2?",
                   oldFile, newFile)
Or another way is to write a comment prior to sentence like this:

/* TRANSLATORS: Replacing an old file (%1) with a new one (%2). */
QString msg = i18n("Do you want to replace %1 with %2?",
                   oldFile, newFile)
 
Note
Avoid inserting anything other than numbers or nouns with this method, since in some languages the translation depends on the inserted words. It is therefore best to create strings that are as complete sentences as possible.

A related mistake is not including markup tags in rich text, such as or , in the translatable string. Not all languages use such markup in an identical fashion to English and so it is necessary for the translator to be able to "translate" the markup accordingly as well.

Similarly, messages that contain a version string or other often changing parts should be inserted by placeholders into the message. This prevents unnecessary changes that cause the translators to have to change the translated messages as well.

Since KDE is translated into more than 65 languages a single string change causes at least 65 people to open the file, find the changed message, look carefully if this is the only thing that has changed, change the translation, save the file again and commit the changed file into the code repository. All in all such a small change might create hours of work which could be easily avoided.

Pitfall #3: Lack of Unicode Support
Whenever there is source code that handles strings using a datatype (such as char) or class (such as std::string) that can not handle Unicode, translations will break.

To avoid this, never call QString::latin1() or QString::ascii() on translated strings. This also applies to information resulting from user input such as passwords, URLs and filenames. If you really need a plain char* representation of a string, it is better to use QString::utf8().

 
Note
For more information on character sets and Unicode, see the Unicode tutorial.

KIO slaves may also provide paths and file names encoded using UTF-8. It is up to the programmer, however, to take care of passing properly encoded filenames to any KIO method in question. The correct way to do this is not to guess at user's filesystem encoding but to use QFile::encodeName() and QFile::decodeName() instead.

 
Tip
You can turn KIO's UTF-8 file name support on for testing by exporting the KDE_UTF8_FILENAMES environment variable in your shell's startup file (e.g. ~/.bashrc).

Pitfall #4: Complex Text Flow
When designing an application that needs non-standard text flow, don't assume that the same rules apply for all languages. Given vertical writing as an example East-Asian languages using Chinese characters have a long history of vertical writing, even longer then horizontal. Strings are not rotated by 90 degrees but instead single characters are placed under one another. There might be just a different behaviour with different scripts. Expect the need to implement specialised versions.
http://techbase.kde.org/Development/Tutorials/Localization/i18n_Mistakes


spiros

  • Administrator
  • Hero Member
  • *****
  • Posts: 293795
  • Gender: Male
  • point d’amour
    • spiros.doikas
    • greektranslator
    • doikas
    • 102094522373850556729
    • lavagraph
    • Greek translator CV