Internationalization and
localization are terms used
to describe the effort to make WordPress (and other such projects)
available in languages other than English, for people from different
locales, who use different dialects and local preferences.
The process of localizing a program has two steps. The first step
is when the program's developers provide a mechanism and method for the
eventual translation of the program and its interface to suit local
preferences and languages for users worldwide. WordPress developers have
done this, so in theory, WordPress can be used in any language.
The second step is the actual
localization, the process by
which the text on the page and other settings are translated and
adapted to another language and culture, using the framework prescribed
by the developers of the software. WordPress has already been localized
into many other languages (see
WordPress in Your Language for more information).
This article explains how translators (bi- or multi-lingual
WordPress users) can go about localizing WordPress to more languages.
Translating WordPress
Before you start translating WordPress, check
WordPress in Your Language
(and resources cited there) to see if a translation of WordPress into
your language already exists. It is also possible that someone (or a
team) is already working on translating WordPress into your language,
but they haven't finished yet. To find out, subscribe to the
polyglots' blog, introduce yourself, and ask if there's anyone translating into your language. There is also a list of
localization teams and
localization teams currently forming, which you can check to see if a translation is in progress.
Qualifications
Assuming that a WordPress translation into your language does not
already exist or has someone working on it, you may want to volunteer to
create a public translation of WordPress into your language. If so,
here are the qualifications you will need:
- You need to be truly bilingual -- fluent in both written
English and the language(s) you will be translating into. Casual
knowledge of either one will make translating difficult for you, or make
the localization you create confusing to native speakers.
- You need to be familiar with PHP, as you will sometimes need
to read through the WordPress code to figure out the best way to
translate messages.
- You should be familiar with human language constructs: nouns,
verbs, articles, etc., different types of each, and be able to identify
variations of their contexts in English.
About Locales
A
locale is a combination of language and regional dialect.
Usually locales correspond to countries, as is the case with Portuguese
(Portugal) and Portuguese (Brazil).
You can do a translation for any locale you wish, even other
English locales such as Canadian English or Australian English, to
adjust for regional spelling and idioms.
The default locale of WordPress is U.S. English.
Localization Technology
WordPress's developers chose to use the
GNU gettext
localization framework to provide localization infrastructure to
WordPress. gettext is a mature, widely used framework for modular
translation of software, and is the
de facto standard for localization in the open source/free software realm.
gettext uses
message-level translation — that is, every
"message" displayed to users is translated individually, whether it be a
paragraph or a single word. In WordPress, such "messages" are
generated, translated, and used by the WordPress PHP files via two PHP
functions.
__() is used when the message is passed as an argument to another function;
_e() is used to write the message directly to the page. More detail on these two functions:
- __('message')
- Searches the localization module for the translation of 'message', and passes the translation to the PHP return statement. If no translation is found for 'message', it just returns 'message'.
- _e('message')
- Searches the localization module for the translation of 'message', and passes the translation to the PHP echo statement. If no translation is found for 'message', it just echoes 'message'.
Note that if you are internationalizing a Theme or Plugin, you should use a "Text Domain". See
Writing a Plugin for more information on how to do this for a plugin; themes are similar.
The gettext framework takes care of most of WordPress. However,
there are a few places in the WordPress distribution where gettext
cannot be used -- see
Files For Direct Translation for more information on how to translate these spots.
gettext files
There are three types of files used in the gettext translation
framework. These files are used and/or generated by translation tools
during the translation process, as follows:
- POT (Portable Object Template) files
- The first step
in the localization process is that a program is used to search through
the WordPress source code and pick out every message passed into a __() or _e()
function. This list of English-language messages is put into a
specially-formatted template file (POT file) that forms the basis of all
translations. Generally, you can download a POT file for WordPress, so
you shouldn't have to generate your own. Separate POT files can also be
made for themes and plugins, if the theme/plugin developer has enclosed
all text in __() or _e() functions.
- PO (Portable Object) files
- The second step in the
localization process is that the translator translates all the messages
from the POT file into the target language, and saves both English and
translated messages in a PO file.
- MO (Machine Object) files
- The final step in the
localization process is that the PO file is run through a program that
turns it into an optimized machine-readable binary file (MO file).
Compiling the translations to machine code makes the localized program
much faster in retrieving the translations while it is running.
Translation Tools
There are various tools available to aid in translating. You may use whichever you prefer.
- GlotPress
- GlotPress
will let you, or an entire team, translate your favourite software. It
is web-based and open-source. It is also completely in sync with the
main repositories and the preferred method for translating WordPress
into your language.
- Launchpad
- The
Ubuntu Linux project has a web site that allows you to translate
messages without even looking at a PO or POT file, and export directly
to a MO.
- Note: many translators have found Rosetta to be a good
starting point, but once it comes time to proofread the entire list of
translations, many have opted to switch hand-editing the PO file or
using a program like Poedit or KBabel, since the Rosetta UI lacks a
search feature and other things that become essential when proofreading
and editing.
- Pootle
- An
open source web-based translation system. The server hosted at
Locamotion.org currently has WordPress translation enabled on it.
- Poedit
- An
open source program for Windows, Mac OS X and UNIX/Linux which provides
an easy-to-use GUI for editing PO files and generate MO files.
- KBabel
- Another open source PO editing program for the KDE window manager on Linux.
- GNU Gettext
- The
official Gettext tools package contains command-line tools for creating
POTs, manipulating POs, and generating MOs. For those comfortable with a
command shell.
Translating With GlotPress
There is a page with instructions on how to translate with GlotPress, on the
Getting Started guide.. If you don't see your language listed, please request its inclusion on the
WP Polyglots blog.
Translating With Launchpad
We have a separate page with
instructions for translating WordPress with Launchpad.
Translating With Pootle (at Locamotion.org)
- Register an account at the Pootle server, and send an e-mail to one of the admins to add your language
- Before trying to translate anything, remember to log in
to Pootle. Content can sometimes be viewed and suggestions can
sometimes be entered even if a visitor is not logged in, but one can
only translate if logged in.
- Visit the WordPress page for your language. For example, the Afrikaans page is at pootle.locamotion.org/af/wordpress/ (remember the trailing slash).
- Click "Show Editing Functions".
- Click "Quick Translate" to edit only untranslated and fuzzy strings, or click "Translate All" to edit all strings.
For the purpose of translating WordPress at locamotion.org, the
single wordpress.pot file has been split up into smaller logical units.
The readme.html file is also available there, and so is a file
containing all the strings that one would normally add to the PHP files
manually.
Also take a look at the Decathlon wiki page for WordPress,
here and
here.
Merging your translations into wordpress.pot
Normally, with a Pootle server, the translator can download his
chosen software's PO file at any time and submit it to his project.
However, because the original source file has been split into smaller
units at pootle.locamotion.org, translators must manually merge their
translations back into the wordpress.pot file before submitting it to
WordPress.
- Download the official WordPress POT file.
- Download the WordPress Continent POT file (Optional)
- Download and install the Translate Toolkit on your computer.
- Download the translated or partially translated PO files from
the Pootle server. You can download them one by one or you can download
them in a ZIP file (see options on the web site). Normally you don't
need to be logged in at Poolte to download the PO files that translators
in your language have translated.
- First, combine the PO files into a single translation memory
(because it is easier to do subsequent steps with a single file than
with several files), and execute the following from the command line: po2tmx -l xx -i pofiles -o xx.tmx where "xx" is your target language code. This will create a TMX translation memory file called xx.tmx.
- Second, pre-translate the WordPress POT file using the
translation memory. To do this, execute the following from the command
line: pot2po --tm=xx.tmx -i wordpress.pot -o wordpress_xx.po. This will create a PO file for your language called wordpress_xx.po.
- Lastly, do a word/string count on your PO file to see how much
of it is translated, fuzzy and untranslated, using the following from
the command line: pocount wordpress_xx.po
If all the PO files were 100% translated, the final wordpress_xx.po
file will also be 100% translated. If some strings were not translated
in the PO files, the pot2po command might cause some fuzzy translations
in wordpress_xx.po (this is not a bad thing).
Translating With Poedit
- Download and install Poedit
- Download the official WordPress POT file
- Open the file in Poedit.
- (See Image) The box labeled (1) is the original message
(in English) from the POT file. The box labeled (2) is where you add
your translation. Boxes labeled (3) and (4) are used for adding comments
about the messages. These come in handy if you are working with a team
of translators and would like to pass around ideas through the PO file.
- Go to File → Save as… to save your translations in a PO file.
- When you are finished translating, go to File → Save as… again to generate the MO file.
- Or you can set your Poedit to always compile a MO file when saving changes by clicking File → Preferences and on the Editor tab check the Automatically compile .mo file on save box.
Translating With KBabel
This section is incomplete.
- Download the official WordPress POT file
- Open the file in KBabel
Translating With Gettext Tools
- Download the official WordPress POT file
- Open the file in your favorite text editor
- Update the header information
- Translate the messages
- Save the file with a .po file extension
- Issue msgfmt -o filename.mo filename.po
The PO File Header
At the beginning of the PO file is something called the
header.
This gives information about what package and version the translation
is for, who the translator was, and when it was created. Certain
portions of this header should be universal for all WordPress
translations:
# LANGUAGE (LOCALE) translation for WordPress.
# Copyright (C) YEAR WordPress contributors.
# This file is distributed under the same license as the WordPress package.
# FIRST AUTHOR <EMAIL@ADDRESS>, YEAR.
#
#, fuzzy
msgid ""
msgstr ""
"Project-Id-Version: WordPress VERSION\n"
"Report-Msgid-Bugs-To: \n"
"POT-Creation-Date: 2005-02-27 17:11-0600\n"
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
"Language-Team: LANGUAGE <LL@li.org>\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=CHARSET\n"
"Content-Transfer-Encoding: 8bit\n"
Fill in the rest of the capitalized text with the appropriate values.
Message Format
The remainder of the file will be in a format as follows:
#: wp-comments-post.php:13
msgid "Sorry, comments are closed for this item."
msgstr ""
#: wp-comments-post.php:29
msgid "Sorry, you must be logged in to post a comment."
msgstr ""
#: wp-comments-post.php:35
msgid "Error: please fill the required fields (name, email)."
msgstr ""
The first line of each message contains the location of the message
in the WordPress code. In the case of these messages, they're all
located in wp-comments-post.php, on lines 13, 29, and 35, respectively.
Occasionally you will come across a message for which you will need to
check its context; look at the appropriate line or lines in the
WordPress core, and you should be able to figure out when and where the
message is displayed, and even reproduce it yourself using your web
browser. Some messages will also appear with the same text in multiple
locations; in that case, there may be more than one line giving a file
and line location.
The next line,
msgid, is the
source message. This is the string that WordPress passes to its
__() or
_e() functions, and the message you will need to translate.
The final line,
msgstr, is a blank string where you will fill in your translation.
Here's how the same few lines would look after being translated, using the
French (France) locale as an example:
#: wp-comments-post.php:13
msgid "Sorry, comments are closed for this item."
msgstr "L'ajout de commentaire n'est pas ou plus possible pour cet article."
#: wp-comments-post.php:29
msgid "Sorry, you must be logged in to post a comment."
msgstr "Vous devez être connecté pour rédiger un commentaire."
#: wp-comments-post.php:35
msgid "Error: please fill the required fields (name, email)."
msgstr "Erreur : veuillez remplir les champs obligatoires vides (nom, e-mail)."
- Note: see Character encodings and HTML character entities below for notes on when to use HTML character entities in translation.
Types of messages
Labels
Labels are often used in the context of HTML
<label>,
<legend>,
<a>, or
<select>
tags. They are short and precise descriptors of the purpose of a UI
element. These can be very difficult to translate at times, especially
if they are single words, and if the word used in English can be
interpreted as either a noun or imperative verb. With most
labels
you will need to do some searching through the code to find the context
of its use before coming up with an appropriate translation.
Because so many of the messages are part of the WordPress administration interface,
Labels are probably the most frequent type of message to translate.
Examples
msgid "Post"
msgstr "Artikkeli"
"Post" could be interpreted as an imperative verb, but in this
context it's a noun. The noun form of "post" in English can be difficult
to translate, and the most appropriate translation has been difficult
for some teams to decide upon. Many translations use their language's
equivalent to the English "Article," as this one does. (From the Finnish (Finland) translation.)
#: wp-login.php:79 wp-login.php:233 wp-register.php:166
#: wp-includes/template-functions-general.php:46
msgid "Register"
msgstr "रजिस्टर"
From the Hindi translation.
#: wp-admin/admin-functions.php:357
msgid "- Select -"
msgstr " - Dewis -"
Items like the surrounding dashes in this example can be
eliminated or replaced if they might be confusing to users in your
target locale, or if there are different established conventions for
your locale. From the Welsh translation.
Informational Messages
Another frequent type of message, the
informational message is
usually composed of full sentences, and conveys information or requests
an action of the user. Since these tend to be longer than
labels,
they tend to be slightly easier to translate. However, with the longer
messages comes more variation in the level of formality (or
informality), which is something translators need to be aware of.
Examples
#: wp-login.php:146
msgid "Your new password is in the mail."
msgstr "Вашата нова парола е в електронната ви поща."
This particular message contains a modified English formulaic
expression ("the check/cheque is in the mail"), which contributes to its
informality. (From the Bulgarian (Bulgaria) translation.)
#: wp-includes/functions.php:1636
msgid "<strong>Error</strong>: Incorrect password."
msgstr "<strong>FEL</strong>: Felaktigt lösenord."
Error messages tend to be more formal, simply because they're short and concise. (From the Swedish (Sweden) translation.)
#: wp-includes/functions-post.php:467
msgid "Sorry, you can only post a new comment once every 15 seconds. Slow down cowboy."
msgstr "Leider kannst du nur alle 15 Sekunden einen neuen Kommentar eingeben. Immer locker bleiben."
Of course, not all of them. (From the German (Germany) translation.)
Strings with description
If a string contains a vertical bar |, the part on the right of | is a
description. Its purpose is to help you translate the string, placing
it in certain context or providing additional information.
Examples
#: wp-includes/locale.php:186
msgid ""
"number_format_decimal_point|$dec_point argument for http://php.net/number_format, default is ."
msgstr ","
The description suggest you look at a web page, in order to translate the string.
Date and Time Locale Settings
Rather than using PHP's built-in locale switching features, which is
not configured for very many languages on most hosts, WordPress uses the
gettext translation module to accomplish date and time translations and
formatting.
WordPress translates the following:
Month names
#: wp-includes/locale.php:42 wp-includes/locale.php:57
msgid "May"
msgstr "Květen"
(From the Czech (Czech Republic) translation.)
Month abbreviations
#: wp-includes/locale.php:57
msgid "May_May_abbreviation"
msgstr "Mag"
Note the unusual msgid. These messages should NOT
be translated literally: they are a hack to get around the fact that in
English, the full name and abbreviation for May are the same, which
Gettext would erroneously combine into one entry. (From the Italian (Italy) translation.)
Weekday Names
#: wp-includes/locale.php:7
#: wp-includes/locale.php:18
#: wp-includes/locale.php:31
msgid "Tuesday"
msgstr "火曜日"
(From the Japanese (Japan) translation.)
Weekday Abbreviations
#: wp-includes/locale.php:31
msgid "Tue"
msgstr "Уто"
(From the Serbian (Serbia) translation.)
Weekday Initials
#: wp-includes/locale.php:18
msgid "T_Tuesday_initial"
msgstr "ti"
The weekday initials are for WordPress's calendar feature, and use
the same hack as the month abbreviations to get around the fact that in
English Tuesday and Thursday share the same first letter. Not all
locales use single-letter abbreviations for all days: in this example,
Norwegian Bokmål uses an extra letter to distinguish tirsdag
(Tuesday) and torsdag
(Thursday). (From the Norwegian Bokmål (Norway) translation.)
Date Formatting Strings
These are
PHP date() formatting strings, and they allow you to change the formatting of the date and time for your locale.
WordPress uses the translations elsewhere in the localization
file for month names, weekday names, etc. This special string is for the
selection of which elements to include in the date & time, as well
as the order in which they're presented.
Take this
msgid from the
theme.pot file:
#: archive.php:40 search.php:19 single.php:22
msgid "l, F jS, Y"
msgstr ""
In English, this gets formatted as:
Sunday, February 27th, 2005
However, different locales format their dates differently. In Danish, for example, dates are written:
søndag, 27. februar 2005
To accomplish this, the
msgid above would be translated to:
#: archive.php:40 search.php:19 single.php:22
msgid "l, F jS, Y"
msgstr "l, j. F Y"
To use another example, one way to format dates in Chinese and Japanese is as follows:
2005年2月27日
This would be accomplished in the translation like this:
#: archive.php:40 search.php:19 single.php:22
msgid "l, F jS, Y"
msgstr "Y年n月j日"
Lastly, if you need to include literal alphabetic characters in your
date format, as sometimes occurs in Spanish, you can backslash them:
#: archive.php:40 search.php:19 single.php:22
msgid "l, F jS, Y"
msgstr "l j \d\e F \d\e Y "
This would output:
domingo 27 de febrero de 2005
Translation via WordPress-PHP
To translate a date, e.g. inside your plugin, use
mysql2date() or
date_i18n(). Your date will be returned in localized format, based on the timestamp.
Messages With Placeholders
Many messages contain special PHP formatting placeholders, which
allow the insertion of untranslatable dynamic content into the message
after it is translated. The PHP placeholders come in two different
formats:
- %s
- When only one placeholder is present, this marker is used.
- %1$s, %2$s, %3$s, …
-
Numbered placeholders, which allow translations to rearrange the order
of the placeholders in the string while maintaining the information each
is replaced with.
Examples
#: wp-login.php:116
msgid "The e-mail was sent successfully to %s's e-mail address."
msgstr "El e-mail fue enviado satisfactoriamente a la dirección e-mail de %s"
This message inserts the username of the user to which an email has been sent. (From the Spanish (Spain) translation.)
#: wp-admin/upload.php:96
#, php-format
msgid "File %1$s of type %2$s is not allowed."
msgstr "类型为%2$s的文件%1$s不允许被上传。"
This message reverses the order in which the file name and type are used in the translation. (From the Chinese (China) translation.)
Tips for Good Translations
- Don't translate literally, translate organically
- Being
bi- or multi-lingual, you undoubtedly know that the languages you speak
have different structures, rhythms, tones, and inflections. Translated
messages don't need to be structured the same way as the English ones:
take the ideas that are presented and come up with a message that
expresses the same thing in a natural way for the target language. It's
the difference between creating an equal message and an equivalent
message: don't replicate, replace. Even with more structural items in
messages, you have creative license to adapt and change if you feel it
will be more logical for, or better adapted to, your target audience.
- Try to keep the same level of formality (or informality)
- Each
message has a different level of formality or informality. Exactly what
level of formality or informality to use for each message in your
target language is something you'll have to figure out on your own (or
with your team), but WordPress messages (informational messages
in particular) tend to have a politely informal tone in English. Try to
accomplish the equivalent in the target language, within your cultural
context.
- Don't use slang or audience-specific terms
- Some amount
of terminology is to be expected in a blog, but refrain from using
colloquialisms that only the "in" crowd will get. If the uninitiated
blogger were to install WordPress in your language, would they know what
the term means? Words like pingback, trackback, and feed
are exceptions to this rule; they're terminology that are typically
difficult to translate, and many translators choose to leave in English.
- Read other software's localizations in your language
- If
you get stuck or need direction, try reading through the translations
of other popular software tools to get a feel for what terms are
commonly used, how formality is addressed, etc. Of course, WordPress has
its own tone and feel, so keep that in mind when you're reading other
localizations, but feel free to dig up UI terms and the like to maintain
consistency with other software in your language.
WordPress Localization Repository
The
WordPress Localization Repository at
http://i18n.svn.wordpress.org/ is a
Subversion repository
where official WordPress translations are maintained. Various teams
collaborate on translations for their native language, and team
maintainers commit updates and changes to the repository.
Participating
Participation in the repository is open to anyone. Simply visit the
WP Polyglots Blog,
introduce yourself, and let everyone know what translation you'd like
to work on. If there is already a team for your language and locale,
they'll let you know and you can join them. If not, you can either
volunteer to be a maintainer for your language and locale, or simply
contribute your localization and the repository maintainers will add it.
Guidelines and requirements
Note: these guidelines are subject to change as the system
evolves; repository maintainers will be happy to assist you in updating
the files you maintain in the repository should these guidelines change.
Character Encodings
All localizations should have at least a UTF-8 version, but may
optionally add versions in other character encodings popular for that
locale.
Current PHP versions don't support Byte Order Markers (BOMs), so
be sure the UTF-8 encoded files you contribute do not have them.
HTML Character Entities
With a few exceptions (noted below), all translations should be
written literally, rather than escaping accented and special characters
with HTML character entities.
Some characters must be escaped to avoid conflict with XHTML markup: angle brackets (
< and
>), and ampersands (
&). In addition, there are a few other characters better used escaped, such as non-breaking spaces (
), angle quotes (
« and
»), curly apostrophes (
’) and curly quotes.
For more information about the W3C's best practices involving
character encodings and character entities, see the following
references:
Repository File Structure
The repository contains directories for each locale, which are named as follows:
Within each locale's directory are the regular Subversion versioning directories: branches/, tags/, and trunk/.
Inside the appropriate versioning directory are the following subdirectories:
messages/
This directory contains the Gettext MO and PO files for the locale. Message files are named after the locale name.
In the
kubrick
folder you should put the translation (using exactly the same PO/MO filename as above) of the i18n-ed
default theme, residing at the wordpress-i18n svn repository. There is also another way of translating the default theme:
dist/
This directory contains all
files in the WordPress distribution that cannot be Gettexted, which have been translated into the target locale.
If the locale has only a UTF-8 translation of the files, the
dist/ directory may be populated with them directly, and the structure
within dist should mirror the structure of the WordPress root directory:
- dist/
- license.txt
- readme.html
- wp-config-sample.php
- …
- wp-admin/
theme/
It is better to translate the i18n-ed kubrick (see the messages/ part above), instead of using theme/.
Similarly to the dist/ dir, theme/ contains hard-translated theme
files. If only a UTF-8 translation is present, the directory can be
populated with subdirectories for each theme translated. These
subdirectories contain all of the same files as the original theme
(except that they're translated), and are named the same as the original
theme:
- theme/
- default/
- 404.php
- index.php
- sidebar.php
- …
- images/
Just as with the dist/ directory, if there are multiple character
encodings represented, theme/ should contain a subdirectory for each
character encoding, which in turn would contain subdirectories for each
theme translated.
Using Localizations
In order to localize your installation of wordpress, create a directory named
languages inside of
wp-includes, if it does not already exist. Then grab the appropriate localization files from the
Subversion Repository as described above. The main .mo file and the continent .mo file for the language should go inside the
languages directory. Set WPLANG inside of
wp-config.php to your chosen language. For example, if you wanted to use french, you would have:
define ('WPLANG', 'fr_FR');
Troubleshooting
- Rosetta won't export my translation as an MO file. It just says, "A system error occurred."
- There
is a syntax error in your translation that is preventing it from
compiling to an MO. Download the PO instead and try compiling it
manually with msgfmt. This will tell you which lines the errors
are on so you can correct them by hand. If you don't have the GNU
Gettext package installed, you can try opening the PO file in Poedit or
KBabel to see if they will help you correct the errors, or you can email
the wp-polyglots mailing list and ask for someone to debug it for you.Source: http://codex.wordpress.org