
In this article we will learn about PHP internationalization and localization i18n using gettext extension. Using gettext we can localize websites and web apps in a proper way.
Overview
Internationalization and localization is a common task in web applications, and in PHP language software engineers tend to implement localization in many different ways like storing languages in PHP arrays, however did you know that PHP has a built in support for localization through the use of gettext extension.
gettext is a powerful internationalization and localization (i18n and l10n) system commonly used for writing multilingual programs on Unix-like computer operating systems. gettext software adopted in many systems and programming languages and among all programming languages PHP supports gettext through the gettext extension.
How gettext works?
- gettext uses special translation files POT (Portable Object Template) *.pot and PO (Portable Object) *.po and MO (Machine Object) *.mo
- POT files serve as a template for all the app translations for all languages.
- PO files is a text files have special structure that can be opened and parsed using relevant software like Poedit and xgettext tools.
- PO files contain all the original text and the corresponding translations.
- PO files can be compiled to the binary format MO (Machine Object) files.
- MO files is machine object files and in contrast with PO files that are human readable, it’s binary files and can not be opened with text editors.
- After translation work correctly in PO we can compile them into MO format for better performance.
What is gettext domain?
The domain or text domain in gettext is like a module in web application, for example in an ecommerce website that contains many modules, module for categories, module for products, module for users, etc. So you can make a domain for each module of these. Some applications may depend on single domain only. If you worked with some common CMS systems before like wordpress, you will see that each plugin in wordpress has it’s own text domain just to identify each plugin translations and to prevent any conflicts that might happen.
What Is Singular and Plural Strings?
When implementing application localization we usually look for a way to have different translations for the same original string, depending on the string count (singular or plural). For instance suppose we need to tell the number of items in shopping cart so we may have these translations:
"Cart Items":"The Cart have one item"
"Cart Items":"The cart have 5 items"
"Cart Items":"The cart have more than 5 items"
And when retrieving translations we need to get the right translation depending on the count of the shopping cart. This example is called pluralization of strings, as string have many translation versions.
Installing gettext In PHP
First check if your PHP installations have gettext installed through phpinfo as in this figure:
Also you can check gettext support with this code block:
<?php if(function_exists('gettext')) { echo 'gettext installed'; } else { echo 'gettext not installed'; }
If you didn’t find gettext like this figure, then you need to install gettext extension first.
Installing On Windows:
- Open php.ini and enable gettext extension
extension=php_gettext.dll
Installing On Ubuntu:
sudo apt-get update -y sudo apt-get install -y php-gettext
Also be sure the mbstring extension is also installed on the server.
common gettext functions:
- textdomain(?string $domain): Sets the default text domain.
- bindtextdomain(?string
$domain
, ?string$directory
): Attach a domain to a locale path - gettext(string
$message
): Lookup a message in the current domain set by textdomain() and returns the translated message if found. - ngettext(string
$singular
, string$plural
, int$count
): Like gettext() but for plural strings version. If your string have many translations (plural) use this function instead of gettext(). - dgettext(string
$domain
, string$message
): Override the current domain and lookup a message in the newly overridden domain.
Implementation of php gettext:
- Create a directory in your application i.e named “locale”. This directory will hold all the app translations, then create a directory for each language that your app will support using language code as the directory name.
- Each language sub-directory will contain another directory that has a special name “LC_MESSAGES”.
- The language directory should be named using language code and country code altogether like “en_US”.
- The next step is to generate the .pot files which is template files for translations from php files. Using the xgettext utility in linux. In windows you can use the Poedit software, download and install it.
- Before using the xgettext command you should have a php code with calls to gettext() or _() function because when running the xgettext command it will scan the php files and if no calls to gettext() function found no pot files generated.
- So create a sample php file with this code:
index.php
<?php $locale = 'ar_EG.UTF-8'; putenv("LC_ALL=$locale"); setlocale(LC_ALL, $locale); bindtextdomain('messages', './locale'); textdomain('messages'); echo gettext('This is first time using gettext'); echo "<br/>"; echo _('Welcome'); echo "<br/>"; echo _('Application starting');
In this code i am setting the initial locale to “ar_EG.UTF-8“. Note that this is the locale name as exist the OS. The “.UTF-8” suffix not included in the sub-directory of the language. The “.UTF-8” suffix not available in all languages just for non-ascii languages only. If the locale name not exist of the OS the translation won’t appear.
The using putenv() php function to store the locale in environment. The setlocale() function set locale information with the current locale and accepts a category and the current locale. For valid locale names check ISO 639. Different systems have different naming schemes for locales.
Then i bind the text domain with bindtextdomain(), i supposed we have only one domain ‘messages‘. At last i set the domain using textdomain() function to be ‘messages’.
Now when using gettext() function it will search for translations in the ‘messages’ domain. The _() function is an alias to gettext().
If you run this script in the browser you will see nothing as there is no translations yet for this text, so to show the actual translation let’s use the xgettext tool.
Run the xgettext command in the project root to generate .pot and .po files.
- Using the xgettext command will scan all the php files in the current directory and generates the .pot file with all strings:
xgettext -n --from-code=UTF-8 -o messages.pot *.php
When running this command it will generate file ‘messages.pot’ in the root of the project as specified in the command. If you open messages.pot you will something like this:
# SOME DESCRIPTIVE TITLE. # Copyright (C) YEAR THE PACKAGE'S COPYRIGHT HOLDER # This file is distributed under the same license as the PACKAGE package. # FIRST AUTHOR <EMAIL@ADDRESS>, YEAR. # #, fuzzy msgid "" msgstr "" "Project-Id-Version: PACKAGE VERSION\n" "Report-Msgid-Bugs-To: \n" "POT-Creation-Date: 2023-11-22 08:22+0200\n" "PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n" "Last-Translator: FULL NAME <EMAIL@ADDRESS>\n" "Language-Team: LANGUAGE <LL@li.org>\n" "Language: \n" "MIME-Version: 1.0\n" "Content-Type: text/plain; charset=CHARSET\n" "Content-Transfer-Encoding: 8bit\n" #: index.php:8 msgid "This is first time using gettext" msgstr "" #: index.php:10 msgid "Welcome" msgstr "" #: index.php:12 msgid "Application starting" msgstr ""
This template file contains the original strings and the corresponding translations structured as follows:
msgid "This is first time using gettext" msgstr ""
The “msgid” key is the original string and “msgstr” is the translation.
This a template file and don’t edit the POT file as xgettext
overwrites it each time you start it. Instead we will create a PO files from this template.
There are several methods to convert .POT to .PO:
- The first method is using Poedit software.
- The second method using msginit command from xgettext tool:
msginit --locale ar_EG --input messages.pot --output locale/ar_EG/LC_MESSAGES/messages.po msginit --locale en_US --input messages.pot --output locale/en_US/LC_MESSAGES/messages.po
The msginit command accepts the target locale, the input POT file and the destination path. As shown i triggered the command twice for each language.
After completion of these commands you will see the generated PO files in the language sub-directories.
- The third method is by using the cp command:
cp messages.pot locale/en_US/LC_MESSAGES/messages.po cp messages.pot locale/ar_EG/LC_MESSAGES/messages.po
An important note for languages that contain non-ASCII characters like Arabic you have to update the Content-Type header to UTF-8,
so open locale/ar_EG/LC_MESSAGES/messages.po and update it like so:
Content-Type: text/plain; charset=UTF-8\n
Updating Translations
After generating the .PO files we need to update the missing translations in each PO files using text editor or Poedit software to add the translations as in this figure:
And then save the file again as .PO and replace it in the same location.
If you update the file manually then it will be like this snippet:
#: index.php:8 msgid "This is first time using gettext" msgstr "هذه هي المرة الأولى التي تستخدم فيها gettext" #: index.php:10 msgid "Welcome" msgstr "مرحبا" #: index.php:12 msgid "Application starting" msgstr "بدء التطبيق"
Do this step for the two locales and then save.
Up to this point for the translations appear on the browser the remaining step is to compile the .PO files to .MO files.
Compiling to .MO
To compile the translation files .PO to .MO files you can use the Poedit or using the msgfmt command:
msgfmt locale/ar_EG/LC_MESSAGES/messages.po --output-file=locale/ar_EG/LC_MESSAGES/messages.mo msgfmt locale/en_US/LC_MESSAGES/messages.po --output-file=locale/en_US/LC_MESSAGES/messages.mo
Now the two .mo files generated in the same sub-directories and if you go to the browser and reload you will see the arabic translation appears. Be sure that the locale/ directory have a writable permissions so that translation appear.
Tip: Each time you update the .PO you have to regenerate the .MO counterpart.