Index: Doc/library/gettext.rst =================================================================== --- Doc/library/gettext.rst (revision 86753) +++ Doc/library/gettext.rst (working copy) @@ -8,8 +8,8 @@ The :mod:`gettext` module provides internationalization (I18N) and localization -(L10N) services for your Python modules and applications. It supports both the -GNU ``gettext`` message catalog API and a higher level, class-based API that may +(L10N) services for your Python modules and applications. It supports both the +GNU :program:`gettext` message catalog API and a higher level, class-based API that may be more appropriate for Python files. The interface described below allows you to write your module and application messages in one natural language, and provide a catalog of translated messages for running under different natural @@ -34,7 +34,7 @@ Bind the *domain* to the locale directory *localedir*. More concretely, :mod:`gettext` will look for binary :file:`.mo` files for the given domain using - the path (on Unix): :file:`localedir/language/LC_MESSAGES/domain.mo`, where + the path (on Unix): :file:`{localedir}/{language}/LC_MESSAGES/{domain}.mo`, where *languages* is searched for in the environment variables :envvar:`LANGUAGE`, :envvar:`LC_ALL`, :envvar:`LC_MESSAGES`, and :envvar:`LANG` respectively. @@ -84,14 +84,14 @@ .. function:: ngettext(singular, plural, n) - Like :func:`gettext`, but consider plural forms. If a translation is found, + Like :func:`gettext`, but consider plural forms. If a translation is found, apply the plural formula to *n*, and return the resulting message (some - languages have more than two plural forms). If no translation is found, return + languages have more than two plural forms). If no translation is found, return *singular* if *n* is 1; return *plural* otherwise. - The Plural formula is taken from the catalog header. It is a C or Python + The plural formula is taken from the catalog header. It is a C or Python expression that has a free variable *n*; the expression evaluates to the index - of the plural in the catalog. See the GNU gettext documentation for the precise + of the plural in the catalog. See the GNU gettext documentation for the precise syntax to be used in :file:`.po` files and the formulas for a variety of languages. @@ -134,8 +134,8 @@ The class-based API of the :mod:`gettext` module gives you more flexibility and greater convenience than the GNU :program:`gettext` API. It is the recommended way of localizing your Python applications and modules. :mod:`gettext` defines -a "translations" class which implements the parsing of GNU :file:`.mo` format -files, and has methods for returning strings. Instances of this "translations" +a :class:`GNUTranslations` class which implements the parsing of GNU :file:`.mo` format +files, and has methods for returning strings. Instances of this class can also install themselves in the built-in namespace as the function :func:`_`. @@ -144,14 +144,14 @@ This function implements the standard :file:`.mo` file search algorithm. It takes a *domain*, identical to what :func:`textdomain` takes. Optional - *localedir* is as in :func:`bindtextdomain` Optional *languages* is a list of + *localedir* is as in :func:`bindtextdomain`. Optional *languages* is a list of strings, where each string is a language code. If *localedir* is not given, then the default system locale directory is used. [#]_ If *languages* is not given, then the following environment variables are searched: :envvar:`LANGUAGE`, :envvar:`LC_ALL`, :envvar:`LC_MESSAGES`, and :envvar:`LANG`. The first one returning a non-empty value is used for the - *languages* variable. The environment variables should contain a colon separated + *languages* variable. The environment variables should contain a colon separated list of languages, which will be split on the colon to produce the expected list of language code strings. @@ -160,18 +160,18 @@ :file:`{localedir}/{language}/LC_MESSAGES/{domain}.mo` - The first such file name that exists is returned by :func:`find`. If no such - file is found, then ``None`` is returned. If *all* is given, it returns a list + The first such file name that exists is returned by :func:`find`. If no such + file is found, then ``None`` is returned. If *all* is given, it returns a list of all file names, in the order in which they appear in the languages list or the environment variables. .. function:: translation(domain, localedir=None, languages=None, class_=None, fallback=False, codeset=None) - Return a :class:`Translations` instance based on the *domain*, *localedir*, + Return a :class:`*Translations` instance based on the *domain*, *localedir*, and *languages*, which are first passed to :func:`find` to get a list of the associated :file:`.mo` file paths. Instances with identical :file:`.mo` file - names are cached. The actual class instantiated is either *class_* if + names are cached. The actual class instantiated is *class_* if provided, otherwise :class:`GNUTranslations`. The class's constructor must take a single :term:`file object` argument. If provided, *codeset* will change the charset used to encode translated strings in the :meth:`lgettext` and @@ -211,7 +211,7 @@ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Translation classes are what actually implement the translation of original -source file message strings to translated message strings. The base class used +source file message strings to translated message strings. The base class used by all translation classes is :class:`NullTranslations`; this provides the basic interface you can use to write your own specialized translation classes. Here are the methods of :class:`NullTranslations`: @@ -227,7 +227,7 @@ .. method:: _parse(fp) - No-op'd in the base class, this method takes file object *fp*, and reads + No-op in the base class, this method takes file object *fp*, and reads the data from the file, initializing its message catalog. If you have an unsupported message catalog file format, you should override this method to parse your format. @@ -260,13 +260,14 @@ .. method:: lngettext(singular, plural, n) - If a fallback has been set, forward :meth:`ngettext` to the fallback. + If a fallback has been set, forward :meth:`lngettext` to the fallback. Otherwise, return the translated message. Overridden in derived classes. .. method:: info() - Return the "protected" :attr:`_info` variable. + Return the "protected" :attr:`_info` variable, a dictionary containing + the metadata found in the message catalog file. .. method:: charset() @@ -277,14 +278,14 @@ .. method:: output_charset() - Return the "protected" :attr:`_output_charset` variable, which defines the + Return the "protected" :attr:`_output_charset` variable, which is the encoding used to return translated messages in :meth:`lgettext` and :meth:`lngettext`. .. method:: set_output_charset(charset) - Change the "protected" :attr:`_output_charset` variable, which defines the + Change the "protected" :attr:`_output_charset` variable, which is the encoding used to return translated messages. @@ -321,15 +322,15 @@ :meth:`_parse` to enable reading GNU :program:`gettext` format :file:`.mo` files in both big-endian and little-endian format. -:class:`GNUTranslations` parses optional meta-data out of the translation -catalog. It is convention with GNU :program:`gettext` to include meta-data as -the translation for the empty string. This meta-data is in :rfc:`822`\ -style +:class:`GNUTranslations` parses optional metadata out of the translation +catalog. It is convention with GNU :program:`gettext` to include metadata as +the translation for the empty string. This metadata is in :rfc:`822`\ -style ``key: value`` pairs, and should contain the ``Project-Id-Version`` key. If the key ``Content-Type`` is found, then the ``charset`` property is used to initialize the "protected" :attr:`_charset` instance variable, defaulting to ``None`` if not found. If the charset encoding is specified, then all message ids and message strings read from the catalog are converted to Unicode using -this encoding, else ASCII encoding is assumed. +this encoding, else ASCII is assumed. Since message ids are read as Unicode strings too, all :meth:`*gettext` methods will assume message ids as Unicode strings, not byte strings. @@ -344,48 +345,50 @@ The following methods are overridden from the base class implementation: -.. method:: GNUTranslations.gettext(message) +.. class:: GNUTranslations - Look up the *message* id in the catalog and return the corresponding message - string, as a Unicode string. If there is no entry in the catalog for the - *message* id, and a fallback has been set, the look up is forwarded to the - fallback's :meth:`gettext` method. Otherwise, the *message* id is returned. + .. method:: gettext(message) + Look up the *message* id in the catalog and return the corresponding message + string, as a Unicode string. If there is no entry in the catalog for the + *message* id, and a fallback has been set, the look up is forwarded to the + fallback's :meth:`gettext` method. Otherwise, the *message* id is returned. -.. method:: GNUTranslations.lgettext(message) - Equivalent to :meth:`gettext`, but the translation is returned as a - bytestring encoded in the selected output charset, or in the preferred system - encoding if no encoding was explicitly set with :meth:`set_output_charset`. + .. method:: lgettext(message) + Equivalent to :meth:`gettext`, but the translation is returned as a + bytestring encoded in the selected output charset, or in the preferred system + encoding if no encoding was explicitly set with :meth:`set_output_charset`. -.. method:: GNUTranslations.ngettext(singular, plural, n) - Do a plural-forms lookup of a message id. *singular* is used as the message id - for purposes of lookup in the catalog, while *n* is used to determine which - plural form to use. The returned message string is a Unicode string. + .. method:: ngettext(singular, plural, n) - If the message id is not found in the catalog, and a fallback is specified, the - request is forwarded to the fallback's :meth:`ngettext` method. Otherwise, when - *n* is 1 *singular* is returned, and *plural* is returned in all other cases. + Do a plural-forms lookup of a message id. *singular* is used as the message id + for purposes of lookup in the catalog, while *n* is used to determine which + plural form to use. The returned message string is a Unicode string. - Here is an example:: + If the message id is not found in the catalog, and a fallback is specified, the + request is forwarded to the fallback's :meth:`ngettext` method. Otherwise, when + *n* is 1 *singular* is returned, and *plural* is returned in all other cases. - n = len(os.listdir('.')) - cat = GNUTranslations(somefile) - message = cat.ngettext( - 'There is %(num)d file in this directory', - 'There are %(num)d files in this directory', - n) % {'num': n} + Here is an example:: + n = len(os.listdir('.')) + cat = GNUTranslations(somefile) + message = cat.ngettext( + 'There is %(num)d file in this directory', + 'There are %(num)d files in this directory', + n) % {'num': n} -.. method:: GNUTranslations.lngettext(singular, plural, n) - Equivalent to :meth:`gettext`, but the translation is returned as a - bytestring encoded in the selected output charset, or in the preferred system - encoding if no encoding was explicitly set with :meth:`set_output_charset`. + .. method:: lngettext(singular, plural, n) + Equivalent to :meth:`gettext`, but the translation is returned as a + bytestring encoded in the selected output charset, or in the preferred system + encoding if no encoding was explicitly set with :meth:`set_output_charset`. + Solaris message catalog support ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -428,7 +431,7 @@ #. run a suite of tools over your marked files to generate raw messages catalogs -#. create language specific translations of the message catalogs +#. create language-specific translations of the message catalogs #. use the :mod:`gettext` module so that message strings are properly translated @@ -438,9 +441,9 @@ filename = 'mylog.txt' message = _('writing a log message') - fp = open(filename, 'w') - fp.write(message) - fp.close() + with open(filename, 'w') as fp: + fp.write(message) + fp.close() In this example, the string ``'writing a log message'`` is marked as a candidate for translation, while the strings ``'mylog.txt'`` and ``'w'`` are not. @@ -454,14 +457,14 @@ for the strings you previously marked as translatable. It is similar to the GNU :program:`gettext` program except that it understands all the intricacies of Python source code, but knows nothing about C or C++ source code. You don't -need GNU ``gettext`` unless you're also going to be translating C code (such as +need GNU :program:`gettext` unless you're also going to be translating C code (such as C extension modules). :program:`pygettext` generates textual Uniforum-style human readable message catalog :file:`.pot` files, essentially structured human readable files which contain every marked string in the source code, along with a placeholder for the -translation strings. :program:`pygettext` is a command line script that supports -a similar command line interface as :program:`xgettext`; for details on its use, +translation strings. :program:`pygettext` is a command-line script that supports +a similar command-line interface as :program:`xgettext`; for details on its use, run:: pygettext.py --help @@ -484,7 +487,7 @@ ^^^^^^^^^^^^^^^^^^^^^^ If you are localizing your module, you must take care not to make global -changes, e.g. to the built-in namespace. You should not use the GNU ``gettext`` +changes, e.g. to the built-in namespace. You should not use the GNU :program:`gettext` API but instead the class-based API. Let's say your module is called "spam" and the module's various natural language @@ -579,7 +582,7 @@ This works because the dummy definition of :func:`_` simply returns the string unchanged. And this dummy definition will temporarily override any definition -of :func:`_` in the built-in namespace (until the :keyword:`del` command). Take +of :func:`_` in the built-in namespace (until the :keyword:`del` command). Take care, though if you have a previous definition of :func:`_` in the local namespace. @@ -603,8 +606,8 @@ In this case, you are marking translatable strings with the function :func:`N_`, [#]_ which won't conflict with any definition of :func:`_`. However, you will need to teach your message extraction program to look for translatable strings -marked with :func:`N_`. :program:`pygettext` and :program:`xpot` both support -this through the use of command line switches. +marked with :func:`N_`. :program:`pygettext` and :program:`xpot` both support +this through the use of command-line switches. Acknowledgements @@ -634,15 +637,15 @@ .. [#] The default locale directory is system dependent; for example, on RedHat Linux it is :file:`/usr/share/locale`, but on Solaris it is :file:`/usr/lib/locale`. The :mod:`gettext` module does not try to support these system dependent - defaults; instead its default is :file:`sys.prefix/share/locale`. For this + defaults; instead its default is :file:`{sys.prefix}/share/locale`. For this reason, it is always best to call :func:`bindtextdomain` with an explicit absolute path at the start of your application. .. [#] See the footnote for :func:`bindtextdomain` above. .. [#] François Pinard has written a program called :program:`xpot` which does a - similar job. It is available as part of his :program:`po-utils` package at http - ://po-utils.progiciels-bpi.ca/. + similar job. It is available as part of his `http://po-utils.progiciels-bpi.ca/ + `_. .. [#] :program:`msgfmt.py` is binary compatible with GNU :program:`msgfmt` except that it provides a simpler, all-Python implementation. With this and