ISO 639-3

ISO 639-3, or more formally known as ISO 639-3:2007 Codes for the representation of names of languages — Part 3: Alpha-3 code for comprehensive coverage of languages, is an international standard for language codes in the ISO 639 series. The standard describes three‐letter codes for identifying languages. It extends the ISO 639-2 alpha-3 codes with an aim to cover all known natural languages. The standard was published by ISO on 2007-02-05.

It is intended for use in a wide range of applications, in particular computer systems where many languages need to be supported. It provides an enumeration of languages as complete as possible, including living and extinct, ancient and constructed, major and minor, written and unwritten. However, it does not include reconstructed languages such as Proto-Indo-European.

Scope
It is a superset of ISO 639-1 and of the individual languages in ISO 639-2. ISO 639-1 and ISO 639-2 focused on major languages, most frequently represented in the total body of the world’s literature. Since ISO 639-2 also includes language collections and Part 3 does not, ISO 639-3 is not a superset of ISO 639-2. Where B and T codes exist in ISO 639-2, ISO 639-3 uses the T-codes.

Examples:

, the standard contains 7776 entries. The inventory of languages is based on a number of sources including: the individual languages contained in 639-2, modern languages from the Ethnologue, historic varieties, ancient languages and artificial languages from Anthony Aristar at the Linguist List as well as languages recommended within the annual public commenting period.

A transition from ISO 639-1 to ISO 639-3 could be done using the data contained in the list of ISO 639-1 codes.

Code space
Since the code is three-letter alphabetic, one upper bound for the number of languages that can be represented is 26 × 26 × 26 = 17576. Since ISO 639-2 defines special codes (4), a reserved range (520) and B-only codes (23), 547 codes cannot be used in part 3. Therefore a lower upper bound is 17576 − 547 = 17030.

The upper bound gets even lower if one subtracts the language collections defined in 639-2 and the ones yet to be defined in.

Macrolanguages
There are 56 languages in ISO 639-2 which are considered, for the purposes of the standard, to be “macrolanguages” in ISO 639-3.

Some of these macrolanguages had no individual language as defined by ISO 639-3 in the code set of ISO 639-2, e.g. ‘ara’ (Generic Arabic). Others like ‘nor’ (Norwegian) had their two individual ‘nno’ (Nynorsk), ‘nob’ (Bokmål)) already in ISO 639-2.

That means some languages (e.g. ‘arb’, Standard Arabic) that were considered by ISO 639-2 to be dialects of one language (‘ara’) are now in ISO 639-3 in certain contexts considered to be individual languages themselves.

This is an attempt to deal with varieties that may be linguistically distinct from each other, but are treated by their speakers as two forms of the same language, e.g. in cases of diglossia.

For example:
 * Documentation for ISO 639 identifier: ara (Generic Arabic, 639-2)
 * Documentation for ISO 639 identifier: arb (Standard Arabic, 639-3)

See Macrolanguage Mappings for the complete list.

Collective languages
“A collective language code element is an identifier that represents a group of individual languages that are not deemed to be one language in any usage context.” These codes do not precisely represent a particular language or macrolanguage.

While ISO 639-2 includes three-letter identifiers for collective languages, these codes are excluded from ISO 639-3. Hence ISO 639-3 is not a superset of ISO 639-2.

ISO 639-5 defines 3-letter collective codes for language families and groups.

Usage of ISO 639-3

 * Ethnologue, Linguist List,
 * IETF language tag
 * Lexical Markup Framework, ISO specification for representation of machine-readable dictionaries
 * Proposed as language TLD (lcTLD)

Generic codes
Four codes are set aside for “languages” without specific identification:

In addition, codes qaa–qtz are ‘reserved for local use’, for example for extinct languages at Linguist List.

=Resources=