Skip to main content
LangDex uses Glottolog as the authoritative source for language identification and classification. This provides a comprehensive taxonomy of over 7,000 language varieties.

Language Hierarchy

Glottolog organizes languages in a hierarchical tree structure:
Language Family
└── Genus (sub-family)
    └── Language
        └── Dialect
Each node in this tree has a unique Glottocode (e.g., nucl1643 for Japanese).

Language Object

{
  "id": 1234,
  "glottocode": "nucl1643",
  "iso639_3": "jpn",
  "name": "Japanese",
  "level": "language",
  "parent_glottocode": "japo1237",
  "family": {
    "glottocode": "japo1237",
    "name": "Japonic"
  },
  "scripts": ["Jpan", "Hira", "Kana", "Latn"],
  "speakers": 125000000,
  "latitude": 35.68,
  "longitude": 139.77
}

Fields

FieldTypeDescription
idintegerInternal LangDex ID
glottocodestringUnique Glottolog identifier (8 chars)
iso639_3stringISO 639-3 code (3 chars, if exists)
namestringPrimary English name
levelenumfamily, language, or dialect
parent_glottocodestringParent node in hierarchy
familyobjectTop-level language family
scriptsarrayISO 15924 script codes
speakersintegerEstimated speaker count
latitude / longitudefloatGeographic center

Language Varieties (Langvars)

LangDex also includes PanLex language varieties (langvar), which provide finer-grained distinctions than Glottolog. These are used for precise translation mapping.
{
  "id": 5678,
  "panlex_uid": 187,
  "language_id": 1234,
  "name": "Japanese (Modern)",
  "script_code": "Jpan",
  "region_code": "JP"
}
Langvars map to Glottolog languages via language_id, enabling translations to specify exactly which variety of a language is being used.

Language Names

Language names are stored in multiple locales via CLDR data:
{
  "language_id": 1234,
  "locale": "ja",
  "name": "日本語",
  "is_native": true
}
This enables displaying language names in the user’s preferred language.

Scripts

Languages can use multiple writing systems. LangDex tracks this via the language_script table:
LanguageScripts
JapaneseJpan (Japanese), Hira (Hiragana), Kana (Katakana), Latn (Latin)
ChineseHans (Simplified), Hant (Traditional), Latn (Pinyin)
SerbianCyrl (Cyrillic), Latn (Latin)

Common Queries

List all languages in a family

curl "https://api.langdex.co/v1/languages?family=indo1319&limit=50" \
  -H "Authorization: Bearer YOUR_API_KEY"

Search languages by name

curl "https://api.langdex.co/v1/languages/search?q=mand&limit=10" \
  -H "Authorization: Bearer YOUR_API_KEY"

Get language with full metadata

curl "https://api.langdex.co/v1/languages/nucl1643?include=scripts,names,family" \
  -H "Authorization: Bearer YOUR_API_KEY"

Data Coverage

LevelCount
Language families~450
Languages~7,000
Dialects~20,000
Language names~5.4M (200+ locales)
  • Glottolog - Source data for language taxonomy
  • ISO 639-3 - Language code standard
  • CLDR - Localized language names