Lexeme Object
Core Fields
| Field | Type | Description |
|---|---|---|
id | integer | Internal LangDex ID |
lemma | string | Dictionary form / headword |
language_id | integer | Foreign key to language |
language | string | ISO 639-3 code |
pos | string | Part of speech |
reading | string | Phonetic reading (for logographic scripts) |
romanization | string | Latin transliteration |
source | string | Data source (kaikki, jmdict, etc.) |
Senses
Each lexeme has one or more senses - distinct meanings of the word.Sense Fields
| Field | Type | Description |
|---|---|---|
sense_order | integer | Position within the lexeme |
gloss | string | Short definition/translation |
meaning_id | integer | Link to cross-lingual meaning |
pos | string | Part of speech (can differ from lexeme) |
register | string | formal, informal, slang, etc. |
domain | string | Subject area (medicine, law, etc.) |
Word Forms
Lexemes can have multiple word forms representing inflected variants:Pronunciations
- IPA-Dict (1M+ entries)
- Kaikki/Wiktionary pronunciations
- Wiktionary audio files
Etymology
Frequency
Word frequency data from OpenSubtitles and Leipzig corpora:Proficiency Levels
For languages with standardized proficiency tests:- JLPT (Japanese) - N5 to N1
- HSK (Chinese) - 1 to 9
- CEFR (European) - A1 to C2
API Examples
Get a lexeme with full details
Search lexemes
Get word forms
Get lexemes by proficiency level
Data Sources
| Source | Languages | Lexemes |
|---|---|---|
| Kaikki (Wiktionary) | 200+ | ~10M |
| JMDict | Japanese | ~200K |
| Regional dictionaries | Various | ~100K |
Related Concepts
- Meanings - Cross-lingual semantic hub
- Translations - How lexemes translate