Skip to main content
A lexeme is a dictionary entry - a word or phrase in a specific language with its associated linguistic information. LangDex aggregates lexemes from multiple sources including Wiktionary (via Kaikki), JMDict, and regional dictionaries.

Lexeme Object

{
  "id": 56789,
  "lemma": "水",
  "language_id": 1234,
  "language": "jpn",
  "pos": "noun",
  "reading": "みず",
  "romanization": "mizu",
  "source": "kaikki",
  "senses": [
    {
      "id": 111,
      "sense_order": 1,
      "gloss": "water",
      "meaning_id": 98765,
      "definitions": [
        {
          "language": "eng",
          "text": "The liquid form of H2O"
        }
      ],
      "examples": [
        {
          "text": "水を飲む",
          "translation": "to drink water"
        }
      ]
    }
  ],
  "pronunciations": [
    {
      "ipa": "mizɯ",
      "audio_url": "https://cdn.langdex.co/audio/jpn/mizu.mp3"
    }
  ],
  "etymology": {
    "text": "From Old Japanese 水 (mi₁du)"
  },
  "frequency": {
    "rank": 342,
    "corpus": "opensubtitles"
  }
}

Core Fields

FieldTypeDescription
idintegerInternal LangDex ID
lemmastringDictionary form / headword
language_idintegerForeign key to language
languagestringISO 639-3 code
posstringPart of speech
readingstringPhonetic reading (for logographic scripts)
romanizationstringLatin transliteration
sourcestringData source (kaikki, jmdict, etc.)

Senses

Each lexeme has one or more senses - distinct meanings of the word.
{
  "id": 111,
  "lexeme_id": 56789,
  "sense_order": 1,
  "gloss": "water",
  "meaning_id": 98765,
  "pos": "noun",
  "register": "neutral",
  "domain": "nature",
  "definitions": [...],
  "examples": [...]
}

Sense Fields

FieldTypeDescription
sense_orderintegerPosition within the lexeme
glossstringShort definition/translation
meaning_idintegerLink to cross-lingual meaning
posstringPart of speech (can differ from lexeme)
registerstringformal, informal, slang, etc.
domainstringSubject area (medicine, law, etc.)

Word Forms

Lexemes can have multiple word forms representing inflected variants:
{
  "lexeme_id": 56789,
  "word_forms": [
    {
      "form": "waters",
      "features": ["noun", "plural"]
    },
    {
      "form": "watered",
      "features": ["verb", "past"]
    },
    {
      "form": "watering",
      "features": ["verb", "present-participle"]
    }
  ]
}
Morphological data comes from UniMorph, covering 150+ languages.

Pronunciations

{
  "pronunciations": [
    {
      "ipa": "/ˈwɔːtər/",
      "variety": "Received Pronunciation",
      "audio_url": "https://cdn.langdex.co/audio/eng/water-rp.mp3"
    },
    {
      "ipa": "/ˈwɑːtɚ/",
      "variety": "General American",
      "audio_url": "https://cdn.langdex.co/audio/eng/water-ga.mp3"
    }
  ]
}
Pronunciation data comes from:
  • IPA-Dict (1M+ entries)
  • Kaikki/Wiktionary pronunciations
  • Wiktionary audio files

Etymology

{
  "etymology": {
    "text": "From Middle English water, from Old English wæter, from Proto-Germanic *watōr",
    "cognates": [
      {"language": "deu", "lemma": "Wasser"},
      {"language": "nld", "lemma": "water"},
      {"language": "swe", "lemma": "vatten"}
    ]
  }
}

Frequency

Word frequency data from OpenSubtitles and Leipzig corpora:
{
  "frequency": {
    "rank": 342,
    "count": 1847293,
    "per_million": 3421.5,
    "corpus": "opensubtitles-en"
  }
}

Proficiency Levels

For languages with standardized proficiency tests:
{
  "proficiency": {
    "standard": "JLPT",
    "level": "N5",
    "is_core_vocabulary": true
  }
}
Supported standards:
  • JLPT (Japanese) - N5 to N1
  • HSK (Chinese) - 1 to 9
  • CEFR (European) - A1 to C2

API Examples

Get a lexeme with full details

curl "https://api.langdex.co/v1/lexemes/56789?include=senses,pronunciations,etymology,frequency" \
  -H "Authorization: Bearer YOUR_API_KEY"

Search lexemes

curl "https://api.langdex.co/v1/lexemes/search?q=water&lang=eng&pos=noun" \
  -H "Authorization: Bearer YOUR_API_KEY"

Get word forms

curl "https://api.langdex.co/v1/lexemes/56789/forms" \
  -H "Authorization: Bearer YOUR_API_KEY"

Get lexemes by proficiency level

curl "https://api.langdex.co/v1/lexemes?lang=jpn&proficiency=JLPT:N5&limit=100" \
  -H "Authorization: Bearer YOUR_API_KEY"

Data Sources

SourceLanguagesLexemes
Kaikki (Wiktionary)200+~10M
JMDictJapanese~200K
Regional dictionariesVarious~100K