E.K. Kurup corpus
E.K. Kurup corpus is a free and open English–Malayalam dictionary+thesaurus+synset dataset with over 900,000 synonym entries across English and Malayalam synonym groups (or synsets). It was compiled by E.K. Kurup over a period of 20 years. Read story.
Data
Each entry has a head (English) word, under which there may be one or more part-of-speech groups, divided into English and Malayalam. Each such group is an list of lists, where the parent list represents synsets ordered by popular usage in the English language. For example:
- head: happy
senses:
- pos: adjective
en: [[joyful, cheerful, cheery, merry, jovial, jolly, ...]]
ml: [[സന്തോഷമുള്ള, സന്തുഷ്ടചിത്തമായ, സാനന്ദ, 'അദു:ഖ', ആനന്ദിത, ...]]
Download
ekkurup.tar.gz (~8.6 MB). Licensed under Creative Commons (CC BY-SA 4.0) license.
← Open source