CEDICT

Updated on Apr 25, 2026

Edit

Comment

The CEDICT project was started by Paul Denisowski in 1997 and is maintained by MDBG, under the name CC-CEDICT, with the aim to provide a complete Chinese to English dictionary with pronunciation in pinyin for the Chinese characters.

Content

CEDICT is a text file; other programs (or simply Notepad or egrep or equivalent) are needed to search and display it. This project is considered a standard Chinese-English reference on the Internet and is used by several other Chinese-English projects. The Unihan Database uses CEDICT data for most of its information about character compounds, but this is auxiliary and is explicitly not a part of the main Unicode database [1].

Features:

Traditional Chinese and Simplified Chinese

Pinyin (several pronunciations)

American English (several)

As of 14 February 2016, it had 114,087 entries [2] in UTF-8.

The basic format of a CEDICT entry is:

Traditional Simplified [pin1 yin1] /American English equivalent 1/equivalent 2/漢字汉字 [han4 zi4] /Chinese character/CL:個|个/

Example of a simple egrep search:

$ egrep -i 有勇無謀 cedict.txt有勇無謀有勇无谋 [you3 yong3 wu2 mou2] /bold but not very astute/

CEDICT has shown the way to some other projects:

HanDeDict (127,000 Chinese entries)

Chinese-German free dictionary

CFDICT (200,000 entries) for French

Some older CEDICT data is also found in the Adsotrans dictionary.

February 2012: ChE-DICC, the Spanish-Chinese free dictionary starts (currently beta)

CC-Canto is Pleco Software's addition of Cantonese language readings in Jyutping transcription to CC-CEDICT

Cantonese CEDICT features Cantonese language readings in Yale transcription and has Cantonese-specific words, many of which were taken from "A Dictionary of Cantonese Slang" (possible copyright infringement)

References

CEDICT Wikipedia

(Text) CC BY-SA

Contents

Content

Related projects

References