Chinese Dictionaries

 

There are several important consequences that follow from the way Chinese is written.  One such consequence is the use of dictionaries.  In many ways, it poses the opposite challenges as using dictionaries in English. 

In English, it is easier to look up a new word, no matter how bizarrely it is spelt.  As long as you know the alphabetical sequence of the letters, you can find the word in a dictionary.  However, to check the spelling of a word you already know may not be as easy.  Due to the many irregularities of English spelling, it may take a while for you to find the word. 

In Chinese, it is exactly the opposite.  If you know how to say a word, as long as you know how to spell it in a phonetic representation, it will not be hard to look up a character in an alphabetically ordered dictionary (most modern ones are).  The only snag is that you may have to go through the list of homonyms to find the character with the right meaning.  But the spelling itself is far more regular than English spelling.  On the other hand, if you encounter an unfamiliar character, the whole ‘radical identifying, stroke-counting’ procedure has to be employed.  It is not only time-consuming; in some cases (especially in the simpler characters), you may not be able to find what you want at all. 

            Why is looking up an unfamiliar character such a chore?  There are three main sources of difficulty:

            Multiple indices: most Chinese dictionaries' entries are listed alphabetically. But for looking up an unfamiliar character whose pronunciation is not known, characters will also have to be sorted in a graphically-based way, i.e., by the number of strokes. But the sheer number of characters with the same stroke count dictates that there have to be more than one such graphically-based indices. Suppose we list the characters by the number of strokes and group all characters with 5 strokes together. But there would be too many characters with 5 strokes and finding a character would be like looking for a needle in a haystack! Therefore, the characters have to be categorized first into different groups.  The grouping is typically done by bushou 部首or radicals.  So in addition to the many character lists headed by the same bushou, there has to be a radical list as well.  Looking up a character then is at least a two-stage process, one to find the radical, another to find the character among the many characters under that same radical.

            Indeterminacy of radical:  The first step in looking up a character is to decide what the radical is for a character.  But deciding radical is by no means easy.  In compound characters, a radical can be on the left, on the right, on top, at the bottom, inside an enclosure or outside one.  In the case of simple characters, it is even harder, as there may not be an obvious radical.

            Counting number of strokes: both for finding the radical on a radical index (the second step) and finding a character on a character index (the third step), one needs to count strokes.  Counting strokes is not as easy as it looks either.  One needs to know what constitutes a single stroke even when geometric pattern seems to suggest otherwise.  So has three strokes despite the four sides; has two (or one?) strokes instead of four, for example.

            Given the difficulty of looking up characters by shape, it may pay off to guess the pronunciation of an unknown character and then look it up by sound.  Due to the predominance of semantic-phonetic characters, we can use the phonetic component to guess at the sound of the character.  The problem is that many phonetic components do not have the same sound now as the character due to sound changes.  Be that as it may, guessing the pronunciation is still worthwhile.  If one is familiar with the typical sound changes, it is even possible to guess at the changed sound of a character.  Looking up a character by shape really is the last resort and should be avoided as much as possible. 

 

                                                            Chinese Typewriters

 

Before the advent of computers, typewriters are used to produce formal looking documents.  Instead of the 26 keys of an English typewriter, Chinese typewriters need to have as least a few thousand characters.  There can’t be as many keys of course.  So a typical set up will have a movable pickup for picking up a lead character and then striking it on the paper, very much like the way the hammer of a piano strikes the strings.  

The main difficulty in using the Chinese typewriter is to find the location of the desired character among the whole set of a few thousand characters.  This is not something ordinary people can do.  Chinese typists were all professionals.  They also arrange the characters in their own ways to facilitate the hunting for characters.

 

                                                Chinese Telegraphy

 

Before the advent of modern telecommunication, telegraphy was used for both civilian and military purposes.  One example was the use of the Morse code.  In English, different combinations of dees and dahs were used to represent the 26 letters of the English alphabet.  Take a hypothetical example.  The letter ‘a’ can be represented as ‘dee dah dee’ and so on.  The speed of communication can be slow.   But all you need to learn are the combinations for 26 letters.  Chinese telegraphy is an entirely different matter.

Accuracy is of paramount importance in telegraphy.  Therefore, the codes that are sent out must be uniquely associated with the characters that are intended.  No codes should have ambiguous interpretations.  This will rule out the use of phonetic encoding, as homophony is so extensive in Chinese.  The problem of dialects also complicates the possible use of phonetic encoding. 

As a result, a last resort brute force method was adopted to represent characters in telegraphy, i.e., the use of 4 digits numeric codes.  Let’s take a hypothetical example.  The character for peoplecould be 2038 for example and that for day’could be 8530.  The problem with such encoding method of course is its arbitrariness, which leads to extreme difficulty in remembering the code for characters.  Needless to say, one needs a very long time of intensive training to become proficient in sending codes.