Because of the serious need for an adequate English-Tamil dictionary, this writer has been compiling since 1972 a corpus that is intended to be eventually published as an ``English Dictionary of the Tamil Verb". This task was actually at first a by-product of another project having as its corpus a computerized file of Tamil verbs taken from Fabricius' Tamil-English dictionary (1933 edition). Since the corpus was provided with class and transitivity specification, as well as synonyms, it was easy to generate an output with English entries first, in English alphabetical order, since this would then constitute the nucleus of an English-Tamil dictionary. Admittedly this was not perhaps the ideal way to begin compiling a dictionary, but since the data were available, and contained information not given elsewhere, it seemed a useful point of departure.
It was decided to concentrate on verbs for two reasons (other than the fact that there were only verbs in the corpus). One is that the verb phrase in Tamil or any Dravidian language is typically the most complex and the most difficult for non-Tamils to master. Most of the syntactic and semantic complexity in the language hinges on the verb phrase, but because of the way dictionaries have been written in the past, the information about class and transitivity, as already mentioned, is always lacking. Instead, lexicographers have left this task to the grammarian, who is supposed to deal with it in grammars of Tamil. Of course there is a limit to what can be included in a dictionary, and many complexities must indeed be explained in the grammar rather than in the dictionary, but on the other hand there is information that must be supplied in the dictionary, and this is the kind that has always been lacking. For example, information as to whether a verb is `dative/stative' (i.e., takes a dative subject rather than a nominative one) ought to be part of the lexicon, since without this information a user is likely to generate ungrammatical sentences, especially since there are often homonyms in the language, one of which is a stative verb requiring a dative subject, and the other a non-stative transitive verb requiring a nominative subject, such as mu8i mudi:
In this entry, the verb mu8i mudi `be able' is provided with the information that it occurs only in the third person (3pn) with the dative case. When it means `end, finish s.t.' it is not a dative-stative verb, but is of the class 2b and is transitive. The classes are based on that of Dr. Graul given in Arden (1942) and other grammars. `b' means that in Spoken Tamil its form is palatalizing, i.e. the past form is mudinj-adu. The policy of this project is to include more of this kind of information than has been given in the past, without cluttering up the lexicon with general knowledge that ought to be part of the user's competence in the language. There is no dictionary that can be used adequately by a person without some command of a language (despite the existence of such phrase books for tourists etc. sold in book stalls in international airports) and it seems desirable to give enough information in the dictionary to allow a competent user to frame adequate sentences. We therefore do not provide `case-frames' for all the verbs of the language, only those with idiosyncrasies, such as the `dative-stative' verbs mentioned, but with example sentences that illustrate case-relationships where pertinent, rather than abstract `case-frames.'
For example, the fact that a transitive verb takes an object in the accusative is general information that should be part of the user's knowledge of the language. If English speakers regularly choose the Tamil dative case as equivalents for English verb-plus-particle items like `look for', this is a result of misconceptions about the English language, not the Tamil language, and should be cleared up in the language classroom, not in the dictionary. English speakers typically make this mistake in all languages, e.g. translate `look for' in French as chercher pour, in German as suchen für, etc. Moreover, the amount of space required to provide case-frames for all the potentially problematical items would be double or triple what we are currently contemplating, and is not a realistic consideration.
Secondly, verbs in the Tamil language are a finite set. Tamil does not at present borrow verbs from other languages, nor does it invent new ones except by compounding existing items or by the use of a loan word plus a Tamil verb (e.g. English `drive' may be rendered by draiv plus pannu pa33u `do', i.e. although this process is not permitted in LT). Tamil also has recourse to `borrowing' verbs from older stages of the language and occasionally from its own dialects, but for the most part any listing of the verbs of Tamil now in use would remain fairly constant and not become obsolescent for some time. Nouns, the other major `part of speech' in the Tamil lexicon, on the other hand, are an infinite set and therefore somewhat unmanageable given the present circumstances. New nouns have been and are constantly being invented, coined, and neologized by various Tamil-Nadu and Sri Lanka government agencies struggling to produce technical terminology to replace English or Sanskrit terms now in use in the domains of government, agriculture, science, medicine, etc. (Cf. Dhamotharan 1978, especially section III, pp. 117-158). Eventually, of course, we can expect the work of coining new terminology to slow down, at which point a more complete English-Tamil dictionary incorporating this present project plus all the new terminology could be produced. This ENGLISH DICTIONARY OF THE TAMIL VERB can be considered a first step in that direction. Because of the enormity of such an undertaking, however, we do not take a stand, for or against, on this issue at this time. Because the Tamil verb is at once the most complex, urgent, and yet manageable part of the lexicon of the language, it lends itself admirably to becoming the first part of the important task of compiling a complete modern English-Tamil dictionary. Once the technology (in particular, the experience with managing a vast computerized data-base for such a larger project) has been developed by this `pilot program', it should be more feasible to add noun data to the already existing data base of verbs.
Because the original corpus was produced by simply reversing the corpus of Fabricius' Tamil-English dictionary, there were many entries that were inappropriate, archaic, or simply missing. Archaisms abounded because the Tamil entries that might be found in a text required an English equivalent, but that equivalent might not be the most appropriate Tamil equivalent of a standard English verb. For example, the English verb `smite' certainly ought to appear in an English-French dictionary with its French equivalent tuer, but a French speaker looking up tuer for an appropriate English equivalent ought to be given `kill' as a primary entry, not archaic `smite'. In addition to such problems, the original corpus simply lacked many modern English verbs (since it was originally prepared by German missionaries 200 years ago) as well as their modern Tamil equivalents. Therefore the original computer-generated data base, when Tamil archaisms were thoroughly expunged, lacked many modern English and Tamil forms. My collaborator S. Paranisamy has been working off and on since 1976 to provide new material, and in 1978 under the AIIS grant we made substantial progress (the letters A to M are now essentially complete and have been entered in the data base). In 1984 we received funding from the National Endowment for the Humanities for work in North America, and from the Smithsonian Institution for work in India.
Computerization has allowed us to incorporate all available material from previous dictionaries, and to edit and phototypeset in English and Tamil. Computerization also allows us to update the data base periodically and to reprint revised editions at low cost. Eventually it will also allow us to add nouns when the afore-mentioned process of devising new terminology has reached a slower pace.