Parallel corpora, metalanguage generation, linguistic pollution, hybridization, semantic extractions, chunks, neologisms.
A member of the Benue-Congo branch of the Niger-Congo language family, the Igbo language is spoken by the Ndi Igbo, who live mainly in south-eastern Nigeria, with an estimated population of 20 million. (Microsoft Encarta 2003). In spite of the large population of native speakers, the Igbo language could today be described, without fear of contradiction, as being under threat by numerous factors. Prominent among them are (i) the erroneous view held by many native speakers that their mother tongue is lexically too poor and, therefore, inadequate for discoursing about or documenting certain subjects in the humanities and sciences as well as for translating technical and scientific texts into the language, (ii) abusive borrowing by educated Igbo native speakers, and (iii) the linguistic inferiority complex exhibited by most of the native speakers of the language with regard to the use of their language. However, this experience is not peculiar to Igbo.
In spite of the large population of native speakers, the Igbo language could today be described as being under threat by numerous factors.
In the 16th
century, the French language suffered a similar fate as the Igbo language does today. It happened that Latin, regarded then as a 'universal language,' displaced French to the extent that French intellectuals and philosophers found it difficult to express their intellectual and philosophical ideas in their own mother tongue. The French blamed their ancestors, who preferred Latin to their mother tongue, for the poverty
of the language. "La langue française est pauvre
, parce que nos ancêtres ont plus pratiqué 'le bien faire que le bien dire'. » (Lagarde et Michard, 1965:95) Later, a number of French intellectuals and philosophers, led by Ronsard and du Bellay took up the challenge to promote, enrich and restore the linguistic glory of French, their mother tongue, under the umbrella of an association called La Pléiade.
Indeed, the great success recorded by these French intellectuals and philosophers is my major source of inspiration to carry out this research with a view to contributing towards generating the Igbo metalanguage in the area of computer science.1.0 Linguistic pollution resulting from abusive borrowing
It is said that charity begins at home. But ironically, 'linguistic charity', for virtually every educated Igbo native speaker, begins abroad, especially in London. How else can one explain the general aberrant linguistic behavior of the Ndi Igbo, who can hardly make a single sentence in their own mother tongue without polluting it with (good or bad) English? Consider the following bizarre Igbo utterances in which the igbonized English words and phrases are presented in italics.
- Maị fádà mere travụlụ. O jere vileji.mere . O jere .(My father travelled. He went to the village.) m na nwunye gị yi bịara m . (I like the type of foreign gown and hat which your wife wore to my wedding yesterday.) dị . (The sowing was wonderful.) ya dị . Ọ kọrọ ọnụ. (But her shoes were very "igbotic" (inferior). They degraded the entire attire.) gwara m na Ị kwụbeghị eri na Ị na- ije ? (Someone told me that you had not paid your house rent since March this year and that you are planning to travel overseas.)
(My father travelled. He went to the village.)
- A laịkịrị m di taịpụ ọfụ fọrịn gawụnụ na hatị nwunye gị yi bịara m wedin yestadee.
(I like the type of foreign gown and hat which your wife wore to my wedding yesterday.)
- Di soyin dị wọndafụlụ.
(The sowing was wonderful.)
- Bọtụ shuu ya dị veri ịgbotiki. Ọ kọrọ di entaya ataya ọnụ.
(But her shoes were very "igbotic" (inferior). They degraded the entire attire.)
- Sọmwọm gwara m na Ị kwụbeghị yụọ haụzụ renti eri Machị dis yie andị na Ị na-apụlanị ije ovas?
(Someone told me that you had not paid your house rent since March this year and that you are planning to travel overseas.)
- E tekiri m tabụleti abụọ dụrọgụ m n'ụtụtụ afụta brekifasịtị. A ga m eteki di rimeenin araụndụ sikisi mgbede.
(I took two tablets of my drug in the morning after breakfast. I will take the remaining ones around 6 p.m.)
In my linguistic judgment, the above corpus is a typical example of what I call 'linguistic pollution', resulting from 'abusive borrowing, which means the preference for foreign expressions, even where Igbo equivalents exist. Note that I am not against borrowing in any way because no language can do without borrowing from others. In translating selected items from my corpus compiled in the field of computer science, I borrowed quite a number of words, either from English or French, e.g. kọmputa, Intaneetị, mega baịtị, weebụ etc., which should form part of the Igbo metalanguage.
2.0 What is metalanguage?
Metalanguage "is a technical term referring to a body of coinages in local languages to express contemporary concepts in technology, arts and sciences." (Echerue 2005: 40) In a broader sense, it is "the supra-language required for talking about a language, a people's culture and its entire civilization in that language. It is the sum total of all technical or specialized terms needed for discussing anything and everything in that language." (Emenanjo 2005: 5) One can deduce from the above definitions that metalanguage is a corpus of neologisms constantly reinforced by globalization and other dynamic forces such as research, inventions, scientific discoveries, etc. To this extent no single person can claim authorship of the metalanguage of any particular language. In the case of Igbo, the generation of its metalanguage should involve all stakeholders of the language such as "touts, teachers, students, authors, media people (journalists), traditionalists and modernists, charlatans, professionals and amateurs, madmen and specialists." (Emenanjo 2005: 17) Hence, I am hereby involved.
3.0 The Need for Igbo Metalanguage
As we pointed out in the introduction of this article, most Igbo native speakers tenaciously hold the view that it is a mission impossible to discourse about or document certain subjects, especially in the areas of science and technology in Igbo. For instance, they do not believe that it is possible to teach computer appreciation or computer programming in Igbo. At the surface level, one might tend to justify their fears due to the apparent lack of immediate Igbo equivalents of some of the highly technical terms and expressions. For this reason, the tendency is for one to go all out to borrow and igbonize every technical term. If this happens, one might end up producing a language which could hardly be recognized as Igbo. (See corpus in section 1.0 above)
As I stated in the introduction of this article, the French language passed through the same turbulent experience. However, through composition, derivation, borrowing, etc., French intellectuals and philosophers took the bull by the horn, by picking up the challenge to provide a metalanguage for their mother tongue in order to enable it cope with what they described as "les mots techniques du langage des métiers ...non seulement (pour) les savants, mais aussi toutes sortes d'ouvriers et de gens mécaniques, comme mariniers, fondeurs, peintres, engraveurs et autres. » (Lagarde et Michard, 1965: 92) "technical words for professional language ... for not only scientists but also all types of workers and technicians, such as sailors, metalworkers, painters, engravers and others" (translation mine). Indeed, their work paved the way for what I call the "modern French technical lexicon."
Going by the above historical episode, one sees that the Igbo language is in dire need of a functional metalanguage, especially in this computer age, where "transdisciplinarity" globalization is gaining ground (Adedipe 2006). Hence, I would like to acknowledge and appreciate the work already done by different Igbo scholars, research organizations, and educational institutions in their bid to generate a rich and functional Igbo metalanguage. I wish to thank particularly SPILC for the Recommendations of the Igbo Standardization Committee, NERDC for publishing the Quadriligual Glossary of Legislative Terms (1999), NLC for the Primary Science Terminology, NERC for the Metalanguage Project, etc. All these bodies have so far "brought no fewer than 20,000 words into the modern lexicon of Standard Igbo." (Emenajo 2005: 9) However, in generating a metalanguage for Igbo, we should not brush aside the following advice:
"If we engage in the naming game in order to expand our knowledge, if we think the names we give to concepts are the best possible aids to a better understanding of those concepts and in teaching, then we must be careful how we choose our terms." (Echerue 2005: 41)
Indeed, I agree with Echerue that "we have to be careful how we choose our terms" probably to avoid a situation where our language would be overloaded with confusing and conflicting terms. But then, first we should remember that between "un signifiant" and its "signifié" (de Saussure), the rapport is arbitrary. Second, a language has a natural way of shedding unacceptable word creations. Thirdly, we have competent regulatory bodies like the Igbo Studies Association (ISA), the Association for Promoting Nigerian Languages and Culture (APNILAC) and their Standardization Committees, which meet from time to time to consider and regulate some of the neologisms. In my humble opinion, therefore, our young intellectuals and indeed, every stakeholder in the business of the use of Igbo, should be encouraged to continue with their research in the area of "the enlargement of the Igbo lexicon through an aggressive mass coining of new terms and words for the language" (Echerue 2005: 39), especially in this computer/globalization age.
Against this background, this paper aims at contributing its widow's mite, through the translation of selected text samples from my corpus from the field of computer science, towards the generation of Igbo metalanguage. My choice of this corpus is due to the fact that today a thorough knowledge of this discipline serves as a master key to many other human endeavors. I shall adopt the hybrid approach in the metalanguage generation process.
4.0 What is the hybrid approach to metalanguage generation?
Microsoft Encarta Premium Suite 2004 uses the term "hybrid" to describe "a plant produced from a cross between two plants with different genetic constituents" (Botany) or "an animal that results from the mating of parents from two distinct species or subspecies" (Zoology) or "an electronic circuit that consists of two or more components not ordinarily combined with one another" (Electronic Engineering). But in the present circumstance and context, I am going to adopt the concept of "hybrid" as propounded by Nirenburg et al (1993) and quoted by Beal (1996) in his monograph on Example-Based MT (EBMT).
"A very different route to hybridization has been suggested by Nirenburg and his co-workers (Nirenburg, 1993; Nirenburg and Frederking, submitted) in which multiple diverse MT engines co-exist in a single system configuration and contribute their best partial outputs to the overall output of the system."
By hybridization, Nirenburg and his co-researchers simply mean an arrangement where a number of machine translation engines are configured in such a way that they can work together as a single system so that each engine contributes its own quota of expertise to the translation process. It is like saying, "Two heads (of course, good ones) are better than one" or "Igwe bụ ike" in Igbo. Similarly, an Igbo technical neologism could best be coined from a bilingual, rather than a monolingual corpus. In other words, it is quite logical to aver that more semantic resources are available in a parallel corpus (a source language with its translation) are higher than in a single language. Therefore, hybridization means coining out a new word or phrase, which I shall refer to as "chunk" (apud Beal 1996) in a local language (LL) from the meaning of two foreign chunks, (FC1) and (FC2) belonging to two different foreign languages, (FL1) and (FL2) respectively, where FC1 and FC2 denote the same referent. By implication, therefore, hybridization can only take place if the researcher has a good knowledge of two or more foreign languages, in addition to his mother tongue.
The algorithm involves four steps:
- choosing a parallel corpus of a foreign language-pair, that is, a source language text and its translation, depending on the researcher's area of interest;
- studying carefully the nuances between FC1, FC2, etc.;
- executing what I may call "semantic extraction" from each of the foreign chunks and finally;
- coupling the semantic extracts to form a hybrid chunk, that is, a new word or term in the local language, which must be aligned to the phonological framework of the local language.
5.0 Hybridization The parallel corpora in examples one, two and three below are in English (FL1) and French (FL2) language-pair, whereas the local language (LL) is Igbo. Note that the coined hybrid chunks (HC) in (LL) are put in italics.
5.1.0 Example One
Foreign Language (FL1)
Foreign Language (FL2)
Local Language (LL)
Press and hold any key. If the system beeps, then your keyboard is operating correctly.
(COMPAQ 2002: 2-2)
Appuyez sur une touche quelconque et maintenez-la enfoncée. Si le system émet un signal sonore, le clavier fonctionne correctement.
(COMPAQ 2002: 2-2)
Pịgide aka n'elu mpịaka ọbụla sọrọ gị. Ọ bụrụ na nhanhiwe ahụ e mee ụzụmgbaama, nke ahụ na-egosi na ọyọrọmpịaka ahụ na-arụ nke ọma.
aka n'elu ọbụla sọrọ gị. Ọ bụrụ na ahụ e mee nke ahụ na-egosi na ahụ na-arụ nke ọma.
5.1.1 AnalysisIn FL1, the idea of "press and hold" (FC1) is rendered in FL2 as "appuyez sur...et maintenez-la enfoncée" (FC2). Based on the nuances between FC1 and FC2, I coined the hybrid chunk (HC), "pịgide aka", with the verbal root, "pị" translating the idea of "press" or "appuyez" and the bound morpheme "-gide" rendering the time lag, "and hold" or "maintenez-la enfoncée".
The FC1 "beep" is rendered into FL2 as "un signal sonore" (FC2). According to Microsoft Encarta Dictionary (2005), beep is "a short high-pitched noise emitted as a signal by a piece of electronic equipment or the horn of a vehicle". After a careful study of FC1 and FC2, I extracted "noise" (ụzụ) from FC1 and "signal" (mgbaama) (NERDC 1991: 257) from FC2. Thereafter, I coupled the two semantic extracts, to obtain "ụzụmgbaama" as the hybrid chunk (HC).
Also, a critical study of the words "key" (FC1) and "touche" (FC2) shows that the referent of the two words relates more to the French word, "touche", than the English equivalent, "key", because the operation involves pressing with a finger. In other words, the "keys" or "touches" are primarily meant to be touched (touché) and pressed (appuyé) and not to be locked or unlocked as with a key, hence, the semantic extraction of the idea of "touche". Thus, I coined the hybrid chunk (HC) "mpịaka" based on the above considerations. Note that "mpịaka" aligns perfectly well into the Igbo phonological framework because its vowels obey the Igbo vowel harmony.
Another typical example of hybridization is found in the translation of the word "keyboard" (FC1) or "clavier" (FC2) into Igbo. Whereas, FC1 is both a collective noun and a compound word made up of "key" and "board", FC2 is only a collective noun. However, both denote the same referent, that is, "a row or several rows of keys on a musical instrument like a piano or a machine like a computer" (Gadsby 2003) or "ensemble des touches d'un instrument de musique ... d'une machine" (Dubois 1971). I have already chosen the French word "touche", that is, "mpiaka" instead of the English equivalent "key", that is, "igodo" for reasons given in the above paragraph. But a careful study of the nuances in their meanings shows that "a collection of keys" corresponds to the word "bunch" in the phrase "bunch of keys" in English, hence, my choice and translation of "bunch" into Igbo to have "ọyọrọ". Then, the semantic extracts, "ọyọrọ" and "mpịaka" could now be coupled to produce "ọyọrọmpịaka", which is a typical example of a hybrid chunk (HC), aligned into the Igbo phonological framework.
5.2.0 Example Two
Foreign Language (FL1)
Foreign Language (FL2)
Local Language (LL)
Google is a search engine.
Le Google est un moteur de recherche.
Gọgụlụ bụ igwenchọ.
5.2.1 AnalysisThe FC1, "search engine", is translated into FL2 as "moteur de recherche", FC2. Based on the nuances between FC1 and FC2, I extracted the chunk "igwe" from "engine" and "moteur". I also extracted "nchọ" from "search" and "recherché." Then I coupled "igwe" and "nchọ to obtain the hybrid chunk (HC), "igwenchọ."
5.3.0 Example Three
Foreign Language (FL1)
Foreign Language (FL2)
Local Language (LL)
The mouse is a plug-and-play hardware used for moving the cursor from one part of the screen to another.
La souris est un matériel de saisie qui sert à déplacer le curseur sur l'écran.
Oke bụ ngwaike fanyegwue nke e ji akpụgharị tịkọm si n'otu akụkụ onyonyo gaa n'akụkụ ọzọ.
bụ nke e ji akpụgharị si n'otu akụkụ o gaa n'akụkụ ọzọ.
5.3.1 AnalysisAfter considering the nuances between the foreign chunks "mouse" and "souris," I chose the local chunk "oke" because the physical shape of the referent looks more like a mouse (oke) than other rodents. A careful study of the nuances between the foreign chunks "hardware" and "matériel" on the one hand, and "software" and "logiciel" on the other hand, prompted the semantic extraction of "ngwa" from "matériel" FC2 and "ike" from "hardware" FC1. Thus, I can now couple the semantic extracts to obtain "ngwaike," which is a hybrid chunk HC. In the same vein, I extracted "ngwa" from "matériel" FC2 and "nro" from "software." When they are coupled, I obtain "ngwanro," which is also a hybrid chunk HC.
In conclusion, this paper has tried to approach metalanguage generation from the point of view of corpus-based hybridization, that is, choosing a technical text (computer science) in a foreign language (English) and translating it into another foreign language (French); considering the nuances between the technical terms in the foreign-language sentences, extracting the meaning of each of the terms in the two languages; coining a word in the local language (Igbo) based on the meanings extracted from the two different foreign languages. The result obtained from this approach so far has been found to be quite encouraging.
Suggestion for further research
I therefore suggest that the same method be attempted with other language pairs such as German-French, French-Italian, German-Spanish, Japanese-English, Russian-Portugese, etc. Indeed, researchers can even work in smaller groups of say two or three, so that people who are competent in different language pairs can pull their linguistic resources together to form wider and richer semantic fields for semantic extractions, especially in scientific and technical areas. This will go a long way in salvaging some of the world's endangered languages from extinction.
Adedipe (2006), "Inceptional Interaction with Proposed Universities," being the title of a lecture delivered by Prof. Adedipe, Chairman, Special Committee on Private Universities (SCOPU) on Wednesday September 20, 2006, at NUC, Abuja
Beal, S. (1996) "Inter-Language Matching," http://crl.nmsu.edu/user/sb/papers/embt/col94/col94.html
Dubois (1971), Dictionnaire du Français Contemporain, Paris, Larousse
Echerue (2005), "Igbo Metalanguage in Literary Discourse" in Ikekonwu and Nwadike (eds), Igbo Language Development: the Metalanguage Perspective, Enugu, CIDJAP Printing Press
Emenanjo (2005), "Beyond Ọkaasụsụ Igbo : Igbo Metalanguage, Past, Present and Future" in Ikekonwu and Nwadike (eds), Ibid
Gadsby (2003) Longman Dictionary of Contemporary English, Edinburgh, Pearson Educational Limited
Lagarde et Michard, (1965)
Microsoft Encarta Reference Library (2005), Microsoft Corporation.
NERDC (1999), Quadriligual Glossary of Legislative Terms
Saussure, F. (1916), Cours de Linguistique Générale, Genève, quoted in Microsoft
by Enoch Ajunwa
Department of Mordern European Languages
Nnamdi Azikiwe University, Awka, Nigeria