Extracting Significant Words in Engineering Texts for Specialised Language Descriptions
Noorli Khamis1, Imran-Ho Abdullah2

1Noorli Khamis*, Pusat Bahasa dan Pembangunan Insan (PBPI), Centre for Technopreneurship Development (CTeD), Universiti Teknikal Malaysia Melaka (UTeM), Melaka, Malaysia.
2Imran-Ho Abdullah, School of Language Studies and Linguistics, Universiti Kebangsaan Malaysia (UKM), Bangi, Malaysia.
Manuscript received on October 11, 2019. | Revised Manuscript received on 26 October, 2019. | Manuscript published on November 10, 2019. | PP: 5862-5868 | Volume-8 Issue-12, October 2019. | Retrieval Number: L25171081219/2019©BEIESP | DOI: 10.35940/ijitee.L2517.1081219
Open Access | Ethics and Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Abstract: The academic discourse of a specialised language is characterised by specialised and technical vocabulary, and lexicogrammar. Studies on language description suggest the need to explore and determine the specific characteristics of the academic discourse of each specialised language, to serve the language needs of the learners. This study demonstrates an exploration of this discipline specificity by looking at the nouns used in a specialised language – an Engineering English. It attempts to integrate a multivariate technique, i.e. the Correspondence Analysis (CA), as a tool to extract significant nouns in a specialised language for any further language use scrutiny. CA allows visual representations of the word interrelationships across different genres in a specialised language. To exemplify this, an Engineering English Corpus (E2C) was created. E2C is composed of two sub-corpora (genres): Engineering reference books (RBC) and online journals articles (EJC). The British National Corpus (BNC) was used as the reference corpus. 30 key-key-nouns were identified from the E2C, and the frequency lists of the words were retrieved from all the corpora to run the CA. The CA maps of the nouns display how these corpora are different from each other, as well as, which words characterise not only E2C from a general corpus (BNC), but also the different genres in E2C. Thus, CA proves to be a potential tool to display words which characterise not only a specialised corpus from a general corpus, but also the different genres in that specialised corpus. This study promises more informed descriptions of a specialised language can be made with the identification of specific and significant vocabulary for any academic discourse investigations.
Keywords: academic discourse, Correspondence Analysis, ESP, nouns, specialised corpus
Scope of the Article: Specialised language