The construction of a corpus of aviation scientific articles:

an interdisciplinary study

Authors

  • Fernanda Beatriz Caricari de Morais Divisão de Ensino, Academia da Força Aérea, AFA, Pirassununga, SP, Brasil
  • João Paulo Martins dos Santos Divisão de Ensino, Academia da Força Aérea, AFA, Pirassununga, SP, Brasil.

DOI:

https://doi.org/10.22480/revunifa.2024.37.617

Keywords:

Corpus, Corpus Linguistics, Systemic-Functional Linguistics

Abstract

This article presents the experience of building a corpus of scientific articles written in English   in the field of aviation, and the linguistic-computational treatment given by the [...] and the linguistic-computational treatment given by Corpus Linguistics. Data collection was performed using computer programming techniques for data scraping, which allowed the collection of articles from two electronic journals: Air & Space Power Journal and Journal of Aviation/Aerospace Education and Research. The corpus is used for linguistic research, based on Systemic-Functional Linguistics (Halliday, 1994 e Halliday & Matthiessen, 2004, 2014), that sees language as a potential system of meanings, in which the concept of choice is essential for allowing the study of lexical regularities, and has implications for both language description and language teaching. With the use of Corpus Linguistics computational tools (Berber-Sardinha, 2000, 2004), it is possible to work with a large number of texts, obtaining quantitative data that help in the qualitative analysis of these regularities. As a result, we have a study corpus that can be considered "[...] medium-large (Berber-Sardinha, 2004), with more than three million words. It is expected that the construction of this corpus will encourage new linguistic and statistical research in aviation, especially involving cadets who participate in scientific initiation programs and who draft their course completion papers.

Author Biographies

Fernanda Beatriz Caricari de Morais, Divisão de Ensino, Academia da Força Aérea, AFA, Pirassununga, SP, Brasil

She is an Adjunct Professor III at the Air Force Academy. PhD in Applied Linguistics and Studies of Language (PUC-SP), with a period in the Department of English Studies of the University of Lisbon. Post-doctorate at UFU (PNPD/CAPES) and PUC-SP (PDJ/CNPq). Teacher of the Professional Master's Degree in Bilingual Education from INES/MEC-RJ since 2014. Group member SAL (Systemics Across Languages) international research program, also dialoguing with the Center for of Interdisciplinary Studies in Aerospace Sciences (NEICA/UNIFA). Your interests in research are related to the use of Systemic-Functional Linguistics and Linguistics of Corpus for the analysis of various aspects of language use. Currently, it analyzes the characteristics lexical-grammatical of academic articles in the field of aviation published in American journals. 

João Paulo Martins dos Santos , Divisão de Ensino, Academia da Força Aérea, AFA, Pirassununga, SP, Brasil.

He holds a degree in Mathematics from the São Paulo State University Júlio de Mesquita Filho (2006), a master's degree in Mathematics from the São Paulo State University Júlio de Mesquita Filho (2009) and a PhD in Sciences from the São Carlos School of Engineering - EESC-USP. He is an Adjunct Professor at the Air Force Academy in Pirassununga/SP. He has experience in the area of nonlinear and non-ideal Dynamical Systems, perturbation methods, numerical methods for solving linear systems, finite element method. He has experience in the areas of Teaching and Mathematics with interest in numerical methods for solving ordinary and partial differential equations, estimator residual error for the equation of pollutant transport, Python programming language, Scientific Computing in Python and numerical methods for solving linear systems, teaching Mathematics.

References

BERBER SARDINHA, T. Computador, corpus e concordância no ensino de léxico-gramática de língua estrangeira. In: V, Leffa (org.) As palavras e sua companhia: o léxico na aprendizagem. Pelotas: EDUCAT, UCP, p. 45-72, 2000.

BERBER SARDINHA, T. Linguística de Corpus. Barueri-SP: Manole, 2008.

BIBER, D. Representativiness in Corpus Design. Linguist Computing. v. 8, p. 243-257, 1993.

BIRD, Steven; LOPER, Edward; KLEIN, Ewan. Natural Language Processing with Python. O’Reilly Media Inc., 2009. Disponível em: https://www.nltk.org/book/. Acesso em: 24 jul. 2023.

BISONG, E. Google Collaboratory. In: Building Machine Learning and Deep Learning Models on Google Cloud Platform. Berkeley, CA: Apress, 2019. Capítulo 7. Disponível em: https://doi.org/10.1007/978-1-4842-4470-8_7.

CRYSTAL, D. English as a global Language. Cambridge. Cambridge University Press, 1997.

EGGINS, S. An introduction to Systemic Functional Linguistics. Londres: Pinter Publishers, 1994.

GOUVEIA, C. Texto e gramática: uma introdução a linguística sistêmico-funcional.

Matraga. Rio de Janeiro, v. 16, n. 24, p. 13-47, 2009.

GROSS, A. The rhetoric of science. Cambridge, MA: Harvard University Press, 1996.

HALLIDAY, M. A. K. An introduction to Functional Grammar. Londres: Edward Arnold, 1994.

_________________. & MATTHIESSEN, C. M.I.M. An introduction to Functional Grammar. Londres: Edward Arnold. Third Edition, 2004.

_______ & MATTHIESSEN, C. M.I.M. An introduction to Functional Grammar. Londres: Edward Arnold. Third Edition, 2014.

HARRIS, Charles R. et al. Array programming with NumPy. Nature, v. 585, n. 7825, p. 357-362, set. 2020. DOI: 10.1038/s41586-020-2649-2. Disponível em: https://doi.org/10.1038/s41586-020-2649-2.

HUNTER, J. D. Matplotlib: A 2D Graphics Environment. Computing in Science & Engineering, v. 9, n. 3, p. 90-95, 2007.

pdfminer.six. (2023). pdfminer.six (Version 20221105). [Software de extração de texto de PDF]. Disponível em: https://pypi.org/project/pdfminer.six/. GitHub repository: https://github.com/pdfminer/pdfminer.six.

Leonard Richardson. BeautifulSoup (Version 4.11.2). [Pacote Python para análise de documentos HTML e XML]. Disponível em: https://pypi.org/project/beautifulsoup4/. GitHub repository: https://github.com/wention/BeautifulSoup4.

MARTIN, J. R. English Text: System and Structure. Ámsterdam: Benjamins, 1992.

McENERY, T. & WILSON, A. Corpus Linguistics. Edinburgh, Edinburgh University Press.

MOITA LOPES, L. P. (Org.) Por uma. Linguística Aplicada Indisciplinar. São Paulo: Parábola Editorial, 2006.

AUTOR1. Entre alhos e bugalhos – os usos do clítico SE na escrita acadêmica. Tese de Doutorado. PUC-SP. 2013.

___________. Os dizentes nos artigos científicos de Linguística - um estudo baseado na Linguística Sistêmico-Funcional e com o auxílio da Linguística de Corpus. Letras & Letras, v. 30, p. 46-63, 2014.

___________. O uso do processo existencial ‘haver’ na escrita acadêmica: um estudo com base em um corpus de artigos científicos de diversas áreas do conhecimento. Revista (Con) Textos Linguísticos (UFES), v. 9, p. 142-160, 2015.

___________. O gênero resenha na sala de aula de Língua Portuguesa como L2. Anais do IV Encontro Mundial de Ensino de Língua Portuguesa. Washington: Georgetown University, 2016.

MOREIRA FILHO, J. L. Python para Linguística de Corpus : guia prático, 1. ed., São Paulo, Ed. do Autor, 2021.

SANCHEZ, A. Definicion e historia de los corpus. In: SANCHEZ, A et al (Org.) CUMBRE – corpus linguistico de espanol contemporaneo. Madrid: SGEL, 1995.

SCOTT, M. R. Wordsmith Tools v. 8. Software for text analysis. Oxford University Press, 2018.

THOMPSON, G. Introducing Functional Grammar. New York: Routledge, 1996.

TRASK, R. L. Dicionário de Linguagem e Linguística. São Paulo: Contexto, 2004.

VIRTANEN, Pauli et al. SciPy 1.0: Algoritmos fundamentais para computação científica em Python. Nature Methods, v. 17, p. 261-272, 2020. DOI: 10.1038/s41592-019-0686-2.

WIDDOWSON, H. ELF and the pragmatics of language variation. Journal of English as Lingua Franca. V. 4 (2), pp. 359-372, 2015.

Published

2024-03-18

How to Cite

MORAIS, F. B. C. de; SANTOS , J. P. M. dos. The construction of a corpus of aviation scientific articles:: an interdisciplinary study. The Journal of the University of the Air Force , Rio de Janeiro, v. 37, p. 1–21, 2024. DOI: 10.22480/revunifa.2024.37.617. Disponível em: https://revistaeletronica.fab.mil.br/index.php/reunifa/article/view/617. Acesso em: 13 may. 2024.

Issue

Section

Original Articles