Machine Translation at South Tyrolean Institutions (pilot study)
- Project duration: May 2021 - December 2022
- Project status: Approval by the Scientific Committee
In this project, we aim to conduct a preliminary study on machine translation (MT) at South Tyrolean institutions, e.g. Free University of Bolzano, Province of Bolzano, etc. Neither of them employs a customized machine translation software already, but they have high translation demands, to some extent repetitive and structured texts and specific needs related to the local variety of German. Customized MT could therefore represent an improvement into their translation process. To do so we need a huge amount of parallel segments, ideally several hundreds of thousands up to a million.
In particular, in this project we look into the possibilities of building aligned parallel corpora with their institutional documents. In order to build the corpora, we seek already aligned texts (i.e. translation memories), or collect and align them afterwards. The second option would require also an automated process (e.g. web crawler) and therefore the collaboration of IT-personnel. In both cases, we contact the institutions and pursue a collaboration. In addition, we evaluate the most suitable software for our customization among those present in the scientific-commercial landscape, also getting in contact with foreign institutions that have already conducted similar experimentations in order to initiate scientific interchange.
The results will consist in one or more parallel corpora to feed into a selected MT software, with which we would be able to start a customization. Goal of this project is also to define a greater research project on machine translation for South Tyrolean institutions, which we aim to submit to international funding programs (e.g. Interreg).
De Camillis F, Contarino AG (2021)
Conference: PaCor 2021 | Vitoria-Gasteiz | 23.6.2021 - 25.6.2021
Chiocchetti E (2019)
Conference: Workshop NMT-Diagnose | Hildesheim | 12.12.2019 - 12.12.2019
Coherence in academic ItalianDuration: September 2020 - October 2021Funding:
Scientific support for terminology issuesDuration: May 2019 - October 2021Funding:
Zeit.shift – On a digital journey into yesterday's future: preserving Tyrol's cultural text heritage ...Duration: September 2020 - October 2021Funding:
Learner Corpus InfrastructureDuration: December 2015 - October 2021Funding:
Translation and terminology work in the domain of occupational health and safetyDuration: September 2013 - October 2021Funding:
European Network for Combining Language Learning with Crowdsourcing TechniquesDuration: March 2017 - October 2021Funding:
One School, Many Languages 2.0Duration: December 2018 - October 2021Funding:
Machine Translation at South Tyrolean Institutions (pilot study)Duration: May 2021 - October 2021Funding:
Common Language Resources and Technology Infrastructure for Italy at Eurac ResearchDuration: March 2017 - October 2021Funding: