Machine Translation at South Tyrolean Institutions (pilot study)
- Project duration: May 2021 - May 2022
- Project status: Ongoing
In this project, we aim to conduct a preliminary study on machine translation (MT) at South Tyrolean institutions, e.g. Free University of Bolzano, Province of Bolzano, etc. Neither of them employs a customized machine translation software already, but they have high translation demands, to some extent repetitive and structured texts and specific needs related to the local variety of German. Customized MT could therefore represent an improvement into their translation process. To do so we need a huge amount of parallel segments, ideally several hundreds of thousands up to a million.
In particular, in this project we look into the possibilities of building aligned parallel corpora with their institutional documents. In order to build the corpora, we seek already aligned texts (i.e. translation memories), or collect and align them afterwards. The second option would require also an automated process (e.g. web crawler) and therefore the collaboration of IT-personnel. In both cases, we contact the institutions and pursue a collaboration. In addition, we evaluate the most suitable software for our customization among those present in the scientific-commercial landscape, also getting in contact with foreign institutions that have already conducted similar experimentations in order to initiate scientific interchange.
The results will consist in one or more parallel corpora to feed into a selected MT software, with which we would be able to start a customization. Goal of this project is also to define a greater research project on machine translation for South Tyrolean institutions, which we aim to submit to international funding programs (e.g. Interreg).
Chiocchetti E (2019)
Conference: Workshop NMT-Diagnose | Hildesheim | 12.12.2019 - 12.12.2019http://hdl.handle.net/10863/12453
De Camillis F, Contarino AG (2021)
Conference: PaCor 2021 | Vitoria-Gasteiz | 23.6.2021 - 25.6.2021https://hdl.handle.net/10863/17781
Coherence in academic ItalianDuration: September 2020 - July 2021Funding:
Scientific support for terminology issuesDuration: May 2019 - July 2021Funding:
Zeit.shift – On a digital journey into yesterday's future: preserving Tyrol's cultural text heritage ...Duration: September 2020 - July 2021Funding:
Learner Corpus InfrastructureDuration: December 2015 - July 2021Funding:
Translation and terminology work in the domain of occupational health and safetyDuration: September 2013 - July 2021Funding:
European Network for Combining Language Learning with Crowdsourcing TechniquesDuration: March 2017 - July 2021Funding:
One School, Many Languages 2.0Duration: December 2018 - July 2021Funding:
Machine Translation at South Tyrolean Institutions (pilot study)Duration: May 2021 - July 2021Funding:
Common Language Resources and Technology Infrastructure for Italy at Eurac ResearchDuration: March 2017 - July 2021Funding: