ScienceBlogs
Home

Are Eurac Research linguists really gaming during work hours?!

1
1
Are Eurac Research linguists really gaming during work hours?!
Ötzit! game screen - © Image licensed under CC BY 4.0.

Eurac linguists are exploring gamified data crowdsourcing methods to engage Tyrolean citizens in linguistic and cultural heritage research. We present to you Ötzit!, our first game with a purpose (GWAP) designed to collect transcription data while teaching players how to read German Fraktur.

Yes, it’s all true! You can play video games and be a scientist!

Besides researching contemporary language use, the Institute of Applied Linguistics is increasingly turning to libraries and other memory institutions to leverage their historical collections and investigate the evolution of language in South Tyrol. The more books, dictionaries, newspapers and magazines are digitized, the more opportunities linguists have to study language.

However, not all that glitters is gold! In fact, digitization of textual heritage is not a smooth process and the quality of the digitized text depends on many factors:

  • the quality of the original print text (for example, typeface, age, page layout, state of decay);
  • the care applied during digitization(for instance, pages scanned while being turned, blurry scans, scanners’ hands);
  • the performance of the Optical Character Recognition (OCR is the technology used to transform print text into machine-readable text).

All these factors can contribute negatively and lead to a noisier, error-heavy digitized output (Figure 1).

As a result, researchers need to clean up digitized data from errors before they can even begin to process and study it. Studies show that researchers spend 80% of their valuable time cleaning up and organizing data and only 20% on the actual analysis! 😲

Image 1 of 2
Example of an image scan of a newspaper advert in German Fraktur (1st image) compared to the noisy machine-readable version produced by OCR software (2nd image): mistakes are highlighted in red. Advert taken from Tiroler Land-Zeitung, 21st December 1918, p. 8.© Image licensed under CC BY 4.0.
Image 2 of 2
© Image licensed under CC BY 4.0.

So, what can we do to break this 80/20 pattern and help researchers spend more time on analysis rather than data cleaning? Well, in our Zeit.shift project, we’ve developed the Ötzit! web game to crowdsource manual corrections of digitized historical Tyrolean newspapers from local citizens while teaching them how to read German Fraktur typescript! By gamifying an otherwise tedious process, we aim to create a mutually beneficial cooperation between citizens and researchers in the name of cultural heritage awareness and research.

Ötzit! takes its name after Ötzi the Iceman. In the game, alpine animals walk in the direction of Ötzi looking to harm him while Fraktur words automatically extracted from Tyrolean newspapers appear on the screen; players must type the words correctly as fast as possible to fend off the animals and thus preserve Ötzi's health (Figures 2). Transcriptions typed by players are collected to test the efficacy of the game as an OCR manual post-correction tool.

Image 1 of 1
A researcher at the Institute of Applied Linguistics playing Ötzit!© Eurac Research - Greta Franzini

Everyone is invited to play so give it a go and spread the word! Knowledge of German is not required but certainly an advantage. Head over to register and play (from desktop or mobile devices).

Greta Franzini

Greta Franzini

Greta H. Franzini is an Anglo-Italian postdoctoral researcher at the Institute for Applied Linguistics at Eurac Research. A Classicist by training, she holds a Ph.D in Information Studies and Digital Humanities from University College London (UCL), and works across cultural heritage and natural language technology research. In her free time, Greta particularly enjoys sports and driving around in her historical Fiat Nuova 500 D.

Tags

  • Ask a Linguist

Citation

https://doi.org/10.57708/b143281732
Franzini, G. Are Eurac Research linguists really gaming during work hours?! https://doi.org/10.57708/B143281732

Related Post

Essere coerenti è importante anche quando si scrive?
ScienceBlogs
connecting-the-dots

Essere coerenti è importante anche quando si scrive?

Lorenzo ZanasiLorenzo Zanasi
Habe ich nun Vorfahrt, Vorrang, Vortritt oder soll ich doch lieber warten?
ScienceBlogs
connecting-the-dots

Habe ich nun Vorfahrt, Vorrang, Vortritt oder soll ich doch lieber warten?

Natascia RalliNatascia Ralli
Übersetzen studieren im Zeitalter der maschinellen Übersetzung: Ist Deutsch eine Option?
ScienceBlogs
connecting-the-dots

Übersetzen studieren im Zeitalter der maschinellen Übersetzung: Ist Deutsch eine Option?

Sandra NauertSandra Nauert