Language tec​hnologies​

  • Deutsch
  • English
  • Italiano

The Language Technologies (LT) group researches and uses computational approaches to advance different subfields of linguistics. The group’s activities concern:

  • Creation and collection: citizen science, transcription or crowdsourcing initiatives

  • Augmentation: corpus annotation

  • Management: databases, infrastructures and standards

  • Analysis, visualization and prediction: researching non-standard language varieties, with, for instance, statistical methods, machine learning and neural networks

  • Application: developing technologies to support research at the Institute

Originally, LT was founded to support other research groups within the Institute with linguistic data management and processing. However, as the number of LT experts at the Institute grew, so did the group’s interest in developing its own research agenda. As a result, LT today actively pursues computational linguistics research related to the creation, curation, adaptation, and long-term preservation of language resources and tools, while continuing to provide expertise and technical support to the research efforts of the entire Institute. In so doing, LT enhances traditional research design and procedures with the latest developments from computational linguistics and Natural Language Processing (NLP), while better understanding and accommodating valuable information on the realities and needs of the linguistic research stakeholders. Its research subjects are in particular focused on non-standard language varieties such as learner language, computer-mediated communication and specialized language.

The mutually beneficial cooperation between the research groups is one of the strengths of the Institute for Applied Linguistics and has led to its international recognition in several areas of applied interdisciplinary research. Ongoing projects focus on resource creation and management, in particular of learner corpora and, computer-mediated communication, language research infrastructures, crowdsourcing for computer-assisted language learning, machine learning for data annotation, visualization of linguistic information, Digital Humanities and digital cultural heritage, citizen science, and, to a limited extent, technologies for terminology research and e-lexicography.