DAMICO

Data Mining in Corpus Linguistics

  • Deutsch
  • English
  • Italiano

Within her PhD project, “Data-Mining in Corpus Linguistics”, Jennifer-Carmen Frey aims at bridging between the field of computer science and linguistics, exploring recent methods of data-mining and their value for corpus linguistic research. In an exploratory case study, state-of-the-art machine-learning based approaches to data analysis are explored for their applicability to corpus linguistics and evaluated via prototypical implementations on existing corpus research.  The central questions of the approach, namely if data-mining methods are able to a) generate (and therefore verify) existing research results and b) lead the linguist to further linguistically interesting patterns emerging from the data, are addressed within a couple of case studies on available, non-standard corpora. The results of the work, an evaluation and discussion on the potential and the restrictions of corpus-driven data-mining approaches, as well as the provision of the adapted implementations as ready-to-use plug-ins for widely-used corpus software, will show how and if data-mining techniques can serve general corpus linguistic research.

Publications
Lexikalische Komplexität im Kontext holistischer Textbewertungen
Frey JC (2020)
Presentation/Speech

Conference: Mehrsprachigkeit und Lernerkorpora | Bolzano | 13.2.2020 - 13.2.2020

https://hdl.handle.net/10863/14953

Using Data Mining to Repurpose German Language Corpora. An evaluation of data-driven analysis methods for corpus linguistics
Frey J (2020)
PhD thesis

https://hdl.handle.net/10863/17321

Comparison of Automatic vs. Manual Language Identification in Multilingual Social Media Texts
Frey JC, Stemle E, Doğruöz AS (2019)
Contribution in book
Building computer-mediated communication corpora for socio-linguistic analysis

https://hdl.handle.net/10863/10130

The myth of the Digital Native? Analysing language use of different generations in Facebook
Frey JC, Glaznieks A (2018)
Conference proceedings article
Was wir bewerten, wenn wir Schülertexte bewerten: Menschliche Bewertungen und digitale Zugänge zu ihren empirischen Spuren
Frey JC (2018)
Presentation/Speech

Conference: Expertenworkshop MIT.Qualität | Mannheim | 18.6.2018 - 19.6.2018

The myth of the Digital Native: Analysing language use of different generations on Facebook
Frey JC, Glaznieks A (2018)
Presentation/Speech

Conference: 6th Conference on CMC and Social Media Corpora for the Humanities (cmccorpora18) | Antwerp | 17.9.2018 - 18.9.2018

Sociolinguistic research using the DiDi corpus of South Tyrolean CMC: From corpus-based research designs to computational linguistic challenges
Frey CF, Stemle EW, Glaznieks A (2018)
Presentation/Speech

Conference: 44. Österreichische Linguistiktagung 2018 (ÖLT2018) | Innsbruck | 26.10.2018 - 28.10.2018

Measuring Text Quality in the Digital Age: The Project “MIT.Qualität”
Glaznieks A, Linthe M, Frey JC (2018)
Presentation/Speech

Conference: 1st Literary Summit | Porto | 1.11.2018 - 3.11.2018

The Myth of the Digital Native: Analysing language use of different generations on Facebook
Frey JC, Glaznieks A (2018)
Conference proceedings article

Conference: 6th Conference on CMC and Social Media Corpora for the Humanities (cmccorpora18) | Antwerp | 17.9.2018 - 18.9.2018

More information: https://www.uantwerpen.be/images/uantwerpen/container49896/f ...

https://hdl.handle.net/10863/8093

A data mining approach to digital age
Frey J (2017)
Forlì
Presentation/Speech

Conference: DIT Postgraduate Research Workshop | Forlì | 6.7.2016 - 6.7.2016

DiDi: A multilingual corpus of non-public South Tyrolean computer-mediated communication
Frey J (2016)
Lancaster
Presentation/Speech

Conference: UCREL Summer School in corpus-based NLP | | 10.7.2016 - 15.7.2016

Our partners
1 - 1
  • University of Bologna, Department of Interpretation and Translation in Forlì

Project Team
1 - 1

Projects

1 - 9
Project

ITACA

Coherence in academic Italian

Duration: - Funding: Provincial P.-L.P. 14. Research ...

view all

Institute's Projects

Institute