DIDI

Digital Natives - Digital Immigrants. La scrittura nei social network: osservazione degli attuali fenomeni linguistici in Sudtirolo sulla base di corpora, con particolare riferimento all'età degli scriventi.

  • Deutsch
  • English
  • Italiano
DIDI
  • Project duration: May 2013 - December 2019
  • Project status: finished
  • Funding:
    Provincial P.-L.P. 14. Research projects (Province BZ funding /Project)
  • Total project budget: 200.392,20 €

Oggetto della ricerca è stato la scrittura in ambito privato. Grazie a testi editi e pubblicati nei social network sites (SNS), si analizzava come la lingua tedesca, nelle sue varianti standard e dialettale, viene impiegata dagli "user" sudtirolesi in forma scritta per scopi comunicativi. L'obiettivo era di evidenziare le eventuali particolarità che derivano dall'uso della lingua nei nuovi media. Particolare attenzione è stato inoltre dedicata all'aspetto generazionale, ovvero a stabilire se l'età esercita un'influenza sull'uso del tedesco scritto.

The DiDi Corpus

The DiDi corpus has an overall size of around 650.000 Tokens gathered from 136 South Tyrolean Facebook users who participated in the DiDi project. It consists of 11.102 Facebook wall posts, 6.507 wall comments and 22.218 private messages. All messages were written by the participants throughout the year 2013. Please read the fulldescription of the corpus for further details. Please consider also the description of the method of data collection and the full description of the DiDi project and its research questions.

As every participant could offer either his/her private messages, his/her texts on the wall or both, the corpus comprises wall posts and wall comments from 130 profiles and private messages of 56 profiles; 50 participants granted access to both types of data. Free access to the corpus is given to the wall posts and comments. Due to privacy issues the access to the private messages is restricted. Access to the private messages can be given for scientific research only, after signing a non-disclosure agreement. In case you are interested in the data for scientific reasons, please contact the research team.

All texts were anonymised in order to guarantee that the participants' identity cannnot be infered from the texts. The anonymisation included person names, group names, geographical names and adjectival references, institution names, hyperlinks, mail addresses, phone numbers, numbers of bank accounts, servers, postal codes and other private information. Please, read the anonymisation document for the anonymisation keys.

The corpus offers a vast range of research opportunities for linguists that are interested in CMC in general, and more specific in multilingual language use, the use of regional varieties, code switching, code shifting and code mixing phenomena, etc.

Access to the DiDi corpus via ANNIS: https://commul.eurac.edu/annis/didi

Corpus download via Eurac Research Clarin Centre: https://clarin.eurac.edu/

Publications
Das DiDi‐Korpus: Internetbasierte Kommunikation aus Südtirol
Glaznieks A, Frey JC (2020)
Contributo in un libro
Deutsch in Sozialen Medien

https://doi.org/10.1515/9783110679885-019

https://hdl.handle.net/10863/15720

Using Data Mining to Repurpose German Language Corpora. An evaluation of data-driven analysis methods for corpus linguistics
Frey J (2020)
Tesi di dottorato (PhD)

https://hdl.handle.net/10863/17321

DIDI - The DiDi Corpus of South Tyrolean CMC 1.0.0
Frey JC, Glaznieks A, Stemle EW (2019)
Banca dati

Ulteriori informazioni: http://hdl.handle.net/20.500.12124/7

How FAIR are CMC Corpora?
König A, Frey JC, Stemle EW (2019)
Presentazione

Conference: 7th Conference on CMC and Social Media Corpora for the Humanities (cmccorpora19) | Cergy-Pontoise | 9.9.2019 - 10.9.2019

https://hdl.handle.net/10863/11295

Comparison of Automatic vs. Manual Language Identification in Multilingual Social Media Texts
Frey JC, Stemle E, Doğruöz AS (2019)
Contributo in un libro
Building computer-mediated communication corpora for socio-linguistic analysis

https://hdl.handle.net/10863/10130

How FAIR are CMC corpora?
Frey JC, König A, Stemle E (2019)
Contributo in atti di convegno

Conference: 7th Conference on CMC and Social Media Corpora for the Humanities (cmccorpora19) | Cergy-Pontoise | 9.9.2019 - 10.9.2019

Ulteriori informazioni: https://cmccorpora19.sciencesconf.org/data/pages/proceedings ...

https://hdl.handle.net/10863/11294

Das DiDi-Korpus: internetbasierte Kommunikation aus Südtirol
Frey J, Glaznieks A (2019)
Presentazione

Conference: 55. Jahrestagung des Instituts für Deutsche Sprache | Mannheim | 12.3.2019 - 14.3.2019

https://hdl.handle.net/10863/13382

The myth of the Digital Native? Analysing language use of different generations in Facebook
Frey JC, Glaznieks A (2018)
Contributo in atti di convegno
Der plurilinguale Sprecher in Facebook. Neue Medien und Pluriliteracy in Südtirol
Frey JC (2018)
Presentazione

Conference: 4th LRI Workshop for young academics "Language Policy - Language Use - Language Standard" | Meran | 7.6.2018 - 8.6.2018

Becoming a multilingual speaker. New Media and pluriliteracy in South Tyrol
Frey JC (2018)
Presentazione

Conference: Round table "Social Net(work)s in Education and Language Sciences" | Heidelberg | 15.6.2018 - 15.6.2018

Pluriliteracy on Social Media. The Multilingual Practices of South Tyroleans on Facebook
Frey JC (2018)
Presentazione

Conference: Language, Identity and Education in Multilingual Contexts | Dublin | 2.2.2018 - 4.2.2018

The myth of the Digital Native: Analysing language use of different generations on Facebook
Frey JC, Glaznieks A (2018)
Presentazione

Conference: 6th Conference on CMC and Social Media Corpora for the Humanities (cmccorpora18) | Antwerp | 17.9.2018 - 18.9.2018

Sociolinguistic research using the DiDi corpus of South Tyrolean CMC: From corpus-based research designs to computational linguistic challenges
Frey CF, Stemle EW, Glaznieks A (2018)
Presentazione

Conference: 44. Österreichische Linguistiktagung 2018 (ÖLT2018) | Innsbruck | 26.10.2018 - 28.10.2018

Experteninterview: We viel "Emojion" verträgt unsere Sprache?
Abel A, Frey JC (2018)
Giornale
Zett: Die Zeitung am Sonntag
Dialekt als Norm? Zum Sprachgebrauch Südtiroler Jugendlicher auf Facebook
Glaznieks A, Frey JC (2018)
Contributo in un libro
Jugendsprachen/Youth Languages: Aktuelle Perspektiven internationaler Forschung/Current Perspectives of International Research

https://doi.org/10.1515/9783110472226-038

https://hdl.handle.net/10863/7699

The Myth of the Digital Native: Analysing language use of different generations on Facebook
Frey JC, Glaznieks A (2018)
Contributo in atti di convegno

Conference: 6th Conference on CMC and Social Media Corpora for the Humanities (cmccorpora18) | Antwerp | 17.9.2018 - 18.9.2018

Ulteriori informazioni: https://www.uantwerpen.be/images/uantwerpen/container49896/f ...

https://hdl.handle.net/10863/8093

Connecting Resources: Which Issues have to be Solved to Integrate CMC Corpora from Heterogeneous Sources and for Different Languages?
Beißwenger M, Wigham CR, Etienne C, Fišer D, Suárez HG, Herzberg L, Hinrichs E, Horsmann T, Karlova-Bourbonus N, Lemnitzer L, Longhi J, Lüngen H, Ho-Dac L, Parisse C, Poudat C, Schmidt T, Stemle EW, Storrer A, Zesch T (2017)
Think Global, Write Local – Patterns of Writing Dialect on SNS
Glaznieks A (2017)
Presentazione
Geschriebener Dialekt in Südtiroler Facebooktexten
Glück A, Glaznieks A (2017)
Presentazione
A data mining approach to digital age
Frey J (2017)
Forlì
Presentazione

Conference: DIT Postgraduate Research Workshop | Forlì | 6.7.2016 - 6.7.2016

Think Global, Write Local: Patterns of Writing Dialect on SNS
Glaznieks A (2017)
Contributo in atti di convegno

https://doi.org/10.5281/zenodo.1041851

https://hdl.handle.net/10863/7939

Proceedings of the 5th Conference on CMC and Social Media Corpora for the Humanities
Stemle E, Wigham C (2017)
Bolzano: Eurac Research
Monografia (curatore)

Ulteriori informazioni: https://zenodo.org/record/1040875

https://doi.org/10.5281/zenodo.1040875

https://hdl.handle.net/10863/6510

Connecting Resources: Which Issues have to be Solved to Integrate CMC Corpora from Heterogeneous Sources and for Different Languages?
Beißwenger M, Wigham CR, Etienne C, Fišer D, Suárez HG, Herzberg L, Hinrichs E, Horsmann T, Karlova-Bourbonus N, Lemnitzer L, Longhi J, Lüngen H, Ho-Dac L, Parisse C, Poudat C, Schmidt T, Stemle E, Storrer A, Zesch T (2017)
Bolzano, Italy
Contributo in atti di convegno
Proceedings of the 5th Conference on CMC and Social Media Corpora for the Humanities

Ulteriori informazioni: https://zenodo.org/record/1041877

https://doi.org/10.5281/zenodo.1041877

https://hdl.handle.net/10863/7942

DiDi Corpus
Stemle EW (2017)
Duisburg, Germany
Presentazione

Conference: Integrating a new type of language resource into the Digital Humanities landscape| French-German colloquium on standards for corpora of computer-mediated communication | Duisburg : 19.6.2017 - 20.6.2017

Ulteriori informazioni: https://sites.google.com/view/dhcmc2017/

https://hdl.handle.net/10863/9186

Mehrsprachigkeit auf Südtirols Social-Media-Profilen
Frey J (2016)
Bozen/Bolzano
Presentazione

Conference: Work in Progress Linguistics Colloquium Eurac Research/Free University of Bolzano | Bozen | 11.6.2015 - 11.6.2015

The DiDi Corpus of South Tyrolean CMC Data: A multilingual corpus of Facebook texts
Frey J, Glaznieks A, Stemle EW (2016)
Naples
Presentazione

Conference: Third Italian Conference on Computational Linguistics (CliC-it 2016) | Naples | 5.12.2016 - 6.12.2016

DiDi: A multilingual corpus of non-public South Tyrolean computer-mediated communication
Frey J (2016)
Lancaster
Presentazione

Conference: UCREL Summer School in corpus-based NLP | | 10.7.2016 - 15.7.2016

The DiDi Corpus of South Tyrolean CMC Data: A multilingual corpus of Facebook texts
Frey J, Glaznieks A, Stemle EW (2016)
Naples
Contributo in atti di convegno

Conference: Third Italian Conference on Computational Linguistics (CliC-it 2016) | Naples | 5.12.2016 - 6.12.2016

Ulteriori informazioni: http://ceur-ws.org/Vol-1749/paper27.pdf

https://hdl.handle.net/10863/8949

"Bitte deutsch schreiben!" Multilingual and diglossic - a linguistic description of South Tyrolean Facebook users
Glaznieks A, Frey JC (2015)
Presentazione

Conference: Multilingualism in the Digital Age | Reading | 19.6.2015 - 19.6.2015

The DiDi Corpus of South Tyrolean CMC Data
Frey J, Glaznieks A, Stemle EW (2015)
Essen
Presentazione

Conference: 2nd Workshop of the Natural Language Processing for Computer-Mediated Communication / Social Media| NLP4CMC at GSCL 2015 | Essen : 28.9.2015 - 29.9.2015

The DiDi Project: Collecting, Annotating, and Analysing South Tyrolean Data of Computer-mediated Communication.
Stemle EW (2015)
Rennes
Presentazione

Conference: ird-cmc-rennes | International Research Days: Social Media and CMC Corpora for the eHumanities | Rennes : 23.10.2015 - 24.10.2015

Ulteriori informazioni: http://ird-cmc-rennes.sciencesconf.org/

https://hdl.handle.net/10863/9187

The DiDi Corpus of South Tyrolean CMC Data
Frey J, Glaznieks A, Stemle EW (2015)
Essen
Contributo in atti di convegno

Conference: 2nd Workshop of the Natural Language Processing for Computer-Mediated Communication / Social Media| NLP4CMC at GSCL 2015 | Essen : 28.9.2015 - 29.9.2015

Ulteriori informazioni: http://ceur-ws.org/Vol-1749/paper27.pdf

https://hdl.handle.net/10863/8928

Zum Projekt DiDi - Digital Natives - Digital Immigrants
Frey J (2014)
Bozen/Bolzano
Radio-TV
Wie schreibt Südtirol auf Facebook?
Frey JC (2014)
Presentazione

Conference: 1. LRI Workshop "Sprache - Region - Identität in der computervermittelten Kommunikation | Meran | 13.6.2014 - 14.6.2014

Code-Switching on Facebook Wall Posts of Bilingual German-speaking South Tyroleans
Stuckey N, Frey J (2014)
Vienna
Presentazione

Conference: 41. Österreichische Linguistiktagung (ÖLT 2014), Universität Wien | Vienna | 6.12.2014 - 8.12.2014

Collecting language data of non-public social media profiles
Frey J, Glaznieks A, Stemle EW (2014)
Hildesheim
Presentazione

Conference: Workshop “NLP 4 CMC| Natural Language Processing for Computer-Mediated Communication / Social Media” at the 12th edition of KONVENS | Hildesheim : 8.10.2014 - 10.10.2014

Collecting language data of non-public social media profiles
Frey J, Stemle EW, Glaznieks A (2014)
Hildesheim: Universitatsverlag Hildesheim, Germany
Contributo in atti di convegno

Conference: Workshop “NLP 4 CMC| Natural Language Processing for Computer-Mediated Communication / Social Media” at the 12th edition of KONVENS | Hildesheim : 8.10.2014 - 10.10.2014

Ulteriori informazioni: http://www.uni-hildesheim.de/konvens2014/data/konvens2014-wo ...

https://hdl.handle.net/10863/8891

The Project DIDI. Writing on Social Network Sites – A Corpus-based Observation of the Current Language Use in South Tyrol, with Particular Consideration of the Writers' Age
Glaznieks A, Stemle EW (2013)
Dortmund
Presentazione
The Project DIDI. Writing on Social Network Sites – A Corpus-based Observation of the Current Language Use in South Tyrol, with Particular Consideration of the Writers’ Age. Talk at the international workshop "Building Corpora of Computer-Mediated Communi
Glaznieks A, Stemle EW (2013)
Dortmund
Presentazione

Conference: International Workshop "Building Corpora of Computer-Mediated Communication| Issues, Challenges, and Perspectives" | Dortmund : 14.2.2013 - 15.2.2013

Herausforderungen bei der automatischen Verarbeitung von dialektalen IBK-Daten
Glaznieks A, Stemle EW (2013)
Darmstadt
Presentazione

Ulteriori informazioni: https://www.researchgate.net/publication/259344920_Herausfor ...

Our partners

Südtiroler Kulturinstitut

Project Team
1 - 5

Nicole Stuckey

Team Member

Projects

1 - 9
Project

ITACA

Coerenza nell'ITAliano Accademico

Duration: September 2020 - June 2022Funding: Provincial P.-L.P. 14. Research ...

view all

Institute's Projects

Institute