Show simple item record

Optimierte Wörterbücher: Ein teilautomatisierter Arbeitsablauf zur Identifizierung von Konzepten in Textdaten
[working paper]

dc.contributor.authorRöth, Leoncede
dc.contributor.authorKaftan, Leade
dc.contributor.authorSaldivia Gonzatti, Danielde
dc.date.accessioned2024-01-30T08:27:20Z
dc.date.available2024-01-30T08:27:20Z
dc.date.issued2024de
dc.identifier.urihttps://www.ssoar.info/ssoar/handle/document/91666
dc.description.abstractIdentifying social science concepts and measuring their prevalence and framing in text data has been a key task of scientists ever since. Whereas debates about text classifications typically contrast different approaches with each other, we propose a workflow that generates optimized dictionaries that are based on the complementary use of expert dictionaries, machine learning, and topic modeling. We demonstrate our case by identifying the concept of "territorial politics" in leading newspapers vis-à-vis parliamentary speeches in Spain (1976-2018) and the UK (1900-2018). We show that our optimized dictionaries outperform singular text-identification techniques with F1-scores around 0.9 for unseen data, even if the unseen data comes from a different political domain (media vs. parliaments). Optimized dictionaries have increasing returns and should be developed as a common good for researchers overcoming costly particularism.de
dc.languageende
dc.subject.ddcSozialwissenschaften, Soziologiede
dc.subject.ddcSocial sciences, sociology, anthropologyen
dc.subject.othertext-as-data; agenda-setting; saliencede
dc.titleOptimized Dictionaries: A Semi-Automated Workflow of Concept Identification in Text-Datade
dc.title.alternativeOptimierte Wörterbücher: Ein teilautomatisierter Arbeitsablauf zur Identifizierung von Konzepten in Textdatende
dc.description.reviewnicht begutachtetde
dc.description.reviewnot revieweden
dc.publisher.countryDEUde
dc.subject.classozForschungsarten der Sozialforschungde
dc.subject.classozResearch Designen
dc.subject.thesozTextanalysede
dc.subject.thesoztext analysisen
dc.subject.thesozMassenmediende
dc.subject.thesozmass mediaen
dc.subject.thesozParlamentsdebattede
dc.subject.thesozparliamentary debateen
dc.subject.thesozWörterbuchde
dc.subject.thesozdictionaryen
dc.subject.thesozAufmerksamkeitde
dc.subject.thesozattentionen
dc.subject.thesozpolitische Agendade
dc.subject.thesozpolitical agendaen
dc.identifier.urnurn:nbn:de:0168-ssoar-91666-3
dc.rights.licenceCreative Commons - Namensnennung, Nicht-kommerz. 4.0de
dc.rights.licenceCreative Commons - Attribution-NonCommercial 4.0en
ssoar.contributor.institutionGESISde
internal.statusformal und inhaltlich fertig erschlossende
internal.identifier.thesoz10035477
internal.identifier.thesoz10037618
internal.identifier.thesoz10054083
internal.identifier.thesoz10051186
internal.identifier.thesoz10036983
internal.identifier.thesoz10063283
dc.type.stockmonographde
dc.type.documentArbeitspapierde
dc.type.documentworking paperen
dc.source.pageinfo24, 18de
internal.identifier.classoz10104
internal.identifier.document3
internal.identifier.ddc300
dc.description.pubstatusPreprintde
dc.description.pubstatusPreprinten
internal.identifier.licence32
internal.identifier.pubstatus3
internal.identifier.review3
dc.subject.classhort10500de
ssoar.wgl.collectiontruede
internal.pdf.validfalse
internal.pdf.wellformedtrue
internal.pdf.encryptedfalse


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record