Show simple item record

[working paper]

dc.contributor.authorKatzenbach, Christiande
dc.contributor.authorKopps, Adriande
dc.contributor.authorMagalhães, João Carlosde
dc.contributor.authorRedeker, Dennisde
dc.contributor.authorSühr, Tomde
dc.contributor.authorWunderlich, Larissade
dc.date.accessioned2023-09-19T06:52:41Z
dc.date.available2023-09-19T06:52:41Z
dc.date.issued2023de
dc.identifier.urihttps://www.ssoar.info/ssoar/handle/document/89155
dc.description.abstractPlatform policies contain the spelled out rules about what is allowed and prohibited on a service. As such, they constitute both a normative framework as well as a means of public communication by platforms. Studying the evolution of the increasingly complex web of policies that platforms have developed can hence allow us to trace the emergence of a specific normative order, i.e. the ways in which platforms are governing user activities and public speech and communication dynamics, as well as identify how they have reacted to public controversies, political debates and legal regulation. A major difficulty for studies on the historical evolution of platform policies, however, is the availability of past policies which is often needed for a thorough analysis, as the policies change quite frequently and even their names and locations often differ from the current version. Although platforms have become increasingly transparent about how and when they are changing their rules and have begun to offer public archives of the different historical versions of their policies, these archives often do not contain all of the past versions of a policy and relying on them entails trusting the platforms to provide complete information. Thus it remains hard to systematically study how the rules and norms of platforms have changed over time. Our Platform Governance Archive (PGA) aims to address this need by providing a comprehensive and uniformly collected dataset of all of the historical versions of platform policies which does not rely on the platforms’ own public records. While we are working on extending the scope of the archive to include more platforms and policies, the current dataset described in this paper contains all of the historical versions of three types of policy documents (Terms of Service, Community Guidelines, Privacy Policies) by four major platforms (Facebook, YouTube, Twitter and Instagram) in the time period from the inception of each policy until the end of 2021. Our paper gives a comprehensive overview of the conceptual layout of the Platform Governance Archive and details the automated and manual processes of data collection and data cleaning, as well as our practical and theoretical challenges. Starting with how we define a relevant change to a platform policy, we lay out how we used the Internet Archive’s Wayback Machine to identify past versions of platform policies, collect them, and then automatically and manually check for changes. Specifically, we explain how we mapped the URLs of the selected policies and they have changed over time, putting together a puzzle of how they were renamed and relocated. We then detail the automated scraping process of these URLs from the Wayback Machine as well as the automated diff-checking which we employed. The last step of the data cleaning consisted in a manual revision of the automatically identified versions based on our definition of a relevant change, which was necessary because a significant amount of data noise remained. The paper furthermore describes how the platforms' ways of displaying their policies have changed over time by increasingly turning them into interactive pages and multi-page documents, as well as how we addressed the data collection challenges that arose from this. The paper furthermore provides an overview of the resulting v1 corpus the Platform Governance Archive which is a dataset consisting of 354 policy documents with a total of 6,036 pages. By detailing the structure of our data repository on Github, we offer a guide on how to access and work with the data. We furthermore describe the characteristics and details of each platform and policy type to account for the fact that each of them have undergone a specific historical development. Lastly, our paper also presents a structural analysis of some of the general trends and patterns which are visible in the dataset over a time period of up to almost two decades on the document level. Using a quantitative analysis, we analyse how the change frequency and the character count of each platform policy has developed over time. A comparative visualisation of these findings allows us to show how the extent of the policies has grown over time, to identify periods of high growth and frequent changes and to draw comparative conclusions about the four different platforms. The Platform Governance Archive aims to be a resource for researchers, journalists, policy-makers, platform operators, activists, and other stakeholders as well as the general public. By offering both a comprehensive dataset and an accessible interface, we aim to offer and continue to develop this resource to enable research and public debate on the historical evolution of platform policies in order to trace down changes, to identify characteristic periods of isomorphic policies, to measure influencing factors, and to understand how specific debates, events, and legislation have influenced and manifested in platform policies.de
dc.languageende
dc.subject.ddcPublizistische Medien, Journalismus,Verlagswesende
dc.subject.ddcNews media, journalism, publishingen
dc.subject.otherPlatform Governance; Platform Policies; Terms of Service; Community Guidelines; Privacy Policies; Platform Law; Content Moderation; Dataset; Digital Constitutionalism; Platform History; Archivede
dc.titleThe Platform Governance Archive v1: A longitudinal dataset to study the governance of communication and interactions by platforms and the historical evolution of platform policies (Data Paper)de
dc.description.reviewbegutachtetde
dc.description.reviewrevieweden
dc.publisher.countryDEUde
dc.publisher.cityBremende
dc.subject.classozinteraktive, elektronische Mediende
dc.subject.classozInteractive, electronic Mediaen
dc.subject.thesozDigitale Mediende
dc.subject.thesozdigital mediaen
dc.subject.thesozOnline-Mediende
dc.subject.thesozonline mediaen
dc.subject.thesozRahmenbedingungde
dc.subject.thesozgeneral conditionsen
dc.subject.thesozGovernancede
dc.subject.thesozgovernanceen
dc.subject.thesozDatenschutzde
dc.subject.thesozdata protectionen
dc.subject.thesozModeratorde
dc.subject.thesozmoderatoren
dc.rights.licenceCreative Commons - Namensnennung 4.0de
dc.rights.licenceCreative Commons - Attribution 4.0en
internal.statusformal und inhaltlich fertig erschlossende
internal.identifier.thesoz10083753
internal.identifier.thesoz10064820
internal.identifier.thesoz10055889
internal.identifier.thesoz10054891
internal.identifier.thesoz10040560
internal.identifier.thesoz10052608
dc.type.stockmonographde
dc.type.documentArbeitspapierde
dc.type.documentworking paperen
dc.source.pageinfo34de
internal.identifier.classoz1080404
internal.identifier.document3
dc.contributor.corporateeditorUniversität Bremen, Zentrum für Medien-, Kommunikations- und Informationsforschung / Centre for Media, Communication and Information Research (ZeMKI)
internal.identifier.corporateeditor1371
internal.identifier.ddc070
dc.identifier.doihttps://doi.org/10.26092/elib/2331de
dc.description.pubstatusVeröffentlichungsversionde
dc.description.pubstatusPublished Versionen
internal.identifier.licence16
internal.identifier.pubstatus1
internal.identifier.review2
dc.subject.classhort10800de
internal.pdf.validtrue
internal.pdf.wellformedtrue
internal.pdf.encryptedfalse
ssoar.urn.registrationfalsede


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record