Recent headlines show a rise in the widespread use of data and social media tools for a wide range of purposes including marketing, political micro-targeting, and even credit scoring. These activities fall under the ecosystem of digital marketing. To understand this ecosystem, the first question to ask is how did this whole infrastructure emerge? How can this ecosystem and infrastructure be studied?
This is a challenge being taken on by Dr Anne Helmond and Fernando van der Vlist of the University of Amsterdam; who are trying to historicize the development of this infrastructure, understand how this has emerged over time and how it has changed over the recent years. It is no doubt that advertising and marketing have become engrained in the core of the ad-supported web space, particularly on social media platforms like Facebook. As their role continues to grow, understanding their growth is key.
To understand these developments, the first step needs to be looking at how the web’s history is recorded. Among the first steps taken to archive the web is the Internet Archive, which records snapshots of web pages in different points of time and whoever has the URL can view the archive to see what records exist of that URL. The Internet Archive is the first archiving initiative, launched in 1996 and holds over 327 billion web pages in its archive.
This way of archiving was sufficient in the early days of the web when the web had static self-contained web pages with textual content and some images. The Internet Archive records the chronological evolution of websites. One can even see how YouTube looked like in its early days.
Over time, more institutions emerged with tries to archive the web. We now also see national libraries taking an initiative – the Library of Congress being one example focusing on archive a special collection of websites around historical events like ‘9/11’ and the Iraq war. Some of these institutions focus on a more local approach by archiving only their country’s part of the web. The UK Web Archive is one example, which archives only UK-based websites.
There have been different strategies that have emerged over archiving. A group called ‘Archiveteam‘ focuses on archiving parts of the web which are in danger of deletion. For example, when Yahoo! took over GeoCities and decided to shut it down, risking millions of early web pages to be deleted, ‘Archiveteam’ acted and archived it.
However, as the web continued to grow and became more dynamic, the task of archiving becomes more and more challenging. Many websites and platforms are now increasingly personalised and updated frequently. Many of these platforms, particularly social media platforms, are behind a login-wall. When an archiving crawler visits these pages, all it can record is the login page, and that’s it. This poses a challenge to archive and study how these platforms have evolved.
The Internet Archive found a creative solution, by creating a Facebook profile called ‘Charlie Archivist’ which would pass through the login wall and attempt to, at least, archive the public pages and profiles. But is this a proper way understand Facebook in 10 years’ time?
To tackle this challenge, Twitter donated all of their tweets to the Library of Congress for archiving purposes. Tweets are valuable data that provide insight and evidence of social trends and how they evolved. However, this became an infrastructural challenge. The library could not find a way to store and handle such a large amount of data which was constantly updated. The archive never became available to the public, and the Library of Congress decided to archive only curated tweets.
There have been suggestions on a need for more longitudinal studies of platforms and ecosystems, but, how do you do a longitudinal study of an ecosystem? One could start now and chronicle for the next 5 years – which would require time and dedication.
Dr Anne Helmond and Fernando van der Vlist have found one approach: making use of publically available boundary resources to historicize the development of platforms. Resources such as the developer’s technical documentation provide information about technical integration with the platform which might be APIs or other tools provided for debugging purposes. As these pages are public, they are already archived by the traditional web archives and can be looked at to understand what features, tools, and access was available to developers and how did it change over time, to understand the evolution of the platform and the ecosystem.
Looking at Facebook’s partner programs and their ‘partner badges’, Anne and Fernando find the evolution clear to study. What started in 2011 as a single badge to represent multiple platform-specific specialities evolved and by 2015, it was different badges in broader terminology that applied to the whole industry. This meant moving from partner expertise like ‘Facebook CMS’ and ‘Facebook Plugins’ to broader expertise like ‘Small Business Solutions’, ‘Content Marketing’ and ‘Community Management’.
Anne and Fernando also looked at the partners Facebook has had over time, providing insight into how their partners have increased in the early years, and decreased (some due to mergers or acquisitions of smaller partners), and then eventually stabilise as the platform grows and it becomes harder for a new agency to partner.
As the digital ecosystem and social platforms continue to expand, the archiving becomes increasingly difficult. The technique and the model proposed by Anne and Fernando is a very interesting approach that can also be replicated on other platforms to study their historical progression. Together it can also give insight into the whole digital ecosystem of social platforms and make it easier for future studies to understand the evolution of the web and social platforms.