CityLIS Writes: Ask the Crowd: Ethnographic Methods and Community Engagement Strategies in Digital Libraries by Irene Tortorella

***This essay was written by CityLIS student Irene Tortorella in Spring 2016. It is reproduced here with the author’s permission as part of our CityLIS Writes initiative.***

Keywords: #digitallibraries #userstudies #libraryoutreach #socialweb #socialtheory #opendata #semanticweb #crowdsourcing


The statement “digital libraries are social systems” underpins the topic of the social significance of digital libraries. Digital libraries are social systems because of their value to the society, the community, the crowd[1]. Calhoun (2014) spoke of the social roles assumed by digital libraries asserting that they: “support the free flow of ideas; empower individuals; support teaching, learning and the advancement of knowledge; provide economic benefits; and preserve intellectual and cultural assets for future generations” (Calhoun, 2014). In addition, digital libraries allow individuals to interact with each other, making use of data and information resources (Borgman et al., 1995). Most of the social roles performed by both digital and physical libraries are related to their contribution in enhancing democratic access to information. In fact, digital libraries are meant and built for a community of individuals in order to meet their information needs and uses (Borgman, et al. 1995). Behind the social value of digital libraries in contemporary society, Buttenfield, et al. (2003) are very interested in understanding the social aspects of digital libraries, “the web of social and material relations in which digital libraries are embedded” (cit. Buttenfield, et al. 2003). This web of social interactions is partially unleashed from digital libraries’ social roles themselves, and concerns the aspects of community involvement, engagement and participation in digital libraries. In this essay, in order to uncover the importance of the network of social interactions for digital libraries, I will start framing my discourse in terms of social theory, establishing and explaining key-concepts from Luhmann’s social systems approach, such as communication, interaction and event. Then, I am going to give an account of two digital libraries’ social aspects: 1) the application of ethnographic methods in the construction of digital libraries systems; 2) the provision of linked open data[2] for community engagement projects, such as digital scholarship crowdsourcing initiatives. I have chosen these two aspects, because I believe they represent the moments in which digital libraries’ interactions with society, politics, values and the community are more crucial in shaping and improving digital libraries’ information architecture, access and discoverability. In analysing their connections with the crowd, I claim that digital libraries are social systems not only because of their value to the community, but also because the community is of value to digital libraries themselves in many ways.

  1. A theoretical framework: digital libraries as social system

Luhmann’s theory defines social systems as based on communication, and precisely, on communicative events (Luhmann, 1995). Events are moments, which together create the sense of the passing of time, with a start and an end (Luhmann, 1995). They are communicative because they are meaningful, and they convey meanings because they are filled with information. Interaction is a fundamental concept of the theory: there is no such thing as interaction-free communication. In fact, “society is not possible without interaction, nor interaction without society” (Luhmann, 1995). For Luhmann, technologies cannot be social systems because they are not “communicative events” (Grundmann, 1999). When he first enunciated his theory technologies could in fact enable communication, but they lacked the interactive aspect which characterises social systems. Recent interpretations of Luhmann’s theory have been applied to the social study of technology: the modern systems theory, built upon older Luhmann’s approaches, considers digital technologies sufficiently interactive to be considered socio-technological systems. The human-machine interaction resembles the human-human interaction, and in this theoretical perspective digital libraries are also entitled to be studied and observed as social systems.

  1. Ethnographic methods in designing digital library systems

“Ethnography” is the study of the distinctive practices of particular human groupings through observation of and immersion in those practices, and also the representations of those people, based on such study (Hakken, 1999). Ethnographic fieldwork involves observing directly and meaningfully in the practices of interest, and includes finding ways to participate actively in the practices; in this sense the notion of “embodied understanding” comes to hand: if a person actually does something, he or she understands it better than just observing it or hearing it from another person (Hakken, 1999). The application of socially grounded user studies in digital libraries development is part of a broader field related to ethnographic methods, also concerned with human information behaviour. “The librarian as ethnographer” (Kline, 2013) is a type of LIS researcher able to inform the design of digital libraries by observing users’ behaviour through the use of ethnographic methods. Borrowing this kind of ethnographic analysis from social science can provide an idea of the perspective of the target user community (Dobreva, et al., 2012). Observational and participatory researches help with understanding digital libraries’ users, the way they try to retrieve and use information, and the challenges they face during information seeking processes (Dent Goodman, 2011).

The investigation of digital libraries’ users’ behaviour has its roots in the application of ethnographic research to business and computer science. Hakken is an ethnographer, and he observed that “cyberspace is the notional social arena we enter when using computers to communicate” (Hakken, 1999). Cyberspace is also a “knowledge society” (Hakken, 1999) that can be analysed applying an ethnographic approach; in the same way digital libraries are knowledge societies, because they represent socio-technical systems interacting with the production of knowledge (Buttenfield, et al., 2003). Digital libraries are designed to provide access – which can be, simultaneously, technical, cognitive and social (Buttenfield, et al., 2003, my italics) – to that knowledge. Ethnographic approaches are relevant to the design of information systems because they give insights on the best way to provide those kinds of access to information, which eventually will lead to knowledge. Examples of community analysis – having a lot in common with ethnography – date back to the end of the 19th century, when the gathering of information about the community for evaluating and improving services was seen as crucial to librarianship (Dent Goodman, 2011). When it comes to designing digital libraries, librarians are ethnographers because they ask the crowd, observe the community, talk to users about what they observe, and ask questions in order to look for patterns in users’ answers. The perfect ethnographic method seeks to gather information through the active observation, over a period of time, of users’ behaviour; it also involves the direct participation of LIS professionals, in order to provide the necessary context to the investigation (Dent Goodman, 2011). It is fundamental to choose a portion of users which is representative of the entire community, and will actually make use of a specific digital library. Complex communities are hard to frame, and this is the reason why Shumar (2005) uses Anderson’s imagined communities concept to choose the target community for his ethnographic research on digital libraries. Communities are products of social imagination, and they must be defined symbolically (Shumar, 2005); therefore choosing a focus group which is “virtually” prototypical of a well-defined community is essential. Nardi and O’Day (2003) argue that it is important to look closer at the people in the moment when they interact with tools, generating practices. “A technological innovation may look good when considered in isolation and yet turn out to be problematic or incomplete in actual settings of use” (Nardi and O’Day, 2003). This is specifically the reason why digital librarians cannot work in isolation, and should instead put the community – the crowd – at the centre of the design process, involving it in various stages of the digital libraries’ implementation and further evaluation.

  1. Community engagement: (linked) open data for crowdsourcing and digital scholarship

Currently, the trend in digital libraries is the move to a linked data environment, inspired by the semantic web model, in which data are released from libraries and their digital catalogue and freely available on the web. Linked open data improves the relevance of faceted search results; moreover, the online availability of collections of data, semantically linked, is an opportunity for libraries and information providers to engage with the public. Over the last decade digital libraries have started to make their data available to the community, with the intent of reaching new audiences, other than the library community (Deliot, 2014). Many institutions are expected to reach out to the community, and ask for its feedback or specialist knowledge in various ways (Van Hooland and Verborgh, 2014). There are many methods of seeking assistance from the community: partnerships with volunteers, social networking, applications such Wikis which allow for communal input and editing (Budzise-Weaver, Chen and Mitchell, 2012). Then, there is crowdsourcing, one of the most important social web phenomena, with plural potentialities and applications in the digital libraries world (Calhoun, 2014). Estellés-Arolas and González-Ladrón-de-Guevara (2012) analysed various definitions of crowdsourcing, in order to extract common elements in crowdsourcing initiatives. From their analysis, the ultimate definition of crowdsourcing has been created:

“Crowdsourcing is a type of participative online activity in which an individual, an institution, a non-profit organization, or company proposed to a group of individuals of varying knowledge, heterogeneity, and number, via a flexible open call, the voluntary undertaking of a task” (Estellés-Arolas and González-Ladrón-de-Guevara, 2012).

The participative activity of crowdsourcing is social because it is voluntary, and done for the benefit of the community. It is also social because it provides an occasion for the crowd to apply its skills and share its knowledge, in a collaborative effort to make digital libraries better. In fact, the above cited authors continue saying:

“The undertaking of the task, of variable complexity and modularity, and in which the crowd should participate bringing their work, money, knowledge and/or experience, always entails mutual benefit. The user will receive the satisfaction of a given type of need, be it economic, social recognition, self-esteem, or the development of individual skills, while the crowdsourcer will obtain and utilize to their advantage that what the user has brought to the venture, whose form will depend on the type of activity undertaken.” (Estellés-Arolas and González-Ladrón-de-Guevara, 2012).

Digital libraries which have been involved in crowdsourcing strategies are social systems, because they use social engagement techniques to ask a group of people to achieve a shared, significant, and large goal (Holley, 2010). Individuals from the crowd – the community – work collaboratively together as a group, and “rather than belonging to a specified group of employees or contractors, people who work on crowdsourced projects are either volunteers or part-time freelancers who generally work online and from home” (Bartlett, 2014). The crowd is a pool of content editors, translators, transcriptionists and annotators (Grassi, Morbidoni, Nucci, 2012), coming into play when, for example, methods such Optical Character Recognition (OCR) are efficient but not perfectly reliable in detecting printed text due to the poor condition of the original image (Bartlett, 2014); not to mention the problems with the identification of handwritten text. A large number of digital libraries get assistance from their online patrons, not only in transcribing texts, but also identifying images, content, and tagging elements in digitised documents (Bartlett, 2014). The crowdsourcing strategy enables the integration of the community into the collection access development process. Similar digital libraries’ projects are, in fact, all about enhancing access, which also, in its turn, enables better interoperability – and discoverability. For example, the LibCrowds platform launched by the British Library and developed by British Library Labs, hosts various crowdsourcing projects, including Convert-a-card (British Library Labs, 2015). The latter focuses on the retro-conversion of British Library’s Asian and African Studies collection’s printed card catalogue, and it asks three volunteers to collaboratively match the image of a catalogue card against a record from the WorldCat database.

Figure 1: Convert-a-Card Process Copyright © The British Library Board

“By asking three people to complete the same task, and looking for cases where at least two volunteers have selected the same record, we can provide a level of risk mitigation and be confident that the records being retrieved are correct” (Mendes, 2015). This example of collaborative effort shows how individuals come together in interactive, communicative events to help improve the findability of the British Library’s Asian and African Studies collection through interaction with a machine or computer, in line with Luhmann’s social systems approach.

So far the contributions to LibCrowds have been applauded as invaluable. The LibCrowds community is so keen on enhancing access to the British Library’s collections that they also voluntarily decided to participate to a forum-like online discussion on the project. In this space the crowd meet, discuss, ask questions, share information and ideas, while curators and librarians from the British Library are active in promoting the use of BL open datasets, available from the LibCrowds’s website[3]. Linked open data are useful not only for crowdsourced data and metadata enrichment projects such as Convert-a-card. The possibilities given by putting data to use are many, and from metadata to research data, leveraging engagement with the community has never been so easy.

When it comes to research data, the community that digital libraries engage with is generally a scholarly one. As I already mentioned, linked open data render digital libraries very similar to the semantic web. “The semantic web and linked data are important to the social web because they produce open, reusable bits of data that facilitate machine-to-machine interactions, in turn enabling better integration and interoperability of digital library information in other contexts.” (Calhoun, 2014). The semantic web allows computers to automatically match, retrieve, and link resources across the internet which are related to each other. From a scholarly point of view, applying the same concept to digital libraries offers significant opportunities for the community of users, in terms of publishing, referencing, researching and re-using digital research outcomes. Linked data repositories are very heavily used digital libraries themselves, and make open data accessible everywhere, in real-time, with immediate research findings’ impact (Griffin, 2015). Linked open data allow scholars to research in a non-traditional way: for example, big datasets are crucial for Digital Humanities research, which applies tools such data mining or text analysis in order to find meaningful patterns (Hearst, 1999). Linked open data ask the scholarly crowd to change old paradigms of research, in order to move towards a data-driven approach for both humanities and science research fields. “The emerging paradigm of social machines provides a lens onto future developments in scholarship and scholarly collaboration, as we live and study in a hybrid physical-digital sociotechnical system of enormous and growing scale” (De Roure, 2014).

Figure 2: De Roure, D. (2014).
Creative Commons 2016.

Figure 2 represents De Roure (2012)’s model for the hybrid physical-digital sociotechnical system we live in, which I find applicable to digital libraries, and their efforts to engage with the scholarly community through the provision of linked open data. The machines’ axis means that computational capacities increase with the growth of data and electronic devices. The “Internet of things” integrates the physical world into IT systems, and so do the “digital libraries of things”. The “library without walls” brings seamless “anytime, anywhere” access to information (Marshall, 2003). Simultaneously, the people axis represents the rapid progress of social interactions derived from technological innovations. The top right quadrant represents the result of the crowd meeting the digital world (De Roure, 2014). All the interactions between the crowd and digital data render the computer machines – and digital libraries – social machines or social systems (see Luhmann, 2015). In conclusion, digital scholarship and crowdsourcing through libraries systems are example of digital interactions, “in-the-wild experiments in the co-production of social machines” (De Roure, 2014).


Social theory, applied to technologies such as digital libraries, can provide new interpretations of the human-machine relationship, and the human-human relationship in the digital information society. It can also help build new research questions, which will lead to understanding new aspects of digital libraries, their effects on society, and society’s effects on digital libraries. My essay started with outlining the theoretical framework behind the statement “digital libraries are social systems”: the modern Luhmann’s social systems approach applied to this study provided a context for the subsequent analysis of digital libraries as human-interactions in a digital environment. The application of user ethnographic studies in the design of digital libraries, and the strategic use of open data for outreach in libraries are exemplars of the plurality of communicative events induced by interactions in the digital libraries world (see Luhmann, 2015). In analysing these two social aspects of digital libraries, I claimed that the latter are social systems because the crowd – the community of end-users – has the vital role of shaping design, contents and discoverability, creating communicative events filled with information and enabled by interactions. As De Roure (2014) brilliantly summarises: “we all are participants, authors and readers alike, and many of us are designers too”.


  • Borgman, C.L., Bates, M.J., Bates, M.V., Efthimiadis, E.N. et al. (1995), “Social aspects of digital libraries” in Background paper for UCLA – National science foundation workshop. Available from [Retrieved 26/04/2016]
  • Budzise-Weaver, T., Chen, J., Mitchell, M. (2012). “Collaboration and crowdsourcing” in The Electronic Library, Vol. 30 Iss 2 pp. 220-232. Available from [Retrieved 21/03/2016]
  • Buttenfield, B.P., Peterson Bishop, A., Van House N.A. (2003), “Introduction: Digital Libraries as sociotechnical systems”, in Buttenfield, B.P., Peterson Bishop, A., Van House N.A. (eds), Digital library use: social practice in design and evaluation, Cambridge, MA: The MIT Press.
  • Calhoun, K. (2014), Exploring digital libraries: foundations, practice, prospects. London: Facet.
  • Dent Goodman, V. (2011). “Applying ethnographic research methods in library and information settings”, in Libri, vol. 61, pp. 1-11. Available from [Retrieved 23/03/2016]
  • Dobreva, M., O’Dwyer, A., Feliciati, P., (2012). “Introduction: user studies for digital library development” in Dobreva, M., O’Dwyer, A., Feliciati, P., (eds), User studies for digital library development, London: Facet.
  • Griffin, S.M. (2015). “Libraries in the digital age: technologies, innovation, shared resources and new responsibilities” in Cantoni, L., and Danowski, J.A., (eds), Communication and technology, Berlin: De Gruyter Mouton, pp. 527-552. Available from [Retrieved 26/03/2016]
  • Grundmann, R. (1999). “On control and shiftind boundarier: modern society in the web of systems and networks” in Coutard, O. (ed.). The governance of large technical systems. London & New York: Routledge.
  • Hakken, D. (1999). Cyborg@Cyberspace: an ethnographer looks to the future, New York: Routledge.
  • Kline, S. (2013). “The librarian as ethnographer: an interview with David Green”, in College & Research Libraries News, 74 n. 9, 488-491. Available from [Retrieved 23/03/2016]
  • Luhmann, N. (1995). Social systems. Stanford: Stanford University Press.
  • Marshall, C.C. (2003). “Finding the boundaries of the library without walls” in Buttenfield, B.P., Peterson Bishop, A., Van House N.A. (eds), Digital Library use: social practice in design and evaluation, Cambridge, MA: The MIT Press.
  • Nardi, B.A., O’Day, V.L., (2003). “An ecological perspective on digital libraries”, in Dobreva, M., O’Dwyer, A., Feliciati, P., (eds), User studies for digital library development, London: Facet. Available from [Retrieved 21/04/2016]
  • Van Hooland, S., Verborgh, R. (2014). Linked data for libraries, archives and museum. London: Facet.

[1] In the title and throughout the essay, I use the term “crowd” in the sense of “community”. See Merriam-Webster entry for “crowdsourcing”: [Retrieved 26/04/2016]

[2] Linked data is a method to publish data in a structured way, so it can be linked to other data. Linked open data is a set of linked data which is openly available on the web. It is ideal for digital libraries that want to improve their access and interoperability (Bojārs, Lopes, Schneider, 2013).

[3] See BL Labs’ open datasets:


You can follow Irene on Twitter @irenetortorell4

About lyn

Dr Lyn Robinson is Reader in Library & Information Science, and Head of Department at City, University of London. She established and directs the Library School, and co-directs the Centre for Information Science alongside Prof David Bawden. Contact:
This entry was posted in CityLIS Writes and tagged , , , , , , . Bookmark the permalink.