SOCIAM: The Theory and Practice of Social Machines

Lead Research Organisation: University of Southampton
Department Name: Electronics and Computer Science

Abstract

SOCIAM - Social Machines - will research into pioneering methods of supporting purposeful human interaction on the World Wide Web, of the kind exemplified by phenomena such as Wikipedia and Galaxy Zoo. These collaborations are empowering, as communities identify and solve their own problems, harnessing their commitment, local knowledge and embedded skills, without having to rely on remote experts or governments.

Such interaction is characterised by a new kind of emergent, collective problem solving, in which we see (i) problems solved by very large scale human participation via the Web, (ii) access to, or the ability to generate, large amounts of relevant data using open data standards, (iii) confidence in the quality of the data and (iv) intuitive interfaces.

"Machines" used to be programmed by programmers and used by users. The Web, and the massive participation in it, has dissolved this boundary: we now see configurations of people interacting with content and each other, typified by social web sites. Rather than dividing between the human and machine parts of the collaboration (as computer science has traditionally done), we should draw a line around them and treat each such assembly as a machine in its own right comprising digital and human components - a Social Machine. This crucial transition in thinking acknowledges the reality of today's sociotechnical systems. This view is of an ecosystem not of humans and computers but of co-evolving Social Machines.

The ambition of SOCIAM is to enable us to build social machines that solve the routine tasks of daily life as well as the emergencies. Its aim is to develop the theory and practice so that we can create the next generation of decentralised, data intensive, social machines. Understanding the attributes of the current generation of successful social machines will help us build the next.

The research undertakes four necessary tasks. First, we need to discover how social computing can emerge given that society has to undertake much of the burden of identifying problems, designing solutions and dealing with the complexity of the problem solving. Online scaleable algorithms need to be put to the service of the users. This leads us to the second task, providing seamless access to a Web of Data including user generated data. Third, we need to understand how to make social machines accountable and to build the trust essential to their operation. Fourth, we need to design the interactions between all elements of social machines: between machine and human, between humans mediated by machines, and between machines, humans and the data they use and generate. SOCIAM's work will be empirically grounded by a Social Machines Observatory to track, monitor and classify existing social machines and new ones as they evolve, and act as an early warning facility for disruptive new social machines.

These lines of interlinked research will initially be tested and evaluated in the context of real-world applications in health, transport, policing and the drive towards open data cities (where all public data across an urban area is linked together) in collaboration with SOCIAM's partners. Putting research ideas into the field to encounter unvarnished reality provides a check as to their utility and durability. For example the Open City application will seek to harness citywide participation in shared problems (e.g. with health, transport and policing) exploiting common open data resources.

SOCIAM will undertake a breadth of integrated research, engaging with real application contexts, including the use of our observatory for longitudinal studies, to provide cutting edge theory and practice for social computation and social machines. It will support fundamental research; the creation of a multidisciplinary team; collaboration with industry and government in realization of the research; promote growth and innovation - most importantly - impact in changing the direction of ICT.

Planned Impact

The proposed programme will have beneficial impact on a wide range of stakeholders. Via technology transfer, companies will gain access to new technologues, and also gain the understanding that will allow them to develop new products for communities organising themselves in social machines. Those companies that partner us or support our research will of course have the ability to feed ideas into the research, and frame the problems we are trying to solve; we consider it essential that fundamental research feeds into, and back from, real-world applications.

Smaller-scale entrepreneurs will have new outlets for innovation, and new opportunities to develop radical business models. The public sector and third sector will have available new tools and methods for achieving policy ends. Communities using social machines will also benefit, of course, by the ability to identify and define their own problems, and develop their own solutions. These benefits, in social cohesion and cooperation, will often outlive the immediate issue which drove the development of the social machine.

We should not forget the benefits to the wider academic community of the proposed research. Of course, the development of a community of multi-disciplinary researchers in social machines will benefit the computer science field, but via the observatory and the strong social relevance of the research, we would expect a wide academic community in science and social science benefiting from the deepening of expertise in this area, and the large quantity of data. The 5-year programme would allow a strong multi-disciplinary cohort of researchers to emerge, able to influence a range of fields, spreading expertise in these relatively novel methods of social collaboration. Dissemination will also take place via our programme of Town Meetings, sandpits, hackathons, disruptive skills workshops, etc. Groups associated with the consortium, such as the Web Science Trust, will be able to ensure that SOCIAM's work is widely disseminated and one of our Partners is the world's largest Technical PR Agency.

The impacts will be both economic and non-economic. The economic impacts will be the benefits that come from innovation and cooperation, and from bottom-up solutions to problems. These will include both lowering costs of social problems (e.g. via community policing lowering the costs of crime), and creating opportunities for innovation and commercial exploitation of innovation (as for example with the development of new services based on creative uses of available data). Some of these benefits will fall to entrepreneurs, while others will spill over into the wider community.

Furthermore, the research will enable value to be extracted from the ever-growing quantities of data we see. The social return on investment in data acquisition, particularly public open data, will be dramatically improved as more tools and methods are created for using the data to drive services.

There will be several non-economic impacts too. In policy terms, the impacts will be high, particularly as local solutions for problems - inherently more efficient than centralised problem-solving which cannot always take account of local conditions - will emerge from collaboration in social machines in small communities. Communities will become empowered and self-reliant. The result will be a suite of tools and methods which can be put to work in social contexts by a range of actors - government, to achieve policy goals, groups of people, to achieve social goals, or entrepreneurs, to achieve commercial goals. Indeed, one would expect a social machine to encompass all of these at different times.

Publications


10 25 50
Akhlaq A (2017) Defining Health Information Exchange: Scoping Review of Published Definitions. in Journal of innovation in health informatics
Berners-Lee T (2013) The read-write Linked Data Web. in Philosophical transactions. Series A, Mathematical, physical, and engineering sciences
Brown I (2015) DNA
Buneman P (2016) Why data citation is a computational problem in Communications of the ACM
Buneman P (2015) Databases and Programming

Related Projects

Project Reference Relationship Related To Start End Award Value
EP/J017728/1 01/06/2012 31/07/2015 £6,219,059
EP/J017728/2 Transfer EP/J017728/1 01/08/2015 31/05/2018 £2,667,741
 
Description Significant achievements to date include:

1. We have defined a formal semantics of our operational specification language, LSC (created in the previous year) and built a theorem prover (using HOL-Lite, a generic theorem proving system) in which to prove properties of LSC specifications. This gives precision on the semantics of our language and provides new opportunities to prove (e.g.) security properties LSC specifications, which is likely to be useful as we move into healthcare applications. Also in security (via a PhD studentship additional to SOCIAM) we have developed an extension of the LSC interpreter that permits security levels to be associated with data transmitted during a social interaction and for "information leakage" between security levels to be detected.
With colleagues in Vienna University of Technology, we have produced a model and techniques to allow coordination of human workers in web-scale collaborative human computation systems. The techniques augment the existing Social Compute Unit (SCU) concept - a general framework for management of ad-hoc human worker teams - with versatile coordination protocols expressed using LSC. This approach allows us to combine coordination and quality constraints with dynamic assessments of user desires, while choosing workflow patterns appropriate to the current situation. This extends earlier work on the use of social machines to construct social machines.
Our system for choreographing social computations using Twitter has been further developed. This has been connected to the Southampton personal data store, allowing joint engineering across process and linked data scenarios. It has also led to the concept of a "shadow institution", where actions on existing social networks are mapped onto formal interaction protocols, allowing participants access to computational intelligence in a seamless, zero-cost manner to carry out computation and store information. We have merged this with our work on provenance representation (continuing our focus on PROV as a representation language), integrating it with our LSC-based Twitter system. This allows us to derive a restricted range of annotations automatically during LSC-supported interactions and it also allows us to propagate annotations with data introduced through interactions (although it does not by itself solve deeper issues of appropriate provenance merging).

2. Over the past year, our work on decentralised architectures and personal data stores for the Web has focused on maintaining end-user control over personal data, identity, and privacy. Working with the W3C, we achieved interoperability between our INDX platform and their decentralised Linked Data Platform via a fully-functional heterogeneous microblogging application.

A second major focus of this work is end-user management of personal health data, sourced from both clinical (e.g. EHR) and non-clinical (e.g. wearable sensor) sources. We have built a mobile INDX application for atrial fibrillation patients to demonstrate consolidated patient-centred analytics and interventions. Additionally, we are investigating enforcing user-specified Data Terms of Use constraints for expressing and detecting data re-sharing and reuse violations in decentralised settings. This framework will be demonstrated in a public health decentralised digital epidemiology application centred on the INDX personal data store in 2015.

3. The Web Observatory seeks to develop the necessary technical and social infrastructure in order to observe, analyse, and predict behaviour of social machines on the Web. In SOCIAM we have built the necessary computing infrastructure required for real-time and big data analytics, and are actively pursuing new analytical methods for monitoring the state of a social machine. Central to the Web Observatory is the analysis of both historic and real-time sources of data streams in order to understand the characteristics and functionality of a social machine. We are also developing applications and visualisations to monitor and interact with a social machine's signal, and working on techniques to cross-examine activity.

Working with a number of partner labs, we are leading a network of Web Observatories in order to provide a distributed infrastructure to store, describe, query, and analyse heterogeneous sources of data. As part of this we are actively developing the necessary layers of data description and access control protocols in order to provide data owners control over their data.
Exploitation Route Our work on Twitter could be translated (with further development effort) into a tool for supporting process-based interaction with the Twitter platform. Our HOL-Lite system could be of relevance to those with an interest in proving properties of interaction protocols (e.g. in our healthcare strand of activity). Our collaborative design system architecture is of relevance to software engineering using distributed teams of engineers.

Our work on INDX has been released as an open source software project on GitHub, and as the platform matures, will work with Web development communities to foster its adoption and use. Additionally, over the next 2 years we expect to trial the health data management platform with NHS patients, as well as work with the private sector, in particular Personal Data Store companies such as CTRLio, and PInCH Medical Systems, to facilitate the creation of similar capabilities in their offerings.

The Web Observatory provides two major strands of contributions, infrastructure and analytics. The platform is helping shape the way research on the Web is conducted, with the goal of providing an environment where data and tools can be shared. Our efforts in big data will provide the further analytical methods and tools for those researching social machines and more generally, the Web.
Sectors Communities and Social Services/Policy,Creative Economy,Digital/Communication/Information Technologies (including Software),Education,Energy,Financial Services, and Management Consultancy,Healthcare,Leisure Activities, including Sports, Recreation and Tourism,Government, Democracy and Justice,Pharmaceuticals and Medical Biotechnology,Retail,Transport
URL http://sociam.org/
 
Description The 2 most significant interactions are: 1. We have engaged with Hampshire County Council (and through them, Hampshire Constabulary), the Hampshire Police and Crime Commissioner, and the College of Policing collaborating on use of the Web Observatory to monitor social media activity and also to specify use cases for a nuisance reporting app. Potential domains for exploitation include: understanding victim experience, witness experience, anti-social behaviour and intelligence gathering. The collaboration with Hampshire County Council has produced a paper presented at a policing conference about the role that social machines technology could play in policing and crime management. Two further papers have been published on measuring the reliability of open crime data, which advanced the state-of-the-art in a highly contested area. 2. Online citizen science is a prominent example of Social Machines. It is a blueprint of a hybrid computing approach, coupling state-of-the-art artificial intelligence with human-based computation, to tackle problems that are impossible to be solved in a purely computational fashion. The Zooniverse, the world's largest multi-project citizen science platform with over one million volunteers contributing to projects from various domains such as astrophysics or digital humanities, is developed and maintained at the University of Oxford. SOCIAM researchers from Southampton and Oxford look deep into the socio-technical process behind Zooniverse projects but also other citizen science platforms such as EyeWire. This is an important cornerstone for the development of observational methods and tools that allow to measure the contributions of individual participants in collective online activities such as citizen science. A fundamental change is underway in the research landscape. Principles from citizen science will heavily be involved in new digital and democratic approaches to pursue science. These will accelerate the creative output of mankind but also change how scientific impact is measured and how funding is managed. SOCIAM acts at the forefront of these developments.
First Year Of Impact 2013
Sector Communities and Social Services/Policy,Digital/Communication/Information Technologies (including Software),Government, Democracy and Justice,Culture, Heritage, Museums and Collections
Impact Types Cultural,Societal,Policy & public services
 
Title ProvToolbox 
Description Provenance is a record that describes the people, institutions, entities, and activities involved in producing, influencing, or delivering a piece of data or a thing. In particular, the provenance of information is crucial in deciding whether information is to be trusted, how it should be integrated with other diverse information sources, and how to give credit to its originators when reusing it. In an open and inclusive environment such as the Web, where users find information that is often contradictory or questionable, provenance can help those users to make trust judgements. PROV is a set of W3C specifications defining a model, corresponding serializations and other supporting definitions to enable the inter-operable interchange of provenance information in heterogeneous environments such as the Web. ProvToolbox is a Java library to create Java representations of the PROV data model (PROV-DM), and convert them between RDF, XML (in PROV-XML format), text (in PROV-N format), and JSON (in PROV-JSON format). 
Type Of Technology Software 
Year Produced 2013 
Open Source License? Yes  
Impact ProvToolbox is the basis of community services for provenance translation and validation at https://provenance.ecs.soton.ac.uk. ProvToolbox was used in the inter operability phase of the W3C Provenance Working group https://www.w3.org/TR/prov-implementations/ 2016 contribution: templating system 
URL http://lucmoreau.github.io/ProvToolbox/
 
Description Enabling Provenance on the Web: Standardization and Research Questions (Keynote at International Conference on WWW/INTERNET 2015) 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Enabling Provenance on the Web: Standardization and Research Questions

Provenance is a record that describes the people, institutions,
entities, and activities, involved in producing, influencing, or
delivering a piece of data or a thing in the world.

Some 10 years after beginning research on the topic of provenance, I
co-chaired the provenance working group at the World Wide Web
Consortium. The working group published the PROV standard for
provenance in 2013.

In this talk, I will present some use cases for provenance, the PROV
standard and some flagship examples of adoption. I will then move
onto our current research area in exploiting provenance, in the
context of the Sociam, SmartSociety, ORCHID projects. Doing so, I will
present techniques to deal with large scale provenance, to build
predictive models based on provenance, and to analyse provenance.
Year(s) Of Engagement Activity 2015
 
Description IPAW 2006-2016: Retrospect and Prospect of Provenance (Keynote at International Provenance and Annotation Workshop ipaw'16) 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact IPAW 2006-2016: Retrospect and Prospect of Provenance

IPAW, the biannual International Provenance and Annotation Workshop
series, was launched in 2006. We celebrate its 10th anniversary in
2016. During those 10 years, the field of provenance has seen a
tremendous amount of development. Among the 30 events I identified, I
will highlight some successes, such as the Provenance Challenge and a
standardisation activity at the World Wide Web Consortium. What is
the next step for the provenance community? By reviewing existing
applications of provenance and tooling, and by discussing some
research activities, I will attempt to map future directions for the
provenance community.
Year(s) Of Engagement Activity 2016
URL http://www2.mitre.org/public/provenance2016/ipaw.html
 
Description Presentation at JP Morgan TechFest, Bournemouth, Enabling Provenance on the Web: Standardization and Research Questions 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Industry/Business
Results and Impact Enabling Provenance on the Web: Standardization and Research Questions

Provenance is a record that describes the people, institutions,
entities, and activities, involved in producing, influencing, or
delivering a piece of data or a thing in the world.

Some 10 years after beginning research on the topic of provenance, I
co-chaired the provenance working group at the World Wide Web
Consortium. The working group published the PROV standard for
provenance in 2013.

In this talk, I will present some use cases for provenance, the PROV
standard and some flagship examples of adoption. I will then move
onto our current research area in exploiting provenance, in the
context of the Sociam, SmartSociety, ORCHID projects. Doing so, I will
present techniques to deal with large scale provenance, to build
predictive models based on provenance, and to analyse provenance.
Year(s) Of Engagement Activity 2015