What Does It Mean to Measure a Smile? Assigning numerical values to emotions

This article looks at the implications of emotion recognition, zooming in on the speci fi c case of the care robot Pepper introduced at a hospital in Toronto. Here, emotion recognition comes with the promise of equipping robots with a less tangible, more emotive set of skills – from companionship to encouragement. Through close analysis of a variety of materials related to emotion detection software – iMotions – we look into two aspects of the technology. First, we investigate the how of emotion detection: what does it mean to detect emotions in practice? Second, we re fl ect on the question of whose emotions are measured, and what the use of care robots can say about the norms and values shaping care practices today. We argue that care robots and emotion detection can be understood as part of a fragmentation of care work: a process in which care is increasingly being understood as a series of discrete tasks rather than as holistic practice. Finally, we draw attention to the multitude of actors whose needs are addressed by Pepper, even while it is being imagined as a care provider for patients.


Introduction
In a children's care ward in a Toronto Hospital, Pepper -a white, plastic, child-sized robot with blue-ringed eyes and a tiny smiling mouth -has been introduced as part of the staff.A news story covering this development shows montages of Pepper and children dancing together in the hospital lobby, people taking smiling selfies with Pepper in a hallway, Pepper letting children navigate the iPad-like interface attached to its chest, Pepper making a sick child in his hospital bed laugh out loud.All the while, the narrator, together with 1 hospital administrators and parents, is talking about how useful Pepper is for calming children down, alleviating stress and anxiety, and simply making a stay in hospital more fun.
Pepper is presented as a robot that can speak with emotion and be a friendly conversation partner, but Pepper's smile is an unmoving, sculpted part of its robot face.Yes, Pepper smiles, but continuously and unchangingly, like Barbie, Buddha, or the Mona Lisa.However, Pepper is equipped with facial recognition software and can use the sensors in its 'eyes' to detect human emotional expression (from robots are effective -both in accurately reading the humans to whom they are assigned for care and in responding appropriately to these humans such that they improve their quality of life in some way.But what does it do to care interactions when they become programmable and accountable in this way? See for example: BBC News, "Pepper robot to work in Belgian hospitals", 14 June 4 2016.https://www.bbc.com/news/technology-36528253,accessed 3 February 2021.Do (2018), "Meet Pepper: An AI robot that will reduce wait times in hospitals", 31 October (2018).https://news.engineering.utoronto.ca/meet-pepper-an-ai-robot-thatwill-reduce-wait-times-in-hospitals/,accessed 3 February 2021.Bayern 2020, "How robots are revolutionizing healthcare".ZDNet, 1 July 2020.https://www.zdnet.com/article/how-robots-are-revolutionizing-healthcare/,accessed 3 February 2021.
Here we understand "accountable" to mean both economic valuing of labour by 5 institutions and commercial organizations, and making the development of care robots morally and ethically accountable.
This question is at the heart of the broader research project surrounding this article, where we explore different ways in which 6 concepts such as empathy, affect, touch, and care are incorporated and shaped in the development of social robots for care work.This collective research approaches how different disciplinary understandings of what makes good social interaction between human and robot are brought into dialogue in the context of care robot development.In this article, we want to bring this kind of approach to a study of Pepper as caregiver.We want to investigate the practices involved in enabling Pepper as an emotional caregiver, and ask what happens to emotions, as well as care, when they are programmed to be delegated to Pepper?Additionally, and to address concerns about norms and values engaged by care robots, we will be thinking through how Pepper and care robots in general highlight context and emphasize the specificity of contextualized technology.Technology does different things in different places.Pepper is no different.Pepper is a robot which can be used to interact with humans in many different contexts and for many different reasons.As Pepper moves into a new context, for example, into a children's ward, this move exposes concerns that Pepper's presence is addressing in that particular place and time, with those particular people or groups.And it triggers a question about what roles social institutions, like hospitals, play in Pepper's placement.
To grasp the many agendas, discourses, hopes, and practices entangled in the introduction of Pepper as a caring companion into a hospital environment, we take inspiration from Donna Haraway's notion of the 'imploded knot'.A different kind of cyborg from the one that Haraway uses so provocatively in her manifesto, Pepper nonetheless represents the 'implosion of the technical, organic, political, economic, oneiric, and textual that is evident in the materialsemiotic practices and entities in (…) technoscience' (Haraway 1997: 12; see also Dumit 2018).Haraway's attention to the multiple strands that knot together in technoscientific practices and entities is useful here in lifting the very sets of thinking and scholarship that inform our reading of Pepper.As such, we find it helpful to offer the reader a brief overview of some of the key ideas from scholarly literature.

Getting into conversation
In this section, we will introduce some central scholarly discussions that inform and inspire our discussions around Pepper as caregiver.The section is organized into two subsections.The first focuses on The authors are part of an interdisciplinary research project that brings together different literatures associated with emotions and AI and connects this to the concept of norms.This is followed by a second subsection introducing the critical scholarship around care that has informed our analysis of Pepper in this article.

Emotions and AI
The notion of social interaction with robots and AI has long been a theme in science fiction.The present moment, however, is marked by a turn to the real with digital assistants such as Siri or Alexa touted as bringing invaluable companionship to older adults or those who are ill.New understandings of relationships in which robotic nonhumans become part of private, intimate life are as urgently required as the ethical and legal frameworks demanded in order to keep them accountable.As such, this article is part of increasing attention to the promises and challenges of 'emotional AI'. 14  Scholarship around emotional AI can be dated back to Rosalind Picard's work in the mid-1990s on affective computing: I have come to the conclusion that if we want computers to be genuinely intelligent, to adapt to us, and to interact naturally with us, then they will need the ability to recognize and express emotions, to have emotions, and to have what has come to be called 'emotional intelligence'.(Picard 1997: x) Apple, 'Siri'.https://www.apple.com/siri/,accessed 20 September 2021.Picard's work -together with that of Cynthia Breazeal -made emotional interaction with a robot or AI impossible to ignore.Picard herself acknowledged in the introduction to her seminal 1997 volume Affective Computing that the idea of computers having emotions might sound 'outlandish'.However, as she explains, this response was grounded in the prevailing notion that rationality and emotion are two completely distinct and independent mechanisms.Instead, she makes the argument that emotions 'influence the very mechanisms of rational thinking' (1997: back cover) and proceeds from there to argue that: 'Computers do not need affective abilities for the fanciful goal of becoming humanoids; they need them for the meeker and more practical goal: to function with intelligence and sensitivity toward humans' (Picard 1997: 247).
In a chapter titled 'Recognizing and Expressing Affect', Picard details the various models available for recognizing and expressing affect.Many of the models she describes have developed significantly since this book was published.For example, she details real time processing as a major stumbling block with facial recognitionsomething that contemporary facial recognition software claims to have resolved.However, Picard's work laid the foundation for two key premises: (i) emotions as integral to intelligence, and (ii) emotions as tangible, measurable, and accurately reproducible.The possibility for a machine to read accurately the emotional expression of a human (thus allowing the next 'step' in terms of programming an appropriate response) is a key part of creating the conditions for the 'natural' feeling interaction of which social roboticists dream.
One of the models that Picard details is Paul Ekman's 'Facial Action Coding System' which is also the basis for one of the best-known facial-recognition softwares, Affectiva.Ekman's work codified facial 15 expressions for a series of emotions (see Ekman 1976;Ekman and Rosenberg 2005), identifying expressions and muscle movements which are claimed to be relevant across many different cultures and contexts.This approach leans heavily on a Darwinian understanding of emotional expression as part of biological evolution, more basic and universal than local cultural expressions.Indeed, Ekman and Friesen (1978) termed these facial expressions 'basic emotions' and posited that they would be relevant for all humans.This system has been widely used by psychology researchers, computer and AI developers and, not least, animators, to read and/or reproduce emotional expression in faces.
The idea of 'basic emotions' has also been the subject of much discussion by social sciences-oriented scholars interested in the turn towards digital affect.This literature focuses on exploring issues such Notably, Picard is one of the founders of Affectiva.https://www.affectiva.com/who/ as the 'range of phenomena encompassed by terms such as affect, emotion, feeling and mood' (Stark 2019: 118, emphasis in original; see also Papoulias and Callard 2010), and the consequences of adopting particular models, for example Rhee's (2018) reminder that '(e)motional labor, which emerges from uneven power relations, insists on the expression of normative emotions, in many instances as evidence of humanness or citizenship ' (2018: 101).Overall, this body of scholarship leans more generally towards 'the messier idea that emotions might not be fixed objects, but culturally constructed experiences and expressions defined through historical and situational circumstances' (McStay 2018: 4).We will return to these disciplinary differences in understanding emotions throughout the text.
In this article, we are interested in exploring the role that the measuring of emotions plays in care work done by social robots.This is a significant part of our article because the how -what practices and what science are part of the measuring of emotions -is where the work happens.We are looking at the nitty gritty of valuation practices (inspired by science and technology studies (STS) work on users, manuals, instructions and classification [Akrich 1992;Goodwin 1994;Bowker and Star 1999]).We are doing this against a background of how the technology is used, which provokes questions about why this work is being done, and also points to larger questions about the interaction of this work with other structures and norms (inspired by work practice researchers like Cockburn 1983, Orr 1996, and Suchman et al. 1999).
When we are talking about the relationship between norms and technology in the context of emotion recognition, we are beginning from an understanding of norms that takes inspiration from early sociological work (Parsons 1951;Joas andKnöbl 2009 [2004]) and examines how social norms shape what is possible and acceptable, particularly in institutional settings like care homes and hospitals.This understanding of norms is also relevant for seeing how they shape practices in medicine and science (Merton 1942(Merton , 1973;;Bucchi 2015).However, our work with norms is also highly influenced by the way they have been examined in STS, both as reproduced and materialized in technology (Winner 1980); visible in the discourses and tropes used to describe technology (Haraway 1997;Johnson 2019); internalized in our responses to technology (Rose 2007); and conscientiously challenged through design (Disalvo 2012;Ehrnberger 2017;Escobar 2017).
This means that we consider that smiles are not only valued, measured, counted, but also what normative work they are doing and what power dynamics are at play: which smiles, whose smiles, and where.From this we will try to read which norms can be articulated in the use of emotion (or at least smile) detection technology to assign values in human-robot interaction.

(Robotic) care
Since Pepper is supposed to be providing care, critical reflections that have been carried out in STS on care as a theoretical category are relevant in framing some of the observations we make.In what follows we draw particular attention to some themes within this literature, namely valuation of care, emotional labour, power and care fragmentation.
As a sociological term, care has a long history, often related to the ethics of care in professions like nursing.It often draws upon and teases out a universalizing (and naturalized) understanding of care as something 'good' which can nonetheless be dissected into parts, categorized, and then taught to those who are supposed to deliver care (Duffy 2011;Allen 2013).In more critical discussions of the term, Joan Tronto (1993) is often referenced, with her early critique of tendencies to imagine care as (universally) feminine.She defines care as: 'Everything that we do to maintain, continue and repair "our world" so that we can live in it as well as possible.That world includes our bodies, ourselves, and our environment, all that we seek to interweave in a complex, life sustaining web' (Tronto 1993: 103).
Importantly, Tronto's political argument is that care is often carried out by underprivileged groups, which serve to maintain systems of privilege for others.This line of theorizing has frequently seen care equated with devalued labour, often similar to what is called invisible labour, and which can overlap with the 'dull, dirty and dangerous' work that it is often imagined will be assigned to robots in the (near) future (Suchman 2007;Rhee 2018;DeFalco 2020).This also resonates with the emotionally subservient role assigned to Pepper as it is engaged in making sick children happy.
Another line of research into care deals with the way emotions are enacted as care practice.Here, attention is paid to emotional labour and the management of feeling, as developed initially in studies of workplace expectations, service industries, and the 'commercialization of feeling' (Hochschild 1983).It later appeared in classics like James's definition: 'Care = organization + physical labour + emotional labour' (James 1992: 2).Such studies quickly became a staple of sociological research on nursing practices (see Allen 2013) often highlighting structural issues of care provision, labour relations, and the expected 'doing' of care, physically and emotionally.
The power dynamics and politics of care have been brought into an STS discussion around the word which has blossomed in the last decade.Here, too, one is reminded that care is multifaceted and not necessarily benign or positive.Not all care is good.This conversation started with Puig de la Bellacasa's attempt to encourage an ethos of care in STS research as a response to, and extension of, Latour's suggestion that the field engage matters of concern (Puig de la Bellacasa 2011).While Latour was suggesting matters of concern vs matters of fact as a way of addressing a dreaded turn away from 'truth' or belief in science that a constructivist approach was by some thought to produce (and written before our current mayhem of alternative facts aftermaths) (Latour 2004), Puig de la Bellacasa ( 2011) was suggesting that a discussion of care would carry with it a critical edge, one attuned to exclusions and power dynamics in stratified worlds (2011: 86).As she points out, 'care' is a stronger word than 'concern' and can also be easily turned into a verb, 'to care'.This is important because '[u]nderstanding caring as something we do extends a vision of care as an ethically and politically charged practice, one that has been at the forefront of feminist concern with devalued labours' (Puig de la Bellacasa 2011: 90).
Taking this further, Martin et al. (2015) addressed the way that care 'is both necessary to the fabric of biological and social existence and notorious for the problems that it raises when it is defined, legislated, measured and evaluated' (2015: 625).They, too, point out that care is not always positive, it can have a darker side, lack innocence, and induce violence.It is selective -it can cherish some things and exclude others.And the power of care includes the power to define what counts as care and how it should be administered.Likewise, it can: render a receiver powerless or otherwise limit their power.It can set up conditions of indebtedness or obligation.It can also sediment these asymmetries by putting recipients in situations where they cannot reciprocate.Care organizes, classifies, and disciplines bodies.Colonial regimes show us precisely how care can become a means of governance.It is in this sense that care makes palpable how justice for some can easily become injustice for others.(Martin et al. 2015: 627) These aspects, too, are very relevant to our discussion of robots.One can ask what bodies are being organized, and how, by care robots like Pepper.That question easily reshapes into a question of which organizations (hospitals?nursing homes?) are tasked with caring for which bodies, and therewith what power dynamics the use of Pepper is reproducing.
We suggest that keeping these critical stances to the concept of care in mind can remind us that an analysis like ours has political implications for many people, care givers and care recipients alike, not just for the development of robots or their integration into care provision budgets.Furthermore, they make clear the necessity of critically examining newly emerging modes of caregiving (such as Pepper), with particular attention to how digitization may reconfigure understandings and practices of care.In the case of Pepper, for example, this includes tracing a line from earlier analyses of how care was dissected and categorized in order to be taught to humans, to the dissection of care work necessary for it to be programmed into a robot.One of the most tangible implications of this critical stance can be a recognition of the politics of care fragmentation.
'Care fragmentation' (Vallès-Peris and Domènech 2020) refers to a process in which the various elements of care work go from being understood as a holistic practice to being understood as a series of tasks that can be thought of, and carried out, separately.Vallès-Peris and Domènech (2020) use the term care fragmentation in an interview study investigating roboticists' imaginaries of care robots.Here, care fragmentation refers to the process in which care becomes 'conceptualized as a set of tasks that can be separated in[to] pieces made of different tasks, with some of these pieces being able to be delegated to the robot and others not' (2020: 165).More specifically, they describe how the physically demanding and strenuous tasks of care work become separated from the 'affective tasks' (2020: 165) (specified in the article for example as conversations, quality time, and creative interaction).
Similar to Vallès-Peris and Domènech, we will argue that articulation of Pepper as a caregiver depends on practices of care fragmentation.However, as we will show, care fragmentation in our case differs from that in Vallès-Peris and Domènech's study.It does so since Pepper comes with the promise precisely of managing the kind of 'affective tasks' that roboticists placed outside of care robots' range in their study.What we will show is that emotion recognition took part in another kind of care fragmentation, one which aimed at making possible affective care interactions between Pepper and patient.

Broc hures, demos, and im plosions
The authors of this text were introduced to Pepper thanks to collaboration with the Machine Perception and Interaction Lab at Örebro University.In this lab, Pepper has been tested as a care robot for older adults, coaching residents in a care home for older adults in exercising (Akalin et al. 2019).Pepper has also been a part of experiments exploring topics such as a sense of safety and security in human-robot interaction (Akalin et al. 2017).At one point during our collaboration, a robotics professor showed us the facial recognition software that they run on Pepper: Affectiva.The professor demonstrated how the software assigned emotional interpretation of her various facial movements while she went through a roster of smiling, frowning, looking confused.
Demonstration of the Affectiva software made us curious about the role that it played in our colleagues' research and that of others working with Pepper.The encounter therefore inspired us to go further with investigating the use of emotion detection in Pepper.Having made this decision, we started looking into media coverage depicting Pepper in use, as a way to gain insight into how this technology is demonstrated to the public.Reading news articles, such as the story that opens this article (showing how Pepper, equipped with emotion detecting software, is used in hospital environments) was one way of learning more about how this technology is introduced.
However, we also wanted to understand the workings of the software and learn how it measures emotions.To do so, we turned to Affectiva's web page.Affectiva software is offered for use in several different settings.However, when used for the purpose of emotion recognition, the web page directs you to their partner, iMotions.iMotions is one of several companies and many research groups using Affectiva.iMotions uses the Affectiva technology -facial movement tracking enabled by a vast database of faces, deep learning, and 16 machine learning -as part of software that helps researchers with all stages of their emotion-tracking studies.iMotions software includes, besides the Affectiva technology, survey tools, data visualization, and a library function for previous studies.We have engaged with iMotions in two different ways.Initially, we analysed different materials made available through the iMotions website which describe the functioning of the software.One important such resource, which will reappear throughout the text, is an introduction brochure titled 'Facial Expression Analysis: The Complete Pocket Guide' (2017).
The guide is twenty-seven pages long, freely available as a pdf to download from the iMotions website, and divided into three main sections.The first section entitled -'The Basics … and Beyond'outlines the theory behind Facial Expression Analysis.It references Darwin, evolution, and Ekman's work on cross-cultural emotion recognition (Ekman and Friesen 1978).Put simply, the theoretical basis relates to the perceived causal relationship between facial movements and emotions exemplified in the paragraph above -the view of emotions as 'readable' (to use iMotions' own terminology) through facial movements.The second section -'Getting Started with Facial Expression Analysis' -goes into the practicalities of using the iMotions software.It explains which types of technology are needed to use it (most importantly, a working web camera) and gives general instructions about how to set up the software.Finally, 'Facial Expression Analysis … Reloaded' deals with other measurements of emotions that can be used to complement facial expression analysis.For example, it can be complemented with a device that can be attached to one's index and middle fingers which measures the valence of one's emotional response (how strongly one experiences the emotion) using sweat measurements.The brochure appears to be aimed at beginner-or intermediate-level users of the software due to user-friendly terms (the fact that we, three social scientists, could Called the Affdex database, which contains the world's largest data set of human understand it is a testament to that).The text is complemented throughout with pictures of demonstration faces and results charts.
The second way in which we engaged with iMotions was through a meeting where the software was introduced to us.During our exploration of the website and the brochure, a chat prompted us to make direct contact with the iMotions team for a one-hour online demonstration.This seemed like a good opportunity to ask some direct questions about the software, so we decided to participate in the demonstration as potential buyers/users of the product.In this meeting, we made clear that we were researching emotion detection and wanted to learn more about the practicalities of using the iMotions software.We were walked through the interface of the software: what one sees when one is using it.During the demonstration, a company representative illustrated how to carry out a facial movement recording by using her own web camera, 'reading' her own emotions, to use iMotions's terminology.
All of the materials detailed above -news stories such as the one about Pepper in the children's ward, the material gained from Affectiva's and iMotions' websites, the iMotions Facial Expression Analysis brochure, and the field notes that we took during the demo with the iMotions team -make up the empirical material for this article.Each of the materials has been integrated differently into our analysis.The media coverage fed our initial curiosity and helped us to understand why it is so important to pay attention to this measurement software; witnessing the embodied connection between measurement algorithms, bodies, and affects as it plays out in the children's ward of a hospital.The specific story about Pepper being introduced in the Toronto hospital also inspired the case that we return to throughout the article.The brochure helped us to understand how iMotions situate and formulate both their software and emotion detection in general.Taking part in the demonstration that iMotions gave us helped both in understanding how the software works (seeing the yellow dots on the face) as well as triggering wider curiosity about the broader scientific context in which the technology is situated.
In the article, we will 'implode' (Haraway 1997; see also Dumit 2018) the various empirical materials that we introduce above.This means that we aim to disentangle the different discourses, media, and intellectual heritages that are knotted together in the materials, all contributing to the articulation of emotion detection.We will implode the intellectual heritage of the system in the form of Paul Ekman and Wallace V. Friesen's theory of 'basic emotions ' (1978), the technical details of how to make the system work, the set of bodily norms required for recognition, and the commercial rhetoric designed to increase the appeal of the package offered by iMotions.In doing so, we take inspiration from STS analyses that pay close attention to the 'interdependence of technical networks and standards, on the one hand, and the real work of politics and knowledge production on the other' (Bowker and Star 1999: 34).Approaching the material in this way highlights the 'non-innocence' (Haraway 1991(Haraway [1988]]: 157) of emotion recognition software by examining its heritage and connecting it to its contemporary consequences.

iMotions and a smile
As we laugh or cry we're putting our emotions on display, allowing others to glimpse into our minds as they 'read' our face based on changes in key face features such as eyes, brows, lids, nostrils, and lips.Computer-based facial expression analysis mimics our human coding skills quite impressively as it captures raw, unfiltered emotional responses towards any type of emotionally engaging content.(iMotions 2017: 2) The quotation above is from iMotions' promotional material and offers an intriguing introduction to the phenomenon of emotion detection.Software such as that offered by iMotions is one example from a range of programs currently available to do the work of 'reading' emotional responses.The idea that facial expressions can be reliably correlated to emotional responses, and that such expressions can be accurately read and analysed by software (in real-time) is a premise upon which social robotics is built (Picard 1997;Ekman and Rosenberg 2005).It is a crucial part of the claim that robots like Pepper can act with and understand emotions.The ability to measure emotions accurately is important to those both for whom the robots are tasked with caring (ensuring or promising that the robot will be able to respond to their needs and feelings), and for the programmers of such robots who are trying to develop the performance of emotional responses by robots (to facilitate a smoother bond between human and robot).
The iMotions brochure uses a variety of strategies to connect facial expressions and emotions: scientific references, a particular kind of rhetorical register, different theories, and illustrations in the brochure of scientific-looking diagrams.In the demonstration meeting with iMotions, this material was meshed with the technical specificities of the software.Through the online introduction, and from the questions we were able to ask the developers and demonstrators, it became clear that all these elements are part of enabling the emotionally sensitive care that Pepper is argued to give to children at the Toronto hospital.
Below, we will illustrate how this software works and was demonstrated to us.We will argue that emotion detection such as that developed by iMotions, when used in care robots, can be understood as part of a larger process of 'care fragmentation' (Vallès-Peris and Domènech 2020): a process in which the various elements of care work go from being construed as a holistic performance of physical as well as emotional labour, all wrapped up in a professional identity -to being seen as a series of discrete tasks that can be performed by anyone, specifically a robot in our case.We will show how the promise of affective robotic care requires this kind of fragmentation where, in order for it to be manageable by a robot such as Pepper, affective care practices become about scanning the slightest movement of the corner of the mouth in order to calculate how probable it is that the patient is smiling.And we will discuss how, as this is done, some smiles are made visible and are valued, while others are not.

Using iMotions -how it works
A robot is not necessary for using the iMotions software.This became clear to us as the software was demonstrated in the meeting with iMotions.For many users, the only equipment needed (besides the software) is a computer with a standard web camera, directed at the study participant's face.When using the software, real-time images of the face are picked up by the camera and adorned: the outline of the face is marked by a thin yellow line that follows head movements.Within this outline, small yellow dots mark the ends and middle points of the eyebrows and eyes, as well as the edges of the nostrils, lips, and chin.All in all, the software covers the face in about thirty yellow dots.These dots mark crucial 'landmarks'-facial areas that are seen as containing information about emotional states, especially when read in relation to other dots as one's muscles move.The movement of these landmarks in response to a person using their facial muscles constitutes the basic data on which iMotions relies.
Then those data, i.e. the movement of facial landmarks, are fed into a classification algorithm: a type of algorithm that is used to assign data to predefined categories.In the iMotions software, these categories are made up of different emotions.The algorithm returns a numerical score which corresponds to the likelihood that one is, for example, happy.This can be judged partly by how wide the algorithm perceives one's smile to be.This shows that there is a strong connection being made here between a smile and happiness: if one smiles, there is a strong likelihood that the software will perceive that one is happy.If the algorithm returns the number 0, this means that there is 'no expression' of happiness.If the algorithm returns the number 100, the expression of happiness is suspected to be 'fully present'.In other words, emotions are measured on a scale from zero to hundred.
In our observations of the iMotions software, we have noticed some key processes of standardization that are involved in iMotions' emotion detection -standardization processes that are, by extension, key to enabling the promise of emotional competence in robotic care Below, we will delve into the emotion classification systems that iMotions uses as a basis for their software, discussing both the science that enables it and the ways in which researchers and participants have to adapt in order to fit into the system.In doing so, we pay attention to the intersection of quantification, classification, and standardization (Bowker and Star 1999) in the context of emotion detection.
iMotions (and similar competing) software requires a standardized and widely accepted conceptual framework for classifying emotions.iMotions' Facial Expression Analysis software depends on earlier, analogue, methods of classifying and cataloguing emotional states.One such system is Paul Ekman's earlier mentioned 'Facial Action Coding System' (FACS), which is explained by iMotions in the following terms: 'a fully standardized classification system of facial expressions for expert human coders based on anatomic features' (iMotions 2017:16).
As is explained in the brochure, FACS forms an important foundation for the iMotions software.FACS and the basic emotions theory, on which FACS is founded, is described in detail in the brochure.For example, the brochure dedicates several pages to illustrating the seven basic emotions that Ekman and Friesen (1978) discuss: joy, anger, surprise, fear, contempt, sadness, disgust.In the brochure, these emotions are complemented by images of different (albeit all white) faces illustrating these emotions in what could only be seen as fairly exaggerated ways (for example, the man illustrating the emotion 'fear' is depicted opening his mouth wide as if screaming).
It is these seven basic emotions that the iMotions software 'finds' through measurement of the tiny movements of landmarks in a person's face -or that Pepper looks for in faces of children in the Toronto hospital.This classification system is therefore a key component of the care fragmentation described in the previous section: basic emotions theory has laid the foundation for establishing clearly distinguishable components of an emotional spectrum, thus helping robots like Pepper in assigning an emotion to people they encounter.

But can you measure an emotion?
So far, we have described and discussed the process in which iMotions measures emotions digitally.However, emotion detection has also been associated with the important methodological question of whether emotions can be measured (Davies 2017).Is there a difference between other real-time tracking technologies such as step counters and sleep trackers (see for example Lupton 2016;Salmela et al. 2019)measuring numbers of steps walked/ numbers of hours slept -and emotion detection technologies?Arguments both for and against this inevitably circulate around questions of the very nature of emotions: are emotions something that you can 'read' on a face?Are there categories of emotions that can be distinguished from each other?Are emotions something that can be represented using numbers?And if so, what is the best way to achieve this?
In his work on mood tracking apps, Davies (2017) takes up this question, arguing that there is something special to measuring emotions.According to Davies, the emphasis on catching emotions in real time that is inherent both in the mood tracking apps that Davies focuses on and in the work of iMotions implies a view of emotions as subjective and constantly shifting experiences that exist in the moment (2017: 45).Davies sees the view of emotions as in the moment, and simultaneous attempts to somehow 'capture' (2017: 39) them, as a philosophical point of conflict underlying mood tracking: 'There is a philosophical contradiction here between the privileging of immediate, unreflective experience as the essence of value and the attempt to represent it in calculable, objective form for purposes of evaluation' (Davies 2017: 39).
What Davies understands as a philosophical contradiction, between the here-and-now view of emotions in mood tracking and the simultaneous wish to 'capture' these moods, could perhaps also be framed as a disciplinary one.As we have mentioned, the approach to emotions taken by iMotions -as visible in facial movements, measurable and quantifiable -has been understood as related to a larger tendency in AI research to consider emotions as quantifiable and replicable 'discrete states' (Suchman 2007: 232-234).The view of emotions as measurable and quantifiable is, quite obviously, compatible with research in which emotions are measured and represented in a calculable and objective form.As noted earlier, this view of emotions can be contrasted with discourses in the social sciences that are more attuned to the contexts and practices of the production and expression of 'affect' (Ahmed 2004;Pellegrini and Puar 2009).Here, emotions are thought of not as discrete psychological states but rather as social and cultural practices (Ahmed 2004: 9).This view of emotions aligns less well with emotion detection practices, such as those carried out by iMotions, in which emotions are seen as residing in the physical body and possible to read through muscle movements.In other words, Davies's discussion on the philosophical conflict underlining mood tracking -between on the one hand, viewing emotions as subjective and 'in the moment' and, on the other, wanting to 'capture' them -could be connected to wider discussions on the nature of emotions and the methodological implications for how to study them.
The approach used by iMotions and other similar emotion-tracking technologies -to trace physical signs of emotions in the body -is put forth as one way out of the struggles of capturing 'real experience' without disrupting the flow of experience.In these cases, physical data is used as a tool to 'avoid any perceivable engagement with the quality or quantity of subjective experience' (Davies 2017: 45).In other words, technologies such as iMotions could be seen as using physical data as a tool to navigate between an emphasis on emotions as a subjective experience, and a wish to quantify and calculate them.To do this they rely on discrete, observable physical changes in facial muscles, pulse rate, skin temperature, etc.As such, technologies seem to hold the promise of providing accurate and reliable access to 'real' emotions unmediated by limits of language, using a system that guarantees quantifiable data.
Interestingly, the methodological challenge raised by Davies leads us back to the different disciplinary understandings of emotions that we noted earlier.The process of identifying emotional responses (by connecting physiological responses to predefined categories of emotions through a computer algorithm) does involve mediation by language in order for emotions to become legible to others; it is just that this happens 'out of sight'.More precisely, the promise of iMotions to capture emotions directly by tracking physiological responses is -on closer inspection -undone by the background framework of 'basic emotions' on which it depends.The algorithm that connects muscle movement with emotion functions as a kind of black box, in which particular facial expressions have already (thanks to FACS) been categorized and named.
The methodological challenge with iMotions then lies not in how to handle the 'the flow of experience' but rather in recognizing the effects of the interpretative framework imposed by FACS.Fundamentally, reliance of such facial recognition software on language as a way to mediate the experience is in tension with the idea of affect as 'visceral forces beneath, alongside, or generally other than conscious knowing, vital forces insisting beyond emotion -that can serve to drive us toward movement, toward thought and extension' (Gregg and Seigworth 2010).In this latter conceptualization, the emotional/ affective responses are understood to reside outside language.
Developing this further, we suggest that it is also worth paying close attention to the experimental framework that surrounds the iMotions capture of emotions and which is necessary in order to deliver on its promise of reliable emotional feedback.This experimental framework differs depending on the type of emotion detection, but in the case of iMotions includes the materials, algorithms, and science that emotions are funnelled through in order for the software to 'read' them.This framework cannot be considered a neutral transmitter of emotions, but rather shapes them in various ways in order for them to be legible, or readable.The process of making emotions readable depends on specific types of standardization of emotions which we will investigate below.

Whose smile?
We have so far addressed how differing understandings of emotions may shed light on the limitations of facial recognition software that is premised in Ekman's work on basic emotions (1976).However, a less considered aspect -and one which turns us now towards exploring some of the norms involved in care work -is the question of which faces (and emotions) constitute 'valid' subjects.Thinking about care and norms is often connected to critical studies of care, which examine what types of care are institutionalized and how, which bodies receive institutional care, and which provide them (Duffy 2011;Allen 2013), and what the power dynamics of those norms involve (Murphy 2015;DeFalco 2020).These studies inform our concern for both the historical basis for emotion recognition as well as contemporary practices of it.
An interesting aspect of the practice of recording emotional states is who is considered a 'good and reliable subject' (i.e. a subject whose emotions are worth recording) -and who is not.This is approached by Lucy Suchman in her work on affective computing (2007).Referring to the historian of medicine Otniel Dror (2001), Suchman draws a parallel between affective computing and the practice of recording, cataloguing, and enumerating emotions in laboratory sciencespractices that Dror traces back to the late nineteenth and early twentieth centuries.The aim of such cataloguing practices was to produce clear representations of 'emotional states' such as anger, fear, or excitement, and these were used in a variety of practices.
In Suchman's analysis on Dror's work, she discusses how those who were considered good subjects for representing emotional states -and chosen to be included in the research -were those who displayed 'clearly recognizable emotions on demand' (Suchman 2007: 233).These subjects can be contrasted with those who were more 'ambiguous' in their emotional expressions and therefore difficult to classify, who were excluded from research.As we will show below, similar categorizations of 'good' and 'bad' subjects (in our case, subjects for producing detectable emotional states) come to light when looking closer at the technological limitations of the iMotions software.
Whose emotions can be detected?The iMotions system is significantly limited in terms of what can be reliably captured by the material technology itself.This is visible for example in the technical requirements listed in the iMotions brochure to ensure best results: 'For online automatic facial coding with webcams, keep the following camera specifications in mind' (iMotions 2017: 24).There follows a list of five guidelines about resolution, frame rate, lens, and so on, specifying the quality of equipment required.In addition, there is a long list of 'Respondent instructions' in which the ideal set-up for capture is achieved by positioning, illumination, visibility of face, and mobility of face.As the brochure goes into increasing detail about the experimental procedure, focus turns to ways in which the more temporary aspects of a participant's appearance may affect the measurements: 'Facial expression analysis requires the visibility of emotionally sensitive facial landmarks such as eyebrows, eyes, nose, and mouth.If any of these are occluded, the face tracking and expression analysis may lead to only partial results ' (2017: 27).
This then is different from the practices of recording and cataloguing emotions (like the FACS), discussed above, in which particular types of individuals are considered to produce the 'correct' performance of certain emotions.This standardization is connected rather to the practical aspects of capturing emotions digitally.Accurate reading of an emotion requires the camera to be able to clearly 'recognize' the landmarks of the face to be measured.If these are obscured by features such as large glasses, long beards, facial jewellery, hats, or side-swept bangs, then the camera is unable to place the small yellow dots which it relies on to take measurements of facial movements.
In other words, the cheery statement, 'there are only a few things to consider before you get going with your study ' (2017: 26), sits somewhat at odds with the pages of detailed instructions about how to set up the procedure for capturing emotions.There are quite a few chances for things to 'go wrong'.We could understand these lists of instructions as a way to mitigate an underlying fragility in the system.The opportunity to receive a reliable reading of 'raw unfiltered emotional responses' (iMotions 2017: 2) is revealed as actually highly precarious in technical terms and dependent on participants fulfilling certain bodily norms and the experiment fulfilling certain technical norms.The brochure guidelines are based on bodies appearing in particular ways in order to be quite literally recognizable, and also demand a particular quality of camera and lighting in order for the image to be of high enough quality to apply the yellow dots that show the necessary facial landmarks.The iMotions software could for this reason be seen as a site of normative tensions (Grosman and Reigeluth 2019) where different and at times conflicting bodily and technical normativities are enacted and handled (2019: 10).
Let us return to the story that we opened with: the story of Pepper caring for sick children in a Toronto hospital, to illustrate what Pepper's use of emotion detection software might look like in practice.Imagine Pepper moving into the room of one of the patients.If using the iMotions software, Pepper would use its camera (which is, as earlier mentioned, located on Pepper's forehead) to scan the child's face, marking its landmarks, running the information through the software's classification algorithm that calculates which emotion(s) that the child appears to be showing.Based on this information, Pepper could adapt its behaviour accordingly.If the child appears to be sad, for example, Pepper might dance for the child in the hopes of making them smile.
As is made clear in the video clip that started out this text, this aspect of Pepper's caregiving -making children smile -is seen as crucial.Inducing smiles is put forth as a key part of the care practice carried out by Pepper.And, as we attempt to detail above, Pepper uses advanced software to be able to carry out this specific component of its care work -software which, in turn, has to divide the smile (and the associated emotion, happiness) into even smaller fragments: the landmarks on the child's face and slight movements of the corners of their eyes and mouth.Throughout the steps of emotion detection, the many separate tasks that go into calculating smiles create distance between the smile, the emotion that it is associated with, the emotional competence that emotion detection is associated with, and the care work to which this emotional competence is seen as crucial.We have already argued that this slippage, from care work to software pointing out landmarks in the face, could be understood as a form of care fragmentation where care practices are divided into separate fragments that are seen, and carried out, as separate entities (Vallès-Peris and Domènech 2020), and which has significant implications for the regulation and provision of professional care (James 1992).
In Vallès-Peris and Domènech's (2020) study of roboticists' imaginaries of care robots, they highlight the role of care fragmentation in shaping these imaginaries.One example was the division of care practices into physical care tasks and affective care tasks.While physical tasks were seen by roboticists as tasks that could be delegated to a robot, it was considered crucial that human staff remain in charge of the latter, affective tasks of care work.What we find happening in the case of Pepper in the children's hospital contributes to another dimension of robotic care fragmentation, since Pepper is, indeed, being delegated the affective elements of care work.The care fragmentation that we describe rather has to do with what goes into making that delegation possible: how the affective elements of care work are spliced up into smaller and smaller components, or tasks, in order for a robot to be able to do them.
As we suggest above, reliable data on emotional responses can only be provided for a particular subset of participants, using a carefully defined suite of technologies and within a framework of emotional classification that was generated from a subset of participants who displayed emotions in particular ways; in other words, it is contingent, that is context and participant specific.In this paradigm, value is afforded to particular emotional expressions by virtue of being able to recognize them and accurately measure them.There are numerous other faces and emotions which remain outside the classification process and thus outside valuation.
Perhaps, the normative tension of whose emotions can be read could be seen as a side effect of robotic (affective) care.As robots are programmed to carry out the emotional labour that is arguably a significant part of care work, several steps of care fragmentation are required -breaking the emotional labour into smaller and smaller units.The emotional classification system that is used by iMotions is one part of this.The measurement of facial landmarks and algorithmic classifications another.And both of these begin with the idea that the robot is providing care when producing smiles.

Hierarc hies of care?
Earlier discussions of care, especially care provision in (often poorly) paid labour relations can help us to better understand Pepper's role in processes of valuing, evaluating, and devaluing care.Feminist sociology, for example, makes clear how the dependency hidden in the term care is mitigated by existing hierarchies of race, gender, class, and humanism (Star 1991;Tronto 1993;Haraway 1997; Puig de la Bellacasa 2011).And possibly there are dependencies, and hierarchies of care, hidden in some of the smiles we observe with robots (cf.DeFalco 2020).Perhaps it is not a coincidence that Pepper is always smiling (replicating a deferential and affective understanding of care provision), and that Pepper's software is concerned about the smiles Pepper is producing (attuned to the responses of care recipients, rather than the feelings of the care provider, thus the unchanging smile on Pepper's face).
As Puig de la Bellacasa points out, seeing the dynamics of care requires paying attention to it on the ground, in the details of practice, in the situatedness of care -something feminist STS is accustomed to doing (Puig de la Bellacasa 2011: 100).This is part of what makes us curious about the practices involved in emotion detection as well as which faces are most legible to robots.We suggest that these differences in legibility are not neutral, incidental artefacts of programming difficulties (even if programming difficulties are one layer in the onion peel of explanations surrounding them), but rather expected results from technological obduracy -the way existing technologies and their cultural genealogies impact the possibilities available for the development and deployment of new technologies.In our example, we see the legacy of Ekman's (1976) work and understanding of emotions intersecting with robotic technologies for care.The theory of basic emotions (Ekman and Friesen 1978) enables the algorithmic classification that allows iMotions software to associate a mouth's muscle movements with happiness.However, as we show above, this legacy at the same time contributes to producing categories of legible and illegible emotions, causing some smiles to count and others not.This discussion is easily connected to Suchman's analysis of technologies of the service economy, like smart assistant interfaces or, as here, robots meant to care.Such technologies embody a 'just visible enough' worker ethos, one which is '… autonomous, on the one hand, and just what we want, on the other.We want to be surprised by our machine servants, in sum, but not displeased' (Suchman 2007: 217-220; see also Kennedy and Strengers 2020).This is in relation to the standardized expressions of emotions explored above.We want care robots to be able to understand how we are feeling, but from a limited number of ways to visually express this on our faces.Remember, from above, that only some, standard, legible expressions of emotions were selected to produce the basic knowledge about how to read faces.Caregivers are tasked with recognizing and responding to standardized care recipients's emotions, which triggers questions about the politics of servile self-erasure in care work, as Tronto did for humans, and which Suchman does in human/non-human relations; and which Pepper's moulded, unchanging smile embodies.Smiles do different things in different relations.They can erase a person's self by feigning compliance and pleasantries or produce an affected subject by expressing a reaction.But if only some reactions are legible/readable/ recognizable, only those responding in that way become subjects.
Our initial analysis suggests that -through unpacking the development of iMotions software and its deployment through Pepper -some familiar hierarchies of care are re/produced or remediated in the emergence of robotic care provision.Pepper illuminates the p r e v i o u s ' s e d i m e n t e d a r r a n g e m e n t s o f v a l u a t i o n a n d devaluation' (Murphy 2015: 722), and assists us in asking: which valuations have previously been put on the bodies which were able to recognize those smiles before Pepper arrived on the scene?Pepper also prompts new, more future-oriented questions, such as what value is being placed on making the robot capable of assessing a patient's smile?As a non-human who can be switched off, Pepper has no access to discussion around its own value, and it is unsettling to think that this development might lead to broader devaluation of care workers of all kinds.Relatedly, the care practices that Pepper provides (and this relates particularly to emotional labour) are devalued by virtue of being provided by a non-human.
The tension here -that Pepper reminds us that other bodies have previously been expected to do the same work -also highlights a potential critique of Pepper's expertise, however skilled Pepper or the next generation of care robots may soon be.Intimate care proffered by non-humans is often valued differently or less so than care proffered human-to-human, and the reason for this lies in 'an inbuilt and little discussed expectation/requirement of "authentic" intimacy: humanness' (Harrison 2019).Perhaps this is why the introduction of Pepper to a children's ward, and the smiles that Pepper generates, are considered newsworthy.In her article, DeFalco organizes her argument around deconstructing the connection between authenticity and humanity within the field of care, challenging the dominant paradigm in which 'a wilfully anthropocentric perspective (…) makes 'real' or 'authentic' care the exclusive domain of human animals' (DeFalco 2020: 3).
DeFalco's arguments are politically motivated by a desire to interrogate the category of 'human' upon which the care she describes is based: 'by claiming that good care is human care, one is tacitly assuming the transparency of the category human' (DeFalco 2020: 5).As such, her work helps to tease out what it means to be able/allowed to do care.Defalco's discussion on care robots further illustrates how a devaluing of care is entangled with other valuation practices.Care work, in the fragmented sense we see demonstrated by the use of Pepper that we examine here, is also being assigned value in that it is connected to quantified understandings of emotion as read through facial recognition technology.As particular care practices produce a visible, legible emotional response in the care receiver, they are simultaneously fragmented (identified as a composition of discrete care tasks) and valued.The production of positive emotional response is seen as an essential and unique kind of care labour that is -as is illustrated by the triggering effect that the image of a robotic caregiver can have -very important to get right.
The value of a smile -still, whose?
In this work, we have asked how the technology -in this case a recognition program embodied by a robot -is refracting social norms and values about smiling and about care.We have paid attention to whose voices and concerns are whispering in the muddle of discourses about recognizing smiles that we see in the iMotions material but also in news reports and publicity about Pepper.We are trying to articulate which concerns are not merely whispering, but are speaking clearly and loudly.We have reflected on which institutional or structural values they are conveying, what end goals are given, which expectations and hopes are expressed, and which are silenced.
Returning to the presence of Pepper in the Toronto children's ward, these questions could ask why Pepper is there, with sick children.Who has decided this is important?So important that someone wants to make a video report about it?In the brief film about Pepper in the ward, the voices (concerns, reflections, and jubilations) of hospital administrators, parents, and children are all heard talking about how useful the robot is because it is helping the sick children have fun.And of course, if developers want to make sure that children are having fun when interacting with Pepper, it is good if Pepper can see the children smile when caring for them.
Underlying this discourse is expression of a very strong norm (that makes certain things possible and desirable in an institutional setting, and which is contingent on some places and times; apparently found on the Toronto children's ward in 2020) that a child should have fun (play, dance, laugh) even when sick in hospital.Within practices influenced by that norm, one element of providing care also includes providing the opportunity for fun, for play, for smiles.
This article has examined software developed to 'read' emotions by tracing facial human movements and mapping that onto a pre-existing cartography of emotion.This software is used in care robots like Pepper, who are imagined to be able to read the emotional responses of humans for whom they are caring.This scenario of care, bound up with the unidirectional expression of emotion which is thought to be important for the robot to process, also awakens questions about the human/non-human relation and the imaginaries of care which are found in it.The iMotions brochure proffers an understanding of 'authentic' emotions not validated or recognized by human judgement, but rather scientifically rigorous 'objective' measurements performed by facial recognition software.The ability to make such measurements means the possibility of being able to program a robot to recreate them and generate 'authentic' emotional reactions.We followed the 17 journey of a smile from the face of the human, through Pepper's camera eyes, as it is fed into an algorithm and transformed into a number that makes sense to Pepper, but also to those humans who own Pepper and assess its usefulness through value metrics that are expressed in numbers rather than warm and fuzzy feelings.We have used the concept of care fragmentation to make sense of this journey of compartmentalizing and translating care practices.
And it is here we end, with a reflection on care presented by STS researchers Martin, Myers and Viseu, who write: 'A critical practice of care would insist on paying attention to the privileged position of the caring subject, wary of who has the power to care, and who or what tends to get designated the proper or improper objects of care' (Martin et al. 2015: 636).Reviewing the video from the Toronto Children's ward with which we started this article, we can see care providers in the background.Hospital administrators are being interviewed -and smiling -in scenes intercut with sick and happy children.The smiles of care recipients may be what Pepper is measuring, but the purpose of measuring them is to produce smiles (or at least reduce headaches and stress) in the administrators, policymakers, and care providers who have been tasked with caring for the bodies that are interacting with There is, however, a crack in the argument that iMotions make: their correlation 17 between facial measurements and emotions hangs on the following phrase: 'facial expressions and emotions are closely intertwined' (iMotions brochure, p.7).Note 'closely ' -not 'completely' or 'accurately', but 'closely'.Pepper.And these are the people (job categories, structures) that are paying for Pepper and its continued development.
Thus, we leave with a final unsettled feeling, and suggest that it is important to consider not only how a smile is being valued, but who (and what structures and systems) have decided that that particular smile is valuable.Who cares about the smiles, and what is the privileged position of that caring person?