A.I. Among Us: Agency in a World of Cameras and Recognition Systems

Share Share Share Share Share
[s2If is_user_logged_in()]
DOWNLOAD PDF
[/s2If] [s2If current_user_can(access_s2member_level1)]
[/s2If]

This paper reports on the use and perceptions of deployed A.I. and recognition social-material assemblages in China and the USA. A kaleidoscope of “boutique” instantiations is presented to show how meanings are emerging around A.I. and recognition. A model is presented to highlight that not all recognitions are the same. We conclude by noting A.I. and recognition systems challenge current practices for the EPIC community and the field of anthropology.

[s2If current_user_is(subscriber)]

video-paywall

[/s2If] [s2If !is_user_logged_in()] [/s2If] [s2If is_user_logged_in()]

Unknown, Caucasian, male, grey hair, 80 kgs, 1.8m, 55-60 years at entrance 2.
Unknown, Caucasian, male, grey hair, 80 kgs, 1.9 m, 55-60 years in hallway 1.
Unknown, Caucasian, male, grey hair, 78 kgs, 1.9 m, 55-60 years located in café 2.
Unknown, Caucasian, male, grey hair, 80kgs, 1.8 m, 55-60 years located in hallway 3.
Unknown, Caucasian, male, grey hair, 80 kgs, 1.8m, 55-60 years located in café 2.

Thousands of “observations” are logged, one about every second, during a single day on campus, ostensibly forming some sort of narrative of the researcher’s day. What kind of narrative is it? That’s the question. What the researcher understood at this stage was simply that this narrative was made possible by a set of networks of cameras connected together; a range of facial recognition systems dispersed across the school campus. Somewhere, or perhaps at multiple points distributed across the network, judgment and decisions were being made, that scripted the actions of others and thereby gave shape, unbeknownst to him, to the actions he might or might not take.

Strangers on campus are noted by the recognition software as “unknowns.” This means that they are not students, staff, faculty, parents, administration, regular service people or even those identified as “concerns.” By the end of a day visit, one of the authors had been spotted in the #2 café at least 3 times, usually in the company of another “unknown” and accompanied by someone who was known. This made the author a kind of “known unknown”, which was an acceptable identity to the system, warranting no further action than to continue to register his presence. In this way, these school recognition systems demonstrated some small ability to deal with uncertainty. Looking from the camera’s point of view, the author, and another researcher had become “familiar strangers” (Stanley Milgram,1972). Milgram used the concept to help explain the rise of modern cities. In this paper we are flipping it to help think about a new hybrid digital-social landscape being ushered in by A.I. and facial recognition.

BACKGROUND

Everyday life is a more mixed world experience than ever: digital/analog, machine/human, bits/atoms. Donna Haraway (1984) called out the limitations of such binaries decades ago, and today such binaries are even more inadequate as our lives are even more hybrid, comprised of more-than-human multiplicities. Advances in artificial intelligence, cloud computing, wireless networking and data collection have ushered us into a new social-material era, one equally exciting and anxiety provoking. But relationships don’t come easy and humans and technologies are surely in a protracted period of courting one another. If the industrial age ushered in one set of expectations and accountabilities, artificial intelligence seems to change the character of this courtship—suddenly our relations are much more promiscuous. In part, these distributed and varied encounters are expressive of a shift from products to networks, and concomitantly, a shift from discrete and singular artifacts of value to value as an outcome of connectedness and multiplicity. The shift is one where digital technologies that were previously limited to particular kinds of discreet, controlled, one-to-one interactions are now engaged in constant interaction with many, sometimes multitudes of humans. However, this adjustment period is the beginning, not the end. Self-driving cars, “personalized” agents on our smartphones and household systems, and autonomous robots are just some of the images conjured when A.I. is mentioned. While these examples seem to suggest A.I. is represented by a sleek, singular futuristic technological artifact, several scholars have highlighted how contemporary instantiations of A.I. rely on a complex, distributed, interdependent network of computers, software, data warehouses and infrastructure (Dourish 2016).

This paper offers a critical and ethnographically-informed exploration into key questions surrounding the constitution of A.I. and recognition systems as they permeate the complex practices and relationships that comprise contemporary everyday life. Our focus is on recognition, A.I. and the real time video analytics of recognition that are deployed and used in everyday contexts today. We will empirically illustrate the ways that human and non-human agents participate in building everyday life worlds and cooperate in this shared meaning-making process. We want to focus on the many agents involved, and shift the focus from singularity of device, product, service, and brand to the heterogeneity of intersecting databases, programs, products, services, people and networks.

We are conceptualizing various collections of A.I. and recognition as polyvocal assemblages (Tsing 2015, Deleuze and Guattari 2003, Ong and Collier 2004). The concept of the assemblage is salient because these systems are not in fact singularly engineered. They are diverse, more-than-human assortments that are gathered together, sometimes by design, other times ad hoc. Even though we might experience them through discreet interactions, as coherent services, their composition is multifaceted, often entangled. Our hope is to develop a critical appreciation for how diverse materialities, cultures, agencies, and experiences blend together in these emerging assemblages.

This use of assemblages has been employed to shift the framework of research to place greater emphasis on the dynamic, changing, and opaque characteristics of these A.I. recognition assemblages, as well as to bring in non-human participants. The approach enables agency of objects and the possibility of heterogeneity of assemblages. The researchers here are positioned to observe how elements are understood to cohere in existing or developing assemblages. Unlike Tsing’s mushrooms (2015) or Bennett’s (2009) green chilies, we did not have a material object to focus upon, rather this is the ground work to understanding how thoroughly entwined systems can mutate and develop over time [and space] and frame what is possible, desirable and expected of recognition systems. As Deleuze and Guattari (1987) note in their original writings on assemblage’s, they are “anticipatory” and concerned with continuing trajectories and future possibilities of what these assemblages might become, which seem particularly apt as we research A.I. and recognition technologies. The alternative of conceiving A.I. and recognition uses as discrete products or systems would imply a closed-ended and functionalist understanding that hides the series of interconnected and interdependent sets of technologies, institutions, agendas and people. What emerges here are partial directions and pressing questions related to the topic of the conference – agency: as artificial intelligence becomes an agent, what are the opportunities and challenges for shaping relationships to continue to enable agency? And what kinds of agency are possible in a world where technical things can know and do?

APPROACH

Since 2017 we have conducted four field research projects in China and two studies in the USA.1 In 2017 we elected to study these A.I.-recognition technologies because they offered attractive solutions to address many contemporary needs for identification and verification. These technologies brought together the promise of other biometric systems that tie identity to individual, distinctive features of the body, and the more familiar functionality of video surveillance systems. This latter aspect has also made them controversial, which motivated our research to get a deeper understanding. In the USA, there has been growing social and political concern around the use of facial recognition systems. Samplings from the press in recent months include stories in the BBC (White 2019), Wired (Newman 2019), New York Times (Teicher 2019), Washington Post (Harwell 2019), CNN (Metz 2019) and The Guardian (2019), to name a few. In contrast, China’s facial recognition systems, found in urban centers like Shanghai, Beijing and Hangzhou, were becoming ubiquitous even in 2017. In China, these recognition technologies continue to grow in sectors like civic behavior, retail, enterprise, transportation and education. Business Times (2019) reports that Alipay facial recognition payment is already deployed in 100 cities and will pay $582 million to expand further. Tencent, is adding facial recognition payments to the WeChat platform of 600M users. In a society that has had overt and everyday surveillance in human and institutional form for over 70 years, the emergence and deployment of recognition through cameras has been less controversial than in the USA.

We also chose to study these systems because recognition technologies, for all of their social and political controversy, allowed us to continue to talk about humans. Unlike some other A.I. systems, recognition technologies rely upon human embodiment, action, and often interaction. This is significantly different from, for example, machine learning systems that use social media as proxies for human activity. We hypothesized early that camera systems were harbingers of new interaction models with humans, and that recognition technologies, in particular, were examples of cameras literally reaching out to people, albeit awkwardly and often inaccurately. For even when deployed as a surveillance use case, the experience of being seen at a distance in a public space equipped with CCTV was a kind of interaction that implicated a more complex web of human users with specific interests and motivations. These new interaction models are suggestive of notions of embodied interaction (Dourish 2001) but also, due to the seamlessness of these recognition systems, these new interactions also seem to elude some of the situations of collaborative meaning-making we are accustomed to. As these systems become so commonplace that they disappear, and our interactions with them become just another everyday action (“smile to pay”), how do we—humans—participate with these dynamic, but elusive assemblages to make the worlds we want to inhabit?

In 2017 facial recognition systems were emerging in the mainstream landscape at a global scale just as companies like Intel were shifting business interests to the cloud and networks, and in the communications arena to 5G. The technologies emerging to transform the network, mobilized further by 5G’s emphasis on machine to machine compute, indirectly signaled that the interaction model of human and device, a hallmark of the PC ecosystem, was no longer the asset to exploit. Today’s technology industry conversations about “edge” and the challenge faced not just by silicon companies, but by cloud service providers, telecommunications companies, telecom equipment manufacturers, original equipment manufacturers, and even content providers on “last mile access” and how to bring compute closer to where data is produced, simply do not focus on what people do with technology. In this business context, increasingly distant from end-users, facial recognition provided us with a way to continue to talk about humans at a moment where so many only wanted to talk about machines.

Finally, we were skeptical not about the fact of facial recognition becoming ubiquitous in China, but about the contrast cultivated by the USA press relative to deployments at home. The research concerns in the USA on facial recognition have centered on three points: 1) recognition systems were biased in their development (Burrell 2016; Crawford and Shultz 2013; Eubanks 2017; Noble 2016; O’Neil 2016; and Pasquale 2016); 2) the systems created new risks to privacy (Dwork and Mulligan 2016; Introna 2009); and 3) there were ethical concerns about use (Horvitz and Mulligan 2015; Stark 2019). While Eubanks (2017) has equated their development to the rise of “eugenics”, Stark (2019) equates the potential dangers of recognition to “plutonium.” But these concerns have not necessarily resulted in fewer systems adopted. Indeed, Gartner (Blackman 2019) projects recognition to be the fastest growing Internet of Things (IOT) space in the near future. Further, we have seen deployments expand in the USA since 2017 in public city infrastructure as well as airports, private school campuses, industrial facilities, summer camps and childcare settings. Further, the US government says facial recognition will be deployed at the top twenty US airports by 2021 for “100 percent of all international passengers,” including American citizens, according to an executive order issued by President Trump (2017). By examining deployed uses of recognition, we hoped to provide empirical evidence to fill the gap between building, speculation and future deployments.

In what follows we share a kaleidoscope of vignettes from the field to supply the raw material for a discussion about value and its complexities for A.I. and recognition. The use of kaleidoscope is intentional in that it is not the scientific instruments of telescope or microscope that we employ here, but images of instantiations of new technology with people; images left open for further interpretations. As Gibson (1999) notes, “The future is already here – it’s just not evenly distributed.” While there has been plenty of speculation on the cataclysmic possibilities of A.I., there has been a dearth of studies on tangible, instantiations; so, something that is more “what it is” than “what might it be.” We will share snapshots of a future world of A.I. and recognition that is already here. We focus on what could be called “intimate” or “boutique” uses of recognition; so, not massive surveillance systems, but closed institutions or community uses. The snapshots don’t tell a complete story–there isn’t one to tell–nor do they provide a perfect compass for navigating the emerging new spaces unfolding before us. Instead, they are glimpses into the kinds of questions a compass can address, and the kinds of terrain it should help us navigate. From these vignettes, we raise questions about future research and practice for the EPIC community.

STORIES FROM THE FIELD

Everyday & Uneventful Facial Recognition

Popular visions of A.I. are seductive, but real-world facial recognition is amazingly boring in China. A few of the A.I. systems we experienced delivered identification for seamless access to residences, offices and schools; seamless access to subways and trains; seamless identification for hotel check-in, and seamless access inside banks and at the ATM; clerk-less convenience stores; preferential treatment in retail stores; identification for government services and criminal investigations. This list of the applications is only meant to underscore that A.I. and recognition is commonplace in China, and still growing in both government and commercial sectors, to the extent those are differentiated. From the start, what is important to emphasize is how banal the use of these systems is. Perhaps there is complexity and prowess behind the scenes, but everyday interactions with these systems and services is…well…every day.

Recognition is so ordinary and uneventful that it often goes unnoticed, both to users and to researchers who are supposed to be in the field keenly observing. As a result, there were many times in the field when we had to ask people to repeat their use of a facial recognition system, so we could observe the process. We asked one of our early participants in the study if we could take her picture as she walked through the facial recognition system at her residence. She walked through, and we had to ask her to do it again. We explained she did it too fast for us; that we could not see the system in action. Could she do it again? Ooops, we missed it the second time, and then we missed it again the third. Finally, we just asked her to walk very slowly, much slower than usual, and we got it. Of course, by that time a mother and her kid, an older woman, and the security guard were all looking at us like we were idiots. The guard, in particular, seemed delighted by it all. Another time, there was the look of a young man when we asked to go with him to take money out of the facial recognition ATM. You could almost see him thinking, “Oh yeah, foreigners think facial recognition is interesting? Is this a scam to take my money?” We also had to ask him to log in three times to catch the process.

Such interactions with facial recognition are very different from, indeed opposite to what we are used to with technology. Generally, with any kind of technology, whether a personal computer, phone, Alexa, Nest thermometer, car, or even Siri—we prepare to interact, and we remain aware of the interface, even with those that work almost seamlessly. Facial recognition interactions in China are stunning because they are so normative and normalized, often blending seamlessly into the environment. For example, three women walking back into work after lunch only briefly look in the direction of the facial recognition machines as they continue to walk and talk straight back into the building. Nothing to see here. No break in the conversation. Hardly a pause in their steps. They give a look that is less than a nod one might give a security guard that you knew very well. It is substantially less of an action than pulling out a badge, and pausing to badge in. Life simply unfolds, not only as if the technology was never there, but also as if those social regimes and routines of observation that define so much of what we call society and culture had ceased to exist. But of course, that haven’t ceased to exist, they’ve just been differently delegated.

Facial recognition is not just a part of high-end office buildings or residential complexes or trendy businesses; it is becoming commonplace everywhere in China. We watched as customers at a KFC quickly ordered on a screen then smiled briefly to pay. Yes, giving up money and smiling about it! In practical terms, of course, the smile is a second form of authentication for the facial recognition system to verify that you are alive (first the system verifies you are you; smiling is a secondary measure to avoid spoofing). The “smile and pay” is also common at some grocery stores. “Sometimes you can’t help but feel a little happy about smiling [even if it as a machine]” a woman checking out at a grocery store commented. Of course, she isn’t really smiling at a screen. She is smiling at an Alipay system (from ANT Financial) that is part of the Sesame Credit loyalty program for Alibaba. People are aware of the Alibaba loyalty program, and some of the perks of participation. Dual systems, like the ticket/person verification system at the Beijing main train station are also popular, as lines move quickly with people being recognized, authenticated and verified by a machine, rather than waiting in the lines to get tickets and then waiting for a security person to check in before boarding. These are just normal, everyday, “nothing to see here” parts of urban life.

Beyond the mundaneness of recognition systems, people were able to articulate some advantages, and while they would raise occasional issues about use, their concerns did not necessarily impinge on the value of using a facial recognition system. People mentioned that it is more secure, is hassle free because all you have to do is smile to get access, and oh yeah, it is fast. On the surface, these seem to be values of efficiency — where ease of use and enhanced productivity determine the worth of the system. While that may be partially the case, we also believe users found meaning and significance in the fact that the use of these systems removed and obviated the unnecessary social complications often inherent in transactions. In other words, one of the (human-centered) values of these systems is the desire to avoid awkward interactions with other humans in a socio-cultural context that has weighed heavily on how those interactions should take place. While social interactions are important in China, they come at a cost. People may push more stuff at you to buy or try to make connections by attempting to leverage a transaction into a relationship. There are additional cultural factors at play here, such as those of class. Though we presume people want to interact, and that sociability is desired, that presumption may be flawed, or at least not always true or uniform. By their very personalization, recognition technologies support the capacity to elide select social encounters.

Participants in the study were expecting to see more places and more uses for facial recognition in their urban environment. Unlike the USA, there was no moral panic, in fact, people were excited and proud about what they perceived to be a highly novel technology.2 There is a solid cultural belief in China’s middle class that technology is both a marker and a catalyst for economic growth and national success on the global stage. The recognition systems are interpreted as markers of the development of society, at the same time they are making urban China an easier place to live, and in some respects more like the West. In a curious way, A.I. facial recognition technologies highlight the individual, a hallmark of Western culture and traditions. As one of the participants said, “If everything is connected then you can just bring your face!”

Someone Is Watching You: Interpretive Flexibility

High School X: Hall Security3

High School X, in a tier two city in China, has switched their campus security camera system over to one that uses facial recognition. The facial recognition system enables students to come and go freely on campus and is connected to the classroom attendance (check-in at the door) system. The security camera system can be accessed from any authorized desktop, e.g., security office monitors, IT office PCs, principal’s PC, etc. The school used to have a bank of twelve TV monitors rotating through the twenty cameras on campus. The campus now has over forty cameras on campus for security. Two features of the system were demonstrated for us. One feature of the system was that it does anomaly detection of spaces and, when possible, identifies the person in the space (minimally captures them). Anomaly detection in this case means someone is in a space at the wrong time, e.g., in the hallway during class time. The other feature enabled a human supervisor to search by image or name in order to have all the appearances of that person for the day aggregated on screen. Taken together, these capacities enabled the detection of more than just attendance. As the following example shows, they enabled the detection of patterns of behavior, and as a consequence, revealed relations that might otherwise go undetected.

[Interview 1PM Classroom]

June (HS X Student): I’ve had cameras in my schools all of my life. They are watching us to protect us, but it is a little creepy. I mean, they know so much about us that they could know when you go to the bathroom or if you were dating, and who that is, really anything …

[Interview 3PM IT office]

Main IT guy (HS X): I think you talked to June earlier. Did she mention she was dating? Dating between students is not permitted at this school. We’ve known[with the facial recognition system], she has been dating for over a month. We haven’t done or said anything about it. She and her boyfriend are both getting very good grades. As long as they are getting good grades and don’t disrupt the community (school body), we won’t interfere.

How did IT and the administration know June was dating? We don’t know. Those details weren’t forthcoming. We do know that the analysis of her daily patterns involved verification with a teacher, the anomaly detection, and person identification (like a game of Clue) on the school grounds. The interpretive agency in the assemblage didn’t reside solely with the software but with the interaction between security, IT, teachers and the hall monitoring software.

[/s2If]

Pages: 1 2 3 4

Leave a Reply