Who and What Drives Algorithm Development: Ethnographic Study of AI Start-up Organizational Formation

Share Share Share Share Share
[s2If is_user_logged_in()] Download PDF
[/s2If]

The focus of this paper is to investigate deep learning algorithm development in an early stage start-up in which edges of knowledge formation and organizational formation were unsettled and contested. We use a debate by anthropologists Clifford Geertz and Claude Levi-Strauss to examine these contested computational forms of knowledge through a contemporary lens. We set out to explore these epistemological edges as they shift over time and as they have real practical implications in how expertise and people are valued as useful or non-useful, integrated or rejected by the practice of deep learning algorithm R&D. We discuss the nuances of epistemic silences and acknowledgments of domain knowledge and universalizing machine learning knowledge in an organization that was rapidly attempting to develop algorithms for diagnostic insights. We conclude with reflections on how an AI-Inflected Ethnography perspective may emerge from both, data science and anthropology perspectives together, and what such a perspective may imply for a future of AI organizational formation, for the people who build algorithms and for a certain kind of research labor that AI inflection suggests.

Keywords: AI, Deep Learning, Algorithm R&D, Epistemology, Domain Knowledge

[s2If !is_user_logged_in()] [/s2If] [s2If is_user_logged_in()]

SETTING THE SCENE

We are in a 3,000 square foot office space and from a small board room we hear “that’s domain knowledge – we’ll get subject matter experts for that!” Team members debate over hiring radiologists and hiring expertise outside of algorithm development. They are puzzling over the usefulness of radiologist knowledge in developing deep learning algorithms that are to serve radiologists in their diagnostic interpretations. We have arrived at an early stage deep learning startup in Silicon Valley near a strip of park land where cyclists and joggers stream by an area that once fell into neglect and was reborn into a corridor of fusion restaurants and tech companies.

We find ourselves among decades of tech venture capital infusion set against university and residential growth. You can smell chlorine from Olympic sized pools and spas, coconut-crusted shrimp from a high-end bistro and musty ash from recent wild fires. Flocks of feral parrots sound off in the trees high above, escapees of nearby ranch-style living rooms. We have arrived at what social network theorist Mark Granovetter has termed an “innovation cluster” (Ferrary & Granovetter 2009) of artificial intelligence (AI) or what Andrew Ng has termed the “the new electricity” (Ng 2018). The term refers to algorithms that could be described as “rocket engine[s]” in which “huge amounts of data” can be processed lifting the rocket ship of machine intelligence to new heights (Garling 2015). Jeff Dean, the head of AI at Google assesses Deep Learning algorithms as keys to AutoML, machines that learn to learn and are foundational for early detection of a wide range of diseases (Dean 2018). Such start-up companies that Peter Diamandis describes as “smaller” and “nimble” have evolved out of a linear industrial era into an “exponential” era, a period of market disruption, unpredictability and 100x algorithmic growth (Koulopoulos 2018).

Based on our real experience of such a start-up, the concepts and possibilities of our case study of Deep Learning algorithm development and new organizational formation emerge out of this exponential petri dish.

STRUCTURE OF PAPER

This paper explores original research set within deep learning algorithm development in an early stage start-up organization, from 2014-2015. At the time I (Rodney Sappington) served as a Senior Data Scientist focused on developing algorithms for the early detection of disease. Together with my collaborator we analyze features of deep learning algorithm development which include observations and interactions with strategic partners, clinical leaders and radiologists. We also refer to experiences of highly regarded team members with whom I was involved and had the pleasure in part to lead in algorithm and clinical development. Not the least we have developed our perspective out of references across the social and behavioral sciences and discussions with colleagues outside the walls of algorithm development.

The structure of this paper includes conceptual and case-study analyses. We contextualize our research by way of briefly introducing a passionate debate between Clifford Geertz and Claude Levi-Strauss on the fear and embrace of computational forms in ethnographic practice. This debate is not a gentle or dated conceptual tug of war between renowned anthropologists but instead a contemporary lens through which we can investigate contemporary algorithm development and messy problems of building algorithms with medical diagnostic capabilities. We briefly explore the appearance of deep learning algorithms in Silicon Valley. We turn to early stage start-up features and move to the complexities of a case of a health insurance company that offers large amounts of data and a problem for algorithm development with a twist. We go inside the organization and examine hiring practice and the role of domain knowledge in a machine learning start-up with a scene at a company retreat. We explore the problem of positionality of being a data scientist and anthropologist in the field of applied machine learning. We attempt to open up how the diagnostic patient is being conceptualized. In conclusion we provide observations on epistemological tensions in algorithm development which we term AI-inflected ethnography. We are not suggesting a methodology but instead pointing to a way of inhabiting these tensions and silences at the core of algorithm development, an approach that opens up a view into how organizations and organizational members get constituted and sometimes unravel.

FEAR-EMBRACE OF COMPUTATIONAL FORMS

There could not have been two more different scholars studying human behavior and culture than Clifford Geertz and Claude Levi-Strauss. For Geertz field research was immersive, forged in dust, blood and side-bets surrounding Balinese cock-fights. For Levi-Strauss field research was forged as a product of field work in conceptions of binary order and kinship system. For them a certain kind of intelligence and mind of the ethnographer was at stake, it was almost at once human-driven and machine/system-driven:

Society is, by itself and as a whole, a very large machine for establishing communication on many different levels (Levi-Strauss 1953).

Lévi-Strauss has made for himself…an infernal culture machine. It annuls history, reduces sentiment to a shadow of the intellect (Geertz 1973).

On one side Geertz viewed ethnography as a practice of interpreting human speech and gesture in the wild and intentionality in everyday life, which he called “deep play.” On another side Levi-Strauss viewed ethnography as a scientific practice of interpreting universal codes, totems and patterns across society, it was “structural.” The gist here was a tension: a type of human perception and cognition emerging in local everyday life versus a type of human perception and cognition emerging as universal patterns across everyday life. For Levi-Strauss information science held a central place in creating and interpreting culture which implied both human and non-human forms. For Geertz information science suggested an infernal (hellish) culture machine.

There was another kind of legacy that was epistemological. It drove their passion and still fuels passions today in machine learning. As anthropologists and data scientists will still largely live and work in this legacy, categories of “hard” and “soft” knowledge, universality and particularity, probabilities and possibilities of quantified judgment and human intuitive judgment, structured and unstructured data and embodied and cognitive forms of intelligence. I take the Geertz-Levi-Strauss debate as a struggle for thinking how we build, imagine and fashion algorithms for human benefit and machine automation.

DEEP LEARNING ALGORITHMS EMERGENCE IN SILICON VALLEY CONTEXT

Andrew Ng and others in 2012 built high level features using deep learning to recognize and classify cat videos at a 70% improvement over previous “state-of-the-art” networks (Quoc et al 2012). Using computational power, 16,000 computer processors, the deep learning network was presented with 10 million digital images found in YouTube videos. This was a breakthrough and supplied proof that certain deep neural networks could perform (automatically learn without human hand-coding) across complex image sets. This was the beginning of the successful use of convolutional neural networks (CNNs). Two years later overutilization of medical imaging in healthcare made radiology ripe as a testing ground to apply some of the lessons learned from 2012. New early stage organizations began to take shape to productize these findings. The context was Silicon Valley in which a line-up of similar innovations had set an aspirational and venture capital stage for deep learning algorithm development. Expectations were high. From Apple, Uber, Lyft, Google, Airbnb, Salesforce, Tesla and Twitter to name a few, today nearly 30 technology companies are so-called “unicorns”1 or near “unicorn” status totaling close to $140B in value (Glasner 2018). It has also brought together venture funds. Venture funded machine learning start-ups has recently almost defined the San Francisco-Silicon Valley region in terms of company valuations. As the global economy is expected to grow by 3.5% GDP, venture backed AI startups have had an expected growth rate of over 40% by 2020 with a U.S. market valuation of AI by 2035 of $8.3T (Faggella 2018). Along with this growth came warnings that we “must be thoroughly prepared—intellectually, technologically, politically, ethically, socially—to address the challenges that arise as artificial intelligence becomes more integrated in our lives” (Faggella 2018).

The new start-ups are often described fondly in the industry as “Moonshots.”2 Singularity Ventures founded by Ray Kurzweil also terms them “exponential startups.” These are organizations that claim to hyper-scale across person, transaction and global impact. Founders within them are typically referred to as super smart. They exemplify that the social good of machine learning goes beyond a mere transactional machine or consumer recommendation system or ad placement. As the perspective goes these are exceptional people building exponential machine learning products with exceptional resourcefulness. What they know and how they apply what they know has become a global phenomenon that others try to emulate.

This smaller-nimble organizational type has been described by Ferrary and Granovetter as one “node” in “a durable” assembly of organizations that has an almost magical capacity to “anticipate, learn and innovate in order to react to major internal or external changes” (2009). This is a lot to ask of people building the new global electricity with industry expectations to touch every human life on the planet.3

[/s2If]

Leave a Reply