Edited by Janet Pauketat and Jacy Reese Anthis. Many thanks to Tobias Baumann, Michael Johnson, Jason Schukraft, Brian Tomasik, and Daniela Waldhorn for reviewing and providing helpful comments and discussion.
The feature-based approach to assessing sentience
Evaluating features of artificial entities
Applying and weighing features
In a previous post we explained our reasons for favoring the term “artificial sentience.” However, while this term captures what we truly care about — artificial entities with the capacity for positive and/or negative experiences — it may be too vague when we try to use it to make judgments about sentience in specific artificial entities. Since our judgments about which entities to grant moral consideration depend on whether and the extent to which we consider them to be sentient, this raises a potentially serious problem.
One approach to this problem is to say that we do not currently need to know which artificial entities are sentient, only that some future artificial entities will be, and they may be excluded from society’s moral circle. We should, therefore, encourage the expansion of the moral circle to include artificial entities despite not knowing exactly where the boundary should be drawn. Working out the specifics of which artificial entities are and are not sentient can be deferred to the future. Still, further clarity on this can be useful because of the growing complexity of artificial entities, as well as the increasing attention to their treatment as a social and moral issue. In this post, we operationalize the term “artificial sentience” and outline an initial, tentative framework for assessing sentience in artificial entities.
A common approach in the literature on nonhuman animals is to identify features that are indicative of sentience, and then to test the extent to which different entities possess those features. This approach is relatively theory-neutral, relying on argument-by-analogy and inference to the best explanation, rather than starting with a specific theory of sentience and reasoning from that. Focusing on features makes sense across a range of philosophical theories of consciousness, such as eliminativism, in which the features either constitute consciousness itself or are empirical evidence of other features that constitute consciousness, or property dualism, in which the features are only empirical evidence of the dualist property of consciousness. While this post does not cover such theories, see this blog post for more detail on how Sentience Institute thinks about them.
Bateson (1991) proposed eight features that, taken together, would suggest that an animal is sentient, including the presence of nociceptors, analogous structures to the human cerebral cortex, and pathways connecting the nociceptors to higher brain structures. Varner (2012) proposed a similar set of eight features. Sneddon et al. (2014) proposed 17 features, split into two categories: (1) whole animal responses to noxious stimuli, such as physiological responses, that differ to those of innocuous stimuli, and (2) motivational and behavioral change, such as willingness to pay a cost to avoid noxious stimuli. They also encourage that the features are considered as a whole and are not taken as indicators in isolation. In the context of phenomenal consciousness, Muehlhauser (2017) proposed 42 features, arguing that given that the scientific/philosophical community does not currently have much confidence in the right theory of consciousness, we should consider a wide range of features that capture a range of plausible theories. Anthis (2018) proposed just three features constituting sentience: reinforcement learning, moods, and integration.
Others have applied this third-person feature-based approach to understand whether animals in specific taxa are sentient. The contemporary debate largely concerns invertebrates and fish. Rethink Priorities carried out in-depth research on the topic of invertebrate sentience, identifying 53 anatomical, physiological, cognitive, and behavioral features associated with sentience and evaluating the extent to which different entities possess those features. Animal Ethics outlined relevant neuroscientific features associated with invertebrate sentience, such as the number of neurons, the presence of specific brain structures such as the neocortex or midbrain (or their functional equivalents), and degree of centralization, as well as third-person behavioral features ranging across flexibility, emotion, and sociability, that are relevant to understanding sentience in invertebrates.
Braithwaite (2010) concluded that fish are sentient on the basis of third-person features such as the presence of nociceptors, production of and responses to pain relieving drugs in response to noxious stimuli, and the activation of certain brain regions which are taken to be associated with conscious rather than reflexive processing of sensory inputs. On the other hand, the third person approach is also used to argue negatively for the sentience of animals; Key (2016), for example, argued that a lack of neocortical brain structures or their analogue in fish imply their lack of sentience. This raises the argument that in addition to considering features indicative of sentience, we should also consider “defeaters” that may invalidate or significantly downgrade our judgments of sentience (Tye, 2017).
Can the feature-based approach be applied to help us understand and make judgments about sentience in artificial entities? A couple of studies have made some attempt to do so: Gamez (2005) proposed a list of eight features for assessing consciousness in artificial entities, where the features were chosen so that artificial entities with more human-like capacities are more likely to be considered conscious, and Pennartz et al. (2019) considered how features indicative of consciousness in nonhuman animals can be applied to artificial entities. Overall, therefore, we think this approach can be usefully applied to the case of artificial entities, but there are several differences from the application to nonhuman animals that need to be considered.
First, when it comes to identifying features associated with sentience in nonhuman animals it is typical to focus on biology. For example, the studies described in the previous section propose neuroanatomical features, such as the absence or presence of nociceptive cells or certain brain structures, and physiological features, such as the production of and response to analgesics. Clearly, these features are not appropriate in the case of artificial entities. Instead, we should focus on analogous features at the functional and algorithmic levels. For example, rather than the presence of physical nociceptors, we should consider the ability to detect harmful stimuli. This shift to focusing on functional and algorithmic rather than biological features requires us to assume that these higher levels are appropriate levels of analysis for understanding sentience, and that sentience can be realized in multiple substrates, not just biological ones. It may also be important to take into account the physical structure of artificial entities, which is important on some theories of consciousness.
Another common strategy to assess sentience in nonhuman animals is to refer to its evolved function. For example, it is argued that pain serves the function of encouraging animals to avoid harmful stimuli, which in turn increases their chances of survival and reproduction, and therefore animals likely experience pain. This type of reasoning will generally be less reliable for artificial entities insofar as evolutionary selection is less influential in their development. At the extreme, artificial entities could be designed in an entirely ad hoc way, for example, with the capacity to experience pain that serves no function, and without the capacity to avoid that pain. We should also extrapolate from evolved behaviors, such as vocalizing pain, with caution, since they may not be relevant to artificial entities. 
A third consideration is that the presence of higher-order capacities in artificial entities, such as self-recognition, are not as strong evidence of sentience as in animals, where they are sometimes considered to be sufficient conditions. This is because artificial entities can be designed to have very strong capabilities on some dimensions but very limited capabilities on others, whereas in animals higher-order capacities tend to be built upon more basic ones. Relatedly, artificial entities can be designed to superficially meet some features but still be unlikely to be sentient. For example, while verbal reports are typically considered to be strong evidence of consciousness, engineers could easily design a robot that plays a recording of the phrase “I’m in pain!” if it is dropped. On its own, we wouldn’t want to take this “verbal report” as evidence that the robot is sentient. As Tomasik (2013) argues, to be judged as sentient it may be that we require an entity to satisfy a range of criteria in a non-superficial way, and it may also be that the types of algorithms the entity runs to produce behavior are important in addition to the behavior itself.
Taking into account the considerations described in the previous section, we can propose features that may be associated with sentience in artificial entities. Given the current lack of scientific consensus on how sentience works, especially its experiential aspect, we do not think it is possible to have a great deal of confidence in a core set of features. We instead propose, in line with Muehlhauser (2017) and Rethink Priorities (2019), that a relatively large number of features consistent with a range of plausible theories are considered.
For example, we propose features in the table that cover both theories that emphasize attention and theories that emphasize integration of information. That is not to say we endorse either of these theories, or any other specific theory referred to in the table of features. Nor do we think that we need evidence that an entity needs to satisfy all (or even most) of the criteria in the table to be considered sentient. Rather, we think that an entity that satisfies more of the criteria more strongly warrants the attribution of sentience than an entity that satisfies fewer. In the next section we outline a method for making judgments about sentience based on the features in Table 1.
The features we propose capture two general categories. First are features more related to valenced states, such as the detection of harmful stimuli. Second are features related to an entity’s general capacity for experience, such as centralized information processing. Some features, such as attention directed on the source of harmful stimuli, capture both an entity’s capacity for experience and to be in a valenced state, and therefore fall under both categories.
Note that the features listed in Table 1 are not exhaustive, precise, or distinct. We have considered several of the most prominent theories relevant to sentience, but there are many theories that we have not considered. We have also focused at this point on negatively valenced states, but we note that positively valenced states may also be relevant. More research is needed to refine and potentially expand on this list.
Table 1: Incomplete list of features indicative of sentience in artificial entities
Reason for inclusion
Category: Valence, Experience, Both
Example sources referring to feature
Reinforcement learning refers to learning via trial and error with one’s environment and reward signals to achieve goals. It is sometimes argued that the reward signal is associated with valenced experience.
Focused/selective attention on source of harmful stimuli
Attention is widely considered to be an important aspect of experiential states. Assuming it is, attention to harmful stimuli implies sentience.
Detection of harmful stimuli
While nociception is reflexive, the capacity to detect harmful stimuli is plausibly a necessary condition for sentience.
Aversive behavioral responses to harmful stimuli
While we may not expect the same behavioral responses as we find in animals, some kind of aversive behavioral responses would be indicative of sentience in artificial entities.
Aversive memory associations with harmful stimuli
This is an important aspect of pain in animals, though it may not be necessary for an artificial entity to form memories in response to painful experiences.
Mood-like states caused by harmful stimuli
If harmful stimuli affect an entity’s overall mood, e.g., by making the entity more pessimistic, it implies that the entity has both valence (since mood is an aspect of this), and that the system is integrated since multiple parts of the system are affected.
Goal-directed behavior is the capacity to evaluate the consequences of different actions and make choose actions based on the value of those outcomes. Pennartz et al. (2019) tie this process to consciousness in animals, though they argue it could be done without consciousness in artificial entities.
Verbal reports of experiential states
Some artificial entities will have language capacities, and thus will be able to report on their own internal states. Verbal reports are often considered the strongest evidence of experience in humans. In AI, verbal reports will be more reliable where speech systems draw on information from other cognitive systems (that are processing information in the appropriate ways), rather than responding based on statistical models as with current AI.
Centralized information processing
Centralized information processing is widely considered to be important as it allows for disparate information from various cognitive subsystems to be integrated into a unified model of the world. This is analogous to a central nervous system in animals.
Causal interconnections within cognitive system
Some theories hold that interconnections within a cognitive system, particularly feedback connections, are what give rise to experiential states.
Global broadcasting of stimuli made available to multiple cognitive systems
According to Global Workspace Theory, a leading theory of consciousness, we experience stimuli when they are enhanced in a “global workspace” and made available for use by multiple cognitive systems, such as action selection, planning, and decision making.
On some views a self-model is necessary for there to be a subject who has experiential states. In its simplest form, this may involve an entity being able to distinguish between oneself and one’s environment.
Reggia, Katz & Huang (2016); Metzinger (2007); Godfrey-Smith (2021)
Flexible cognition and behavior typically require deliberate (and therefore plausibly conscious) processing of information, rather than responding to stimuli in a reflexive way.
Total processing power
Luke Muehlhauser argues, “a brain with more total processing power is (all else equal) more likely to be performing a greater variety of computations (some of which might be conscious), and is also more likely to be conscious if consciousness depends on a brain passing some threshold of repeated, recursive, or “integrated” computations.”
Higher-order representation of mental states
On higher-order theories, higher-order representation of one’s mental states is necessary for sentience. This feature may be measured by capacities such as metacognition.
Functional similarity with humans
Muehlhauser (2017) notes that since we are most confident humans are conscious but do not have a strong understanding of how other than that it involves information processing in the brain, it seems reasonable to think that, all else equal, animals whose brains are more neuroanatomically similar to humans are more likely to be conscious. For similar reasons, we may expect functional similarity with humans to be an indicator of sentience in artificial entities.
Physical structure similarity to humans
On some theories of consciousness, the physical structure of a system is what matters for its mental states, rather than, say, the higher-level functions and algorithms the physical system implements. For example, Koch (2019) argues that causal interconnectivity at the hardware level rather than just the higher algorithmic level is required for artificial sentience. Using similar reasoning as the previous feature, we may expect physical structure similarity to humans to be an indicator of sentience in artificial entities.
Several authors have considered the importance of a body for sentience. For example, Damasio (1999) considers that feelings arise from an interaction between brain and body, and Harnad (1992) extends the Turing Test to include bodily capacities as a way of getting additional certainty that an entity is sentient.
Susceptibility to cognitive illusions
Being susceptible to cognitive illusions, such as the Müller-Lyer illusion, would indicate that things appear a certain way to an entity, implying that they may have an experience. Note that susceptibility to illusions has been found in some nonhuman animals and existing AI.
To assess sentience based on the features in the table, we propose taking a graded approach. For each additional feature in the table an entity satisfies, and for every increase in the extent to which an entity possesses one of the features, we should increase the degree of sentience we attribute to that entity.
An important consideration is how much weight to give to the different features in our judgments. As Varner (2012) notes, to do so we need a guiding theory, but there is no consensus on the right theory underlying sentience. Following Muehlhauser (2017), we therefore propose that decision-makers take a Bayesian approach reflecting their degree of confidence in different theories. For example, if an individual has a lot of confidence in theories that posit higher-order cognitive capacities are required for sentience, then higher-order features in the table should be given more weight. We expect that as the philosophical and empirical literature on this topic continues to grow, the empirical uncertainties entailed in this weighting will be reduced and perhaps the normative disagreements as well, with less room for subjectivity.
As noted above, we should also consider defeaters to the conclusion that an artificial entity is sentient. One consideration in this context is the type of algorithms an artificial entity runs to produce its behavior. For example, if an artificial entity’s behavior was entirely determined by a giant lookup table we may cap our judgment of its sentience no matter how complex or otherwise convincing its behavior.
A typical approach to allocating resources, particularly for individuals and organizations involved in effective altruism, is to estimate, either explicitly or implicitly, the expected value of different actions. This is because the outcomes of our actions, which exist in the future, are inherently uncertain. This uncertainty can be modelled. Similarly, we propose that individuals should use estimates of degrees of sentience as an input when estimating the expected value of interventions affecting artificial entities. This type of approach has been applied to estimate the expected value of working on interventions to benefit nonhuman animals.
Rather than assessing interventions that affect artificial entities today, this method is likely better applied to address the question of interventions that will affect artificial entities in the future, who are more likely to possess features in Table 1. This will involve considering the likelihood that artificial entities with the features described in the table will exist in the future, and the resulting degree of sentience of those entities. Note that, even for a skeptic whose overall judgment about the degree of sentience in artificial entities is very small, given the number of artificial entities that may come into existence in the future, and their potential for exclusion from the moral circle, the expected value of addressing issues related to their wellbeing may still be high.
However, multiple uncertainties relevant to assessing the expected value of working on artificial sentience remain: What are the different types of entities that will likely exist in the future? What will be their moral statuses and capacities for welfare? How many of these entities will exist? Under what scenarios will they have positive and negative welfare, and how likely are these scenarios to be realized? Are there any interventions we can carry out that will improve their welfare? We intend to explore these questions in future posts.
 Arguably, this problem is compounded by the problem of other minds, and applies to all entities, not just artificial ones. Under this view, consciousness has a first-person nature — it is not possible to get inside an artificial entity and see if they have positive or negative experiences; we can only observe them from the third person.
 See Tye (2017) and Rethink Priorities (2019). Another approach found in the literature takes as its start point specific theories of sentience and makes inferences about which entities are likely to be sentient in the context of those specific theories. See, for example, Barron and Klein (2016) and Carruthers (2020).
 Given that sentience can be considered as a specific kind of consciousness, this research is still highly relevant.
 There is some evidence that these analogues do exist in some species of fish. Moreover, the existence of neocortex-like structures as a requirement for sentience is itself a contested issue (Merkel, 2007).
 This list includes features such as how close the processing speed and size of the artificial system are to human levels.
 The principle of substrate independence and the view that implementing the right types of algorithms would be sufficient for mental states are important in the fields of cognitive science and artificial intelligence, though they are not universally accepted.
 By “physical structure,” we mean roughly the lowest level on David Marr’s levels of analysis. As an example of emphasizing the importance of this level, Koch (2019) argues that present-day computers cannot be conscious no matter what higher-level algorithms they implement because their physical hardware does not have anywhere near the right degree of causal interconnectivity. He notes, however, that future computers could be designed with the right physical structure for consciousness.
 It may be useful, however, in cases where artificial entities are modelled on biological brains, such as with whole brain emulations, or in some cases where evolutionary algorithms are used.
 Vocalizing pain may be useful for some animals to alert others of their situation. Even if it is useful for an artificial entity to vocalize their pain in a similar way, they may not be designed with the capacity to do so.
 This reasoning also applies when extrapolating from other, non-evolved behaviors in animals to artificial entities.
 Many theories of consciousness focus on this general capacity for experience (for example, visual experience) rather than valenced experience per se. These features are also plausibly evidence of sentience.
 Alternatively, if one thinks of sentience as an all-or-nothing capacity rather than a graded capacity, one could apply a similar procedure to make judgments about the probability that an artificial entity is sentient. They may still wish to consider the degree of sentience as an additional factor (e.g., a probability distribution over different degrees).