A Robot Model of OC-Spectrum Disorders: Design Framework, Implementation, and First Experiments

Computational psychiatry is increasingly establishing itself as a valuable discipline for understanding human mental disorders. However, robot models and their potential for investigating embodied and contextual aspects of mental health have been, to date, largely unexplored. In this article, we present an initial robot model of obsessive-compulsive (OC) spectrum disorders based on an embodied motivation-based control architecture for decision-making in autonomous robots. The OC family of conditions is chiefly characterized by obsessions (recurrent, invasive thoughts) and/or compulsions (an urge to carry out certain repetitive or ritualized behaviors). The design of our robot model follows and illustrates a general design framework that we have proposed to ground research in robot models of mental disorders and to link it with existing methodologies in psychiatry, notably in the design of animal models. To test and validate our model, we present and discuss initial experiments, results, and quantitative and qualitative analyses regarding the compulsive and obsessive elements of OC-spectrum disorders. While this initial stage of development only models basic elements of such disorders, our results already shed light on aspects of the underlying theoretical model that are not obvious simply from consideration of the model.


INTRODUCTION
The growing field of computational psychiatry (Adams, Huys, & Roiser, 2016;Corlett & Fletcher, 2014;Huys, Maia, & Frank, 2016;Montague, Dolan, Friston, & Dayan, 2012;Stephan & Mathys, 2014;Wang & Krystal, 2014) includes within its aims the application of computational techniques to understand, and better treat, human mental disorders.This includes the use of simulations to explore and compare theorized mechanisms for diseases.However, to date, simulations of mental disorders have largely been "disembodied," that is the simulation has been divorced from sensorimotor interaction with the environment (see Yamashita andTani, 2012, andIdei et al., 2018 for rare counterexamples).To address this gap, we advocate the use of autonomous robots in modeling mental disorders (Lewis & Cañamero, 2017) to augment existing computational psychiatry techniques.
Robot models complement existing biological and computational models in a number of ways.Compared to purely computational models, robots, like animals, allow us to model complete systems, including a closed-loop interaction with the real environment.Compared to animal models, commonly used in psychiatric research, robot models allow precise operationalization of theoretical models through implementation, increased replicability and control of experiments, and the ability to make controlled manipulations that may not be possible in animals for ethical, methodological, or practical reasons.Such controlled manipulations include, for example, changing the values of specific parameters in the robot model, for example, in relation to OCD, testing different values of a threshold controlling the tolerance to perceptual errors.Another example would be introducing errors analogous to brain lesions, but in a highly precise and reproducible manner.A third example would be introducing communication errors between components of the controller, for example, the addition of noise to signals in a neural network corresponding to errors in top-down/bottom-up communication used by Yamashita and Tani (2012) in a robot model of schizophrenia.In this and other cases, the analogous controlled manipulations in animals may be inaccessible because we either do not know which specific elements or connections to manipulate, or we do not have a technique for manipulating them consistently, or we cannot manipulate them without causing side effects in other parts of the system (e.g., if a drug used also binds in another part of the body).However, when carrying out such manipulations in robots, we should have a clear hypothesis regarding the existence of analogous dynamics or systems linked to the condition that we are seeking to understand in human patients (construct validity, see section "Evaluation of the Robot Model (Stage 7)").
Since the use of such robot models is a new area of research, we seek to establish a design framework (methodology, guidelines, and evaluation criteria) to guide research (Lewis & Cañamero, 2017).Our motivation in choosing these guidelines is to ensure that we learn from the extensive experience of researchers using animal models, to ground the research in theoretical models, and to guide research toward applications.In this article, we present the initial development and first experiments for a robot model of obsessive-compulsive disorders following this framework.
Obsessive-compulsive disorder (OCD) is a disabling mental health disorder characterized by obsessions (recurrent, invasive, often unpleasant thoughts) and/or compulsions (a strong urge to carry out certain repetitive or ritualized behaviors, such as handwashing or excessive checking).OCD is considered as part of the obsessive-compulsive (OC) spectrum of disorders, which also includes conditions such as trichotillomania (TTM, pathological hair pulling), pathological skin picking (PSP), body dysmorphic disorder (BDD), and tic disorders such as Tourette's syndrome (American Psychiatric Association, 2013).A cardinal feature of these disorders is the performance of compulsions, which can be defined as repetitive stereotyped behaviors, performed according to rigid rules and designed to reduce or avoid unpleasant consequences but which, as a consequence of the repetition, become a source of distress and functional disability (Fineberg et al., 2018).The behaviorally similar condition of obsessivecompulsive personality disorder (OCPD) is characterized by excessive perfectionism and desire for "orderliness" (e.g., a needless desire for symmetry) and control.The main difference between OCD and OCPD is that OCPD is part of the person's personality and therefore perceived by the person as normal rather than unwanted.Whether OCPD should be considered within the OC spectrum is an open question (Fineberg, Reghunandanan, Kolli, & Atmaca, 2014).
We present and test experimentally a cybernetics-and ethology-inspired autonomous robot control architecture for decision making that can display both adaptive (functional) behavior and nonfunctional decision-making that presents similarities with compulsions and obsessions in OCD.
In the remainder of the article, we first review different types of models of mental disorder, then give an overview of our design process, before illustrating it with our development of a robot model for OCD.We then describe our initial experiments and discuss our experimental results.

Types of Models of Mental Disorders
In previous work (Lewis & Cañamero, 2017), we described four types of models commonly used in research into mental disorders, which we recap here: 1.A conceptual model of a mental disorder is a theoretical construct that links underlying causes (etiology), either proposed or observed, with observed symptoms and correlates.
A conceptual model serves as a framework for understanding and should have explanatory and predictive power with respect to the condition being modeled.There is not necessarily one "true" model, since different models may be complementary, having different scope, emphasis, level of abstraction, or uses.
A conceptual model may be associated with one or more specific implementations (in the sense of the three types listed in this section).However, this is not always the case; for example, Pitman's cybernetic model of OCD (Pitman, 1987) had been so far, to our knowledge, a purely conceptual model.A conceptual model without an implementation can nevertheless have applications to guide research into potential treatments (notably, the initial conception of Exposure and Response Prevention as a treatment for OCD was based on a theoretical formulation; Meyer, 1966), to provide an explanatory framework for observations, or as a theoretical basis for future research, for example, the Cognitive-Energetic Model of attention deficit hyperactivity disorder (Sergeant, 2000).2.An animal model of a mental disorder is a nonhuman animal used to study brainbehavior relations with the goal of gaining insight into, and to enable predictions about, these relations in humans (van der Staay, 2006).Animal models may be induced by genetic manipulation, drugs, or environmental manipulation.Alternatively, they may be naturally occurring.They have the advantage that they model a complete system (organism and environment) and use a real animal.However, there are limits to how closely a nonhuman animal can be used to model human mental disorders (Geyer & Marcou, 2002).3. A computational model of a mental disorder is a realization, or partial realization, of a theoretical model in a computer.The field of computational psychiatry includes within its scope the development of computational models of psychiatric disorders (Huys et al., 2016).These models have the advantage of being highly specified, and so any results should be replicable and can be analyzed in detail.However, owing to the complexity of implementing such a model, they are typically only partial implementations (e.g., of a neurological subsystem, as in the model of OCD in Maia and Cano-Colino, 2015), or they work at a relatively high level of abstraction (such as reinforcement and Bayesian learning models; for an overview, see Huys et al., 2016).In addition, they do not necessarily include any behavioral element, a true closed-loop interaction with the environment, or the effects of contextual and environmental elements.4. A robot model of a mental disorder embeds a computational and hardware realization of a conceptual model in an embodied, interacting robot and its environment.Like an animal model, it models a complete system (agent and environment), but using an artificial agent rather than an animal.While conceptual, animal, and computer models are widely used in research, there has thus far been relatively little use of robot models (some of the few examples are Idei et al., 2018;Yamashita & Tani, 2012).However, robot models share the advantages of computational models in terms of specificity and controllability, while, like animal models, taking into account the agent-environment interaction.We thus advocate the development of robot models of mental disorders, to complement existing models by offering more controllable agents in a complete system, in which theoretical models can be more precisely implemented (Lewis & Cañamero, 2017).
In reality, the different categories of model are not clear cut.For example, the signal attenuation model for OCD (Joel, 2006), outlined later, combines both a theoretical and an animal model.

Design Framework for Robot Models
We have followed the iterative design process shown in Figure 1 for the development of our robot model, which we will describe as we describe the development.This process is Figure 1.Flowchart for an iterative process for designing a robot model.This is a modified version of the chart from Lewis and Cañamero (2017), which is based on, and closely follows, the process described in van der Staay (2006) and van der Staay et al. (2009).Numbers in circles are to facilitate references to individual steps in the text.Note that, even after the robot model is accepted for clinical use (Stage 10), it is envisioned that robot model development might continue iteratively and that incremental improvements will be made with each loop through the process.
Computational Psychiatry based on a design process for animal models of behavioral disorders (van der Staay, 2006;van der Staay et al., 2009), with modifications to adapt it for use with robots, which provides us with a well-established evaluation framework used by the clinical research community, and which is generally relevant to the development and evaluation of embodied models.The design process that we have followed covers all the stages in the design process, starting with a theoretical model, through experimental evaluation and refinement.
We have here refined the process that we proposed in (Lewis & Cañamero, 2017) by changing the "accept the model" end point into part of the flow, to allow the explicit inclusion of "us [ing] the accepted model for clinically focused studies," while also reflecting the fact that a robot model (indeed, any computational model) is always open to further development and refinement.
In this article, we illustrate this process with the development of a robot model for OC-spectrum disorders.

Theoretical Model Selection (Stage 1)
The first stage in the development process is to select a conceptual model (or multiple complementary conceptual models) to serve as the basis of the robot model.While a new conceptual model could be created at this stage, there would not then be the opportunity for the wider mental health community to review it before implementation.In practice, since the process of developing the robot model requires the computational and hardware implementation of the model, new components of the model will be created during development.
A variety of theoretical models exist for OC-spectrum disorders (see, e.g., Fineberg, Chamberlain, Hollander, Boulougouris, & Robbins, 2011, for a discussion of various models).Our choice to conceptualize OCD as a disorder of decision-making (Sachdev & Malhi, 2005) and of a specific conceptual model (based on cybernetics, as we explain later) was linked to our (LC and ML's) existing research in robotics, which has extensively investigated decision-making (action selection) in autonomous robots from a perspective that is close to cybernetics.Our previous work in decision-making (behavior selection) in autonomous robots, inspired by ethology and cybernetics, investigates the adaptive value of decision-making strategies, measured in terms of contribution to maintenance of homeostasis, in motivated and goal-oriented behavior in robots (Cañamero, 1997;Cañamero & Avila-García, 2007;Lewis & Cañamero, 2016).In that work, the overall behavior of the robot was changed in different ways through controlled manipulation of the perceptual element of the perception-action loop, namely, through modulation of perceptual properties of incentive stimuli.Alongside adaptive benefits resulting from such benefits, we observed maladaptive behaviors-in particular, excessive persistence in behavior execution-that bore a similarity with decision-making problems in OCD and other conditions, such as addictions.Given the similarities between our model of decision-making and the cybernetic (Pitman, 1987) and signal attenuation models (Joel, 2006) of OCD, we selected these as the conceptual models to be used as the basis for our robot model of OCD.As we shall discuss later in the article, in addition to the behavioral "compulsion" aspect stressed by animal models, both the cybernetic model of OCD and our specific models of motivation allow us to consider the internal "obsession" aspect of OCD, since elements of the model can be viewed as "thoughts" even when they do not result in action.
Pitman's cybernetic model (Pitman, 1987) takes the cybernetic view of behavior as an attempt to control perception (Powers, 1973).In the cybernetic framework, behavior (of natural or artificial systems) is the result of attempting to correct "perceptual errors."Such errors indicate a mismatch between an "actual" perceptual signal (such as sensed room temperature in the archetypal example of a thermostat) and an internal reference, ideal, or "target" value that the system aims to reach (the set temperature that the thermostat tries to maintain).The actual signal can be external, such as in the example of a thermostat, although it can also be an internal signal, such as perceived hunger.Pitman specifies that, in the field of control systems theory, the reference signal is an internal signal (e.g., satiety signal after satisfaction of hunger).This does not mean that the target value is fixed, since it may adapt to some extent to adjust to internal or external environmental factors.The core element of such cybernetic control systems is an internal comparator mechanism that computes the mismatch between the actual (measured) value and the target value.This difference is conceptualized as an error that provides a signal (called the error signal) for the system to trigger behavioral output that aims to correct that error (e.g., in the example of the thermostat, activating the heating mechanism).Following this model, Pitman conceptualizes OCD in terms of behavioral control of perception and proposes that "the core problem in OCD is the persistence of high error signals, or mismatch, that cannot be reduced to zero through behavioral output" (Pitman, 1987, p. 336), for example, an erroneous ever-present perception that the hands are contaminated, leading to compulsive washing that fails to make the erroneous perception disappear.Pitman argues that his model can explain features of OCD, such as perfectionism, indecision, need for control, overspecification, and obsessive thoughts, with the presence of the error signal itself being subjectively experienced as a sense of incompleteness and doubt.He further considers three possible sources for the persistent error signal: conflict between multiple control systems, comparator defect, and attentional disturbance.
The signal attenuation model for OCD is a theory-based animal model built on the proposition that "compulsive behaviors result from a deficit in the feedback associated with the performance of normal goal-directed responses" (Joel, 2006, 381).In the associated experimental animal model, compulsive behavior is produced by the attenuation of the informational value of an external signal (e.g., light or sound) that has been linked, by training, to the successful execution of some action (e.g., lever pressing for food).A more generalized view of this model would also include internal signals, such as interoceptive signals for satiety after eating or drinking.Indeed, internal signals are important in goal-directed and motivated behavior (Damasio, 2010;Frijda, 1986;Lehner, 1996;Panksepp, 1998;Pessoa, 2013), for example, to provide targets and "stop messages" or to monitor performance.However, internal signals are not normally accessible (for technical, practical, or ethical reasons) in studies involving animal models, which must resort to the use of external signals and their association (typically through learning) with externally observable behavior.
We therefore select the cybernetic model of Pitman and the signal attenuation model (as a theoretical model, rather than its specific implementation in animals) as the basis for our robot model.These are brought together within the framework of our motivation-based robot controller.We will implement our robot model of OCD using an internal signal deficit (faulty interoception); this is something that is much simpler to do in robots than in animals because of our more complete control of the robot's internal decision-making and sensing mechanisms.The internal signal deficit falls within the category of "comparator defect" in Pitman's possible sources for the error signal.

Consensus Stage (Stage 2)
This stage seeks conceptual clarity regarding the selected conceptual model and precision in the use of its associated notions and principles.We refine and seek consensus, typically on concepts, criteria, definitions, and assumptions underlying and associated with the conceptual model of the previous stage.
From our selected cybernetic and signal-attenuation conceptual models, the key notions with respect to OC-spectrum disorders that are most relevant for our robot model are as follows: Compulsions.According to the classic text of Fish (Fish, Casey, & Kelly, 2008), compulsions are obsessional motor acts that may result from an obsessional impulse or thought.To attempt to clarify this, we will consider a behavior to be compulsive if it is executed repetitively and persistently, even though it might not have a clear function, or it may even be maladaptive or unwelcome to the individual.We note that some habits fall under this definition of compulsive behaviors; however, the main difference between them might reside in the context in which they are executed.Compulsive grooming.In research using animal models, compulsive self-grooming is widely used to research OC-spectrum conditions.It is induced in mouse models by gene knockout, and its link with human OC-spectrum disorders is supported by the similar responses to pharmaceutical interventions (Camilla d'Angelo et al., 2014).Grooming is considered related to the human conditions of TTM and PSP due to the high face validity.In this case, we consider grooming behavior to be compulsive if it occurs to the extent that it either directly damages the animal or causes the animal to neglect other needs.Obsessive thoughts.Obsessive thoughts are a defining feature of OCD.According to Fish et al. (2008), obsessions are thoughts that persist and dominate an individual's thinking, despite the individual's awareness that the thought is without purpose or no longer relevant or useful.1 Stop signals.These are internal or external signals that indicate that a goal has been achieved, a need satisfied, or a behavior successfully executed.Several models of OCD, namely, the cybernetics and signal attenuation models, postulate problems with stop signals linked to compulsive behavior.In the cybernetic model (Pitman, 1987), an error signal is present that, when it becomes zero, signals that the behavior that was being executed to correct the error can stop.In the signal attenuation model, a signal indicates the successful execution of a behavior; compulsive behavior then results from an "attenuation" of that signal, which weakens the perception of the behavior's success and therefore that it can stop.The signal attenuation model is typically presented in the context of an experimental paradigm (Joel, 2006) in which animals are trained on an external signal and this signal is "attenuated" by reducing its value as a signal.However, we will consider it more generally, and in our robot experiments, the equivalent of the stop signal will be an internal signal.2

Selection Stage (Stage 3)
At this stage, we select the (endo)phenotypes of interest for our model.These may be behavioral or internal phenotypes that we generate explicitly or that we believe may be generated as a consequence of how the model works. 3ince we were starting the development of our model, we chose to model one of the simpler conditions in the OC-spectrum: compulsive self-grooming.While compulsive selfgrooming is our behavioral phenotype of interest, we are also interested in endophenotypes, in particular, obsessions-a major subjective symptom of OCD.The endophenotypes that we can study depend on the nature and interaction dynamics of the robot controller.In our case, our robot controller uses competing motivational systems that vary over time as a function of the robot's interaction with the environment and the dynamics of its embodiment.In this model, we can use these motivational states as an indicator of obsessions.Such subjective symptoms are not easy to analyze in animal models, where access to the internal state is limited.The list of phenotypes of interest may expand on subsequent iterations.

Deduction Stage (Stage 4)
At this stage, we create operational definitions of concepts to be used in this iteration of the development.In some cases, simplifications of concepts may be required.
First, we need to describe briefly how our robot model will work.We will have an autonomous mobile robot that tries to survive in an environment, making decisions about how to use the resources available to satisfy its needs.It will be endowed with some internal physiological variables (e.g., energy), the values of which will change over time as the robot interacts with the environment, and which may fall out of the range of permissible or viable values (Ashby, 1960), resulting in the robot's "death."The robot will also be able to self-groom through interaction with appropriate objects in the environment.We will analyze the robot's behavior and performance in terms of metrics to measure "viability" and "well-being" of the robot, as well as statistics about its behaviors.These metrics (to be described in sections "Metrics" and "Testing Stage: Analysis of Experimental Results (Stage 6b)") will be calculated from the aforementioned internal physiological variables that constitute the internal state of the robot.
In this context, taking the concepts from the consensus stage, we refine them (for this iteration) as follows: Adaptive (maladaptive) behavior in the robot is behavior that positively (negatively) affects the performance of the robot, as measured by the viability and well-being metrics.Compulsive grooming is repeated self-grooming to the extent that it is maladaptive.Obsessions are persistent states in the robot's internal decision-making process that have no benefit, either because they cannot be acted upon or because acting on them will have no benefit in terms of the robot's viability or well-being.The error signal of a physiological variable is the mismatch (difference) between the current value and the target (ideal, reference) value.The perceived error signal of a physiological variable is the robot's "sensed" difference between the current value and the target value (in our case, the perceived error signal may be different from the actual error signal [section "Modeling Compulsive Behavior"], and our robot's action selection code uses the perceived value).Signal attenuation is a decrease in the strength or salience of an internal or external cue.In the signal attenuation animal model, this cue is an external cue to indicate that a behavior has been successfully executed and the next stage in a chain of behaviors can be started.However, in our case, we use an internal variable that can have different target values under different conditions, leading to different error signals and hence different signals that a behavior (grooming) has had sufficient effect.

Build Robot Model of OC-Spectrum Disorders (Stage 5)
To begin this section, let us highlight the features and advantages provided by an (embodied) robot model versus a computer simulation of an agent.In an embodied robot model, the external environment (and the way it is perceived by the robot) is as important as the internal controller in producing the robot's behavior.The environment, in addition to posing specific decision-making problems to the robot, provides the context through which the robot's behavior links back to and modifies its perceptions, closing the perception-action loop (Brooks, 1991a;Pfeifer & Scheier, 2001;Powers, 1973).Given the same behavior control software and the same internal state of the robot, the behavior of the robot might be completely different depending on factors concerning its relation with the environment, such as what is happening in the environment at a particular time, its ambiguity and unpredictability, what the robot perceives of it, imprecisions in the perception-action loop (e.g., in the case of robots, the potential "noise" coming from sensors or actuators), how the robot can act on the environment and how the robot and the environment interact, how what is happening in the environment affects the actions of the robot, the opportunities for interaction that the environment "affords" to the robot, or the place of the robot in the ecological niche.The dynamics of such highly complex interactions cannot be fully modeled in a simulator, since the complexity of the real world and its effects on an agent cannot be fully modeled.Since we are interested here in the dynamics of the pathology, and this is something that occurs in interaction with the real world, we advocate the use of a robot model situated in and in interaction with the physical world.

Robot Hardware
For our initial model, we selected a simple robot well suited to prototyping and research of an exploratory nature: the Elisa-34 (see Figure 3).This is a small, round, twowheeled Arduino-based robot, 5 cm in diameter and 3 cm in height.It is equipped with a ring of eight infrared (IR) "distance" sensors with a range of approximately 5 cm and four downward pointing IR "ground" sensors.These sensors provide the robot with a rudimentary (coarse and noisy) capability to detect both the proximity of objects around it and dark and light areas on the ground.It additionally has radio communications to receive and transmit messages with a PC, which we use to log data for quantitative analysis of results.Finally, it has colored LED lights on its top, which we used to visually signal its internal activity.
Since this robot has limited capabilities for manipulation and perception of external objects,5 we model grooming by having it "rub" its side sensors against objects in the environment to improve its (simulated) "integument": the state of its external surface, analogous to the state of an animal's fur or feathers.

Environment
For the purposes of data collection and analysis, we have placed the robot in a small walled area containing objects appropriate to the sensorimotor capabilities of the robot.Our environment had to support both "healthy" and pathological behaviors.We therefore placed in the environment a number of resources-"energy sources" (light patches on the floor) and "grooming posts" (plastic pipes)-that could provide the means for the robot to satisfy its survival-related needs (energy) and its other needs (grooming for maintenance of integument), but which could also provide the opportunity for pathological behavior.
An internal "integrity" variable keeps track of the (simulated) physical integrity of the robot.It decreases following collisions and other types of contact with objects (detected by the distance sensors): These include the arena walls and the grooming posts placed within the arena.If this damage causes the robot's integrity to fall to zero or below, the robot will "die" and stop in place.In the absence of collisions, the integrity would slowly increase as the robot "heals."In the architecture of the robot, the integrity variable is linked with the robot's motivation to avoid objects, including the grooming posts, providing an internal conflict with the motivation to groom.This gives both a cost to the grooming behavior and a "cue" to stop.This element of the architecture allows us to investigate the extent to which the grooming behavior is compulsive, specifically, the extent to which it continues even though it directly damages the robot (see the "Consensus Stage (Stage 2)" section, compulsive grooming). 6n healthy (adaptive) decision-making, the robot will alternate between grooming (or seeking a grooming resource) and feeding (or seeking an energy resource).In the compulsive behavior situation, the robot will groom to the extent that it adversely affects its survival, either because its energy level falls too low or because it damages itself though contact with the grooming post.

Robot Model of Obsessive-Compulsive Disorders
As we have seen, the theoretical cybernetic model that we have selected proposes that "the core problem in OCD is the persistence of high error signals … that cannot be reduced to zero through behavioral output."To operationalize this, our architecture will act on the basis of the perception of internal error signals, combined with (perceived) external cues.By manipulating internal parameters within the controller to create cases where the error signal remains present, we can then test and explore the theoretical model.To facilitate systematic analysis of experimental results, we test the robot in a "two-resource problem" (Spier & McFarland, 1997), used in ethology and robotics as the simplest decision-making, or action selection, scenario.As its name suggests, in this scenario, an agent (animal or robot) must autonomously decide which of the two resources available in the environment it should consume in a timely fashion to satisfy its survival-related needs successfully.To focus on the dynamics of the perceptionaction loops that are proposed as the "core problem" in OCD, our robot does not include other elements, such as memory, learning, or map building.

Software Behavior Control Architecture
The specific robot model and action selection architecture that we have implemented draw on our previous work on motivation-based robot behavior controllers (Cañamero, 1997;Cañamero & Avila-García, 2007;Lewis & Cañamero, 2014, 2016), while also being inspired by animal models of OC-spectrum disorders.
In the robot architecture used in this study, four competing motivations guide the behavior of the robot.These motivations are urges to action determined by a combination of the four corresponding internal homeostatically controlled "physiological" variables, which provide the robot with "needs," and by the robot's perception of the environmental resources that can be used to manage those variables.The decision-making behavior control software provides the robot with strategies to prioritize and satisfy these needs.
An overview of the behavior control (also known in the literature as action selection) mechanism is shown in Figure 2. We describe the components of the architecture (the highlevel design of the software) in the following subsections.

Physiological Variables
Our robot has four homeostatically controlled physiological variables, shown in Table 1: energy, integrity, and two integument variables (one for each side).The physiological variables take values in the range [0, 1,000], with 1,000 being the ideal value in all cases.The variables change both over time and as a function of the robot's interactions with its environment, reflecting the current state of the robot.Following a model of homeostatic control, the difference between the actual value and the ideal value of each variable generates an error signal indicating the magnitude of the mismatch (in this case, deficit).
Two of the physiological variables (energy and integrity) have a fatal limit of zero (the robot dies if the value falls to zero).The two other variables, related to "integument," can be thought of as analogous to an animal's fur or feather condition: something that needs to be maintained for viability (e.g., waterproofing, efficient flight) but that does not directly cause death if it falls too low.In our robot implementation, low values of the integument variables have no physical consequences on the robot (e.g., they does not affect its physical integrity or its travel speed or any other aspects), but they will trigger a motivation to maintain the variable within a good range of values (correct the error between the actual and ideal values of this variable) by grooming.

Sensors, Cues, and Motivations
The robot uses its infrared distance sensors and ground sensors to detect obstacles, grooming posts, and energy resources in the environment.These correspond to environmental cues or incentive stimuli that influence the motivational states of the robot to avoid (obstacles), groom (at grooming posts), or consume (energy resources).The numerical size of the perceived cue is in the range [0, 100] and is determined by the sensed distance of the obstacle or by the color detected by the ground sensors for the energy resources and grooming posts.
Following a classical model in ethology, we use the long-standing concept of motivational states (Colgan, 1989), defined in terms of the drives set by the deficits or errors of the internal variables, combined with external environmental cues (incentive stimuli).Our robot has four different motivations, each linked to the satisfaction of a physiological variable (see Figure 2).The internal drives and the external incentive cues combined provide a level of intensity to each motivation at each point in time, which reflects its relevance to the current situation.The motivational intensity is calculated according to the formula proposed in Avila-García and Cañamero (2004) (modified from a classical formula in ethology; Tyrrell, 1993, p. 139): where deficit i (the error signal) is the difference between a variable's current value and its ideal value as perceived by the robot (see section "Modeling Compulsive Behavior"), cue i is the size of the corresponding cue, and α is a scaling factor to scale the size of the exteroceptive component.In our experiments, α will equal 0.05.This value was empirically determined in pretrials to allow the robot with a realistic target value (nonpathological or baseline condition 1 in our experiments) to be able to have enough persistence in satisfying its needs to be able to survive in the environment.
As the values indicating the intensity of the motivations change over time, depending on the external and internal perceptions, at certain points, the motivation with the greatest intensity (i.e., the most pressing need) will be overtaken.This will result in a change of motivational state and hence of the behavior executed to satisfy it, and it could be viewed as an analog for a "stop signal" for the current behavior.

Behaviors
The robot has a number of discrete behaviors of different types (rounded boxes in Figure 2).The execution of some of these behaviors allows the robot to directly satisfy its motivations, and hence correct the errors of the physiological variables, while other behaviors allow the robot to "search" (move about the environment avoiding objects) for the resources required to satisfy these needs.Let us note that the robot will only move around the environment or execute a behavior if at least one of its motivations is active.
We have grouped the behaviors into four "higher level" behavioral subsystems, each one linked to a motivation: groom left group, groom right group, eat group, and avoid.These behavioral subsystems (except for avoid) are composed of smaller, simpler behaviors, which can be executed independently, simultaneously, or sequentially depending on the state of the robot and the external stimuli detected.For example, the eat behavioral group is composed of the consummatory (goal-achieving) behavior "eat," which is executed if the robot is hungry and food is detected and located at the mouth; "orient to food" which is executed if the robot is hungry and food is detected nearby; and "search (for food)," which is an appetitive (goalseeking) behavior that will make the robot wander around the environment until food is detected.Searching behaviors (for an energy resource or a grooming post) involve the robot wandering around the environment (i.e., traveling in random directions, while avoiding objects), typically until the incentive stimulus of the motivation that triggered the search is found.Searching does not involve any knowledge of the environment on the part of the robot and does not occur in any particular direction.The only information about energy resources and grooming posts that the robot uses in this search is the ability to recognize them when it encounters them.

Action Selection
The robot controls its behavior as follows: At each time step (100 ms), the robot recalculates its four motivations and sorts them from highest to lowest.This order determines the order in which it prioritizes the satisfaction of its physiological variables in that action selection step.To satisfy the motivations, we use a slightly modified "winner-take-all" action selection policy, as follows.The robot will try to satisfy the highest ranked motivation (the "winner motivation") first.To do so, the winner motivation triggers the behavioral subsystem linked to it, and one or more of the simpler behaviors that constitute this subsystem (nested rounded boxes in Figure 2) are executed, depending on whether the preconditions for their execution (e.g., in the case of the eat behavior, presence of food detected) are met.If, while satisfying the winner motivation, a behavior that satisfies a lower ranked motivation can also be executed, then it will be executed.This means that two behaviors can sometimes be executed simultaneously.Our robot can execute two behaviors simultaneously only if the two behaviors use distinct sets of actuators.For example, the "orient to food" behavior, which allows the robot to approach and stop at an energy resource, uses the wheels, while the behavior to consume the resource uses the virtual "mouth," and thus the two behaviors can be executed concurrently.This allows the robot to consume an energy resource as it aligns itself with the resource.In certain cases, two behaviors can execute due to different motivations, for example, the eat behavior can be executed opportunistically as the robot passes over a resource while searching for a grooming post, as the mouth actuator is not otherwise occupied.
Note that, since the grooming posts are obstacles that the robot can collide with, they will also act as a cue for the avoid behavior.Whether the avoid behavior is actually executed depends on the intensity of the motivation to avoid, which depends on the values of the cue and the robot's integrity.

Modeling Damage and Grooming
Damage to the robot (e.g., through collisions) and the effect of grooming on the robot's integument are implemented using the IR distance sensors.Damage and grooming use independent mechanisms (summarized in Figure 3) designed with the goal that interaction with environmental objects can be potentially beneficial or damaging to the robot: Grooming involves a small possibility of damage, but not so much that a normal amount of grooming risks serious damage to the robot.The various constants in these calculations were determined empirically to meet these design goals.
To calculate damage, the distance sensors are checked every 50 ms and compared to the previous values.The calculated damage is subtracted from the current integrity.Two types of damage are possible; collisions and sustained rubbing, described as follows: Collisions.If the closest IR reading crosses a "touch" threshold (a value of 850, corresponding to an object approximately 3 mm from the robot), then a "collision" is deemed to have occurred, and a value for the damage is calculated depending on the size of the change in the sensor readings that have crossed the threshold.A maximum value of 100 is applied to this type of damage, to stop a single unlucky collision killing the robot in one blow.Sustained rubbing.When the IR sensor values maintained a value over the threshold of 900 (indicating a very close object), then a constant value of 2 damage is applied (hence a maximum of 40 per second).
To implement grooming, the distance sensors are checked every 100 ms and compared to previous values.Two sets of sensors are used: (IR1, IR2, IR3) for the right side and (IR5, IR6, IR7) for the left side, corresponding to the two integuments.If a sensor value indicates a close object (a value above 150) and the value has increased since the previous reading, while the value of the adjacent sensor toward the front of the robot has decreased (indicating the movement of a grooming post from front to back of the robot), then a "stroke" counter is incremented, indicating movement in the "correct" direction (front to back).Conversely, a sensor indication of a movement in the "wrong" direction (back to front) is considered as a "anti-stroke" and decrements the stroke counter.The overall value of the counter indicates whether the overall movement on one side is a stroke (>0) or an anti-stroke (<0).If a stroke has occurred, then the actual value of the corresponding integument (left or right) is increased by 20 times the count value; if an anti-stroke has occurred, then the actual value of the corresponding integument is decreased by −5 times the (negative) count value.

Modeling Compulsive Behavior
Following the signal attenuation model, in this article, we model compulsive behavior by manipulating the robot's perception of the internal errors linked to its physiological variables.More concretely, we manipulate the robot's perceived ideal ("target") value for the integuments.This will affect the decision-making (action selection) process in the calculation of the "error" in Figure 2 or the deficit in Equation 1.
To model "typical" or "healthy" decision-making, we consider the case where the perceived ideal value is equal to the real "perfect" value (i.e., 1,000, which is as far from the fatal limit as possible).This value can be achieved and so is a "realistic" target.However, as we have seen in previous work, a robot will typically stop attending to a need before the related physiological variable reaches its ideal value, due to competition from other needs.The point when the value of the active motivation is overtaken by another motivation (which consequently becomes the active motivation) can be thought of as the robot receiving the stop signal for that behavior (although in our model, it might be more accurately thought of as an "attend to another need and switch behavior" signal).
To model "pathological" conditions, we consider values for the perceived ideal value that are not achievable, even theoretically: in this model, values greater than 1,000, which is the maximum possible value the variable can take.Since these values are not valid values for the integument, having such a target value can be thought of as a "perceptual error" or distorted perception in the robot.
As outlined in the section "Deduction Stage (Stage 4)," if the signal attenuation model of OCD holds, we would hypothesize that this manipulation, analogous to attenuating the signal for successful grooming as we increase the target value, would result in an exaggerated (more intense) motivation to perform the selected behavior.In turn, this increased motivation would out-compete the other needs and thus produce an increase in the performance and perseverance of the grooming behavior.If this effect is sufficiently large, we would expect it to be maladaptive, and this would be measurable by at least some of our metrics.

Methods
The experimental setup is shown in Figure 4.It consists of an 80 cm × 80 cm square surrounded by 45-mm-high wooden walls that can be detected with the robot's distance sensors.The floor is covered in paper, which is printed gray, except for two light areas and Computational Psychiatry two dark areas, which can be detected by the robot's ground sensors and which indicate the presence of food and grooming resources, respectively.Two 35-mm-diameter white plastic pipes are fixed in place in the centers of the dark areas to be used as grooming posts.
We conducted 20 runs in each of the following conditions: Note that Conditions 2 and 3 correspond to target values (perceived ideal values) that are not achievable, even in theory, because they lie outside the range of the variables (see section "Modeling Compulsive Behavior").In these conditions, an error is always perceived, even in the absence of a "real error" (a mismatch between the actual value and a realistic ideal value within the range of the variable, which is 1,000).This "distorted perception" of the target value gives rise to "perceptual errors" that cannot be corrected through behavior or by interaction with the environment because the target values lie outside the range of the variable and therefore the urge to correct it is always present.These conditions fall under the category of "comparator defect" in Pitman's possible sources of a persistent error signal.
The numerical values for Conditions 2 and 3 were empirically determined following some informal pretrial runs.In these, we observed that the value of 1,200 resulted in highly persistent grooming-the robot would frequently groom until it died, so it was selected as the most extreme value to test-while the value of 1,100, halfway between the baseline condition and our extreme value, showed very different behavior, with the robot stopping grooming before it died.
On each run, the robot's physiological variables were initialized to the middle value of the range (500 out of 1,000) for energy and both integuments, so that the robot would need to work to maintain all these variables, which decrease over time if not actively maintained.For the integrity variable, an initial value of 900 (out of 1,000) was chosen to allow the robot the approach grooming posts and maintain its integument, rather than starting in a "half-damaged" state and being overmotivated to avoid objects.The robot was started at the center of the arena, equidistant from all four resources (see Figure 4) facing directly toward one side of the arena in one of four alternating directions (labeled "north," "south," "west," and "east").The alternating direction was done to reduce any bias that the initial direction might impose, since this may influence which resource a robot would encounter first.Runs lasted for 6 min each, or until the robot died.The values of its physiological variables, motivations, sensor values, and the currently executing behaviors were recorded every 250 ms and transmitted to a PC via its radio link.

Select data to be collected
During our experiments, we need to collect data to evaluate the adaptive or maladaptive value of the robot's decision-making process.In terms of OC-spectrum conditions, to compare the balance between the satisfaction of different needs, we need to record the values of the physiological variables, the values of the motivations, and the behaviors that the robot is currently executing.To allow post hoc examination of the robot's behavior, we additionally record its sensor readings and wheel speeds.

Metrics
We evaluated the performance of the robot in each condition and run using four metrics: 1. Survival related.specifically, death rates (the number of robots that died during the run) and duration of life for each run (up to 6 min).
2. Metrics relating to the regulation of physiology (well-being, physiological balance, maintenance of physiological variables away from dangerously low values, maintenance of integument).These give a measure of the success of a living robot in managing its physiological variables as a result of its decision-making, either at a specific time or over its lifetime.These help to compare the performance of robots that do not die.They are calculated from the recorded values of the physiological variables.3. Behavior of the robot.The behaviors that the robot executed are included in our logged data, so we can use this record to compare different robots' behaviors without needing to resort to external observation.4. Motivational balance.Since our robot controller is based on the four motivations defined in the "Sensors, Cues, and Motivations" section, we can use this internal information to analyze the robots' motivational balance, which reflects how much time they spend attending to each of their physiological needs.
We provide the mathematical definitions of these measures and use them to evaluate the results of our experiments in the following section.

Testing Stage: Analysis of Experimental Results (Stage 6b)
We now present the experimental results and analyze them in terms of the preceding metrics.

Death Rates, Duration of Life
Death rates for each condition are shown in Table 2 (first column).Let us first consider Condition 1 (realistic target values).The two Condition 1 robots to die both survived 49 s, and the runs were very similar.Both of them found a grooming post in this time, spent some time grooming, and then left due to motivation to find an energy resource.After this, they both encountered a second grooming post, and although they both groomed, they did this for only about 1 s before leaving again in search of an energy resource.
In Condition 2 (mildly unrealistic target values), three robots died.Since we recorded internal data for the robots, including their motivations, we can examine what happened, including why they made their decisions.Let us look at what happened during the lives of the three robots that died: The first Condition 2 robot to die survived 52 s.It had found a grooming resource soon after starting and spent 15 s grooming before leaving.It soon found another grooming post and remained grooming for approximately 9 s.It left the post with only 6 s worth Note.The mean well-beings and variance have been calculated by taking the means over the lifetime for each "robot" (run) and then calculating the mean of the 20 values in each condition.The percentages in the last three columns have been calculated by concatenating the lifetimes of the robots in the 20 runs in each condition and calculating what percentage of this time was spent grooming, etc.
of energy and failed to find an energy resource in this short time.We remark that the integument that the robot attended to was low (only 24 greater than the energy) when the robot prioritized it over the similarly low energy, meaning that the robot had two pressing competing needs.
The second Condition 2 robot to die survived 297 s.After 200 s, both of its integuments had fallen to zero, and when it found a grooming post, it stayed there grooming for approximately 1 min, increasing its integuments while its energy level fell.By the time it left the post, it had approximately 15 s worth of energy left, and it did not find an energy resource before dying.As in the first case, the low integuments meant that the robot had multiple pressing needs.
The third Condition 2 robot to die survived 124 s.After finding its first grooming post and grooming, it left with approximately 30 s worth of energy and was wandering the arena to find an energy resource.However, during this time, it found another grooming post and opportunistically groomed for 10 s.When it left the second grooming post, it had less than 10 s worth of energy with which to find an energy resource.
Of the 19 deaths in Condition 3 (with the highly unrealistic target values), 17 occurred due to the energy falling to zero while the robot was grooming.In all of these cases, the integument in question had already reached its ideal value (i.e., its maximum value of 1,000), so further grooming was not achieving anything.In the remaining two cases where the robot died, these were also caused by the energy reaching zero.In both cases, the robot had stopped grooming within the last 5 s, and was now wandering-in one case motivated to find an energy resource, in the other case motivated again to find a grooming post.The mean survival time for Condition 3 was 134 s (including the robot that survived).

Well-being
To evaluate the robot's current state, in terms of its physiological variables, we used metrics that we call well-being (Lewis & Cañamero, 2016), which provide the average level of all four the physiological variables at each point in time.Intuitively, well-being gives an indication of the robot's internal "health" at a point in time, with high values indicating good health (the physiological variables are high, close to the ideal value) and low values indicating poor health (the physiological variables are low, close to the fatal limit).
We calculated two different well-being metrics by taking the arithmetic and the geometric means of the physiological variables at each sample time, giving, respectively, the arithmetic well-being and geometric well-being.Both means are unweighted, that is, in both means, the four components contribute equally to the final value.However, the value of the geometric well-being is more strongly affected by those physiological variables that have values close to the critical value of zero-which is also the fatal limit for energy and integrity-and that, for this reason, can be considered the most pressing.We also calculate the more broadly used arithmetic well-being because it gives a more familiar and intuitive way of calculating the average, since it gives the "middle" value.
Applying these metrics to our experimental data, the mean values of each well-being metric over all the runs in each condition are shown in Table 2 (second and third columns).In both the arithmetic and geometric cases, these values were calculated in three steps.First, in each run, in each of the conditions, we calculated the well-being of the robot at each "time step" (each point in time for which data were collected).Second, we calculated the mean wellbeing over the lifetime of the robot in each run (20 runs per condition, 60 in total).Finally, we calculated the overall mean for each condition as the mean of the 20 "lifetime means" (1 per run) from the previous step.The mean values of the geometric well-being for each run are shown in Figure 5.
There was a statistically significant difference for the arithmetic well-being between conditions (ANOVA, p = 0.020), with Tukey HSD post hoc analysis showing that Condition 3 (highly unrealistic target values) differed from Condition 2 (mildly unrealistic target) (p < 0.03), but no statistically significant differences between other pairs of conditions.Turning now to the geometric well-being, there was again a statistically significant difference between the different categories (ANOVA, p = 2.5 × 10 −4 ).In this case, Tukey HSD post hoc analysis showed that Condition 3 (highly unrealistic target values) differed from both Condition 1 (p < 0.005) and Condition 2 (p < 0.001), while Conditions 1 and 2 did not differ significantly from each other.
The fact that the difference in the geometric well-beings was statistically more significant than the difference between the arithmetic well-beings (both in terms of the p-value for the ANOVA and there being a significant difference between Conditions 1 and 3) illustrates a desirable property of the geometric well-being metric: the larger effect of small ("critical") physiological variables on the overall value.This means that the geometric well-being is strongly affected by those variables with large errors, reflecting more accurately the significance of poorly maintained variables.Conversely, for the arithmetic well-being, one well-maintained variable can cancel out the effect of a poorly maintained variable, so in an extreme case, a robot could maintain one variable well but die quickly due to neglecting a survival-related variable and still have a high arithmetic well-being over its lifetime.

Computational Psychiatry
In summary, as expected, a highly unrealistic target value (Condition 3) is disadvantageous in that it results in a lower geometric well-being than either the realistic target value (Condition 1) or the mildly unrealistic target value (Condition 2).However, contrary to what we would expect, a mildly unrealistic target value (Condition 2) is not disadvantageous compared to the realistic target value (Condition 1): The trend, although not statistically significant, is that our chosen mildly unrealistic target value results in a higher mean well-being than a realistic target value and so may be advantageous, as measured by these metrics.This can be viewed as an advantage of a more cautious decision-making strategy for the management of the integument variables: For the same value of integument, the motivation to groom is higher.Even if the advantage of Condition 2 does not hold true in further experiments, our results indicate a nonlinear response of the well-being metrics to the changing perception of the target value.

Physiological Balance/Variance of the Physiological Variables
We calculated the "physiological balance" as the variance of the four physiological variables at each "point in time" (i.e., the variance of the four values at each sampling time rather than for the entire series of values for each individual variable).Intuitively, this gives a measure of whether, at that point in time, the robot has managed the physiological variables evenly (the four variables have similar values, so the balance [their variance] is low) or whether the robot has kept some variables high while allowing others to fall (the four variables have a wide range of values, so the balance [their variance] is high).A high value can be thought of as a "poorly balanced" management of the Computational Psychiatry four physiological variables, although in some scenarios, it may be advantageous, for example, it might be a good strategy for the robot to increase the value of one variable when resources are abundant if it is likely that the relevant resource will be scarce in the future.
Applying this metric to our experimental data, we calculated the physiological balance in a three-step process.First, in each run in each condition, we calculated the physiological balance at each "time step" (each point in time at which data were collected).Second, we calculated the mean physiological balance over the lifetime of the robot in each run (20 runs per condition, 60 in total; these values are shown in Figure 6).Finally, we calculated the overall mean for each condition as the mean of the 20 "lifetime means" (1 per run) from the previous step.These values are shown in Table 2

(fourth column).
There was a statistically significant difference between conditions (ANOVA, p < 1 × 10 −6 ), with Tukey HSD post hoc analysis showing that Condition 3 (highly unrealistic target values) differed from the other two conditions (p < 0.001), although there was no statistically significant difference between Conditions 1 and 2.
In summary, as with the well-being metric, a highly unrealistic target value is disadvantageous in terms of physiological balance.However, the chosen mildly unrealistic target value Figure 7. Experimental results: percentage of the robot's lifetime during which the physiological variable closest to the critical limit of was in "regions" the physiological space.With a value of exactly 0 (the variable with a value of zero here must be an integument variable, since if it had been one of the survival-related variables, the robot would have been dead), in the range (0, 100] (intuitively "highly critical"), in the range (100,200] ("critical"), and in the range (200,300] ("danger").These percentages were calculated by concatenating the lifetimes of the robots in the 20 runs for each condition and calculating the percentage of this time during which the physiological variable that was closest to the critical limit was in each region.The equal zero percentages correspond to the values in Table 2, last column.(Condition 2) is not statistically different from the realistic target value (Condition 1) for this metric.

Maintenance of Variables Away from Dangerous Values
To evaluate how well our robots kept their physiological variables from falling to dangerous values (near to zero, the critical limit of the variables), we calculated the percentage of the robots' lifetime during which any variable was = 0, ≤ 100, ≤ 200, ≤ 300.These percentages are shown in Figure 7, and the specific values for a variable = 0 in Table 2 (last column).Here lower values (smaller percentages of time) are better, since they indicate less time spent with a variable in the "danger zone" close to the critical limit. 7n summary, as with the well-being metrics, a highly unrealistic target value is disadvantageous in terms of maintaining the physiological variables above the dangerous values.However, contrary to what we would expect, the mildly unrealistic target value is not disadvantageous compared to the realistic target value condition: Given our results, it may be advantageous compared to the realistic target value, in that the physiological variables were better maintained away from the low values.

Maintenance of Integument
Focusing more closely on the maintenance of the integument variables, we calculated the percentage of the robots' lifetime during which either of the two integument variables were higher than the other two physiological variables ("best maintained") or lower than the other two physiological variables ("worst maintained").These are shown in Figures 8 and 9.Note that since we have two integuments, one side may be the best maintained, while at the same time the other side is the worst maintained.
Figure 8, showing the percentage time as the largest physiological variable, shows a trend toward better management of at least one integument.However, this should be considered in the context of Figure 9, where there is not a clear trend as to the worst managed variable.Looking at the evolution of the physiological variables in Condition 3, the robot would often concentrate on grooming one side, at the expense of the other, and the maintenance of integument metrics reflect this fact.
In our robot model, this concentration on grooming one side is connected with the perception of the salience of the grooming post.In our implementation, when grooming is happening, due to the position of the grooming post on one side of the robot, the post cannot be perceived by the IR sensors on the other side of the robot, and therefore the post does not provide an incentive stimulus for the motivation to maintain the integument on the other side of the robot.When the robot has a more realistic (lower) target integument, the action of grooming one side is more likely to be interrupted, providing more opportunities to switch to the other side.

Balance of Behaviors
To evaluate how the robot divided its time among behaviors, we used the internally logged data to calculate the amount of its lifetime spent in grooming and eating, combined over all robots in each condition, shown in Table 2 (fifth and sixth columns).For Computational Psychiatry its remaining lifetime, the robot was searching for a resource (this does not account for all the time spent searching, since occasionally the robot would pass over an energy resource and would eat opportunistically without stopping its search).
As we can see from Table 2, the percentage of time spent grooming increases with the increased perceived ideal value of the integument in Conditions 1 -3.The percentage of time eating decreases as the target integument increases, but not as much as the grooming increases, and there is only a small decrease between Conditions 1 and 2. This indicates that the robots with mildly unrealistic target values are spending less time searching and more time executing consummatory behaviors.This may be an indication that there is some adaptive value in a mildly unrealistic target.

Balance of Motivations
Since we are considering OC-spectrum disorders, we wanted to check if our robot had anything analogous to obsessions.To evaluate the robots' "concerns," we calculated the percentage of the robots' lifetime during which the motivation with the highest intensity was either to feed, avoid, or groom.We additionally calculated the corresponding percentages of time during which the robot was wandering around in search of either an energy resource or a grooming post.The results are shown in Table 3.
These figures indicate that all robots spent the majority of their lifetime motivated to groom (i.e., attending to either of the two integuments, either by active grooming or by searching for a grooming post) and the smallest part of their lifetime motivated to avoid objects (since in general the robots did not experience much damage from collisions).As expected, as the perceived integument target value became more unrealistic through the three conditions, the amount of time when the highest motivation was to groom increased, and the amount of time when the other motivations were highest decreased.However, the change was not smooth across the conditions, with a larger change in motivations to feed and to groom occurring from Condition 2 to Condition 3.Although this result is preliminary, it suggests a nonlinear response of the internal motivations ("concerns") to the different perceived ideal values in the different conditions.

Discussion of Experimental Results
Our clearest result is the difference in the number of deaths in the three runs (Table 2, first column).Seventeen of the deaths in Condition 3 (highly unrealistic target values) occurred when the robot was grooming, not stopping even though in each case one integument had reached its maximum value and the energy was falling to its fatal limit.In contrast, the robot in Condition 1 (realistic targets) would always stop grooming before the integument reached 1,000, enabling it to search for and find an energy resource in time to feed-the only two Note.Values in brackets: taken as a percentage of the time when the robot was searching.Total percentages may exceed 100%, since if two motivational values were equal largest, they were counted in both categories.
Computational Psychiatry deaths in Condition 1 could both be partially due to "bad luck," with the robot not finding an energy resource while searching for one.
In Condition 2 (mildly unrealistic target values), with the more moderate perceptual error, in 16 of the 17 runs where the robot survived, at some point in the run, one or other integument would reach 1,000.However, in this condition, it would eventually stop grooming.
To examine why grooming continues, we consider the three reasons for the robot to stop grooming: 1. Integument improves sufficiently from grooming, therefore the robot's motivation to groom falls sufficiently that the robot acts to satisfy another motivation.2. Energy falls so low that the motivation to feed (even in the absence of an energy resource acting as an external cue) exceeds the motivation to groom (even though this is increased by the presence of an external cue).3. Integrity falls so low that the motivation to avoid objects prompts the robot to move away from the grooming post.This could be due to damage incurred during grooming and interaction with the post.
Our results show that in Conditions 2 and 3, the distorted perception interfered with the normal dynamics of decision-making, and in particular with the above reasons to stop grooming.In Conditions 2 and 3, we see from our data that the first reason to stop grooming was made less likely compared to Condition 1, since there were cases where an integument reached its maximum value and the robot did not stop grooming, and it was only the continuing fall in the other variables (increasing the corresponding motivation) that caused the robot to move away.Additionally, in Condition 3 (highly unrealistic targets), the second reason to stop grooming was also less likely, since there were cases where the energy fell all the way to zero, and even with maximum integument value, the motivation to groom was still higher than the motivation to feed.In our experimental setup, the rate of damage from grooming was not sufficiently high for the integrity to fall low enough to prompt the robot to move away, and therefore we cannot say if reason 3 was still a factor.

Possible Advantages of an Unattainable Target
While the highly unrealistic target values (Condition 3) led to almost inevitable death, examining Condition 2 indicates that a mildly unrealistic target value may confer an advantage to our robot.We can see this in several of our results, listed below.With this positive perspective on the unachievable targets, we could characterize them as "idealistic" rather than "unrealistic." First, an advantage of mildly unrealistic target values can be seen in the increased arithmetic and geometric well-beings for Condition 2 compared to Condition 1 (Table 2, second and third columns), indicating better overall "health." Second, if we look at the percentage of lifetime during which our robots had zero (worse maintained) integument (Table 2, last column), we see that in Condition 2, the percentage is smaller than for Condition 1.In a situation where maintaining integument aids survival (as is typically the case in animals), this smaller amount of time where the value of an integument was zero represents an advantage.We see the same advantage (reduced times for Condition 2) in considering the percentage of time that any of the physiological variables were below particular values (Figure 7).The high values for Condition 3 reflect that the robot would typically neglect one integument while focusing on the other.However, we are cautious about drawing conclusions from this difference in Condition 3, since it may be the shorter lifetimes in Condition 3 that make the percentage of lifetime larger.Third, we can look at the balance between the time spent grooming and the time spent feeding (Table 2, fifth and sixth columns).Comparing Condition 2 to Condition 1, the percentage of time spent grooming increases by a moderate amount (from 34.5 to 39.4) as expected, but while the time spent feeding does decrease, it does so by only a small amount (from 21.3 to 20.6).This can viewed as due to increased persistence of the consummatory grooming behavior making more use of the resource when it is available.Rather than the time eating, it is principally the time spent searching that is reduced by the increased grooming.
Fourth, the relatively small penalty that we have observed for increasing the target value in Condition 2 is also apparent in the small increase in variance of the essential variables (Table 2, fourth column, and Figure 6), indicating that the physiological balance is minimally affected.
It should be noted that none of these differences between Conditions 1 and 2 was found to be statistically significant, which is not unexpected, since the value of 1,100 was not chosen for the purposes of testing an improved performance in the robot compared to Condition 1 (there might be some target value either above or below 1,100 where the improved performance is more marked).However, these different lines of evidence all point to advantages of a target value greater than 1,000.This question of advantages of a mildly unrealistic value is, therefore, a hypothesis for future research.It may be the case that the optimal value is somewhere between 1,000 and 1,100, the exact value depending on the metric chosen and the environment, as well as other variables.Such potential advantages of mildly unrealistic target values also contribute to the debate about possible evolutionary origins of OCD (Glass, 2012).

Computed Threshold for Unstoppable Grooming
Finally, mathematical analysis of the algorithm used to compute the intensity of the robot's motivations allows us to calculate a theoretical threshold for the perceived target integument value, above which grooming would become unstoppable: a perceived target integument value that results in a grooming behavior that will not be stopped by the motivation to feed.The value is calculated as follows.
First, we need to deduce what the intensity of the motivation to groom is when the grooming behavior continues indefinitely.To do this, let us consider a situation with perfect integument (=1,000).In this situation, taking Equation 1, in Condition 1 the motivation to groom is zero, in Condition 2 it ranges from 100 to 600 (depending on the size of the cue), and in Condition 3 it can range from 200 to 1,200.In this calculation, in which we are considering the general case of an arbitrary target value, our experimental data justify the use of the maximum value for the cue to groom.Examining our data, we see that, on the occasions when integument reaches its maximum value, the maximum values for motivational intensity are common; hence the cue in Equation 1 must also be at its maximum value.
In considering that the competition between the motivations to groom and to feed while grooming is ongoing, we will assume that there is no energy resource detected (this assumption is realistic in our environment due to the separation of the resources and the short range of the robot's sensors).Therefore, by Equation 1, the intensity of the motivation to feed is equal to the energy deficit and thus reaches a maximum value of 1,000.Hence we seek the target value T for integument that would give a motivational intensity greater than 1,000 for a maximum cue (100) and perfect integument (1,000).Substituting from Equation 1; motivation groom ≥ motivation feed (T − 1, 000) + 100 × α × (T − 1, 000) ≥ 1, 000.

Computational Psychiatry
Solving for T with our choice of α = 0.05, we get a minimum value of T = 1, 166.7, and for target integument values above this, our robot will be very unlikely to stop grooming once started.This is in agreement with our experimental results, where in Condition 3 our perceived target value (1,200) is greater than T and the robot was highly likely to die from lack of energy while grooming.
Larger values of α would result in values of T closer to 1,000; therefore, from this calculation, we can predict that smaller errors in the perceived target value would result in pathological behavior.

Evaluation of the Robot Model (Stage 7)
After assessing the model through analysis of the experimental results, we now expose the robot model and its underlying theoretical model to criticism in order to evaluate the quality of the model and, in subsequent stages, whether it is of clinical use and how to improve it in the next iteration of our design process.
To evaluate our robot and its interaction as a model of an OC-spectrum disorder, we consider four criteria based on their use to evaluate animal models: face validity, construct validity, predictive validity, and reliability (Geyer & Markou, 2000;Lewis & Cañamero, 2017;van der Staay, 2006).

Face Validity
Face validity refers to the descriptive similarity (van der Staay et al., 2009) or "phenomenological" similarity (Willner, 1986) between the robot model and specific features of the phenomenon that is being modeled, in this case, OC-spectrum disorders.This similarity would concern, for example, a specific symptom or a behavioral dysfunction observed in both the patient and the robot model and is not related to the experiential quality that the term phenomenological has in philosophy (in the phenomenological tradition).Therefore the robot behavior should resemble the OC-spectrum disorders being modeled by showing features of the disorders and not showing features that are not seen in the disorders.
Our results show that we achieved high face validity within the scope of our model, focused on compulsions and obsessions: The self-grooming behavior was executed for long periods in Conditions 2 and 3 and itself is related to OC-spectrum conditions TTM and PSP.The continuation of the grooming behavior beyond the point where the Condition 1 robot would have stopped can be viewed as perfectionism-a characteristic of several OC-spectrum and related disorders (OCD, TTD, body dysmorphic disorder, OCPD) (Fineberg et al., 2015;Fineberg, Sharma, Sivakumaran, Sahakian, & Chamberlain, 2007;Pélissier & O'Connor, 2004).Hence, our work provides experimental support for the theoretical claims about Pitman's model being able to generate persistent repetitive behavior-that is, that Pitman's model can generate behavior with face validity.
Our model also shows face validity with respect to the sense of "incompleteness"-an inner sense of imperfection or the perception that actions or motivations have been incompletely achieved (Hellriegel, Barber, Wikramanayake, A. Fineberg, & Mandy, 2016;Pitman, 1987;Summerfeldt, 2004;Wahl, Salkovskis, & Cotter, 2008)-which is widely viewed as a key aspect of OCD and can be linked with our persistent internally sensed error.In our motivation-based architecture, the motivational systems are goal-oriented embodied sensorimotor loops, and in the pathological case, the perceived need is never satiated, even if an outside observer would say that the goal-grooming to improve integument-has been achieved.
In other words, in the pathological case, goal-oriented behavior is never complete because the perceived need is never satiated, and therefore the corrective behavior continues even if the error of the physiological variable has actually been corrected.
In addition, considering the results from our experiments regarding maintenance of integument, the concentration of the robot on grooming one side, even to the neglect of the other side, bears a potential phenomenological similarity with PSP, in which, in some cases, a person may concentrate his or her skin picking in one place, causing skin lesions.
Our model does not yet include other characteristics of OCD, such as additional nonfunctional ritual behaviors (Amitai et al., 2017;Eilam, Zor, Fineberg, & Hermesh, 2012) or indecision aspects (Sachdev & Malhi, 2005) that can occur in OCD and TTM.At its present level of development, our model lacks some key mechanisms hypothesized to be behind such nonfunctional ritual behaviors.For example, our robot has no learning capability, and therefore if the development of nonfunctional rituals is, as some theorize (Eilam, 2017), due to a disrupted behavior learning process, there is no opportunity for the robot to develop such learned rituals.Similarly, indecision is theorized as resulting from an inability to choose between strong competing goals (Pitman, 1987).However, our current model has limited capacity for such conflict, because in the experimental setup presented here, only one resource (and hence only one cue for action) would be detectable by the robot's short-range sensors, so such competition would be unlikely.
Adding further complexity would allow us to produce such nonfunctional ritual behaviors using different mechanisms, such as those mentioned, and to compare experimentally these different hypotheses.
In summary, although our robot's behavior does not exactly match an OC-spectrum condition, it exhibits those aspects that we would expect, given the scope and complexity of our model.A simple model like ours also allows for an incremental investigation of the behavior, in which different aspects of the condition are the result of specific additions to the model.

Construct Validity
Construct validity indicates the degree of similarity between the underlying mechanisms of the model and of the condition (Epstein, Preston, Stewart, & Shaham, 2006).In the context of animal models, Joel (2006) specified that the underlying mechanisms for construct validity may be either physiological or psychological.According to van der Staay et al. (2009), construct validity reflects the "soundness of the theoretical rationale."In our robot model, we do not directly model psychological constructs, and we see them as being more related to face validity (e.g., the sense of incompleteness discussed in the previous section).
When talking about underlying mechanisms, animal models have tended to focus on specific elements, such as the involvement of specific brain areas, receptors, chemicals, or genes (Camilla d'Angelo et al., 2014, Table 1).Construct validity in this case is then based on finding specific mechanisms underlying a phenomenon (e.g., symptoms) in the animal model and the human condition.Such a view of construct validity might be critically questioned by approaches that emphasize species-specific features and differences, such as models grounded in ethology.
Such a view of construct validity also implies that the idea of robot models having construct validity might be problematic and questioned on various grounds, for example, the fact that robots and biological systems are made of different matter or that the models and algorithms implemented in robots are simplifications of biological constructs.However, what critics consider as weaknesses of these models can also be considered as strengths.The fact that robot models are simplifications allows us to capture key selected structural, functional, or dynamic elements for a focused, rigorous investigation.The adoption of a cybernetics perspective that focuses on interaction dynamics, processes, and general principles also means that we can model aspects of underlying physiological mechanisms relevant to OCD that are more general than the specific types of underlying mechanisms that animal models have focused on; for example, we can implement processes that model effects on perception that may be hypothesized to involve specific chemicals, without having to model the specific chemicals themselves.In addition, in robot models, mechanisms underlying a phenomenon can be modeled at different levels of granularity from different theoretical perspectives.These complementary constructs, models, and levels could be experimentally tested and compared, bridging gaps across levels and conceptual perspectives, which is a crucial issue in cross-disciplinary and translational research.
From a conceptual perspective, construct validity for our robot model would be linked to the construct validity of the cybernetic and signal attenuation models of OCD, since the underlying mechanisms that we use to model OCD are closely related to the mechanism they postulate.In all of the three models (the robot model and the two conceptual models), the emphasis is on the dynamics of interaction among the elements of a regulatory system rather than attempting to locate the problem in and modeling specific brain areas or specific genes.All these models share the use of cybernetics notions and conceive of OCD as a disorder in the decision-making process, in particular, the presence of a high error signal that cannot be eliminated by behavioral output.One of the causes that Pitman proposes for this persistent high error signal is an intrinsic comparator defect, and our pathological case is generated by a fault in the robot's comparator system that gives rise to an error signal that cannot be eliminated through the robot's behavior.In terms of the signal attenuation model, the robot's behavior does not result in feedback as to the success of the behavior since the error signal remains high, so the behavior continues.
Whereas the signal attenuation model has received more attention and has provided more examples of construct validity, we have found little direct investigation of Pitman's model as it applies to humans with OC-spectrum disorders; it thus remains largely theoretical.At this early stage of our research, we can therefore only claim limited and indirect construct validity for our robot model.
Robot models that we have used in previous work, based on similar motivational architectures, have included elements that could allow us to link to anxiety, perception of harm, or an excessive reliance on habits (as opposed to instrumental acts), all of which have been the basis of conceptual models of OCD.This could potentially allow us to expand the construct validity of our model in relation to other theoretical models.

Reliability
An animal or robot model is said to be reliable if the experimental outputs are reproducible (in the sense that the exact experiments can be reproduced, possibly by different experimenters, producing the same results) and extended replications can be run (e.g., conceptual replications, in which the same underlying concepts can be tested in different ways).
As a robot model with an explicitly programmed controller and a highly controlled environment, we would expect the reliability of our model to be high (highly reproducible) compared to animal models.Indeed, with relatively few runs, we obtained statistically significant results.

Predictive Validity
Predictive validity indicates that the behavior of the robot model can be used to make reliable predictions about the condition being modeled.A particularly important aspect of this is that the model can be used to make predictions about outcomes in the human condition and which interventions will, or will not, work with some degree of accuracy.This aspect of predictive validity, highly important for clinical research purposes, is currently lacking in our model.In our experiments, we did not investigate any "treatments," and at this early stage in the development of the model, we could expect only limited predictive validity.However, this is an important point to be developed in future work, so that our experiments can inform future clinical research.

Is the Robot Model Sufficiently Advanced to Be of Clinical Use? (Decision Point 8)
We consider if our model might be sufficiently advanced for clinical use.If the model was sufficiently advanced, then we would move to Stage 10 in our design process; otherwise, we need to ask if the results so far indicate if there is a potential for improvement (see Decision Point 9 later).
Strictly speaking, at this point, the answer is that our model is not yet sufficiently advanced for clinical use.However, we already have suggestions for potential avenues to clinical use.Even at the current stage of development, the robot could be used as a working model to help OCD patients understand their condition and reduce negative feelings about it.For example, seeing how compulsive behavior in the robot results from the perception of a high persistent error that is not corrected through behavior can help them understand, and feel relieved, that similar behavior in them may be the result of a processing error rather than their often held assumption that they are "morally wrong," which can be very emotionally disturbing for them.Although, to our knowledge, robots have not been used in the treatment of OC-spectrum disorders, they have been used as therapeutic tools in other areas, for example, in autism spectrum disorder (see Diehl, Schmitt, Villano, & Crowell, 2012;Pennisi et al., 2016, for reviews.However, our proposed use would differ significantly from these other robots, since they are tools to be used in therapy, mostly as stimuli for interaction, but they are not models of the condition8 (they do not "have" the condition), whereas our robot is a model (it "has" an OC-spectrum disorder) that we also aim to use as a tool.A closer match for our proposed use would be to our own robot Robin, which is controlled by a related software architecture and includes a model of diabetes.Robin was designed as a tool to support diabetes education and management in children with diabetes (Cañamero & Lewis, 2016), focusing particularly on affective elements of diabetes self-management (Lewis & Cañamero, 2014).

Do the Results So Far Indicate the Potential for Improvement? (Decision Point 9)
At this stage, we need to ask if the results so far indicate if there is a potential for improvement, particularly with respect to our evaluation criteria, and in the direction of clinical relevance.Let us thus assess potential improvements to our model in the direction of clinical applications.
One of the main treatments for OCD is exposure and response prevention (ERP), which involves habituation to the urges to perform compulsions, resulting in the compulsions being extinguished (Storch & Merlo, 2006).Currently in our model, while we could prevent the robot from grooming, there is no adaptive capability in its controller that would make this change its future behavior.Both this and our model's lack of capability necessary to develop nonfunctional rituals point to a direction for future research: introducing adaptation in its behavior.This could be done, for example, by making reference values susceptible to change (modulation) through external environmental factors, such as exposure "treatments"; by adding receptors for the internal signals that could in turn be modulated by long-term signal strength, allowing reinforcement or habituation of behaviors (Lones, Lewis, & Cañamero, 2018); or by adding a capability to inhibit behaviors, thus separating obsessions and compulsions.Such additions would be aimed at improving the model's face validity (nonfunctional behaviors) and predictive validity (potential treatments) and hence improving its clinical relevance.
Allowing the robot to adapt and respond to treatment in this way may also provide another avenue to clinical application.Showing the working robot model to patients, as in the previous section, but in this case, also showing the patient the robot's improvement after applying therapy to the robot, might help them to understand and accept the often stressful ERP treatment, in which they are exposed to the triggers for their compulsions.

Accept the Robot Model for Use in Clinical Studies (Stage 10)
In the case that the robot model was sufficiently advanced for clinical trials (Decision Point 8) we would move to Stage 10: "Accept the (improved) robot model for use in clinical studies."This contrasts with the process so far, which has concentrated on model development, more in the domain of robotics research.The details of this stage would depend on the proposed clinical research.One possible route would be to investigate potential treatments by manipulating targeted elements of the model in different ways, either internally in the robot (e.g., by amplifying particular internal signals where problems have been hypothesized in humans) or externally, in the environment (e.g., by exposing the robot to problem situations to analyze whether and how it adapts).If adjusting an element of the robot model reduces symptoms in the robot, then the analogous adjustment in human patients could be investigated as potential targets for intervention.We expect that, initially, such applications of the robot model would result in very broad targets for intervention, but as the robot model is refined, these predictions could also be refined.
In any case, even as more clinically focused research begins, development of the robot model would continue, following the process described in this article.However, feedback from the clinical researchers could be brought to different stages of the robot model development process.For example, phenotypical targets in the selection stage (Stage 3) could be drawn from observations in human subjects in the clinical studies or from elements that are theorized as potential pharmaceutical targets.As another example, the design of robot experiments (Stage 6) could be done with the design of corresponding clinical studies in mind.

Is Further Refinement of the Robot Model Required? (Decision Point 11)
This decision point is similar to Decision Point 9, with the difference that, in Point 9, the robot model has not yet been considered sufficiently advanced for clinical research.Consequently, if further model development were not possible, then the robot model would be rejected as inadequate for clinical use, although it may still shed light on the underlying theoretical model used as the basis for the robot model.In contrast, at Decision Point 11, the model is already considered sufficiently advanced for some clinical research, and this research can potentially continue even as model development stops.

Induction Stage (Stage 12)
Having given some indication of how we can refine our model, we now reach the induction stage.Here we use the knowledge gained from the evaluation stage to refine our assumptions and definitions, both those identified at the consensus stage and any implicit assumptions that we had made and not identified.In our case, we see that we should think carefully about the different properties of what we have called variously a "target," "reference," or a perceived "ideal" value, the generation of the error signal, and what the range of adaptive values might be.Specifically, a "good" reference value for the comparator mechanism for a cybernetic model may not be one that is achievable, and an error signal that can never be reduced to zero may not be an indication of a pathology.While in our case, the induction stage has shed light on an underlying assumption of the cybernetic model, and hence is relevant to research into both conceptual and robot models, the induction stage may also reevaluate the assumptions made about the clinical aspects of the model.For example, the nature of the phenotypes might be reconsidered if the behavior of the robot deviated in some unexpected way from the clinical description, perhaps by showing additional behaviors or internal states.These unexpected observations could indicate either that the model was in error or that the clinical description of the condition was incomplete.

CONCLUSIONS
In this article, we have discussed and illustrated the use of robot models to complement existing computational and animal models in psychiatric research.We have described a design process for robot models of mental disorders stemming from animal models and illustrated this design process with the initial development of a robot model for OC-spectrum disorders, including initial experiments and results.Our model builds on our work on architectures for decisionmaking in autonomous robots and also on existing models of OCD-specifically the cybernetic model and the signal attenuation model-to link with existing research.The design process has also given directions for future work with a view to the model's clinical relevance.
Although this initial stage of development only models the most basic aspects of such disorders, and does not approach the complexity of OCD in humans, our results already serve to shed light on aspects of the theoretical model on which they are based that are not obvious simply from consideration of the model: specifically the nonlinear relationship between the perceived target value and the onset of pathological behavior, and the possible advantage of a mildly unrealistic target.This result might have implications in clinical research and treatment, for example, by helping us understand why some members of a family develop OCD while others do not.This initial development work on a robot model has also generated a hypothesis for future research: that mildly unrealistic target values may provide some advantages for our robot.Such potential advantages may also be explored in humans, in animal models, and in cybernetic systems in general.
To conclude, we would like to add some remarks on the nature of robot models and their relation to other models relevant to computational psychiatry.
As models, robots present very different features to other types of models, such as computational models or simulated environments.To characterize the main differences between computational and robot models, we find it useful to think of the distinction that Herbert Simon, one of the founders of artificial intelligence, drew between types of models in his book The Sciences of the Artificial (Simon, 1981) when trying to characterize the meaning of the terms artificial and simulation.Simon distinguished between models that simulate a system by predicting its behavior and deriving consequences from premises (e.g., a system for weather prediction) and models that are a simulation of a system by embodying a few key features of that system and being put to behave in the same environment, governed by the same laws (e.g., a satellite is not a simulation of a moon; it is a moon, the "real thing").While computational models fall in the first category, embodied autonomous robot models, such as ours, fall in the second.According to Simon, the first type of models are appropriate for achieving understanding of systems with many parameters, for which it is difficult to predict behavior without complex or extensive calculations, whereas the second type is most useful as a source of new knowledge to understand, by synthesis, the behavior of poorly understood systems.The choice between one or the other type of model will depend on the type of research questions under investigation.Some would perhaps argue that a simulated agent in a simulated environment might also belong to the second type of model and might be preferable to a robot situated in the physical world because replicability of experiments can be higher.We do not think such types of models belong to the second category but to the first.The complexity (including important features, such as unpredictability and "noise") of the physical world, a physical agent, and their interactions cannot be fully simulated (Brooks, 1991b;Pfeifer & Scheier, 2001).In a simulated environment, we can only see the consequences of the features that we have included in it, even if we simulate some noise and unpredictability; however, in the real world, unexpected noise and unpredictable elements that we had not anticipated might give rise to significant behavior.This is the case in both robots and humans.As a "trade-off," these features might reduce exact replicability, although replicability is still very high when using robots, and if data are properly logged during experiments, it is often possible to analyze when unexpected behavior might be due to noise.In the other direction, this "trade-off" means that the easier replicability of experiments using a simulated agent in a simulated environment comes at the cost of an impoverished model that might leave out features that had not been anticipated by the designer but might end up being significant.Therefore, in addition to the same considerations made regarding the two different types of simulations distinguished by Simon, the choice between a physical robot situated in the physical (and social) environment and a simulated agent in a simulated environment also depends on how important features like dynamics of interaction or embodied sensorimotor loops are to address the question under investigation.by an Early Career Research Fellowship grant from the University of Hertfordshire, awarded to LC.

Figure 2 .
Figure 2.An overview of the action selection mechanism for our robot.Rounded boxes represent individual (potentially nested) behaviors, while square-cornered boxes represent other internal components.The actions of the actuators result in changes in the environment and the robot's physiology, which is fed back to the robot controller via the robot's perceptions.Motivations are updated and new behaviors are selected every action selection loop (10 Hz).

Figure 3 .
Figure 3.The Elisa-3 robot.Left: an Elisa-3 robot, viewed from the front/left.Right: a diagram of the Elisa-3's infrared distance sensors (top view).Arrows indicate how the sensors are used to detect grooming and damage from collisions and sustained rubbing.

Figure 4 .
Figure 4.The 80 cm × 80 cm environment used in the experiment.Here the robot is feeding at an energy resource (white patch) while the grooming posts (white pipes) stand on the black patches.

Figure 5 .
Figure 5. Experimental results: the means of the robot's geometric well-being over the lifetime of each run.Larger values indicate better maintained physiological variables.Crosses indicate runs in which the robot died.

Figure 6 .
Figure 6.Experimental results: the means of the variance of the robot's physiological variables (which can be thought of as a measure of the robot's "physiological balance") over the lifetime of each run.Smaller values indicate better balance between the different physiological variables.Crosses indicate runs in which the robot died.

Figure 8 .
Figure 8. Experimental results: the percentage of the robot's lifetime that either of the two integument variables was the largest valued (i.e., most well maintained) essential variable.Crosses indicate runs in which the robot died.

Figure 9 .
Figure 9. Experimental results: the percentage of the robot's lifetime that either of the two integument variables was the smallest valued (i.e., least well maintained) essential variable (right).Crosses indicate runs in which the robot died.

Table 1 .
The robot's physiological variables decreases over time; increases when the robot consumes from an energy resource Integrity 0 1,000 decreases on contact with objects; increases over time as the robot "heals" Integument L none 1,000 decreases over time; increases when the robot's left side passes close to a grooming post Integument R none 1,000 decreases over time; increases when the robot's right side passes close to a grooming post

Table 2 .
Experimental results

Table 3 .
Experimental results: percentage of time during which each motivation was the highest, taken as a percentage of the robots' combined lifetime