A Reduced Self-Positive Belief Underpins Greater Sensitivity to Negative Evaluation in Socially Anxious Individuals

Positive self-beliefs are important for well-being, and are influenced by how others evaluate us during social interactions. Mechanistic accounts of self-beliefs have mostly relied on associative learning models. These account for choice behaviour but not for the explicit beliefs that trouble socially anxious patients. Neither do they speak to self-schemas, which underpin vulnerability according to psychological research. Here, we compared belief-based and associative computational models of social-evaluation, in individuals that varied in fear of negative evaluation (FNE), a core symptom of social anxiety. We used a novel analytic approach, ‘clinically informed model-fitting’, to determine the influence of FNE symptom scores on model parameters. We found that high-FNE participants learn more easily from negative feedback about themselves, manifesting in greater self-negative learning rates. Crucially, we provide evidence that this bias is underpinned by an overall reduced belief about self-positive attributes. The study population could be characterized equally well by belief-based or associative models, however large individual differences in model likelihood indicated that some individuals relied more on an associative (model-free), while others more on a belief-guided strategy. Our findings have therapeutic importance, as positive belief activation may be used to specifically modulate learning. Author Summary Understanding how we form and maintain positive self-beliefs is crucial to understanding how things go awry in disorders such as social anxiety. The loss of positive self-belief in social anxiety, especially in inter-personal contexts, is thought to be related to how we integrate evaluative information that we receive from others. We frame this social information integration as a learning problem and ask how people learn whether someone approves of them or not. We thus elucidate why the decrease in positive evaluations manifests only for the self, but not for an unknown other, given the same information. We investigated the mechanics of this learning using a novel computational modelling approach, comparing models that treat the learning process as series of stimulus-response associations with models that treat learning as updating of beliefs about the self (or another). We show that both models characterise the process well and that individuals higher in symptoms of social anxiety learn more from negative information specifically about the self. Crucially, we provide evidence that this originates from a reduction in the amount of positive attributes that are activated when the individual is placed in a social evaluative context.

• the action, the valence of the attribute that the computer persona will choose in a particular trial, a ∈ {+ve word, −ve word}, • the state or context s ∈ {persona : like, neutral, dislike} × {self, other}, • and the reinforcement value of the outcome, r ∈ {+1 = correct, −1 = incorrect} on each trial t.
The Valence model contained separate learning rates λ for positively and negatively valenced outcome words, a.k.a +ve and -ve information, regardless of which action led to them (and so regardless of r t ).The Valence model made no distinction between self and other.
Q t (a t , s t ) = Q t−1 (a t , s t ) + λ pos (r t − Q t−1 (a t , s t )) for +ve word outcome (1) Q t (a t , s t ) = Q t−1 (a t , s t ) + λ neg (r t − Q t−1 (a t , s t )) for −ve word outcome (2) Learning rates λ further varied as follows, and are fully listed in Table S1.
Self/other valence model: separate λs for positive and negative information, and for self and other, resulting in 4 learning rates.
Self/other asymmetric valence model: separate λs for positive and negative information for self, but only one for other (i.e. 3 learning rates).
Self-valence model: separate λ for positive information for self only, and a general learning rate for all other information, resulting in 2 learning rates.

Initial values
1. Fixed uncertainty.Models without an initial bias parameter were initialised using two different methods.Fixed uncertainty operationalised the first trial for each persona, q t=0 to be zero, expressing equal weight between the positive and negative word and capturing a state of equal uncertainty.

Initial bias free parameter.
The initial bias parameter, q 0 , allowed the starting expectations to vary between -1 and 1.This is applied to both self and other or to the self only and impacts the first trial for every persona.
The positivity bias, ρ, is applied to each round of the expectation update as a constant term which is allowed to vary from -1 and 1.Again, this is applied to both self and other or to the self only.

Model Fit
The fit of competing models for MLE was compared using the Bayesian and Akaike information criteria (Akaike, 1998;Schwarz, 1978) (see eq. 3) at an individual level, which were then summed over participants.
Where k is the number of free parameters, n the number of trials, d the observed data and p(d|θ M L ) the maximum-likelihood of the parameters given the data.
Although AIC and BIC assessments of model fit are the most commonplace reported model fit statistics, these raw values are non-interpretable on their own.
Factors such as number of participants and the number of trials will influence the scale of the measures, rendering comparison across studies difficult.The scores can be standardized in order to produce the pseudo-r 2 value, which reflects the variance explained by the model relative to the log likelihood from a model of pure chance (Camerer & Ho, 1999;Daw, O'Doherty, Dayan, Seymour, & Dolan, 2006).

Social Evaluation Modeling
We can compute the log likelihood of the chance model just by inputting the total number of trials in our experiment, t = 192.We can then compute the pseudo-r 2 value, by taking the ratio of the two values, where L is the log likelihood of the fit model and C is the log likelihood of the chance model for each participant.
As explained in the main text, we fit all models hierarchically and report the appropriate fit statistic for hierarchical models, LOO.Table S1 gives all fit statistics for all models.We used a model comparison procedure reported in (Vehtari, Gelman, & Gabry, 2017), which compares the expected log pointwise predictive density (ELPD) for each model.We determined the size of importance of ELPD difference by taking models greater than 5 x the SE of the estimate.

Table S1
List of associative learning models and their fit indices.Each models also had one decision variability parameter, τ .

Belief-Update Model Descriptions
Belief models contained α and β trait parameters, memory parameters η and one decision variability parameter, τ .A full list of belief-update (BU) models is given in Table S2.
Assuming that a participant used an effective number of N items of information, of which at least one was positive and one negative, then asymptotic consistency (if only positive or only negative information were presented for very many trials) meant that the update equations for the state component of beliefs took the form: We did not use additional parameters to weigh the feedback information o t and 1 − o t to limit over-parameterising the model, allowing us to test whether preferential learning from positive and negative outcomes could simply be accounted for by activated evidence about the self, α trait and β trait as described in the main text. Belief

Initial bias(1).
Models had separate initial starting beliefs applied to the first round of the experiment, formalizing the intuition that individuals may activate modifiable as well as trait-like components to their schemata upon entering a new environment which differ from their general traits.At trial 1 (t=1) policy was evaluated according to free parameters α init and β init and default state values (1).
At trial 2, we assumed that individuals converged to policies determined by (accumulated evidence + traits).

Initial bias(2).
Models again had separate initial beliefs, this time decaying over time.
4. Fixed uncertainty.Here, we assumed a maximally uncertain starting state, given by beta distribution, Beta(1, 1).Again, we applied this only to trial 1.

Belief-Update Link Functions
We explored different ways in which belief uncertainty might impact choice variability.

Expectation only policy.
Here, we assumed that belief uncertainty has no impact on choice variability.The individual chose solely on the basis of the expectation of each option, α/(α + β), β/(α + β) and a fixed decision noise τ , via a standard softmax function:

Strength-of-evidence based policy
Here, choice became more deterministic with the amount of evidence accumulated in favour of one or the other option, i.e. as |α − β| increased:

Belief uncertainty using sampling
Here policy was randomly sampled from a distribution which was that of belief itself, albeit spread out through the decision noise τ : The Expectation-only policy, which was agnostic to belief uncertainty, fitted the data best (Equation 11, Table S2.Belief uncertainty did not further increase decision noise.Given the absence of evidence for an effect at this level, we opted to not run this models hierarchically.

Hopkins, Button, Dolan & Moutoussis
List of belief-update models.Each also contained one decision variability parameter, τ .
n.f.refers to models not fit hierarchically.

Self-schema: from intuition to key computational hypothesis
Inspired by the work of Koban et al. (2017), but also by the cognitive-behavioural hypothesis that activated, schema-based beliefs about the self appear to 'filter' information and facilitate congruent learning, we performed simulation experiments to establish prima facie validity of hypotheses.We set up a model based on the schema-based formulation, and asked whether such a model would behave, from an associative learning point of view, as if it learnt faster about congruent information.
Here, we demonstrate the key simulation that led to the hypothesis tested in the main analysis.
In this demonstration, a person accumulates evidence about how two raters like them, one 80% positive and one 80% negative.They simply add 1 to their positive tally, α if they perceive a positive observation from social feedback o t = 1, or 1 to their negative tally β, if they receive disapproving feedback o t = 0.
However our person has an amount of 'active background evidence' about themselves.For example, they already have positive information weighing as much as 10 items of positive observations in mind, and 2 items of negative information active about themselves.This is our proxy for the positive self bias in the active schema; however it could be reinterpreted as an assumption about whatever the object of learning is, depending on the experiment.So their total active evidence is: Clinically, people seem to have self-schemas persistent in their influence and hard to overcome by new evidence, maybe because new information is often easily forgotten.
We wondered if corresponds to Rescorla-Wagner learning, where the learning rate inherently induces replacement, that is, forgetting, of old association values with prediction-error based ones.We translated this clinical and computational intuition to the self-evaluation setting by assuming that new evidence, whether of approval or disapproval, is imperfectly remembered from trial to trial with a memory parameter 0 < m < 1, giving: For the purposes of demonstration, we write the memory parameter in its simplest form and provide further explanation for the form actually used in the main work in equations 21 -23 below.Finally, CBT-like belief strength in this simple model is simply the proportion of active evidence about the statement in question: Having formulated this simple model of evidence accumulation into beliefs, how can we map it onto a value-based, associative model?We could complete the simulation by linking beliefs to behaviour, and fitting this behaviour with an associative model, but greater conceptual clarity can be achieved as follows.In our task, we take the choice probabilities π to scale according to belief strength in a schema-model, but with exp(action value) in a value-based model (consistent with, for example, the standard Gibbs Softmax): µ ∼ π ∼ e Q so, for example, (We have omitted proportionality constants which do not materially affect this demonstration).Next, we can solve the Rescorla-Wagner learning rule as if the learning rate were the unknown, and obtain for each belief update in the cognitive model the equivalent associative learning rate: These effective learning rates can be averaged to demonstrate if indeed they are higher for in the presence of α 0 > β 0 .This lead to the hypothesis that in the real data higher learning rates would indeed correspond to congruent activated schemas, with more positive or less negative information activated about the self and either of the two being relatively blunted in those with high FNE.
Note, however, that the in this demonstration only self-schema parameters are considered, whereas in principle models fitted to the real data contained several more parameters that might account for descriptive differences in behaviour -or even fail to capture these altogether.Note also that the 'equivalent associative learning rate' above does not map exactly to the λ pos and λ neg in our associative models above.

Memory, or the leaky accumulation of evidence
Participants may gradually forget older evidence, as well as accumulate new evidence, about themselves.We thus took the evidence α t and β t to be subject to decay, as well as trial-dependent accumulation.
Furthermore, we wanted α t and β t to be no lesser than 1.If this condition is met, the belief distribution has a value of 0 at the extremes p = 0 and p = 1.This formalizes the commonsense assumption that people do not think that their most likely self-evaluation is perfectly positive, nor perfectly negative.U-shaped beliefs are also possible in principle, but here we adopted a more conservative framework.
We now consider how participants may update information in their working memory by partially forgetting older information.We denote the maximum size of effective memory to be N = max(α t + β t ).We also need the beliefs represented by α t Social Evaluation Modeling and β t to make sense if participants do not -as yet, or through memory decay -have any evidence about the current context.We assume that they then become agnostic about how they will be evaluated by others, which is achieved by α t = β t = 1.If now α t , β t ≥ 1. Happily, this also excludes nonsensical values for α total and β total .If n + , n − is the positive and negative evidence retained from the task itself, over and above the minimum of 1, Consistent with traditional views of working memory update, we take old observations to be chosen at random to replace.This is on average equivalent to a forgetting process operating on α t−1 .We now show that this forgetting rate is fully determined by the requirement for α to tend to N-1 (so β is still at least 1) if all o = 1, and to 1 if all o = 0. Now max(α t ) = N − 1.Only the n + part of α decays before updating, so for example in the absence of positive feedback If the participant receives positive evidence all the time, Substituting this into the update equation for α t gives: Hopkins, Button, Dolan & Moutoussis

Social Evaluation Modeling
When it comes to real data, η, is a free parameter to be fitted.Apart from 0 ≤ η < 1 (or equivalently 2 < N < ∞) the value of η is not constrained (though it can be subject to regularizing or empirical priors).It is directly related to the effective memory.β t+1 is estimated in the exact same manner: Finally, we note that in the models fitted here o t is either 0 or 1.In future work, we agents can be considered that over-count or under-count information.

Correspondence between associative learning and belief-update models
In order to determine the stability of the λ self neg , α self correlation across models, we computed the correlation for each nested model variant within the best-fitting model family.Figure S3 displays the correlations between the self-negative learning rate parameter from variants of the self/other valence models and the α self parameter from the best-fitting self/other belief-update models.All parameter correlations were

Figure S1 .Figure S2 .
Figure S1 .Simulated data for different parameter values of the λ self neg parameter from the self/other valence model.Parameter values differentiate behaviour according to FNE in self (a) conditions, but not in other (b) conditions.

Figure S3 .
Figure S3.Correlations between the λ self neg learning rate parameters from variants of the self/other valence models and the α self parameter from the best-fitting belief-update model.

Figure S4 .Figure S5 .
Figure S4 .Correlations between the α self parameters from variants of the self-other belief-update models and the λ self neg learning rate parameter from the best-fitting self/other valence models

Figure S6 .
Figure S6 .Generative performance for the Belief-Update Simple Model (Model Number 1 TableS2); mean cumulative positive words chosen for actual data (in black) vs. data generated from MCMC fits (cyan).Data is visualised using median-split FNE scores (lighter=lower BFNE) and shaded zones represent +/− SEM.The generated data captures the asymmetries in positive vs. negative word selection but not the key high VS low FNE differences for the self/other distinction.

Figure S7 .Figure S8 .
Figure S7 .Generative performance for the Associative Learning S/O Valence Model (Model Number 12 TableS1); mean cumulative positive words chosen for actual data (in black) vs. data generated from MCMC fits (cyan).Data is visualised using median-split FNE scores (lighter=lower BFNE) and shaded zones represent +/− SEM.The generated data captures the asymmetries in positive vs. negative word selection and the key high VS low FNE differences, similarly to Model 18 -update self/other (BU S/O) models thus contained α self /β self and α other /β other

Table S4
Generative statistics for simple models.Both simple models display poor ability to reproduce many of the key statistics, including the 3-way-interaction.

Table S5
Generative performance statistics from full models.

Table S6
Parameter weights on FNE from the full models, derived from clinically informed model-fitting.The key FNE parameter differences were the same as for the selected models.