Custom Search

Saturday, December 4, 2010


We can be pretty sure that, in males, the preoptic area is involved in the control of sexual behaviour because:
1. lesions of this region permanently abolish male sexual behaviour;
2. electrical stimulation of this area can elicit copulatory activity;
3. neuronal and metabolic activity is induced in this area during copulation; and
4. small implants of the male hormone testosterone into this area restore sexual behaviour in castrated rats.

[David Buss (1953– ), a professor in the Evolutionary Psychology Research Lab, University of Texas at Austin, has pioneered the use of modern evolutionary thinking in the psychology of human behaviour and emotion. His primary research has focused on human mating strategies and conflict between the sexes. He has championed the idea that men and women have different long-term and short-term mating strategies, and that monogamous and promiscuous mating strategies may coexist. Some interesting extensions to his work include references to sexual jealously and coercion, homicide, battery and stalking. In an effort to find empirical rather than circumstantial evidence to show that human psychological preferences have evolved and are not only learned, Buss has performed many cross-cultural studies containing up to 10,000 participants from many countries around the globe. Overall, his evolutionary psychology has highlighted the dynamic and contextsensitive nature of evolved psychological mechanisms.]

In females, the preoptic area is involved in the control of reproductive cycles, and is probably directly involved in controlling sexual behaviour too. The ventromedial nucleus of the hypothalamus (VMH) is also involved in sexual behaviour. Outputs from the VMH project to the periaqueductal gray of the midbrain, and this region is also necessary for female sexual behaviour, including lordosis (the position adopted by a female to accept a male) in rodents. This behaviour can be reinstated in ovariectomized female rats by injections of the female hormones oestradiol and progesterone into the VMH brain region. Can the brain help us to understand sexual arousal at the sight and smell of someone to whom we are sexually attracted? By receiving inputs from the amygdala and orbitofrontal cortex, the preoptic area receives information from the inferior temporal visual cortex (including information about facial identity and expression), the superior temporal auditory association cortex, the olfactory system and the somatosensory system. It is presumably by these neural circuits that the primary rewards relevant to sexual behaviour (such as touch and perhaps smell) and the learned stimuli that act as rewards in connection with sexual behaviour (such as the sight of a partner) reach the preoptic area. And it is likely that, in the preoptic area, the reward value of these sensory stimuli is modulated by hormonal state, perhaps (in females) related to the stage of the menstrual cycle – women are more receptive to these sensory stimuli when they are at their most fertile. The neural control of sexual behaviour may therefore be organized in a similar way to the neural controls of motivational behaviour for food. In both systems, external sensory stimuli are needed to provide the reward, and the extent to which they do this depends on the organism’s internal state, mediated by plasma glucose concentration for hunger and hormonal status for sexual behaviour. For sexual behaviour, the internal signal that controls the motivational state and the reward value of appropriate sensory stimuli alters relatively slowly. It may change, for example, over four days in the rat oestrus cycle, or over weeks or even months in the case of many animals that only breed during certain seasons of the year.

The outputs of the preoptic area include connections to the tegmental area in the midbrain. This region contains neurons that are responsive during male sexual behaviour (Shimura & Shimokochi, 1990). But it is likely that only some outputs of the orbitofrontal cortex and amygdala that control sexual behaviour act through the preoptic area. The preoptic area route may be necessary for some aspects of sexual behaviour, such as copulation in males, but the attractive effect of sexual stimuli may survive damage to the preoptic . Research findings suggest that, as for feeding, outputs of the amygdala and orbitofrontal cortex can also influence behaviour through the basal ganglia. Much research remains to be carried out into how the amygdala, orbitofrontal cortex, preoptic area and hypothalamus represent the motivational rewards underlying sexual behaviour. For instance, it has recently been found that the pleasantness of touch is represented in the human orbitofrontal cortex (Francis et al. 1999). Findings such as these can enhance our understanding of sexuality in a wider context.


Sperm warfare
Monogamous primates (those with a single mate) living in scattered family units, such as some baboons, have small testes. Polygamous primates (those with many mates) living in groups, such as chimpanzees, have large testes and copulate frequently. This may be related to what sociobiologists call ‘sperm warfare’. In order to pass his genes on to the next generation, a male in a polygamous society needs to increase his probability of fertilizing a female. The best way to do this is to copulate often and ejaculate a large quantity of sperm, increasing the chances that his sperm will reach the egg and fertilize it. So, in polygamous groups, the argument is that males have large testes to produce large numbers of sperm. In monogamous societies, with less competition between sperm, the assumption is that the male just picks a good partner and produces only enough sperm to fertilize an egg without the need to compete with others’ sperm. He also stays with his partner to bring up the offspring in which he has a genet ic investment, and to guard them (Ridley, 1993).
What about humans? Despite widespread cultural pressure in favour of monogamy or restricted polygamy, humans are intermediate in testis size (and penis size) – bigger than might be expected for a monogamous species. But remember that although humans usually do pair, and are apparently monogamous, we also live in groups, or colonies. Perhaps we can find clues about human sexuality from other animals that are paired but also live in colonies. A problem with this type of comparison, though, is that for most primates (and indeed most mammals) it is the female who makes the main parental investment – not only by producing the egg and carrying the foetus, but also by feeding the baby until it becomes independent. In these species, the male apparently does not have to invest behaviourally in his offspring for them to have a reasonable chance of surviving. So the typical pattern in mammals is for the female to be ‘choosy’ in order to obtain a healthy male, and for the males to compete for females. But, because of its large size, the human brain is not fully developed at birth, so infants need to be looked after, fed, protected and helped for a considerable period while their brain develops and they reach independence. So in humans there is an advantage to paternal investment in helping to bring up the children, because the paternal resources (e.g. food, shelter and protection) can increase the chances of the father’s genes surviving into the next generation to reproduce again. In humans, this therefore favours more complete pair bonding between the parents.

Couples in colonies
It is perhaps more useful to compare humans with birds that live in colonies, in which the male and female pair up and both invest in bringing up the offspring – taking turns, for example, to bring back food to the nest. Interestingly, tests on swallows using DNA techniques for determining paternity have revealed that approximately one third of a pair’s young are not sired by the male of the pair (Ridley, 1993). So the female is sometimes mating with other males – what we might call committing adultery! These males will probably not be chosen at random: she may choose an ‘attractive’ male by responding to indicators of health, strength and fitness. One such indicator in birds is the gaudy tail of the male peacock. It has been suggested that, given that the tail handicaps movement, any male that can survive with such a large tail must be very healthy or fit. Another theory is that a female would choose a male with an attractive tail so that her sons would be attractive too and also chosen by females. (This is an example of the intentional stance, since clearly the peahen is incapable of any real propositional thought; but it has also been criticized as representing a somewhat circular line of argument.) Choosing a male with an attractive tail may also benefit female offspring, so the argument goes, because of the implied health/fitness of the fathering peacock. In a social system such as the swallows’, the ‘wife’ needs a reliable ‘husband’ to help provide resources for ‘their’ offspring. A nest must be built, the eggs must be incubated, and the hungry young must be well fed to help them become fit offspring (‘fit’ here means capable of ‘successfully passing on genes into the next generation’). The male must in some sense ‘believe’ that the offspring are his – and, for the system to be stable, some of them must actually belong to him. But the female also benefits by obtaining genes that will produce offspring of optimal fitness – and she does this by sometimes ‘cheating’ on her ‘husband’. To ensure that the male does not find out and therefore leave her and stop caring for her young, she ‘deceives’ him by ‘committing adultery’ secretly, perhaps hiding behind a bush to mate with her ‘lover’. So the ‘wife’ maximizes care for her children by ‘exploiting’ her ‘husband’, and maximizes her genetic potential by finding a ‘lover’ with better genes that are subsequently likely to make her offspring more attractive to future potential mates.
The implication is that genes may influence our motivational behaviour in ways that increase their subsequent success.

[Richard Dawkins (1941– ), Professor for the Public Understanding of Science at the University of Oxford, in his book The Selfish Gene (1976), highlighted the way in which natural selection operates at the level of genes rather than individuals or species. W.D. Hamilton, also of the University of Oxford, provided some of the theoretical foundations for this approach (described in The Narrow Roads of Gene Land, 2001). ‘Selfish gene’ theory provides potential explanations for a number of aspects of animal and human behaviour that are otherwise difficult to explain. For example, it explains how the likelihood that an individual will display altruistic behaviour towards another depends on how closely the two are related genetically. This approach has also been used to understand the phenomenon of sperm competition, and the effects that this has on sexual behaviour. This approach is now thought of as a modern version of Darwinian theory, and has set a new paradigm for many disciplines including biology, zoology, psychology and anthropology.]

Are humans like swallows?
Again, how might this relate to human behaviour? Though it is not clear how important they are, there is some evidence to suggest that such factors could play some part in human sexual behaviour. One potentially relevant piece of evidence in humans concerns the relatively large testis and penis size of men. The general argument in sociobiology is that a large penis could be adaptive in sperm competition, by ensuring that the sperm are placed as close as possible to where they have a good chance of reaching an egg, and so displacing other sperm, thereby winning the ‘fertilization race’. A second line of evidence is that studies in humans of paternity using modern DNA tests suggest that husbands are not the biological fathers to about 14 per cent of children (Baker & Bellis, 1995; see Ridley, 1993). So it is possible that the following factors have shaped human sexual behaviour in evolution: women might choose a partner likely to provide reliability, stability, provision of a home, and help with bringing up her children; women might also be attracted to men who are perhaps successful and powerful, increasing the likelihood of producing genetically fit children, especially sons who can themselves potentially have many children; men might engage in (and be selected for) behaviours such as guarding the partner from the attentions of other men, to increase the likelihood that the children in which he invests are his; and men might be attracted to other women for their childbearing potential, especially younger women. Much of the research on the sociobiological background of human sexual behaviour is quite new and speculative, and many of the hypotheses have still to be fully tested and accepted. But this research does have interesting implications for understanding some of the factors that may influence human.


Just as we need to eat to keep ourselves alive, working to obtain rewards such as food, so we need to have sex and reproduce in order to keep our genes alive. In this part of the chapter, we will look at the following two questions: How can a socio-biological approach (that is, an approach which seeks to reconcile our biological heritage as a species with our highly social organization) help us to understand the different mating and child-rearing practices of particular animal species? How does the human brain control sexual behaviour?


In drinking caused by, for example, water deprivation, both the cellular and extracellular thirst systems are activated. Experiments show that, in many species, it is the depletion of the cellular, rather than the extracellular, thirst system that accounts for the greater part of the drinking, typically around 75 per cent. It is important to note that we continue to drink fluids every day, even when our bodies aren’t deprived of water. The changes in this type of thirst signal are smaller, partly because drinking has become conditioned to events such as eating foods that deplete body fluids, and also because humans have a wide range of palatable drinks, which stimulate the desire to drink even when we are not thirsty.


Although the amount of fluid in the extracellular fluid (ECF) compartment is less than that in the cells, it is vital that the ECF be conserved to avoid debilitating changes in the volume and pressure of fluid in the blood vessels. The effects of extracellular fluid loss can include fainting, caused by insufficient blood reaching the brain. The behavioural response of drinking in response to hypovolaemia, a disorder consisting of a decrease in the volume of blood circulation, ensures that plasma volume does not fall to dangerously low levels.
There are a number of ways that ECF volume can be depleted experimentally, including haemorrhage, lowering sodium content in the diet, and excessive sweating, urine production or salivation. Two main thirst-inducing systems are activated by hypovolaemia. One is the renin–angiotensin system mediated by the kidneys. When reductions in blood pressure or volume are sensed by the kidneys, the enzyme renin is released, leading to the production of the hormone angiotensin II which stimulates copious drinking A second thirst-inducing system activated by hypovolaemia is implemented by receptors in the heart. For example, reducing the blood flow to the heart in dogs markedly increases water intake (Ramsay et al., 1975). It is still not clear precisely where such cardiac receptors are located. But it seems likely that they are located in the venous circulation around the heart, since the compliance (i.e. the ability to change diameter) of these vessels is high, making them responsive to changes in blood volume. Information from these receptors is probably carried to the central nervous system via the vagosympathetic nerves, from where information is relayed to the brain and drinking behaviour can be regulated.


When our bodies lose too much water, or we eat foods rich in salt, we feel thirsty, apparently because of cellular dehydration, leading to cell shrinkage. For instance, if concentrated sodium chloride solution is administered, this leads to withdrawal of water from the cells of the body by osmosis, and results in drinking. Cellular dehydration is sensed centrally in the brain, rather than peripherally in the body. For instance, low doses of hypertonic sodium chloride (or sucrose) solution infused into the carotid arteries, which supply the brain, cause dogs to drink water, but similar infusions administered into peripheral regions of the body, which don’t directly supply the brain, have no effect (Wood et al., 1977).

The part of the brain that senses cellular dehydration appears to be near or in a region extending from the preoptic area through the hypothalamus.


With all of these brain functions promoting food regulation, why, then, is there such a high incidence of obesity in the world today?

Many different factors can contribute to obesity, and there is only rarely a single cause (see Garrow, 1988). Occasionally, hormonal disturbances, such as hyperinsulinemia (that is, substantially elevated levels of insulin in the bloodstream), can produce overeating and obesity. Otherwise, there are a number of possible contributory factors:

It is possible that the appetite of some obese people is more strongly stimulated by external factors such as the sight and smell of food (Schachter, 1971). The palatability of food is now much greater than it was in our evolutionary past, leading to an imbalance between the reward from orosensory control signals and the gastrointestinal and post-absorptive satiety signals controlling the reward value of sensory input. In other words, the rewards from the smell, taste and texture of food are far greater than the satiety signals can control.
Animals evolved to ingest a variety of foods, and therefore nutrients. So satiety is partly specific to a food just eaten, while appetite remains for foods with a different flavour. Overeating may therefore be partially explained by the tremendous variety of modern foods, encouraging us to eat more by moving from one food to another.
Modern humans take less exercise than our ancestors due to our more sedentary lifestyles, so unless regular exercise is proactively built into our daily lives, we may be inclined to gain weight.

Human meal times tend to be fixed. Animals normally regulate their food intake by adjusting the inter-meal interval. A long interval occurs after a high energy meal, and a short interval after a low energy meal. Quite simple control mechanisms, such as slower gastric emptying (and therefore a feeling of fullness for a long time after an energy rich meal) may contribute to this. But the fixed meal times often preferred by humans deter this control mechanism from operating normally. Obese people tend to eat high energy meals and then eat again at the next scheduled mealtime, even though gastric emptying is not yet complete.
Obese people often eat relatively late in the day, when large energy intake must be converted into fat and is less easily burned off by exercise and heat loss. Regulation of heat loss is one way that animals compensate for excessive energy intake. They do this by activating brown fat metabolism, which burns fat to produce heat. Although brown fat is barely present in humans, there is nevertheless a mechanism that, when activated by the sympathetic nervous system, enables metabolism to be increased or reduced in humans, depending on energy intake (see Garrow, 1988; Trayhurn, 1986).

Obesity may be related to higher stress levels in contemporary society. Stress can regulate the sympathetic nervous system to increase energy expenditure, but at the same time it can also lead to overeating. Rats mildly stressed (e.g. with a paperclip on their tail) show overeating and obesity.
But what of water intake, and drinking?
The human body can survive without food for very much longer than it can survive without water – how does our physiological make-up help direct this vital function? Body water is contained within two main compartments, one inside the cells (intracellular) and the other outside (extracellular). Intracellular water accounts for approximately 40 per cent of total body weight; and extracellular water for about 20 per cent, divided between blood plasma (5 per cent) and interstitial fluid (15 per cent). When we are deprived of water, both the cellular and extracellular fluid compartments are significantly depleted. The depletion of the intracellular compartment as cellular dehydration, and the depletion of the extracellular compartment is known as hypovolaemia (meaning that the volume of the extracellular compartment has decreased).

Functions of Striatum and Basal Ganglia

The striatum and other parts of the basal ganglia
We have seen that the orbitofrontal cortex and amygdala are involved in decoding the stimuli that provide the rewards for feeding, and in connecting these signals to hunger/satiety signals. How do these brain regions further connect to behavioural systems? One path is via the hypothalamus, which is involved in autonomic responses during feeding (such as the need for increased blood flow to the gut, to facilitate the assimilation of food into the body), and also in the rewarding aspects of food. Another route is via the striatum (one part of the basal ganglia, requiring dopamine to function – see chapter 3) and then on through the rest of the basal ganglia (see figure 5.5). This route is important as a behavioural output/feeding system, because disruption of striatal function results in aphagia (lack of eating) and adipsia (lack of drinking) in the context of a general akinesia (lack of voluntary movement) (Rolls, 1999; Rolls & Treves, 1998). Neurons in the ventral striatum also respond to visual stimuli of em otional or motivational significance (i.e. associated with rewards or punishments; Williams et al., 1993), and to types of reward other than food, including drugs such as amphetamine (Everitt, 1997; Everitt & Robbins, 1992).


The amygdala
Many of the amygdala’s connections are similar to those of the orbitofrontal cortex, and indeed it has many connections to the orbitofrontal cortex itself. Bilateral damage to the temporal lobes of primates, including the amygdala, leads to the Kluver–Bucy syndrome, in which, for example, monkeys place non-food as well as food items in their mouths and fail to avoid noxious stimuli (Aggleton & Passingham, 1982; Baylis & Gaffan, 1991; Jones & Mishkin, 1972; Kluver & Bucy, 1939; Murray et al., 1996). Rats with lesions in the basolateral amygdala display similar altered food selections. Given the neural connectivity between the orbitofrontal and amygdalar regions, we might relate these phenomena to the finding that lesions of the orbitofrontal region lead to a failure to correct inappropriate feeding responses. Further evidence linking the amygdala to reinforcement mechanisms is illustrated when monkeys perform physical work in exchange for electrical stimulation of the amygdala. For example, they might be prepa red to press a lever for a long period of time to receive amygdalar stimulation (via an electrode which has been implanted in their brain), implying that this stimulation is significantly rewarding. In addition, single neurons in the monkey amygdala have been shown to respond to taste, olfactory and visual stimuli (Rolls, 2000a). But although the amygdala is similar in many ways to the orbitofrontal cortex, there is a difference in the speed of learning. When the pairing of two different visual stimuli with two different tastes (e.g. sweet and salt) is reversed, orbitofrontal cortex neurons can reverse the visual stimulus to which they respond in as little as one trial. In other words, neurons in the orbitofrontal cortex that previously ‘fired’ in response to a sweet taste can start responding to a salty taste, and neurons that previously ‘fired’ in response to a salty taste can start responding to a sweet taste, very quickly. Neurons in the amygdala, on the other hand, are much slower to reverse their responses (Rolls, 2000a). To explain this in an evolutionary context, reptiles, birds and all mammals possess an amygdala, but only primates show marked orbitofrontal cortex development (along with other parts of the frontal lobe). So the orbitofrontal cortex may be performing some of the functions of the amygdala but doing it better, or in a more ‘advanced’ way, since as a cortical region it is better adapted for learning, especially rapid learning and relearning or reversal (Rolls, 1996, 1999, 2000c).

Orbitofrontal cortex for Taste and Smell

The orbitofrontal cortex
Neurons that respond to the sight of food do so by learning to associate a visual stimulus with its taste. Because the taste is a reinforcer, this process is called stimulus-reinforcement association learning. Damage to the orbitofrontal cortex impairs this type of learning by, for example, altering food preferences. We know this because monkeys with such damage select and eat substances they would normally reject, including meat and non-food objects (Baylis & Gaffan, 1991; Butter, McDonald & Snyder, 1969). The functioning of this brain region could have critical implications for survival. In an evolutionary context, without this function of the orbitofrontal cortex, other animals might have consumed large quantities of poisonous foodstuffs and failed to learn which colours and smells signify nutritious foods. The orbitofrontal cortex is therefore important not only in representing whether a taste is rewarding, and so whether eating should occur, but also in learning about which (visual and olfactory) stimuli are actually foods (Rolls, 1996, 1999, 2000c).
Because of its reward-decoding function, and because emotions can be understood as states produced by rewards and punishers, the orbitofrontal cortex plays a very important role in emotion (Rolls, 1999).


Flavour refers to a combination of taste and smell. The connections of the taste and olfactory (smell) pathways in primates suggest that the necessary convergence may also occur in the orbitofrontal cortex. Consistent with this, Rolls and Baylis (1994) showed that some neurons in the orbitofrontal cortex (10 per cent of those recorded) respond to both taste and olfactory inputs. Some of these neurons respond equally well to, for example, both a glucose taste and a fruit odour. Interestingly, others also respond to a visual stimulus representing, say, sweet fruit juice. This convergence of visual, taste and olfactory inputs produced by food could provide the neural mechanism by which the colour of food influences what we taste. For example, experimental participants reported that a red solution containing sucrose may have the flavour of a fruit juice such as strawberry, even when there was no strawberry flavour present; the same solution coloured green might subjectively taste of lime. There is also another olfactory area in the orbitofrontal cortex. Some of these olfactory neurons respond to food only when the monkey is hungry, and so seem to represent the pleasantness or reward value of the smell of food. These neurons therefore function in a similar manner with respect to smell as the secondary taste neurons function with respect to taste. The orbitofrontal cortex also contains neurons that respond to the texture of fat in the mouth. Some of these fat-responsive neurons also respond to taste and smell inputs, and thus provide another type of convergence that is part of the representation of the flavour of food. A good example of a food that is well represented by these neurons is chocolate, which has fat texture, sweet taste and chocolate smell components.

Brain Mechanisms for Eating

Since the early twentieth century, we have known that damage to the base of the brain can influence food intake and body weight. One critical region is the ventromedial hypothalamus. Bilateral lesions of this area (i.e. two-sided, damaging both the left and right) in animals leads to hyperphagia (overeating) and obesity. By contrast, Anand and Brobeck (1951) discovered that bilateral lesions (that is, damage) of the lateral hypothalamus can lead to a reduction in feeding and body weight. Evidence of this type led, in the 1950s and 1960s, to the view that food intake is controlled by two interacting ‘centres’ – a feeding centre in the lateral hypothalamus and a satiety centre in the ventromedial hypothalamus. But problems arose with this dual centre hypothesis. Lesions of the ventromedial hypothalamus were found to act indirectly by increasing the secretion of insulin by the pancreas, which in turn reduces plasma glucose concentration, resulting in feeding. This has been demonstrated by cutti ng the vagus nerve, which disconnects the brain from the pancreas, preventing ventromedial hypothalamic lesions from causing hypoglycaemia, and therefore preventing the consequent overeating. So the ventromedial nucleus of the hypothalamus is now thought of as a region that can influence the secretion of insulin and, indirectly, affect body weight, but not as a satiety centre per se. On the other hand, the hypothesis that damage to the lateral hypothalamus produces a lasting decrease in food intake and body weight has been corroborated by injecting focal neurotoxins (agents that kill brain cells in a very specific manner, such as ibotenic acid), into rats. These damage the local cell bodies of neurons but not the nerve fibres passing nearby. Rats with lateral hypothalamus lesions also fail to respond to experimental interventions that normally cause eating by reducing the availability of glucose (Clark et al., 1991).

A matter of taste
How are taste signals (which provide one of the most significant rewards for eating) processed through different stages in our brains, to produce (among other effects) activation of the lateral hypothalamic neurons.

Some of the brain connections and pathways in the macaque monkey. The monkey is used to illustrate these pathways because neuronal activity in non-human primates is considered to be especially relevant to understanding brain function and its disorders in humans. During the first few stages of taste processing (from the rostral part [rostral towards the head or front end of an animal, as opposed to caudal (towards the tail)] of the nucleus of the solitary tract, through the thalamus, to the primary taste cortex), representations of sweet, salty, sour, bitter and protein tastes are developed (protein represents a fifth taste, also referred to as ‘umami’). The reward value or pleasantness of taste is not involved in the processing of the signal as yet, because the primary responses of these neurons are not influenced by whether the monkey is hungry or satiated. The organization of these first few stages of processing therefore allows the primate to identify tastes independently of whether or not it is hungry. In contrast, in the secondary cortical taste area (the orbitofrontal cortex), [orbitofrontal cortex above the orbits of the eyes, part of the prefrontal cortex, which is the part of the frontal lobes in front of the motor cortex and the premotor cortex] the responses of taste neurons to a food with which the monkey is fed to satiety decrease to zero (Rolls et al., 1989, 1990). In other words, there is modulation or regulation of taste responses in this tasteprocessing region of the brain. This modulation is also sensoryspecific (see, for example, figure 5.6). So if the monkey had recently eaten a large number of bananas, then there would be a decreased response of neurons in this region of the orbitofrontal cortex to the taste of banana, but a lesser decrease in response to the taste of an orange or melon. This decreased responding in the orbitofrontal cortex neurons would be associated with a reduced likelihood for the monkey to eat any more bananas (and, to a lesser degree, any more orange or melon) until the satiety had reduced.
So as satiety develops, neuronal activity in the secondary taste cortex appears to make food less acceptable and less pleasant – the monkey stops wanting to eat bananas. In addition, electrical stimulation in this area produces reward, which also decreases in value as satiety increases (Mora et al., 1979). It is possible that outputs from the orbitofrontal cortex subsequently influence behaviour via the connections of this region to the hypothalamus, where it may activate the feeding-related neurons described earlier.


The following descriptions of the different signals that control appetite are placed roughly in the order in which they are activated during a meal. All of these signals must be integrated by the brain.

1 Sensory-specific satiety
If we eat as much of one food as we want, the pleasantness rating of its taste and smell change from very pleasant to neutral. But other foods may still taste and smell pleasant. So variety stimulates food intake.
For example, if you eat as much chicken as you want for a meal, the pleasantness rating of the taste of chicken decreases to roughly neutral. Bananas, on the other hand, may remain pleasant, so you might eat them as a second course even when the chicken has already ‘filled you up’, or produced satiety. This type of satiety is partly specific to the sensory qualities of the food, including its taste, smell, texture and appearance, and has therefore been named sensory-specific satiety (Rolls, 1999).

2 Gastric distension
Normally gastric distension is one of the signals necessary for satiety. As we saw earlier, this is demonstrated when gastric drainage of food after a meal leads to immediate resumption of eating.

Gastric distension only builds up if the pyloric sphincter closes. [pyloric sphincter controls the release of food from the stomach to the duodenum] The pyloric sphincter controls the emptying of the stomach into the next part of the gastrointestinal tract, the duodenum. The sphincter closes only when food reaches the duodenum, stimulating chemosensors [chemosensors receptors for chemical signals such as glucose concentration] and osmosensors [osmosensors receptors for osmotic signals] to regulate the action of the sphincter, by both local neural circuits and hormones.

3 Duodenal chemosensors
The duodenum contains receptors sensitive to the chemical composition of the food draining from the stomach. One set of receptors respond to glucose and can contribute to satiety via the vagus nerve, which carries signals to the brain. The vagus is known to represent the critical pathway because cutting this nerve (vagotomy) abolishes the satiating effects of glucose infusions into the duodenum. Fats infused into the duodenum can also produce satiety, but in this case the link to the brain may be hormonal rather than neural (a hormone is a blood-borne signal), since vagotomy does not abolish the satiating effect of fat infusions into the duodenum (see Greenberg, Smith & Gibbs, 1990; Mei, 1993).

4 Glucostatic hypothesis
We eat in order to maintain glucostasis – that is, to keep our internal glucose level constant. Strictly, the crucial signal is the utilization of glucose by our body and brain, as measured by the difference between the arterial and the venous concentrations of glucose. If glucose utilization is low, indicating that the body is not able to extract much glucose from the blood stream, we feel hungry, whereas if utilization is high, we feel satiated. This is confirmed by the following findings:
Rats show a small decrease in plasma glucose concentration just before meals, suggesting that decreased glucose concentration initiates eating (Campfield & Smith, 1990) . At the end of a meal, plasma glucose concentration rises, and so does insulin, which helps the glucose to be used by cells.

Injections of insulin, which reduce the concentration of glucose in the plasma ( by facilitating its entry to cells and storage as fat), provoke food intake.

Infusions, or injections, of glucose and insulin (together enabling glucose to be taken up by the body’s cells) can reduce feeding.

The brain’s monitoring system for glucose availability seems to be in the part of the brain called the medulla ( part of the brainstem), because infusions there of a competitive inhibitor of glucose (5-thio-glucose) also provoke feeding (Levin et al., 2000).

5 Body fat regulation and the role of leptin
The signals described so far help to regulate hunger from meal to meal, but they are not really adequate for the long-term regulation of body weight and, in particular, body fat. So the search has been on for scientists to identify another signal that might regulate appetite, based on, for example, the amount of fat in the body. Recent research has uncovered a hormone, leptin (also called OB protein), which performs this function (see Campfield et al., 1995).

6 Conditioned appetite and satiety
If we eat food containing lots of energy (e.g. rich in fat) for a few days, we gradually eat less of it. If we eat food with little energy, we gradually, over days, ingest more of it. This regulation involves learning to associate the sight, taste, smell and texture of the food with the energy that is released from it in the hours after it is eaten. This form of learning has been demonstrated by Booth (1985).

Two groups of participants ate different flavoured sandwiches – one flavour being high energy sandwiches and the other being low energy. On the critical test day, the participants chose to eat few of the sandwiches that tasted like the high energy ones eaten previously, but far more of the sandwiches that had the flavour of the previously consumed low energy sandwiches. And yet, on the test day, all the sandwiches consumed in fact had medium energy content. This suggests that the level of consumption of the medium energy sandwiches on the test day was strongly influenced by the energy content of the sandwiches that had been eaten previously.

Friday, December 3, 2010


How the motivation to eat (and food intake) are controlled, we first need to consider the functions of peripheral factors (i.e. factors outside the brain), such as taste, smell and gastric distension, and control signals, such as the amount of glucose in the bloodstream. Then we can examine how the brain integrates these different signals, learns about which stimuli in the environment represent food, and initiates behaviour to obtain the correct variety and amount.

The functions of some peripheral factors in the control of eating can be demonstrated by the sham feeding preparation in this preparation, the animal tastes, smells and eats the food normally, but the food drains away from the stomach. This means that, although the animal consumes the food, the stomach does not become full, since the food does not enter the stomach or intestine. Experiments have shown that rats, monkeys and humans will work for food when they are sham feeding often continuing to eat for more than an hour. This demonstrates that it is the taste and smell of food that provide the immediate reward for food-motivated behaviour. Further evidence for this is that humans are more likely to rate the taste and smell of food as being pleasant when they are hungry. A second important aspect of sham feeding is that satiety (reduction of appetite) does not occur. From this we can conclude that taste, smell and even swallowing (i.e. oropharyngeal factors) do not of themselves make us feel satisfied, or satiated. Instead, satiety is produced by food accumulating in the stomach and entering the intestine. Gastric distension is an important satiety signal, and intestinal signals also have a part to play (Gibbs et al., 1981).

When an animal is allowed to eat to normal satiety and then has the food drained from its stomach, it starts eating again immediately. Moreover, small infusions of food into the duodenum (the first part of the intestine) decrease feeding, indicating satiety. Interestingly, however, animals have difficulty learning to perform a response that brings a reward of food if the food is delivered directly into the stomach, demonstrating that this form of feeding is not very rewarding in itself (see Rolls, 1999). We can draw important conclusions about the control systems for motivated behaviour from these findings: n Reward and satiety are different processes. n Reward is produced by factors such as the taste and smell of food.

Satiety is produced by gastric, intestinal and other signals after the food is absorbed from the intestine.
Hunger and satiety signals modulate the reward value of food (i.e. the taste and smell of food are rewarding when hunger signals are present and satiety signals are not). To put this in more general psychological terms, in most behavioural situations the motivational state modulates or controls the reward or reinforcement value of sensory stimuli. So, for example, in certain species the female may apparently find the male of the species ‘sexually attractive’ only during certain phases of the female’s reproductive cycle.
Since reward and satiety are produced by different bodily (i.e. peripheral) signals, one function of brain (i.e. central) processes in the control of feeding is to bring together the satiety and reward signals in such a way that satiety modulates the reward value of food.


In a discrimination learning task the animal is presented with two stimuli (sometimes more) that are associated with different outcomes. For example, a pigeon might be presented with a choice between two discs, one coloured red and the other green; pecking at the green disc will produce food, but pecking at the red disc will not. The pigeon will solve this problem, coming reliably to choose the green disc after a few dozen trials. Its ability to do this task is no puzzle and can be fully explained in terms of standard conditioning processes. More intriguing is the fact that training on such a task will transfer to other similar tasks. If the pigeon is now asked to solve a similar discrimination problem, in which the choice is between blue and yellow discs, learning can occur very rapidly: we call this positive transfer. The original associations involving red and green are clearly irrelevant to this new discrimination task, so the transfer must have some other source. The pigeon appears to have acquired a fairly abstract concept in the course of acquiring the first discrimination – something along the lines of ‘differences in colour are important and should be attended to’. Studies involving primates have produced more dramatic examples of abstraction. In the learning-set procedure (first introduced by Harlow, 1949), a rhesus monkey is presented with two objects and given a small amount of food for picking up one of them. After six trials the original objects are replaced with two new ones and, again, responding to only one of the objects was rewarded. After six trials on this new problem, the objects were again changed, and so on for many, many pairs of objects. Early in training performance is unremarkable, six trials being insufficient for the monkey to solve the problem. But as training proceeds, performance begins to improve, until finally it is as near perfect as it can be (see figure 4.14). After training on hundreds of these problems, the monkey is able to solve a new problem with no more than a single error, switching its choice to the other object if its first choice is wrong, but staying with its original choice if this proves correct. By experiencing many problems of a similar type, the animal appears to abstract some general rule about how to behave in this situation – a rule that allows the near-instantaneous solution of a problem that it had, in fact, never faced before. The rule that operates in this case is the win-stay, lose-shift strategy: in other words, the animal learns to persist with a choice that yields food, but shift to the other object if it does not. Associative theory can go some way towards explaining this. The occurrence of reward (or non-reward) can be regarded as a stimulus that, like any other, can enter into associations or acquire discriminative control over an instrumental action. The special feature of the learning-set procedure is that these stimuli and associations come to dominate the animal’s behaviour to the exclusion of all others. So the animal learns to focus on classes of cues that are accurate predictors of reward and to ignore others that are not. Intensive research is currently going into the nature of such higher-level learning processes that might modulate the mechanims of simpler associative processes.


A rat is set to swim in a pool of water from which it naturally wants to escape. It can do this by reaching a small platform, which is just below the surface and not visible to the animal (because of the ‘milkiness’ of the water). Finding the platform on the first trial is a matter of chance, but with further training the rat rapidly learns to swim straight to the platform. Since the rat cannot see the platform to home in on it, how can it be performing this feat? One obvious possibility is that the rat learns to swim in the general direction of some feature of the room outside the pool, which lies on a continuation of the line between its starting point and the platform. But this cannot be the whole story, because in other trials, rats were put back in the pool at a different starting position from that used in training. The paths that the rats followed. Clearly, in these trials, following a line to an extra-pool cue would not work. However, as the results show, even under these conditions the rats were still very good at finding the platform. To explain this in terms of standard conditioning processes, we must assume that the rat learns to approach not single cues, but complex configurations of cues. We know from other training procedures that rats can learn about combined (often referred to as configural) cues. But such learning tends to occur painfully slowly, whereas spatial tasks are mastered much more easily by rats. This suggests that spatial learning operates according to principles quite different from those that underlie classical and instrumental conditioning procedures. It is possible that exposure to an environment allows the animal to form a cognitive map of that environment – [cognitive map postulated internalized representation of the layout of the environment in which information about the relative spatial relationships of various features is preserved] some sort of internal representation of the spatial relationships among the cues it has experienced. The animal is then able to navigate because it knows its own position with respect to this internal representation. But no one has yet supplied a full account of the process by which the map is constructed, how the animal knows its own position, and so on.


Repeated presentation of a stimulus that elicits a particular UR will result in habituation – a gradual reduction in the magnitude of the response. [habituation waning of the unconditioned response with repeated presentation of the eliciting stimulus] A good instance in vertebrates is the startle response produced by a sudden loud noise, a response that reliably declines if the noise is regularly repeated. Which also shows the phenomenon of dishabituation, [dishabituation restoration of a habituated response by presentation of a strong extraneous stimulus] whereby the response returns when a salient extraneous stimulus (e.g. a flashing light) is presented just before a trial with the habituated noise. The observation that the response can be easily restored in this way shows that habituation is not solely a matter of sensory or motor fatigue – it is a genuine case of learning. And since habituation occurs as a consequence of the presentation of a single event, it is difficult to interpret t his form of learning in terms of association formation. The most likely explanation, at least for simple instances of the phenomenon, is that changes occur in the neuronal pathway connecting the S and R that make transmission of nervous impulses less likely to occur. A series of elegant neurophysiological studies by Kandel and colleagues (e.g. Kandel, 1979) using the marine mollusc Aplysia has gone some way towards establishing which synaptic connection loses effectiveness during habituation, and the biochemical basis of this loss. (For this work Kandel was awarded the Nobel prize for medicine.) Loss of the UR is not the only effect produced by stimulus exposure. Consider the phenomenon of imprinting, [imprinting the development of filial responses by newly hatched birds to an object (usually the mother) experienced early in life, or more generally the early formation of social attachments in animals] in which a chick becomes attached to a conspicuous object experienced early in life. This behaviour pattern is found only in some species, but other features of the imprinting process appear to be more general. Most animals exposed to complex objects are able to learn the characteristics of the object, and subsequently to distinguish more easily the object from other similar things. This phenomenon is known as perceptual learning. The nature of the mechanism responsible for it is not fully known, but it seems likely that associative processes are involved, in that learning the characteristics of a complex object involves learning that its various features go together. This is achieved by the formation of associative links among its component parts. The perceptual learning process, [perceptual learning exposure to events, increasing subsequent ability to discriminate between them] which enables the animal to build up an accurate representation of the stimulus, probably plays a role in some instances of habituation. When animals are habituated to a complex event, the response can be restored if some element of that complex is omitted or changed. Such dishabituation occurs, it has been suggested (Sokolov, 1963), because animals are sensitive to any mismatch between incoming stimulation and the central representations of events they have already experienced.


Laboratory studies of learning have concentrated on conditioning procedures in which the participants experience two events (two stimuli, or a response and a stimulus) in close contiguity. It is hardly surprising, therefore, that association between events has proved so dominant in theories of learning. This approach has been justified by the assumption that the complex instances of learning shown in our everyday behaviour may well be governed by associative principles. But this should not blind us to the fact that learning can also result from procedures in which there is no intentional pairing of two events.


A further challenge to the principle of contiguity came in the 1960s when psychologists began to realize that the principle might apply only to certain pairings of events. They had long suspected that some associations might form more readily than others, but they were usually able to find reasons to dismiss their worries. For example, when attempts to replicate Watson’s demonstration of emotional conditioning in infants proved unsuccessful when an inanimate object, rather than a live animal, was used as the CS, researchers suggested that the CS was simply not salient enough to be noticed. But an important experiment by Garcia and Koelling (1966) showed selectivity in association formation that could not be easily explained away. Their study demonstrates the phenomenon of preparedness. [preparedness tendency of certain combinations of events to form associations more readily than others] The rats in this study appeared to be prepared to associate external cues with painful consequences and to associate illness with taste cues. But taste did not become readily associated with shock, nor external cues with illness. The usefulness to the rat of having a learning system that operates in this way should be clear; after all, gastric illness is more likely to be caused by something the rat ate than something it heard or saw. But to the psychologist investigating general laws of learning, the preparedness effect constitutes a problem in need of explanation. One possibility is that a principle of similarity operates in conditioning. [principle of similarity suggestion that association formation occurs particularly readily when the events are similar to one another] By this principle, not only should the events to be associated occur together, but if learning is to take place they should also be similar to one another. Applying this principle to the Garcia and Koelling result, a taste and an illness might be readily associated because they are similar in that both are detected by receptors (called interoceptors) concerned with the animal’s internal environment. External cues, on the other hand, have little in common with an internal state, making it difficult to associate auditory and visual events with illness. Compared with the massive amount of experimental effort that has been expended on establishing the finer points of the contiguity principle, investigation of the similarity principle has been almost totally neglected. Perhaps we will see more studies in this area before too long.

Predictive power

Predictive power
Blocking has been of special interest not just because it provides an example of the failure of the contiguity principle, but also because it seems to demonstrate the operation of another principle. Animals in the experimental condition learn well about an event with predictive power (the noise in the first stage of training predicts that the US will shortly occur), but they do not learn about an uninformative event (the added light in Phase 2 supplies no added information). The principle here is that conditioning occurs only to a CS that gives information about the likely occurrence of a succeeding event – i.e. what we might term a predictive CS.


The phenomenon of blocking provides an interesting and much-studied instance of failure to learn, in spite of contiguous presentations of CS and US. [blocking training an organism with one stimulus as a signal for an unconditioned stimulus to prevent the organism from learning about a second stimulus when both stimuli are subsequently presented together as signals for the same unconditioned stimulus] In a blocking experiment, animals receive training with what is termed a compound CS (Phase 2) – in this example represented by the simultaneous presentation of a noise and a light followed by a shock reinforcer. However, the experimental group has first received a phase of training in which the noise alone is conditioned (Phase 1). The performance of the control group of participants shows that training (Phase 2) with a compound CS is normally sufficient to establish associations between individual CS elements (noise, light) and the US (shock). So in this control group the light, when subsequently presented on ts own, will evoke a CR. But the experimental group shows no (or very little) evidence of learning about the light in Phase 2. Although they have received light–US pairings, just as the control participants have, in Phase 2, the formation of the light–US association appears to have been blocked by initial training with the noise in Phase 1. A possible explanation of the blocking effect links directly to the asymptote phenomenon. Recall that a US representation in a secondary state of activation will not support association formation. In our blocking experiment, Phase 1 training for the experimental group establishes the noise as a CS, enabling it to activate the US representation in a secondary state of activation. So for these participants, during Phase 2, the presentation of the US will not be able to produce the state of primary activation, which means that the light introduced as part of the CS at this stage of testing will be unable to acquire associative strength.


When a CS (e.g. a light) and a US (e.g. food) occur together, an association appears to be established between their central (i.e. neural) representations. And the more often they occur together, the stronger this association becomes. This is revealed by the growing strength of the CR (e.g. light-induced salivation). But this growth does not go on forever. With repeated CS–US pairings, the increment in the strength of the CR (and also, we deduce, the underlying association) becomes progressively smaller until there is no observable increase in its strength. At this point – referred to as asymptote – contiguity between the CS (light) and US (food) is no longer producing learning. Why does this happen? The most widely accepted explanation is that, as conditioning proceeds, presentations of the US lose their effectiveness. We know from a number of research studies that, during learning, the formation of a CS (light)–US (food) association allows presentation of the CS to evoke activity in the US representation before the US occurs. To adopt the terms used by the influential theorist Wagner (e.g. 1981), the CS induces a state of secondary activation in the US representation (as opposed to the primary activation produced by the US itself ). Wagner proposes that secondary activation is not capable of supporting association formation; furthermore, it stops the US (food) from evoking the primary state of activation. The result is that the US becomes less effective as learning proceeds. As the CS–US link grows stronger, Wagner proposes that the CS (light) becomes more effective at producing the secondary state of activation and the US (food) becomes less able to produce the primary state necessary for further strengthening to occur.

So, while contiguity is important for learning, its nature needs precise specification. The events that must occur together are not so much the CS and US per se as the primary states of activation of their central representations.

[A.R. Wagner (1934– ) and R.A. Rescorla (1940– ) carried out research at Yale University in the late 1960s. Their experiments showed that simple contiguity of the CS and US is not sufficient to produce conditioning, and that it is also necessary for the CS to provide information about the likely occurrence of the US. (The phenomenon of blocking, described here, is an example.) The theoretical model they devised to explain this effect (published in 1972) was able to deal with a wide range of learning phenomena and set the agenda for almost all the research that has been done on associative learning mechanisms since then. Although the details of the Rescorla–Wagner model have been much debated, its central principles have been adopted by a wide range of associative (or ‘connectionist’) theorists attempting to explain not only simple learning processes, but human cognition in general.]


Classical conditioning and instrumental learning both depend on the formation of associations. An association will be formed between a pair of events (two stimuli, or a response and a stimulus) that occur together (in contiguity). This principle of contiguity is clearly important, but it has some limitations. [principle of contiguity the proposal that events must be experienced close together in time and space for an association to be formed between them].

Thursday, December 2, 2010


As we have seen, classical conditioning allows an animal to learn about the relationship between events in the environment and so anticipate what will happen next on the basis of stimuli currently present. If there are grey clouds in the sky, then it will probably rain; if the light is presented, then food may well follow. Instrumental learning is the process by which an animal learns about the relationship between its behaviour and the consequences of that behaviour. And it serves a complementary but equally important function in allowing the animal to control (at least partially) the occurrence of environmental events – in other words, to bring about a desired event or to avoid an aversive event by responding in a particular way. Instrumentally trained responses are not entirely elicited by identifiable stimuli. Instead, they are controlled by their consequences, becoming more likely when they produce a positive result and less likely when they lead to an aversive outcome. As Skinner emphasized, this sort o control is the characteristic feature of what we call ‘voluntary’ behaviour. So the study of instrumental learning and performance is important for what it tells us about the nature of voluntary, goal-directed behaviour.

On the other hand, instrumental learning processes can also play a role in establishing and maintaining behaviour that seems, at first sight, anything but voluntary. Patients with the clinical condition known as obsessive– compulsive disorder (OCD) [obsessive–compulsive disorder (OCD) characterized by intrusive unwelcome thoughts (obsessions) and the need repeatedly to perform certain patterns of behaviour (compulsions), such as hand-washing] suffer from persistent, intrusive, unpleasant thoughts (obsessions) and feel compelled repeatedly to carry out certain acts (compulsions) that they know are senseless but which appear to provide some relief (see chapter 15). OCD can be quite disabling. One patient, who believed that contact with everyday objects contaminated her in some way, felt compelled to shower at least six times a day and to wash her hands very systematically every 20 minutes. With hands rubbed raw and half her working day taken up in these activities, her ability to lead a normal life was severely curtailed. OCD patients tend to feel a build-up of extreme anxiety prior to performing the compulsive ritual, which dissipates as the ritual is enacted. This has been measured both by patients’ own reports and by objective indices such as heart-rate (Hodgson & Rachman, 1972). A parallel can be drawn between such cases and a trained rat ‘compulsively’ responding to the presentation of a tone by jumping a hurdle, and continuing to perform this apparently senseless act for a large number of trials in the absence of any obvious reward. Although this behaviour appears senseless, it becomes understandable when the rat’s training history is known – when it becomes clear that the tone evokes fear by virtue of its initial association with shock and that the response avoids a shock that would otherwise occur. In the same way, the rituals performed by OCD patients may well be avoidance responses that are reinforced and maintained because they reduce the sufferer’s state of anxiety. Of course it remains to be explained why the patient has acquired such a fear of dirt, or whatever, in the first place. Nevertheless, this illustration demonstrates the relevance of the analysis of basic instrumental learning processes to an understanding of interesting and important aspects of human behaviour.

Learning and stimulus control

Learning and stimulus control
Although the ability of the discriminative stimulus to evoke a (conditioned) motivational state is undoubtedly important, this still does not fully explain how it controls instrumental responding. It is difficult to believe that a rat that receives food for leverpressing in the presence of a tone is insensitive to the conditional nature of the task – in other words, that it fails to learn that the response yields food only if the tone is on. But the version of two-process theory just described proposes only that the rat will form two simple associations – stimulus–food and response–food. There is no room in this account for the learning of a conditional relationship of the form ‘only lever-pressing in the presence of the tone results in the presentation of food’. This issue has been addressed experimentally in recent years, and several researchers have demonstrated that animals are capable of conditional learning. The stimulus control of performance revealed by these experiments cannot be explained in terms of standard two-process theory, in which discriminative stimuli have their effects solely by virtue of orthodox associations with reinforcers. Instead, it shows that animals are capable of learning the conditional relationship between a stimulus and a particular response–reinforcer relationship. So, discriminative stimuli exert their effects because they are able to trigger not just the representation of the reinforcer but also the more complex, response–outcome representation produced by instrumental training. This represents the learning of a conditional relationship.

Classical conditioning and motivational control

Classical conditioning and motivational control
For instance, a rat trained on an avoidance task, in which the sounding of a tone indicates that shock is likely, will, at least before the avoidance response has been fully learned, experience some pairings of the tone and the shock. As well as acquiring a response–outcome association, the rat can also be expected to form a tone–shock association. In other words, classical conditioning will occur, as a sort of by-product of the instrumental training procedure. This Pavlovian (S–S) association, it has been suggested, is responsible for energizing instrumental responding. By virtue of the S–S link, the tone will be able to activate the shock representation, producing in the animal both an expectation of shock and the set of emotional responses that we call fear. The state of fear is presumed to have motivational properties, so that the presentation of the tone could effectively boost the supply of energy that causes the animal to behave. The expectation evoked by the tone also gives value to the outcome. In av oidance learning, the outcome associated with the response is the absence of an event (the omission of shock). The absence of an event would not normally be reinforcing in itself, but it could certainly become so given the expectation that something unpleasant is likely to occur. This account of avoidance learning is a version of twoprocess theory, [two-process theory emphasizes the interaction of instrumental and classical conditioning processes in producing many types of behaviour] so called because it acknowledges that classical and instrumental learning processes both play a part in determining this type of behaviour. Although the theory was first elaborated in the context of avoidance learning, there is no reason to suppose that it applies only to this procedure. We have already seen how classical conditioning might contribute to the response suppression generated by the punishment procedure. In the appetitive case, stimuli present when an animal earns food by performing an instrumental response can be expected to become associated with the food. These stimuli will then be able to evoke a positive state (an ‘expectation of food’, a ‘state of hopefulness’) that parallels the negative, fearful, state produced in aversive training procedures.

Partial reinforcement

Partial reinforcement
Skinner, who completely rejected the theoretical law of effect, devoted several years of research (e.g. Ferster & Skinner, 1957) to exploring and demonstrating the power of the empirical law. He worked mostly with pigeons, trained in a Skinner box to peck a disc set in the wall for food reinforcement. Skinner investigated the effects of partial reinforcement [partial reinforcement the delivery of a reinforcer in operant conditioning is scheduled to occur after only a proportion of the responses rather than after all of them (continuous reinforcement)], in which food was presented after some responses but not all. Animals will usually respond well in these conditions, and with some schedules of reinforcement [schedules of reinforcement rules that determine which responses will be followed by a reinforcer in operant conditioning] the rate of response can be very high indeed. If, for example, the animal is required to respond a certain number of times before food is delivered (known as a fixed ratio schedule), there will usually be a pause after reinforcement, but this will be followed by a high frequency burst of responding.

Other ways of scheduling reinforcement control different but equally systematic patterns of response. There is a clear parallel here between the pigeon responding on a partial reinforcement schedule and the human gambler who works persistently at a one-armed bandit for occasional pay-outs.

According to the theoretical version of the law of effect, the only function of the reinforcer is to strengthen a connection between the response (R) that produced that reinforcer and the stimulus (S) that preceded the R. It follows that an S–R learner does not actively know what the consequence of the R will be, but rather the response is simply triggered based on previous contingencies. In other words, the rat in the Skinner box is compelled in a reflex-like fashion to make the R when the S is presented and it is presumed to be as surprised at the delivery of the food pellet after the hundredth reinforced response as it was after the first. Not only is this an implausible notion, but experimental evidence disproves it. The evidence comes from studies of the effects of reinforcer revaluation on instrumental performance. The results of one such study, a first stage of training, rats were allowed to press the lever in a Skinner box 100 times, each response being followed by a sugar pellet. Half the animals were then given a nausea-inducing injection after eating sugar pellets – a flavour-aversion learning procedure. As you might expect, these rats developed an aversion to the pellets, so the reinforcer was effectively devalued.

In the subsequent test phase, the rats were returned to the Skinner box and allowed access to the lever (although no pellets were now delivered). The researchers found that rats given the devaluation treatment were reluctant to press the lever, compared with the control animals. This result makes common sense – but no sense in terms of the theoretical law of effect. According to the strict interpretation of the law of effect, an S–R connection would have been established at the end of the first stage of training by virtue of the reinforcers that followed responding, before the nausea-inducing injection was administered. Subsequent changes in the value of this reinforcer (which, according to the theory, has already done its job in mediating a ‘state of satisfaction’) should have been of no consequence. These results suggest that the critical association in instrumental learning is not between stimulus and response, but between representations of a) the response and b) the reinforcer (or more generally, between the behaviour and its outcome). The stronger this association, assuming that the outcome is valued, the more probable the response will be. But an association with an aversive outcome (i.e. a devalued foodstuff or a punishment) will lead to a suppression of responding. This does not mean that S–R learning can never occur. Often, after long practice, we acquire patterns of behaviour (habits) that have all the qualities of reflexes. In other words, they are automatically evoked by the stimulus situation and not guided by consideration of their consequences. The results may be an experimental example of this. One group of rats was given extensive initial training in lever-pressing (500 rather than 100 reinforced trials) prior to the reinforcer-devaluation treatment. These animals continued to press the lever in the test phase. One interpretation of this result is that with extensive training, behaviour that is initially goal-directed (i.e. controlled by a response–outcome association) can be converted into an automatic S–R habit. When next you absent-mindedly take the well-worn path from your home to the college library, forgetting that on this occasion you were intending to go to the corner shop, your behaviour has been controlled by an S–R habit rather than the response–outcome relationship – just like the rats.

If an animal has acquired an S–R habit, then we can predict that the R will occur whenever the S is presented. But what controls performance if learning is the result of a response-outcome association? A rat can be trained to press for food or jump to avoid shock only in the presence of a given stimulus (called a discriminative stimulus) which signals that food or shock are likely to occur. Presumably the response-outcome association is there all the time, so why is it effective in producing behaviour only when the stimulus is present? How does the presentation of the discriminative stimulus activate the existing instrumental association?


Thorndike’s studies of cats in the puzzle box led him to propose the following interpretation of their behaviour: ‘Of several responses made to the same situation, those which are accompanied or closely followed by a state of satisfaction to the animal will, other things being equal, be more firmly connected with the situation, so that, when it recurs, they will be more likely to recur’ (Thorndike, 1911, p. 244). This is the law of effect as applied to appetitive instrumental learning. Thorndike also put forward (and later retracted) a negative counterpart for the case of punishment, which proposed that certain effects (‘annoyers’) would weaken the connection between a response and the training situation. In modern terminology, Thorndike’s ‘satisfiers’ and ‘annoyers’ are called reinforces and punishers.

Thorndike’s presentation of the law of effect has two major features:
1. What is learned is a stimulus–response (S–R) association.
2. The role of the effect produced by the response is to determine whether this association will be strengthened or not.
Both of these propositions are debatable and, as we shall shortly see, this theoretical version of the law of effect has not stood up well to further experimental analysis. As an empirical generalization, though, the law seems much more secure. Everyone accepts that the likelihood of an animal responding in a particular way can be powerfully controlled by the consequence of that response.


Skinner box soon replaced Thorndike’s puzzle box in the laboratory study of instrumental learning. In the version used for the rat, the Skinner box consists of a chamber with a lever protruding from one wall and a nearby food cup into which food pellets can be delivered by remote control Pressing the lever operates an electronic switch and automatically results in food delivery. So there is an instrumental contingency between the lever-press (the response) and the food (the effect or outcome). A rat exposed to this contingency presses the lever with increasing frequency. The Skinner box is similar to Thorndike’s puzzle box, but instead of using escape from the box as a reward, the animal stays in the box and the reward is delivered directly to it. This is an example of rewarded, or appetitive, instrumental learning, but the same general techniques can be used to study aversive instrumental learning. There are two basic aversive paradigms, punishment [punishment an aversive event as the consequence of a respo nse to reduce the probability of the response] and avoidance [avoidance instrumental training procedure in which performing a given response brings about the omission of an aversive event that is otherwise scheduled to occur].

The event made contingent on the response is aversive. For example, the habit of responding is first acquired. Subsequently, occasional lever-presses produce a brief electric shock through a grid floor fitted to the box. Unsurprisingly, the rate of responding declines. (It is worth adding that, although the effect may not be surprising, it still requires explanation. It often happens in psychology that the basic behavioural facts seem obvious; but when we try to explain them, we realize how little we really understand them.)

A signal occurs from time to time, accompanied by a foot shock. If the rat presses the lever, the shock is cancelled. So there is an instrumental contingency between the response and the omission of a given outcome. By behaving appropriately, the animal can avoid the shocks. In fact, rats are rather poor at avoidance learning when the response required is a lever-press; they respond better when they are required to jump over a hurdle. So the apparatus usually used is a two-compartment box, with a hurdle separating the two parts. Rats readily acquire the habit of jumping the hurdle in response to the warning signal. Training procedures that inflict pain (however slight) on the animal should obviously be employed only for good reason. Studies like this are justified by the insights they have provided into the nature of human anxiety disorders and neuroses.


Pavlov was beginning work on classical conditioning in Russia, E.L. Thorndike, in the United States, was conducting a set of studies that initiated a different tradition in the laboratory study of basic learning mechanisms. Thorndike was interested in the notion of animal intelligence. Motivated by an interest in Darwinian evolutionary theory, comparative psychologists of the late nineteenth century had investigated whether non-human animals can show similar signs of intelligence to those shown by humans. Thorndike took this endeavour into the laboratory. In his best-known experiment, a cat was confined in a ‘puzzle box’ (figure 4.3). To escape from the box, the cat had to press a latch or pull a string. Cats proved able to solve this problem, taking less and less time to do so over a series of trials. Cats solved the problem not by a flash of insight but by a gradual process of trial and error. Nevertheless, here was a clear example of learning. Its characteristic feature was that the animal’s actions were critical (instrumental) in producing a certain outcome. In this respect, instrumental learning [instrumental learning the likelihood of a response is changed because the response yields a certain outcome (a reward or punishment) (also called operant conditioning)] is fundamentally different from classical conditioning, in which the animal’s response plays no role in determining the outcome. Subsequent researchers who took up the analysis of this form of learning include the Polish physiologist Konorski (1948), who called it Type II conditioning (as distinct from Pavlov’s Type I conditioning). Another investigator interested in this type of conditioning was Skinner (1938) in the United States, who named it operant conditioning (Pavlov’s version of learning being referred to as respondent conditioning). [respondent conditioning alternative name for classical conditioning] However termed, all agreed that its defining feature was a contingency between a preceding stimulus, a pattern of behaviour (or response) and a subsequent state of the environment (the effect or outcome).


1. Although the behavioural consequence of conditioning may appear to be merely the development of an anticipatory reflex, the underlying process is fundamental to learning about the relationship among environmental events. Sensory preconditioning tells us that when neutral stimuli co-occur, an association forms between them. Presumably, the informal equivalent of sensory preconditioning will be occurring all the time as an animal goes about its normal everyday business. Simply moving through the environment will expose the animal to sequences of events that go together, and the associations that form among them will constitute an important piece of knowledge – a ‘map’ of its world.
2. As a laboratory procedure, classical conditioning is important because it allows exploration of the nature of associative learning. The observed CR (salivation, pecking, or whatever) may not be of much interest in itself, but it provides a useful index of the otherwise unobservable formation of an association. Researchers have made extensive use of simple classical conditioning procedures as a sort of ‘test bed’ for developing theories of associative learning. Some of these will be described in a later section of this chapter.
3. As a mechanism of behavioural adaptation, classical conditioning is an important process in its own right. Although the CRs (such as salivation) studied in the laboratory may be trivial, their counterparts in the real world produce effects of major psychological significance. Here are two examples from the behaviour of our own species.


What remains to be explained, once the stimulus–stimulus association theory has been accepted, is why the CR should occur and why it should take the form that it does. Pavlov’s dogs might ‘know’, by virtue of the CS–US link, that light and food go together, but this does not necessarily mean that the animal should start to salivate in response to the light. The most obvious explanation is that activation of the US (food) centre will evoke a given response, whether that activation is produced by presentation of the US (food) itself or, via the learned CS–US (light–food) connection, by presentation of the CS (light). An implication of this interpretation is that the CR and the UR should be the same, and this is true for the case just considered: the dog salivates (as a UR) to food and also comes to salivate (as a CR) to the light that has signalled food. In other examples of conditioning, however, the CR and UR are found to differ. In the autoshaping procedure, for instance, the UR is to approach and peck insid the food tray, whereas the CR that develops with training is to approach and peck at the light. In this case, the CR appears to be a blend of the behaviour that activation of the US (food) centre tends to evoke and the behaviour evoked by the CS (the light) itself. So we cannot say that the CR and the UR are always the same. There is, however, a simple rule that describes the relationship between them for most cases of conditioning, in that, as a result of classical conditioning, the animal generally comes to behave toward the CS (the light in these examples) as if it were the US (food). In other words, the CS (light) appears to take on some of the properties of the US (food) and to serve as an adequate substitute for it. So the unconditional response of a hungry animal is to approach food, probably salivating as it does so, and then to consume the food (by pecking, if the animal is a pigeon). The CR consists of directing these behaviour patterns toward the CS, in so far as the physical properties of the event used as the CS will allow this. This rule is sometimes referred to as the stimulus substitution hyphothesis. [stimulus substitution when the conditioned stimulus comes to acquire the same response-eliciting properties as the unconditioned stimulus]

Sensory preconditioning

Sensory preconditioning pairing of two neutral stimuli prior to one of them being used as the conditioned stimulus in a standard classical conditioning procedure, leading to the other stimulus acquiring the power to evoke the conditioned response. If this account is correct, it should be possible to trigger classical conditioning using paired neutral stimuli that themselves evoke no dramatic responses. Evidence that this can occur comes from a phenomenon called sensory preconditioning, first demonstrated by Brogden (1939) and confirmed many times since. In Brogden’s experiment, the animals in the critical experimental condition received a first stage of training consisting of paired presentations of two neutral stimuli, a light and a buzzer. If our theory is correct, an association should be formed between the central representations of these stimuli. The problem is to find a way to reveal this association. Brogden’s solution was to give a second stage of training in which one of the original stimuli (say the light) was given orthodox conditioning, being paired with a US until it came to evoke a CR (in this procedure, a response of flexing the leg). A final test showed that the buzzer was also able to evoke the leg flexion response, even though the buzzer had never previously been paired with the US. This result is what might be expected on the basis of the stimulus–stimulus association theory. The light evokes the CR by virtue of its direct association with the US, whereas the buzzer is able to do so ‘by proxy’ because its association with the light allows it to activate the representation of that stimulus.

Autoshaping and Aversion learning

Autoshaping [autoshaping classical conditioning used with pigeons which results in pecking at an illuminated response key that has been regularly presented before the delivery of food, even though the delivery of the food does not depend on the pecking behaviour]

A hungry pigeon is presented with grain (US) preceded by the illumination for ten seconds of a small light (CS) fixed to the wall of the cage. After 50 to 100 trials, the bird develops the CR of pecking at the light prior to food delivery. It is as if the bird is predisposed to respond to the light even though the pecking does not influence whether or not it receives the grain.

Flavour aversion learning [flavour aversion learning classical conditioning procedure in which animals are allowed to consume a substance with a novel flavour and are then given some treatment that induces nausea, resulting in the flavour being subsequently rejected]

Rats are given a novel flavour (e.g. saccharin is added to their drinking water) as the CS. This is followed by a procedure, such as the injection of a mild poison into their body, that makes them feel sick (the US). When it is subsequently made available, the rats will no longer consume the saccharin-sweetened water; they have developed an aversion (CR) to that flavour. This is clearly a very varied set of phenomena, but what they all have in common is the presentation of two stimuli, one contingent on the other. And, despite the fact that there is nothing in these training procedures that actually requires a change in behaviour, in every case the animal’s behaviour changes as a result of its experience. In the autoshaping case, for instance, the experimenter simply ensures that the light reliably accompanies food. There is no need for the pigeon to respond to the light in any way, since food is delivered regardless of the bird’s behaviour. So why does behaviour change? Why are conditioned responses acquired This puzzle must be dealt with by more detailed theoretical analysis.

When a dog trained by Pavlov’s procedure sees the light (CS), certain neural mechanisms are activated. Without specifying what these mechanisms are, we can refer to this pattern of activation as constituting a representation of the CS. This is often referred to as the CS ‘centre’, implying that it is localized in a specific part of the brain, although this might not necessarily be the case (for the purposes of our current behavioural analysis, this does not matter too much). Eating food (the US) will also have its own pattern of proposed neural activation, constituting the US representation or ‘centre’. One consequence of the Pavlovian conditioning procedure is that these two centres will be activated concurrently. Pavlov suggested that concurrent activation results in a connection between the two centres, which allows activation in one to be transmitted to the other. So, after Pavlovian learning has taken place, presentation of the CS becomes able to produce activity in the US centre, even when the food has not yet been presented. This theory therefore explains classical conditioning in terms of the formation of a stimulus–stimulus association between the CS centre and the US centre. (Given this framework, the fact that the presentation of the US provokes an obvious response is not strictly relevant to the learning process.)


Following Pavlov’s pioneering work, the study of classical conditioning has been taken up in many laboratories around the world. Few of these have made use of dogs as the subjects and salivation as the response, which are merely incidental features of conditioning. The defining feature is the paired presentation of two stimuli – the CS and the US. The presentation of the US is often said to be contingent on (i.e. to depend on) the presentation of the CS. Here are just a few of the wide range of training procedures that employ this contingency:
Conditioned emotional response [conditioned emotional response result of the superimposition of the pairing of a conditioning and an unconditioned stimulus on a baseline of operant or instrumental behaviour]. The experimental animal, usually a rat, is presented with a neutral cue, such as a tone sounding for one minute (the CS), paired with a mild electric shock (US) that occurs just as the tone ends. After several pairings (the exact number will depend on the intensities of tone and shock), the rat’s behaviour changes. It begins to show signs of anxiety, such as freezing and other ‘emotional responses’, when it hears the tone before the shock has occurred. This is the CR.


Pavlov spent the first half of his long scientific career working on the physiology of digestion, turning to the study of learning in about 1900. He had noticed that dogs which salivate copiously when given food also do so in response to other events – for example, at the approach of the laboratory attendant who supplied the food. This response was clearly acquired through experience. Pavlov (1927) took a version of this procedure into the laboratory, making it a model system that could be used to reveal basic principles of learning. Pavlov’s standard procedure involved a quiet, distraction-free laboratory, which gave the experimenter full control over events experienced by a lightly restrained dog. From time to time the dog was given access to food, and each presentation was accompanied (usually slightly preceded) by the occurrence of a neutral event, such as a flashing light. After several training trials (pairings of light and food), the dog would salivate at the flash of light, before any food had appeare.

Salivation at the presentation of food is called an unconditioned response (UR) [unconditioned response (UR) evoked by a stimulus before an animal has received any explicit training with that stimulus], since it occurs automatically (unconditionally). The food is an unconditioned stimulus (US)[unconditioned stimulus (US) evokes an unconditioned response]. The animal’s tendency to salivate when the light flashes is conditional on the light having been paired with food, so this is referred to as a conditioned response (CR)[conditioned response (CR) evoked by a conditioned stimulus as a result of classical conditioning] and the event that evokes it as a conditioned stimulus (CS)[conditioned stimulus (CS) evokes a conditioned response as a result of classical conditioning]. The whole training procedure was labelled conditioning. As other forms of training, introduced later, have also been described as conditioning, Pavlov’s version became known as classical conditioning.

[I.P. Pavlov (1849–1936), born the son of a priest in Ryazan (250 miles south-east of Moscow), moved in 1870 to study natural science and medicine in St Petersburg. He spent the rest of his life there conducting scientific research, first on the physiology of the digestive system (for which he was awarded a Nobel prize in 1904) and later on conditioned reflexes. Although the study of conditioned reflexes was taken up mostly by psychologists, Pavlov insisted that his approach as a physiologist was far superior to that adopted by the comparative psychologists of his day. His demonstration of the salivary conditioned reflex in dogs, for which he is widely known, was just the start of an extensive body of work, in which he analysed the conditioning process in detail, revealing phenomena and suggesting learning mechanisms that are still being actively investigated today.]



classical conditioning learning procedure in which two stimuli are paired – one (the conditioned stimulus) usually presented shortly before the other (the unconditioned stimulus) to produce a conditioned response to the first stimulus (learning)

The physical basis of the changes that constitute learning lies in the brain, and neuroscientists are close to discovering exactly what these changes are. Foremost among these is the concept of association.[association a link between two events or entities that permits one to activate the other (such as when a characteristic odour elicits an image of the place where it was once experienced)] There is a philosophical tradition, going back at least 300 years, which supposes that, when two events (ideas or states of consciousness) are experienced together, a link, connection or association forms between them, so that the subsequent occurrence of one is able to activate the other. In the twentieth century the proposal was taken up by experimental psychologists, who thought that association formation might be a basic psychological process responsible for many, if not all, instances of learning. The first to explore this possibility in any depth was the Russian I.P. Pavlov with his work on classical conditioning.


The brain can solve immensely difficult computational problems. We can judge distances, we can identify objects, we can walk through complex environments relying solely on vision to guide us. These abilities are way beyond the capacities of current computers, even though their processing elements operate very much faster than our neurons. So, how can we solve complex problems of visual geometry so rapidly? The key lies in the brain’s parallel processing capacity. In principle, different aspects of a visual stimulus are analysed by different modules in the brain. One module may deal with form, another with motion, and another with colour. By splitting up the task in this way, it is possible to solve complicated problems rapidly. There might also be an evolutionary explanation for modularity in the brain. To add a new perceptual analysis feature to our existing perceptual analysis systems, the simplest route would be to leave the existing analysis systems unchanged and simply ‘bolt on’ a new feature. The alternative would be to rewire and reconfigure all the existing systems to add the mechanisms for the new analysis. It is harder to imagine how this might happen without the risk of radically disrupting the pre-existing systems. A computational stratagem like this poses new problems, however. There needs to be some way of ensuring that the different aspects of a stimulus, although processed separately, are nonetheless related to each other. A cricket ball heading towards you is red, shiny, round and moving. You need to know that these separate attributes all refer to the same object. How does our brain solve this problem? It is possible that the different brain regions analysing different aspects of the same stimulus show synchronized oscillations, which act to link these structures together (e.g. Gray et al., 1989). The visual processing modules are in some senses independent, but not completely so. Identifying a shape or form, for example, sometimes depends on solving the problem of colour or reflectance. A uniformly coloured, curved surface, lit from above, may emit different wavelengths of light from different points on the surface. We perceive it as being a single colour partly because we are also seeing it as a curved surface. And our perception of it as being curved equally depends partly upon light intensities and/or wavelengths reaching us from different points on the surface. So form and reflectance need to be solved simultaneously, and the solutions are, to some extent, interdependent. Although parallel processing is still an appropriate way to explain how the brain solves problems, as so often seems to be the case, things are a bit more complex than that.