Learning
Classical Conditioning (Respondent Learning) Pavlov, 1920
A procedure in which a neutral stimulus is repeatedly paired with a stimulus that already triggers a reflexive response until the previously neutral stimulus alone provokes a similar response:
1. UCS (meat powder) ----------------------->UCR (salivation)
2. Neutral stimulus (tone) ------------------> orienting response
3. Neutral stimulus (tone) + UCS ----------> UCR (salivation)
4. CS (tone) -----------------------------------> CR (salivation)
· subject is passive
· responses are typically autonomic or emotional
· continued pairings of a CS with UCS strengthen conditioned responses
Essential concepts
Extinction: if the UCS is no longer paired with the CS, the conditioned response becomes weaker and eventually disappears; occurs when the conditioned stimulus is given in the absence of the unconditioned stimulus
- can recover CS-CR link if:
a) repeat pairing of UCS
b) allow respite period without presenting CS (spontaneous recovery)
Spontaneous Recovery: after extinction, the conditioned response often reappears if the CS is presented after some time - this is spontaneous recovery or partial recovery
· it is less strong than the CR
· the longer the time between extinction and reappearance of the CS, the stronger the response
Reconditioning: is the quick recovery of the CR after extinction
· if the conditioned and unconditioned stimuli are paired once or twice after extinction, reconditioning occurs; that is, the CR reverts to its original strength
Stimulus Generalisation: conditioned responses occur to stimuli that are similar but not identical to conditioned stimuli (e.g. fear of dogs > fear of fur)
· usually requires a dozen trials
· CR diminishes proportionally according to the extent new CS differs from orignal CS
Stimulus Discrimination: allows some stimuli to prompt a conditioned response but not others; limits generalisation
The Signaling of Significant Events
· organisms acquire conditioned responses when one event reliably signals the other
· classical conditioning works best when the CS precedes the UCS
· this is known as forward conditioning
· backward and simultaneous conditioning are slow
· a CR develops best if the interval between CS and UCS is no more than about one second (Ross & Ross, 1971)
· the strength of a CR and the speed of conditioning increase as the intensity of the UCS increase
· second-order conditioning occurs when a CS becomes powerful enough to make CSs out of stimuli associated with it.
· stimuli (such as the appearance of a white coat) that precede the UCS (which may be a painful injection) can become a CS for the fear response
· possible model for acquisition of phobias
· organisms seem to be biologically prepared to learn certain associations e.g. taste aversions
· this is known as biopreparedness
· taste aversions violate the usual timing of classical conditioning
· to meet real life findings in humans, the theory must include the concepts of:
· incubation: increase in strength of emotional CR as consequence of repeated brief exposure to CS
· preparedness: some stimuli are more likely to become CS than others
Delayed conditioning
· onset of CS precedes that of UCS, and the CS continues until the response occurs
Simultaneous conditioning
· onset of both stimuli is simultaneous
· less successful than delayed conditioning
Trace conditioning
· CS ends before the onset of the UCS, and the conditioning becomes less effective as the delay between the two increases
Some applications of classical conditioning
1. Learned immune responses (Ader & Cohen, 1993)
2. Phobias
3. Systematic desensitization (Joseph Wolpe)
4. Predator Control
Little Albert (Watson and Rayner, 1920)
· experimental induction of phobia using classical conditioning
· used an 11-month-old boy
· white rat + loud noise resulted in the eventual fear of the rat without the noise
· this fear generalised to any furry animal
Opponent-Process Theory - Solomon (1980)
Habituation is the result of a relatively automatic, involuntary A-Process (essentially an unconditioned response, e.g. drug effect) and a conditioned B-Process that follows and counteracts the A-Process -this theory may explain drug tolerance and some cases of drug overdose
Classical Conditioning (Respondent Learning) Pavlov, 1920
A procedure in which a neutral stimulus is repeatedly paired with a stimulus that already triggers a reflexive response until the previously neutral stimulus alone provokes a similar response:
1. UCS (meat powder) ----------------------->UCR (salivation)
2. Neutral stimulus (tone) ------------------> orienting response
3. Neutral stimulus (tone) + UCS ----------> UCR (salivation)
4. CS (tone) -----------------------------------> CR (salivation)
· subject is passive
· responses are typically autonomic or emotional
· continued pairings of a CS with UCS strengthen conditioned responses
Essential concepts
Extinction: if the UCS is no longer paired with the CS, the conditioned response becomes weaker and eventually disappears; occurs when the conditioned stimulus is given in the absence of the unconditioned stimulus
- can recover CS-CR link if:
a) repeat pairing of UCS
b) allow respite period without presenting CS (spontaneous recovery)
Spontaneous Recovery: after extinction, the conditioned response often reappears if the CS is presented after some time - this is spontaneous recovery or partial recovery
· it is less strong than the CR
· the longer the time between extinction and reappearance of the CS, the stronger the response
Reconditioning: is the quick recovery of the CR after extinction
· if the conditioned and unconditioned stimuli are paired once or twice after extinction, reconditioning occurs; that is, the CR reverts to its original strength
Stimulus Generalisation: conditioned responses occur to stimuli that are similar but not identical to conditioned stimuli (e.g. fear of dogs > fear of fur)
· usually requires a dozen trials
· CR diminishes proportionally according to the extent new CS differs from orignal CS
Stimulus Discrimination: allows some stimuli to prompt a conditioned response but not others; limits generalisation
The Signaling of Significant Events
· organisms acquire conditioned responses when one event reliably signals the other
· classical conditioning works best when the CS precedes the UCS
· this is known as forward conditioning
· backward and simultaneous conditioning are slow
· a CR develops best if the interval between CS and UCS is no more than about one second (Ross & Ross, 1971)
· the strength of a CR and the speed of conditioning increase as the intensity of the UCS increase
· second-order conditioning occurs when a CS becomes powerful enough to make CSs out of stimuli associated with it.
· stimuli (such as the appearance of a white coat) that precede the UCS (which may be a painful injection) can become a CS for the fear response
· possible model for acquisition of phobias
· organisms seem to be biologically prepared to learn certain associations e.g. taste aversions
· this is known as biopreparedness
· taste aversions violate the usual timing of classical conditioning
· to meet real life findings in humans, the theory must include the concepts of:
· incubation: increase in strength of emotional CR as consequence of repeated brief exposure to CS
· preparedness: some stimuli are more likely to become CS than others
Delayed conditioning
· onset of CS precedes that of UCS, and the CS continues until the response occurs
Simultaneous conditioning
· onset of both stimuli is simultaneous
· less successful than delayed conditioning
Trace conditioning
· CS ends before the onset of the UCS, and the conditioning becomes less effective as the delay between the two increases
Some applications of classical conditioning
1. Learned immune responses (Ader & Cohen, 1993)
2. Phobias
3. Systematic desensitization (Joseph Wolpe)
4. Predator Control
Little Albert (Watson and Rayner, 1920)
· experimental induction of phobia using classical conditioning
· used an 11-month-old boy
· white rat + loud noise resulted in the eventual fear of the rat without the noise
· this fear generalised to any furry animal
Opponent-Process Theory - Solomon (1980)
Habituation is the result of a relatively automatic, involuntary A-Process (essentially an unconditioned response, e.g. drug effect) and a conditioned B-Process that follows and counteracts the A-Process -this theory may explain drug tolerance and some cases of drug overdose
Instrumental and Operant Conditioning
The Law of Effect: (Edward Thorndike)
It holds that any response that produces satisfaction becomes more likely to occur again and any response that produces discomfort becomes less likely. He called this type of learning Instrumental Conditioning – responses are strengthened when they are instrumental in producing rewards - The repetition of behaviour increases the likelihood of its recurrence (habit strength).
Operant Conditioning: (B. F. Skinner, 1938)
The organism is free to respond at any time, and conditioning is measured by the rate of responding – the organism learns a response by operating on the environment
· subject is active
· likely to be using consciously controlled behaviours
· stimulus generalization, discrimination, extinction and spontaneous recovery also occur in operant conditioning
Basic Components of Operant Conditioning
· an operant is a response that has some effect on the world
· a reinforcer increases the probability that the operant preceding it will occur again
· positive reinforcers strengthen a response if they are experienced after that response occurs – equivalent to rewards
· negative reinforcers strengthen a response if they are removed after it occurs – e.g. pain, or threats of punishment
· both escape conditioning and avoidance conditioning are the result of negative reinforcement
· escape conditioning results when behaviour terminates a negative reinforcer (e.g. a dog in a shuttle box escaping an electric shock)
· it learns to make a response to an aversive stimulus
· very resistant to extinction
· avoidance conditioning results when behaviour avoids a negative enforcer; it reflects both classical and operant conditioning
· the organism learns to respond to a signal (e.g. light) that avoids the aversive stimulus
· examples include stopping at a red light, or going to work when we don’t really want to
· behaviours learned through avoidance conditioning are very resistant to extinction – they are often reinforced by fear reduction
· discriminative stimuli indicate whether reinforcement is available to a particular behaviour
Forming and Strengthening Operant Behaviour
· shaping involves reinforcing successive approximations of the desired response
· utilizes operant conditioning
· e.g. training circus animals
· primary reinforcers are inherently rewarding (e.g. food, sex)
· secondary reinforcers are rewards that people or animals learn to like because of their association with primary reinforcers (e.g. money - its reinforcing power lies in its association with the rewards it can bring, or smiles and encouragement)
· they are effectively conditioned reinforcers
· the speed of conditioning is proportional to the size of reinforcer
· reinforcement may be delivered on the following schedules:
1. continuous reinforcement schedule: a reinforcer is delivered every time a particular response occurs
2. partial/ intermittent reinforcement schedule: reinforcement is delivered only some of the time:
Fixed-ratio (FR) schedules: reinforcement follows a fixed number of responses
‘post-reinforcement pause’
Variable-ratio (VR) schedules: reinforcement again follows a fixed number of responses, but that number varies from one reinforcement to the next. e.g. gambling - pays off after an unpredictable number of lever pulls, averaging one in twenty (= VR20 schedule)
> more likely to produce emotional outbursts during the learning phase
> less likely to produce emotional outbursts during the extinction phase
> Fixed-Interval schedules (FI): provide reinforcement for the first response that occurs after some fixed time has passed since the last reward, regardless of how many responses have been made during that interval (e.g. you can’t win more than twice in a day competition)
Variable-interval (VI): reinforce the first response after some period of time, but the amount of time varies (e.g. police stopping drivers at random and awarding prizes to those who had their seat-belts on)
· in general, the rate of responding is higher under ratio schedules then under interval schedules
· the unpredictable timing of rewards generates slow, but steady responding
· the curve of speed of learning against time is smooth for variable interval or variable ratio schedules, and scallop-shaped for fixed schedules
· in fixed interval schedules, there is an increase in responding as the time for reinforcement draws near, and a decrease in the rate of response just after reinforcement
· behaviour learned through partial reinforcement, particularly through variable schedules, is very resistant to extinction; this is called the partial reinforcement extinction effect
· partial reinforcement is involved in superstitious behaviour, which results when a response is coincidentally followed by a reinforcer – this is an example of accidental reinforcement (e.g. lucky shirt)
The Law of Effect: (Edward Thorndike)
It holds that any response that produces satisfaction becomes more likely to occur again and any response that produces discomfort becomes less likely. He called this type of learning Instrumental Conditioning – responses are strengthened when they are instrumental in producing rewards - The repetition of behaviour increases the likelihood of its recurrence (habit strength).
Operant Conditioning: (B. F. Skinner, 1938)
The organism is free to respond at any time, and conditioning is measured by the rate of responding – the organism learns a response by operating on the environment
· subject is active
· likely to be using consciously controlled behaviours
· stimulus generalization, discrimination, extinction and spontaneous recovery also occur in operant conditioning
Basic Components of Operant Conditioning
· an operant is a response that has some effect on the world
· a reinforcer increases the probability that the operant preceding it will occur again
· positive reinforcers strengthen a response if they are experienced after that response occurs – equivalent to rewards
· negative reinforcers strengthen a response if they are removed after it occurs – e.g. pain, or threats of punishment
· both escape conditioning and avoidance conditioning are the result of negative reinforcement
· escape conditioning results when behaviour terminates a negative reinforcer (e.g. a dog in a shuttle box escaping an electric shock)
· it learns to make a response to an aversive stimulus
· very resistant to extinction
· avoidance conditioning results when behaviour avoids a negative enforcer; it reflects both classical and operant conditioning
· the organism learns to respond to a signal (e.g. light) that avoids the aversive stimulus
· examples include stopping at a red light, or going to work when we don’t really want to
· behaviours learned through avoidance conditioning are very resistant to extinction – they are often reinforced by fear reduction
· discriminative stimuli indicate whether reinforcement is available to a particular behaviour
Forming and Strengthening Operant Behaviour
· shaping involves reinforcing successive approximations of the desired response
· utilizes operant conditioning
· e.g. training circus animals
· primary reinforcers are inherently rewarding (e.g. food, sex)
· secondary reinforcers are rewards that people or animals learn to like because of their association with primary reinforcers (e.g. money - its reinforcing power lies in its association with the rewards it can bring, or smiles and encouragement)
· they are effectively conditioned reinforcers
· the speed of conditioning is proportional to the size of reinforcer
· reinforcement may be delivered on the following schedules:
1. continuous reinforcement schedule: a reinforcer is delivered every time a particular response occurs
2. partial/ intermittent reinforcement schedule: reinforcement is delivered only some of the time:
Fixed-ratio (FR) schedules: reinforcement follows a fixed number of responses
‘post-reinforcement pause’
Variable-ratio (VR) schedules: reinforcement again follows a fixed number of responses, but that number varies from one reinforcement to the next. e.g. gambling - pays off after an unpredictable number of lever pulls, averaging one in twenty (= VR20 schedule)
> more likely to produce emotional outbursts during the learning phase
> less likely to produce emotional outbursts during the extinction phase
> Fixed-Interval schedules (FI): provide reinforcement for the first response that occurs after some fixed time has passed since the last reward, regardless of how many responses have been made during that interval (e.g. you can’t win more than twice in a day competition)
Variable-interval (VI): reinforce the first response after some period of time, but the amount of time varies (e.g. police stopping drivers at random and awarding prizes to those who had their seat-belts on)
· in general, the rate of responding is higher under ratio schedules then under interval schedules
· the unpredictable timing of rewards generates slow, but steady responding
· the curve of speed of learning against time is smooth for variable interval or variable ratio schedules, and scallop-shaped for fixed schedules
· in fixed interval schedules, there is an increase in responding as the time for reinforcement draws near, and a decrease in the rate of response just after reinforcement
· behaviour learned through partial reinforcement, particularly through variable schedules, is very resistant to extinction; this is called the partial reinforcement extinction effect
· partial reinforcement is involved in superstitious behaviour, which results when a response is coincidentally followed by a reinforcer – this is an example of accidental reinforcement (e.g. lucky shirt)
Punishment and Learning
· Punishment decreases the frequency of a behaviour by following it either with and unpleasant stimulus or with the removal of a pleasant one (then it is known as a penalty). It has several drawbacks:
1. it only suppresses behaviour (e.g. children will repeat punished acts if they think they can avoid detection)
2. fear of punishment may generalize to the person doing the punishing
3. it is ineffective when delayed. If a child confesses to wrongdoing and is then punished, the punishment may discourage honesty rather than eliminate undesirable behaviour
4. it can be physically harmful
5. it may teach aggressiveness
6. it teaches only what not to do, not what should be done to obtain reinforcement
Reinforcement strengthens behaviour; punishment weakens it. e.g. if a shock is turned off when a rat presses a lever, that is negative reinforcement; if a shock is turned on when the rat presses the lever, that is punishment; the rat will be less likely to press the lever again
· punishment is most effective when:
1. it is immediate
2. it is of sufficient intensity to suppress response on first occasion rather than starting with low intensity
3. it is specified why punishment is being given and that the behaviour, not the person, is being punished
4. more appropriate responses are identified and positively reinforced (Differential Reinforcement of Other behaviour: DRO)
Clinical relevance
· shaping: reinforcement of successive approximations to desired/ effective behaviour
· occurs when complete response is complex
· used in teaching and is accompanied by instruction, prompting, and encouragement
· used in learning disability
· chaining: breaking complex behaviour into sequence of steps
· the first act in a series is reinforced until it can be performed reliably, then the contingencies are altered so that the previous steps have to be performed before reinforcement is given, and so on
· in backward chaining, the satisfaction of achieving the desired final links in the chain provides additional reinforcement for the learning of successively earlier links (e.g. toilet training)
Cognitive Processes in Learning :
Learned Helplessness (Seligman & Maier, 1967) appears to result when people believe that their behaviour has no effect on the world. People, like animals, tend to make less effort to control their environment when prior experience leads them to expect those efforts to be in vain. People can develop effort-reducing expectations either through personal experience or through being told they are powerless
· the original experiments used dogs
· Both humans and animals display latent learning - learning that is not evident when it first occurs
· they form cognitive maps of their environment which develop naturally through experience, even in the absence of any overt response or reinforcement – demonstrated by Tolman, 1920s with rats in mazes
· Köhler’s experiments on insight suggest that cognitive processes play a role in learning, even in animals. Insight may result from a ‘mental trial and error’ process.
Observational Learning
Learning by watching others - observational learning, or social learning - is efficient and adaptive
Children are particularly influenced by the adults and peers who act as models for appropriate behaviour in various situations (c.f. Albert Bandura’s experiment with nursery school children who witnessed varying levels of aggression towards a doll, and modified their subsequent behaviour accordingly)
Children who saw adults rewarded for aggression showed the most aggressive acts in play; they had received vicarious conditioning, a kind of observational learning in which one is influenced by seeing or hearing about the consequences of others’ behaviour
· Punishment decreases the frequency of a behaviour by following it either with and unpleasant stimulus or with the removal of a pleasant one (then it is known as a penalty). It has several drawbacks:
1. it only suppresses behaviour (e.g. children will repeat punished acts if they think they can avoid detection)
2. fear of punishment may generalize to the person doing the punishing
3. it is ineffective when delayed. If a child confesses to wrongdoing and is then punished, the punishment may discourage honesty rather than eliminate undesirable behaviour
4. it can be physically harmful
5. it may teach aggressiveness
6. it teaches only what not to do, not what should be done to obtain reinforcement
Reinforcement strengthens behaviour; punishment weakens it. e.g. if a shock is turned off when a rat presses a lever, that is negative reinforcement; if a shock is turned on when the rat presses the lever, that is punishment; the rat will be less likely to press the lever again
· punishment is most effective when:
1. it is immediate
2. it is of sufficient intensity to suppress response on first occasion rather than starting with low intensity
3. it is specified why punishment is being given and that the behaviour, not the person, is being punished
4. more appropriate responses are identified and positively reinforced (Differential Reinforcement of Other behaviour: DRO)
Clinical relevance
· shaping: reinforcement of successive approximations to desired/ effective behaviour
· occurs when complete response is complex
· used in teaching and is accompanied by instruction, prompting, and encouragement
· used in learning disability
· chaining: breaking complex behaviour into sequence of steps
· the first act in a series is reinforced until it can be performed reliably, then the contingencies are altered so that the previous steps have to be performed before reinforcement is given, and so on
· in backward chaining, the satisfaction of achieving the desired final links in the chain provides additional reinforcement for the learning of successively earlier links (e.g. toilet training)
Cognitive Processes in Learning :
Learned Helplessness (Seligman & Maier, 1967) appears to result when people believe that their behaviour has no effect on the world. People, like animals, tend to make less effort to control their environment when prior experience leads them to expect those efforts to be in vain. People can develop effort-reducing expectations either through personal experience or through being told they are powerless
· the original experiments used dogs
· Both humans and animals display latent learning - learning that is not evident when it first occurs
· they form cognitive maps of their environment which develop naturally through experience, even in the absence of any overt response or reinforcement – demonstrated by Tolman, 1920s with rats in mazes
· Köhler’s experiments on insight suggest that cognitive processes play a role in learning, even in animals. Insight may result from a ‘mental trial and error’ process.
Observational Learning
Learning by watching others - observational learning, or social learning - is efficient and adaptive
Children are particularly influenced by the adults and peers who act as models for appropriate behaviour in various situations (c.f. Albert Bandura’s experiment with nursery school children who witnessed varying levels of aggression towards a doll, and modified their subsequent behaviour accordingly)
Children who saw adults rewarded for aggression showed the most aggressive acts in play; they had received vicarious conditioning, a kind of observational learning in which one is influenced by seeing or hearing about the consequences of others’ behaviour
5 functions in observational learning:
1. attention to relevant aspects of model’s behaviour
2. visual image of model
3. remembering/ rehearsal of behaviour
4. refinement by reproduction of learned behaviour
5. anticipation of consequences
1. attention to relevant aspects of model’s behaviour
2. visual image of model
3. remembering/ rehearsal of behaviour
4. refinement by reproduction of learned behaviour
5. anticipation of consequences
Optimum conditions:
1. subject sees the behaviour being reinforced
2. perceived similarity – subject believes they can emit the response necessary to obtain reinforcement
Active Learning
These methods take various forms and encourage people to think deeply about and apply new information instead of just memorizing isolated facts. e.g. small-group problem-solving tasks, discussion of mini-essays, and MCQs that give students feedback on the previous 15 minutes of teaching
Skill Learning
· observational learning, practice (the repeated performance of a skill), and corrective feedback play important roles in the learning of skills
· practice should continue past the point of correct performance until it is automatic
Sign learning theory
- to explain how familiarity with a maze helps learning how to run it
- formation of cognitive maps which are expectations about what will happen next
Insight learning
-rapid restructuring of perceptual field or concept to derive sudden insight into a problem
-learning of a cognitive relationship between means and end
Social learning theory
Based on work by Albert Bandura
Originally applied to attempts to integrate psychoanalysis and learning theory – tends to focus on nurture rather than biological factors. Conceptualizes people as active, thinking, problem solvers who learn by a variety of mechanisms and whose learning is affected by such factors as cognitive appraisal, inference, goal seeking, affiliation, striving for meaning, etc. It includes desensitization
1. subject sees the behaviour being reinforced
2. perceived similarity – subject believes they can emit the response necessary to obtain reinforcement
Active Learning
These methods take various forms and encourage people to think deeply about and apply new information instead of just memorizing isolated facts. e.g. small-group problem-solving tasks, discussion of mini-essays, and MCQs that give students feedback on the previous 15 minutes of teaching
Skill Learning
· observational learning, practice (the repeated performance of a skill), and corrective feedback play important roles in the learning of skills
· practice should continue past the point of correct performance until it is automatic
Sign learning theory
- to explain how familiarity with a maze helps learning how to run it
- formation of cognitive maps which are expectations about what will happen next
Insight learning
-rapid restructuring of perceptual field or concept to derive sudden insight into a problem
-learning of a cognitive relationship between means and end
Social learning theory
Based on work by Albert Bandura
Originally applied to attempts to integrate psychoanalysis and learning theory – tends to focus on nurture rather than biological factors. Conceptualizes people as active, thinking, problem solvers who learn by a variety of mechanisms and whose learning is affected by such factors as cognitive appraisal, inference, goal seeking, affiliation, striving for meaning, etc. It includes desensitization
No comments:
Post a Comment