- Antecedents of operant conditioning
- Basic concepts of operant conditioning
- - Reinforcement
- Positive reinforcement
- Negative reinforcement
- Primary reinforcers
- Secondary reinforcers
- - Three-term contingency
- - Punishment
- Positive punishment
- Negative punishment
- - Extinction
- - Generalization
- - Discrimination
- Reinforcement programs
- Continuous reinforcement programs
- Intermittent reinforcement programs
- Fixed ratio programs
- Variable ratio programs
- Fixed interval programs
- Variable interval programs
- Behavioral change
- Successive approaches or shaping
- Chaining
- References
The operant conditioning or instrumental conditioning is a type of learning where the behavior is controlled with the consequences. It is based on the idea that behaviors that are reinforced tend to show up more often, while behaviors that are punished are extinguished.
What is the difference between operant conditioning and classical conditioning? In operant conditioning, a voluntary response is followed by a reinforcer. In this way, voluntary response (for example, studying for an exam) is more likely to take place in the future.
Skinner Box
In contrast, in classical conditioning, a stimulus automatically triggers an involuntary response. For example, the food a dog sees causes it to produce saliva.
Operant conditioning can be described as a process that attempts to modify behavior through the use of positive and negative reinforcement. Through operant conditioning, an individual makes an association between a particular behavior and a consequence. Examples:
- Parents reward a child's good grades with candy or some other reward.
- A teacher rewards those students who are calm and polite. Students find that by behaving like this they receive more points.
- A food is given to an animal each time a lever is pressed.
BF Skinner (1938) coined the term operant conditioning. Skinner identified three types of responses or operants:
- Neutral operators: responses from the environment that neither increase nor decrease the probability that a behavior will be repeated.
- Reinforcers: responses from the environment that increase the probability of repeating a behavior. Reinforcers can be positive or negative.
- Punishments: responses from the environment that decrease the probability that a behavior will be repeated. Punishment weakens behavior.
Antecedents of operant conditioning
Thorndike was the first to recognize that conditioning includes more than just a response and a reinforcer. The response occurs in the presence of certain stimuli, considering three events: the stimulus, the response and the consequence of the response or reinforcer.
Edward Thorndike. By: Popular Science Monthly Volume 80
This structure facilitates the association between stimulus and response. In his law of effect, Thorndike stated that responses that are followed by reinforcing consequences will have a higher probability of occurrence when the stimulus reappears.
On the contrary, those responses that are followed by negative consequences will have a lower probability of occurrence when the stimulus reappears. The law of effect is the antecedent of operant conditioning or instrumental conditioning, as it was named by Thorndike.
For Skinner, a behavioral psychologist, conditioning was the strengthening of behaviors according to the consequences that had previously been obtained.
Skinner
Along these lines, there are two forms of conditioning:
- The classic or Pavlovian: it is based on the association of unconditioned and conditioned stimuli, the responses being controlled by the antecedent stimuli.
- Operant conditioning: consequent or reinforcing stimuli cause a certain behavior to be emitted. Skinner explains that if the behavior is followed by a positive reinforcer, it would increase the probability of emission of said behavior in the future. On the contrary, if a response is not followed by a reinforcer or that reinforcer is negative, the probability of emitting said behavior in the future will be lower.
Skinner's boxed rat
Basic concepts of operant conditioning
- Reinforcement
It is responsible for the issuance of responses, that is, the probability that they will happen, be it higher or lower in the future. It is a reinforcing and consequent stimulus, since it occurs once the response has occurred.
It is impossible to know whether a particular reinforcer influences behavior until it is contingent upon a response and the behavior is shown to change as a consequence of the reinforcer.
There are two types of reinforcement: positive and negative. Both have the same purpose of increasing the probability that the response will be issued in future situations. In addition, for Skinner, reinforcers are defined by behaviors that are observable and measurable.
Positive reinforcement
Positive reinforcement reinforces a behavior by providing a consequence that an individual finds rewarding. For example, feeding a dog after he sits down. In this case, the sitting behavior would be reinforced.
Negative reinforcement
Removing an unpleasant reinforcer can also strengthen a behavior. This is known as negative reinforcement because it is the removal of an adverse stimulus to the animal or person that causes the behavior to become conditioned.
Negative reinforcement reinforces the behavior by stopping or eliminating an unpleasant experience.
For example, if a child is abused at home and when he goes out into the street he is not abused, the behavior of going outside would be being reinforced.
Primary reinforcers
They would be all those basic reinforcers that do not need any history of prior conditioning to function as such. Some examples would be water, food, and sex.
Secondary reinforcers
Secondary reinforcers would be based on previous stories of conditioning thanks to the association with unconditioned stimuli. Some examples would be money and qualifications.
- Three-term contingency
Source: Joshua Seong / Verywell
It is the basic model of operant conditioning and is made up of three components: the discriminative stimulus, the response, and the reinforcing stimulus.
A discriminative stimulus would be one that will indicate to the subject that the reinforcer is available, indicating that if he carries out a certain behavior, he will be able to obtain said reinforcer. In contrast we have the delta stimulus or stimuli that indicate that the behavior will not lead to obtaining any type of reinforcer.
The answer would be the behavior that the subject will perform, whose execution will lead or not to obtain the reinforcing stimulus.
A reinforcing stimulus is responsible for the emission of the behavior, since thanks to its appearance the probability of emission of a response will increase or decrease in the future.
- Punishment
Punishment is also measured by its effects on the subject's behavior. Instead, unlike reinforcement, what is intended is the reduction or suppression of a certain behavior.
A punishment reduces the probability of issuing a behavior in subsequent situations. However, it does not eliminate the response because if the threat of punishment decreases, the behavior may reappear.
In punishment there are also two different types or procedures, positive punishment and negative punishment.
Positive punishment
This implies the presentation of an aversive stimulus after performing a certain behavior. It is given in a contingent way to the answer given by the subject.
For example, when liquid is put on children's nails to prevent onychophagia. The child savors the bad taste of the liquid (positive punishment) and the likelihood that he will bite his nails again is reduced.
Negative punishment
It consists of the elimination of a stimulus as a consequence of a certain behavior, that is, it consists of the withdrawal of a positive stimulus after carrying out a certain behavior.
For example, if a child is withdrawn from using the game console after they have failed an exam.
- Extinction
In extinction, a response is stopped because the reinforcer no longer appears. This process is based on failing to provide the corresponding reinforcer that is expected to be achieved and that has made that behavior be maintained over time.
When a response is extinguished, the discriminative stimulus becomes the extinction stimulus. This process should not be confused with forgetting, which occurs when the strength of a behavior decreases by not having been emitted in a period of time.
For example, if a child is not given money despite constant complaining, the complaining behavior would be extinguished.
- Generalization
Faced with a given situation or stimulus, a response is conditioned, which can appear before other stimuli or similar situations.
- Discrimination
This process is the opposite of generalization, it responds differently depending on the stimulus and the context.
Reinforcement programs
Skinner also established various reinforcement programs through his research, including continuous reinforcement programs and intermittent reinforcement programs.
Continuous reinforcement programs
They are based on the constant reinforcement of the response each time it occurs, that is, each time the subject executes the desired behavior, they will obtain a reinforcing or positive stimulus.
Intermittent reinforcement programs
On the other hand, here the subject does not always obtain the reinforcer by performing the desired behavior. These are defined based on the number of responses given or the time interval between responses, leading to different procedures.
Fixed ratio programs
In these programs the reinforcer is provided when the subject generates fixed and constant responses. For example, in a ratio 10 program the person obtains the reinforcer after having made ten responses when the stimulus is presented.
Variable ratio programs
This is constructed the same as the previous one, but in this case the number of responses that the subject must give to obtain the reinforcer is variable.
The reinforcer would continue to depend on the number of responses emitted by the subject but with a variable ratio, thanks to which the subject is prevented from predicting when the reinforcer will be obtained.
Fixed interval programs
In interval programs, obtaining the reinforcer does not depend on the number of responses that the subject gives, but is determined by the time elapsed. Consequently, the first response produced after a certain period of time has passed is reinforced.
In fixed interval programs, the time between enhancer and enhancer is always the same.
Variable interval programs
In these programs the reinforcer is obtained after a time, although the time is different for each reinforcer received.
Behavioral change
Successive approaches or shaping
Molding consists of behavioral change through the modeling of behaviors or the differential reinforcement of successive approaches.
A series of steps are followed to shape a specific behavior. In the first place, the initial behavior that is intended to be molded is identified in order to know what one wants to reach.
Afterwards, the possible reinforcers to be used are delimited and the process to reach the final behavior is separated into steps or stages, reinforcing each successive stage or approach until reaching the last one.
With this dynamic procedure, both behaviors and their consequences are transformed. In this sense, successive approaches towards an objective behavior are reinforced.
However, for it to be carried out, it is necessary to start from a previous behavior that the subject already performs, in order to gradually reinforce their behaviors until they reach the goal.
Chaining
With it, a new behavior is formed from the decomposition into simpler steps or sequences, reinforcing each response given in each of the steps to thus lead to the establishment of a more complex response in the behavioral repertoire of the subject.
Long chains of responses can be formed using conditioned reinforcers, adopting a functional unit and the establishment of which leads to the acquisition and definition of a particular skill.
References
- Operant conditioning. Recovered from wikipedia.org
- Operant conditioning. Recovered from e-torredebabel.com.
- Recovered from biblio3.url.
- Law of effect. Recovered from wikipedia.org.
- Extinction. Recovered from wikipedio.org.
- Domjan, M. Principles of learning and behavior. Auditorium. 5th edition.