Reversal Learning

Provides facilities for simple or serial reversal learning, with either two stimuli (A+B- → A-B+) or three (A+B-C- → A-B+C-). You could use one of the Visual Discrimination tasks to accomplish basic between-session reversal learning; this task provides more sophisticated options.

There is an option to use three objects. In this task, a subject is trained with A+B-C- and then reversed to A-B+C-; perseveration can then be measured directly as the degree to which subjects respond to A more than to C. For examples of this task in the recent neurobiological literature, see Arnsten et al. (1997; Neurobiology of Aging 18: 21-28) or Jentsch et al. (2002; Neuropsychopharmacology 26: 183-190). I'm sure this form of the task has a much longer history, but I don't have my copy of Mackintosh (1974; "The Psychology of Animal Learning") to hand!

•

Trial initiation. Specify the initiation method (spontaneous, requiring a lever response, or requiring a magazine response - in which case you can have the magazine light illuminated to indicate the need for a response) and the initiation limited hold time (after which failure to respond causes the trial to be abandoned; use 0 for no limit). See also Use with Dogs.

•

Stop after the within-session reversal criterion has been attained this many times. It's easiest to give an example. If you enable within-session reversals (see below) and enter "2" here, then the subject will be started on one configuration (e.g. A+B-), allowed to proceed until it has passed its first reversal criterion, tested on the new task (A-B+), allowed to proceed until it has passed the second reversal criterion, and then stopped. (Within-session reversal criteria are explained below. You may specify 0 for no such limit. If you do not enable within-session reversals, this option will have no effect.)

•	Time between trials. Specify a minimum and a maximum intertrial time (they may be the same). The actual time is chosen with a rectangular probability distribution within these values. The time between trials starts after any reward or punishment from the previous trial has finished.

•

Leave correct stimulus on during reward? (etc.) When the subject responds, the correct stimulus can be left on the screen during reward, and/or the incorrect stimulus can be left on during punishment. These stimuli can either be left on for the duration of the reward/punishment (as specified in the General Parameters), or you can specify how long to leave them on the screen for.

•

Reverse within a session... when subject performs X of the last Y trials correctly. Fairly obvious, I hope. Set the value of X and Y in the boxes. (When a reversal occurs, the requirement is reset.) You can specify X and Y for (a) the first discrimination, and (b) subsequent reversal discriminations.

•

Locations. Choose the location sets used by the task. One location set (the "2-task" set) is used for the two-stimulus reversal task. If you use three stimuli, two location sets are used ("3-task A" and "3-task B") - one of these is chosen at random on each trial (and whichever is used is recorded in the outpu). (Why? The intention is for tasks involving a central feeder; one of the default 3-task locations sets has two stimuli on the left and one on the right, and the other has one on the left and two on the right. This allows the centre to be avoided without introducing a side bias. You can, of course, configure the A and B location sets to be identical.)

•

Probability of reward given a correct/incorrect response. A conventional reversal procedure has p(reward | correct) = 1 and p(reward | incorrect) = 0. However, if you would like a fully probabilistic reversal task, untick this box. You may then specify p(reward | correct) and p(reward | incorrect) directly. For example, if you specify 0.8 and 0.2, then correct responses would be rewarded 80% of the time, while incorrect responses would be rewarded 20% of the time.

•

If you reward some "incorrect" responses, and you have chosen the option "Leave correct stimulus on during reward", the program will leave the chosen stimulus on (i.e. one that is notionally "incorrect", but is being rewarded on this trial). This seems the only consistent thing to do. Essentially, a probabilistic task blurs the definition of "correct" and "incorrect", so the option is best described as "Leave chosen stimulus on if it's rewarded"!

•

Pseudorandom false feedback, not random. By default, the false feedback system gives feedback on a random basis (i.e. it flips a biased coin with the bias you specifed above on each trial). You may want false feedback to be pseudorandom instead, e.g. to guarantee that if you specify 20% false feedback, two out of every 10 trials have false feedback (whereas in a random system with p = 0.2 for false feedback, you are not guaranteed two out of every 10). This pseudorandom option allows you to specify blocks of X correct trials and Y incorrect trials, such that in every consecutive X correct trials and Y incorrect trials, a certain proportion (specified above) give true or false feedback. As always, for more explanation of this topic, see randomness, pseudorandomness, and drawing without replacement.

•

If you use the pseudorandom option, specify the block size for correct and for incorrect trials. This is the number of consecutive trials (correct or incorrect - they are calculated separately) over which the system calculates. Be careful how you specify this; errors are possible. If you specify p(reward | correct) = 0.8, then you're saying that you want 80% of your correct trials to be rewarded - so you might do well to pick a number that is a multiple of 5 here (e.g. 10 for 8/10 correct trials to be rewarded, or 20 for 16/20 correct trials to be rewarded). If you pick 0.8 and then specify a sequence of 9 trials, the program will not behave as you want (it'll calculate 0.8 * 9 = 7.2, then add 0.5 to implement correct rounding, giving 7.7, and truncate this to 7 for the number of "truthful" trials, then take 9 - 7 = 2 for the "untruthful" trials, so the probability of truth will not be exactly what you requested).

•

You can also enforce that no two consecutive correct or incorrect trials (taken separately) give false feedback. For example, if your subject responds correctly but receives false feedback (punishment), then responds incorrectly on the next two trials, and then responds correctly again, this option would ensure that this last correct trial is not punished (because it's the second of two "consecutive" correct trials). Note that this option, with small block sizes, can lead to predictable trial sequences (because the constrains leave the program little or no choice).

•	If this box is not ticked, the location of the correct stimulus is chosen at random for each trial (and, in the three-stimulus task, the location of the "incorrect but correct in the past" and the "never correct" stimuli are similarly chosen at random).

•	If this box is ticked, then the locations are randomized in groups where the group size is n times the number of stimuli. You specify this value of n in the box labelled "... by drawing without replacement from a list of size n x the number of stimuli...".

•

Suppose that you specify n = 1; then the locations will be randomized in pairs (for the two-stimulus task), meaning that in every pair of trials, the correct stimulus is on the left on one trial and on the right in the other, but the order of those two trials within the pair is random. For the three-stimulus task, there are six possible spatial combinations (ABC, ACB, BAC, BCA, CAB, CBA) and in every six trials one of these combinations will be used, with the order within the group of six being random.

•	Put another way, then if n = 1, for the two-stimulus task, each block of two trials will contain one "left correct" (L) trial and one "right correct" (R) trial. (Therefore, in this case, it's impossible to get more than two trials in a row with the same side correct.)

•	If you specify n = 2, then for the two-stimulus task, each block of four trials will contain two L trials and two R trials. (In this example, it's impossible to see more than four trials in a row with the same side correct.)

•

Correction procedure. Choose the type of correction procedure (CP) you wish to use. Correction procedures are used to try to prevent the subject responding to one side (spatial location), rather than one stimulus. For example, if two stimuli are presented and one is correct and the other is wrong, but the correct stimulus is randomly presented on the left or right, the subject could win on 50% of trials simply by responding to the left. Now, a good analysis of the data (the best being an analysis based on the principles of signal detection theory) will immediately show that the subject is not discriminating the two stimuli. However, some experimenters wish to discourage the use of a spatial strategy further. A common way of doing so is a correction procedure. For example, if the animal keeps responding to one side, the correction procedure could present the correct stimulus on the other side until the subject breaks its positional habit and responds to the other side. The meaning of the types of correction procedure available in this task is explained in the dialogue box, and as follows.

•

Antibias. When a certain number of consecutive responses have been made to one side (left or right), the correction procedure begins. The number of trials this takes is known as parameter A. Having begun, the correction procedure presents the correct stimulus on the other side, overriding the usual mechanism (discussed above) for deciding which side the correct stimulus is shown on. The correction procedure continues until the subject has made a certain number of correct responses - this number is called parameter B. The correct responses do not have to be in sequence (so if B is 5, then the subject might get a series of correction trials correct - wrong - wrong - correct - correct - wrong - correct - wrong - wrong - correct, and then the correction procedure would stop). Once this target number of successful "correction" trials has been achieved, the correction procedure stops, and the usual sequence of trials resumes afterwards. (If a maximum number of trials has been set for the session, both "standard" and "correction" trials count towards this limit.)

•

When using the Antibias correction system, it is also possible to begin the session with the correction procedure. This allows the experimenter to "force" the correct stimulus to one side until the subject gets enough correct trials to terminate the correction procedure. You might use it if, for example, your subject began a correction procedure at the end of its previous session, and then ran out of time/trials, so you would like it to resume. Suppose you like your correction procedures to run until the subject has got 10 trials right (B = 10), but in the last session your subject got to 4 and then ran out of time. You'd like to finish off the correction procedure from last time (requiring the subject to get 6 more "correction" trials right at the start of the session before normal business resumes) but then have your usual B = 10 if the subject requires another correction procedure. No problem: specify parameter C = 6 (and B = 10 as usual) to achieve this.

•

In a two-stimulus task, stimuli can either be on the left or the right, and our correction procedure is simple. What happens if we use a three-stimulus task, in which the stimuli can be on the left, on the right, or in the middle? Well, if the "antibias" correction procedure is employed with a three-stimulus task and the subject perseverates in the middle, then the correct stimulus is randomly assigned to the left or the right location for the correction procedure. If it perseverates on the left, then the correct stimulus is assigned to the right-hand side; if it perseverates on the right, the correct stimulus is assigned to the left-hand side. In all cases, once the correction procedure has determined where the correct stimulus is to be, it chooses the location of the "incorrect but once correct" stimulus and the "never correct" stimulus (stimulus C, as in A+B-C-/A-B+C-) at random.

•

Harsh. In this system, whenever the subject gets a trial wrong, the correction procedure starts. This type of correction procedure simply repeats the trial that the subject got wrong, until it gets it right, or until a limiting number of trials (parameter A) is reached. When either of these conditions is met, the correction procedure stops. (If a maximum number of trials has been set for the session, only "standard" [non-correction] trials count towards this limit when using the Harsh system.)

•

Whatever the type of correction procedure, note that if you allow within-session reversals to happen and your subject achieves the criteria for reversing, the reversal takes priority over the correction procedure: any ongoing correction procedure is cancelled, and all correction procedure counts are reset (i.e. the whole correction system starts again from scratch).

•

SPECIAL OPTION: make this a side (LEFT/RIGHT) rather than a stimulus discrimination. This is a special option that stops the program running a stimulus discrimination/reversal task, and makes it a SIDE or LOCATION discrimination. That is, either "left" or "right" is correct (and the program can, if you wish, reverse between these), but the A and B stimuli are displayed at random, so their visual appearance is irrelevant. For example, for one trial the A stimulus might be on the left and the B stimulus on the right (with the left-hand stimulus being correct) and for the next trial B might be on the left and A on the right (with the left-hand stimulus again being correct).

•

If the special option is ticked, a few other options become irrelevant, and the meaning of the "spatial randomization" option changes (rather than "how should the L/R location of the correct stimulus be picked?", it now means "how should the A/B stimulus to be displayed at the correct location be picked?") and a warning message to this effect pops up:

In this case, the subject responded correctly and correct stimuli are being left up during reward: