Read Mind Hacks™: Tips & Tools for Using Your Brain Online
Authors: Tom Stafford,Matt Webb
Tags: #COMPUTERS / Social Aspects / Human-Computer Interaction
Some of the constraints on how fast we can task-switch or observe
simultaneously aren’t fixed. They can be trained by playing first-person action video
games.
Our visual processing abilities are by no means hardwired and fixed from birth. There
are limits, but the brain’s nothing if not plastic. With practice, the attentional
mechanisms that sort and edit visual information can be improved. One activity that requires
you to practice lots of the skills involved in visual attention is playing video
games.
So, what effect does playing lots of video games have? Shawn Green and Daphne Bavelier
from the University of Rochester, New York, have researched precisely this question; their
results were published in the paper “Action Video Game Modifies Visual Attention,”
1
available online at
http://www.bcs.rochester.edu/people/daphne/visual.html#video
.
Two of the effects they looked at we’ve talked about elsewhere in this book. The
attentional blink
[
Avoid Holes in Attention
]
is that half-second recovery time required to spot a second target in a
rapid-fire sequence. And
subitizing
is that alternative to counting for
very low numbers (4 and below), the almost instantaneous mechanism we have for telling how
many items we can see
[
Count Faster with Subitizing
]
. Training can both increase the subitization limit and shorten the
attentional blink, meaning we’re able to simultaneously spot more of what we want to spot,
and do it faster too.
Comparing the attentional blink of people who have played video games for 4 days a
week over 6 months against people who have barely played games at all finds that the games
players have a shorter attentional blink.
The attentional blink comes about in trying to spot important items in a fast-changing
sequence of random items. Essentially, it’s a recovery time. Let’s pretend there’s a video
game in which, when someone pops up, you have to figure out whether it’s a good guy or a
bad guy and respond appropriately. Most of the characters that pop up are good guys, it’s
happening as fast as you can manage, and you’re responding almost automatically — then
suddenly a bad one comes up. From working automatically, suddenly the bad guy has to be
lifted to conscious awareness so you can dispatch him. What the attentional blink says is
that the action of raising to awareness creates a half-second gap during which you’re less
likely to notice another bad guy coming along.
Now obviously the attentional blink — this recovery time — is going to have an
impact on your score if the second of two bad guys in quick succession is able to slip
through your defenses and get a shot in. That’s a great incentive to somehow shorten your
recovery time and return from “shoot bad guy” mode to “monitor for bad guys” mode as soon
as possible.
Subitizing — the measure of how many objects you can quantify without having to count
them — is a good way of gauging the capacity of visual attention. Whereas counting requires
looking at each item individually and checking it off, subitizing takes in all items
simultaneously. It requires being able to give a number of objects attention at the same
time, and it’s not easy; that’s why the maximum is usually about four, although the exact
cap measured in any particular experiment varies slightly depending on the setup and
experimenter.
Green and Bavelier found the average maximum number of items their nongame-playing
subjects could subitize before they had to start counting was 3. 3. The number was
significantly higher for games players: an average of 4.9 — nearly 50% more.
Again, you can see the benefits of having a greater capacity for visual attention if
you’re playing fast-moving video games. You need to be able to keep on top of whatever’s
happening on the screen, even when (especially when) it’s getting stretching.
Given these differences in certain mental abilities between gamers and nongamers, we
might suspect the involvement of other factors. Perhaps gamers are just people who have
naturally higher attention capacities (not attention as in concentration, remember, but
the ability to keep track of a larger number of objects on the screen) and have gravitated
toward video games.
No, this isn’t the case. Green and Bavelier’s final experiment was to take two groups
of people and have them play video games for an hour each day for 10 days.
The group that played the classic puzzle game Tetris had no improvement on subitizing
and no shortened attentional blink. Despite the rapid motor control required and the
spatial awareness implicit in Tetris, playing the game didn’t result in any
improvement.
On the other hand, the group that played Medal of Honor: Allied Assault (Electronic
Arts, 2002), an intense first-person shooter, could subitize to a
higher number and recovered from the attentional blink faster. They had
trained and improved both their visual attention capacity and processing time in only 10
days.
Green and Bavelier’s results are significant because processes like subitizing
[
Count Faster with Subitizing
]
are
used continuously in the way we perceive the world. Even before perception reaches
conscious attention, our attention is flickering about the world around us, assimilating
information. It’s mundane, but when you look to see how many potatoes are in the cupboard,
you’ll “just know” if the quantity fits under your subitization limit and have to count
them — using conscious awareness — if not.
Consider the attentional blink, which is usually half a second (for the elderly, this
can double). A lot can happen in that time, especially in this information-dense world:
are we missing a friend walking by on the street or cars on the road? These are the
continuous perceptions we have of the world, perceptions that guide our actions. And the
limits on these widely used abilities aren’t locked but are trainable by doing tasks that
stretch those abilities: fast-paced computer games.
I’m reminded of Douglas Engelbart’s classic paper “Augmenting Human Intellect”
2
on his belief in the power of computers. He wrote this in 1962, way before
the PC, and argued that it’s better to improve and facilitate the tiny things we do every
day rather than attempt to replace entire human jobs with monolithic machines. A
novel-writing machine, if one were invented, just automates the process of writing novels,
and it’s limited to novels. But making a small improvement to a pencil, for example, has a
broad impact: any task that involves pencils is improved, whether it’s writing novels,
newspapers, or sticky notes. The broad improvement brought about by this hypothetical
better pencil is in our basic capabilities, not just in writing novels. Engelbart’s
efforts were true to this: the computer mouse (his invention) heightened our capability to
work with computers in a small, but pervasive, fashion.
Subitizing is a like a pencil of conscious experience. Subitizing isn’t just
responsible for our ability at a single task (like novel writing), it’s involved in our
capabilities across the board, whenever we have to apply visual attention to more than a
single item simultaneously. That we can improve such a fundamental capability, even just a
little, is significant, especially since the way we make that improvement is by playing
first-person shooter video games. Building a better pencil is a big deal.
Your ears are not simply “eyes for sound.” Sound contains quite different
information about the world than does light. Light tends to be ongoing, whereas sound occurs
when things change: when they vibrate, collide, move, break, explode! Audition is the sense of
events rather than scenes. The auditory system thus processes auditory information quite
differently from how the visual system processes visual information: whereas the dominant role
of sight is telling where things are, the dominant role of hearing is telling when things
happen
[
Detect Timing with Your Ears
]
.
Hearing is the first sense we develop in the womb. The regions of the brain that deal with
hearing are the first to finish the developmental process called
myelination
, in which the connecting “wires” of neurons are finished
off with fatty sheaths that insulate the neurons, speeding up their electrical signals. In
contrast, the visual system doesn’t complete this last step of myelination until a few months
after birth.
Hearing is the last sense to go as we lose consciousness (when you’re dropping off to
sleep, your other senses drop away and sounds seem to swell up) and the first to return when
we make it back to consciousness.
We’re visual creatures, but we constantly use sound to keep a 360° check on the world
around us. It’s a sense that supplements our visual experience — a movie without a music score
is strangely dull, but we hardly notice the sound track normally. We’ll look at how we hear
some features of that sound track, stereo sound
[
Detect Sound Direction
]
, and pitch
[
Discover Pitch
]
.
And of course, audition is the sense of language. Hacks in this chapter show how we don’t
just hear a physical sound but can hear the meanings they convey
[
Speech Is Broadband Input to Your Head
]
, even on the
threshold of perception
[
Detect Sounds on the Margins of Certainty
]
. Just as with vision,
what we experience isn’t quite what is physically there. Instead, we experience a useful aural
construction put together by our brains.
We’ll finish up by investigating three aspects of understanding language: of the
hidden sound symbolism in words
[
Give Big-Sounding Words to Big Concepts
]
, of how we break sentences
into phrases,
[
Stop Memory-Buffer Overrun While Reading
]
, and of how you know
excalty waht tehse wrdos maen
[
Robust Processing Using Parallelism
]
.
Audition is a specialized sense for gathering information from the fourth
dimension.
If vision lets you see where something is, hearing tells you when it is. The time
resolution of audition is way above that of vision. A cinema screen of 24 images a second
looks like a constant display, rather than 24 brief images. A selection of 24 clicks a
second sounds like a bunch of clicks — they don’t blur into a constant tone.
Listen to these three sound files:
At a frequency of 24 frames per second, film blurs into a continuous image. At 24
clicks per second, you perceive the sound as separate clicks. At four times that rate, you
still hear the sound as discontinuous. You may not be able to count the clicks, but you
know that the sound is made up of lots of little clicks, not one continuous hum. Auditory
“flicker” persists up to higher frequencies than visual flicker before it is integrated to
a continuous percept.
Specialization for timing is evident in many parts of the auditory system. However, it
is the design of the sound receptor device (the ears) that is most crucial. In the eye,
light is converted to neural impulses by a slow chemical process in the receptor cells.
However, in the ear, sound is converted to neural impulses by a fast mechanical
system.
Sound vibrations travel down the ear canal and are transmitted by the tiny ear
bones (
ossicles
) to the snail-shaped
cochlea
, a
piece of precision engineering in the inner ear. The cochlea performs a frequency analysis
of incoming sound, not with neural circuitry, but mechanically. It contains a curled
wedge, called the basilar membrane, which, due to its tapering thickness, vibrates to
different frequencies at different points along its length. It is here, at the basilar
membrane, that sound information is converted into neural signals, and even that is done
mechanistically rather than chemically. Along the basilar membrane are receptors, called
hair cells. These are covered in tiny hairs, which are in turn linked by tiny filaments.
When the hairs are pushed by a motion of the basilar membrane, the tiny filaments are
stretched, and like ropes pulling open doors, the filaments open many minute channels on
the hairs. Charged atoms in the surrounding fluid rush into the hair cells, and thus sound
becomes electricity, the native language of the brain. Even movements as small as those on
the atomic scale are enough to trigger a response. And for low frequency sounds (up to
1500 cycles per second), each cycle of the sound can trigger a separate group of
electrical pulses. For higher frequencies, individual cycles are not coded, just the
average intensity of the cycles. The cells that receive auditory timing input in the brain
can fire at a faster rate than any other neurons, up to 500 times a second.
This arrangement means that the auditory system is finely attuned to frequency and
timing information in sound waves. Sounds as low as 20 Hz (1 Hz is one beat per second),
and as high as 20,000 Hz can be represented. The timing sensitivity is exquisite; we can
detect periods of silence in sounds of as little as 1 millisecond (thousandths of a
second). Compare this with your visual system, which requires exposure to an image for
around 30 milliseconds to report an input to consciousness. Furthermore, thanks to the
specialized systems in the ear and in the brain, timing between the ears is even more
exquisite. If sound arrives at one ear as little as 20
microseconds
(millionths of a second) before arriving at the other, this tiny difference can be
detected
[
Detect Sound Direction
]
.
For perspective, an eye blink is in the order of 100,000 microseconds, 5000 times
slower.
Although vision dominates many other senses in situations of conflicting information
[
Put Timing Information into Sound and Location Information into Light
]
, given the sensitivity of our ears, it is not surprising that audition
dominates over vision for determing the timing of events
We use this sensitivity to timing in many ways — notably in enjoying music and using the
onset of new sounds to warn us that something has changed somewhere.
Our ears let us know approximately which direction sounds are coming from.
Some sounds, like echoes, are not always informative, and there is a mechanism for
filtering them out.
A major purpose of audition is telling where things are. There’s an analogy used by
auditory neuroscientists that gives a good impression of just how hard a job this is. The
information bottleneck for the visual system is the ganglion cells that connect the eyes to
the brain
[
Understand Visual Processing
]
. There are about a million in each eye, so, in your vision, there are about
two million channels of information available to determine where something is. In contrast,
the bottleneck in hearing involves just two channels: one eardrum in each ear. Trying to
locate sounds using the vibrations reaching the ears is like trying to say how many boats
are out on a lake and where they are, just by looking at the ripples in two channels cut out
from the edge of the lake. It’s pretty difficult stuff.
Your brain uses a number of cues to solve this problem. A sound will reach the near ear
before the far ear, the time difference depending on the position of the sound’s source.
This cue is known as the
interaural
(between the ears)
time
difference. A sound will also be more intense at the near ear
than the far ear. This cue is known as the
interaural level
difference.
Both these cues are used to locate sounds on the horizontal plane: the time difference
(delay) for low-frequency sounds and the level difference (intensity) for high-frequency
sounds (this is known as the Duplex Theory of sound localization). To locate sounds on the
vertical plane, other cues in the spectrum of the sound (spectral cues) are used. The
direction a sound comes from affects the way it is reflected by the outer ear (the ears we
all see and think of as ears, but which auditory neuroscientists call pinnae). Depending on
the sound’s direction, different frequencies in the sound are amplified or attenuated.
Spectral cues are further enhanced by the fact that our ears are slightly different shapes,
thus differently distort the sound vibrations.
The main cue is the interaural time difference. This cue dominates the others if they
conflict. The spectral cues, providing elevation (up-down) information, aren’t as accurate
and are often misleading.
Echoes are a further misleading factor, and seeing how we cope with them is a good way
to really feel the complexity of the job of sound localization. Most environments — not just
cavernous halls but the rooms in your house too — produce echoes. It’s hard enough to work
out where a single sound is coming from, let alone having to distinguish between original
sounds and
their reverberations, all of which come at you from different directions. The
distraction of these anomalous locations is mitigated by a special mechanism in the
auditory system.
Those echoes that arrive at your ears within a very short interval are grouped
together with the original sound, which arrives earliest. The brain takes only the first
part of the sound to place the whole group. This is noticeable in a phenomenon known as
the Haas Effect, also called
the principle of first arrival
or
precedence effect
.
The Haas Effect operates below a threshold of about 30–50 milliseconds between one
sound and the next. Now, if the sounds are far enough apart, above the threshold, then
you’ll hear them as two sounds from two locations, just as you should. That’s what we
traditionally call echoes. By making echoes yourself and moving from an above-threshold
delay to beneath it, you can hear the mechanism that deals with echoes come into
play.
You can demonstrate the Haas Effect by clapping at a large wall.
1
Stand about 10 meters from the wall and clap your hands. At this distance,
the echo of your hand clap will reach your ears more than 50 milliseconds after the
original sound of your clap. You hear two sounds.
Now try walking toward the wall, while still clapping every pace. At about 5
meters — where the echo reaches your ears less than 50 ms after the original sound of the
clap — you stop hearing sound coming from two locations. The location of the echo has merged
with that of the original sound; both now appear to come as one sound from the direction
of your original clap. This is the precedence effect in action, just one of many
mechanisms that exist to help you make sense of the location of sounds.
The initial computations used in locating sounds occur in the brainstem, in a
peculiarly named region called the superior olive. Because the business of localization
begins in the brainstem, surprising sounds are able to quickly produce a turn of the head
or body to allow us to bring our highest resolution sense, vision, to bear in figuring out
what is going on. The rapidity of this response wouldn’t be possible if the information
from the two ears were integrated only later in processing.
The classic model for how interaural time differences are processed is called the
Jeffress Model, and it works as shown in
Figure 4-1
. Cells in the midbrain indicate a
sound’s position by increasing their firing in response to sound, and each cell takes
sound input from both ears. The cell that fires most is the one that receives a signal
from both ears simultaneously. Because
these cells are most active when the inputs from both sides are synchronized,
they’re known as
coincidence-detector
neurons.
Now imagine if a sound comes from your left, reaching your right ear only after a tiny
delay. If a cell is going to receive both signals simultaneously, it must be because the
left-ear signal has been slowed down, somewhere in the brain, to exactly compensate for
the delay. The Jeffress Model posits that the brain contains an array of
coincidence-detector cells, each with particular delays for the signals from either side.
By this means, each possible location can be represented with the activity of neurons with
the appropriate delays built in.
The Jeffress Model may not be entirely correct. Most of the neurobiological evidence
for it comes from work with barn owls, which can strike prey in complete darkness.
Evidence from small mammals suggests other mechanisms also operate.
2
An ambiguity of localization comes with using interaural time difference, because
sounds need not be on the horizontal plane — they could be in front, behind, above, or
below. A sound that comes in from your front right, on the same level as you, at an angle
of 33° will sound, in terms of the interaural differences in timing and intensity, just
like the same sound coming from your back right at an angle of 33° or from above right at
an angle of 33°. Thus, there is a “cone of confusion” as to where you place a sound, and
that is what is shown in
Figure 4-2
.
Normally you can use the other cues, such as the distortion introduced by your ears
(spectral cues) to reduce the ambiguity.
The more information in a sound, the easier it is to localize. Noise
containing different frequencies is easier to localize. This is why they now add white
noise, which contains all frequencies in equal proportions, to the siren sounds of
emergency vehicles,
3
unlike the pure tones that have historically been used.
If you are wearing headphones, you don’t get spectral cues from the pinnae, so you
can’t localize sounds on the up-down dimension. You also don’t have the information to
decide if a sound is coming from in front or from behind.
Even without headphones, our localization in the up-down dimension is pretty poor.
This is why a sound from down and behind you (if you are on a balcony, for example) can
sound right behind you. By default, we localize sounds to either the center left or center
right of our ears — this broad conclusion is enough to tell us which way to turn our heads,
despite the ambiguity that prevents more precise localization.
This ambiguity in hearing is the reason we cock our heads when listening. By taking
multiple readings of a sound source, we overlap the ambiguities and build up a composite,
interpolated set of evidence on where a sound might be coming from. (And if you watch a
bird cocking its head, looking at the ground, it’s listening to a worm and doing the same.
4
)
Still, hearing provides a rough, quick idea of where a sound is coming from. It’s
enough to allow us to turn toward the sound or to process sounds differently depending on
where they come from
[
Don’t Divide Attention Across Locations
]
.