Read Mind Hacks™: Tips & Tools for Using Your Brain Online
Authors: Tom Stafford,Matt Webb
Tags: #COMPUTERS / Social Aspects / Human-Computer Interaction
This chapter looks at how we integrate our perceptions — images (
Chapter 2
), sounds (
Chapter 4
), our own mechanisms of attention
(
Chapter 3
), and our other senses
[
Build Your Own Sensory Homunculus
]
— into a
unified perceptual experience.
For instance, how do we use our eyes and ears together? (We prefer to use our ears for
timing and eyes for determining location
[
Put Timing Information into Sound and Location Information into Light
]
.) And what are the
benefits of doing so? (We feel experiences that happen in two senses simultaneously as more
intense
[
Combine Modalities to Increase Intensity
and
Watch Yourself to Feel More
]
.)
Sometimes, we overintegrate. The Stroop Effect
[
Confuse Color Identification with Mixed Signals
]
, a classic experiment,
shows that if we try to respond linguistically, irrelevant linguistic input interferes. In its
eagerness to assimilate as much associated information, as much context, as possible, the
brain makes it very hard to ignore even what we consciously know is unimportant.
We’ll also look at one side effect and one curious limitation of the way we integrate
sense information. The first goes to show that even the brain’s errors can be useful and that
we can actually use a mistaken conclusion about a sound’s origin to better listen to it
[
Pay Attention to Thrown Voices
]
. The
second answers the question: do we really need language to perform what should be a basic
task, of making a simple deduction from color and geometry? In some cases, it would appear so
[
Talk to Yourself
]
.
The timing of an event will be dominated by the sound it makes, the location by where
it looks as if it is happening — this is precisely why ventriloquism works.
Hearing is good for timing
[
Detect Timing with Your Ears
]
but not so good for locating things
in space. On the flip side, vision has two million channels for detecting location in space
but isn’t as fast as hearing.
What happens when you combine the two? What you’d expect from a well-designed
bit of kit: vision dominates for determining location, audition dominates for determining
timing. The senses have specialized for detecting different kinds of information, and when
they merge, that is taken into account.
You can see each of the two senses take control in the location and timing domains. In
the first part, what you see overrules the conflicting location information in what you
hear; in the second part, it’s the other way around.
Go to the theater, watch a film, or play a movie on your PC, listening to it on
headphones. You see people talking and the sound matches their lip movement
[
Hear with Your Eyes: The McGurk Effect
]
.
It feels as if the sound is coming from the same direction as the images you are
watching. It’s not, of course; instead, it’s coming at you from the sides, from the
cinema speakers, or through your headphones.
The effect is strongest at public lectures. You watch the lecturer on stage talking
and don’t notice that the sound is coming at you from a completely different direction,
through speakers at the sides or even back of the hall. Only if you close your eyes can
you hear that the sounds aren’t coming from the stage. The visual correspondence with
the sounds you are hearing causes your brain to absorb the sound information into the
same event as the image, taking on the location of the image. This is yet another
example (for another, see
Watch Yourself to Feel More
) of how our most
important sense, vision, dominates the other senses.
Incidentally, this is how ventriloquism works. The ventriloquist knows that if the
timings of the dummy’s lip movements are close enough to the sounds you hear you will
preconsciously locate the sounds as coming from the dummy. Every time we go to the
cinema we are experiencing a ventriloquism effect, but it is so finessed that we don’t
even notice that it is part of the show.
— T.S.
Vision doesn’t always dominate. Watch Ladan Shams’s “Sound-induced Illusory
Flashing” movies at Caltec (
http://shamslab.psych.ucla.edu/demos/
; QuickTime).
1
They show a black dot flashing very briefly on a white background. The
only difference between the movie on the left and the movie on the right is the sound
played along with
the flash of the dot. With one set you hear a beep as the dot appears; with
another set you hear two beeps.
Notice how the sound affects what you see. Two beeps cause the dot not to flash but
to appear to flicker. Our visual system isn’t so sure it is seeing just one event, and
the evidence from hearing is allowed to distort the visual impression that our brain
delivers for conscious experience.
When the experiment was originally run, people were played up to four beeps with a
single flash. For anything more than one beep, people consistently experienced more than
one flash.
Aschersleben and Bertelson
2
demonstrated that the same principle applied when people produced timed
movements by tapping. People tapping in time with visual signals were distracted by
mistimed sound signals, whereas people tapping in time with sound signals weren’t as
distracted by mistimed visual signals.
This kind of dominance is really a bias. When the visual information about timing is
ambiguous enough, it can be distorted in our experience by the auditory information. And
vice versa — when auditory information about location is ambiguous enough, it is biased in
the direction of the information provided by visual information. Sometimes that distortion
is enough to make it seem as if one sense completely dominates the other.
Information from the nondominant sense (vision for timing, audition for location) does
influence what result the other sense delivers up to consciousness but not nearly so much.
The exact circumstances of the visual-auditory event can affect the size of the bias too.
For example, when judging location, the weighting you give to visual information is
proportional to the brightness of the light and inversely proportional to the loudness of
the sound.
3
Nevertheless, the bias is always weighted toward using vision for location
and toward audition for timing.
The weighting our brain gives to information from these two senses is a result of the
design of our senses, so you can’t change around the order of dominance by making sounds
easier to localize or by making lights harder to locate. Even if you make the sound
location-perfect, people watching are still going to prefer to experience what they see as
where they see it, and they’ll disregard your carefully localized sounds.
Attention isn’t separate for different senses. Where you place your attention in
visual space affects what you hear in auditory space. Attention exists as a central,
spatially allocated resource.
Where you direct attention is not independent across the senses. Where you pay attention
to in space with one sense affects the other senses.
1
If you want people to pay attention to information across two modalities (a
modality
is a sense mode, like vision or audition), they will find
this easiest if the information comes from the same place in space. Alternatively, if you
want people to ignore something, don’t make it come from the same place as something they
are attending to. These are lessons drawn from work by Dr. Charles Spence of the Oxford
University crossmodal research group (
http://psyweb.psy.ox.ac.uk/xmodal/default.htm
). One experiment that everyone will be able to empathize with involves listening
to speech while driving a car.
2
Listening to a radio or mobile phone on a speaker from the back of a car makes
it harder to spot things happening in front of you.
Obviously showing this in real life is difficult. It’s a complex situation with lots
of variables, and one of these is whether you crash your car — not the sort of data
psychologists want to be responsible for creating. So Dr. Spence created the next best
thing in his lab — an advanced driving simulator, which he sat people in and gave them the
job of simultaneously negotiating the traffic and bends while repeating sets of words
played over a speaker (a task called
shadowing
). The speakers were
placed either on the dashboard in front or to the side.
Drivers who listened to sounds coming from the sides made more errors in the shadowing
task, drove slower, and took longer to decide what to do at junctions.
You can see coping strategy in action if you sit with a driver. Notice how he’s happy
to talk while driving on easy and known roads, but falls quiet and pops the radio off when
having to make difficult navigation decisions.
This experiment — and any experience you may have had with trying to drive with
screaming kids in the backseat of a car — shows that attention is allocated in physical
space, not just to particular things arbitrarily and not independently across modalities.
This is unsurprising, given that we know how interconnected cortical processing is
[
Bring Stuff to the Front of Your Mind
]
and that it is often organized in maps that use spatial coordinate frames
[
Build Your Own Sensory Homunculus
]
. The spatial constraints on attention may reflect the physical limits of
modulating the activity in cortical processing structures that are themselves organized to
mirror physical space.
Other studies of this kind of task, which involve complex real-world tasks, have shown
that people are actually very good at coordinating their mental resources. The experiments
that motivated this experiment proved that attention is allocated in space and that
dividing it in space, even across modalities, causes difficulties. But these experiments
tested subjects who weren’t given any choice about what they did — the experimental setup
took them to the limits of their attentional capacities to test where they broke
down.
Whether these same factors had an effect in a real-world task like driving was
another question. When people aren’t at the limit of their abilities, they can switch
between tasks, rather than doing both at once — allocating attention dynamically, shifting
it between tasks as the demands of the task change. People momentarily
stop
talking when driving at sections of the road that are
nonroutine, like junctions, in order to free up attention, avoiding getting trapped in the
equivalent of Spence’s shadowing task.
The driving experiment shows that despite our multitasking abilities the spatial
demands of crossmodal attention do influence driving ability. The effect might be small,
but when you’re travelling at 80 mph toward something else that is travelling at 80 mph
toward you, a small effect could mean a big difference.
One of the early conclusions drawn from research into crossmodal attention
3
was that it was possible to divide attention between modalities without
clashing, so if you wanted users to simultaneously pay attention to two different streams
of information, they should appear in two different modalities. That is, if a person needs
to keep on top of two rapidly updating streams, you’re better off making one operate in
vision and one in sound rather than having both appear on a screen, for example. The
results discussed in this hack suggest two important amendments to this rule of thumb:
4