Read Mind Hacks™: Tips & Tools for Using Your Brain Online
Authors: Tom Stafford,Matt Webb
Tags: #COMPUTERS / Social Aspects / Human-Computer Interaction
Sounds from the same spatial location are harder to separate, but not if you
use vision to fool your brain into “placing” one of the sounds somewhere else.
Sense information is mixed together in the brain and sorted by location
[
Don’t Divide Attention Across Locations
]
, and
we use this organization in choosing what to pay attention to (and therefore tune into). If
you’re listening to two different conversations simultaneously, it’s pretty easy if they’re
taking place on either side of your head — you can voluntarily tune in to whichever one you
want. But let’s say those conversations were occurring in the same place, on the radio: it’s
suddenly much harder to make out just one.
Which is why we can talk over each other in a bar and still understand what’s being
said, but not on the radio. On the radio, we don’t have any other information to
disambiguate who says what and the sounds get confused with each other.
— T.S.
Hang on...how do we decide on the spatial location of a sense like hearing? For sound
alone, we use clues implicit in what we hear, but if we can
see
where
the sound originates, this visual information dominates
[
Put Timing Information into Sound and Location Information into Light
]
.
Even if it’s incorrect.
Jon Driver from University College London
1
took advantage of our experience with syncing language sounds with lip
movements to do a little hacking. He showed people a television screen showing a person
talking, but instead of the speech coming from the television, it was played through a
separate amplifier and combined with a distracting, and completely separate, voice
speaking. The television screen was alternately right next to the amplifier or some
distance away. The subject was asked to repeat the words corresponding to the talking head
on the television.
If they watched the talking head on screen nearby the amplifier, they made more errors
than if they watched the talking head on the screen kept distant from the sound. Even
though both audio streams were heard from the single amplifier in the two cases, moving
the video image considerably changed the listener’s ability to tune into one voice.
This experiment is a prime candidate for trying at home. An easy way would be with a
laptop hooked up to portable speakers and a radio. Have the laptop playing a video with
lots of speech where you can see lip movements. A news broadcast, full of talking heads,
is ideal. Now put the radio, tuned into a talk station, and the laptop speaker, in the
same location. That’s the single amplifier in Driver’s experiment. The two different cases
in the experiment correspond to your laptop being right next to the speakers or some feet
away. You should find that you understand what the talking heads on the video are saying
more easily when the laptop is further away. Give it a go.
It’s easier to understand what’s going on here if we think about it as two separate
setups. Let’s call them “hard,” for the case in which you’re looking at the television
right by the amplifier and “easy,” when you’re looking at the screen put a little further
away.
In the hard case, there’s a video of a talking head on the television screen and two
different voices, all coming from the same location. The reason it’s hard is because it’s
easier to tune out of one information stream and into another if they’re in different
locations (which is what
Don’t Divide Attention Across Locations
is all
about). The fact there’s a video of a talking head showing in this case isn’t really
important.
The easy setup has one audio stream tucked off to the side somewhere, while a talking
head and its corresponding audio play on the television. It’s plain to see that tuning
into the audio on the television is a fairly simple task — I do it whenever I watch TV while
ignoring the noise of people talking in the other room.
But hang on, you say. In Driver’s experiment, the easy condition
didn’t
correspond to having one audio stream neatly out of the way
and the one you’re listening to aligned with the television screen. Both audio streams
were coming from the same place, from the amplifier, right?
Yes, right, but also no. Strictly speaking, both audio streams do still come from the
same place, but remember that we’re not very good at telling where sounds come from. We’re
so poor at it, we prefer to use what we see to figure out the origin of sounds instead
[
Put Timing Information into Sound and Location Information into Light
]
. When you look at the screen, the lip movements of the talking head are
so synchronized with one of the audio streams that your brain convinces itself that the
audio stream must be coming from the position of the screen too.
It’s whether the video is in the same place as the amplifier that counts in
this experiment. When the screen is in a different place from the amplifier, your brain
makes a mistake and mislocates one of the audio streams, so the audio streams are divided
and you can tune in one and out the other.
Never mind that the reason the conversations can be tuned into separately is because
of a localization mistake; it still works. It doesn’t matter that this localization was an
illusion — the illusion could still be used by the brain to separate the information before
processing it. All our impressions are a construction, so an objectively wrong
construction can have as much validity in the brain as an objectively correct
construction.
Language isn’t just for talking to other people; it may play a vital role in helping
your brain combine information from different modules.
Language might be an astoundingly efficient way of getting information into your head
from the outside
[
Speech Is Broadband Input to Your Head
]
, but that’s not its only
job. It also helps you think. Far from being a sign of madness, talking to yourself is
something at the essence of being human.
Rather than dwell on the evolution of language and its role in rewiring the brain into
its modern form,
1
let’s look at one way language may be used by our brains to do cognitive
work. Specifically we’re talking about the ability of language to combine information in
ordered structures — in a word: syntax.
Peter Carruthers, at the University of Maryland,
2
has proposed that language syntax is used to combine, simultaneously,
information from different cognitive modules. By “modules,” he means specialized processes
into which we have no insight,
3
such as color perception or instant number judgments
[
Count Faster with Subitizing
]
. You don’t know
how
you know that something is red or that there are two coffee cups,
you just
know
. Without language syntax, the claim is, we can’t combine
this information.
The theory seems pretty bold — or maybe even wrong — but we’ll go through the evidence
Carruthers uses and the details of what exactly he means and you can make up your own mind.
If he’s right, the implications are profound, and it clarifies exactly how deeply language
is entwined with thought. At the very least, we hope to convince you that
something
interesting is going on in these experiments.
The experiment described here was done in the lab of Elizabeth Spelke.
4
You could potentially do it in your own home, but be prepared to build some
large props and to get dizzy.
Imagine a room like the one in
Figure 5-4
. The room is made up of four
curtains, used to create four walls in a rectangle, defined by two types of information:
geometric (two short walls and two long walls) and color information (one red
wall).
Now, think about the corners. If you are using only geometric information, pairs of
corners are identical. There are two corners with a short wall on the left and a long wall
on the right and two corners the other way around. If you are using only color
information, there are also two pairs of identical corners: corners next to a red wall and
corners
not
next to a red wall.
Using just one kind of information, geometry or color, lets you identify corners with
only 50% accuracy. But using both kinds of information in combination lets you identify
any of the four corners with 100% accuracy, because although both kinds of information are
ambiguous, they are not ambiguous in the same way.
So, here’s a test to see if people can use both kinds of information in combination.
5
Show a person something he’d like, like some food, and let him see you hide
it behind the curtains in one corner of the room. Now disorient him by spinning him around
and ask him to find the food. If he can combine the geometric and the color information,
he’ll have no problem finding the food — he’ll be able to tell unambiguously which corner it
was hidden in. If he doesn’t combine information across modules, he will get it right 50%
of the time and 50% of the time wrong on his first guess and need a second guess to find
the food.
Where does language come into it? Well, language seems to define the kinds of
subjects who can do this task at better than 50% accuracy. Rats can’t do it. Children who
don’t have language yet can’t do it. Postlinguistic children and adults can do it.
Convinced? Here’s the rub: if you tie up an adult’s language ability, her performance
drops to close to 50%. This is what Linda Hermer-Vazquez, Elizabeth Spelke, and Alla
Katsnelson did.
6
They got subjects to do the experiment, but all the time they were doing
it, they were asked to repeat the text of newspaper articles that were played to them over
loudspeakers. This “verbal shadowing task” completely engaged their language ability,
removing their inner monologue.
The same subjects could orient themselves and find the correct corner fine when they
weren’t doing the task. They could do it when they were doing an equivalently difficult
task that didn’t tie up their language ability (copying a sequence of rhythms by
clapping). But they couldn’t do it with their language resources engaged in something
else. There’s something special about language that is essential for reorienting yourself
using both kinds of information available in the room.
Peter Carruthers thinks that you get this effect because language is essential for
conjoining information from different modules. Specifically he thinks that it is needed at
the interface between beliefs, desires, and planning. Combining across modalities is
possible without language for simple actions (see the other crossmodal hacks
[
Combine Modalities to Increase Intensity
through
Hear with Your Eyes: The McGurk Effect
]
in this book for
examples), but there’s something about planning, and that includes reorientation, that
requires language.
This would explain why people sometimes begin to talk to themselves — to instruct
themselves out loud — during especially difficult tasks. Children use self-instruction as a
normal part of their development to help them carry out things they find difficult.
7
Telling them to keep quiet is unfair and probably makes it harder for them
to finish what they are doing.
If Carruthers is right, it means two things. First, if you are asking people to engage
in goal-oriented reasoning, particularly if it uses information of different sorts, you
shouldn’t ask them to do something else that is verbal, either listening or
speaking.
I’ve just realized that this could be another [
Don’t Divide Attention Across Locations
] part of the reason people can
drive with the radio on but need to turn it off as soon as they don’t know where they
are going and need to think about which direction to take. It also explains why you
should keep quiet when the driver is trying to figure out where to go next.
— T.S.
Second, if you do want to get people to do complex multisequence tasks, they might
find it easier if the tasks can be done using only one kind of information, so that
language isn’t required to combine across modules.