Authors: Charles Sheffield
Tags: #High Tech, #General, #Science Fiction, #Mathematicians, #Adventure, #Life on Other Planets, #Space Colonies, #Fiction
Examining the mixture of facts, conjecture, and statistics on the displays, it was no surprise to find that the Argus, Odin, and Puzzle Network groups were all following roughly the same initial agenda. The signal had arrived as one immense and unstructured linear digit string. Without discovering and imposing order, you had no chance of deciphering message content. Therefore, you looked for rational ways to subdivide the whole into smaller sections.
You could try dozens of different ways. For example, you might examine the statistics of the whole string locally, where a “local” region contained anything from a thousand to a million digits. All the tools of signal processing were available for that analysis. In one common technique, regions of abnormally low entropy—where the next digit could be predicted with some confidence from the group of digits immediately preceding it—could be sought and marked. These might be “start message” and “end message” markers, since it seemed highly unlikely that the whole SETI signal held only one message. You had to remember how much information could be contained in twenty-one billion binary digits. It was like five thousand densely-written books.
But perhaps regions of low entropy were merely a hint to some other kind of information. An entropy analysis had already been performed, but whoever had done it made no assumptions as to its significance. Milly saw a whole library of possible maps, showing the signal divided into pieces and available for her inspection or continued analysis.
Of course, examining the statistical behavior of signal sections was not the only way to seek structure, and it might not be the best way. As a valid but quite different approach, you could scan the entire signal for test sequences that repeated over and over throughout its length. Naturally, a test sequence had to be long enough for its occurrence to provide information. If the whole signal were totally random, then a short sequence such as 1-0-0-1 could be found in it more than a billion times by sheer chance. On the other hand, if you chose a thirty-digit test sequence you would expect to find it only a score of times in a random string of twenty-one billion digits. Occurrence of that thirty-digit string fifty or sixty separate times was so unlikely that you would know you were on to something.
It was easy to say, “Examine the signal for test sequences long enough to be significant,” but the actual task was a monster. A billion different sequences existed with thirty binary digits. You needed to screen for all of them. That work was still going on.
And when you had found a particular sequence too often to believe that it was the result of chance, what came next? That was another and more difficult question. Perhaps each occurrence of a thirty-digit string indicated a starting point or an end point for an actual message. Then between each thirty-digit string that you found there were sure to be shorter repeated sequences of, say, six to twelve digits. These, particularly if groups of them came in close proximity, ought to form the message itself. In human terms, six binary digits were enough to encode letters of an alphabet, while twelve digits would suffice for most words. Even though there was surely no hope of finding the letters of any human language, it made sense to look for the universals of mathematics. The integers themselves should be easiest of all. Once you knew where each one started and stopped, the numerical value of a binary string was a unique number to within a reflection (should you read the number from left to right, or right to left?). Then you could begin to look for the symbols that stood for equality, less than, greater than, exponentiation, and other common arithmetical operations.
But this brought the interpretation teams face to face with the most vexing question of all: To what extent could you or should you assume that human thinking, human behavior, and human science applied in any way to a SETI message?
How alien was alien? This was the question that gave Milly nightmares. Even within the limited group of workers on Argus Station, she had found two schools of thought. One set—call them the optimists—assumed that any aliens advanced enough to send signals across space must be ahead of humans in every field of science. Moreover, the optimists were convinced that the aliens would have done their absolute best to make their messages easy to read. They would employ no tricks, such as run-length encoding, to reduce the volume of data transmitted and received.
The pessimists said, but wait a moment. These are
aliens
. Technical and scientific discoveries throughout human history didn’t come in the most convenient or logical order. Archimedes was unlucky. He had the integral calculus within his grasp, and if Arabic numeral notation had been available to him he would have beaten Newton and Leibniz by almost two millennia. Kepler, on the other hand, had been fortunate. The Greeks, from Euclid to Apollonius, had established hundreds of theorems concerning conic sections. When Kepler needed them in order to replace the old systems of epicycles with his own laws, those theorems sat waiting.
Aliens are likely to know
different things
, because there is no fixed order for discovery. Maybe we have as much to offer them as they have to offer us. Suppose they never invented the alphabet, or positional notation in mathematics? Then their messages could be all ideographs, their numbers Roman numerals. But far more likely they would use something less familiar and comprehensible than either.
Milly had long ago made her own decision as to where she stood. You could not afford to be either an extreme pessimist or an extreme optimist. On the side of pessimism, surely any aliens would be physically and mentally nothing like humans. They were, after all,
aliens
. Their languages, notations, and order of evolution of ideas would be vastly different. On the other hand, on the side of optimism, surely any alien thought processes must follow the universal laws of logic. Also, anyone who bothered to send messages far across space would want their messages to be not only received, but
comprehended
.
Once you accepted those two assumptions, you had certain guarantees. To take one simple example, no sensible alien would ever send as part of a message 2 ? 2 = 4 unless there was other independent evidence as to how the symbol? was to be interpreted. The message was too ambiguous. The receiver could not determine whether? stood for plus (2 × 2 = 4), times (2 + 2 = 4), or raise to a power (2
2
= 4).
If it were up to Milly, she knew exactly how she would build and send a SETI message. First, you defined special symbols that provided start and stop instructions; then you displayed the positive integers, with enough examples, such as sequences of primes, to make sure the receiver could be absolutely sure there was no misinterpretation.
After that came the symbols of common arithmetic, with examples showing how to add, subtract, multiply, and divide. From there it was a short step to negative numbers, fractions, powers, and irrationals. Imaginaries you would introduce using fractional powers of negative numbers. Then on to series of powers, and the elementary transcendental functions such as sines, cosines, logarithms, and exponents. In every case you would give enough examples to be sure there was no confusion. After providing series expressions for the universal transcendentals such as π or e, you would provide a check that all was being interpreted correctly by quoting one of mathematics’ enduring wonders, a formula that mysteriously links transcendental and imaginary numbers with the basic numerical building blocks of 1 and 0:
e
iπ
=1.
Mathematics was easy, the obvious way to start. After that, Milly would proceed to astronomy, physics, chemistry, and, finally and most difficult, language.
The trouble, of course, was that it was not up to Milly. She was not sending a message. She was
receiving
one. The difference, in terms of self-confidence, was the difference between being a doctor and being a patient.
The good news was that she was not working alone. People as smart as she, and probably a whole lot smarter, were her allies. The displays in front of her provided an overview of the whole signal in schematic form, subdivided into twenty-seven regions.
Using her console to control the rate at which she advanced, Milly set out to scan the entire length of the signal. The Puzzle Network team had worked cooperatively to attach their analyses to the appropriate regions. The result was like a gigantic snake, of which the string of digits of the signal itself formed a narrow backbone. Here and there, in places where something particularly interesting and significant had been discovered, the snake bulged out like a python that had swallowed a pig.
Milly backed the scan to study Section 7, the fourth bulge, which at first sight was bigger than all the others. The comments were offered in ordered bunches:
* * *
Attoboy:
The structure here is odd. High entropy sequences of average length 10
6
digits are regularly interspersed with low entropy regions each of constant length 3.3554 × 10
7
digits. Any thoughts?
Sneak Attack:
Yes. We could be seeing sections of “text” (variable but roughly equal lengths) that introduce or describe a “picture” (something in image format, with a constant array size). Maybe square arrays of black and white images, each about 6,000 × 6,000 elements?
Claudius:
More likely, a gray scale image 4,096 × 4,096 (2
12
× 2
12
—that supports the notion of binary representations), with 2 bits (4 levels) for each pixel. That fits with the exact size of the low entropy regions, 33,554,432 bits.
Sneak Attack:
Could just as well be 2,048 × 2,048, with 256 (8-bit) gray levels.
Claudius:
Should be easy enough to find out which. If we assume a particular line length and do cross-correlations of successive lines, the correct line length should jump right out at us when we get to it, because the correlation will be a lot higher. Let me take a look.
* * *
That was all for that cluster of messages. Presumably Claudius did not yet have an answer, or at least not one that “jumped right out.” Milly moved on.
The seventh bulge along the signal’s spine, Section 12, contained remarks similar to the previous one, except for three added comments:
* * *
Megachirops:
In this case the low entropy regions have a constant length of 4,194,304 bits, exactly one-eighth as long as in Section 7. Does anyone else find this somewhat surprising?
Ghost Boy:
We would probably make them all the same size. The difference may be part of the message, trying to tell us something.
Claudius:
Or could these be line drawings?—binary images, black and white with no shades of gray.
* * *
The ninth bulge supported a hypothesis offered early in the history of SETI:
* * *
The Joker:
My frequency analysis of this section suggests that we are dealing with base 4 arithmetic, rather than the base 2 binary we have seen elsewhere. The temptation to interpret this as a biological description in terms of strings of four nucleotides is strong.
Attoboy:
Beware of anthropomorphism. But I agree, the temptation is strong. I will try to correlate this section with everything in the genome library.
* * *
Not surprisingly, Attoboy had not yet reported results from that effort. The task was a monstrous one. The library to be examined held complete genomes for more than two million species, everything from humans and oak trees and mushrooms to the smallest and simplest viruses. No one, no matter how optimistic, would hope for an exact match. It would be a miracle (and enormously relevant to the universal nature of life) if anything correlated at all with a living creature from Earth. But Attoboy was right, you couldn’t afford not to look.
Milly worked her way on through the signal, section by section. The exercise was giving her a strong inferiority complex. The results that she was seeing had been performed so quickly, and offered such powerful evidence of ingenuity—what could she possibly contribute? The team had already established the existence of unique start and stop sequences, each fourteen bits long. Numerical base and reading order were known beyond doubt: integers were base 2 and base 4, with the most significant digit to the right. Sequences of primes and squares and cubes had been discovered, more than long enough to be unambiguous.
When she came to the very end of the signal, with its termination as a repeated pattern of the fourteen-bit stop sequence, Milly at once went back to the beginning and started over. The easy part, following the trail that others had already marked out, was over. Now she had to do something to justify her own presence in the group.
Sit, observe, learn, keep quiet
; that was all very well—for the first half-day. After that, Milly hoped to bring her own special knowledge and experience to bear. She went to a section near the middle of the signal, where analysis and comments by members of the Puzzle Network were meager and felt tentative. This was a place with special significance for Milly. It was here where she first noticed the oddity that had evolved into the Wu-Beston anomaly, and she had studied it extensively.
Something she had brought with her from Argus Station, more important than clothing or personal effects, was her own suite of processing programs. She had no illusions that they were
better
than anyone else’s; what she was sure was that they were different. Also, they were
hers
, and she knew them inside-out.
She began her analysis. It was similar to what she had attempted months ago, with one crucial difference: she could now build on everything established or conjectured by the Puzzle Network group. The start-stop coding sequence was known. She was sure of the integers. Perhaps most important of all, she knew that what she was dealing with
was
a signal. Puzzles always become easier when you know that a solution exists.
The section that she clipped out for inspection was only a small section of the whole, roughly a hundred million digits out of twenty-one billion. You could eat that up very quickly with images, but she had deliberately avoided low-entropy data runs. What she hoped to find was “text”—whatever that term might mean to an alien mind. It was too early in the game to hope for keys to an actual language.