Read But How Do It Know? - the Basic Principles of Computers for Everyone Online
Authors: J Clark Scott
Eight Is Enough
In order to be able to represent something more than simple yes/no matters, what we are going to do is to stack up eight bits in a single package, and use them as a single unit. Here is a diagram of how it is done. We have taken eight of our memory bits, each one still has its own data input ‘i’ and its own output ‘o,’ but we have wired all eight of the set inputs ‘s’ together. Thus when the single ‘s’ gets turned on and then off again, all eight of these ‘M’s will capture the states of their corresponding ‘i’s at the same time. The picture on the left shows all eight ‘M’s, the one on the right is the same thing, just a little simpler.
This assembly has a name; it is called a byte, thus the “B” in the diagram. There are several conflicting explanations of exactly where this word came from, but since it sounds just like the word “bite,” you can just think of it as a whole mouthful compared with a smaller unit, a bit. Just to show you that computer designers do have a sense of humor, when they use four bits as a unit, they call it a nibble. So you can eat a tiny bit of cherry pie, or have a nibble or take a whole byte.
When we had a bit, we would just say that its state was either 0 or 1. Now that we have a byte, we will write the contents of the byte like this: 0000 0000, and you can see why we switched from using off/on to 0/1. That shows the contents of each of the eight bits, in this case they are all zeros. The space in the middle is just there to make it a little easier to read. The left hand 0 or 1 would correspond to the top bit in our byte, and the rightmost 0 or 1 would represent the bottom bit.
As you had better know by now, a bit has two possible states that it can be in — on or off. If you have two bits, there are four possible states that those two bits can be in. Do you remember the chart we drew for the inputs of the NAND gate? There were four lines on the chart, one for each possible combination of the two input bits to the gate, 0-0, 0-1, 1-0 and 1-1.
Notice that the order of the bits
does
matter – that is, if you look at two bits and only ask how many bits are on, there are only three possibilities: no bits on, one bit on or two bits on. That would be calling the 1-0 and 0-1 combinations the same thing. For the purpose of using multiple bits to implement a code, we definitely care about the order of the bits in a byte. When there are two bits, we want to use all four possibilities, so we have to keep the bits in order.
How many different possibilities are there when you use eight bits? If all you have is one bit, it can be in one of two states. If you add a second bit, the pair has twice as many states as before because the old bit has its two states while the new bit is one way, and then the old bit has its two states while the new bit is the other way. So two bits have four states. When you add a third bit, the first two have four states with the new bit off and four states with the new bit on, for a total of eight states. Every time you add a bit, you just double the number of possible states. Four bits have 16 states, five have 32, six have 64, seven have 128, eight have 256, nine have 512 states, and so on.
We are going to take eight bits, and call it a byte. Since a bit is a thing that has a location in space, that can be in one of two states, then a byte is a thing that has eight separate locations in space, each of which can be on or off, that are kept in the same order. The byte, taken as a whole, is a location in space that can be in any one of 256 states at any given time, and may be made to change its state over time.
Codes
A bit could only represent yes/no types of things, but now that we have 256 possibilities, we can look for things in our lives that are slightly more complicated.
One of the first things that might fit the bill is written language. If you look in a book and see all of the different types of symbols that are used to print the book, you will see all 26 letters of the alphabet in uppercase as well as lowercase. Then there are the numbers 0 through 9, and there are punctuation marks like periods, commas, quotes, question marks, parentheses and several others. Then there are special symbols like the ‘at’ sign (@,) currency ($,) and more. If you add these up, 52 letters, 10 numbers, a few dozen for punctuation and symbols, you get something like 100 different symbols that may appear printed on the pages of the average book.
From here on out, we will use the word ‘character’ to mean one of this sort of thing, one of the letters, numbers, or other symbols that are used in written language. A character can be either a letter, a number, a punctuation mark or any other type of symbol.
So we have written language with about 100 different characters, and our byte with 256 possibilities, maybe we can represent language with bytes. Lets see, how do you put an ‘A’ into a byte? There is nothing inherent in a byte that would associate it with a character, and there is nothing inherent in a character that has anything to do with bits or bytes. The byte doesn’t hold shapes or pictures. Dividing a character into eight parts does not find any bits.
The answer, as before, is to use a code to associate one of the possible states of the byte with something that exists in the real world. The letter ‘A’ will be represented by a particular pattern of 1s and 0s in the bits of a byte. The byte has 256 different possible states, so someone needs to sit down with pencil and paper and list out all 256 of those combinations, and next to each one, put one of the characters that he wants that pattern to represent. Of course, by the time he gets to the 101
st
line or so, he’ll run out of characters, so he can add every type of rarely used symbol he can think of, or he can just say that the rest of the combinations will have no meaning as far as written language is concerned.
And so, in the early days of computers, each manufacturer sat down and invented a code to represent written language. At some point, the different companies realized that it would be beneficial if they all used the same code, in case they ever wanted their company’s computers to be able to communicate with another brand. So they formed committees, held meetings and did whatever else they needed to do to come up with a code that they could all agree on.
There are several versions of this code designed for different purposes, and they still hold meetings today to work out agreements on various esoteric details of things. But we don’t need to concern ourselves with all that to see how a computer works. The basic code they came up with is still in use today, and I don’t know of any reason why it would ever need to be changed.
The code has a name, it is the: American Standard Code for Information Interchange. This is usually abbreviated to ASCII, pronounced “aass-key.” We don’t need to print the whole code here, but here’s a sample. These are 20 of the codes that they came up with, the first 10 letters of the alphabet in uppercase and lowercase:
PART OF ASCII CODE TABLE
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
Each code is unique. It’s interesting to note the way that they arranged the codes so that the codes for uppercase and lowercase of the same letter use the same code except for one bit. The third bit from the left is off for all uppercase letters, and on for all lowercase letters.
If you wanted to put a message on your computer screen that said “Hello Joe” you would need nine bytes. The first byte would have the code for uppercase “H”, the second byte would have the code for lowercase “e”, the third and fourth bytes would have the code for lowercase “l”, the fifth byte would have the code for lowercase “o”, the sixth byte would have the code for a blank space, and bytes seven, eight and nine would contain the codes for “J”, “o” and “e.”
Notice that there is even a code for a blank space (it is 0010 0000 by the way.) You may wonder why there needs to be a code for a blank space, but that just goes to show you how dumb computers are. They don’t really contain sentences or words, there are just a number of bytes set with the codes from the ASCII code table that represent the individual symbols that we use in written language. And one of those “symbols,” is the lack of any symbol, called a space, that we use to separate words. That space tells us, the reader, that this is the end of one word and the beginning of another. The computer only has bytes, each of which can be in one of its 256 states. Which state a byte is currently in, means nothing to the computer.
So let us take a memory byte, and set the bits to 0100 0101. That means that we have put the letter E into the byte, right? Well… not really. We have set the pattern that appears next to the letter E in the ASCII code table, but there is nothing inherent in the byte that has to do with an ‘E.’ If Thomas Edison had been testing eight of his new experimental light bulbs, and had them sitting in a row on a shelf, and the first, third, fourth, fifth and seventh light bulbs had burned out, the remaining light bulbs would be a byte with this pattern. But there wasn’t a single person on the face of the Earth who would have looked at that row of bulbs and thought of the letter ‘E,’ because ASCII had not yet been invented. The letter is represented by the code. The only thing in the byte is the code.
There you have the subject of codes. A computer code is something that allows you to associate each of the 256 possible patterns in a byte with something else.
Another language note here, sometimes the word code refers to the whole list of patterns and what they represent, as in “This message was written with a secret code.” Sometimes code just refers to one of the patterns, as in “What code is in that byte?” It will be pretty obvious from the context which way it is being used.
Back to the Byte
Do you remember the memory byte we drew a few chapters ago? It was eight memory bits with their ‘s’ wires all connected together. Almost every time that we need to remember a byte inside a computer, we also need an additional part that gets connected to the byte’s output. This extra part consists of eight AND gates.
These eight AND gates, together, are called an “Enabler.” The drawing on the left shows all of the parts, the drawing on the right is a simpler way to draw it.