HTML The Definitive Guide (6 page)

Read HTML The Definitive Guide Online

Authors: Chuck Musciano Bill Kennedy

BOOK: HTML The Definitive Guide
10.79Mb size Format: txt, pdf, ePub

Chapter 1

HTML and the World Wide

Web

 

1.2 Talking the Internet Talk

Every computer connected to the Internet (even a beat-up old Apple II) has a unique address: a number whose format is defined by the
Internet Protocol (IP)
, the standard that defines how messages are passed from one machine to another on the Net. An
IP address
is made up of four numbers, each less than 256, joined together by periods, such as 192.12.248.73 or 131.58.97.254.

While computers deal only with numbers, people prefer names. For this reason, each computer on the Internet also has a name bestowed upon it by its owner. There are several million machines on the Net, so it would be very difficult to come up with that many unique names, let alone keep track of them all. Recall, though, that the Internet is a network of networks. It is divided into groups known as
domains
, which are further divided into one or more
subdomains.
So, while you might choose a very common name for your computer, it becomes unique when you append, like surnames, all of the machine's domain names as a period-separated suffix, creating a
fully qualified
domain name.

This naming stuff is easier than it sounds. For example, the fully qualified domain name
www.oreilly.com
translates to a machine named "www" that's part of the domain known as "oreilly,"

which, in turn, is part of the commercial (com) branch of the Internet. Other branches of the Internet include educational (edu) institutions, nonprofit organizations (org), U.S. government (gov), and Internet service providers (net). Computers and networks outside the United States have a two-letter abbreviation at the end of their names: for example, "ca" for Canada, "jp" for Japan, and "uk" for the United Kingdom.

Special computers, known as
name servers
, keep tables of machine names and their associated unique IP numerical addresses, and translate one into the other for us and for our machines. Domain names must be registered and sometimes paid for through the nonprofit organization InterNIC. Once registered, the owner of the domain name broadcasts it and its address to other domain name servers around the world. Each domain and subdomain has an associated name server, so ultimately every machine is known uniquely by both a name and an IP address.

1.2.1 Clients, Servers, and Browsers

The Internet connects two kinds of computers:
servers
, which serve up documents; and
clients
, which retrieve and display documents for us humans. Things that happen on the server machine are said to be on the
server side
, while activities on the client machine occur on the
client side
.

To access and display HTML documents, we run programs called
browsers
on our client computers.

These browser clients talk to special
web servers
over the Internet to access and retrieve electronic documents.

Several web browsers are available - most are free - each offering a different set of features. For example, browsers like Lynx run on character-based clients and display documents only as text.

Others run on clients with graphical displays and render documents using proportional fonts and color graphics on a 1024 × 768, 24-bit-per-pixel display. Others still - Netscape Navigator, Microsoft's Internet Explorer, NCSA Mosaic, Netcom's WebCruiser, and InterCon's NetShark, to name a few -

have special features that allow you to retrieve and display a variety of electronic documents over the Internet, including audio and video multimedia.

1.2.2 The Flow of Information

All web activity begins on the client side, when a user starts his or her browser. The browser begins by loading a
home page
HTML document from either local storage or from a server over some network, such as the Internet, a corporate intranet, or a town extranet. In these latter cases, the client browser first consults a domain name system (DNS) server to translate the home page document server's name, such as
www.oreilly.com
, into an IP address, before sending a request to that server over the Internet. This request (and the server's reply) is formatted according to the dictates of the
HyperText Transfer Protocol
(HTTP) standard.

A server spends most of its time listening to the network, waiting for document requests with the server's unique address stamped on it. Upon receipt, the server verifies that the requesting browser is allowed to retrieve documents from the server, and, if so, checks for the requested document. If found, the server sends (downloads) the document to the browser. The server usually logs the request, the client computer's name, document requested, and the time.

Back on the browser, the document arrives. If it's a plain-vanilla ASCII text file, most browsers display it in a common, plain-vanilla way. Document directories, too, are treated like plain documents, although most graphical browsers will display folder icons, which the user can select with the mouse to download the contents of subdirectories.

Browsers also retrieve binary files from a server. Unless assisted by a
helper
program or specially enabled by
plug-in
software or
applets
, which display an image or video file or play an audio file, the browser usually stores downloaded binary files directly on a local disk for later attention by the user.

For the most part, however, the browser retrieves a special document that appears to be a plain text file, but contains both text and special markup codes called
tags.
The browser processes these HTML

documents, formatting the text based upon the tags and downloading special accessory files, such as images.

The user reads the document, selects a hyperlink to another document, and the entire process starts over.

1.2.3 Beneath the World Wide Web

We should point out again that browsers and HTTP servers need not be part of the Internet's World Wide Web to function. In fact, you never need to be connected to the Internet, an intranet or extranet, or to any network, for that matter, to write HTML documents and operate a browser. You can load up and display on your client browser locally stored HTML documents and accessory files directly. This isolation is good: it gives you the opportunity to finish, in the editorial sense of the word, a document collection for later distribution. Diligent HTML authors work locally to write and proof their documents before releasing them for general distribution, thereby sparing readers the agonies of

broken image files and bogus hyperlinks.[2]

[2] Vigorous testing of the HTML documents once they are made available on the Web is, of course, also highly recommended and necessary to rid them of various linking bugs.

Organizations, too, can be connected to the Internet and the World Wide Web, but also maintain private webs and HTML document collections for distribution to clients on their local network, or intranet. In fact, private webs are fast becoming the technology of choice for the paperless offices we've heard so much about these last few years. With HTML document collections, businesses and other enterprises can maintain personnel databases, complete with employee photographs and online handbooks, collections of blueprints, parts, and assembly manuals, and so on - all readily and easily accessed electronically by authorized users and displayed on a local computer.

1.1 The Internet, Intranets,

1.3 HTML: What It Is

and Extranets

Other books

Anne Frank and Me by Cherie Bennett
Time Warp by Steven Brockwell
Hawk's Way Grooms by Joan Johnston
My Senior Year of Awesome by Jennifer DiGiovanni
Newbie by Jo Noelle
Krakens and Lies by Tui T. Sutherland