CIS 3 LM The Internet
Sources Used:
Kaare Christian, The UNIX Operating System, (NY: John
Wiley &Sns., 1983)
Douglas E. Comer, the Internet Book, (NJ: Prentice-Hall,
1997)
John L. Hennessy & David A. Patterson, Computer
Organization and Design, (San Fransisco: Morgan Kaufmann,
1998)
Wendy G. Lehnert, Internet 101, (Reading: AWL, 1998)
This article aims to define the Internet and introduce you to its history and the advances in Computer Science that led to its existence. The headings of this articles are very much a brief summary of what the Internet is. This means that just by reading the headings in the Summary Index you can sum up the basic characteristics of the Internet as a networking system.
At the end of this article, I have appended a list of definitions of the technical terms used. You can refer to it as you are reading the material by using the hyperlinks provided.
Before we start discussing the unique history of the Internet, you should note three major trends that were occurring in the late 1960s that bear a direct influence on advances in computer networking:
To help you appreciate this fact, consider that
the first successful commercial computer, UNIVAC I (Universal
Automatic Computer.) This computer, built by Remington-Rand in
1951, was sold for about $1 million (Adjusted price for 1996
would be close to $5 million.) This computer had 4 Kilobytes of
memory and about kHz in CPU speed.
The next exciting thing in commercial computers, was the IBM
System/360 line of computers, which were introduced to the market
in 1964. The cheapest one of these, model 40, with a CPU speed of
1.6 MHz and 32KB-256KB of memory, cost $225,000. The most
expensive, model 75, 5.1 MHz, 256KB-1MB, sold for $1,900,000.
In 1965, Digital Equipment Corporation (now acquired by Compaq)
released the first commercial minicomputer, PDP-8, for just under
$20,000 (Some 13,135 times more robust in price-performance.)
This sharp decrease in price and increase in performance enabled corporations to buy more than just one computer. As computers got cheaper, businesses bought more of them. As you may well know, a computer without peripherals is an isolated (almost useless) machine. It can perform complex jobs. But the real benefit comes with the ability to share its results with all those who need it. Subsequently, the need to allow these computers to communicate gave rise to research means to interconnect them. Many public and private organizations (such as American Telephone and Telegraph, AT&T, International Business Machines, IBM and the US Department of Defense, DOD) devoted funds and research to developing networking technologies. One of the agencies funded by DOD ultimately developed the software and hardware protocols that made the Internet possible.
Most computer Operating Systems can be called
single-user Operating Systems. In the late 1960s, AT&T's Bell
Labs became involved in an Operating System called Multics.
Multics was a multi-user interactive system that used a GE
mainframe computer. Bell Labs withdrew from this project in 1969.
At about that same time, a software developer called Ken Thompson
began dabbling with a reject PDP-7 mini from DEC. He sought to
coordinate the efforts by programmers in a programming research
environment and to provide a document preparation tool for the
Bell Labs patent organization. Thompsons initial work
culminated in a PDP-7 assembler + several assembly utilities. In
1971, an early
version of UNIX was delivered to the Bell Labs patent
organization.
UNIX incorporates two disciplines that seem divergent.
Programming, and document preparation. UNIX also underlined the
importance of text management tools for many disciplines
including that of programming. By degrees, UNIX began to be used
internally throughout Bell System. Academic institutions also
began to get interested.
In 1973, Dennis M. Ritchie (a co-author of the C Programming
Language, and its standard expository book The C Programming
Language) rewrote UNIX in C. That is, for the first time in a
high level language. This meant the operating system could be run
(or ported) to any brand of computer.
UNIX was widely distributed across the country's Computer Science departments, where scientists modified and added to it. One of the many releases of UNIX, called BSD, developed by the students and faculty of the University of California at Berkeley in the 1970s helped disseminate the Internet, by including in the Operating System the networking software protocols that form the foundation of Internetworking.
In the late 1960s the US Department of Defense (DOD) starting funding the Advanced Research Projects Agency (ARPA) with the intent to develop more efficient reliable and robust networks.
ARPA consisted of a collective research effort that included researchers from both the computing and telecommunications industry, as well as academia.
ARPA directed its research towards solutions to the problem of interconnecting the many isolated networks that the military and other organizations used.
One of the important factors of ARPA's success was its approach (which I will summarize later). ARPA was geared not towards theoretical research but towards the practical implementation of ideas. It emphasized:
As a result, ARPA succeeded in developing networks that used satellite and radio transmission for communication and data transfer. The agency also gave birth to ARPANET, which was a nation-wide Wide Area Network that was used between 1969 and 1990 as the backbone that tied the researchers together, and was the platform of their tests and ideas.
Researchers used ARPANET:
Indeed ARPANET was the first backbone of what we now call the Internet.
ARPA built on the idea of an inter-network that would make it inter-network communication possible.
To achieve this, ARPA researchers developed a collection of programs (a software suite) that, once installed on the different computers in the network(s), worked together to enable smooth transfer of data between different computer platforms and network technologies.
(The Internet, with an uppercase 'I', refers to that huge network which ARPA built. The internet (with a lowercase 'i') refers to a private internetwork that uses the same principles and fundamental programs that the Internet uses. It can also be called an Intranet.)
One of the results of the cooperation among researchers, is that ARPA made all its findings and subsequent specifications public and available to all. This was contrary to corporate practice, which regarded innovations as secrets that must be closely guarded.
As work on the Internet project progressed, the scientists who used ARPANET for the exchange of technical information, decided to keep all technical documents online (that is, accessible electronically over ARPANET.)
Reports issued by scientists were released in two steps. When a report was initially written, its author(s) would make it available to others over ARPANET, for comment. Such reports were called Requests For Comments (RFCs.) After other researchers sent the author(s) their comments and suggestions, the report was re-written and polished, and re-issued as an Internet Engineering Note (IEN.)
Later on, some RFCs were found to be sufficient, without need of further refinement, while some others had to be re-written completely and re-posted still as RFCs. Eventually, the IEN series was dropped, and the official records of the project were just called RFCs.
By the time, the country and many others were getting connected to the ARPANET backbone, the research results, programs and standards were made public to everybody.
This had a most beneficial effect on the rapid progress and universalization of the Internet and the software upon which it depended.
The Internet is said to grow by 20% per month. Here's some of the history.
During the 1970s, an Operating System called UNIX (written around 1969) was modified in such a way that made it usable on just about any type or brand of computer. It also became freely distributed. At UC Berkeley, faculty and students were writing applications and modifications to it. In the late 1970's, they started distributing what became known as BSD UNIX on tape. ARPA saw in BSD a means of distributing TCP/IP. They signed a research contract with the researchers at Berkeley, which enabled the Berkeley folks to incorporate TCP/IP into their release of the OS. Students and researchers around the country were thus given the opportunity to use and study TCP/IP along with UNIX. They also contributed it to.
In the 1970s, the National Science Foundation had begun funding a project to build the Computer Science Network (CSNet). CSNet played a role in providing Internet connections to Computer Science departments through the mid-1980s.
By 1980, TCP/IP was written for different brands of computers. The Internet was a functional and promising network. A number of industrial and academic sites were testing and using TCP/IP regularly.
During the mid 1980's, some 2000 computers were connected to the Internet. By 1990, about 3 million computers got Online. In 1996, over 9 million computers were connected to the Internet. By 1997, this number almost doubled. Today, there are over 30 million computers, ranging from supercomputers, to handheld computers, and embedded systems that have access to the Internet.
In 1982, it became apparent that this software was reliable and robust enough to be used in critical environments such as the military. DOD (which had funded the project from the beginning) started to use TCP/IP on its networks. The switch to TCP/IP was complete in a year, and both ARPA and DOD switched all their connections to TCP/IP in 1983.
In 1984, the Internet doubled in size for the second time. Furthermore, other government agencies such as National Aeronautics and Space Administration used TCP/IP on some of their networks.
In 1985, the NSF developed NSFNet, a Wide Area Network that connected its five supercomputing sites around the country. It also provided grants to groups who wished to build and operate a new high-speed WAN which would replace parts of ARPANET as well as NSFNet. It also offered grants to groups who worked towards interconnecting computers within regions, and connecting them to the larger WAN. These smaller networked regions were referred to as the NSF Regional Networks, or NSF Mid-Level Networks. NSF also allocated funds to help universities pay for long distance fees in order to connect their LANs to the Internet.
Between 1986 and 1990, the Internet ballooned to be the world's largest computer network.
In 1987, NSF accepted a joint proposal from IBM, MCI and MERIT to build a WAN backbone, later called the NSFNet backbone.
In 1988, NSFNet became the Internet backbone. As traffic increased, NSF approved tripling the capacity of the backbone.
By 1991, as NSFNet was nearing its capacity, and expansions were beyond the Federal budget, responsibility for maintaining the Internet was partially relegated to IBM, MCI, and MERIT, which formed the nonprofit Advanced Networks and Services (ANS.)
In 1991, several European countries, which had experience with earlier non-TCP/IP networking efforts such as BITNET, X.25, and EARN (funded by IBM,) began cooperating to install a European Top Level Backbone. It came to be known as EBONE. Like the US Backbones, each region in the European backbone has a second level network that interconnects the local sites (inside a single country) and connects them to the backbone. At more localized level (level 3,) those individual sites can have Local or Metropolitan Area Networks that connect the different ISPs and LANs.
In 1992, ANS built a new WAN backbone, ANSNET, that had 30 times the capacity of the NSFNet it replaced.
In 1995, MCI replaced ANSNET with a new very high-speed backbone, vBNS (very high speed Backbone Network Service). This informal transfer of ownership to a private company, was a turning point, where the Internet became commercial.
Formally, vBNS is sponsored by the NSF and implemented by MCI.
In 1998, vBNS was supporting streaming data traffic (voice, video, mission critical data) at 2.5 Gigabits per second.
There
once was what Craig Hunt, author of TCP/IP Network
Administration, called "the protocol wars." (See
the definition of protocol below)
There are many different kinds of networks. And many different
protocols or programs designed to interconnect them, such IPX,
UUCP, etc.
As he put it, TCP/IP won that war. Today it is the software that
is universally used in the administration of many kinds of
individual networks, as well as the Internet, which is also
called a TCP/IP network.
TCP/IP refers to a set of communication protocols that regulate and standardizes communication between networks. They are the set of rules that define a common language between networks. TCP and IP, developed by ARPA researchers in the seventies and completed aruond 1980, are the most important and innovative of these protocols. They are the bedrock of the Internet and all other private internets.
The entire set of Internet communication protocols is informally named after these two protocols: TCP/IP.
One great thing about these protocols is that their writers decided to make them public. That is, to be distributed and shared for free. They published detailed specifications on how they should be used or modified.
ARPA donated the TCP/IP software suite to academic institutions and encouraged them to use and develop it.
At the University of California at Berkeley, students and faculty who were developing their homegrown distribution of the UNIX Operating System, were offered a research contract with ARPA. They incorporated TCP/IP software into the Berkeley Software Distribution (BSD) UNIX. Both UNIX and TCP/IP were written in such a way as to make them operational on different brands of computers. As a result, most Computer Science departments were able to obtain free copies of the Operating System as well as the networking suite. They used both to run their Local Area Networks.
Open Source Software and Open Systems (such as the Internet,) are important for computing because organizations buy a multitude of different computers. Only an Open Network System like the Internet, or other networks modeled after it, enables those diverse computer platforms to communicate.
Thus, all the specifications to build install and use this software were made available to everyone by ARPA. This was also part of their effort to make their results universally accepted and used.
Here is where you can take a look at these specifications as defined in the RFCs that I had mentioned earlier:
http://www.cis.ohio-state.edu/hypertext/information/rfc.html This is a page where you can find an index for all the Requests For Comments. http://www.cis.ohio-state.edu/htbin/rfc/rfc1025.html This is an RFC written by Dr. Jonathan Postel about stages in the development of TCP/IP.
IP, for short, provides the basic tools to convert data to a format recognizable by different networks. It defines the size of and prepares the data packets that are sent over the Internet. All computers on the networks must conform to the format that IP specifies. They all must have a copy of the IP software.
1. It specifies how the data to be transmitted
over the network is converted into packets (called IP datagrams.)
2. It regulates how routers choose the paths in the network that
will lead the datagram towards its destination.
3. Defines an address scheme where each computer is assigned a
unique IP address.
Just for curiosity, a typical formal definition of IP would look like this:
The Internet protocol that defines the unit of information passed between systems providing a basis packet delivery service within the transmission control protocol/Internet protocol (TCP/IP). IP is used in gateways to link networks at an open systems interconnection (OSI) network Level 3 and above. IP is a standard that describes how packets of data are transported across the Internet and recognized as an incoming message. (© 1999 MCI WorldCom)
TCP
provides the mechanisms that guarantee that a data packet sent
through one or more networks reaches its destination.
1. TCP establishes the commmunication session between the sending and the receiving ends of the network.
2. It ensures that data reaches its
destination. It checks to see if data is lost. If the data is
lost, TCP resends it.
TCP does so by settin up a timer. If the receiving destination
did not send back an acknowledgment of receipt before the
Time-out value of the timer is up, TCP assumes the data is lost
and resends it. The value of the timer is set automatically
according to network conditions.
3. Before data is sent across the Internet, it is chopped up in little pieces called packets. This is done so that large chunks of data do not slow down the networks, and to allow many messages to share the same transmission lines. Because each data packet travels across the Internet independently, they do not reach their destination in the order they were received. That's half the reason why the Internet is called a Packet Switched Network. So another nifty thing that TCP does, is to reassemble the data packets at the destination site in the same order in which they were sent out.
All data that is transmitted over the Internet (and any other packet switched network) has to be broken up into units called packets. This enables all the users of the network (and, in the case of the Internet, all the users around the globe) to share the transmission lines that interconnect them without having to suffer delays caused by large data files.
When a large file is broken up into packets, each packet is transmitted independently over the Net. Some of the packets that belong to the same file may take different routes to reach their destination. Not all the packets arrive at the same time. Although to our senses, they arrive almost instantaneously. TCP software ensures that all the packets of one file reach their destination and are reassembled in the correct order.
The Internet Protocol ensures that each
data packet contains a label called an IP header which has
information (readable by any computer) on the source and
destination of the packet. That is, the IP addresses of these
computers.
The reason these headers or labels are recognizable by all
computers is that, by now, each computer has a copy of the IP
software.
A packet that conforms to the IP specification is called an IP
datagram.
Not all packets are the same size.
Sending data in packets means that all users
can share the transmission lines by taking turns. For example,
along one transmission line, a packet from computer A to computer
B passes, followed by another packet from computer C to computer
E, followed by a packet from computer D to G, and so on.
Each sender or receiver receives a fair share of the network
routes.
The sharing of transmission lines is handled by the hardware.
That is the machine waits for its turn to send the datagram.
The process of passing packets along different routes is called routing. It is usually handled by the dedicated computers called routers.)
For a network to be reliable, it needs to maintain its functionality independent of any of the computers that are connected to it. Indeed, one of the goals that DOD wanted to achieve was a network that would sustain communication between remote sites in the case of nuclear attack.
Another fact to note is that in a large network, there is more than one path from one point to another. This means that a data packet can reach its destination following one of many paths. How the path to follow is decided on, is carried out by routing software located at the routers.
For the two above reasons, networks make use of something called dynamic routing. This means that data being transferred do not always take the same route, depending on which routes are the closest to the destination, and which routes are the least congested. Those decisions are carried out by the routers at transmission time, and are not planned or programmed beforehand.
There is no central router that controls all the routing in the network. This would defeat the purpose of a network that can withstand failure in some of its points.
1. The Internet implemented revolutionary ideas such as the concept of connecting networks together using routers.
2. The TCP/IP protocols was developed, thoroughly tested and fine-tuned over many years, before they were adopted as a standard that was universally implemented.
3. The Internet technology (such as the concepts, the software standards, etc.) were made freely available, and were developed as a collaborative synergistic effort.
4. Researchers were allowed and encouraged to experiment and apply their ideas to real computers and real networks.
5. The Internet research was very well documented through the use of RFCs, which enabled researchers to review and build upon previous work.
Click here to go to previous document,
or here for Internet Addresses