Web Information

Basic Information about WorldWideWeb Services on the Internet

This page of information is meant to clarify the very most basic terms and concepts related to using websites and the WWW services on the Internet. It will include links to other sources of information as well. Please make suggestions toward improving it.

What is the Internet?

A high-speed telephone and cable network that routes information requests and information replies from one computer to another almost anyplace in the world.

What is the WorldWideWeb (WWW)?

The "Web" is often used as a synonym for the internet as a whole, but technically it means WorldWideWeb services over the internet. These services are a specialized routine (HTTP, HyperText Transfer Protocol) for encoding and routing both requests for information and delivery of information that may be in the form of both text and graphic images, integrated into files that appear on the computer screen as something like a full-color magazine page layout. Extensions of the original system now allow audio and video to be transmitted as well as computer files of all types.

What is a Website?

A computer which responds to HTTP requests for information is called an HTTP Server or Web Server. It stores information in files and organizes these files into directories or folders in its memory system. A website normally consists of all the files in a particular directory and its sub-directories. A sub-directory can also hold a separate or subsidiary website. Ultimately, a website is just a set of interlinked files stored on a Web Server computer.

How does the WWW work?

One computer communicates with another via the internet using the HTTP procedures. The Server computer holds files that someone has put there. Another computer, called the Client computer, users a program called a Browser (e.g. Netscape Navigator, Internet Explorer, Mosaic, Lynx) to send a request for a particular file to the Server computer. The Server sends back the file, and the Browser displays it on the screen.

Every website file has an address, or URL (Universal Resource Locator). It consists of the address of its Server computer, followed by its local address (or an alias) in the memory of that computer. http://academic.brooklyn.cuny.edu/education/index.htm
Protocol -- Host Server computer address -- Website Directory -- Filename

Most website files are HTML files and end in the extension ".html" (on UNIX operating system computers) or ".htm" (on Windows PC computers). Web servers can also deliver other kinds of files as accessories to the main HTML files, including graphic images (*.jpeg or *.jpg, *.gif, etc.) and audio (*.au, *.ra, *.wav, etc.) or video or other kinds of files. These other files are integrated into the webpage display by the Browser. Normally one webpage consists of one HTML file plus its auxiliary files.

Many websites also have a default HTML file, one that is displayed if no specific file is requested. This is often called the Homepage file and is usually the starting point for finding information on a website. The URL then consists of just the host Server computer address plus the directory address of the whole website:
http://academic.brooklyn.cuny.edu/education/
Protocol -- Host Server computer address -- Website main directory

What are LINKS? What is Hypertext?

The most amazing and useful feature of HTML and HTTP is that it supports "hypertext". The principle of hypertext is that one text or webpage can cite, refer to, or point to, another webpage, either on the same website or on another website somewhere else in the world. Thus it is possible for webpages to LINK to other webpages on the same computer or on different computers, anywhere. This is the real power of the Web.

A Hypertext LINK is a hidden URL embedded in a webpage. What shows on the screen is a colored and underlined (normally) stretch of text, or a graphic image where the shape of the computor's cursor changes (usually to a pointing hand). If you place the cursor on one of these "hotspots" on the webpage display, and click the mouse, the Browser program automatically sends a request to the invisible URL attached to the text or image you clicked on, and the responding Server computer send a new webpage which replaces the existing page on your screen. Browsers have a "Back" function (button) that then lets you return to the previously displayed page.

A link can point to another place on the same webpage, or to a different webpage on the same website, or to a webpage on a distant computer around the world.

How do I find information on the Web?

There is no easy answer to this question. In principle it is possible to use HTTP to get to any file on any computer attached to the internet. In practice only those files that are indexed by a Web Server program on the computer are "visible" and accessible. There are, however, hundreds of millions of such files. They cover information on every subject imaginable, and much of the information is out-of-date, inaccurate, or unreliable. A lot of it is also extremely valuable. Most of it today is free, but increasingly you must register or pay a fee for access to proprietary or "value-added" information.

There are three basic ways to find information on the Web:

You can learn to use a Search Engine. Search Engines are, mostly, free services offered by companies that are seeking to build a reputation (and sell a little advertising space) on the Web. A Search Engine looks just like an ordinary webpage. It has an ordinary URL. You enter a set of keywords in a Form space on the webpage, wait less than a minute normally, and are sent a new webpage consisting of the names and URL addresses (clickable) of from zero to hundreds of possibly relevant webpages and websites. The most commonly used Search Engine sites are: Altavista, Yahoo, and Hotbot.

You can "surf", which means explore the Web freely by going from one webpage to another just by clicking on the links on a page. Serendipity will often lead you to something interesting and useful, but the results are highly unpredictable.

You can find and use a "launch site", which is the new term for webpages and websites that specialize in providing links to information on a particular topic. These are also known as Links Pages, or Hotlinks pages. They are of great value, and finding the right ones for your own specialized interests is the best first step in learning to use the Web. Many academic libraries are now developing links pages that will identify high-quality specialized sites in various subject areas. So are many government agencies, university departments, individual faculty members, companies, etc. You can use a Search Engine to find possible "launch sites" for various topics, and then use the Bookmark (or Favorites, or Hotlist) function of your Browser to store the URLs of good sites in a special file on your own computer. You can then return to these sites with a single click, or even put them up on your own webpage to make them available to students, colleagues, etc. The Yahoo search engine site also has an extensive list of topic-based menus that is a bit like a super- launch site, but has no evaluation for quality.

For research purposes, you can also use the Web to consult indexes and abstracts of the professional (print) literature, including via the ERIC database.

What are Interactive Web Pages?

All webpages with LINKS are to some extent "interactive" in that when you click somewhere on a page, something happens: the Web/Browser responds to you.

There are more advanced forms of interactivity, as well. Many pages have Buttons that you click on to start an action (e.g. a Search request), and Forms that transmit information you type onto the page to the Server for processing (e.g. passwords, information requests, even email messages). This capability can be used to run a discussion group via a webpage. It can also support a "chat room", which is a discussion group in real time (vs. the normal method which is by email to a bulletin board which others read and respond to over the next day or so).

Finally there are webpages that have computer programs built into them. These can run animations, play videos, and interact with you in any way that any computer program could do (play chess with you, for example). These programs either run on the Server machine or on your own computer. JAVA programs, for example, run on the client or receiving computer. You can stop them by setting your Browser to not allow them to run. JAVA programs are generally safe, but other kinds of programs, such as Active-X programs can in principle act like viruses and harm your system, or steal information from your computer. Most Browsers ask you whether you want to have information sent to the other computer or not and whether you will permit a program to run or not. Some Browsers also allow highly coded or encrypted information to be sent between computers over the Web (e.g. passwords, credit card numbers) by a modification of the HTTP protocol called HTTPS (secure HTTP). These may or may not actually be "secure". Be cautious or even skeptical, but not paranoid, when interacting on the Web.

How does the Web relate to Email, Telnet, Chat, etc.?

The web is one kind of transmission protocol over the internet. Many others are possible. Email uses its own separate protocol (SMTP or POP). Telnet has its own, and so does FTP. Telnet is used to connect to a remote computer and use it as if it were on your desktop. You need permission from the remote computer of course (usually an account name and password, but some accept "guests"). FTP is used to transfer files of any kind between computers (again with permission, but Anonymous FTP is often used to download publicly available files; use your email address as the password). Chat is also a protocol (IRC), used to allow several people to type messages to one another as if they were in the same room together. Related but more complex are MUDs and MOOs, in which imaginary rooms and objects as well as communicating people are connected to one another.

In principle all these other protocols can be integrated into the interactive functions of websites and webpages, sometimes by embedding another program inside a webpage (as with JAVA, above). Many Browsers today are not just HTTP client programs, they also include the ability to run the other protocols as well, and to the user it looks like it is all one integrated system. This is why today "the Web" is becoming synonymous with "the internet". But HTTP has many limitations and researchers are already developing successor protocols that will probably become common in another five to ten years.

A "protocol" basically tells computers what kind of file is being sent and where it is supposed to be routed to, it lets them know how to distinguish the destination address, return address, message, attachments, auxiliary files, etc. and what to do with them. It can include, for example, a signal that a transmission is to be given priority over other transmissions along the same route, or that the information to follow is encrypted.

There will be successors also to HTML, which is already quickly evolving into new versions with new features. SGML (Standard General Markup Language) is the more powerful "grandmother" of all HTML, and might eventually replace it. There are plans for a new Internet2, which will have faster communications and more intelligent routing stations along the way.

What are Plug-Ins and Browser Helpers?

What does a Browser program do if the file it receives from the Server computer is NOT an HTML file? It consults a list of file extensions (the last 3-4 letters of the filename) to see if it knows how to display the file. If the file is an image file, it will usually just put it up on the screen. This can look very odd if it was not what the file was meant for. But if the file is an audio file (sounds, voice) or video file (animation, movie) or even many kinds of text-display file (an email file, a formatted print page), it looks to see if the file extension (file type) is associated in its list with another program, called a plug-in or helper program, which is better able to display or play that kind of file.

Plug-ins are designed to run as if they were part of the browser, in the same window, and these just show up on the webpage (e.g. a video window in the middle of an ordinary webpage) or play sounds as you view the page. Helpers are full programs on their own, which open a new window and display the file or do what they do. As the Web evolves, more and more helpers are becoming plug-ins. New Browsers tend to come with many plug-ins already installed, or as options when you install the Browser.

The most important kinds of plug-ins you may not have, but could need, are Real Audio, for *.ra or *.ram files (sound effects) Adobe Acrobat viewer, for *.pdf files (scanned text and images, very common on the Web today) Video viewers (there are many of these, with competing formats) VRML viewers (for 3-D effects, slow but very impressive) Shockwave (for animations, comes in several versions)

If and when files formats are standardized on the Web, all these will be plug-ins and they will all come standard with your Browser. Meanwhile, most of them are at least free and can be downloaded from the company website. The company makes its money not off the viewer's decoding client software, but from selling people very expensive server encoding software to create these files in the first place.

Dowloading Software and Files

Every time you display a webpage you are downloading an HTML file and probably a few auxiliary files as well. In principle you can save these files to your hard drive if you do so before you click on the next link. (In Windows95 use a right-click of the mouse, or generally see the File menu of the Browser.) This includes images. Many webpages use Frames, which means different rectangles on the screen are actually displaying different HTML files. Always click in the area you want to save before saving to disk.

Most Browsers today include enough of an FTP protocol capability, in addition to HTTP, to let you download public files from the Web. This can include program files, or software. Many companies offer regular updates or upgrades, sometimes free, on their company websites for their commercial programs. Some offer free trials, or demos, or even stripped down versions of a program for free. In addition many programmers create useful programs that they offer for free on the internet. A few of these can damage your computer. Some just don't work. It is always safer to buy a program. In between freeware and commercial purchase is "shareware", which is free to get, but you are morally (and sometimes technologically) obligated to pay a small registration fee, usually much cheaper than comparable commercial products. Don't download and run programs from the Web without asking someone with experience in these matters. But if you know what you are doing and how to protect yourself, you can get some great bargains this way. Shareware is usually safe.

Your Browser may also allow you to download other kinds of files, but if you don't have the right program to run them or display them, they will be useless (and harmless). Remember: for basic purposes, every file is either a program or not-a-program. Most files are non-programs, and these all need a program to make them work or even be visible as anything other than nonsense symbols. The last few letters of the filename (the extension) are the key to which program works for that kind of file. Somewhere there must be a master list of all these "program associations" but I've never seen it. Every computer, and every Browser, of course, has a little list like this, and sometimes you can view this list (read your documentation). But that doesn't help when you encounter a new type of file and want to know what to do with it. (When in doubt, don't do anything with it. It could be a program or carry a virus. A virus is a mini-program with harmful effects.)

It is almost always safe to download files that end in *.txt, *.htm or html, *jpg or jpeg, and *.gif from the Web, and very safe if your Browser also has Active-X turned off (since some htm/html pages include potentially harmful Active-X programs). Of course this means safe for your hardware ... YOU might not always like what you see!

Millions of people surf the web daily with no harm except to their preconceptions. Join in!