graphic with four colored squares

The Internet and HTTP

Ross Shaull, Brandeis University

In this lecture we take a whirlwind tour of the Internet and a fundamental building block of the web, HTTP.

The Internet as Fancy Picture

The following image was generated by Matt Britt for the Wikipedia article on the Internet. It depicts a partial map of the hosts reachable on the Internet, using data found at opte.org/maps.

Partial map of the Internet, generated by Matt Britt

Partial map of the Internet. Source: Wikipedia.

The Internet as Cartoon

A better way for us to start visualizing the Internet is as an opaque cloud to which hosts connect.

Internet as a cloud

The Internet and two hosts, a client and a server.

We will spend this lecture looking more closely at this picture...

Core Internet Idea(l)s

Switching: Connecting One Host to Another

Old-style telephone switchboard

Telephone Switchboard

Source: Posted to Wikipedia courtesy of Joseph A. Carr

Packet Switching

Ethernet

Addressing for Ethernet: IP

Internet Address Book: DNS

DNS stands for Domain Name Service.

Domains and Names

TLDs

The US has a country code (.us) too, but it's not typically used (some government sites use it).

Sometimes countries sell their domains to companies, like Tuvalu (who knows what their TLD is?).

Some DNS Details

DNS hierarchy

Source: comptechdoc.org

TCP/IP

The Protocol Stack

Messages in the Protocol Stack

When you send a message, each layer in the protocol stack adds its own header (and possibly footer) information. The lower layers treat the entire message+headers as its payload, so lower layers do not have to know how higher layers work.

Example of HTTP -> TCP -> IP -> Ethernet:

Ports

HTTP

Request and Response

Here is an abbreviated example of the http protocol communication between your computer and the facebook.com web server:

  1. You type "facebook.com" into the location bar of your web browser
  2. Your web browser consults DNS to find the IP address for facebook.com
  3. Your web browser connects to facebook.com's IP address and sends an HTTP GET request for "/", the standard root for websites.
  4. The facebook.com webserver processes the request and sends an HTTP response message containing the HTML of the facebook.com homepage
  5. Your browser parses the HTML in the response and displays the facebook.com homepage

HTTP Requests

GET can send data too

HTTP Responses

Always a status line with a status code, followed by headers, followed by the content. Here is a request and a response:

Request

GET /~rshaull/ HTTP/1.1
Host: www.cs.brandeis.edu

Response

HTTP/1.x 200 OK
Date: Wed, 05 Jun 2008 13:00 GMT
Content-Length: 8681
Content-Type: text/html; charset=UTF-8

... content here ...

Another response header: redirect with Found

Sometimes when you visit a site, the URL in your location bar changes. The reason is beacuse the site sent you a redirect. For example, facebook.com redirects to www.facebook.com.

Another response header: caching with Not Modified

Web browsers cache content locally. This is why hitting the back button is fast.

If you refresh a page, the web server may choose to tell you that content hasn't changed, in which case your browser will know to use the local cache instead of downloading the same content again.

Request

GET /~rshaull/ HTTP/1.1
Host: www.cs.brandeis.edu

Response

HTTP/1.x 304 Not Modified
Date: Wed, 05 Jun 2008 13:05 GMT

Many Requests per Page

Even for a very basic web page, your browser will make many requests to the server! This is why parts of a web page sometimes seem to load slower than others, and also why you can start reading text before images or movies show up.

A snippet of HTML and the page it is copied from.

For example, here are some of the additional HTTP GET requests that your browser makes when it starts rendering the facebook homepage:

GET /rsrc.php/98481/css/welcome.css HTTP/1.1
GET /rsrc.php/101731/css/dialogpro.css HTTP/1.1
GET /images/welcome/welcome_3.gif HTTP/1.1

What is a Web Browser?

A web browser is:

Another Tool Break

Let's use a program called telnet to act like a web browser

Practical Principles

The end-to-end principle says that the usefulness of the network is at its edges. HTTP and TCP/IP together form the Web, where the edges are your web browser and a web server.

The openness principle means that much of the code that makes the Internet possible is there for you to look at. A good way to learn HTML and CSS (and JavaScript) is to view source.

What's Next?

We start putting our knowledge of HTML, CSS, and HTTP to work to create forms and process input data