Chapter 1. HTTP

Table of Contents

1.1. Interactive telnet session with a Web server
1.2. A tiny Web server to play with
1.3. A tiny Web client to play with

This practical session will enable you to explore important concepts regarding the Web:

1.1. Interactive telnet session with a Web server

We will first explore the HTTP protocol by using a good old program called telnet, that can connect on any port service by just providing the port number as an argument on the command line. This will let us see what is normally hidden by a standard Web browser: actual request and server responses, before being handled by a graphical end-user browser.

Exercise 1.1. Fetch a document

Let us fetch a document from the Pasteur Institute Web server:

% telnet www.pasteur.fr 80
Trying 157.99.64.12...
Connected to www.pasteur.fr.
Escape character is '^]'.
GET /formation/infobio/web/cours/data/page1.html HTTP/1.0

HTTP/1.1 200 OK
Date: Tue, 24 Feb 2004 18:01:05 GMT
Server: Apache/1.3.26 (Unix) mod_perl/1.24_01 mod_ssl/2.8.10 OpenSSL/0.9.5a
Last-Modified: Tue, 18 Feb 2003 13:40:14 GMT
ETag: "101e6a1-cd-3e5237be"
Accept-Ranges: bytes
Content-Length: 205
Connection: close
Content-Type: text/html; charset=iso-8859-1

<html>
  <head>
    <title>A sample Web page</title>
  </head>

  <body>
    <h1>A first header</h1>
    And some text...

  </body>
</html>
Connection closed by foreign host.
%
	  

You can as well get important information from the Web server, before getting the entire document:

% telnet www.pasteur.fr 80
Trying 157.99.64.12...
Connected to www.pasteur.fr.
Escape character is '^]'.
HEAD /formation/infobio/web/cours/data/page1.html HTTP/1.0

HTTP/1.1 200 OK
Date: Tue, 24 Feb 2004 18:01:05 GMT
Server: Apache/1.3.26 (Unix) mod_perl/1.24_01 mod_ssl/2.8.10 OpenSSL/0.9.5a
Last-Modified: Tue, 18 Feb 2003 14:38:31 GMT
ETag: "101e6a1-cd-3e5237be"
Accept-Ranges: bytes
Content-Length: 205
Connection: close
Content-Type: text/html; charset=iso-8859-1

Connection closed by foreign host.
%
	  
What could this kind of information be useful for?

Exercise 1.2. HTTP headers

Let us use date information from the server. For instance, say you do not want a too recently modified file: you can use an HTTP header for this purpose.

% telnet www.pasteur.fr 80
Trying 157.99.64.12...
Connected to www.pasteur.fr.
Escape character is '^]'.
GET /formation/infobio/web/cours/data/page1.html HTTP/1.0
If-Modified-Since: Tue, 18 Feb 2003 14:38:31 GMT

HTTP/1.1 304 Not Modified
Date: Tue, 24 Feb 2004 18:01:05 GMT
Server: Apache/1.3.26 (Unix) mod_perl/1.24_01 mod_ssl/2.8.10 OpenSSL/0.9.5a
Last-Modified: Tue, 18 Feb 2003 14:38:31 GMT
Connection: close
ETag: "101e6a1-cd-3e5237be"

Connection closed by foreign host.
	  

Another useful header let you chain several requests. Try:

% telnet www.pasteur.fr 80
Trying 157.99.64.12...
Connected to www.pasteur.fr.
Escape character is '^]'.
GET /formation/infobio/web/cours/data/page1.html HTTP/1.0
Connection: keep-alive

	  
What can you do after server's reposnse?

Exercise 1.3. HTTP status code

HTTP return codes help you know whether a request has been successful. Explain what happens in the following:

% telnet bioweb.pasteur.fr 80
Trying 157.99.64.11...
Connected to rosalind.sis.pasteur.fr.
Escape character is '^]'.
GET /formation/infobio/web/cours/data/page1.html HTTP/1.0

HTTP/1.1 404 Not Found
Date: Tue, 24 Feb 2004 18:01:05 GMT
Server: Apache/1.3.20 (Unix)
Last-Modified: Tue, 18 Feb 2003 14:38:31 GMT
Connection: close
Content-Type: text/html; charset=iso-8859-1

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<HTML><HEAD>
<TITLE>404 Not Found</TITLE>
</HEAD><BODY>
<H1>Not Found</H1>
The requested URL /formation/infobio/web/cours/data/page1.html was not found on this server.<P>
<HR>
<ADDRESS>Apache/1.3.20 Server at bioweb.pasteur.fr Port 80</ADDRESS>
</BODY></HTML>
Connection closed by foreign host.
%
	  

And here:

% telnet www.pasteur.fr 80
Trying 157.99.64.12...
Connected to www.pasteur.fr.
Escape character is '^]'.
HEAD /formation/infobio/web/cours/data/page2.html HTTP/1.0

HTTP/1.1 403 Forbidden
Date: Tue, 24 Feb 2004 18:01:05 GMT
Server: Apache/1.3.26 (Unix) mod_perl/1.24_01 mod_ssl/2.8.10 OpenSSL/0.9.5a
Last-Modified: Tue, 26 Nov 2002 15:21:19 GMT
ETag: "ee325-c4c-3de3916f"
Accept-Ranges: bytes
Content-Length: 3148
Connection: close
Content-Type: text/html; charset=iso-8859-1

Connection closed by foreign host.
	

Exercise 1.4. HTTP GET: dynamic content

A Web server does not only serve static document. You can also issue request to get dynamically computed document. At the HTTP protocol level, there are several ways to achieve this. Firstly, try:

% telnet www.pasteur.fr 80
Trying 157.99.64.12...
Connected to www.pasteur.fr.
Escape character is '^]'.
GET /cgi-bin/biology/bnb_s.pl?query=biopython HTTP/1.0


	  
what do you get?

Try another one (put your own email in place of YOUR_EMAIL):

% telnet bioweb.pasteur.fr 80
Trying 157.99.64.11...
Connected to rosalind.sis.pasteur.fr.
Escape character is '^]'.
GET /cgi-bin/seqanal/pdbsearch.pl?email=YOUR_EMAIL&query=1crn HTTP/1.0
	  

Exercise 1.5. HTTP POST

Describe the difference between Exercise 1.4 and the following (do not forget to reset the <variable>Content-length</variable> according to the total length of the request character string, e.g: ):

% telnet bioweb.pasteur.fr 80
Trying 157.99.64.11...
Connected to rosalind.sis.pasteur.fr.
Escape character is '^]'.
POST /cgi-bin/seqanal/pdbsearch.pl HTTP/1.0
Content-type: application/x-www-form-urlencoded
Content-length: 35

email=YOUR_EMAIL&query=1crn