CC425 -- Network Technologies (TCP/IP Suite) OR Advanced Computer Networks

Semester Assignments

The idea (and most of the text) for both the following assignments has been taken from the courses CS 244a, Stanford University and CPSC441, University of Calgary Canada. It is recommended the text available at these websites be consulted to facilitate your comprehension. A copy of the same is available locally. [proxy-1] [proxy-2] [tcp]

A Simple Web Proxy | TCP Traffic Analysis

A Simple Web Proxy

The purpose of this assignment is to learn about Web proxies and the HyperText Transfer Protocol (HTTP). Along the way, you will also learn a bit about TCP/IP and socket programming.

A Web proxy is a software entity that functions as an intermediary between a Web client (browser) and a Web server. The Web proxy intercepts Web requests from clients and reformulates the requests for transmission to a Web server. When a response is received from the Web server, the proxy sends the response back to the client. While the presence of the proxy as an intermediary in the request-response interaction adds some overhead, one advantage of a proxy is that it conceals the identity of the client from the Web server. That is, from the server's point of view, the proxy is the client. Similarly, from the client's point of view, the proxy is the server. A Web proxy thus provides a single point of control to regulate Internet access between clients and servers.

In some deployments of Web proxies, the proxy is augmented to have a local storage capability called a cache. If one or more clients access the same Web content repeatedly, then a Web proxy offers a natural point to store a "local" cached copy of that Web content. By storing a copy locally, the proxy can respond to some requests immediately without contacting the origin Web server. This reduces the response time for the Web user, reduces traffic on the core Internet, and offloads the server from processing repeated requests. These are the main reasons why Web proxy caches are popular.

In this assignment, you will implement and test a simple Web proxy. This Web proxy performs the first role (proxying) but not the second role (caching). The goals of the assignment are to build a properly functioning Web proxy for simple Web pages, and then to extend the functionality in certain ways to offer some novel Web proxy features. You do not need to implement caching in this Web proxy.

The most important HTTP command for your Web proxy to handle is the "GET" request, which specifies the URL for an object to be retrieved. In the basic operation of your proxy, it should be able to parse, understand, and forward to the Web server a (possibly modified) version of the client request. Similarly, the proxy should be able to parse, understand, and return to the client a (possibly modified) version of the response that the Web server provided to the proxy. Your proxy should be able to handle response codes such as 200 (OK) and 404 (Not Found) correctly, notifying the client as appropriate. Reasonable handling of Conditional GET requests and 304 (Not Modified) responses is also desirable. Adding support for HTTP request redirection (302) is optional; such requests can be handled "recursively" by your proxy if you want to implement this. (Tricky!)

You will need at least one TCP (stream) socket for client-proxy communication, and at least one additional TCP (stream) socket for proxy-server communication. If you want your proxy to support multiple concurrent HTTP transactions (highly recommended), you will need to fork child processes for request handling as well. Each child process will use its own socket instances for its communications with the client and with the server.

You should be able to compile and run your Web proxy on any machine, or even your home machine. You should be able to use your proxy from any Web browser (e.g., Netscape Navigator, Internet Explorer, Mozilla Firefox), and from any machine (either on campus or at home). To test the proxy, you will have to configure your Web browser to use your specific Web proxy (e.g., look for menu selections like Edit, Preferences, Advanced, Proxies).

As you design and build your Web proxy, give careful consideration to how you will debug and test it. For example, you may want to print out information about requests and responses received and processed. Once you become confident with the basic operation of your Web proxy, you can toggle off the verbose debugging output. If you are testing on your home network, you can also use tools like ethereal or tcpdump to collect network packet traces. By studying the HTTP/TCP packets going to and from your proxy, you can convince yourself that it is working properly.

In your testing of the proxy, you may want to go through incremental steps similar to the following:

The primary test of correctness for your proxy is a simple visual test. That is, the content displayed by your Web browser should look the same regardless of whether you are using your Web proxy or retrieving content directly from the Web server. This mode of operation can be called "invisible" mode, since the presence of the proxy is invisible to the user.

In addition to invisible mode, please implement a "visible" mode for your proxy, wherein your proxy inserts an additional tag line such as "This page retrieved by dexter's proxy" to be displayed at the bottom of the Web page. This feature will involve modifying the HTTP response header and the HTML byte stream. You should be able to toggle between visible mode and invisible mode either in your source code, or using a command-line option when you start your proxy. Visible mode can be helpful while debugging your proxy (e.g., if you forget whether your browser is configured to use the proxy or not).

The suggested grading scheme for the assignment is as follows:

TCP Traffic Analysis

The purpose of this assignment is to learn about the Transmission Control Protocol (TCP). In particular, you will write a program to analyze a specially formatted network traffic trace file, in order to assess and understand the TCP/IP protocol, including its handshaking behaviour and its protocol states.

The file 441trace.dat (270 KB ASCII data file) shows some TCP/IP packet traffic collected using a network traffic analyzer on a research network at the University of Calgary. (You may use a different trace file compiled by wireshark/ethereal) This trace contains 2,808 TCP/IP packets, and lasts about 3.6 minutes. During the period traced, a single Web client was downloading Web pages from different Web sites on the Internet. This trace is to be used for your TCP traffic analysis, and for answering the questions given below.

Each line of data in the trace file represents one TCP/IP packet. There are multiple columns of data on each line, separated by spaces. The columns, from left to right, represent:

An example of a line in this trace format is:
7.974098 -> 44 TCP 1104 80 533868 : 533868 0 win: 32768 S
This packet traveled from IP source address (port 1104) to IP destination address (port 80) at time 7.974098 sec. It was a SYN packet of size 44 bytes (including TCP/IP protocol headers). The proposed starting TCP sequence number was 533868. This packet carried no actual TCP data bytes. The acknowledgement field was invalid, and initialized to 0. The flow control window size advertised was 32 KB.

You need to write a program (20 marks) for parsing and processing trace files in this format (or in a format compatible with wireshark) , and tracking TCP state information. In particular, the program processes the trace file and computes summary information about TCP connections. Note that a TCP connection is identified by a 4-tuple (IP source address, source port, IP destination address, destination port), and packets can flow in both directions on a connection (i.e., from host A to host B, and from host B to host A). Also note that the packets from different connections can be arbitrarily interleaved with each other in time, so your program will need to extract packets and associate them with the correct connection.

The summary information to be computed for each TCP connection includes:

For testing your program, here are some small example traces. The trace example1.dat - (17 TCP packets) contains a single complete TCP connection. The trace example2.dat - (12 TCP packets) contains a single TCP connection that is reset. The trace example3.dat - (100 TCP packets) contains 8 TCP connections (5 complete, 2 reset, and 1 still in progress when the trace ended). When you have your program working properly, you can run it on the real trace file for this assignment.

Use your program, and the 441trace.dat file, to answer the following questions.

Last updated: 23rd of April, 2007.