Question

1 Approved Answer

Posted on Sep 25, 2024

In C, there is a very useful program called wget. It's a command line tool that you can use to download a web page like

In C, there is a very useful program called "wget". It's a command line tool that you can use to download a web page like this:

wget http://www.gnu.org/software/make/manual/make.html

which will download the make manual page, make.html, and save it in the current directory. wget can do much more (downloading a whole web site, for example); see man wget for more info.

Your job is to write a limited version of wget, which we will call http-client.c, that can download a single file. You use it like this:

./http-client www.gnu.org 80 /software/make/manual/make.html

So you give the components of the URL separately in the command line: the host, the port number, and the file path. The program will download the given file and save it in the current directory. So in the case above, it should produce make.html in the current directory. It should overwrite an existing file.

Hints:

- The program should open a socket connection to the host and port number specified in the command line, and then request the given file using HTTP 1.0 protocol. (See http://www.jmarshall.com/easy/http/ for HTTP 1.0 protocol.) An HTTP GET request looks like this:

GET /path/file.html HTTP/1.0 [zero or more headers ...] [a blank line]

- Include the following header in your request:

Host: the.host.name.you.are.connecting.to:

Some web sites require it.

- Use " " rather than " " as newline when you send your request. It's required by the HTTP protocol.

- Then the program reads the response from the web server which looks like this:

HTTP/1.0 200 OK Date: Fri, 31 Dec 1999 23:59:59 GMT Content-Type: text/html Content-Length: 1354

Happy New Millennium!

(more file contents) . . .

Just like in part 1, you can use fdopen() to wrap the socket with a FILE*, which will make reading the lines much easier.

- The "200" in the 1st line indicates that the request was successful. If it's not 200, the program should print the 1st line and exit.

- After the 1st line, a bunch of headers will come, then comes a blank line, and then the actual file content starts. Your program should skip over all headers and just receive the file content.

- Note that the program should be able to download any type of file, not just HTML files.

- The server will terminate the socket connection when it's done sending the file.

- You will need to pick out the file name part of a file path (make.html from /software/make/manual/make.html for example). Check out strrchr().

- You will need to convert a host name into an IP address. Here is one way to convert a host name into an IP address in dotted-quad notation:

struct hostent *he; char *serverName = argv[1];

// get server ip from server name if ((he = gethostbyname(serverName)) == NULL) { die("gethostbyname failed"); } char *serverIP = inet_ntoa(*(struct in_addr *)he->h_addr);

The man pages of the functions will tell you which header files need to be included.