A Question regarding wget

Tag: unix Author: k594612041 Date: 2010-06-19

when I type wget http://yahoo.com:80 on unix shell. Can some one explain me what exactly happens from entering the command to reaching the yahoo server. Thank you very much in advance.

You want an explanation of the HTTP protocol? Or of the network stack?
@Felix I am more interested about the network stack but I am also interested in http protocol too.

Best Answer

RFC provide you with all the details you need and are not tied to a tool or OS.

Wget uses in your case HTTP, which bases on TCP, which in turn uses IP, then it depends on what you use, most of the time you will encounter Ethernet frames.

In order to understand what happens, I urge you to install Wireshark and have a look at the dissected frames, you will get an overview of what data belongs to which network layer. That is the most easy way to visualize and learn what happens. Beside this if you really like (irony) funny documents (/irony) have a look at the corresponding RFCs HTTP: 2616 for example, for the others have a look at the external links at the bottom of the wikipedia articles.

Other Answer1

  1. The program uses DNS to resolve the host name to an IP. The classic API call is gethostbyname although newer programs should use getaddrinfo to be IPv6 compatible.
  2. Since you specify the port, the program can skip looking up the default port for http. But if you hadn't, it would try a getservbyname to look up the default port (then again, wget may just embed port 80).
  3. The program uses the network API to connect to the remote host. This is done with socket and connect
  4. The program writes an http request to the connection with a call to write
  5. The program reads the http response with one or more calls to read.