I’ve asked myself this question too many times. Today, let’s just pick one place, one time, one situation, and start there. Let’s pick a situation that everybody does every day, and go from there. What follows is a dialogue and a monologue. Things in “quotes” are things I imagine you say. Everything else is me answering. Note that I don’t know a lot of how the Internet works. And sometimes I’ll say so, in the text. So if you’re an Internet Expert™, grep for these ‘I don’t know’ phrases and help me out!
“Okay, so, I have this website from my voting papers and I’m putting it in my browser and pressing enter, and pretty soon I’ve got that page. What just happened? How does this work?”
In order, roughly, what happened is this:
- You entered this address,
http://www.elections.govt.nz, into your browser.
- Your browser parsed this address, and figured out you were asking for the
website at the server
- Your browser went and used the
httpprotocol to ask the
www.elections.govt.nzserver about that page you wanted.
- The server replied with the information you wanted.
- The browser displayed that information.
“Wait wait waaait. What does ‘parsed’ mean?”
Well, it means… you know what, let’s just google it:
Parsing or syntactic analysis is the process of analysing a string of symbols, either in natural language or in computer languages, according to the rules of a formal grammar. The term parsing comes from Latin pars, meaning part. (Wikipedia)
Well, that was helpful. Before you ask me what a ‘string’ or a ‘formal grammar’ are, let’s just say that ‘parsing’ is about taking a piece of data, a piece of text, figuring out if it fits a certain pattern, and if it does, cutting it up in its components. So in english, for example, the sentence ‘How are you?’ fits the pattern of, well, an english sentence, while ‘Bruf aouf whsbt.!..42’ doesn’t. What’s more, ‘How are you?’ can be cut up into the words ‘How’, ‘are’, ‘you’, and the punctuation ‘?’. Further, ‘are’ is the verb ‘to be’ at the second person singular or plural, and so on, so it can be seen that the sentence further fits the pattern of a simple question of the form ‘adverb verb subject?’.
The browser does the same thing. From the text
it wants a pattern known as a URL. So it checks it out, figures that yes, this
piece of text is indeed a URL, and cuts it up in its components:
http is the
www.elections.govt.nz is the address of the server.
“Right. What’s a protocol?”
A protocol is… let’s ask google again:
In telecommunications, a communications protocol is a system of digital rules for data exchange within or between computers. Communicating systems use well-defined formats for exchanging messages. (Wikipedia)
Much better than the previous one! In somewhat plainer english, a protocol is a set of rules that computer programs use to speak to each other. Much like someone from New Zealand and someone from, say, Brasil will not speak the same language, if two programs don’t understand the same protocols, they won’t be able to understand each other. Unlike most spoken languages, though, protocols are very rigidly defined. Someone from England may have some difficulties understanding someone from Australia, but they’ll manage. But a program that understands protocol A will absolutely not understand a program that speaks protocol B, no matter if the only thing that changes between the two is a single word. Computers are fussy like that.
In this case, the protocol that browser is going to use is called ‘HTTP’. That stands for ‘HyperText Transfer Protocol’. ‘Hypertext’ is a fancy word to refer to webpages, and the rest is self-explanatory. ‘HTTP’ is a set of rules that computer programs use to transfer webpages between themselves.
“What are those rules?”
HTTP is actually quite simple. When your browser wants a page, it requests it from the server, which responds to that request with, well, a response. Both the request and response are composed of three parts: the first line has basic information about what you want or what the server is saying, the lines below that contain metadata about the request or response, and then, separated from the metadata by a blank line, is the actual content.
When your browser asks
www.elections.govt.nz for a page, here’s what it sends
to the server:
GET / HTTP/1.1 Host: www.elections.org.nz User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:35.0) Gecko/20100101 Firefox/35.0 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: en-US,en;q=0.5 Accept-Encoding: gzip, deflate Connection: keep-alive Pragma: no-cache Cache-Control: no-cache
The first line,
GET / HTTP/1.1, is the ‘basic information’ line. It says ‘GET
me the page
/ is the default page), and it says ‘oh by the way I’m
going to speak to you in the version 1.1 of the HTTP protocol’. There’s several
versions, the ones most important right now are 1.1, which is the current, most
common one, and version 2.0, which is the next generation one. I don’t know
much about HTTP/2, so we’ll just ignore that for now.
Then comes the headers, the metadata of the request, in
Name: value pairs.
Host header says ‘I’m requesting this using the
address’; this is important because the server might not know about that, but
we’ll see more details later. The
User-Agent says ‘The browser speaking to
you is Firefox, version 35.0, using the Gecko engine, made by Mozilla, running
on Linux, on a 64-bit computer.’ although most servers don’t really care about
all that. The
Accept says ‘here are the list of formats I understand’, which
the server can use to make sure it responds with the proper thing. The
Accept-Language says ‘My human has indicated peh would prefer things written
in English’. The
Accept-Encoding says ‘If you want, you can compress the
response so it goes faster over the wire, and I’ll understand it just fine as
long as you use either the
deflate programs to do the compressing’.
Connection specifies that the server should keep the line open so the
browser doesn’t have to go through all the trouble it did the first time around
if it wants something else from the server. The
pairs are about caching, which I’ll tell you about later if you ask nicely.
As you can see, there’s a whole lot of information being expressed there. There is no ‘body’, no contents below that, but that’s okay, because with all this information, the server can surely comply and give us an answer.
And it does:
HTTP/1.1 200 OK Server: nginx Date: Fri, 12 Sep 2014 10:35:16 GMT Content-Type: text/html; charset=utf-8 Transfer-Encoding: chunked Connection: keep-alive Keep-Alive: timeout=10 Vary: Accept-Encoding Etag: "1410517815-1" Content-Language: en X-UA-Compatible: IE=edge,chrome=1 Link: <http://www.elections.org.nz/>; rel="canonical",<http://www.elections.org.nz/>; rel="shortlink" Last-Modified: Fri, 12 Sep 2014 10:30:15 +0000 Vary: Accept-Encoding X-Frame-Options: sameorigin X-Varnish: 1757163577 1757156649 Age: 299 Via: 1.1 varnish X-Hits: 33 Cache-Control: no-cache, must-revalidate, max-age=3600 X-Frame-Options: sameorigin < !DOCTYPE html> <html>...
Again, the first line is ‘basic information’. The server is telling us ‘HTTP
version 1.1 is fine’ and ‘Your request was OK, so here’s the response’. The
200 here is because instead of saying
OK, the server really says
That’s what it understands. But it’s also being nice to humans who might read
this, and it includes the meaning of that
200 next to it:
Then the metadata.
Server says ‘The name of the program I use to talk to you
Date is just that, the current date.
Content-Type is what the
format of the contents is; if you look back to the request, you’ll notice it
was in the
Accept list we gave the server, so all is well.
Transfer-Encoding says ‘I’m going to give this to you in small bits.’
Connection is an acknowledgement of what we asked, and a confirmation that
it’s going to hold the line.
Keep-Alive is related to that: if we don’t say
anything for 10 seconds, the server will hang up.
Vary is for proxies, which
I’ll explain later if you ask.
Etag is a small piece of text which identifies
the version of the content; if your browser asks if the page has changed, it
Etag along, and the server can know precisely which version your
browser has, and if it has, indeed, changed (if it hasn’t, it just says so
instead of sending a copy of the content, which is much faster).
Content-Language says ‘this content is written in English’. Anything that
X- is a special header, and might not mean the same thing to
everybody, or might just be for extra information that only some browsers or
X-UA-Compatible is used only by the Internet Explorer browser.
Link says ‘This link,
http://www.elections.org.nz, is both the canonical
link (the one that points to the source of the content) and the short link
(the one that should be used if you want to save space).’
ETag, but using a date. The server repeats the
Vary header, which is
probably a bug.
X-Frame-Options says ‘This page can only be put in a frame if
the parent page is on the same address as I am.’
X-Varnish is related in some
way to the program Varnish.
Age is the age of the contents in seconds, in
this case the content is just under 5 minutes old.
Via says ‘This response
came to you via Varnish’; Varnish is a special program that big websites use to
X-Hits… I’m not sure; it could be that this is the number of times
your browser has asked that server for something today, or something else… I
Cache-Control is related to caching, see above. And they’ve
X-Frame-Options header, too; must really be a bug.
And then below that there’s a blank line, and below that blank line is the
contents of the response, which start with
!DOCTYPE html and
I’ve omitted the rest because there’s, like, ten pages of the stuff.
So that’s how HTTP works. You ask the server a question, and give it plenty of information, some of it it won’t need, and it replies with plenty of information, some of it you won’t need, plus the actual contents of what you asked for. So far so good.
“So, that explains a lot, but now I have something like fifty questions.”
Ooh boy. Okay, to make things simpler, could you just tell me which questions you have and I’ll see what I have to answer?
“Yeah, sure. So…
- Why might the server not know about the address we’ve used to ask it a
question, i.e. why is that
- What’s the Gecko engine? What’s an engine? How is it different from a browser?
- What’s the format of the
Acceptheader? What are those
q=0.somethingthings? What’s the difference between
- What’s the format of the
Accept-Languageheader? Why is it both
en? Where do I change that?
- What are
deflateand why were those two picked instead of, say, WinRAR? WinRAR does compression, too, doesn’t it? My friend said it was better than Zip.
- Can you tell me more about the caching business? Pretty please?
- Why is the response giving us the current date? We’ve got a computer, we know what today is…
- What if
Content-Typeisn’t in the
Acceptlist we sent through?
Tranfer-Encoding: chunkedthing is weird. Why would the server give us the response in small bits? Can’t it just give it whole?
- The Internet doesn’t run on phones anymore, there’s nothing to ‘hang up’ and ‘holding the line’ isn’t literal anymore, so what are you actually talking about there?
- So can you explain that
Varybusiness and proxies?
X-UA-Compatiblemight only be for Internet Explorer, but what does it mean?
- What’s a frame?
- How does Varnish make big website faster?
- Earlier you said that when the browser got the response back, it displayed it to me. All that I see in the response is text, and not even all of it, as you have omitted a bunch of it. Where does the browser get the nice layout and pretty pictures from just a lot of text?
- In the
Content-Typeof the response, there’s something that says
charset=utf-8. UTF-8… I feel like I’ve heard that before? What is it?”
Firstly, that’s not even fifty questions. More like half that.
“You’re a pedant.”
Sometimes. As for answers…