Ever wondered what happens after you compose an email to your friend and then click the “send” button?
How does the email end up at your friend’s mailbox?
In this article, I am going to teach you step-by-step the full journey of an email message starting from when you click the “send” button until it lands in your friend’s mailbox.
To understand how email works, you need to understand an internet protocol called SMTP or “simple mail transfer protocol”.
Let’s get started!
What is SMTP?
If you feel adventurous and you want to actually read the specifications of the protocol, the original specifications for SMTP were published under RFC 821 in 1982. And later on, more specifications were introduced under RFC 5321.
I have to warn you though, these RFCs are very dry and, well, kind of boring to read.
Now instead of me going through the RFC, I’d rather explain how email and SMTP work by walking through a practical example.
A motivating example
For the rest of this article, I am going to explain the SMTP protocol and how email works by following the journey of an email message from one person Bob to another Alice.
Let’s assume Bob has an email account at gmail, Bob@gmail.com and Alice has an email account at yahoo, Alice@yahoo.com
Bob wants to send an email to Alice. He composes his message on an application running on his MAC (Apple’s Mail app) and he is ready to click “send”.
What I am going to do now is I am going to track this message all the way from Bob’s laptop until it reaches Alice’s laptop.
First, let’s lay down all the participating players in the process:
1- Bob’s user agent
This is the application running on Bob’s laptop that he uses to compose, reply to, and read his email messages.
Bob uses Apple’s Mail app on his MAC as his user agent.
If Bob wants to read his email messages, his user agent fetches them from Bob’s mail server (I’ll explain what that is next). If Bob wants to send an email message, he composes the message on his user agent, and then pushes it to his mail server to be delivered to the right recipient.
2- Bob’s Mail Server
Bob has an email account on gmail.
What this means is that there is a remote machine under the gmail.com domain that manages all the email messages sent to Bob. This machine is also in charge of sending email messages sent from Bob to other users on other mail servers.
This remote machine (or more accurately, the application running on this remote machine) is what we call Bob’s mail server.
3- Alice’s Mail Server
Similar to Bob’s mail server, but it’s a yahoo machine instead of a gmail machine because as I mentioned earlier Alice has a yahoo email account.
4- Alice’s user agent
again similar to Bob’s user agent, this is the application running on Alice’s laptop that allows her to fetch emails from her mail server to read. It also allows her to compose messages on her laptop and push them to her mail server to be later delivered to the proper recipient. Alice has a PC and she uses Microsoft Outlook as her user agent.
The email journey
Now back to our scenario, let’s follow the email message as it travels from Bob to Alice at a high level.
1- Bob opens his Mail app, provides Alice e-mail address (Alice@yahoo.com), writes his message, and clicks the “send” button
2- The Mail app starts communicating with Bob’s mail server and eventually push the email that Bob composed to Bob’s mail server where it is stored to be delivered later to Alice@yahoo.com.
3- Bob’s mail server sees that there is a message pending delivery to Alice@yahoo.com. It starts a communication with the yahoo.com mail server to allow for this message delivery to happen. It is here where the SMTP protocol comes into play. SMTP is the protocol that governs the communication between these two mail servers. In our particular scenario, Bob’s mail server will play the role of an SMTP client while Alice’s mail server will play the role of an SMTP server.
4- After some initial SMTP handshaking between the gmail and yahoo mail servers, the SMTP client sends Bob’s message to Alice’s mail server.
5- Alice’s mail server receives the message and stores it in her mailbox so that she can read it later.
6- At some point, Alice uses her Microsoft Outlook to fetch messages from her mailbox and eventually reads Bob’s message.
I will discuss how the email messages are delivered from Bob’s user agent to his mail server (and from Alice’s mail server to her user agent) later.
The SMTP protocol
For now let’s focus on the communication that happens between Bob’s mail server (running on the gmail.com machine) and Alice’s mail server (running on the yahoo.com machine).
Remember that Bob’s mail server had to start a communication channel with Alice’s mail server to deliver Bob’s email to Alice.
Also remember that SMTP is the protocol that governs this communication.
Here is a sequence diagram of all the events that happen when everything works correctly.
The SMTP protocol is a text-based protocol that is composed of commands and replies.
The SMTP client (Bob’s mail server in our case) sends SMTP commands whereas the SMTP server (Alice’s mail server) responds to these commands with numerical codes.
Some examples of the commands that are used in the SMTP protocol are EHLO, MAIL FROM, RCPT TO, DATA, and QUIT.
There are essentially three phases in the SMTP protocol:
First: The SMTP handshake
First, Bob’s mail server (the SMTP client) establishes a TCP connection to Alice’s mail server (The SMTP server) to which the SMTP server responds with code 220. (This step is not shown in the sequence diagram)
After the SMTP client receives the 220 reply, the handshaking starts.
The general purpose of the handshaking stage is for the client and the server to identify themselves, the services they can provide, and to identify the identities of the sender and the recipient of the email.
It starts by Bob’s mail server sending an EHLO command to Alice’s mail server and identifying its domain. For example, Bob’s mail server would send “EHLO <gmail.com>”.
Think of the EHLO command as a “hello” message that the SMTP client sends to the SMTP server. As a matter of fact, it was actually called a HELO command in the older RFC, but it was modified later on in the newer RFC to allow for richer features.
The SMTP server at yahoo acknowledges the EHLO message by responding with code “250” along with the services that the SMTP server can support. It’s important for the client and server to agree on the services and features they can support before the message transfer starts.
Now that the greeting is done, it’s time for the client to send the information of the sender and the recipient of the email.
The SMTP client resumes by sending a “MAIL FROM” command along with the sender information. In our scenario, it would be something like “MAIL FROM: <Bob@gmail.com>”
When the SMTP server receives this command, it responds again with the same code 250 to indicate that it has no problem accepting messages from this user, Bob.
Afterwards, the client sends a “RCPT TO” command along with the email address of the recipient “RCPT TO: <Alice@yahoo.com>”.
Among other things, the SMTP server checks if the user “Alice” exists and if yes, it sends back a 250 acknowledgement indicating that it’s OK with accepting messages from Bob to be delivered to Alice.
This concludes the handshaking stage. Now let’s move on to the meaty details. How does the actual email message gets transferred from the SMTP client to the SMTP server?
Second: The message transfer
Before starting the actual message transfer, the SMTP client sends one more command called “DATA” to the server just to make sure that the server side is ready.
Alice’s mail server responds with code “354” indicating that it’s ready to receive the message.
After receiving this code from the server, the client is now ready to send the email message.
Believe it or not but the actual email message is sent out line by line. The server side though does not acknowledge each individual line received. It just waits for the “End of Mail” special line which is a line that only has a “.” (period or full stop) by itself.
When the client sends a “.” to the server, this indicates that the client is done with sending the email message. This also tells the server that it can start processing the message now.
After Alice’s mail server receives the “.”, it acknowledges receiving the whole message by sending a 250 code back to the client.
And that’s it, this is how the email message that Bob composed on his laptop ends up on a yahoo machine waiting for Alice to fetch and read. But there is still one thing missing, closing the connection between the SMTP client and the SMTP server.
Third: Closing the connection
This is very simple and straightforward.
Bob’s mail server sends a “QUIT” command to Alice’s mail server to indicate its intention to close the connection to which Alice’s mail server responds with a “221” code.
A word about user agents
Let’s talk a little about user agents.
In our scenario, we know that Bob used his user agent to push his email message to his mail server. We also know that Alice used hers to fetch and read Bob’s email but I never really talked about the mechanics of that.
Let’s first talk about Bob’s side of the story, pushing his email message to his mail server at gmail.
It turns out, Bob’s user agent can also use the SMTP protocol to send Bob’s message to his mail server.
Think about it.
It is exactly the same process, but with Bob’s user agent being the SMTP client and Bob’s mail server being the SMTP server.
For Alice though it’s different. Alice does not want to push an email to her mail server. She wants to fetch and read messages already stored in her yahoo mailbox. For that there are two popular protocols that her user agent could use. You probably heard of them before: POP and IMAP.
Needless to say, these are not the only ways to interact with your mail servers from the user agents.
In fact nowadays, our user agents are often times our browsers (we go to yahoo.com or gmail.com from our browsers to send/read our emails). Our browsers send and receive http messages so there is no SMTP or POP/IMAP involved at all. However, the communication between the gmail mail server and the yahoo mail server is still governed by the SMTP protocol as I explained earlier.
Now my question to you is
I mentioned earlier that the SMTP client sends a “.” on a line by itself to indicate that it has transferred all the email message data.
My question to you is, what do you think would happen if the email message that Bob composed had a “.” on a line by itself? 🙂
To know how the SMTP protocol handles such an irresponsible behavior from Bob’s part, I encourage you to take a look at the RFC.
Don’t worry 🙂 I will make it easier for you. Here is where you should look.