Socket 101
Just finished the first part of Beejās Guide to Network Programming and the first chapter of UNIX Network Programming, and here is what I have learned.
Both books are just great, but as far as I can say, in the first chapter, the UNP book gives concrete examples but doesnāt explain enough, while the BGNET booklet talks much about details but gives no concrete examples so far. So I think why not just letās combine the examples of UNP with Beejās explanations.
Sockets
Everything is a file in Unix, so is a socket. A socket is a file connected to another socket, almost like a dokodemo door. Every time you want to communicate with someone on the Internet, you just open and read (or write) a socket.
int sockfd, n; // fd for file descriptor
char receline[MAXLINE + 1]; // store contentif ((sockfd = socket(AF_INET, SOCK_STREAM, 0)) < 0)
// report error
Here AF_INET
specifies that we are using the IPv4 Internet protocols and SOCK_STREAM
specifies the type of communication we need, that is, Iām quoting the man page here, sequenced, reliable, two-way, connection-based byte streams.
Read From a Socket
Since a socket is just a file, you can use read()
and write()
with it as with all other text files, though there are more specialized tools like send()
and recv()
(we are not using them here).
while ((n = read(sockfd, recvline, MAXLINE)) > 0) {
// print content out with fput()
}
Ok, but how do we connect our socket with another socket to begin with? Well, we use connect()
, which is just like fopen()
and takes three arguments, a socket file descriptor, a socket address structure struct sockaddr
and finally the size of the socket address structure. (On the other hand, if you are setting up a server, listening to requests, you should use bind()
instead)
if (connect(sockfd, (SA *)&servaddr, sizeof(servaddr)) < 0)
// report error--------------------------------------------// by the way, our program should look like this now
// so you don't get lost easilyopen_socket();
prepare_address_structure(); <-- going to talk about this
connect_to_another_socket();
read_from_socket_and_print();
exit(0);
Addresses
But why we need a structure to represent an address? Isnāt it just a four-byte number (255:255:255:255
)? Well, 255:255:255:255
is an IPv4 address and assumes a port number, we need to tell connect()
explicitly that we use IPv4, and we must also specify a port number, so we need a structure. A socket address structure looks like this
struct sockaddr {
unsigned short sa_family; // address family, AF_xxx
char sa_data[14]; // 14 bytes of protocol address
};
So where are we going to put address and port? You can see this is a āgenericā structure that just sets aside enough space for whatever you give it (be it an IPv4 or IPv6 address), and sa_data
is more like a place holder. Apparently, when we are preparing socket addresses, we need to be more specific, so we have struct sockaddr_in
for IPv4 and struct sockaddr_in6
for IPv6 (in
for Internet). An Internet socket address structure looks like this
struct sockaddr_in {
short int sin_family; // Address family, AF_INET
unsigned short int sin_port; // Port number
struct in_addr sin_addr; // Internet address
unsigned char sin_zero[8]; // Same size as struct sockaddr
};
Address: Family and Port
So now everything is simple, and we just need to fill in the blanks. First, the Internet socket address family for IPv4, it would be AF_INET
(already pre-defined elsewhere, as macros or enum constants, I guess).
servaddr.sin_family = AF_INET;
Then we have the port number. The book uses port 13 for this daytime client example, so we just give it 13.
servaddr.sin_port = htons(13);
No, wait, what is this htons()
? Well, itās a format conversion function. So we have all kinds of computers around the world, some store numbers as they are, like 00 2a ff 33
, and we call these computers big-endian computers, while some store them in the opposite order, like 33 ff 2a 00
, and we call these computers little-endian computers. We sure never want to send streams of big-endian digits to a small-endian machine, but how do we know which format our computers are using? We donāt need to know, we just convert the numbers into a common format, and when it reaches the other machine convert it back. The common format is called Network Byte Order, which is actually also big-endian. So htons
actually stands for Host TO Network conversion for Short numbers (ports) and there are also htonl
ntohl
ntohs
etc.
Address Itself
Now that we have got our port number ready, we can prepare our address. But why do we need another structure struct in_addr
to store an address? Itās because there are more than one kind of addresses even when we are explicitly talking about IPv4 and it used to be implemented as a union, which I think makes more sense. For IPv4, the Internet address structure looks like (for IPv6, itās in6_addr
)
struct in_addr {
uint32_t s_addr; // that's a 32-bit int (4 bytes)
};
So we need to assign the address of the destination (server) to the servaddr.sin_addr.s_addr
. To do this we need another function
if(inet_pton(AF_INET, argv[1], &servaddr.sin_addr) <= 0)
Then whatās this inet_pton
? Well, we know computers donāt read numbers like 1.1.1.1
, they don't even read base 10 either, so we need to convert this to a real four-byte number so that our computers can understand the address. This conversion is called Presentation TO Network conversion, hence pton, and there is also a Network TO Presentation conversion, a.k.a ntop, which you will need when you want to, for example, print addresses.
Summary
The program currently looks like this
open_socket();
prepare_address_structure();
connect_to_another_socket();
read_from_socket_and_print();
And the prepare_address_structure()
part is the most complicated. In summary, it takes the following parts and steps
// eventually we will pass connect() a generic sockaddr
// that can be used to coverd all protocol
struct sockaddr {
type sa_family;
type sa_data[14];
};// but for each protocol, we use a parallel structure
// before we eventually casting back to the generic one
// for Internet IPv4 it's
struct sockaddr_in {
type sin_family; // use predefined constants
type sin_port; // use htons
type sin_addr; // a struct for historical reasons
// use inet_pton
type sin_zero; // for padding
};
Conclusion
Now that we have filled in all the blanks of the sockaddr_in
structure, we can cast it into a sockaddr
structure and pass it to the connect
function, and then the connect
function will handle everything correctly, so will our read
function.
(BGNET mentions an addrinfo
structure and a getaddrinfo()
function to work with it, but they are not used in this example and honestly donāt really understand this part, so letās just skip it for now)