I'm not convinced I like Steffen Ullrich's wording on that topic. Sockets are naturally complex because they are a generalized interface which can be used for a number of very different protocols (See note 1).
To generalize between different protocols, the sockets interface identifies common features of different protocols:
- All protocols must have some address mechanism. (See note 1)
- All data is sent from an address to an address.
- Two types of protocol are useful to discuss here:
- Some protocols represent a connection: SOCK_STREAM
- Some protocols represent a series of messages: SOCK_DGRAM
SOCK_STREAM protocols are generally very different from SOCK_DGRAM protocols. But within each of these groups, protocols don't differ so much.
SOCK_STREAM
- Example: TCP
- Data arrives in a long stream without any breaks.
- Data always arrives in the order it was sent
Connection orientated protocols are always(?) a pair of sockets with data sent between them in a single connection. Connection protocols usually support some form of "listening" socket as well, who's sole purpose is to wait for new connection requests.
Think of connection orientated sockets as two telephones with a line between them.
Calling connect()
and accept()
results in a new connection with two bound sockets (one on each side of the connection).
SOCK_DGRAM
- Example UDP
- Data arrives in the same blocks (messages) it was sent, not mashed into a continuous stream
- Messages might not arrive in order.
Datagram orientated protocols are very different. Sockets can be configured to act a little like mail boxes, receiving messages from anywhere. There's no requirement to have a connection. In the case of UDP, any packet sent to the right IP and port from any IP and port will be picked up by the same socket. So you can hold conversations with several different computers through the same socket.
There is a special meaning for connect()
on SOCK_DGRAM sockets.
If the socket sockfd is of type SOCK_DGRAM, then addr is the address to which datagrams are sent by default, and the only address from which datagrams are received.
This does not create a new unique connection. It just limits which messages will be received by this one socket and NOT the socket on the other side.
The sendto()
function allows the program to send a message and specify an address to send it.
Unix domain sockets
These come in both flavors (SOCK_STREAM and SOCK_DGRAM) just as internet sockets do (TCP and UDP). So for SOCK_STREAM unix sockets, yes they just have one connection between a pair of sockets. But SOCK_DGRAM unix sockets are different (just as UDP is different from TCP).
Note 1: Unix sockets are funky because there is no underlying protocol as such, they are a construct of the kernel. The kernel is free to use it's own socket inodes as the addresses mechanism. As mosvy points out this can lead to odd behaviour when you try to determine the address of a unix socket.
Two types of protocol are useful to discuss here:
Didn't say there were only2. 2) Yes they do. You must have an identifier for the end point. I read somewhere that unix sockets use inodes as addresses. Hunting for a reference. 3) A listening socket is implied by the protocols such as TCP (It's not just API). There must be something waiting for the initialSYN
packet, ready to send anACK
packet and receive the finalSYNACK
packet before the new socket (connection) is open.SOCK_DIAG
/UNIX_DIAG_VFS
by tools likess
). But let's cut it here, mkay? Instead of telling me that I'm wrong you could've just called getpeername() and getsockname() on any of the socket returned by socketpair(). – Mar 04 '19 at 09:54