4

I'm debugging Hadoop DataNodes that won't start. We are using saltstack and also elasticsearch on the machines.

The Hadoop DataNode error is pretty clear:

java.net.BindException: Problem binding to [0.0.0.0:50020]    
java.net.BindException: Address already in use; 
      For more details see:  http://wiki.apache.org/hadoop/BindException

[...]

Caused by: java.net.BindException: Address already in use

[...]

(ExitUtil.java:terminate(124)) - Exiting with status 1

lsof -i -n for port 50020 says it's already used but only as source port and not destination port:

salt-mini 1733          root   25u  IPv4  17452      0t0  TCP xx.xx.132.72:50020->xx.xx.132.20:4505 (ESTABLISHED)
java      2789 elasticsearch 2127u  IPv6   9808      0t0  TCP xx.xx.132.72:50020->xx.xx.132.55:9300 (ESTABLISHED)

However binding on 0.0.0.0 does not seem to work:

root@host:~# nc -l 50020
nc: Address already in use

Is this intentional? Is binding to 0.0.0.0 disallowed when the port is already used a source port? There is nothing listening to the socket - I don't really know why it shouldn't work.

Ubuntu 14.04:

root@host:~# uname -a
Linux host 4.2.0-19-generic #23~14.04.1-Ubuntu SMP Thu Nov 12 12:33:30 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
Braiam
  • 35,991
  • This is normal behavior. see http://serverfault.com/questions/326819/does-the-tcp-source-port-have-to-be-unique-per-host – VenkatC Dec 14 '15 at 13:49

3 Answers3

12

It does not matter if 50020 is a source or destination port: If it is claimed, it is claimed.

I would consider it a bug to need to start a service on a particular port in the range 49152 - 65535, as these are the ephemeral ports as defined by IANA. Many Linux distributions take ports higher than 32768 as ephemeral. You can review the currently ephemeral port range with:

cat /proc/sys/net/ipv4/ip_local_port_range

Any application may just use a port from the ephemeral range, so there is no guarantee that a particular port always will be free. Better to pick an unused port between 1024 and 32767.

See some intro on ephemeral ports.

If you want to change the ephemeral range to cater for the Hadoop DataNode requirement, you could do so by editing /etc/sysctl.conf, and setting a line along the following:

net.ipv4.ip_local_port_range=56000 65000

edit: Thanks @mr.spuratic, who indirectly pointed out that with a recent enough kernel (the change was commitedd in May 2010), one can make exceptions to the range. This is recommended, as toying with the range itself is quite a drastic change.

sysctl -w net.ipv4.ip_local_reserved_ports = 50020, 50021

Quoting from Documentation/networking/ip-sysctl.txt

ip_local_reserved_ports - list of comma separated ranges

Specify the ports which are reserved for known third-party
applications. These ports will not be used by automatic port
assignments (e.g. when calling connect() or bind() with port
number 0). Explicit port allocation behavior is unchanged.

The format used for both input and output is a comma separated
list of ranges (e.g. "1,2-4,10-10" for ports 1, 2, 3, 4 and
10). Writing to the file will clear all previously reserved
ports and update the current list with the one given in the
input.

Note that ip_local_port_range and ip_local_reserved_ports
settings are independent and both are considered by the kernel
when determining which ports are available for automatic port
assignments.

You can reserve ports which are not in the current
ip_local_port_range, e.g.:

$ cat /proc/sys/net/ipv4/ip_local_port_range
32000   60999
$ cat /proc/sys/net/ipv4/ip_local_reserved_ports
8080,9148

although this is redundant. However such a setting is useful
if later the port range is changed to a value that will
include the reserved ports.

Default: Empty
joepd
  • 2,397
1

I think that is normal behavior, If a port is used, its used. Source or Destination does not matter.

0.0.0.0 will mean that you try to listen on all network addresses to that port. So if you have 2 IP addresses, let say 192.168.1.20 & 10.4.2.1 you can use the port twice if you specify the IP address

1

However binding on 0.0.0.0 does not seem to work:

root@host:~# nc -l 50020
nc: Address already in use

Is this intentional? Is binding to 0.0.0.0 disallowed when the port is already used a source port? There is nothing listening to the socket - I don't really know why it shouldn't work.

This is entirely normal. The special IP address 0.0.0.0 means "any" internet protocol address this machine answers to, which means it binds to every IP address on the system. Every TCP connection is a two-way stateful connection and source/destination really only has meaning from the initial handshake. All that really matters to you is the port number on your side of the connection.

Consider what you are asking from the point of view of the IP stack. It has an existing TCP connection on xx.xx.132.72:50020 and you are trying to bind a listen socket to 0.0.0.0:50020. This special address expands to include xx.xx.132.72:50020 and fails as in use. If it didn't fail how would an IP packet inbound to that address discern if it were to be delivered to your listen socket or to the pre-existing connection? Sure you could envision a scheme to allow multiple sockets to share a port but then you've re-engineered exactly the problem that ports solve in the first place.

Your listen socket probably has better claims to a port number since it needs to be reachable at a reliable location, so you need to change the other applications port. If it is not configurable, just stop the other application, start your server and then restart the application which will use a different, available, source port for its outgoing connection and no longer conflict with your server.

casey
  • 14,754