2

I've been having a spot bother on my ubuntu 22.04 box.

This has raised some questions with understanding of how the internet connection works. As I understand it for a working internet connection I need:

  1. a connection to my router acting as the gateway
  2. an IP address allocated by the router via DHCP
  3. DNS nameservers address also provided by the router via DHCP

This is necessary but it seems not sufficient. What else is required?

Under systemd, NetworkManager "activates a network connection". This covers points 1-3 at least.

I have seen a situation where:

  • the connection is "activated"
  • The IP address allocated is correct
  • The DNS servers allocated are correct

but I cannot talk to the internet (unless I use recovery mode or a different PC on the same LAN):

  • ping times out
  • nslookup times out
  • There is a question mark overlaid on LAN icon.

I am so used to this "just working" that I need to ask:

Q What needs to happen beyond "activating the network connection" for the network to actually work?

Q What does a question mark over the network icon mean? (following a system update this no longer appears)

Q What does "Activation of network connection" actually mean here?

This is a deliberately general question. For a specific example problem see the question I linked to on the first line.


One critical difference in the working case is this line from route:

default         _gateway        0.0.0.0         UG    100    0        0 enp0s31f6

See below for more details


Possibly irrelevant details below

For ping strace shows:

socket()
connect()  - "8.8.8.8"
setsockopt()
newfsstatat()
ioctl(TCGETS)
ioctl(TIOCGWINSZ)
sendto() "8.8.8.8"
recvmsg()

and recvmsg() fails with EAGAIN until it times out. So in some sense the kernel has allowed the connect() call to succeed but the connection doesn't seem to allow data to flow.

But for nslookup strace shows:

connect("127.0.0.53")

which is NOT the correct name server. This might be relevant. Something appears to listening on 127.0.0.53 according to ss.

>tracepath -n -p 55 8.8.8.8.8
1?: [LOCALHOST]   pmtu 1500
1: no reply
...

and its recvmesg() timing out again (after repeated EAGAINs)


More details of the specific problem:

Not working case:

> route
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
default         192.168.0.1     0.0.0.0         UG    100    0        0 enp0s31f6
link-local      0.0.0.0         255.255.0.0     U     1000   0        0 virbr0
172.17.0.0      0.0.0.0         255.255.0.0     U     0      0        0 docker0
192.168.0.0     0.0.0.0         255.255.255.0   U     100    0        0 enp0s31f6
192.168.123.0   0.0.0.0         255.255.255.0   U     0      0        0 virbr0
> ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: enp0s31f6: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 70:4d:7b:64:62:35 brd ff:ff:ff:ff:ff:ff
    inet 192.168.0.17/24 brd 192.168.0.255 scope global dynamic noprefixroute enp0s31f6
       valid_lft 854001sec preferred_lft 854001sec
    inet6 fe80::ba46:11c7:b4d1:16b3/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever
3: wlx503eaa945dc9: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 50:3e:aa:94:5d:c9 brd ff:ff:ff:ff:ff:ff
4: virbr0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default qlen 1000
    link/ether 52:54:00:ed:8b:39 brd ff:ff:ff:ff:ff:ff
    inet 192.168.123.1/24 brd 192.168.123.255 scope global virbr0
       valid_lft forever preferred_lft forever
5: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default 
    link/ether 02:42:fe:8d:1b:aa brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
       valid_lft forever preferred_lft forever

inxi -N

Network: Device-1: Intel Ethernet I219-V driver: e1000e Device-2: Realtek RTL8188EUS 802.11n Wireless Network Adapter type: USB driver: r8188eu

netstat -i

Kernel Interface table Iface MTU RX-OK RX-ERR RX-DRP RX-OVR TX-OK TX-ERR TX-DRP TX-OVR Flg docker0 1500 0 0 0 0 0 0 0 0 BMU enp0s31f 1500 27075 0 0 0 1608 0 0 0 BMRU lo 65536 5441 0 0 0 5441 0 0 0 LRU virbr0 1500 0 0 0 0 0 0 0 0 BMU

netstat -rn

Kernel IP routing table Destination Gateway Genmask Flags MSS Window irtt Iface 0.0.0.0 192.168.0.1 0.0.0.0 UG 0 0 0 enp0s31f6 169.254.0.0 0.0.0.0 255.255.0.0 U 0 0 0 virbr0 172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0 docker0 192.168.0.0 0.0.0.0 255.255.255.0 U 0 0 0 enp0s31f6 192.168.123.0 0.0.0.0 255.255.255.0 U 0 0 0 virbr0

Working case after:

  • boot in recovery mode
  • enable networking
  • resume normal boot

New output:

$ netstat -rn
Kernel IP routing table
Destination     Gateway         Genmask         Flags   MSS Window  irtt Iface
0.0.0.0         192.168.0.1     0.0.0.0         UG        0 0          0 enp0s31f6
169.254.0.0     0.0.0.0         255.255.0.0     U         0 0          0 enp0s31f6
172.17.0.0      0.0.0.0         255.255.0.0     U         0 0          0 docker0
192.168.0.0     0.0.0.0         255.255.255.0   U         0 0          0 enp0s31f6
192.168.123.0   0.0.0.0         255.255.255.0   U         0 0          0 virbr0

$ route Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface default _gateway 0.0.0.0 UG 100 0 0 enp0s31f6 link-local 0.0.0.0 255.255.0.0 U 1000 0 0 enp0s31f6 172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0 docker0 192.168.0.0 0.0.0.0 255.255.255.0 U 100 0 0 enp0s31f6 192.168.123.0 0.0.0.0 255.255.255.0 U 0 0 0 virbr0

$ netstat -i Kernel Interface table Iface MTU RX-OK RX-ERR RX-DRP RX-OVR TX-OK TX-ERR TX-DRP TX-OVR Flg docker0 1500 0 0 0 0 0 0 0 0 BMU enp0s31f 1500 85232 0 1 0 46876 0 0 0 BMRU lo 65536 4250 0 0 0 4250 0 0 0 LRU virbr0 1500 0 0 0 0 0 0 0 0 BMU

$ inxi -N Network: Device-1: Intel Ethernet I219-V driver: e1000e Device-2: Realtek RTL8188EUS 802.11n Wireless Network Adapter type: USB driver: r8188eu

$ ip addr 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: enp0s31f6: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000 link/ether 70:4d:7b:64:62:35 brd ff:ff:ff:ff:ff:ff inet 192.168.0.17/24 brd 192.168.0.255 scope global dynamic noprefixroute enp0s31f6 valid_lft 862734sec preferred_lft 862734sec inet6 fe80::ba46:11c7:b4d1:16b3/64 scope link noprefixroute valid_lft forever preferred_lft forever 3: wlx503eaa945dc9: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 50:3e:aa:94:5d:c9 brd ff:ff:ff:ff:ff:ff 4: virbr0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default qlen 1000 link/ether 52:54:00:ed:8b:39 brd ff:ff:ff:ff:ff:ff inet 192.168.123.1/24 brd 192.168.123.255 scope global virbr0 valid_lft forever preferred_lft forever 5: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default link/ether 02:42:05:f9:90:04 brd ff:ff:ff:ff:ff:ff inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0 valid_lft forever preferred_lft forever

1 Answers1

1

Your router can provide all the three things you listed even if the router has no working upstream connection to your Internet Service Provider. This allows you to access the router's configuration interface (assuming it has one) with a web browser even if the router's Internet connection is not correctly configured yet.

So you should try and get some diagnostic information from the router itself: if it has indicator lights, how are they behaving? If you can access the router's web interface, does its status page indicate a working connection Internet-side?

There is also the possibility that the router might be "child-proofed" or otherwise restricted to allow internet connection to particular client system(s) only. For example, overly strict firewall rules in the router could cause your symptoms.

On modern operating systems, "activation of a network connection" often includes a test: the OS attempts to connect some OS-vendor-maintained cloud service to verify that the connection actually works. If it doesn't, the system won't even try downloading updates and usually indicates a missing internet connection (e.g. with a question mark over the connection icon).

In NetworkManager, such a connection test is configurable and optional, but Ubuntu seems to have it enabled by default.


Your router can provide all the three things you listed even if the router has no working upstream connection to your Internet Service Provider. This allows you to access the router's configuration interface (assuming it has one) with a web browser even if the router's Internet connection is not correctly configured yet.

So you should try and get some diagnostic information from the router itself: if it has indicator lights, how are they behaving? If you can access the router's web interface, does its status page indicate a working connection Internet-side?

There is also the possibility that the router might be "child-proofed" or otherwise restricted to allow internet connection to particular client system(s) only. For example, overly strict firewall rules in the router could cause your symptoms.

On modern operating systems, "activation of a network connection" often includes a test: the OS attempts to connect some OS-vendor-maintained cloud service to verify that the connection actually works. If it doesn't, the system won't even try downloading updates and usually indicates a missing internet connection (e.g. with a question mark over the connection icon).

In NetworkManager, such a connection test is configurable and optional, but Ubuntu seems to have it enabled by default.


socket() connect() - "8.8.8.8" setsockopt() newfsstatat() ioctl(TCGETS) ioctl(TIOCGWINSZ) sendto() "8.8.8.8" recvmsg()

So if connect() and sendto() apparently succeeded, then as far as the kernel knows, the ping packet was sent out.

According to man 2 recvmsg, the EAGAIN error means:

EAGAIN or EWOULDBLOCK

The socket is marked nonblocking and the receive operation would block, or a receive timeout had been set and the timeout expired before data was received. POSIX.1 allows either error to be returned for this case, and does not require these constants to have the same value, so a portable application should check for both possibilities.

In other words, no answer was received. If there was no accompanying ICMP error packet describing the reason, then no further explanation is available from the system. Generally this means that some fault or configured restriction (e.g. a firewall rule) caused either your outgoing ping packet or the incoming ping response to be dropped, either at your router or at some point upstream of it.

In such a situation, one possible test is traceroute. Since firewalls are ubiquitous in modern networks, I've found it useful to use specifically either TCP- or UDP-based traceroute, directed to the specific port I expected to be working. So in this case:

sudo traceroute -n -T -p 53 8.8.8.8

attempts a TCP-based traceroute to port 53 of Google's well-known DNS server. Since DNS protocol uses both TCP and UDP, and the TCP port provides a more robust response for traceroute, this should be able to tell you how far your outgoing packets can get before you stop receiving responses back.

If the last IP address you see in the traceroute output with packet travel times is your router's IP address, then that indicates the problem is in your router or between it and your ISP's router.

If you see any further IP addresses after your router's IP address, then the problem is somewhere within your ISP's equipment (more likely) or in the internet backbone connections between the ISPs and other organizations (less likely, as these are often multiply redundant). In both cases, you cannot fix it yourself: the best you can do is report the problem to your ISP and wait for them to fix it or reroute traffic around the break.

But for nslookup strace shows:

connect("127.0.0.53")

which is NOT the correct name sever. This might be relevant.

Didn't you read the answers to the question you linked? Your system is using systemd-resolved. Most applications use glibc's hostname resolution API, and the hosts: ...resolve... keyword in your /etc/nsswitch.conf directs it to communicate with systemd-resolved directly. For the sake of some old applications and DNS-specific tools like nslookup, systemd-resolved maintains a backwards-compatibility /etc/resolv.conf and a DNS resolver proxy in port 53 of 127.0.0.53 (i.e. an alternative IP address for localhost). You should run resolvectl to see the actual DNS settings of your system, and ignore the legacy /etc/resolv.conf.

But one problem at a time: since you apparently cannot even ping 8.8.8.8 by IP, you have a IP connectivity issue you must fix before you can even start worrying about DNS resolution.


One new thing that might cause your frustration is that newest OSs may randomize your MAC address on every boot for privacy reasons by default. If your router has been configured with specific IP reservation for your system or other host-specific settings, such randomization might throw a spanner to the works.

Your NetworkManager settings should include an option for disabling MAC address randomization. It might be that the randomization is skipped in recovery mode, as that might use a simplified procedure for starting up networking.

telcoM
  • 96,466
  • For this question I am looking for a lower level explanation of how the connection should work. I know that my internet connection works. In fact if I enable the network in recovery-mode and then continue booting normally it works. What I don't understand is when it isn't working outside of recovery mode. Something else is missing but what? – Bruce Adams Oct 29 '22 at 16:08
  • For tracepath -n -p 55 8.8.8.8.8 I get:

    ` 1?: [LOCALHOST] pmtu 1500

     1: no reply
    

    ` and its recvmesg() timing out again (after repeated EAGAINs)

    – Bruce Adams Oct 29 '22 at 20:23