3

I have a weird situation that on one server I am getting following results:

vagrant@shopping:/vagrant/deployer-example$ uname -a
Linux shopping 4.19.0-0.bpo.9-amd64 #1 SMP Debian 4.19.118-2+deb10u1~bpo9+1 (2020-06-09) x86_64 GNU/Linux

vagrant@shopping:/vagrant/deployer-example$ bin/php --version PHP 8.0.3 (cli) (built: Mar 5 2021 08:36:11) ( NTS ) Copyright (c) The PHP Group Zend Engine v4.0.3, Copyright (c) Zend Technologies with Zend OPcache v8.0.3, Copyright (c), by Zend Technologies vagrant@shopping:/vagrant/deployer-example$ sudo ldd --version ldd (Debian GLIBC 2.24-11+deb9u4) 2.24 Copyright (C) 2016 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. Written by Roland McGrath and Ulrich Drepper. vagrant@shopping:/vagrant/deployer-example$ php -r 'var_dump(glob("/vagrant/deployer-example/config/{routes}/.yaml", GLOB_BRACE));' array(4) { [0]=> string(56) "/vagrant/deployer-example/config/routes/annotations.yaml" [1]=> string(55) "/vagrant/deployer-example/config/routes/easy_admin.yaml" [2]=> string(52) "/vagrant/deployer-example/config/routes/monitor.yaml" [3]=> string(59) "/vagrant/deployer-example/config/routes/nelmio_api_doc.yaml" } vagrant@shopping:/vagrant/deployer-example$ php -r 'var_dump(glob("/vagrant/deployer-example/config/{routes}/.yaml/", GLOB_BRACE));' array(4) { [0]=> string(56) "/vagrant/deployer-example/config/routes/annotations.yaml" [1]=> string(55) "/vagrant/deployer-example/config/routes/easy_admin.yaml" [2]=> string(52) "/vagrant/deployer-example/config/routes/monitor.yaml" [3]=> string(59) "/vagrant/deployer-example/config/routes/nelmio_api_doc.yaml" }

But on second system

deployer-example@s2-stg-s01:~/deployer/current$ uname -a
Linux s2-stg-s01 4.19.0-0.bpo.9-amd64 #1 SMP Debian 4.19.118-2+deb10u1~bpo9+1 (2020-06-09) x86_64 GNU/Linux
deployer-example@s2-stg-s01:~/deployer/current$ bin/php --version
PHP 8.0.3 (cli) (built: Mar  5 2021 08:36:11) ( NTS )
Copyright (c) The PHP Group
Zend Engine v4.0.3, Copyright (c) Zend Technologies
    with Zend OPcache v8.0.3, Copyright (c), by Zend Technologies
deployer-example@s2-stg-s01:~/deployer/current$ ldd --version
ldd (Debian GLIBC 2.24-11+deb9u4) 2.24
Copyright © 2016 Free Software Foundation, Inc.
Dies ist freie Software; in den Quellen befinden sich die Lizenzbedingungen.
Es gibt KEINERLEI Garantie; nicht einmal für die TAUGLICHKEIT oder
VERWENDBARKEIT FÜR EINEN ANGEGEBENEN ZWECK.
Implementiert von Roland McGrath und Ulrich Drepper.
deployer-example@s2-stg-s01:~/deployer/current$ php -r 'var_dump(glob("/home/deployer-example/deployer/releases/437/config/{routes}/*.yaml", GLOB_BRACE));'
array(4) {
  [0]=>
  string(75) "/home/deployer-example/deployer/releases/437/config/routes/annotations.yaml"
  [1]=>
  string(74) "/home/deployer-example/deployer/releases/437/config/routes/easy_admin.yaml"
  [2]=>
  string(71) "/home/deployer-example/deployer/releases/437/config/routes/monitor.yaml"
  [3]=>
  string(78) "/home/deployer-example/deployer/releases/437/config/routes/nelmio_api_doc.yaml"
}
deployer-example@s2-stg-s01:~/deployer/current$ php -r 'var_dump(glob("/home/deployer-example/deployer/releases/437/config/{routes}/*.yaml/", GLOB_BRACE));'
array(0) {
}

I was debugging it with strace and found that for first system it reports:

open("/vagrant/deployer-example/config/routes", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3
fstat(3, {st_dev=makedev(0, 44), st_ino=86530891, st_mode=S_IFDIR|0755, st_nlink=1, st_uid=1000, st_gid=1000, st_blksize=4096, st_blocks=8, st_size=224, st_atime=2021-03-12T15:24:14+0000, st_mtime=2021-03-12T15:19:23+0000, st_ctime=2021-03-12T15:19:23+0000}) = 0
brk(0x562b5bb63000)                     = 0x562b5bb63000
getdents(3, [{d_ino=1031795, d_off=1, d_reclen=24, d_name=".", d_type=DT_UNKNOWN}, {d_ino=1031796, d_off=2, d_reclen=24, d_name="..", d_type=DT_UNKNOWN}, {d_ino=1031797, d_off=3, d_reclen=40, d_name="annotations.yaml", d_type=DT_UNKNOWN}, {d_ino=1031798, d_off=4, d_reclen=24, d_name="dev", d_type=DT_UNKNOWN}, {d_ino=1031799, d_off=5, d_reclen=40, d_name="easy_admin.yaml", d_type=DT_UNKNOWN}, {d_ino=1031800, d_off=6, d_reclen=32, d_name="monitor.yaml", d_type=DT_UNKNOWN}, {d_ino=1031801, d_off=7, d_reclen=40, d_name="nelmio_api_doc.yaml", d_type=DT_UNKNOWN}], 32768) = 224
newfstatat(3, "annotations.yaml", {st_dev=makedev(0, 44), st_ino=87279600, st_mode=S_IFREG|0644, st_nlink=1, st_uid=1000, st_gid=1000, st_blksize=4096, st_blocks=8, st_size=135, st_atime=2021-03-12T15:19:24+0000, st_mtime=2021-03-12T15:19:23+0000, st_ctime=2021-03-12T15:19:23+0000}, 0) = 0
newfstatat(3, "easy_admin.yaml", {st_dev=makedev(0, 44), st_ino=87279603, st_mode=S_IFREG|0644, st_nlink=1, st_uid=1000, st_gid=1000, st_blksize=4096, st_blocks=8, st_size=127, st_atime=2021-03-05T16:00:41+0000, st_mtime=2021-03-05T16:00:41+0000, st_ctime=2021-03-05T16:00:41+0000}, 0) = 0
newfstatat(3, "monitor.yaml", {st_dev=makedev(0, 44), st_ino=87279604, st_mode=S_IFREG|0644, st_nlink=1, st_uid=1000, st_gid=1000, st_blksize=4096, st_blocks=8, st_size=137, st_atime=2021-03-05T16:00:41+0000, st_mtime=2021-03-05T16:00:41+0000, st_ctime=2021-03-05T16:00:41+0000}, 0) = 0
newfstatat(3, "nelmio_api_doc.yaml", {st_dev=makedev(0, 44), st_ino=87279605, st_mode=S_IFREG|0644, st_nlink=1, st_uid=1000, st_gid=1000, st_blksize=4096, st_blocks=8, st_size=367, st_atime=2021-03-05T16:00:41+0000, st_mtime=2021-03-05T16:00:41+0000, st_ctime=2021-03-05T16:00:41+0000}, 0) = 0
getdents(3, [], 32768)                  = 0
close(3)                                = 0

but for second one

open("/home/deployer-example/deployer/releases/437/config/routes", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 5
fstat(5, {st_dev=makedev(8, 17), st_ino=123601048, st_mode=S_IFDIR|0755, st_nlink=3, st_uid=1084, st_gid=1084, st_blksize=4096, st_blocks=8, st_size=4096, st_atime=2021-03-12T16:18:27+0100.512902323, st_mtime=2021-03-12T16:17:55+0100.984164843, st_ctime=2021-03-12T16:17:55+0100.984164843}) = 0
brk(0x557145c2a000)                     = 0x557145c2a000
getdents(5, [{d_ino=123601055, d_off=2396884471359371616, d_reclen=40, d_name="nelmio_api_doc.yaml", d_type=DT_REG}, {d_ino=123601053, d_off=2713394375622026215, d_reclen=32, d_name="monitor.yaml", d_type=DT_REG}, {d_ino=123601049, d_off=3087430337415459848, d_reclen=40, d_name="easy_admin.yaml", d_type=DT_REG}, {d_ino=123601050, d_off=7440370618885146051, d_reclen=24, d_name="dev", d_type=DT_DIR}, {d_ino=123600993, d_off=8681975314043990400, d_reclen=24, d_name="..", d_type=DT_DIR}, {d_ino=123601479, d_off=9220757504948275036, d_reclen=40, d_name="annotations.yaml", d_type=DT_REG}, {d_ino=123601048, d_off=9223372036854775807, d_reclen=24, d_name=".", d_type=DT_DIR}], 32768) = 224
getdents(5, [], 32768)                  = 0
close(5)                                = 0

Both systems are running under same Debian version, under EXT4 filesystem and are provisioned with same Ansible scripts, so should also have same core packages.

How can I narrow this down to the culprit? What possible reasons there could be why these two servers behave differently?

gadelat
  • 505
  • 1
    Both straces show getdents() behaving exactly the same, returning 7 entries with a total size of 224. I guess you're mainly concerned about the PHP glob() function behaving differently. There's a slight difference in the PHP versions used, perhaps some PHP configuration variables might be different as well. – TooTea Mar 12 '21 at 19:56
  • 1
    glob("*/") returning things that don't end in / looks like a bug in that glob() to me. – Stéphane Chazelas Mar 12 '21 at 20:06
  • But bug in where? I want to patch this so systems behave same and people don't run into this issue on production again when it's working in VM fine. – gadelat Mar 12 '21 at 20:07
  • Looks like the PHP 8.0.2 on the first server is misbehaving by returning stuff that should not be there. Perhaps you should start by making sure you are using the same PHP version in both testing and production. – TooTea Mar 12 '21 at 20:11
  • Done upgraded both to PHP 8.0.3, but same result. – gadelat Mar 12 '21 at 20:23
  • 2
    Those DT_UNKNOWN and st_nlink=1 on the directory suggest the first one is not ext4. Can you confirm with df -T /vagrant/deployer-example/config/routes for instance? – Stéphane Chazelas Mar 12 '21 at 20:27
  • Indeed, looks like you are right

    vagrant@shopping:/vagrant/deployer-example$ df -T /vagrant/deployer-example/config/routes Filesystem Type 1K-blocks Used Available Use% Mounted on vagrant prl_fs 488245288 426727712 61517576 88% /vagrant. So Parallels bug perhaps that it works with / at the end.

    – gadelat Mar 12 '21 at 20:29
  • Sorry, just noticed that the glob() in PHP is just a thin wrapper around the same function in libc. The implementation in glibc is what is getting confused by the first system. – TooTea Mar 12 '21 at 20:32
  • But glibc version is included in my post and according that output it's same on both systems :/ – gadelat Mar 12 '21 at 20:35
  • 1
    I can reproduce a similar bug in that php -r 'var_dump(glob("./*/"));' (but not php -r 'var_dump(glob("*/"));' !?) returns broken symlinks. I can reproduce your problem on a fs formatted as a minix fs as well (where directories don't store file types either and d_type is DT_UNKNOWN) – Stéphane Chazelas Mar 12 '21 at 20:44
  • Just to get this correctly (I'm not much of a systems programmer or kernel guru), glob() bug is in glibc so we should probably report this bug there? – gadelat Mar 12 '21 at 20:45
  • 2
    Yeah, even though both systems use the same libc, the first system triggers a corner case (weird filesystem that does not return entry type in getdents data) in the glibc code. The second system just does what 99% filesystems do today and returns the data immediately, triggering a more efficient code path (and one that is the most tested nowadays). – TooTea Mar 12 '21 at 20:48
  • Yes, it seems to be in glibc, so we'd need to check the doc to see if it's not expected behaviour, and on the latest version to see if it hasn't been fixed already. – Stéphane Chazelas Mar 12 '21 at 20:49
  • 1
    I can reproduce with glibc 2.33. I see the code turns on the GLOB_ONLYDIR (GNU specific) flag when the pattern ends in / and the documentation for that flag says: If the information about the type of the file is easily available non-directories will be rejected but no extra work will be done to determine the information for each file. I.e., the caller must still be able to filter directories out. Still, I'd call it a bug as it breaks POSIX compliance (and you can see glob() still does a stat() on each of those files so it has the information). – Stéphane Chazelas Mar 13 '21 at 07:27
  • That's very cool that this issue took so much of your interest and you went extra mile to confirm this is still present in last glibc version :) Could you then also create bug report issue for glibc (and link it here)? As it seem, you are much more capable of writing the correct description of issue than me! – gadelat Mar 13 '21 at 08:09
  • 1
    Oh actually after checking glibc tracker, looks like somebody reported this bug yesterday already: https://sourceware.org/bugzilla/show_bug.cgi?id=25659 If that was you, much thanks! – gadelat Mar 13 '21 at 08:21
  • No, wasn't me, but you can see that guy had already added a workaround for that glibc bug to GNU make a few years back so had probably looked into that issue back then and was reminded of it from this discussion. – Stéphane Chazelas Mar 13 '21 at 08:41
  • Wow. Actually, he reported it almost exactly a year ago. And also used "minix" FS as a filesystem example that doesn't supply d_type. What are the odds of that? – Stéphane Chazelas Mar 13 '21 at 09:08

1 Answers1

5

As could be seen thanks to verbose mode in strace enabled (-v), these getdents calls were returning d_type=DT_REG on first system, but on second one d_type=DT_UNKNOWN. Reason for this is that while in first case filesystem used is ext4, in second case it is prl_fs. prl_fs filesystem itself apparently doesn't return known d_type.

This triggers an edge case in glob() function in glibc. There was a glibc bug issue reported for this at https://sourceware.org/bugzilla/show_bug.cgi?id=25659

I've also contacted Parallels support where I ask them to stop returning DT_UNKNOWN as a d_type. Since Parallels 16.5.0-49183, they fixed this :)

gadelat
  • 505
  • 1
    Note that the type of a file is something that belongs with the file. Some filesystems (most these days) also duplicate that information in the entries of all directories the file is linked to (for performance improvement purposes) but don't have to. I'd expect that prl_fs is some sort of "network" file system between guest and host. If the host OS doesn't make that information readily available without extra effort, it would be reasonable for the prl_fs implementation not to include it in the directory entries. That would be a perfectly valid thing to do. – Stéphane Chazelas Mar 13 '21 at 09:17