When you run ls
without arguments, it will just open a directory, read all the contents, sort them and print them out.
When you run ls *
, first the shell expands *
, which is effectively the same as what the simple ls
did, builds an argument vector with all the files in the current directory and calls ls
. ls
then has to process that argument vector and for each argument, and calls access(2)
¹ the file to check it's existence. Then it will print out the same output as the first (simple) ls
. Both the shell's processing of the large argument vector and ls
's will likely involve a lot of memory allocation of small blocks, which can take some time. However, since there was little sys
and user
time, but a lot of real
time, most of the time would have been spent waiting for disk, rather than using CPU doing memory allocation.
Each call to access(2)
will need to read the file's inode to get the permission information. That means a lot more disk reads and seeks than simply reading a directory. I do not know how expensive these operations are on your GPFS, but as the comparison you've shown to ls -l
which has a similar run time to the wildcard case, the time needed to retrieve the inode information appears to dominate. If GPFS has a slightly higher latency than your local filesystem on each read operation, we would expect it to be more pronounced in these cases.
The difference between the wildcard case and ls -l
of 50% could be explained by the ordering of inodes on the disk. If the inodes were laid out successively in the same order as the filenames in the directory and ls -l
stat(2)ed the files in directory order before sorting, ls -l
would possibly read most of the inodes in a sweep. With the wildcard, the shell will sort the filenames before passing them to ls
, so ls
will likely read the inodes in a different order, adding more disk head movement.
It should be noted that your time
output will not include the time taken by the shell to expand the wildcard.
If you really want to see what's going on, use strace(1)
:
strace -o /tmp/ls-star.trace ls *
strace -o /tmp/ls-l-star.trace ls -l *
and have a look which system calls are being performed in each case.
¹ I don't know if access(2)
is actually used, or something else such as stat(2)
. But both probably require an inode lookup (I'm not sure if access(file, 0)
would bypass an inode lookup.)
ls
it can just ask the file system "what are the children of the inode forpwd
" where as withls *
it has to ask "what are the children (and what is the file) of the inodea
" followed by b, c, d, etc etc. One query vs many. – N J May 05 '11 at 07:18ls -l
as well (still about 30 seconds less thanls *
) – Sebastian May 05 '11 at 07:45ls -l
will take longer thanls
as it has tostat(2)
each file to get information about timestamps/owner information/permissions, etc. – N J May 05 '11 at 08:31*
globs to all entries in the current directory that don't start with a period -- including the names of subdirectories. Which will then bels
'ed. – Shadur-don't-feed-the-AI May 05 '11 at 09:00ls
<ls -l
<ls -l *
<ls *
(I always ran it three times). With your explanation, I don't understand whyls -l *
is faster thanls *
– Sebastian May 05 '11 at 11:55ls *
andls -l *
, unless some cache effects are in play. I've updated my answer with how you can see what system calls are being run. – camh May 05 '11 at 12:15strace
reveals thatls -l *
is usinglstat64
andgetxattr
, whereasls *
is just usingstat64
andlstat64
. – Sebastian May 05 '11 at 14:13ls
usesgetdents64
and not(l)stat64
(asls *
is doing).getdents64
doesn't have a file name argument. On GPFL,ls -l
uses another function calledreadlink
, which may be faster thanstat64
. thanks a lot! – Sebastian May 05 '11 at 14:21*-expansion
part will be performed by the englobing bash, and therefore will not appear in the strace output (and it is probably the most expensive part). An alternative would be tostrace -f bash -c 'ls'
andstrace -f bash -c 'ls *'
(so the "*" is quoted and therefore will only occur within the sub-bash, instead of being done by the englobing shell prior to (and "outside of) calling strace) and compare the two. (note also the-f
to trace child processes, which will happen as the "bash -c" will fork a "ls" process) – Olivier Dulac Sep 27 '13 at 07:50