129

The Windows dir directory listing command has a line at the end showing the total amount of space taken up by the files listed. For example, dir *.exe shows all the .exe files in the current directory, their sizes, and the sum total of their sizes. I'd love to have similar functionality with my dir alias in bash, but I'm not sure exactly how to go about it.

Currently, I have alias dir='ls -FaGl' in my .bash_profile, showing

drwxr-x---+  24 mattdmo  4096 Mar 14 16:35 ./
drwxr-x--x. 256 root    12288 Apr  8 21:29 ../
-rw-------    1 mattdmo 13795 Apr  4 17:52 .bash_history
-rw-r--r--    1 mattdmo    18 May 10  2012 .bash_logout
-rw-r--r--    1 mattdmo   395 Dec  9 17:33 .bash_profile
-rw-r--r--    1 mattdmo   176 May 10  2012 .bash_profile~
-rw-r--r--    1 mattdmo   411 Dec  9 17:33 .bashrc
-rw-r--r--    1 mattdmo   124 May 10  2012 .bashrc~
drwx------    2 mattdmo  4096 Mar 24 20:03 bin/
drwxrwxr-x    2 mattdmo  4096 Mar 11 16:29 download/

for example. Taking the answers from this question:

dir | awk '{ total += $4 }; END { print total }'

which gives me the total, but doesn't print the directory listing itself. Is there a way to alter this into a one-liner or shell script so I can pass any ls arguments I want to dir and get a full listing plus sum total? For example, I'd like to run dir -R *.jpg *.tif to get the listing and total size of those file types in all subdirectories. Ideally, it would be great if I could get the size of each subdirectory, but this isn't essential.

MattDMo
  • 2,384
  • 5
    Why does't ls -lh help you ? It prints total sum in top. You can also run du -sh *.exe to get disk space usage information in human readable form. – bagavadhar Apr 16 '13 at 19:40
  • 1
    @ashwin I don't know what the 'total' ls -lh is printing, but it's not always related to what the awk scripts below calculate, or what I add up by hand. Sometimes it's close to the number of KB of files in the directory, but it doesn't seem to take the allocated sizes of subdirectories into effect. I'd be grateful if you could point me toward an explanation of what exactly that number is... – MattDMo Apr 17 '13 at 17:05
  • see if my answer below works for you – bagavadhar Apr 17 '13 at 20:17
  • 3
    ls -lh does not show the total of size of a dir calculating it's contents – aequalsb Feb 09 '17 at 14:35
  • 5
    One liner: du -ach *.exe – tuga Jun 06 '20 at 12:31

13 Answers13

243

There's already a UNIX command for this: du

Just do:

du -bch 

As per convention you can add one or more file or directory paths at the end of the command. -h is an extension to convert the size into a human-friendly format, -b gives you the file size instead of disk usage, and -c gives a total at the end.

ajay4q
  • 103
Pete Cornell
  • 2,625
  • 1
    Does du work with file filters like .exe, .jpg etcetera – vfclists Aug 30 '13 at 15:36
  • 16
    Yes, du works fine. You can use the -c option (same as --total) to get a total at the end of the list. – MikeB May 21 '14 at 16:28
  • 17
    Note that du gives the disk usage, not the sum of file sizes. – Stéphane Chazelas Feb 09 '15 at 10:41
  • The reason I haven't chosen this as the answer (although du is a very useful command) is because I want to maintain the basic functionality of ls -l - listing the contents of a directory, only recursively if I ask, and showing the size of each file, which du does not do. Thank you for the answer, though! – MattDMo Feb 12 '15 at 17:59
  • 9
    du -h doesn't sum the sizes of the files passed to it. du -h *.so shows the size of each file, but not the sum. I think what you're wanting here is du -hc *.so (or even du -hc *.so | tail -1). But of course, he wants the directory listing, too. – lmat - Reinstate Monica Jan 15 '16 at 17:10
  • 1
    This is the correct answer. ls will show size of directory as file, not the total amount of files in that directory. du is recursive, thus allows showing total size of everything inside a directory – Sergiy Kolodyazhnyy Jan 22 '16 at 14:46
  • 1
    This command works only with short list of files. See what happens when you have 850000 files in a directory!!! – hamidfzm May 20 '16 at 10:07
  • 11
    ! -a means --all. Consider rather using --apparent-size – Arnauld VM Nov 09 '18 at 13:31
  • 1
    The -b switch is not in POSIX. macOS doesn't understand it. – ericek111 Nov 24 '21 at 10:22
  • 1
    @ericek111 And that's a macOS problem. It was only removed because they didn't want to support other block sizes. and filesystem compression and deduplication were not even considered. "The -b option was added to an early proposal to provide a resolution to the situation where System V and BSD systems give figures for file sizes in blocks, which is an implementation-defined concept. (In common usage, the block size is 512 bytes for System V and 1024 bytes for BSD systems.) However, -b was later deleted, since the default was eventually decided as 512-byte units." – Behrooz Jul 13 '22 at 10:44
44

You can use du -h -c directory|tail -1

This will generate a single line with memory usage.

muru
  • 72,889
34

The following function does most of what you're asking for:

dir () { ls -FaGl "${@}" | awk '{ total += $4; print }; END { print total }'; }

... but it won't give you what you're asking for from dir -R *.jpg *.tif, because that's not how ls -R works. You might want to play around with the find utility for that.

  • 1
    This is true, if you're looking for essentially the contents size of each file, NOT the size the file consumes on disk. This distinction is more pronounced for very small files. On my distro's each file is allocated space on disk in 4 KB chunks (so a 300 byte file still uses 4K on a disk, as reported by the du command). Given that's what the OP was looking for "how much space each file takes up", then du is the way to do it. – Jon V Jan 10 '17 at 20:20
  • 4
    dir is already the name of a popular GNU coreutil, I'd rather not name a function like that. – dessert Nov 03 '17 at 08:18
  • 1
    Please fix so it works on both Linux and OS-S and your "-a" include "." and "..", which is not good : ) Here's the fixed command: dir () { ls -FAl "${@}" | awk '{ total += $5; print }; END { print "total:"total }'; } – Dmitry Shevkoplyas Sep 11 '19 at 14:39
  • @DmitryShevkoplyas running Cygwin bash on Windows, ls -a reports the size of . and .. as 0 bytes, so it doesn't affect the total. However, your point is valid on Linux and OSX, so I've changed my function definition of dir accordingly. Thanks! – MattDMo Jun 06 '20 at 17:21
  • du -ch **/*.ext|tail -1 works pretty fine – Marcelo Idemax Aug 23 '21 at 18:08
  • I maybe mistaken but this outputs the total of everything ls lists... so this includes . and .. in the total. That skews the data as it includes totals of files in other directories. – Dave May 17 '22 at 17:42
  • @Dave in my function I use ls -FAGl, which does not include . and ... – MattDMo Nov 29 '22 at 14:24
10

with perl:

perl -le 'map { $sum += -s } @ARGV; print $sum' -- *.pdf

Size of all non-hidden PDF files in current directory.

ingopingo
  • 807
9

Simply print the current line that you are summing the total of:

dir | awk '{ print; total += $4 }; END { print "total size: ",total }'
7

For counting files in a directory using a mask, I tend to follow this approach:

For bytes

du -ac --bytes  | grep "zip$" | awk '{ print; total += $1 }; END { print "total lobsters: ", total, " Bytes" }'

For Kibibytes

du -ac --bytes  | grep "zip$" | awk '{ print; total += $1 }; END { print "total lobsters: ", total/1024, " KiB" }'

For Mebibytes

du -ac --bytes  | grep "zip$" | awk '{ print; total += $1 }; END { print "total lobsters: " total/1024/1024 " MiB" }'

You get the idea.

The breakdown is simple:

  • du - disk usage
  • -a - all files
  • -c - total bytes
  • --bytes - print output in bytes [in newer versions of bash, not sure if this applies anymore]
  • grep - global regular expression print [prints output matching patterns]
  • "zip$" - pattern to match. 'zip' is the string, and the '$' denotes end of string/line/etc - in this case, match lines that END with 'zip'. Conversely, placing '^' at the start of the string indicates that the pattern will be at the start of the string [ie: "^start" will match lines beginning with the word 'start'] - With this knowledge, wrapping a string in ^ and $ respectively, will match lines that start/end with the pattern used. "^hello people$" will match strings saying 'hello people'. "^hello(.*)people$" will match strings saying 'hello french people' and 'hello coding people', but not 'hello coding people with no lives'
  • awk - a scripting language programmed by Aho, Weinberger, and Kernighan. Not a very original name, but a very powerful language that is excellent for text processing and data extraction.
  • { print; total += $1 }
    • print - print the line currently being iterated over
    • total += $1 - initialize variable total if it isn't already, and add the first block separated by the field separator, in this case a space character. This can be changed by the -F flag.
    • ; - line / statement terminator. you can put multiple awk statements on a single line using this, similar to terminating linux command line statements. Otherwise you could have them in a multi line thing still surrounded by { ... }
    • END - this effectively means that awk will execute the actions specified before it exits.
  • { print "total lobsters: " total " Bytes" }
    • print "total lobsters: " - first part of string being output
    • total - the variable containing the total sum of lines iterated over
    • " Bytes" - final part of printed string, tacked on to the end of the prior two statements
    • obviously, these three statements are encapsulated in { } like the first part.

So, stepping through an example in a case where we want to count the total number of zip files in a directory:

du -ac --bytes

836544  ./wp-content/themes/astra.1.8.1.zip
934364  ./wp-content/themes/astra.2.0.1.zip
400033  ./wp-content/uploads/2019/09/premium-addons-for-elementor-3.2.9-WJdFQT1mLd3GA81lQEAo.zip
117351218       ./wp-content/uploads/backwpup-fc5928-temp/2019-05-30_00-47-01_TX6FSKC601.zip
1192275 ./wp-content/plugins/essential-addons-elementor-master.zip
170     ./wp-content/plugins/gravityforms/images/doctypes/icon_zip.gif
1969651 ./wp-content/plugins/acf.zip
4284    ./wp-content/plugins/types/application/controllers/api/handler/import_from_zip_file.php

Two components of output, col 1 represented by the numeric values: 836544, 934364... etc, and col 2 being the path of the file.

However, since there are two lines that do not match what we want - icon_zip.gif and import_from_zip_file.php - we want to exclude these. Since du does not provide a way to filter recursively by extension (that I know of), we filter using grep

grep "zip$"

This effectively has the output from du piped to it, and filters the lines that end in zip, eliminating the two records we don't want:

836544  ./wp-content/themes/astra.1.8.1.zip
934364  ./wp-content/themes/astra.2.0.1.zip
400033  ./wp-content/uploads/2019/09/premium-addons-for-elementor-3.2.9-WJdFQT1mLd3GA81lQEAo.zip
117351218       ./wp-content/uploads/backwpup-fc5928-temp/2019-05-30_00-47-01_TX6FSKC601.zip
1192275 ./wp-content/plugins/essential-addons-elementor-master.zip
1969651 ./wp-content/plugins/acf.zip

Then awk parses each line, with the numerics in col 1 being stored in $1

We get this:

836544  ./wp-content/themes/astra.1.8.1.zip
934364  ./wp-content/themes/astra.2.0.1.zip
400033  ./wp-content/uploads/2019/09/premium-addons-for-elementor-3.2.9-WJdFQT1mLd3GA81lQEAo.zip
117351218       ./wp-content/uploads/backwpup-fc5928-temp/2019-05-30_00-47-01_TX6FSKC601.zip
1192275 ./wp-content/plugins/essential-addons-elementor-master.zip
1969651 ./wp-content/plugins/acf.zip
total lobsters:  128.339  MiB
  • 1
    This is a very thorough answer and explanation. I now have some if/else if statements in the awk part of the function that check the size of total and prints it with B, KB, MB, or GB, as appropriate. Thanks! – MattDMo May 14 '21 at 18:17
3

Adding the following to .bash_profile or .bashrc works for me.

dir ()
{
find . -iname "$@" -exec ls -lh {} \;
find . -iname "$@" -print0|xargs -r0 du -csh|tail -n 1;
}

Now when i do a dir *.mp3 it does recursively and prints total at the end. You can control how much depth you want by adding a maxdepth parameter to the find. I know running find twice is not a very effiecnt idea. But i couldnt think of a better way. Atleast it gets the job done.

2

Using du and a awk statement like the one mentioned above will provide what you are looking for.

Example: du /home/abc/Downloads/*.jpg | awk '{ print; total += $1 }; END { print "total size: ",total }'

This will list all files in folder Downloads of user abc ending in .jpg and prints the sum of all these files at the end of the listing.

1

To get both, dir output and size calculation, without using any of the other proposed options, you can use tee(1) and process substitution...

dir | tee >( awk '{ total += $4 }; END { print total }' )
Archemar
  • 31,554
Janis
  • 14,222
1

tee will solve the problem of output disappearing on screen when it is piped to another command like awk.

So the command:

ls -FaGl | printf "%'d\n" $(awk '{SUM+=$4}END{print SUM}')

which only prints:

63,519,676,015

Is replaced with the command:

ls -FaGl | tee /dev/stderr | printf "%'d\n" $(awk '{SUM+=$4}END{print SUM}')

and now the full file listing appears with total:

total 62031069
drwxrwxrwx 1 rick      20480 Oct  9 15:47 ./
drwxrwxrwx 1 rick      12288 Jul 20  2020 ../
drwxrwxrwx 1 rick          0 Oct 15  2017 Captures/
-rwxrwxrwx 1 rick        504 Jun 29  2020 desktop.ini*
drwxrwxrwx 1 rick       4096 Nov 18  2017 Mass Effect Andromeda/
-rwxrwxrwx 1 rick  210355992 Nov  8  2019 Screencapture 2019-11-08 at 13.07.14.mp4*
-rwxrwxrwx 1 rick  127445089 Nov  8  2019 Screencapture 2019-11-08 at 13.43.55.mp4*
-rwxrwxrwx 1 rick  997439911 Nov 11  2019 simplescreenrecorder-2019-11-11_21.42.51.mkv*
   ( Long listing snipped... )
-rwxrwxrwx 1 rick 1546689758 Sep  6 22:35 simplescreenrecorder-2021-09-06_21.18.29*
-rwxrwxrwx 1 rick  422607080 Sep 18 19:13 simplescreenrecorder-2021-09-18_18.57.00*
63,519,676,015

TL;DR

Insert | tee /dev/stderr into your pipeline.

Total in human readable format

In my own ~/.bashrc is this function:

$ grep 'BytesToHuman(' -A20 ~/.bashrc

function BytesToHuman() {

# https://unix.stackexchange.com/questions/44040/a-standard-tool-to-convert-a-byte-count-into-human-kib-mib-etc-like-du-ls1/259254#259254

read StdIn
if ! [[ $StdIn =~ ^-?[0-9]+$ ]] ; then
    echo "$StdIn"       # Simply pass back what was passed to us
    exit 1              # Floats or strings not allowed. Only integers.
fi

b=${StdIn:-0}; d=''; s=0; S=(Bytes {K,M,G,T,E,P,Y,Z}iB)
while ((b > 1024)); do
    d="$(printf ".%02d" $((b % 1024 * 100 / 1024)))"
    b=$((b / 1024))
    let s++
done

echo "$b$d ${S[$s]}"
exit 0                  # Success!

} # BytesToHuman ()

So simply add | BytesToHuman to the end of the pipeline. Also remove printf builtin that was used previously:

ls -FaGl | tee /dev/stderr | awk '{SUM+=$4}END{print SUM}' | BytesToHuman

This will now display the total as:

59.15 GiB

If you would prefer to see 63.51 GB then the BytesToHuman() function needs to be changed from:

b=${StdIn:-0}; d=''; s=0; S=(Bytes {K,M,G,T,E,P,Y,Z}iB)
while ((b > 1024)); do
    d="$(printf ".%02d" $((b % 1024 * 100 / 1024)))"
    b=$((b / 1024))

To:

b=${StdIn:-0}; d=''; s=0; S=(Bytes {K,M,G,T,E,P,Y,Z}B)
while ((b > 1000)); do
    d="$(printf ".%02d" $((b % 1000 * 100 / 1000)))"
    b=$((b / 1000))
0
du path_to_your_files/*.jpg | awk '{ total += $1 }; END { print total }'
chaos
  • 48,171
  • 3
    No. First, just giving a command is not an answer. Second, if you'd bothered to read the whole question, the other answers, and my comments, you'd have seen this is NOT what I want. – MattDMo Aug 26 '15 at 18:36
  • To be fair to the poster, this is the first search result for the Google search linux sum human readable sizes and is what I'm looking for. – Sridhar Sarnobat Aug 18 '19 at 07:25
0

Just for the sake of clarity and completeness, here is my current iteration of the dir() function, as well as answers to some common comments (including the most highly-voted answer here).

function dir() {
    ls -FAGl --color=always "${@}" | awk '{
                print;
                total += $4
            }; END {
                if (total < 1024)
                    print "\t\ttotal: ",total;
                else if (total < (1024 * 1024))
                    print "\t\ttotal: ",total/1024,"KB";
                else if (total < (1024 * 1024 * 1024))
                    print "\t\ttotal: ",total/(1024*1024),"MB";
                else if (total < (1024 * 1024 * 1024 * 1024))
                    print "\t\ttotal: ",total/(1024*1024*1024),"GB";
            }'
}

Here's how it works:

-F appends an indicator (one of */=>@|) to entries

-A does not list the implied . and .. entities

-G omits group names (in systems that include them by default)

-l prints a long listing, which is what we need in order to get file sizes

--color=always should be self-explanatory

"${@}" passes any other arguments to ls

The awk script, which I cobbled together from several sources, including the accepted answer here, first prints a newline, then adds up all the numbers in column 4. That number is then tested to see if it's smaller than 1KB, 1MB, 1GB, and 1TB, respectively. It then prints the total with appropriate unit, indented by 2 tabs (16 spaces) which approximately aligns with the size column on most systems. This can be customized to your own needs.


Why do you call it dir? That's a GNU coreutil!

I named the function dir because I grew up using DOS, and when I started playing around with Linux (kernel 1.2!) a friend suggested I alias dir as ls -al as I had gotten so used to typing it. It's just muscle memory now. Eventually I found out about /usr/bin/dir, but I much prefer having long directory listings, so I never bothered changing my alias, and now function. If you use the dir program and/or don't want to shadow it, feel free to name the function whatever you want.

Why don't you use du?

I do, when all I'm interested in is how much space a directory takes up. I wanted to have this functionality when I'm doing directory listings as well (which I do much more frequently), so this function was born.

MattDMo
  • 2,384
-2
du * | awk -v sum=0 '{print sum+=$1}' | tail -1
chaos
  • 48,171
jason
  • 1