-1

I sometimes wish human-readable du -h option to be more fine-grained while still human-readable.

Instead of showing:

 14G

it would show something like:

14G 236M 788k 110b

Is there an easy / straightforward / standard way to get this?

iago-lito
  • 2,751
  • What is o in 110o? – Arkadiusz Drabczyk Dec 27 '17 at 16:06
  • 1
    Just get the source code and modify to your needs. – ridgy Dec 27 '17 at 16:08
  • @ArkadiuszDrabczyk probably 'octet', french for 'byte' – Carpette Dec 27 '17 at 16:11
  • @Carpette: it can be, thx – Arkadiusz Drabczyk Dec 27 '17 at 16:13
  • @ridgy Of course, even though I do not consider this "an easy way" yet ;) – iago-lito Dec 27 '17 at 16:16
  • 2
    This can help you: https://unix.stackexchange.com/questions/44040/a-standard-tool-to-convert-a-byte-count-into-human-kib-mib-etc-like-du-ls1?rq=1 – Carpette Dec 27 '17 at 16:20
  • @Carpette Hm. This is not "an easy way" either, but I could write my own version of this awk function and pipe du output to it.. why not. Thanks :) – iago-lito Dec 27 '17 at 16:29
  • Following your last comment: you may also consider wrapping the piped du and awk in a shell function called du – Weijun Zhou Dec 27 '17 at 16:37
  • @WeijunZhou for sure. We're still getting closer from ridgy's suggestion "get into the source code".. maybe it's the way to go :) – iago-lito Dec 27 '17 at 16:43
  • Of course you can do it the "hard way" with bash scripting, awk, ... But be aware you have to consider all the possible different options and outputs of du: du -hs, du -hc, du --si, etc. This is why I think of modifying the source might be easier. – ridgy Dec 27 '17 at 17:06
  • @ridgy And I do think you're right. Would I need to recompile all coreutils then? – iago-lito Dec 27 '17 at 17:17
  • 1
    I think so. Depending on your OS/distribution (you did not tell about) get the full sources of coreutils, then modify ./lib/human.c (and probably ./lib/human.h) as this is the function library used, and then do a configureand make. All those coreutils now will have that human readable format you defined, but as long as you do no make install that doesn't harm. – ridgy Dec 27 '17 at 20:02
  • 1
    Looked a bit deeper in the sources. You'ld probably have to modify human.h for the new buffer length LONGEST_HUMAN_READABLE, and then either modify after target do_grouping or the full function human_readable. – ridgy Dec 27 '17 at 20:18
  • @ridgy Well, cheers for this insight :D I will probably not get into this soon, but it could be fun. Would this patch be good to offer for future coreutils releases? – iago-lito Dec 27 '17 at 20:23
  • 1
    If I were doing this, I would write a utility (say in awk, perl, or python) that converted a control statement parameter, and if no parameter was present, read from STDIN. The reason is that there are probably a number of situations where you might want the expanded output. So not just "du -s | new_utility", but also "ANY_command | new_utility". And the pipeline could be easily put into a script or a function. Many birds with one stone generality ... cheers, – drl Dec 28 '17 at 02:29
  • @drl I like the idea. However, we would have to deal with, say, columns adjustment not to break it while ls -la | new_utility.. and in the same pipeline, how would we discriminate between filesizes and filecounts? – iago-lito Dec 28 '17 at 09:09

1 Answers1

2

Well, there seems to be no easy / straightforward / standard way to do this yet.

Alternative options are (credits to ridgy's, Carpette's, Weijun Zhou's and drl's comments :):

  • write a dedicated small converter utility in bash/awk/python/etc. so that:

    $ echo "789456" | utility  
    770K 976o
    

    Then pipe it to convert du output. You can inspire from this related question. If it runs well, it could also parse the output of any command piped to it like:

    $ du -s | utility
    $ ls -la | utility
    

    You can even alias it forever on you machine to:

    duH() du -s $@ | utility
    
    • pros: Easy to write in any language you prefer.
    • cons: difficult to adapt to any command options (du -s, du -hc, du --si). Any command output (ls -lah, rsync) has to be parsed for finding digits strings meant to represent bytes and transformed without breaking the layout.
  • get into coreutils source code and add a new relevant option suiting your needs. You'll probably have to have a glance at ./lib/human.c. Then once modified, it'll be a matter of ./configure, make, make install so the du on your machine will now have this option implemented.

    • pros: quite straightforward, fast and integrated. May be offered as a future patch to standard du ?
    • cons: you'll have to get into existing C code and understand it first not to break it. You'll have to reinstall your own version of coreutils on any machine you need to use it.

For now, I'm not getting into this soon. Anyway, feel free to post here your own pieces of solutions or alternatives workarounds as they come :)

iago-lito
  • 2,751