2

We have more than 4 years of data in our system. We need to move 2 years old files and directories in a new repository. Our requirement is needed to know how many TB of data from Jan 2017 to as of now 2.Exclude personal folder I tried to find command but couldn't work out.

find . -type f -mtime +1010 ! -path "./home/01_Personal Folder*" -printf '%s\n' \ 
| awk '{a+=$1;} END {printf "%.1f GB\n", a/2**30;}'
Jeff Schaller
  • 67,283
  • 35
  • 116
  • 255
  • 1
    Possibly duplicated with: https://unix.stackexchange.com/questions/41550/find-the-total-size-of-certain-files-within-a-directory-branch – Paulo Tomé Nov 20 '19 at 16:21
  • I think you should present your commands seperately, the find and the awk. Possibly duplicated? Yes but not in the sense "already answered". I think find -type f ... |awk and find -type d ... du {} both make sense –  Nov 20 '19 at 16:43
  • Oh from Jan 2017 to now, that would rather be -1000. Good trick with the 1000 and TEN! (besides the usual extra confusion and lack of info or clear language) –  Nov 20 '19 at 19:28

2 Answers2

2

Could be something similar to:

find . ! -path "./home/01_Personal Folder*" -type f -mtime +1000 -exec du -ch {} + | grep total

Answer based on: Find the total size of certain files within a directory branch

Explain shell command.

Christopher
  • 15,911
Paulo Tomé
  • 3,782
2
...
37M     total
29M     total
42M     total
43M     total
36M     total

real    0m1.271s
user    0m0.561s
sys     0m1.278s

Is what I get with:

time find ~/sda1 -type f -exec du -ch {} +|grep total

So now I need a tool to sum up the totals! (This is a -exec...+ overflow)

But with:

time find ~/sda1 -type f -printf "%s\n" | awk '{a+=$1;} END {print a;}'

10483650002

real    0m0.550s
user    0m0.251s
sys     0m0.349s

And:

]# time du ~/sda1 -sh
11G     /.../sda1

real    0m0.458s
user    0m0.116s
sys     0m0.340s

I get it nice and fast.


It seems inefficient to du each file, when find is stating them anyway, and can give the size for free. With find ... -exec du {} +, du is degraded to a calculator of -c "grand total".

There is of course some difference between file size (in bytes) and disk usage (in blocks).


Here just to show that the original `find ... -printf "%s\n" | awk '{...} END {...}' works:

]# find ~ -maxdepth 1 -printf "%s\n" | awk '{a+=$1;} END {print a;}'
1093990
]# find ~ -maxdepth 1 -printf "%s\n" | awk '{a+=$1;} END {printf "%x\n",a;}'
10b166

This is my first awk ever.

I tested this on ~ -maxdepth 1, and the round number struck me, and that "GB" thing at the end of OP, so I played around until I got 10**6 = 16x64K.