18

How do I write a script for moving just the 20 oldest files from one folder to another? Is there a way to grab the oldest files in a folder?

user11598
  • 181
  • Including or excluding subdirectories? And should it be done recursively (in a directory tree)? – maxschlepzig Oct 16 '11 at 09:29
  • 2
    Many (most?) *nix filesystems don't store the creation date, so you can't determine the oldest file with certainty. The typically available attributes are atime (last access), ctime (last permission change), and mtime (last modified)... eg. ls -t and find's printf "%T" use mtime ... It seems, according to this link, that my ext4 partitions are capable of handling a creation date, but ls and find and stat don't have the appropriate options (yet)... – Peter.O Oct 22 '11 at 10:14
  • @Peter.O, as of coreutils 8.32 (March 2020), GNU ls now has a --time=creation/birth option. – Stéphane Chazelas Oct 23 '20 at 07:21

7 Answers7

15

Parsing the output of ls is not reliable.

Instead, use find to locate the files and sort to order them by timestamp. For example:

while IFS= read -r -d $'\0' line ; do
    file="${line#* }"
    # do something with $file here
done < <(find . -maxdepth 1 -printf '%T@ %p\0' \
    2>/dev/null | sort -z -n)

What is all this doing?

First, the find commands locates all files and directories in the current directory (.), but not in subdirectories of the current directory (-maxdepth 1), then prints out:

  • A timestamp
  • A space
  • The relative path to the file
  • A NULL character

The timestamp is important. The %T@ format specifier for -printf breaks down into T, which indicates "Last modification time" of the file (mtime) and @, which indicates "Seconds since 1970", including fractional seconds.

The space is merely an arbitrary delimiter. The full path to the file is so that we can refer to it later, and the NULL character is a terminator because it is an illegal character in a file name and thus lets us know for sure that we reached the end of the path to the file.

I have included 2>/dev/null so that files which the user does not have permission to access are excluded, but error messages about them being excluded are suppressed.

The result of the find command is a list of all directories in the current directory. The list is piped to sort which is instructed to:

  • -z Treat NULL as the line terminator character instead of newline.
  • -n Sort numerically

Since seconds-since-1970 always goes up we want the file whose timestamp was the smallest number. The first result from sort will be the line containing the smallest numbered timestamp. All that remains is to extract the file name.

The results of the find, sort pipeline is passed via process substitution to while where it is read as if it were a file on stdin. while in turn invokes read to process the input.

In the context of read we set the IFS variable to nothing, which means that whitespace won't be inappropriately interpreted as a delimiter. read is told -r, which disables escape expansion, and -d $'\0', which makes the end-of-line delimiter NULL, matching the output from our find, sort pipeline.

The first chunk of data, that represents the oldest file path preceded by its timestamp and a space, is read into the variable line. Next, parameter substitution is used with the expression #*, which simply replaces all characters from the beginning of the string up to the first space, including the space, with nothing. This strips off the modification timestamp, leaving only the full path to the file.

At this point the file name is stored in $file and you can do anything you like with it. When you're finished doing something with $file the while statement will loop and the read command will be executed again, extracting the next chunk and the next file name.

Isn't there a simpler way?

No. Simpler ways are buggy.

If you use ls -t and pipe to head or tail (or anything) you'll break on files with newlines in the file names. If you mv $(anything) then files with whitespace in the name will cause breakage. If you mv "$(anything)" then files with trailing newlines in the name will cause breakage. If you read without -d $'\0' then you'll break on files with whitespace in their names.

Perhaps in specific cases you know for sure that a simpler way is sufficient, but you should never write assumptions like that in to scripts if you can avoid doing so.

Solution

#!/usr/bin/env bash

# move to the first argument
dest="$1"

# move from the second argument or .
source="${2-.}"

# move the file count in the third argument or 20
limit="${3-20}"

while IFS= read -r -d $'\0' line ; do
    file="${line#* }"
    echo mv "$file" "$dest"
    let limit-=1
    [[ $limit -le 0 ]] && break
done < <(find "$source" -maxdepth 1 -printf '%T@ %p\0' \
    2>/dev/null | sort -z -n)

Call like:

move-oldest /mnt/backup/ /var/log/foo/ 20

To move the oldest 20 files from /var/log/foo/ to /mnt/backup/.

Note that I am including files and directories. For files only add -type f to the find invocation.

Thanks

Thanks to enzotib and Павел Танков for improvements to this answer.

Sorpigal
  • 1,167
  • The sort should not use -n. At least in my version, it doesn't sort decimal numbers correctly. You either have to remove the dot in the date or use -printf '%TY-%Tm-%TdT%TH:%TM:%TS %p\0' | sort -rz, ISO dates, or something else. – l0b0 Apr 05 '12 at 11:45
  • @l0b0: This limitation is known to me. I presume that it is sufficient to not require that level of granularity (that is, sorting beyond the . must be irrelevant to you.) It would be clearer to say sort -z -n -t. -k1. – Sorpigal Apr 05 '12 at 13:18
  • @l0b0: Your solution exhibits the same bug, regardless: %TS also shows a "fractional part" which would be in the form 00.0000000000, so you also lose granularity. Recent GNU sort could solve this problem by using -V for a "version sort", which will handle this type of floating point as expected. – Sorpigal Apr 05 '12 at 13:34
  • No, because I do a string sort on "YYYY-MM-DDThh:mm:ss" rather than a numeric sort. String sort doesn't care about decimals, so it should work until year 10000 :) – l0b0 Apr 05 '12 at 13:35
  • @l0b0: A string sort on %T@ would also work, then, because it is zero-padded. – Sorpigal Apr 05 '12 at 13:41
  • No, it's not zero-padded (at least on Ubuntu 11.10 with find 4.4.2). Verify with touch -d '1970-01-01 00:00:00 UTC' test && find . -maxdepth 1 -name test -printf '%T@ %p\n'. – l0b0 Apr 05 '12 at 13:58
  • @l0b0: I see what you mean, the second-part is not zero padded (only the fractional part is), but for %TS it works correctly. How annoying! I still don't like your original solution (seems kind of messy). – Sorpigal Apr 05 '12 at 14:25
  • It is messy, because I got to it by applying everything I learned from Greg's wiki and others while doing some exhaustive unit testing of critical software, and Bash makes it exceedingly difficult to do things right. It's a bit like VBA: A billion hacks makes for neat throw-away scripts, but it's not something to build important, long-lived software with. – l0b0 Apr 05 '12 at 14:30
  • @l0b0: My proposed solution is while IFS= read -r -d '' line ; do file="${line#*.*.}" ; echo "$file" ; done < <(find test -printf '%T@.%p\0' | sort -z -n -t. -k1,2) - sort on the first two numeric . delimited fields, then strip the first two such fields to obtain the filename. What do you think? Related question: Do you have test data for the -n failure with floats? – Sorpigal Apr 05 '12 at 14:35
5

It's easiest in zsh, where you can use the Om glob qualifier to sort matches by date (oldest first) and the [1,20] qualifier to retain only the first 20 matches:

mv -- *(Om[1,20]) target/

Add the D qualifier if you want to include dot files as well. Add . if you want to match only regular files and not directories.

If you don't have zsh, here's a Perl one-liner (you can do it in less than 80 characters, but at a further expense in clarity):

perl -e '@files = sort {-M $b <=> -M $a} glob("*"); foreach (@files[0..1]) {rename $_, "target/$_" or die "$_: $!"}'

Note however that it only has a precision up to the second (the nanosecond part of the modification time, where available, is not considered).

With only POSIX tools or even bash or ksh, sorting files by date is a pain. You can do it easily with ls, but parsing the output of ls is problematic, so this only works if the file names contain only printable characters other than newlines.

ls -tr | head -n 20 | while IFS= read -r file; do mv -- "$file" target/; done
5

You can use GNU find for this:

find -maxdepth 1 -type f -printf '%T@ %p\n' \
  | sort -k1,1 -g | head -20 | sed 's/^[0-9.]\+ //' \
  | xargs echo mv -t dest_dir

Where find prints the modification time (in seconds from 1970) and the name of each file of the current directory, the output is sorted according to the first field, the 20 oldest are filtered and moved to dest_dir. Remove the echo if you have tested the command line.

maxschlepzig
  • 57,532
4

Combine ls -t output with tail or head.

Simple example, which works only if all file names contain only printable characters other than whitespace and \[*? and none start with -:

 mv $(ls -1tr | head -20) other_folder
ktf
  • 2,717
  • +1 Nice, but what about hidden files? – ztank1013 Oct 15 '11 at 23:38
  • 1
    Add the -A option to ls: ls -1Atr – Arcege Oct 15 '11 at 23:47
  • 1
    -1, dangerous. Here let me craft an example: touch $'foo\n*'. What happens if you execute mv "$(ls)" with that file sitting there? – Sorpigal Jan 16 '12 at 07:33
  • 1
    @Sorpigal Seriously? It's kind of weak to say "Let me come up with an example you specifically said won't work. Hey look, it doesn't work" – Michael Mrozek Apr 05 '12 at 13:43
  • @MichaelMrozek: Seriously. If it's a bad idea, how about not doing it? – Sorpigal Apr 05 '12 at 13:47
  • 1
    @Sorpigal It's not a bad idea, it works in 99% of cases. The answer is "if you have files with normal names, this works. If you're an insane person who embeds newlines in their filenames, it won't". That's completely correct – Michael Mrozek Apr 05 '12 at 13:54
  • 1
    @MichaelMrozek: It is a bad idea and it's bad because it fails sometimes. If you have the option of doing what fails sometimes and what doesn't, you should take the option that doesn't (and the one that does is bad). Do whatever you like interactively, but in a script file and when giving advice do it correctly. – Sorpigal Apr 05 '12 at 14:22
2

No one has (yet) posted a bash example which caters for embedded newline chars (embedded anything) in the the filename, so here's one. It moves the 3 oldest (mdate) regular files

move=3
find . -maxdepth 1 -type f -name '*' \
 -printf "%T@\t%p\0" |sort -znk1 | { 
  while IFS= read -d $'\0' -r file; do
      printf "%s\0" "${file#*$'\t'}"
      ((--move==0)) && break
  done } |xargs -0 mv -t dest

This is the test-data snippet

# make test files with names containing \n, \t and "  "
rm -f '('?[1-4]'  |?)'
for f in $'(\n'{1..4}$'  |\t)' ;do sleep .1; echo >"$f" ;done
touch -d "1970-01-01" $'(\n4  |\t)'
ls -ltr '('?[1-4]'  |'?')'; echo
mkdir -p dest

Here is the check-results snippet

  ls -ltr '('?[1-4]'  |'?')'
  ls -ltr   dest/*
Peter.O
  • 32,916
0

It's easiest to do with GNU find. I use it every day on my Linux DVR to delete recordings from my video surveillance system older than a day.

Here is the syntax:

find /path/to/files/* -mtime +number_of_days -exec mv {} /path/to/folder \;

Remember that find defines a day as 24 hours from the time of execution. Therefore files last modified at 11 pm won't get deleted at 1 am.

You can even combine find with cron, so deletions can be scheduled automatically by running the following command as root:

crontab -e << EOF
@daily /usr/bin/find /path/to/files/* -mtime +number_of_days -exec mv {} /path/to/folder \;
EOF

You can always get more information about find by consulting it's manual page:

man find
Kevin
  • 40,767
Jonathan Frank
  • 321
  • 2
  • 6
0

as the other answers do not fit my and the questions purpose, this shell is tested on CentOS 7:

oldestDir=$(find /yourPath/* -maxdepth 0 -type d -printf '%T+ %p\n' | sort | head -n 1 | tr -s ' ' | cut -d ' ' -f 2)
echo "$oldestDir"
rm -rf "$oldestDir"
Pwnstar
  • 99