0

Had to do a wget to get files from a Wordpress site. It saved all JPGs with [,random-string-text.pagespeed.blah].

Is there a way to recursively rename these JPGs, stripping all the text after the comma?

Brent
  • 1

2 Answers2

1

Here is a Python script that should do what you want:

#!/usr/bin/env python2
# -*- coding: ascii -*-
"""remove_extensions.py

Remove meta-data filename extensions added by wget.
"""

import sys
import os
import re

# Define a regular expression to match the file-extension pattern
regex = r'^(.*),random-string-text.pagespeed.blah$'

for current_path, subdirectories, files in os.walk(sys.argv[1]):
    for filename in files:
        match = re.match(regex, filename)
        if match:
            newname = match.groups()[0]
            newpath = os.path.join(current_path, newname)
            oldpath = os.path.join(current_path, filename)
            os.rename(oldpath, newpath)

You'll have to slightly modify the regular expressions to match your actual filename patterns (or update your question to say what they are).

Here is a test illustrating the script in action.

Create some directories:

mkdir /tmp/dir1
mkdir /tmp/dir1/dir2
mkdir /tmp/dir3

Populate them with some files:

touch /tmp/dir1/file{01..03}.jpg,random-string-text.pagespeed.blah
touch /tmp/dir1/dir2/file{04..06}.jpg,random-string-text.pagespeed.blah
touch /tmp/dir3/file{07..10}.jpg,random-string-text.pagespeed.blah

Verify what we did:

user@host:~$ tree /tmp

/tmp/
├── dir1
│   ├── dir2
│   │   ├── file04.jpg,random-string-text.pagespeed.blah
│   │   ├── file05.jpg,random-string-text.pagespeed.blah
│   │   └── file06.jpg,random-string-text.pagespeed.blah
│   ├── file01.jpg,random-string-text.pagespeed.blah
│   ├── file02.jpg,random-string-text.pagespeed.blah
│   └── file03.jpg,random-string-text.pagespeed.blah
└── dir3
    ├── file07.jpg,random-string-text.pagespeed.blah
    ├── file08.jpg,random-string-text.pagespeed.blah
    ├── file09.jpg,random-string-text.pagespeed.blah
    └── file10.jpg,random-string-text.pagespeed.blah

3 directories, 10 files

Now run the script:

python remove_extensions.py /tmp/

And finally check that it had the intended effect:

user@host:~$ tree /tmp/
/tmp/
├── dir1
│   ├── dir2
│   │   ├── file04.jpg
│   │   ├── file05.jpg
│   │   └── file06.jpg
│   ├── file01.jpg
│   ├── file02.jpg
│   └── file03.jpg
└── dir3
    ├── file07.jpg
    ├── file08.jpg
    ├── file09.jpg
    └── file10.jpg

3 directories, 10 files
igal
  • 9,886
0

If you're sure that there's no other comma in the filename, you could try something like this:

for i in `find . -type f` ; do mv $i $(echo $i | sed 's/\(.*\),.*/\1/g') ; done
manifestor
  • 2,473