Had to do a wget to get files from a Wordpress site. It saved all JPGs with [,random-string-text.pagespeed.blah].
Is there a way to recursively rename these JPGs, stripping all the text after the comma?
Had to do a wget to get files from a Wordpress site. It saved all JPGs with [,random-string-text.pagespeed.blah].
Is there a way to recursively rename these JPGs, stripping all the text after the comma?
Here is a Python script that should do what you want:
#!/usr/bin/env python2
# -*- coding: ascii -*-
"""remove_extensions.py
Remove meta-data filename extensions added by wget.
"""
import sys
import os
import re
# Define a regular expression to match the file-extension pattern
regex = r'^(.*),random-string-text.pagespeed.blah$'
for current_path, subdirectories, files in os.walk(sys.argv[1]):
for filename in files:
match = re.match(regex, filename)
if match:
newname = match.groups()[0]
newpath = os.path.join(current_path, newname)
oldpath = os.path.join(current_path, filename)
os.rename(oldpath, newpath)
You'll have to slightly modify the regular expressions to match your actual filename patterns (or update your question to say what they are).
Here is a test illustrating the script in action.
Create some directories:
mkdir /tmp/dir1
mkdir /tmp/dir1/dir2
mkdir /tmp/dir3
Populate them with some files:
touch /tmp/dir1/file{01..03}.jpg,random-string-text.pagespeed.blah
touch /tmp/dir1/dir2/file{04..06}.jpg,random-string-text.pagespeed.blah
touch /tmp/dir3/file{07..10}.jpg,random-string-text.pagespeed.blah
Verify what we did:
user@host:~$ tree /tmp
/tmp/
├── dir1
│ ├── dir2
│ │ ├── file04.jpg,random-string-text.pagespeed.blah
│ │ ├── file05.jpg,random-string-text.pagespeed.blah
│ │ └── file06.jpg,random-string-text.pagespeed.blah
│ ├── file01.jpg,random-string-text.pagespeed.blah
│ ├── file02.jpg,random-string-text.pagespeed.blah
│ └── file03.jpg,random-string-text.pagespeed.blah
└── dir3
├── file07.jpg,random-string-text.pagespeed.blah
├── file08.jpg,random-string-text.pagespeed.blah
├── file09.jpg,random-string-text.pagespeed.blah
└── file10.jpg,random-string-text.pagespeed.blah
3 directories, 10 files
Now run the script:
python remove_extensions.py /tmp/
And finally check that it had the intended effect:
user@host:~$ tree /tmp/
/tmp/
├── dir1
│ ├── dir2
│ │ ├── file04.jpg
│ │ ├── file05.jpg
│ │ └── file06.jpg
│ ├── file01.jpg
│ ├── file02.jpg
│ └── file03.jpg
└── dir3
├── file07.jpg
├── file08.jpg
├── file09.jpg
└── file10.jpg
3 directories, 10 files
If you're sure that there's no other comma in the filename, you could try something like this:
for i in `find . -type f` ; do mv $i $(echo $i | sed 's/\(.*\),.*/\1/g') ; done