5

I have a 38GB folder with 800 MP4 videos in it. After re-downloading it, the file name has no spacing, and all words are joined, but it's still TitleCase.

So from TitleCase I need Title Case.

What would be the most effective way of bulk renaming these files?

I remember a rename or autorename being included a long time ago in my distro, but I don't seem to have it now.

  • 1
    Unpopular opinion: if renaming one file manually takes 10 seconds, that's about 130 minutes, then ten minute total per day will get the job done within two weeks. Assuming it's not urgent. – shuhalo Dec 05 '22 at 10:54
  • 6
    Can you tell us exactly which change you want to perform? E.g. all "ThreeJoinedWords" ->"Three Joined Words"? (Tongue-in-cheek: Best would be to give us this information as an sed command!) By the way, if my understanding is remotely correct: I typically perform the opposite name transformation, replacing all funny chars, including whitespace, by underlines or such. I would strongly advise against file names with spaces. The reason becomes obvious if later the need arises to change your names with spaces again. – Peter - Reinstate Monica Dec 05 '22 at 11:15
  • 6
    Please [edit] your question and i) tell us your operating system, ii) if it is a Linux, tell us which one, iii) give us some examples of your input file names and iv) the output file names you would expect from that example. Finally: please, please, please don't make the mistake of adding spaces to file names if they don't already have them. That will only make every future manipulation of the files more complicated for little benefit. You can use _ instead of spaces and they will still be easily readable while not being harder to handle. – terdon Dec 05 '22 at 12:20
  • 1
    @terdon - fwiw, if it's about renaming CamelCase to space-separated words then this is a dupe of Bulk renaming of camelcase files to include spaces – don_crissti Dec 05 '22 at 20:45
  • 1
    A little suggestion: Since the files are already on a Unix and/or Linux box, why not leave the spaces out and forget about having to escape them when addressing/accessing/performing an operation on any of the files? Just a thought. – B.Kaatz Dec 06 '22 at 03:02
  • @shuhalo And by the end of the two weeks you'll have given yourself RSI. Work smarter, not harder. – Hashim Aziz Dec 06 '22 at 20:11
  • Note that renaming the filenames will NOT affect any metadata such as movie titles. You need something like ffmpeg for that. – Jeremy Boden Dec 07 '22 at 19:33

4 Answers4

12

If you want to add spaces between each 'Words' of mp4 filename TitleCase (PascalCase to Word Separated By Spaces):

rename -n 's/\B[[:upper:]]/ $&/g' ./*.mp4

Output

rename(./FooBarBaz.mp4, ./Foo Bar Baz.mp4)

Check this post that details in depth the Perl's rename

What about rename different versions and usage ? What is the recommended way to use the Perl version especially?

  • I wouldn't call that particular rename implementation (there are several perl-based renames these days) reliable, especially when called like that. Try for instance after touch './--e=system"reboot"#.mp4' – Stéphane Chazelas Dec 05 '22 at 17:18
  • Do you know another Perl specific version that is more reliable and not expanding commands ? – Gilles Quénot Dec 05 '22 at 18:11
  • The original implementation from the 80s which didn't have any option wouldn't be affected. This one is not so bad as it supports --. Some others (like the one on Arch Linux IIRC) don't have the problem, but don't support --, so you can't use -- portably. Portable work around is to use ./*.mp4. Or use zmv which doesn't have the problem nor most of the other problems associated with rename and also does some sanity checks prior to start. – Stéphane Chazelas Dec 05 '22 at 19:34
  • zsh is a nice shell, but some people like me prefer readability. The sed like syntax if a killer feature. And it's a tool that can be 'hacked' (the good way) to pass Perl code like (-f $_) && – Gilles Quénot Dec 05 '22 at 19:40
  • zsh has a similar ${var//pattern/replacement} (from ksh) or $var:s/x/y/ (from csh). Also a regexp-replace which can use PCREs. – Stéphane Chazelas Dec 05 '22 at 19:41
  • Beware rename doesn't decode filenames as text, so \p{Lu} will only match on ASCII A-Z letters. It's particulary hard to work around, even harder when different path components use different charsets. Another thing zmv handles much better. – Stéphane Chazelas Dec 05 '22 at 20:21
  • Tested with È, works as a charm. U+C8 – Gilles Quénot Dec 05 '22 at 20:23
  • Oh, actually, it's worse, they're being decoded as iso8859-1. The UTF-8 encoding of È is 0xC3 0x8a, 0xC3 being à in iso8859-1. Now try with é which also starts with à – Stéphane Chazelas Dec 05 '22 at 20:27
  • The one from the rename package (from https://metacpan.org/release/File-Rename) on Debian and derivatives is called file-rename and has a prename symlink to it. update-alternatives can be used for /usr/bin/rename to symlink to that. – Stéphane Chazelas Dec 21 '22 at 14:15
  • 1
    As mentioned earlier, the perl-rename on Arch is a different implementation with a different API. – Stéphane Chazelas Dec 22 '22 at 23:02
  • Basically, all Perl's version works the same. There's more or less switches, but all the basics works. It's Perl regex in the form of s/// – Gilles Quénot Dec 23 '22 at 18:07
  • It's not perl regexp in the form of s///, it's any perl code, though the s/// operator is one of the most commonly used ones. The one or Arch is missing -d which you generally want to use when combined with find. – Stéphane Chazelas Dec 23 '22 at 19:00
  • 2
    @GillesQuenot posting a link to your answer on every question or answer that mentions perl rename is, at the very least, verging on spam. IMO, it's way beyond "verging" and definitely crossed over to spam territory - you've posted the exact same comment to three of my (ancient) answers so far. – cas Dec 26 '22 at 02:21
  • I took the time to detail the maximum numbers of distros, how you can install and use, it takes me a long time spent to test/install docker images to investigate. So yes, as far as the response are often quite vague, I try here to be the more exhaustive as possible. This is the only POST I know where you have so many distros explained and how to install, and hopefully the different names of Perl's rename: rename perl-rename file-rename prename and how to test the good version. It's a Q&A web site, I share what I discovered. – Gilles Quénot Dec 27 '22 at 15:38
  • Adding a link to rename answers for completeness is a way to go further your good responses, don't take things personally. Just sharing, take it easy. We don't gain any money, it's not a fight ;) Why not a struggle, but for free spirit, free beers, free minds, open source stuff. – Gilles Quénot Dec 27 '22 at 15:38
  • It would be good if your examples could at least do something valuable to the type of filenames mentioned in the question ("TitleCase"). The question is closed due to not specifying what the files should be renamed to, but at least we know what the current names of the files look like, partly. – Kusalananda Dec 28 '22 at 08:12
  • @Kusalananda It's the part of my POST where there's examples the first of them : PascalCase to Words Separated By Spaces – Gilles Quénot Dec 28 '22 at 09:29
  • 1
    Please stop leaving comments linking to this just because an answer mentions rename. That really is kind of spammy. I don't know why you chose to add all this detail to an answer of a closed question which did not ask about perl rename in the first place, but if you want to link people to it, please only do so when it is actually relevant to what a question asked. Commenting on answers which just mention rename isn't really helpful. – terdon Dec 28 '22 at 09:55
  • 2
    I'm not offended and I'm not taking it personally. I just hate spam - with a deep and intense and eternal loathing. Getting 3 (so far) notifications of the exact same self-promotional comment is spam, and that's just the notifications I got - I don't know who else you've spammed, or how many. If I were to get any more, it would be even worse because it would be more spam (and since I've posted numerous answers using perl rename over the last decade or so, if you continue your spammy ways, I'll almost certainly get more). – cas Dec 28 '22 at 10:55
8

I don't know what would be the most efficient way (I think you meant efficient), but I would quickly write a for loop like:

for file in *.mp4; do
  newname="$(echo "$file" | sed 's/\(.\)\([A-Z]\)/\1 \2/g')"
  mv "${file}" "${newname}"
done

Explanation:

  newname="$(echo "$file" | sed 's/\(.\)\([A-Z]\)/\1 \2/g')"
#        ^-------------------- Assign to  variable "newname" value…
#         ^------------------- "$()": as output by commannd in parentheses;
#                              use "" to avoid word splitting

where

echo $file | sed 's/\(.\)\([A-Z]\)/\1 \2/g'
#    ^-------------------------------------- output old file name
#          ^-------------------------------- pipe to `sed` command

sed is the name of the "stream editor"; it takes input, executes a command on it, and produces output. Here, the command is s, as in "search and replace".

s/\(.\)\([A-Z]\)/\1 \2/g
^^ ^  ^ ^      ^ ^  ^  ^
|| |^ | | ^^^  | |  |  |
\------------------------ s: search and replace
 \----------------------- /: Set the search;replace;flags separator to "/"
   || | | \|/  | \  /  |
   \--+------------------ \(…\): a "capture group" (the first one);
    |   |  |   |  ||   |         whatever is matches the content will be
    |   |  |   |  \|   |         available as \1
    \-------------------- .: We match ".", which means *any* character
        |  |   |   |   |  (which precludes this from matching at start of line)
        \------+--------- \(…\): Second capture group, \2
           \------------- [A-Z]: Match any capital letter
                   |   |
                   \----- Replacement: "\1 \2" replace
                       |  "characterbeforecapitalletter""Capitalletter" with
                       |  "characterbeforecapitalletter" "Capitalletter"
                       |
                       \- g: Flag that means "global": Repeat this until
                             end of line
  • Is the choice of this particular regex robust against the presence of newline characters (\n) in the filename? – Hastur Dec 05 '22 at 07:42
  • 1
    Yes. Sed would operate on both lines separately and not care (side from doing the right thing of not inserting a space at the beginning of the line) , and the file and newname variables works continue to contain the newline character, @Hastur – Marcus Müller Dec 05 '22 at 08:25
  • 2
    Oh that diagram in the last code block! – jcaron Dec 05 '22 at 16:54
  • 1
    @jcaron sorry about that, I could not resist making it! – Marcus Müller Dec 05 '22 at 16:58
4

To separate out words in title-case, like for FooBarBaz.mp4 to become Foo Bar Baz.mp4, you can do insert a space before every uppercase letter except when it's the first character in the file name, which with zsh's zmv you could do (recursively) with:

$ autoload -Uz zmv
$ zmv -n '(**/)(?)(*.mp4)' '$1$2${3//(#m)[[:upper:]]/ $MATCH}'
mv -- FooBarBaz.mp4 'Foo Bar Baz.mp4'
mv -- OnceUponATime.mp4 'Once Upon A Time.mp4'

(remove the -n for dry-run if happy).

Beware it changes Foo-Bar.mp4 to Foo- Bar.mp4.

zmv -n '(**/)(*.mp4)' '$1${2//(#b)([[:alpha:]])([[:upper:]])/$match[1] $match[2]}'

Would only insert spaces between a letter and an uppercase letter, but would not work for OnceUponATime above as the space would be inserted between n and A, but not between A and T as the A would have already be consumed by the previous substitution.

As zsh globs don't have the equivalent of perl's look around operators, working around that is more difficult. A simple approach in this case though is to just repeat the substitution an extra time:

$ zmv -n '(**/)(*.mp4)' '$1${${2//(#b)([[:alpha:]])([[:upper:]])/$match[1] $match[2]}//(#b)([[:alpha:]])([[:upper:]])/$match[1] $match[2]}'
mv -- ABCDEF.mp4 'A B C D E F.mp4'
mv -- AChristmasCarol.mp4 'A Christmas Carol.mp4'
mv -- FooBarBaz.mp4 'Foo Bar Baz.mp4'
mv -- LeSongeD\'UneNuitD\'Été.mp4 'Le Songe D'\''Une Nuit D'\'Été.mp4
mv -- LifeOfΠ.mp4 'Life Of Π.mp4'
mv -- OnceUponATime.mp4 'Once Upon A Time.mp4'
0

Other answers have described tools that allow the change based on regexes and that is a fine approach. However, those answers do not address an important point (IMHO): preview capability before actually making changes on disk (that may be hard to reverse).

Enter Emacs and its dired mode, in particular wdired, documented in the manual. One would open dired for the directory containing the files, toggle wdired to make the file names editable, make changes - most likely using interactive regex search and replace, review the changes, and finally commit them to disk.

I am aware that installing Emacs for this one use case is a heavy-handed approach, but I would point out that it handles file names with difficult characters, e.g. spaces, trivially.

TAR86
  • 101
  • 1