3

I have over 100 files named like

x.assembled.forward.fastq.gz
x(n).unassembled.reverse.fastq.gz

the problem is that the pipelines that I am working with do not accept 'dots' in the file name and I have to change all of them to _ so it would be like

x_assembled_forward.fastq.gz
x(n)_unassembled_reverse.fastq.gz

I thought it would be possible using the simple command:

mv *.assembled.*.fast.gz  *_assembled_*.fastq.gz

.... apparently not! :D

How can I do that?

xyz0o
  • 31

3 Answers3

8

If you have perl-rename installed (called rename on Debian, Ubuntu and other Debian-derived systems), you can do:

rename -n 's/\./_/g; s/_fastq_gz/.fastq.gz/' *fastq.gz

That will first replace all . with _ and then replace the final _fastq_gz with .fastq.gz.

The -n causes it to only print the changes it would do, without actually renaming the files. Once you're sure this does what you want, remove the -n to actually rename them:

rename  's/\./_/g; s/_fastq_gz/.fastq.gz/' *fastq.gz
terdon
  • 242,166
  • I don't have that software – xyz0o Aug 05 '19 at 16:42
  • 1
    @Masse that's why I asked you to please tell us what operating system you are using. If Linux, tell me what distribution and I can tell you how to install the tool. If you cannot install software (and if you're sure it isn't already present, it may be called perl-rename), then use Kusalananda's approach. – terdon Aug 05 '19 at 17:04
5

mv either takes a single file and moves or renames it, or it takes a number of files or directories and moves them to a directory. You can't rename multiple files with mv.

Instead:

for name in *.*.fasta.gz; do
    newname=${name%.fasta.gz}         # remove filename suffix
    newname=${newname//./_}.fasta.gz  # replace dots with underscores and add suffix

    mv -i -- "$name" "$newname"
done

This would iterate over all your compressed fasta files in the current directory that contains at least one dot elsewhere in the name, apart from in the filename suffix. It would remove the known filename suffix (which should not have dots replaced by underscores) and then substitutes all dots with underscores in the remaining bit and re-attaches the suffix.

The final substitution will work in the bash shell, but possibly not if running under /bin/sh.

mv -i is then used to rename the file (will ask for confirmation if the new name already exists). The double dash (--) is used just in case any of the names start with a dash (these would potentially be taken as sets of options to mv and the double dash prevents this).

Kusalananda
  • 333,661
2

Why it does not work

The * is expanded by the shell, before the command is run. It matches existing files. mv has no pattern matching abilities.

Solutions

mmv

This command works most like the way you are trying to do it. It is not as powerful as rename, but it is simpler.

e.g. mmv '*.assembled.*.fastq.gz' '#1_assembled_#2fastq.gz'