1
  1. For example:

    $ ls -l
    total 344
    -r-------- 1 t t 145657 Mar 11 01:53 joeltest-slides.pdf
    -rw-rw-r-- 1 t t 166814 Mar 11 01:55 The Joel Test: 12 Steps to Better Code? by Joel Spolsky.pdf
    drwx-w--w- 2 t t   4096 Sep 19  2012 The Joel Test 12 Steps to Better Code_files
    -rw--w--w- 1 t t  31940 Feb 12  2011 The Joel Test 12 Steps to Better Code.html
    $ mv The\ Joel\ Test\:\ 12\ Steps\ to\ Better\ Code{                                                                                                                 
    \ by\ Joel\ Spolsky.pdf,.pdf}
    mv: missing destination file operand after ‘The Joel Test: 12 Steps to Better Code{’
    Try 'mv --help' for more information.
    

    What does missing destination file operand mean? Is it because ? in the filename?

  2. In Bash, as far as performing file operations by builtin or external commands is concerned, what characters are best avoided when naming files? For exmple,

    Does the above example imply ? is one of them?

    Does my previous post imply the new line character is one of them?

    Does my previous posts imply the white space is one of them?

  3. Is it correct that, from Linux's point of view, there is no restriction on characters that can be used in filenames? Neither from filesystem type (ext4) 's point of view?

Tim
  • 101,790

1 Answers1

5

The one absolute rule is that you can't use a slash / or a null byte in a file name. The slash is the directory separator and it can't be escaped. Null bytes indicate the end of the name and can't be escaped either. Apart from that, any character is allowed on Linux (except when accessing media or network resources shared with other filesystems), but a number of characters can cause trouble. I think all modern *BSD also allow any character other than / and null bytes, but some older unices had more restrictions.

If you want a file name to work in common shell without quoting, you need to avoid !"#$&'()*;<=>?[\]^`{|}~ and whitespace (space, tab, newline). ~ is ok if it's at the end. In bash specifically, ^ is ok, # and ~ are ok everywhere except at the beginning, and = is ok except as a command name (because it would be interpreted as an assignment).

Beyond that, here is some filename portability advice, in rough order of importance.

  • Don't start a file name with - (dash/hyphen). Commands may interpret it as an option.
  • Don't use an initial ~ (tilde) because that means “home directory” in many applications.
  • More generally, don't start or end with a punctuation sign, as a number of applications assign special meanings to those (e.g. |foo meaning “pipe through the program foo” rather than “write to the file |foo”, and likewise with foo| for output).
  • If you're going to exchange files with Windows users, or to put files on removable media, don't use characters that Windows doesn't support: \/?:*"><|
  • Don't use nonprintable characters (e.g. control characters), tabs or newlines. You won't even be able to type them in many interfaces.
  • Some badly written shell scripts may choke on spaces as well as \*?[] because they're wildcards. In addition, some applications that can act on multiple files at once interpret these characters as wildcards.
  • If you're going to exchange files with older computers or with people who speak a different language, especially one written in a non-Latin alphabet, they may use a different character encoding. The ASCII characters are guaranteed to be available everywhere and encoded in the same way.
  • Many applications use the file extension, to figure out what files they support and how they open it. The system also uses the extension to determine what application to open the file with. So leave extensions in place. The extension is the part after the last dot, e.g. txt in myfile.txt; sometimes there are multiple extensions, e.g. myfile.txt.gz for a compressed (.gz) text (.txt) file.
  • File names beginning with . are hidden by default in the output of the ls command and in many file browsers.
  • Unix is case-sensitive: myfile is not related to Myfile. Traditionally file names are in lowercase, largely because that makes them easier to type. In the old days, systems usually sorted uppercase letters before lowercase letters, so there's a tradition of starting a file name with a capital letter to make it come first in directory listings, but modern systems often sort names case-insensitively. Sticking to lowercase avoids confusion and is easier to type.

If you'd rather not remember all these complex cases, here are just two simple rules:

  • Maximum safety: stick to letters az and digits 09, plus - to separate words, and .extension at the end of the file name. For example: my-file.txt
  • More readable: use letters and digits in English or in your own script, plus space or - to separate words, and .extension at the end of the file name. For example: Jörgs Datei.txt

Final tip: use the YYYYMMDD format (year-month-day, with 4 digits for the year and a leading zero in the month and day number) for dates, e.g. 20150622-report.txt. That way, sorting the file name gives you chronological order.

  • It's not "modern systems" which sort things ignoring case - that's about 20 years old. Call it a feature of locales. – Thomas Dickey Mar 12 '16 at 02:14
  • A guideline I've used over the years is, unless I have a very specific reason not to, try to make files that are "C identifiers". That is, as you say here, stick to alphanumerics and the underscore, and try not to make files that are the same as "reserved" words. Every young grasshopper has learned, the hard way, not to name their test file "test", for example. –  Mar 12 '16 at 02:36
  • also, empty filenames aren't allowed. –  Dec 30 '20 at 09:30