26

Is it correct to use certain special characters, as +, &, ', . (dot) and , (comma), basically, in filenames.

I understand that you can use - and _ with no problem, but doing some research I have been unable to find something definite about the other symbols; some say that you can, some say that you can't, and some others say that it is "not encouraged" to use them (whatever that means).

Anthon
  • 79,293
  • 1
    What programs are you using to work with these files. Only programs that interpret some characters in a special way (e.g. shells on unquoted strings) will give problems. Your average C program takes everything that is not NUL without blinking an eye. – Anthon Sep 16 '14 at 04:33
  • 9
    What do you mean by "correct"? – David Richerby Sep 16 '14 at 12:26
  • The issue with using special characters in a filename is that doing so increases the chance that some buggy piece of code will mishandle the filename. However, I don't don't think any of the characters you listed are particularly likely to cause any issue. You would have more issues with whitespace, which should generally be avoided. And EOL, specifically, should be avoided at all costs. –  Sep 16 '14 at 19:56
  • Windows has stricter restrictions on what can be in a file name, so if there's any chance that the files will need to be used there, that's something to pay attention to. – evilsoup Sep 17 '14 at 09:20

4 Answers4

36

Is it correct to use certain special characters, as +, &, ', . (dot) and , (comma), basically, in filenames.

Yes.

Correct but not necessarily advisable or convenient.

You can use any characters except for null and / within a filename in modern Unix and Linux filesystems.

You can use ASCII punctuation. Some utilities use stops (dot) and commas in the names of files they create.

You can use ASCII control characters, however this is inadvisable as they are unlikely to be displayed acceptably and are difficult to use.

You can use shell meta-characters such as ASCII ampersand and ASCII apostrophe. However this is inconvenient and requires that when constructing commands you take special care to quote or escape such characters.

You can use multi-byte characters using a variety of encodings. It is up to the shell and/or utilities to correctly interpret and display non-ASCII characters. It is advisable to restrict yourself to a popular encoding such as UTF-8 and set locale appropriately.

You will have fewest problems using ASCII printable characters, limiting the set of punctuation characters to ones that are not shell meta-characters and not starting a name with a hyphen (or a stop - unless you want to hide the file).

29

As the others have stated, on modern Unix/Linux systems, file names can contain any character except for \0 (NUL) and / (slash).

In addition to that, the POSIX standard defines a portable character set for file names:

3.282 Portable Filename Character Set

The set of characters from which portable filenames are constructed.

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
a b c d e f g h i j k l m n o p q r s t u v w x y z
0 1 2 3 4 5 6 7 8 9 . _ -

The last three characters are the <period>, <underscore>, and <hyphen> characters, respectively. See also Pathname.

The pathchk utility from GNU Coreutils checks for this when called with the -p option, and the -P option will warn about empty file names (which are not valid but may be passed as an argument to pathchk) and file names starting with a hyphen (-).

8

The safest bet is to refer to the wikipedia entry for the allowed character set for any operating system. It can be found from here.

For instance, for most unix based systems, the allowed character set is 8 bit set and reserved character is the null character (NUL, '\0'). However, it is not a good practice to use the special characters in the file names as they pose a problem while removing them.

For example, I can have a file name as -ramesh.txt and I try to remove it as below.

rm -ramesh.txt
rm: invalid option -- 'a'
Try `rm ./-ramesh.txt' to remove the file `-ramesh.txt'.
Try `rm --help' for more information.
rm "-ramesh.txt"
rm: invalid option -- 'a'
Try `rm ./-ramesh.txt' to remove the file `-ramesh.txt'.
Try `rm --help' for more information.

I need to delete the file as,

rm -- "-ramesh.txt"
rm: remove regular empty file `-ramesh.txt'? y

More details can be found from this answer as well.

In Linux and OS-X only / of the printable ASCII set is prohibited I believe. Some characters (shell metacharacters like *?!) will cause problems in command lines and will require the filename to be appropriately quoted or escaped.

Linux filesystems such as ext2, ext3 are character-set agnostic (I think they just treat it more or less as a byte stream - only nulls and / are prohibited). This means you can store filenames in UTF-8 encoding. I believe it is up to the shell or other application to know what encoding to use to properly convert the filename for display or processing.

So to conclude, the problem is not in using the special characters for file names but on how to handle them.

Ramesh
  • 39,297
  • For that reason ("how to handle them"), I almost exclusively use only letters, numbers, underscores, and periods, if only to make my life easier when I later decide I need to use command line programs to do stuff to my files (which seems to always come up at least once). – phyrfox Sep 16 '14 at 07:15
  • 20
    Not to advocate filenames starting with - but just to be precise: 1) you definitely do not need the quotes around this filename, 2) instead of using the special -- argument you may do exactly what rm itself suggests: rm ./-ramesh.txt, so you do not need to do it exactly as you suggest. – Michał Politowski Sep 16 '14 at 08:22
  • @MichałPolitowski Not only do you not need the quotes, they have exactly zero effect. – ctrl-alt-delor Sep 20 '14 at 12:04
  • couldn't you also escape the - ie rm -ramesh.txt ? – Theresa Forster Jun 06 '20 at 09:31
  • 1
    @TheresaForster: read the comment just above yours. You can escape to no effect (the same string will be passed to the command). The special sense of - is not about shell syntax, it is about conventions used in commands. – Incnis Mrsi Jan 28 '21 at 16:21
4

Your research is almost right. It's possible to use special characters in file names, but it's not advisable since these characters have special meaning. File Naming Conventions in Linux describes other restrictions on file names as well such as "File names should never begin with a hyphen."

Simple example of performing command line operations with special characters in file names.

As a personal note, I'd rather avoid special characters in file names because they require special attention when these files are used for any processing. Thus, removing the concern of dealing with special characters from the development process.

Simply_Me
  • 1,752
  • 1
    So your advise would be to use only -, _ and . (dot) in filenames? – Chris Klein Sep 16 '14 at 03:33
  • @ChrisKlein, yep, though not in the beginning of the file name. – Simply_Me Sep 16 '14 at 03:40
  • Special meaning is in the program (e.g. your shell), not the filename. Almost all programs on U&L don't care about characters at all as long as there is not a NUL in the filename. – Anthon Sep 16 '14 at 04:30
  • @Anthon, yes, my shell as described in the link. – Simply_Me Sep 16 '14 at 04:41
  • 4
    As a personal note, I'd recommend developers naming parent folder of their project something like "föλder\t☃" - so that they'd immediately notice if they make a bug that breaks on such filenames, instead of publishing broken code or binaries that others have to work around. Using it isn't a problem, as long as it's the only one that starts with 'f', tab-completion in any shell will enter the hard-to-type stuff. – Peteris Sep 16 '14 at 12:06
  • @Peteris completely agree, though it depends on the scope of the script/use case. For example(s): if the script is taking images at 600FPS then sure, though parent folder will be static anyway. If the script is part of an algo running then no point. – Simply_Me Sep 16 '14 at 18:23