51

ls -1 lists my elements like so:

foo.png
bar.png
foobar.png
...

I want it listed without the .png like so:

foo
bar
foobar
...

(the dir only contains .png files)

Can somebody tell me how to use grep in this case?

Purpose: I have a text file where all the names are listed without the extension. I want to make a script that compares the text file with the folder to see which file is missing.

Colin
  • 631
  • What if the directory contains other extensions, such as .jpg and .zip. Should those be shown or suppressed? – Otheus May 18 '16 at 12:18
  • 38
    You want to be careful with a request like this. Linux does not have filename extensions. Linux has filenames that may or may not include a . in them. Although the convention says to name your files with .png at the end, there is no reason why I can't have a png file named foo.zip or my.picture.20160518 or just mypic . – hymie May 18 '16 at 13:16
  • @Otheus In my case it has only the same ending (.png). This makes the problem much easier. – Colin May 18 '16 at 13:53
  • 2
    @hymie I know, but my elements in that folder are all named with .png at the end. – Colin May 18 '16 at 13:56
  • 14
    What's an "extension"? That's not part of Unix file naming; it's carryover from VMS/NT/Windows whatever. And you youngsters get off my lawn, too. :) – mpez0 May 18 '16 at 14:33
  • 34
    Let's not overstate this. The OS treats extensions as simply being part of the file name, but plenty of unix programs pay attention to them, from the compiler to the GUI. The concept is most certainly not foreign to unix. – alexis May 18 '16 at 15:09
  • 1
    It is usually suggested to avoid to parse the output of ls and to pipe the output of ls and find, mainly because the possibility to incur in newline,tab char in the file name. If the filename isThe new art of working on .png\NEWLINE files and other formats` many of the solution proposed will create problems. – Hastur May 19 '16 at 00:34
  • @Hastur: But you'll certainly run into plenty more problems with filenames like that. Your advice is a bit like advising against shaking hands when meeting people because they might have ebola. Just get those files renamed. – reinierpost May 19 '16 at 13:51
  • @reinierpost You make it too simple. Working in teams in different countries means that not always you can change the filename of another member. Even when you can, often it results in a time consuming multiplication of the file versions number. What I said was really happened when a guy started to extract from a PDF images for a future use, and called them copying and pasting the title in the field of the name, just adding a progressive number. One of the title had Image.png and a newline after, then continued with other worlds... So why to write something you know it will not works? – Hastur May 19 '16 at 15:13
  • @Hastur: You should be asking your team member, not me. – reinierpost May 19 '16 at 23:27
  • Colin, we need to know why you want to hide the extension: it will give out names that will not correspond to the files (in windows, the extension is often unnecessary. In linux, an extension is just a part of the name, so taking it out truncates the name, giving a different name (and possible another file, or most probably no file at all behind that other name). So, please, give a bit of context (as your question could be a XYProblem. And if it is (or isn't !), giving the context would make it more satysfying for us, as we will then know that the answers we provide are meaningful) – Olivier Dulac May 20 '16 at 09:00
  • @OlivierDulac I have a text file where where all the names are listed without the extension. I want to make a PHP script that compares the text file with the folder to see which file is missing. – Colin May 20 '16 at 09:06
  • 1
    @Colin: then you can now edit your question, including that latest comment of yours, so that the "please tell me how to use grep in that case" becomes much clearer ^^ (it wasn't). – Olivier Dulac May 20 '16 at 09:26

15 Answers15

66
ls -1 | sed -e 's/\.png$//'

The sed command removes (that is, it replaces with the empty string) any string .png found at the end of a filename.

The . is escaped as \. so that it is interpreted by sed as a literal . character rather than the regexp . (which means match any character). The $ is the end-of-line anchor, so it doesn't match .png in the middle of a filename.

cas
  • 78,579
  • 8
    I think the OP wants any extension stripped, but probably only the "last one". So maybe modify your otherwise good answer with: sed 's/\.[^.]*$//' – Otheus May 18 '16 at 11:52
  • 1
    yes, that regexp would work in that case...but if the OP wants that, they should say so instead of specifically saying they "want it listed without the .png" – cas May 18 '16 at 11:56
  • I asked OP for clarification. – Otheus May 18 '16 at 12:18
  • 4
    The -1 is not necessary, being the default here. – jlliagre May 18 '16 at 13:59
  • I prefer the pure-shell answer better, but upvoted because you actually explained all the parts of the command you used. – gardenhead May 18 '16 at 16:26
  • 3
    @jlliagre I agree with cas that the -1 should be specified. It's only the default when pipe is turned on, which is a hidden surprise for some. So making it explicit helps understanding. I also do this in my scripts so I know what kind of output I'm expecting. – Otheus May 18 '16 at 19:43
  • @cas you were right about OP's question – Otheus May 18 '16 at 19:43
  • @jlliagre yes, but i prefer to be explicit rather than implicit. – cas May 18 '16 at 20:43
  • 1
    Warning In the case of a filename with the key (.png) before a newline char you will erase even that .png and not only the last one. It is better to avoid to pipe and parse the output of ls, it reserves often well hidden surprises... (some words and references more in the answer). – Hastur May 19 '16 at 00:45
53

You only need the shell for this job.

POSIXly:

for f in *.png; do
    printf '%s\n' "${f%.png}"
done

With zsh:

print -rl -- *.png(:r)
llua
  • 6,900
cuonglm
  • 153,898
21

If you just want to use bash:

for i in *; do echo "${i%.png}"; done

You should reach for grep when trying to find matches, not for removing/substituting for that sed is more appropriate:

find . -maxdepth 1 -name "*.png"  | sed 's/\.png$//'

Once you decide you need to create some subdirectories to bring order in your PNG files you can easily change that to:

find . -name "*.png"  | sed 's/\.png$//'
Anthon
  • 79,293
  • ls -1 | sed 's/.png//' works great. Thank you! – Colin May 18 '16 at 10:52
  • The find piped to sed solution can present some problems if you find a file with the key (.png) as a part of the name and just before a newline character. It is better to avoid to pipe and parse the output of find or ls, it reserves often well hidden surprises... (some words and references more in the answer). – Hastur May 19 '16 at 00:51
  • Probably replace find with something like echo in the last example. Not clear what purpose find serves there and results depend on directory structure (i.e. if you have a directory files.png) –  May 19 '16 at 01:46
  • @BroSlow Updated to something more sensible. – Anthon May 19 '16 at 04:35
21

I'd go for basename (assuming the GNU implementation):

basename --suffix=.png -- *.png
hennr
  • 332
  • Note that if you want to use it in a pipe, you may find it helpful to use GNU basename's -z (or --zero) option to produce NUL-separated (instead of newline-separated) output. – Toby Speight May 20 '16 at 10:15
17

Another very similar answer (I'm surprised this particular variant hasn't appeared yet) is:

ls | sed -n 's/\.png$//p'
  • You don't need the -1 option to ls, since ls assumes that if the standard output isn't a terminal (it's a pipe, in this case).
  • the -n option to sed means ‘don't print the line by default’
  • the /p option at the end of the substitution means ‘...and print this line if a substitution was made’.

The net effect of that is to print out only those lines which end in .png, with the .png removed. That is, this also caters to the slight generalisation of the OP’s question, where the directory doesn't contain only .png files.

The sed -n technique is often useful in cases where you might otherwise use grep+sed.

  • 1
    I like how the care you used to write your answer. This solution will present problems with filenames including newlines, it will not print the first part of the name. Even more if it is a nastier one with the key (.png) before the newline char: in that case you will print that part without png, not erasing only the last part. It is often sugpugested to avoid to parse (and pipe) the output of ls because the problems can be hidden just where you are not thinking about... – Hastur May 19 '16 at 00:59
  • 3
    @Hastur You're correct, in principle, and the famous page about don't parse ls lists further problems (and solutions) when handing pathological filenames. But the best way of handling that is to avoid having pathological filenames (doh!); and if you can't, or if you must be robust against them, then either use find or – possibly better – use a more powerful language than sh to manage them (the fact that sh can do everything doesn't mean that it's the best choice in each case). The shell is designed for usability first. – Norman Gray May 19 '16 at 09:40
  • I agree, in principle, about the usability, but this variant fails when you have a filename with each newline inside. This can easily occur unnoticed, for example, when you copy and paste a line from a pdf in a GUI, So you only think to be avoided pathological filenames. – Hastur May 19 '16 at 11:23
  • Moreover IMHO It's easy to start to parse ls, but it is afoot of future problems. Often we make scripts that we will use later, when we already will forget their limit... (it's human, it's usual). I proposed a find example (with -exec and without pipe) even if I deem a better (because pure shell) answer the cuonglm's one, solid and posix compliant. – Hastur May 19 '16 at 11:23
  • @Hastur: those future problems will arise anyway. Lots of things in the system aren't robust against files with newlines. E.g. try using locate or make on them. – reinierpost May 19 '16 at 13:56
  • 1
    This is pretty much what I'd do if, for some reason, I wanted to remove strip the .png suffix from a list of file names. I wouldn't put it into a script; instead I'd just type the command at the shell prompt. Doing so would be a reminder that I'm assuming "sane" file names. There are plenty of things I'll do in a one-off manual command, when I feel free to make assumptions about what's in the current directory, that I probably wouldn't do in a script that might be re-used in some other context. – Keith Thompson May 19 '16 at 20:15
10

You can use only BASH commands to do that (without any external tools).

for file in *; do echo "${file%.*}"; done 

This is usefully when you're without /usr/bin and works nice for filenames like this.is.image.png and for all extensions.

8

wasn't it enough?

ls -1 | sed 's/\.png//g'

or in general, this

ls -1 | sed 's/\.[a-z]*//g'

will remove all extensions

7

It is not safe to parse ls or to pipe find[1,2]

It is not safe to parse (and to pipe) the output of ls or find, mainly because it possible to find in the file names non usual characters as the newline, the tab... Here a pure shell cycle will work[cuonglm].
Even the find command not piped with the option -exec will work:

find ./*.png  -exec  basename {} .png  \;

Updates/Notes: You can use find . to search even for the hidden files, or find ./*.png to get only the not hidden ones. With find *.png -exec ... you can have problem in the case it was present a file named .png because find will get it as an option. You can add -maxdepth 0 to avoid to descend in directories named as Dir_01.png, or find ./*.png -prune -exec ... when maxdepth is not allowed (thanks Stéphane). If you want to avoid to list those directories you should add the option -type f (which would also exclude other types of non-regular files). Give it a look to the man for a more complete panorama about all the options available, and remember to check when they are POSIX compliant, for a better portability.

Some words more

It can happen, for example, that copying the title from a document and pasting into the filename, one or more newline will finish in the filename itself. We can be even so unlucky that a title can contain even the key we have to use just before a newline:

The new art of working on .png
files and other formats.

If you want to test, you can create file names like this with the commands

touch "A file with two lines"$'\n'"and This is the second.png"
touch "The new art of working on .png"$'\n'"files and other formats.png"

The simple /bin/ls *png will output ? instead of the non printable characters

A file with two lines?and This is the second.png
The new art of working on .png?files and other formats.png

In all the cases in which you will pipe the output of ls or find the following command will have no hint to understand if the present line comes from a new file name or if it follows a newline character in the precedent file name. A nasty name indeed, but still a legal one.

A shell cycle with a shell Parameter-Expansion , ${parameter%word}, in both the variant with printf or echo will work [cuonglm],[Anthon1] .

for f in *.png; do printf "%s\n" "${f%.png}" ; done

From the man page of the Shell Parameter Expansion [3]

${parameter%word}
${parameter%%word}

... the result of the expansion is the value of parameter with the shortest matching pattern (the ‘%’ case) or the longest matching pattern (the ‘%%’ case) deleted.

Hastur
  • 2,355
  • Also the results of your find command are a bit variable (for example if there is a directory called files.png) –  May 19 '16 at 02:09
  • 1
    Dear @BroSlow, when I wrote the answer above I tried 13 (all) of the other variants present in that moment, by command line, in a script, launched as argument of a shell invocation. Please do the same and tell me if they behave in the way you expect. I did my tests with bash 4.3.11, dash 0.5.7-4, zsh (when needed) 5.0.2. Feel you free to read this post that adds something more. I agree about the note of piping the output of find, for this I expressly suggested -exec, and I wrote in the title. :-). – Hastur May 19 '16 at 08:58
  • Re-read the wiki again. I still think you need to pipe in your example, since that's what's being discussed here. And for the majority of modern versions of ls there is no issue whatsoever when the output is piped or redirected, but as mentioned in wiki it may not work for all. Most will only insert the ? in place of special characters when the output is sent to terminal. i.e. Do echo *.png | od -c and ls *.png | od -c. The newline issue is not an issue with ls, it's an issue with any command that doesn't null terminate across both sides of the pipe. –  May 19 '16 at 14:33
  • 1
    printf "${f%.png}\n" is wrong. The first argument is the format, you shouldn't use variable data in there. Can even be seens as a DoS vulnerability (try with a %1000000000s.png file for instance). – Stéphane Chazelas May 19 '16 at 15:25
  • You'd need find ./*.png -prune -exec... or you'd have problems with filenames starting with - (and files of type directory, note that -maxdepth is not portable) – Stéphane Chazelas May 19 '16 at 15:26
  • @StéphaneChazelas Thanks, for prune I often risk to remain closed in my little gnu world... but even more for the DoS, I didn't think about it. – Hastur May 19 '16 at 15:55
4

Use rev:

ls -1 | rev | cut -f 2- -d "." | rev

rev reverses all the strings (lines); you cut everything after the first '.' and rev re-reverses the remnant.

If you want to grep 'alma':

ls -1 | rev | cut -f 2- -d "." | rev | grep 'alma'
Tom Solid
  • 256
3

If I knew the directory only had files with .png as an extension, I would have just run: ls | awk -F. '{print $1}'

This will return the first "field" for anything where there is a filename.extension.

Example:

[rsingh@rule51 TESTDIR]$ ls
10.png  1.png  2.png  3.png  4.png  5.png  6.png  7.png  8.png  9.png

[rsingh@rule51 TESTDIR]$ ls | awk -F. '{print $1}'
10
1
2
3
4
5
6
7
8
9
rsingh
  • 31
  • Unfortunately it will fail on all the filenames with more than one ., as Image.1.png and even on the ones with not nice names, with special characters inside. as the newline or the one that you will use as (input) record separator in awk, RS. It is suggested to avoid to parse the ls output because it loves to hide problem that will arise when you will not expect. You can read more in those reference 1 or 2. BTW nice the idea to use awk... I put some examples in one answer. – Hastur May 19 '16 at 08:38
  • True, however, given the sample provided by Colin it would work just fine.

    To make it work for the case you suggested, I'd probably change it to:

    [rsingh@rule51 TESTDIR]$ ls | sed -e 's/.png$//' 10 1 2 3 4 5 6 7 8 9 harry.the.bunny whats.a.png.filename

    Not trying to be difficult, but given Colin's need, I'm not sure what the issue would be parsing ls.

    – rsingh May 19 '16 at 14:06
  • sorry...

    I just realized I didn't show the directory with the files prior to sed modifying the output of 'ls'

    [rsingh@rule51 TESTDIR]$ ls 10.png 2.png 4.png 6.png 8.png harry.the.bunny.png 1.png 3.png 5.png 7.png 9.png whats.a.png.filename.png

    [rsingh@rule51 TESTDIR]$ ls | sed -e 's/.png$//' 10 1 2 3 4 5 6 7 8 9 harry.the.bunny whats.a.png.filename

    – rsingh May 19 '16 at 14:25
  • note1 you need to escape the . in \. inside the sed -e 's/\.png$//', but so it becomes an answer just written. :-( note2 you can try to use awk with something like ls | awk -F. '{if ($NF=="png") {for (i=1;i<NF-1;i++) printf("%s.", $i) ; printf $(NF-1)"\n"}}'... but you will have always the problem that awk cannot know if the line is arriving is following or not a newline inside the file name. I tried to say better in my answer. – Hastur May 19 '16 at 14:53
  • Thanks Hastur, I missed that :).

    Also, I ditched the use of awk in favor of sed in this case.

    – rsingh May 19 '16 at 17:02
2

according to your comment " I have a text file where where all the names are listed without the extension. I want to make a PHP script that compares the text file with the folder to see which file is missing " :

for file in $(cat yourlist) ; do
  [ -f "${file}.png" ] || {
    echo "$file : listed in yourlist, but missing in the directory"
  }
done
#assumes that filenames have no space...
# otherwise use instead:
#  while IFS= read file ; do ...(same inner loop as above)... ; done < yourlist

and the reverse:

for file in *.png ; do
  grep "^${file%.png}$" yourlist >/dev/null || {
    echo "$file: present in the directory but not listed in yourlist"
  }
done
#I assume there are no spaces/tabs/? before/after names in 'yourlist'. Change the script accordingly if there are some (or sanitize the list)
1

ls -l | sed 's/\.png$//'

Is the most accurate method as highlighted by @roaima. Without the escaped \.png files named a_png.png would be listed as : a_.

Chris Davies
  • 116,213
  • 16
  • 160
  • 287
aphorise
  • 261
1

A simple shell line (ksh, bash or zsh; not dash):

set -- *.png; printf '%s\n' "${@%.png}"

A simple function (from No Extension):

ne(){ set -- *.png; printf '%s\n' "${@%.png}"; }

Or a function that remove any extension given (png by default):

ne(){ ext=${1:-png}; set -- *."$ext"; printf '%s\n' "${@%.${ext}}"; }

Use as:

ne jpg

If the output is an asterisk *, no file with that extension exist.

1

You can try the following feed awk the output of ls your superator is the "." and since all your files will have name.png you print the first column:
ls | awk -F"." '{print $1}'

igiannak
  • 750
-1

If you have acccess to sed, this is better as it will strip the last file extension, no matter what it is (png, jpg, tiff, etc...)

ls | sed -e 's/\..*$//'