linux command using trim / cut / sed to cut few data in a file

Question

I have the content like below in a /tmp/myfileslist

test1/a/sample1.xls
test2/demo.sh

I want to remove .extentions and content before slash , also slash is to be removed.I want the output as

sample1
demo

i tried this command cat /tmp/myfileslist | cut -d/ -f 2 Output: a demo.sh — afrin, Jun 01 '23 at 11:14
What about test3/foo.tar.gz? Do you want foo or foo.tar? — Philippos, Jun 01 '23 at 11:18
Also, is there always an extension, or can there be files without extension (as in test4/d/noext)? — AdminBee, Jun 02 '23 at 08:35

αғsнιη · Accepted Answer · 2023-06-01T11:43:46.633

3

With awk (and it's assumed that there in no repeated dot suffixes in your records such as /path/to/some.example.txt as it will then return "example" part only)

awk -F'[/.]' '{ print $(NF-1) }' infile

if you have such records like that, use below instead.

awk -F'/' '{ sub(/.[^.]*$/, ""); print $NF }' infile

edited Jun 01 '23 at 11:43

answered Jun 01 '23 at 11:23

αғsнιη

41,407

score 2 · Answer 2 · answered Jun 01 '23 at 11:21

2

You cut approach has the problem that the number of the field changes from line to line.

Also note that "you shall not pipe cats", instead give the filename as attribute to your text processing command.

Do it in two steps to remove everything upto the slash (.*/) and then everything starting from the dot (\..*):

 sed 's_.*/__;s_\..*__' /tmp/myfilelist

(This assumes you want to remove all extensions and you only want the foo of foo.tar.gz.)

answered Jun 01 '23 at 11:21

Philippos

13,453

I should add, that if you only want to remove the last extension, the second substitute command turns into s_\.[^.]*$__ – Philippos Jun 01 '23 at 11:32
Thank you so much this command is working awk -F'[/.]' '{ print $(NF-1) }' /tmp/mytestfiles – afrin Jun 01 '23 at 12:10

Thor · Answer 3 · 2023-06-01T12:00:30.240

1

cut

You can grab the last element with cut if you reverse each line first, e.g.:

<filelist.txt rev | cut -d/ -f1 | rev

Now you can remove the filename extension like this:

<filelist.txt rev | cut -d/ -f1 | rev | cut -d. -f1

bash

Besides cut and sed you could use bash parameter expansion to remove the path filename extension, e.g.:

while read f; do
  f="${f##*/}"
  f="${f%.*}"
  printf '%s\n' "$f"
done < filelist.txt

Hint: use ${f%%.*} to remove all extensions.

edited Jun 01 '23 at 12:00

answered Jun 01 '23 at 11:51

Thor

17,182

1

As one might have files with dots in the name (e.g. file.123.suf), I'd suggest doing the suffix removal cutting in the reversed form: <filelist.txt rev | cut -d/ -f1 | cut -d'.' -f1 --complement | rev – FelixJN Jun 01 '23 at 14:18

jubilatious1 · Answer 4 · 2023-06-01T19:34:47.033

Using Raku (formerly known as Perl_6)

~$ raku -ne 'put .IO.extension("").basename;'  file
#OR (below handles up to 8-part extensions):
~$ raku -ne 'put .IO.extension("", :parts(^9)).basename;'  file

Sample Input:

/test1/a/sample1.xls
/test2/demo.sh
/some/file.txt
/whatever/prog.c
/something/abc.tar.bz
/something/abc.123.456.789.tar.bz
/something/abc.c
/something/abc.h
/path/to/file.10.5.2.tar.gz
/path/to/file.10.5.2.tar.gz.whatever
/path/to/file.10.5.2.tar.gz.whatever.7.pdf
/noextension

Sample Output:

sample1
demo
file
prog
abc
abc
abc
abc
file
file
file
noextension

Briefly, the file is read linewise using the -ne non-autoprinting linewise flags. The code is run over each line: First the path is interpreted as an IO object, for which an extension can be identified/modified. Within the extension parameters, the identified parts are "" substituted with nothing (i.e. deleted). Adding the :parts parameter (a.k.a. "adverb") allows multi-part file-extension identification. Finally, the basename is isolated, removing all parts of the path--slash and above.

Note, because filepaths are understood by Raku with OS-specific settings, the codes above should work unmodified on Windows to extract the correct elements from Windows paths (Raku understands backslash as a path-separator on Windows OS).

https://docs.raku.org/type/IO/Path
https://docs.raku.org/routine/basename
https://docs.raku.org/routine/extension
https://raku.org

Example Source:
https://unix.stackexchange.com/a/731665/227738

linux command using trim / cut / sed to cut few data in a file

4 Answers4

cut

bash