I have the content like below in a /tmp/myfileslist
test1/a/sample1.xls
test2/demo.sh
I want to remove .extentions and content before slash , also slash is to be removed.I want the output as
sample1
demo
I have the content like below in a /tmp/myfileslist
test1/a/sample1.xls
test2/demo.sh
I want to remove .extentions and content before slash , also slash is to be removed.I want the output as
sample1
demo
With awk (and it's assumed that there in no repeated dot suffixes in your records such as /path/to/some.example.txt as it will then return "example" part only)
awk -F'[/.]' '{ print $(NF-1) }' infile
if you have such records like that, use below instead.
awk -F'/' '{ sub(/.[^.]*$/, ""); print $NF }' infile
You cut approach has the problem that the number of the field changes from line to line.
Also note that "you shall not pipe cats", instead give the filename as attribute to your text processing command.
Do it in two steps to remove everything upto the slash (.*/) and then everything starting from the dot (\..*):
sed 's_.*/__;s_\..*__' /tmp/myfilelist
(This assumes you want to remove all extensions and you only want the foo of foo.tar.gz.)
substitute command turns into s_\.[^.]*$__
– Philippos
Jun 01 '23 at 11:32
You can grab the last element with cut if you reverse each line first, e.g.:
<filelist.txt rev | cut -d/ -f1 | rev
Now you can remove the filename extension like this:
<filelist.txt rev | cut -d/ -f1 | rev | cut -d. -f1
Besides cut and sed you could use bash parameter expansion to remove the path filename extension, e.g.:
while read f; do
f="${f##*/}"
f="${f%.*}"
printf '%s\n' "$f"
done < filelist.txt
Hint: use ${f%%.*} to remove all extensions.
file.123.suf), I'd suggest doing the suffix removal cutting in the reversed form: <filelist.txt rev | cut -d/ -f1 | cut -d'.' -f1 --complement | rev
– FelixJN
Jun 01 '23 at 14:18
Using Raku (formerly known as Perl_6)
~$ raku -ne 'put .IO.extension("").basename;' file
#OR (below handles up to 8-part extensions):
~$ raku -ne 'put .IO.extension("", :parts(^9)).basename;' file
Sample Input:
/test1/a/sample1.xls
/test2/demo.sh
/some/file.txt
/whatever/prog.c
/something/abc.tar.bz
/something/abc.123.456.789.tar.bz
/something/abc.c
/something/abc.h
/path/to/file.10.5.2.tar.gz
/path/to/file.10.5.2.tar.gz.whatever
/path/to/file.10.5.2.tar.gz.whatever.7.pdf
/noextension
Sample Output:
sample1
demo
file
prog
abc
abc
abc
abc
file
file
file
noextension
Briefly, the file is read linewise using the -ne non-autoprinting linewise flags. The code is run over each line: First the path is interpreted as an IO object, for which an extension can be identified/modified. Within the extension parameters, the identified parts are "" substituted with nothing (i.e. deleted). Adding the :parts parameter (a.k.a. "adverb") allows multi-part file-extension identification. Finally, the basename is isolated, removing all parts of the path--slash and above.
Note, because filepaths are understood by Raku with OS-specific settings, the codes above should work unmodified on Windows to extract the correct elements from Windows paths (Raku understands backslash as a path-separator on Windows OS).
https://docs.raku.org/type/IO/Path
https://docs.raku.org/routine/basename
https://docs.raku.org/routine/extension
https://raku.org
Example Source:
https://unix.stackexchange.com/a/731665/227738
test3/foo.tar.gz? Do you wantfooorfoo.tar? – Philippos Jun 01 '23 at 11:18test4/d/noext)? – AdminBee Jun 02 '23 at 08:35