It is not recommended to use a *
to remove files. It could match more than you like.
Being in Debian, the ls
(from GNU) command is able to print the values of the files in quoted form[1]:
$ ls -Q
"\nssolve" "\n\nssolve" "y" "z"
Or, even better, list files with quoted names and inodes:
$ ls -iQ
26738692 "\nssolve" 26738737 "\n\nssolve" 26738785 "y" 26738786 "z"
Then, use rm with the inode number to ensure that only the correct files are removed:
$ find . -xdev -inum 26738737 -exec rm -i {} \;
The call to find is limited to one filesystem (-xdev
) to avoid matching a file on other filesystem with the same inode number.
Note also that rm
is being called with the -i
(interactive) option, so it will ask in the command line if each file should be erased.
[1] Note that this do not solve the problem with visually confusing characters like a Cyrillic а
($'\U430') and a Latin a
($'\U61') that look exactly the same but are not. To have a better look at the bytes that a filename is using we need to use an hex viewer;
$ touch а a é $'e\U301' $'\U301'e
$ ls
a ́e é é а # what you "see" here depends on your system.
$ printf '<%s>' * | od -An -c
< a > < 314 201 e > < e 314 201 > < 303 251
> < 320 260 >
?
, it's a non-printable character thatls
renders as?
. What's the output ofls | sed -n l
? – Stéphane Chazelas Aug 21 '18 at 16:39rm "?FileName"
orfind . -name "\?*" -exec rm {} \;
rm has not regex type – Hossein Vatani Aug 21 '18 at 16:43rm
does not parse?
or*
, but the shell expands these globs beforerm
executes. – Kevin Kruse Aug 21 '18 at 16:45ls -Q
would work better in Debian. – Aug 21 '18 at 18:16LC_ALL=C ls -Q
as there are plenty of Unicode characters that would result in ambiguous output with GNUls -Q
. Those should be OK with GNUsed -n l
, but you would also needLC_ALL=C
with some othersed
implementations. It's true that newline characters would be a problem withls | sed -n l
. – Stéphane Chazelas Aug 22 '18 at 19:47touch $'\ue9' $'e\u301' $'foo\u200bbar' foobar; ls -Q
and the many other "invisible" characters, or the many characters that look the same or are the same but meant to be used in different contexts (likeU+00C5
vsU+212B
or the mathematical letters...). – Stéphane Chazelas Aug 22 '18 at 22:58-Q
is to expose what is the?
encoding. This works reasonably well for humans, where a human could differentiated the filenames. That human language has confusing characters (like a Cyrillic а ($'\U430') and a Latin a ($'\U61')) that may "look" exactly the same is a different problem for which there is no simple solution. In any case, I find\303\251
much more difficult to process visually thané
in everyday use. Programs and scripts do not have that problem. – Aug 23 '18 at 00:00?
with ls. So,ls -Q
is not a solution for your problem. – Aug 23 '18 at 00:17ls
options--show-control-chars
,--hide-control-chars
(-q
) and--escape
(-b
). – Volker Siegel Aug 23 '18 at 02:01$'\ufeff\nssolveIncpUL46pK.txt'
(with a UTF-8 BOM as sometimes found at the start of strings coming from the Microsoft world) which would show as?ssolveIncpUL46pK.txt
(and"\nssolveIncpUL46pK.txt"
withls -Q
and"\357\273\277\nssolveIncpUL46pK.txt"
withLC_ALL=C ls -q
) but not match the?ss*
as there are two characters beforess
. – Stéphane Chazelas Aug 23 '18 at 06:35ls -Q
(not-q
as in your comment as it means something different) and (3) If both tests fail, then use something likels | od -c
or evenls | od -tx1c
as a tool of last resort. I really dislike the use of LC_ALL=C for everything. – Aug 23 '18 at 18:00