3

I have a large number of files that contain backslashes \ that I would like to manipulate, but whenever I try something like:

$ ls -li
2036553851 -rw-rw-r-- 1 user user 6757 May 20 00:10 Simplex_config\\B1:B3\\_1.csv
2036553766 -rw-rw-r-- 1 user user 6756 May 20 00:07 Simplex_config\\B1:B3\\_2.csv
2036554099 -rw-rw-r-- 1 user user 6785 May 20 00:20 Simplex_config\\B1:B3\\_3.csv
2036553974 -rw-rw-r-- 1 user user 6785 May 20 00:15 Simplex_config\\B1:B3\\_4.csv

$ find . -type f -name 'Simplex*.csv' | xargs cat > looksee.txt

I receive a No such file or directory error. I have considered changing the filenames and then manipulating, but I am curious to see if there was an easier solution with the inode.

I came up with:

#!/bin/sh

if [ -f looksee.txt ]; then
   rm -rf looksee.txt
fi

ls -i Simplex_config*.csv | awk '{ print $1 }' > inode_list.txt

while IFS= read -r inode;
do
  find . -inum $inode -exec cat {} \; >> looksee.txt
done < inode_list.txt

But this is very cumbersome and I would like to try to find a way to parse the output from ls -i Simplex_config*.csv and pipe it to another command in a one-liner -- is there such an option available?

mlegge
  • 283
  • 2
    Try find . -type f -name 'Simplex*.csv' -print0 | xargs -0 cat > looksee.txt or even find . -type f -name 'Simplex*.csv' -exec cat {} + > looksee.txt – Costas May 20 '15 at 13:26
  • @Costas perfect, would you mind making that an answer and explaining why the -print0/-0 in the first and the + in the second make this possible? – mlegge May 20 '15 at 13:30
  • And if ls Simplex_config*.csv have enough recursion for you why do you do not use cat Simplex_config*.csv > looksee.txt – Costas May 20 '15 at 13:31
  • @Costas in the toy example this would be enough, but in the true situation I would have to resolve recursion. Thank you for your answers! – mlegge May 20 '15 at 14:00

3 Answers3

3

1.

find . -type f -name 'Simplex*.csv' -print0 | xargs -0 cat > looksee.txt

From man xargs

--null
-0
Input items are terminated by a null character instead of by whitespace, and the quotes and backslash are not special (every character is taken literally). Disables the end of file string, which is treated like any other argument. Useful when input items might contain white space, quote marks, or backslashes. The GNU find -print0 option produces input suitable for this mode.

2.

find . -type f -name 'Simplex*.csv' -exec cat {} + > looksee.txt

From man find

-exec command ;
Execute command; true if 0 status is returned. All following arguments to find are taken to be arguments to the command until an argument consisting of ; is encountered. The string {} is replaced by the current file name being processed everywhere it occurs in the arguments to the command, not just in arguments where it is alone, as in some versions of find. Both of these constructions might need to be escaped (with a \) or quoted to protect them from expansion by the shell. The specified command is run once for each matched file. The command is executed in the starting directory. There are unavoidable security problems surrounding use of the -exec action; you should use the -execdir option instead.

-exec command {} +
This variant of the -exec action runs the specified command on the selected files, but the command line is built by appending each selected file name at the end; the total number of invocations of the command will be much less than the number of matched files. The command line is built in much the same way that xargs builds its command lines. Only one instance of {} is allowed within the command. The command is executed in the starting directory.

3.

cat Simplex_config* > looksee.txt

if you have 1 level of subpath only.

Costas
  • 14,916
2

You cannot access files by inodes, because that would break access control via permissions. For example, if you don't have the permission to traverse a directory, then you can't access any of the files in that directory no matter what the permissions on the file are. If you could access a file by inode, that would bypass directory permissions.

Thus, while you can obtain a file's device and inode numbers, you need to find a path to the file in order to act on it. (A path, not the path, since there can be more than one if the file has multiple hard links.) This means that if you use inodes, you'll always have more work to do. The only reason to even look at inodes is if you want to be aware of hard links and act only once on each file with multiple hard links.

Your find command is easily fixed by either using -print0 and xargs -0 (if available on your system), or using the -exec action. For more information, see Why does my shell script choke on whitespace or other special characters?

find . -type f -name 'Simplex*.csv' -exec cat {} + > looksee.txt
find . -type f -name 'Simplex*.csv' -print0 | xargs -0 cat > looksee.txt
  • Why do you say that accessing a file via its inode would bypass the permissions on that inode? I can imagine an 'open_inode' syscall that would respect the inode's mode just as 'open' does. What's impossible about that? – AnotherSmellyGeek Apr 05 '18 at 08:37
  • What do you mean by 'directory permissions'? Are you referring to the mode of an inode? Dirents don't have anything I would call 'permissions'. – AnotherSmellyGeek Apr 05 '18 at 08:39
  • 1
    @AnotherSmellyGeek By “directory permissions”, I mean the permissions on directories. For example, you can't access /dir/file if you don't have the x permission on /dir, regardless of the permissions stored in the inode for /dir/file. – Gilles 'SO- stop being evil' Apr 05 '18 at 18:31
  • Yep, I think when you say "permissions on directories", you mean the mode of an inode. Don't you? Separately, I don't understand what the ability or inability to traverse parent directories has to do with bypassing (or not bypassing) inodes' modes. Also, I'm not sure what you mean by 'access'. If /dir/file's inode were hard-linked into some other directory on which I did have traverse, and if the inode's mode permitted it, then I would be able to open(2) that inode by that other name, even if I didn't have traverse on /dir. Wouldn't you call that an "access"? – AnotherSmellyGeek Apr 17 '18 at 15:46
  • Oh, wait, do you mean the access(2) syscall? – AnotherSmellyGeek Apr 17 '18 at 15:50
  • 1
    @AnotherSmellyGeek “Permissions” and “mode” are synonymous in this context. By “access”, I mean do something with the content of the file, which under the hood means open(2). For example, if you don't have x permissions to /dir, then there's no way you can access /dir/file, no matter what the permissions on /dir/file are (unless you can traverse some other directory where the file is hard-linked). – Gilles 'SO- stop being evil' Apr 17 '18 at 18:54
0

@Costas already gave you the best answer, to replay to your second question

But this is very cumbersome and I would like to try to find a way to parse the output from ls -i Simplex_config*.csv and pipe it to another command in a one-liner -- is there such an option available?

you can just use xargs:

ls -i | cut -d ' ' -f 1 | xargs -I '$input' find -inum '$input' -exec cat {} \;

the -I option let you specify a string to replace with what it reads from the stdin.

  • This is really bizzare: using ls, then cut then find the inode number. You can directly use find as explained by others. – Franklin Piat May 21 '15 at 06:33