0

department.txt contains column as ID, Department I am using below to change the order to Department,ID enter image description here

But if i further update the command to below enter image description here

Then department.txt becomes empty, without any output. If i use some other file instead of the same file which i am reading, then it works.

I understand i am reading and updating the same file, but as per my understanding is | should take care of it, as internally it must be storing the the output somewhere in its memory and dumping it in the file which i have asked. Isnt it ? Can someone throw some insight on how it works ?

Note: I know there are many similar questions, but none of them have really answered how internally | works, which is the fundamental question for which i need an answer.

1 Answers1

0

No, piping does not take care of it. The pipe doesn't run the two commands sequentially and store the output of the first one in memory. The two programs run at the same time, and they're connected using a special I/O device called a pipe. But before it starts the programs, the shell sets up all the I/O redirections, which means it opens the output file and truncates it.

If you have GNU awk, you can use its inplace option to overwrite the input file.

gawk -i inplace -F'\.' -v OFS=. '{print $2, $1}' department.txt
Barmar
  • 9,927
  • Good to hear alternatives, but first i need to clarify my understanding on the fundamentals itself. So | will execute the program parallely, but how does it synchronise execution ? Still unclear about why it gets truncated ? – LoveWithMaths Apr 07 '18 at 06:52
  • Synchronization is done by the pipe. If the reader tries to read from the pipe when nothing is available, it blocks until the writer has sent something. – Barmar Apr 07 '18 at 06:55
  • And if the writer gets too far ahead of the reader, it blocks when it tries to write. – Barmar Apr 07 '18 at 06:55
  • The pipe is like a FIFO queue. – Barmar Apr 07 '18 at 06:57
  • So in my example its pretty much synchroised ? Isnt it ? because > department.txt will be blocked as there will be nothing to write. Then still why does it write it empty ? – LoveWithMaths Apr 07 '18 at 07:00
  • 1
    The shell does redirection. It truncated department.txt before anything else happened. – Barmar Apr 07 '18 at 07:01
  • So the file is already empty when it runs cut -d. -f1 department.txt. – Barmar Apr 07 '18 at 07:02
  • Now i understand > is overwriting and it truncated ? correct ? – LoveWithMaths Apr 07 '18 at 07:02
  • Yes. Isn't that what I said in the last sentence of my first paragraph? – Barmar Apr 07 '18 at 07:02
  • The statement was bit technical to me and did not really understood your point ? Now i get it thanks for the help – LoveWithMaths Apr 07 '18 at 07:03
  • Hi I was trying out some scenarios when i executed below it made the file empty, here i have not used any pipe; so no async operations; so why does department.txt is empty ? paste <(cut -d, -f2 department.txt) <(cut -d -f1 department.txt) >department.txt – LoveWithMaths Apr 07 '18 at 07:13
  • Because the shell truncates department.txt before running anything! My answer just explained why the pipe doesn't protect you. I didn't mean that it works without the pipe, you already said you knew why that was so. – Barmar Apr 07 '18 at 07:13
  • But i thought without | shell would execute everything sequentially. So i was of understanding > would get execute at last ? but as you are saying it will execute it first ? How do i understand what shell executes first ? Apart from > what else shell executes parallely in a sequential command ? – LoveWithMaths Apr 07 '18 at 07:19
  • 1
    > is not a command, there's nothing sequential about it. In order to redirect the output of a command, the shell has to open the file (truncating it), connect the process's stdout to that file descriptor, then start the program running. – Barmar Apr 07 '18 at 07:52
  • 1
    Just remember, output isn't saved in any temporary memory. If the process is writing to an output file, it has to be opened and truncated before the process starts writing. – Barmar Apr 07 '18 at 07:55
  • 1
    BTW, <( is a pipe in disguse. – Barmar Apr 07 '18 at 07:55