2

I have a CSV file that I often update with the output of a command, but whose header I want to retain. How do I keep Bash's > file.csv syntax from overwriting that line?

Kyle
  • 133

6 Answers6

4

A couple of options using in-place editing rather than redirection:

ed -s file.csv <<'EOF'
1r !some_command
1,.wq
EOF

or similarly

some_command | sed -i.bak -e '1r /dev/stdin' -e 1q file.csv

although AFAIK the latter uses a temporary file "under the hood".

steeldriver
  • 81,074
2
execute command >csvfile
sed -i '1i mention the header here' csvfile
1

There's no easy way to get a shell to overwrite a file starting at an offset. And you shouldn't edit files in place anyway, because that leaves a truncated file if the editing is interrupted a crash or power failure.

Create a new file. Start by copying the first line, then write the rest of the content. When you're done, move the new file into place.

read -r first_line <file.csv
{
  printf '%s\n' "$first_line
  … # write the rest of the content to standard output
} >file.csv.new
mv -f file.csv.new file.csv
  • If you want the edited file to keep the old inode and permissions then redirect the new file over the old (e.g. cat file.csv.new > file.csv ; rm -f file.csv.new), otherwise it will have the new file's inode (breaking hard links, if any exist) and the default owner/perms/acls from when file.csv.new was created (which may be different to what file.csv had). This often doesn't matter, but sometimes does....and is why using a scriptable editor like ed or ex or vim is sometimes better than the so-called "in-place edit" of sed or perl etc. – cas Aug 15 '19 at 03:11
  • 1
    @cas This has the downside of leaving a half-written file if something goes wrong, so you should only use this approach in cases where the permissions or hard links do matter and you have a way to tell other processes not to use the file while it's being rewritten. Move-into-place should be the default approach because it's the only safe approach, that's why it's the approach I always use if there's no sign that it wouldn't work. – Gilles 'SO- stop being evil' Aug 15 '19 at 10:18
  • the atomic nature of a mv has some benefits, but the risk with a redirection is the same as with any other file overwrite (including "save" from within your text editor or other app). i.e. not very likely. Also, fixing ownership changes (and some ACLs) can only be done by root.....and fixing broken hard links is impossible unless you took note of the inode before mv-ing the new file over it, and then used find / -inum nnnnnn or something to find them all and re-link them (which the current user may not have RW or X permission in the relevant directory/ies to create the links) – cas Aug 15 '19 at 10:36
  • IMO overwriting is preferable to mv-ing unless you know for sure that a) the inode doesn't matter, no hard links; and b) the owner, group, perms, acls etc won't change. the inode number doesn't usually matter. owner, perms, etc almost always do. – cas Aug 15 '19 at 10:37
  • 1
    @cas No. Hard links are rare, and if there's a problem, you'll find it very quickly and obviously in testing. Race conditions do happen, but you'll only find them a lot later, in production, with no idea of what could possibly have gone wrong. Text edtiors are used by humans who can take special action if saving fails, and even so they arrange to keep an intact copy of the file and many have a recovery mechanism if they get interrupted. With a script that's used as part of an automated pipeline, there are no such safeties, so an unsafe approach is a ticking bomb. – Gilles 'SO- stop being evil' Aug 15 '19 at 10:49
  • as mentioned, mv is also unsafe (for different reasons). choose your preferred kind of unsafety. – cas Aug 15 '19 at 10:53
0

For the case of wanting to maintain an unchanging header, you can write the header to another file with head -1 file.csv > header.csv and use the following syntax:

{ head -1 header.csv && some_command; } > file.csv

As described in the comments, Bash first truncates file.csv before running the commands inside { }, meaning you can't just run head -1 file.csv inside { }.

This { } syntax can also be used to combine output from two commands as explained here.

Kyle
  • 133
  • 1
    Since the redirection is the first thing that the shell acts upon (to be able to decide where to send the output from your compound command), the file would be truncated before head has a chance to run. – Kusalananda Aug 14 '19 at 17:27
  • It seems to work for smaller files; I'm wondering if there's a buffer limit or sequence of events that allows it to work in some situations. – Jeff Schaller Aug 14 '19 at 17:38
  • Ugh, you're right. In my case this exact command does cause this problem. I'll edit my answer with a solution. – Kyle Aug 14 '19 at 18:20
  • 1
    @JeffSchaller I can't get that command to work even for small files. – Kusalananda Aug 14 '19 at 18:25
0
sed -i '2,$d'  file.csv && some_command >> file.csv

sed -i '2,$d' file.csv will delete all but the first line of the file.

-i tells sed to work in-place

2,$ addresses from line 2 to the end of the file

d does the deletion.

You can try out if the command does what you want by leaving out the -i

After that && some_command >> file.csv will append to the file only if sed successfully cleared it.

markgraf
  • 2,860
0

Similar to other examples here, this should work in-place without a temp file

FILENAME="file.csv"
HEADER=$(head -n 1 $FILENAME)
{
  echo $HEADER
  echo "your changes here.."  # or, some_command
} < $FILENAME > $FILENAME  # WARNING: SC2094

Disclaimer: Shellcheck raises SC2094 on the this code, so use it on your own risk

cya
  • 101