7

If I have a file with

#!/usr/bin/env foobar

what is the fastest/best way to determine if this file has a hashbang? I hear you can just read the first 2 bytes? How?

6 Answers6

6

With zsh:

if LC_ALL=C read -u0 -k2 shebang < file && [ "$shebang" = '#!' ]; then
  echo has shebang
fi

Same with ksh93 or bash:

if IFS= LC_ALL=C read -rN2 shebang < file && [ "$shebang" = '#!' ]; then
  echo has shebang
fi

though bash would give false positives for files that start with NULs followed by #! and would read all the leading NUL bytes so would read a one tebibyte file created with truncate -s1T file fully 2 bytes at a time for instance.

So with bash, it would be better to use:

IFS= LC_ALL=C read -rn2 -d '' shebang

That is read up to 2 bytes of a NUL-delimited record.

Those don't fork processes nor execute extra commands as the read, [ and echo commands are all built-in.

POSIXly, you can do:

if IFS= read -r line < file; then
  case $line in
    ("#!"*) echo has shebang
  esac
fi

It is stricter in that it also requires a full line. On Linux at least, the newline is not required for a valid shebang though.

So you could do:

line=
IFS= read -r line < file
case $line in
  ("#!"*) echo has shebang
esac

It's slightly less efficient in that it would potentially read more bytes, with some shells one byte at a time. With our 1TiB sparse file, that would take a lot of time in most shells (and potentially use a lot of memory).

With shells other than zsh, it could also give false positives for files that start with NULs followed by #!.

With the yash shell, it would fail if the shebang contains sequences of bytes that don't form valid characters in the current locale (would even fail (at least with 2.39 and older) if the shebang contained non-ASCII characters in the C locale, even though the C locale is meant to be the one where all characters are single bytes and all the byte values form valid --even if not necessarily defined-- characters)

If you want to find all the files whose content starts with #!, you could do:

PERLIO=raw find . -type f -size +4c -exec perl -T -ne '
  BEGIN{$/=\2} print "$ARGV\n" if $_ eq "#!"; close ARGV' {} +

We're only considering files that are at least 5 bytes large (#!/x\n the minimum realistic shebang).

  • with -exec perl... {} +, we pass as many file paths to perl as possible so run as few invocations as possible
  • -T is to work around that limitation of perl -n and also means it won't work for files whose name ends in ASCII spacing characters or |.
  • PERLIO=raw causes perl to use read() system calls directly without any IO buffering layer (affects the printing of file names as well) so it will do reads of size 2.
  • $/ = \2 when the record separator is set as a reference to a number, it causes records to be fixed length ones.
  • close ARGV skips the rest of the current file after we've read the first record.
4

You can define your own "magic patterns" in /etc/magic and use file to test:

$ sudo vi /etc/magic
$ cat /etc/magic
# Magic local data for file(1) command.
# Insert here your local magic data. Format is described in magic(5).
0 byte 0x2123 shebang is present
$ cat /tmp/hole2.sh #To prove [1] order of hex [2] 2nd line ignored
!#/bin/bash 
#!/bin/bash
$ cat /tmp/hole.sh 
#!/bin/bash
$ file /tmp/hole2.sh 
/tmp/hole2.sh: ASCII text
$ file /tmp/hole.sh 
/tmp/hole.sh: shebang is present
$ file -b /tmp/hole.sh #omit filename
shebang is present

0x2123 is hex of '#!' in reverse order:

$ ascii '#' | head -n1
ASCII 2/3 is decimal 035, hex 23, octal 043, bits 00100011: prints as `#'
$ ascii '!' | head -n1
ASCII 2/1 is decimal 033, hex 21, octal 041, bits 00100001: prints as `!'

Optionally you can put:

0 string \#\! shebang is present

ref: man 5 magic, man 1 file, man 1posix file

林果皞
  • 5,156
  • 3
  • 33
  • 46
3

That should do it:

if [ "`head -c 2 infile`" = "#!" ]; then
    echo "Hashbang present"
else
    echo "no Hashbang present"
fi
don_crissti
  • 82,805
saga
  • 1,401
2

Fast may or may not be best, depending on your feelings on compiling a bunch of C (or maybe some assembly to get all that overhead of C out of the way. and all that tedious error checking, sheesh...)

#include <sys/types.h>

#include <err.h>
#include <fcntl.h>
#include <getopt.h>
#include <stdio.h>
#include <stdlib.h>
#include <sysexits.h>
#include <unistd.h>

int Flag_Quiet;                 /* -q */

void emit_help(void);

int main(int argc, char *argv[])
{
    int ch;
    char two[2];
    ssize_t amount;

    while ((ch = getopt(argc, argv, "h?q")) != -1) {
        switch (ch) {
        case 'q':
            Flag_Quiet = 1;
            break;
        case 'h':
        case '?':
        default:
            emit_help();
            /* NOTREACHED */
        }
    }
    argc -= optind;
    argv += optind;

    if (argc < 1)
        emit_help();

    if ((ch = open(*argv, O_RDONLY)) == -1)
        err(EX_IOERR, "could not open '%s'", *argv);

    amount = read(ch, two, 2);
    if (amount == -1) {
        err(EX_IOERR, "read failed on '%s'", *argv);
    } else if (amount == 0) {
        err(EX_IOERR, "EOF on read of '%s'", *argv);
    } else if (amount == 2) {
        if (two[0] == '#' && two[1] == '!') {
            amount = 0;
        } else {
            amount = 1;
        }
    } else {
        errx(EX_IOERR, "could not read two bytes from '%s'", *argv);
    }

    if (!Flag_Quiet) {
        printf("%s\n", amount ? "no" : "yes");
    }

    exit(amount);
}

void emit_help(void)
{
    fprintf(stderr, "Usage: hazshebang [-q] file\n");
    exit(EX_USAGE);
}

This will require some tweaks if you want a "no" on standard out alongside one of the (many!) err exits from the above. Probably better to check the exit status word.

The slower shell way with head -c 2 file fails a quick portability test to OpenBSD.

$ head -c 2 /etc/passwd
head: unknown option -- c
usage: head [-count | -n count] [file ...]
$ 
thrig
  • 34,938
  • fantastic answer... just compile this and a simple find will fix all files... for me we had ansible-playbook she-bangs in our yaml files, so find . -name "*.yml" -exec shebang -q {} \; -exec chmod 0755 {} \; worked – johnnyB Jun 17 '19 at 21:56
1

use grep in a one-liner solution

if head -1 file | grep "^#\!" > /dev/null;then echo "true"; fi
tachomi
  • 7,592
0

Using pwsh I wanted a portable solution, which wouldn't buffer the entire file (what if the file has no newlines?).

$bytes = Get-Content $path -AsByteStream -TotalCount 2
$isShebang = '#!' -eq -join [char[]]$bytes

Gets the first two bytes, casts to char, joins them to a string to check for equality.