I have a file filled with md5 checksums and filenames. I need to perform some processing on each line, so I need to know:
- Which is the checksum
- Which is the filename
and act accordingly. That is, I need to slurp the checksum into a variable, then the filename. The filename may have non-ascii characters in it but I don't expect to see newlines. It looks like this:
05c00367e8914ca1be0964821d127977 ./.fseventsd/0000000000097aa1
cd9d4291f59a43c0e3d73ff60a337bb5 ./.fseventsd/00000000000fdfec
5d1280769e741e04622cfd852f33a138 ./.fseventsd/0000000000103197
8dda3534e5bbc0be1d15db2809123c50 ./.fseventsd/000000000017c9ca
(...etc., about 100,000 lines)
Traditionally, I might perform something like this:
md5sum=$(echo $line | awk '{print $1}')
filename=$(echo $line | sed 's/[^ ]* //')
But how much faster would it be if I did this:
md5sum=${line%%" "*}
filename=${line#*" "}