It seems to me that the question is a little vague: is the goal to recognize gzip-compressed data? If not, what formats need to be supported?
Focussing on the gzip
case:
The way I see it, possible approaches depend on the use case. For instance, if the length of a possibly compressed response
is expected to be small, one can try decompressing to test if it was compressed in the first place.
;; get sample data for testing
(setq response
(with-temp-buffer
(set-buffer-multibyte nil)
(shell-command "curl --silent 'http://api.stackexchange.com/2.2/filter/create'" t)
(buffer-string)))
(Now I can test the code below using data in response
.)
(defun zlib-compressed-p (string)
"Return t if STRING is compressed with zlib, nil otherwise."
(when (not (zlib-available-p))
(error "This function requires zlib!"))
(with-temp-buffer
(set-buffer-multibyte nil)
(insert string)
(if (zlib-decompress-region (point-min) (point-max))
t
nil)))
A small modification returns human-readable data regardless of whether it was compressed:
(defun zlib-decompress-if-compressed (string)
"Decompress STRING if it is recognized as a compressed
unibyte string by zlib, otherwise return STRING unchanged.
Requires zlib."
(when (not (zlib-available-p))
(error "zlib-decompress-if-compressed requires zlib!"))
(with-temp-buffer
(set-buffer-multibyte nil)
(insert string)
(if (zlib-decompress-region (point-min) (point-max))
(buffer-string)
string)))
As far as I know, zlib
is the only compression library Emacs can be compiled with, so we can't handle other formats this way.
The original question states "I have the compressed data in a variable called response
...", and zlib-decompress-if-compressed
can process it without writing it to a file. It is easy to create versions that takes a file name:
(defun zlib-file-compressed-p (filename)
"Return t if file is compressed with zlib, nil otherwise."
(with-temp-buffer
(set-buffer-multibyte nil)
(insert-file-contents-literally filename nil)
(zlib-compressed-p (buffer-string))))
(defun zlib-decompress-file (filename)
"Return the contents of the file FILENAME as a string,
decompressed using zlib if the file is recognized as compressed.
If the file is not compressed with zlib, return its contents
literally."
(with-temp-buffer
(set-buffer-multibyte nil)
(insert-file-contents-literally filename nil)
(zlib-decompress-if-compressed (buffer-string))))
For other formats, or if a response
is too long to attempt decompressing, one can check if the first two bytes of a string match the magic number defined by the gzip file format specification.
(defun gzip-check-magic (data)
"Check if the first two bytes of a string in DATA match magic
numbers identifying the gzip file format. See
http://www.gzip.org/zlib/rfc-gzip.html for the file format
description."
(equal (substring (string-as-unibyte data) 0 2) (unibyte-string 31 139)))
(defun gzip-compressed-p (filename)
"Check if the file FILENAME is gzip-compressed by checking
magic numbers identifying the gzip file format. See
`gzip-check-magic' for details."
(let ((first-two-bytes (with-temp-buffer
(set-buffer-multibyte nil)
(insert-file-contents-literally filename nil 0 2)
(buffer-string))))
(gzip-check-magic first-two-bytes)))
This method is also limited to gzip
, may give a false positive result, but is truly cross-platform.
(It can be extended to other formats that use "magic numbers" to identify the format, for example bzip2, but this is certainly not scalable.)
Overall, using call-process
and file
or similar seems to be the most flexible approach.