Sorry if it's answered somewhere, I have no idea how to look for it. I received a series of reports from a bank that I'm supposed to process and they seem to be... badly encoded?
First two lines in VIM:
1 ^M^@
2 ^@:^@2^@0^@:^@3^@0^@4^@0^@7^@1^@9^@^M^@
Same two lines in e.g. gedit:
1
2 :20:3040719
Anyone can tell me what's going on? It doesn't matter if I open the file with fenc=utf8
or fenc=cp1250
(which is the encoding these files were supposed to be encoded with). I even tried fenc=ucs-bom
because I thought it has something to do with endianness but it doesn't change anything either. I know ^@
is null and ^M
is Windows style new line (CRLF) but changing between ff=dos
and ff=unix
doesn't matter either.
I have an older file from the same bank (before some changes they've introduced) and it works fine - file
shows it's extended-ASCII
while the broken file is shown as data
:
$ file *sta
20220411_182719.sta: Non-ISO extended-ASCII text, with CRLF line terminators
20220412_071916.sta: data
I can replace those characters in VIM and process the file then but I need to automate this process for thousands of files a day with PHP and can't really use VIM. Ideally I'd like to just tell the bank support what they've messed up.
file
command report? For a data file calledfromthebank.dat
the command would befile fromthebank.dat
– Chris Davies Apr 15 '22 at 12:23file
output. – cprn Apr 15 '22 at 12:27