VIM shows ^@ every other character and ^M^@ at the end of line

Question

Sorry if it's answered somewhere, I have no idea how to look for it. I received a series of reports from a bank that I'm supposed to process and they seem to be... badly encoded?

First two lines in VIM:

     1 ^M^@
     2 ^@:^@2^@0^@:^@3^@0^@4^@0^@7^@1^@9^@^M^@

Same two lines in e.g. gedit:

     1 
     2 :20:3040719

Anyone can tell me what's going on? It doesn't matter if I open the file with fenc=utf8 or fenc=cp1250 (which is the encoding these files were supposed to be encoded with). I even tried fenc=ucs-bom because I thought it has something to do with endianness but it doesn't change anything either. I know ^@ is null and ^M is Windows style new line (CRLF) but changing between ff=dos and ff=unix doesn't matter either.

I have an older file from the same bank (before some changes they've introduced) and it works fine - file shows it's extended-ASCII while the broken file is shown as data:

$ file *sta
20220411_182719.sta: Non-ISO extended-ASCII text, with CRLF line terminators
20220412_071916.sta: data

I can replace those characters in VIM and process the file then but I need to automate this process for thousands of files a day with PHP and can't really use VIM. Ideally I'd like to just tell the bank support what they've messed up.

Does this answer your question? What is `^M` and how do I get rid of it? — jesse_b, Apr 15 '22 at 12:22
What does the file command report? For a data file called fromthebank.dat the command would be file fromthebank.dat — Chris Davies, Apr 15 '22 at 12:23
@jesse_b No, it doesn't answer my question. I want to process files in PHP, so I need to know what's going on. The ^M itself can be replaced but it doesn't tell me what they've broken. — cprn, Apr 15 '22 at 12:27

score 4 · Answer 1 · answered Apr 15 '22 at 12:57

4

Ok, found it. It's UTF-16 Little Endian.

:e ++enc=utf16le

I can convert it properly now while processing.

answered Apr 15 '22 at 12:57

cprn

1,025

VIM shows ^@ every other character and ^M^@ at the end of line

1 Answers1