0

I have a very simple problem, but for reasons it does not work properly..

I have these .txt files in the following format

2    250    1
4    250    1
5    250    1

I wanted to subtract 1 from the numbers in the first column, yielding:

1    250    1
3    250    1
4    250    1

I am using this code in bash:

awk '{ print $1-1,$2,$3 }' file.txt > newfile.txt

I think this code is fine, however, due to the fact that these text files have their file extension changed from .csv to .txt, this awk line doesn't seem to work well. It yields:

1    250    1
4

Any alternatives that can work well with these text files I have?

updates: I used the pico editor to re-generate one of these files, and now the above code works perfectly, so there must be something wrong in the original text files in terms of formatting and properties…any insights?

Gabriel
  • 11
  • 2
    After removing the blank likes, your example works for me. Perhaps you have files with carriage-return/line-feed endings which are interfering with the script. – Thomas Dickey Mar 10 '16 at 00:33
  • It still doesn't work for me..what are those that you mentioned? is there a way to fix it? – Gabriel Mar 10 '16 at 00:39
  • Your example works fine for me too (although it changes the four-space delimiters to a single space). Does it work if you literally copy paste your code example above? I'm not sure what @ThomasDickey meant by "blank likes" (or "blank lines" even?). Changing from .csv to .txt should make a difference; awk is agnostic. – Sparhawk Mar 10 '16 at 00:40
  • @Thomas I tried re-creating one of these files using "pico", and "awk" is working properly with the pico-created txt file. However, it is impossible for me to re-create these files one by one.. – Gabriel Mar 10 '16 at 00:43
  • @sparhawk, I made my comment before jasonwryan edited the question to remove the blank lines. (I made the comment partly because the blanks might be related to OP's question, to get clarification). – Thomas Dickey Mar 10 '16 at 00:47
  • @ThomasDickey Oh okay, fair point. I didn't look at the pre-edited version. – Sparhawk Mar 10 '16 at 00:49
  • Back to my point: pico forces LF-endings, and OP likely has text-files with CRLF endings which are confusing awk. – Thomas Dickey Mar 10 '16 at 01:05
  • I wonder if it's some non-printing character in your file. Could you paste the raw source of a minimal failing example with hexdump -c file.txt? – Sparhawk Mar 10 '16 at 01:34

3 Answers3

1

The problem described appears to be related to line-endings. If the text-files have carriage-return/line-feed line-endings, then awk will see the carriage return as just another whitespace character, and overwrite text in the output.

OP fixed the problem by editing with pico (which assumes line-feed line-endings, according to this Easier way to set line breaks page). One can also use programs such as dos2unix to fix the line-endings.

Further reading:

Thomas Dickey
  • 76,765
  • I'm not sure about this. I created different files in vim containing the first code block, using various line-end formats. awk seems to deal perfectly fine with "dos" and "unix" endings, although it just prints one line with "mac" endings (presumably sees the whole file as one line). I never saw the partial second line from the question. I wonder if there are other non-printables at play here, possibly a corrupt file. – Sparhawk Mar 10 '16 at 01:30
1

What @Thomas Dickey was saying appears to be the solution!

I used this code perl -pe 's/\r\n|\n|\r/\n/g' to convert the text files to unix format. It is similar to the dos2unix program. Now, Awk is running perfectly!

Thanks everyone

Gabriel
  • 11
1

The unix2dos/dos2unix utility would solve this problem rather easily. If you have a UNIX/Linux machine, try installing it and run this:

dos2unix file

In UNIX/Linux, before processing any text file, it is reasonable to run the file utility in order to know what kind of endianness it has.