322

Some compilers (especially C or C++ ones) give you warnings about:

No new line at end of file

I thought this would be a C-programmers-only problem, but github displays a message in the commit view:

\ No newline at end of file

for a PHP file.

I understand the preprocessor thing explained in this thread, but what has this to do with PHP? Is it the same include() thing or is it related to the \r\n vs \n topic?

What is the point in having a new line at the end of a file?

7 Answers7

363

It's not about adding an extra newline at the end of a file, it's about not removing the newline that should be there.

A text file, under unix, consists of a series of lines, each of which ends with a newline character (\n). A file that is not empty and does not end with a newline is therefore not a text file.

Utilities that are supposed to operate on text files may not cope well with files that don't end with a newline; historical Unix utilities might ignore the text after the last newline, for example. GNU utilities have a policy of behaving decently with non-text files, and so do most other modern utilities, but you may still encounter odd behavior with files that are missing a final newline¹.

With GNU diff, if one of the files being compared ends with a newline but not the other, it is careful to note that fact. Since diff is line-oriented, it can't indicate this by storing a newline for one of the files but not for the others — the newlines are necessary to indicate where each line in the diff file starts and ends. So diff uses this special text \ No newline at end of file to differentiate a file that didn't end in a newline from a file that did.

By the way, in a C context, a source file similarly consists of a series of lines. More precisely, a translation unit is viewed in an implementation-defined as a series of lines, each of which must end with a newline character (n1256 §5.1.1.1). On unix systems, the mapping is straightforward. On DOS and Windows, each CR LF sequence (\r\n) is mapped to a newline (\n; this is what always happens when reading a file opened as text on these OSes). There are a few OSes out there which don't have a newline character, but instead have fixed- or variable-sized records; on these systems, the mapping from files to C source introduces a \n at the end of each record. While this isn't directly relevant to unix, it does mean that if you copy a C source file that's missing its final newline to a system with record-based text files, then copy it back, you'll either end up with the incomplete last line truncated in the initial conversion, or an extra newline tacked onto it during the reverse conversion.

¹ Example: the output of GNU sort on non-empty files always ends with a newline. So if the file foo is missing its final newline, you'll find that sort foo | wc -c reports one more byte than cat foo | wc -c. The read builtin of sh is required to return false if the end-of-file is reached before the end of the line is reached, so you'll find that loops such as while IFS= read -r line; do ...; done skip an unterminated line altogether.

  • 1
    Concerning "... series of lines, each of which must end with a newline character (n1256 §5.1.1.1)" --> In re-viewing a the more recent C11dr N1570, did not find support for that other than maybe: "A source file that is not empty shall end in a new-line character, which shall not be immediately preceded by a backslash character before any such splicing takes place." §5.1.1.2 2, but that seems to be restricted to splicing specifications. – chux - Reinstate Monica Aug 04 '16 at 16:22
  • @chux That sentence is present in n1256 too. The last line must end with a newline character. Lines that are not the last line must obviously also end with a newline character to indicate that that line ends and the next line begins. Thus every line must end with a newline character. – Gilles 'SO- stop being evil' Aug 04 '16 at 16:40
  • Hmmm, To me, that line ""A source file ... splicing takes place." could be limited to how splicing considerations and not files in general. Yet I see how one could view otherwise. Perhaps I'll look for a post that focuses on that. – chux - Reinstate Monica Aug 04 '16 at 17:01
  • "So diff uses this special text \ No newline at end of file to differentiate a file that didn't end in a newline from a file that did."

    Git shows this text not only when it compares files. But even when new file added to git. So this argument is not valid, I suppose.

    – Viktor Kruglikov Jan 12 '18 at 15:54
  • "Utilities that are supposed to operate on text files may not cope well with files that don't end with a newline"

    I don't think that it's business of git to care about such low level problems like missing \n because of POSIX requirements. I think that if git shows this message, reason should be in source control problems.

    – Viktor Kruglikov Jan 12 '18 at 15:58
  • In PHP what if you have closing tag ?>\n at the end of php file \n would lead to additional unnecessary new line in server response. – Viktor Kruglikov Jan 12 '18 at 16:07
  • @ViktorKruglikov I don't think PHP cares whether its source file ends with a newline. But PHP is typically used this way to generate HTML, and a newline at the end of an HTML is not a problem. – Gilles 'SO- stop being evil' Jan 12 '18 at 19:05
  • @Gilles > "and a newline at the end of an HTML is not a problem" But php may generate not only html, but any type of file: pdf, csv, or any custom filetype. Also in php if you work with http headers, and you've sent any html content (including \n) to responce before you handled http headers, you get errors. I still think that newline at the end of file is not git's business. Or there is some reason that we're missing. – Viktor Kruglikov Jan 15 '18 at 09:31
  • @ViktorKruglikov If you're generating a binary file with embedded PHP, then your PHP source file is not a text file. It's unusual for source code not to be in text files. Git is primarily designed to handle text files, so its user interface makes text files the default and lets you know if what you have is not a text file. – Gilles 'SO- stop being evil' Jan 15 '18 at 19:03
  • 3
    Why not just update the archaic tools instead so they don't break if there's a missing new line... – Andrew May 11 '19 at 14:27
  • @ViktorKruglikov: This is why there is the ages old PSR-1 rule to not terminate any files with the closing ?> "tag" if that file is not in HTML (output) context (just leave it out, PHP is fine with that). For HTML context +1 what Gilles wrote, HTML also has whitespace normalization and this all should be compatible w/ the HTTP protocol as well, but that just as a note in the margin. – hakre Nov 25 '19 at 21:04
80

Not necessarily the reason, but a practical consequence of files not ending with a new line:

Consider what would happen if you wanted to process several files using cat. For instance, if you wanted to find the word foo at the start of the line across 3 files:

cat file1 file2 file3 | grep -e '^foo'

If the first line in file3 starts with foo, but file2 does not have a final \n after its last line, this occurrence would not be found by grep, because the last line in file2 and the first line in file3 would be seen by grep as a single line.

So, for consistence and in order to avoid surprises I try to keep my files always ending with a new line.

  • 3
    But is it business of git to care about files concatenation? – Viktor Kruglikov Jan 15 '18 at 09:33
  • 3
    Doesn't it stand to reason that you should just put '\n''s in the cat operation... – Andrew May 11 '19 at 14:28
  • 11
    That's like saying, "Sometimes I append Strings together that have \n or whitespace at the ends, so in order to keep things consistent, I always put \n _____ at both ends of my strings." Well, no, the right thing to do there is to have your Strings trimmed and then concatenate them properly. – Andrew May 11 '19 at 14:29
  • 4
    @ViktorKruglikov It is git's business to care about your file content, yes. Git is there to help you keep your data consistent. If it were to ignore newlines it would fail in that purpose. In that same line of thought, it is cat's business to exactly reproduce the data you give it ; It must not add random characters inbetween that you did not specifically ask for. – zaTricky Sep 21 '20 at 16:35
  • @ViktorKruglikov It is the business of Git to maintain the integrity of text files. If you're checking in an invalid text file, it's Git's business to warn you. – Jeff Learman Dec 07 '21 at 14:34
18

There are two aspects:

  1. There are/were some C compilers that cannot parse the last line if it does not end with a newline. The C standard specifies that a C file should end with a newline (C11, 5.1.1.2, 2.) and that a last line without a newline yields undefined behavior (C11, J.2, 2nd item). Perhaps for historic reasons, because some vendor of such a compiler was part of the committee when the first standard was written. Thus the warning by GCC.

  2. diff programs (like used by git diff, github etc.) show line by line differences between files. They usually print a message when only one file ends with a newline because else you would not see this difference. For example if the only difference between two files is the presence of the last newline character, without the hint it would look like the both files were the same, when diff and cmp return an exit-code unequal success and the checksums of the files (e.g. via md5sum) don't match.

maxschlepzig
  • 57,532
  • make sense with diff program – Thamaraiselvam May 07 '19 at 06:36
  • 1
    Sounds like diffs should just be smarter. – Andrew May 11 '19 at 14:32
  • 3
    @Andrew, no, it doesn't. diff is expected to print differences if there are any. And if one file has a newline as last character while the other hasn't then that difference must be somehow noticeable in the output. – maxschlepzig May 11 '19 at 14:42
  • Your latter statement is correct. However, the diff viewer does not have to display "newlines" (\n) to begin with, it can instead simply show "new lines". – Andrew May 11 '19 at 14:45
14

The \ No newline at end of file you get from github appears at the end of a patch (in diff format, see the note at the end of the "Unified Format" section).

Compilers don't care whether there is a newline or not at the end of a file, but git (and the diff/patch utilities) have to take those in account. There are many reasons for that. For example, forgetting to add or to remove a newline at the end of a file would change its hashsum (md5sum/sha1sum). Also, files are not always programs, and a final \n might make some difference.

Note: About the warning from C compilers, I guess they insist for a final newline for backward compatibility purposes. Very old compilers might not accept the last line if doesn't end with \n (or other system-dependent end-of-line char sequence).

  • 10
    "I guess they insist for a final newline for backward compatibility purposes" - Nope, they insist on it because the C standard mandates it. – MestreLion Aug 28 '13 at 09:25
  • 1
    @MestreLion C requires a final newline for C source code (C11 §5.1.1.2 2). Note that for text file I/O, C has "Whether the last line requires a terminating new-line character is implementation-defined." §7.21.2 2 – chux - Reinstate Monica Sep 04 '18 at 13:39
  • Who's using very old compilers? Stop using them. – Andrew May 11 '19 at 14:31
  • 2
    @MestreLion: And why do you think the C standard mandates it… – Stéphane Gimenez Aug 01 '19 at 10:11
  • @StéphaneGimenez: consistency, better compatibility and interoperability among different OSes (POSIX also defines lines ending in '\n') – MestreLion Aug 03 '19 at 03:37
  • There's very undesirable behaviour that arises from missing the trailing new-line in some languages. In C "header" files are just copied byte-for-byte in place of the #include ... line. So if the header file misses a new-line the next line will be concatenated with the last line of the header. That gives some hilarious syntax errors which are very hard to track. If I remember correctly PHP might be the same. – Philip Couling Aug 12 '21 at 15:12
  • @Andrew everyone who wants to run IBM 360 software and being able to use its compilers and or develop new sw and port new ones to mvs/vm – Stefan Skoglund Oct 22 '21 at 12:41
8

There is also the point of keeping diff history. If a file ends without a newline character, then adding anything to the end of the file will be viewed by diff utilities as changing that last line (because \n is being added to it).

This could cause unwanted results with commands such as git blame and hg annotate.

Hosam Aly
  • 290
  • Sounds like diffs just need to be smarter. – Andrew May 11 '19 at 14:32
  • 2
    The diffing tools are being smart. They notice the subtle change to the file (which is important because it will inevitably change the file's hash). And both GNU diff and git diff accept a -w option to ignore whitespace changes when outputting data for humans. – joeytwiddle Oct 16 '19 at 03:22
  • @andrew Smart enough to follow any ad-hoc nonstandard text? Where to end with that? Maybe ending text files with a newline was a bad standard, but is it really a good idea to change that now? More things than diff would be affected. – Jeff Learman Dec 07 '21 at 14:39
8

POSIX, this is a set of standards specified by IEEE to maintain compatibility between operating systems.

One of which is the definition of a "line" being a sequence of zero or more non- characters plus a terminating newline character.

So for that last line to be recognised as an actual "line" it should have a terminating new line character.

This is important if you depend on OS tools to say line count or split / help parse your file. Given PHP is a script language, its entirely possible especially in its early days or even now (I have no idea / postulating) it had OS dependencies like that.

In reality, most operating systems are not fully POSIX compliant and humans are not that machine like or even caring about terminating new lines. So for most things its a smorgasbord of everything either caring about it, warning or just going that last bit of text is really a line so just include it.

7

Here is an additional reason. Say you have a file file.txt containing a list of names, with one name per line (or consider a file such as a .gitignore file).

To add a new entry, it makes sense to call something like:

echo "John" >> file.txt

and you expect this to always work right ?

Well it will actually only work if your file ends with a newline (no matter if your file is empty or not).

Otherwise, say your file contains:

Alice<no_newline>

when you will send your:

echo "John" >> file.txt

you will end up with:

AliceJohn
<new_line_here>

which is definitely not what you expected. So having all text files terminated with a EOL makes life much easier.

Chevdor
  • 171