The Unix text file format is a sequence of lines (i.e., records), potentially variable in length, of electronic text. At the end of each line is the newline character. At the end of file, there is an end-of-file character.
Is this an accurate description of the contents of a file?
Up to but excluding that last bolded part, yes. But I don't know of any Unixy systems that would use an end-of-file character, they all store the length of a file down to a byte, making such markers unnecessary.
Then again, it appears there have been systems that did use an end-of-file character. At least Wikipedia claims that:
The CP/M file system only recorded the lengths of files in multiples of 128-byte "records", so by convention a Control-Z character was used to mark the end of meaningful data if it ended in the middle of a record.
Having file lengths stored only up to a block would require some sort of custom to encode the end of the last line within the data stream. Any programs handling binary data would of course also have to deal with the more granular file sizes somehow. With binary files it might be easier to ignore the trailing "extra" bytes, though.
I think I've seen Control-Z used as an EOF marker on MS-DOS, but it wasn't necessary there either.
That quoted text seems to have a mistaken idea of text files in current systems. If we look at what the POSIX standard says, there's no mention of an end-of-file character or marker for text files, just that they contain no NUL bytes and consist of lines (ending in newlines).
See also: What's the last character in a file?
As for this part...
For the GOES-R ground system, [...] and end-of-file characters conform to the American Standard Code for Information Interchange (ASCII).
Like others have said in the comments, there's no character for end-of-file in ASCII, at least not with that name (*). Control-Z mentioned above is 26, or "substitute" (SUB), "used to indicate garbled or invalid characters". So, based on just that text, it would be hard to know what the EOF character would be, were it used.
(* There's "end of text" (ETX, code 3), "end of transmission" (EOT, code 4), "end of transmission block" (ETB, 23), "end of medium" (EOM, 25) and also "file separator" (FS, 28). Close, but not exact.)
I thought that the end of file was a condition that the operating system or a library routine was returning when no more data can be read from a file (or other stream).
That's what it is, indeed. The system call read()
returns zero bytes (with no error) when the end of a file is reached, while some stdio functions (getchar()
) have a return special value for it, unsurprisingly called EOF
.
See also: Difference between EOT and EOF
EOT
(^D
) to signal the end of the file (or input), but that's app-specific, far from universal or required. DOS/Windows uses (or used to use, i dunno any more)^Z
. – cas Aug 28 '19 at 11:27