3

I've noticed that some software come with multiple info files. For example, tar on Fedora 21 comes with:

  • tar.info.gz
  • tar.info-1.gz
  • tar.info-2.gz

Are the tar-info-* files dependencies of some sort for the main tar.info.gz file? Is this division unique to each distro?

It seems that on the official GNU tar manual page, the Info tarball contains a single file, so I'm not sure where the -1 and -2 come from.

  • Maybe to limit the size of each file? – Svetlin Tonchev Dec 29 '15 at 07:59
  • After some experimentation, it does appear that the main file needs the -* files to function. I'm not sure that limiting the size of the info file is the goal. After all, the files are not very large. – Tianxiang Xiong Dec 29 '15 at 08:02
  • Ye normally when you split a big archive to few smaller ones, you need all files in order to be able to work with It. I don't know what are the sizes in your particular case but ye usually when you see such kind of archives the idea behind that is to have few smaller instead of one big file. – Svetlin Tonchev Dec 29 '15 at 08:12

1 Answers1

2

As noted, this was originally done to reduce size. It is documented in 23.1.5 Tag Files and Split Files (GNU Texinfo 6.0):

If a Texinfo file has more than 30,000 bytes, texinfo-format-buffer automatically creates a tag table for its Info file; makeinfo always creates a tag table. With a tag table, Info can jump to new nodes more quickly than it can otherwise.

In addition, if the Texinfo file contains more than about 300,000 bytes, texinfo-format-buffer and makeinfo split the large Info file into shorter indirect subfiles of about 300,000 bytes each. Big files are split into smaller files so that Emacs does not need to make a large buffer to hold the whole of a large Info file; instead, Emacs allocates just enough memory for the small, split-off file that is needed at the time. This way, Emacs avoids wasting memory when you run Info. (Before splitting was implemented, Info files were always kept short and include files were designed as a way to create a single, large printed manual out of the smaller Info files. See Include Files, for more information. Include files are still used for very large documents, such as The Emacs Lisp Reference Manual, in which each chapter is a separate file.)

The splitting feature is very old. For example, when the texinfo change-log first mentions it in 1993, the feature may have been added before the change-log began in 1988:

Tue Feb  2 08:38:06 1993  Noah Friedman  (friedman@prep.ai.mit.edu)
    * info/Makefile.in: Replace all "--nosplit" arguments to makeinfo
    with "--no-split"

Thomas Dickey
  • 76,765
  • Ah, so it is about size after all. OK, but it seems the size limits stated in the TexInfo manual are not well-followed, especially by GNU Info files! My Elisp info file is 939 kB, Emacs info file is 668 kB, and GCC info file is 576 kB...in Gzipped form! I haven't noticed any performance issues when loading them in Emacs, however. On the other hand, my GNU Common Lisp (GCL) Info file is split into a whopping 54 files, all from 1995! Is splitting Info files something of an archaic practice, given modern computing capabilities? – Tianxiang Xiong Dec 29 '15 at 17:37