21

I was wondering if there was a way to register this, but since most modern search engines don't work well with phrases over about 5 words in length, I need some help on this one.

I was wondering this because I'm making a bash script that has to register files as certain types and make decisions accordingly. This technically isn't important to my project, but I was curious.

Also, if they are considered to be regular files, then is there a way to check if these files are hard linked without having to parse ls -i? And is there a way to check if some arbitrary file, X, is hard linked to some other arbitrary file, Y, without using the find -i command?

2 Answers2

40

In Unix-style systems, the data structure which represents filesystem objects (in other words, the data about a file), is stored in what's called an "inode".

A file name is just a link to this inode, and is referred to as a "hard link". There is no difference between the first name a file is given and any subsequent link. So the answer is, "yes": a hard link is a regular file and, indeed, a regular file is a hard link.

The ls command will show you how many hard links there are to the file.

For example:

seumasmac@comp:~$ echo Hello > /tmp/hello.txt
seumasmac@comp:~$ ls -l /tmp/hello.txt 
-rw-rw-r-- 1 seumasmac seumasmac 6 Oct  4 13:05 /tmp/hello.txt

Here we've created a file called /tmp/hello.txt. The 1 in the output from ls -l indicates that there is 1 hard link to this file. This hard link is the filename itself /tmp/hello.txt.

If we now create another hard link to this file:

seumasmac@comp:~$ ln /tmp/hello.txt /tmp/helloagain.txt
seumasmac@comp:~$ ls -l /tmp/hello*
-rw-rw-r-- 2 seumasmac seumasmac 6 Oct  4 13:05 /tmp/helloagain.txt
-rw-rw-r-- 2 seumasmac seumasmac 6 Oct  4 13:05 /tmp/hello.txt

you can now see that both filenames indicate there are 2 hard links to the file. Neither of these is the "proper" filename, they're both equally valid. We can see that they both point to the same inode (in this case, 5374043):

seumasmac@comp:~$ ls -i /tmp/hello*
5374043 /tmp/helloagain.txt  5374043 /tmp/hello.txt

There is a common misconception that this is different for directories. I've heard people say that the number of links returned by ls for a directory is the number of subdirectories, including . and .. which is incorrect. Or, at least, while it will give you the correct number, it's right for the wrong reasons!

If we create a directory and do a ls -ld we get:

seumasmac@comp:~$ mkdir /tmp/testdir
seumasmac@comp:~$ ls -ld /tmp/testdir
drwxrwxr-x 2 seumasmac seumasmac 4096 Oct  4 13:20 /tmp/testdir

This shows there are 2 hard links to this directory. These are:

/tmp/testdir
/tmp/testdir/.

Note that /tmp/testdir/.. is not a link to this directory, it's a link to /tmp. And this tells you why the "number of subdirectories" thing works. When we create a new subdirectory:

seumasmac@comp:~$ mkdir /tmp/testdir/dir2
seumasmac@comp:~$ ls -ld /tmp/testdir
drwxrwxr-x 3 seumasmac seumasmac 4096 Oct  4 13:24 /tmp/testdir

you can now see there are 3 hard links to /tmp/testdir directory. These are:

/tmp/testdir
/tmp/testdir/.
/tmp/testdir/dir2/..

So every new sub-directory will increase the link count by one, because of the .. entry it contains.

seumasmac
  • 2,015
  • I understand how metadata, inodes, and hard linking works. I just needed it clarified if the hard linked file was counted as a regular file. This only shows me that the answer is 'yes' because of the column dedicated to this, which implicitly indicates that this is native to all files. So sorry, but I'll have to downvote this :( – Mr. Minty Fresh Oct 07 '15 at 02:22
  • That's fine, I'm sure it will be useful info for someone else. – seumasmac Oct 07 '15 at 02:28
  • Interesting edit with the dotglob hardlink system, I never knew that it does this. – Mr. Minty Fresh Oct 07 '15 at 02:34
  • I clarified the hard links == regular files paragraph. – seumasmac Oct 07 '15 at 02:40
  • Cool, upvoting now. – Mr. Minty Fresh Oct 07 '15 at 02:41
  • 1
    Particularly like the sentence: "Neither of these is the 'proper' filename, they're both equally valid." That is a crucial ingredient to an understanding of hard links. Very nicely written. – Wildcard Oct 07 '15 at 06:42
  • You really need a echo World > /tmp/helloagain.txt followed by a cat /tmp/hello.txt /tmp/helloagain.txt somewhere in there, for dramatic effect. – user Oct 07 '15 at 13:16
  • So you could rm either one of a hard-linked file and it doesn't matter? Also, is the lazy creation of so many directory hard links why/how rm ordinarily refuses to remove non-empty directories...n_links != 2 (from parent and self (.))...but what about files... hrm. – Nick T Oct 07 '15 at 19:30
  • rm only removes the link, not the file or inode. Once all hard links to the inode are removed, the inode and disk space is available for re-use (assuming the file is not open). – seumasmac Oct 08 '15 at 02:31
4

Do hard links count as normal files?

Hard links count as whatever they're linked to. You can link to anything on the same filesystem.

mkdir test
cd !$

>file
ln -s file sym
mknod pipe p

ln file file2
ln -P sym sym2
ln pipe pipe2

ls -al

# sockets, too:
cat >tsock.c <<\EOD
#include <sys/socket.h>
#include <sys/un.h>
int main(int n, char **a)
{
        struct sockaddr_un test = { AF_UNIX, "socket" };
        int testfd = socket(AF_UNIX, SOCK_SEQPACKET, 0);
        bind(testfd,(struct sockaddr *)&test,sizeof test);
}
EOD
make tsock
./tsock

ln socket socket2

ls -al

# even devices if you want:
sudo mknod mytty c 5 0
ln mytty mytty2
sudo chmod 666 mytty

ls -al
# notice permissions are on an object not on the links to it:
echo Hi, Kilroy! >mytty2  

Every hardlink to anything is equivalent, the underlying object sticks around so long as there's any (edit: non-symbolic) link at all to it (even an open file descriptor, for which I have embarrassing cause to be very grateful).

The system will enforce rules on directory links, you get one named link to a directory and the system automatically adds its embedded . link and any subdirectories' .. links (notice that . in the ls's above has two links) but that's an explicit check, on some modded systems privileged users who promise promise promise not to make loops can add new links themselves. The filesystem doesn't care, it can represent arbitrary directory graphs just fine, but nobody wants to deal with them.

There are (lots of non-unix) filesystems that don't work this way, including some that call what they offer as a substitute "hard links". OS X has kludged up an equivalent on HFS+ (which doesn't have them natively) if I recall correctly, I don't know how faithfully it preserves the semantics here.

jthill
  • 2,710
  • what does ./tsock actually do, anyway? – mikeserv Oct 08 '15 at 04:41
  • @mikeserv It's the make-a-socket program above, it just drops a socket link named "socket" in its current directory. – jthill Oct 08 '15 at 04:43
  • ok, but maybe i should have made it clearer about how little i know about sockets. i think i understand links well enough and so it just gives the same socket a new name, right? that doesn't have any special significance for sockets or anything, yeah? sorry about my ignorance. – mikeserv Oct 08 '15 at 04:45
  • 1
    @mikeserv A socket's a purely runtime entity. socket() creates an actual socket, bind() gives it a specific name, connect() connects a socket you made to some named socket. Different kinds of sockets use different kinds of names, e.g. Internet sockets use Internet addresses, but they all share common API (including read() and write(), it makes me sad that you can't open() a filesystem socket and have the OS or libc do socket() and connect() for you). man 7 socket has more, all the networking protocols do make for a fidgety manpage. – jthill Oct 08 '15 at 05:03
  • @mikeserv but to actually answer your question (sorry for the wot), yes, every link is just another way to get to the underlying object. – jthill Oct 08 '15 at 05:04
  • if it means any thing, you can almost do it with ttys on linux, but you do at least need to unlockpt() it after < /dev/ptmx. But just doing the latter does create a new pty. i stumbled across that the beginning of this year. – mikeserv Oct 08 '15 at 05:12
  • 1
    @mikeserv See, I can spell pty and pts, and probably even ptmx on a good day, but that's about it. :-) at least the 5,0 node works everywhere I can find, it's the controlling-tty device type. I got it with just ls -l /dev/tty, guess I got lucky there. – jthill Oct 08 '15 at 05:22
  • i like you. this is a really good answer, too. thank you. – mikeserv Oct 08 '15 at 05:42
  • I find it interesting how the hard link becomes the same type as the linked file. Thank you for your answer. – Mr. Minty Fresh Oct 14 '15 at 13:07
  • What you're thinking of as "the linked file" is actually two things, the file itself and a link to it. touch newf makes a file and a link to it, ln newf link makes another link to the same file, this one named link. The two links are indistinguishable peers. You can put them in different places in the filesystem, with different permissions on (the paths to) the directories that hold them, but the links themselves have no attributes other than their arbitrary name and the "inode number" that identifies which actual file in their filesystem is being referred to. – jthill May 06 '18 at 01:23