157

I read in text books that Unix/Linux doesn't allow hard links to directories but does allow soft links. Is it because, when we have cycles and if we create hard links, and after some time we delete the original file, it will point to some garbage value?

If cycles were the sole reason behind not allowing hard links, then why are soft links to directories allowed?

user3539
  • 4,378
  • 4
    Where should .. point to? Especially after removing the hard link to this directory, in the directory pointed to by ..? It needs to point somewhere. – Thorbjørn Ravn Andersen Jun 24 '13 at 22:50
  • 3
    .. doesn't need to physically exist on any drive. It's the operating system's job to keep track of the current working directory, anyway, so it should be relatively simple to also keep a list of inodes associated with each process' cwd and refer to that when it sees a use of ... Of course, that would mean symlinks would need to be created with that in mind, but you already have to be careful not to break symlinks, and I don't think that additional rule would render them useless. – Parthian Shot Feb 04 '15 at 17:28
  • I like this explanation. Concise and easy to read and/or skim. – Trevor Boyd Smith Dec 02 '16 at 14:11

7 Answers7

161

This is just a bad idea, as there is no way to tell the difference between a hard link and an original name.

Allowing hard links to directories would break the directed acyclic graph structure of the filesystem, possibly creating directory loops and dangling directory subtrees, which would make fsck and any other file tree walkers error prone.

First, to understand this, let's talk about inodes. The data in the filesystem is held in blocks on the disk, and those blocks are collected together by an inode. You can think of the inode as THE file.  Inodes lack filenames, though. That's where links come in.

A link is just a pointer to an inode. A directory is an inode that holds links. Each filename in a directory is just a link to an inode. Opening a file in Unix also creates a link, but it's a different type of link (it's not a named link).

A hard link is just an extra directory entry pointing to that inode. When you ls -l, the number after the permissions is the named link count. Most regular files will have one link. Creating a new hard link to a file will make both filenames point to the same inode. Note:

% ls -l test
ls: test: No such file or directory
% touch test
% ls -l test
-rw-r--r--  1 danny  staff  0 Oct 13 17:58 test
% ln test test2
% ls -l test*
-rw-r--r--  2 danny  staff  0 Oct 13 17:58 test
-rw-r--r--  2 danny  staff  0 Oct 13 17:58 test2
% touch test3
% ls -l test*
-rw-r--r--  2 danny  staff  0 Oct 13 17:58 test
-rw-r--r--  2 danny  staff  0 Oct 13 17:58 test2
-rw-r--r--  1 danny  staff  0 Oct 13 17:59 test3
            ^
            ^ this is the link count

Now, you can clearly see that there is no such thing as a hard link. A hard link is the same as a regular name. In the above example, test or test2, which is the original file and which is the hard link? By the end, you can't really tell (even by timestamps) because both names point to the same contents, the same inode:

% ls -li test*  
14445750 -rw-r--r--  2 danny  staff  0 Oct 13 17:58 test
14445750 -rw-r--r--  2 danny  staff  0 Oct 13 17:58 test2
14445892 -rw-r--r--  1 danny  staff  0 Oct 13 17:59 test3

The -i flag to ls shows you inode numbers in the beginning of the line. Note how test and test2 have the same inode number, but test3 has a different one.

Now, if you were allowed to do this for directories, two different directories in different points in the filesystem could point to the same thing. In fact, a subdir could point back to its grandparent, creating a loop.

Why is this loop a concern? Because when you are traversing, there is no way to detect you are looping (without keeping track of inode numbers as you traverse). Imagine you are writing the du command, which needs to recurse through subdirs to find out about disk usage. How would du know when it hit a loop? It is error prone and a lot of bookkeeping that du would have to do, just to pull off this simple task.

Symlinks are a whole different beast, in that they are a special type of "file" that many file filesystem APIs tend to automatically follow. Note, a symlink can point to a nonexistent destination, because they point by name, and not directly to an inode. That concept doesn't make sense with hard links, because the mere existence of a "hard link" means the file exists.

So why can du deal with symlinks easily and not hard links? We were able to see above that hard links are indistinguishable from normal directory entries. Symlinks, however, are special, detectable, and skippable!  du notices that the symlink is a symlink, and skips it completely!

% ls -l 
total 4
drwxr-xr-x  3 danny  staff  102 Oct 13 18:14 test1/
lrwxr-xr-x  1 danny  staff    5 Oct 13 18:13 test2@ -> test1
% du -ah
242M    ./test1/bigfile
242M    ./test1
4.0K    ./test2
242M    .
Danny Dulai
  • 1,968
  • 12
    Allowing hard links to directories would break the directed acyclic graph structure of the filesystem. Can you please explain more about the problem with cycles using hard links? Why is it ok with symlinks – user3539 Oct 12 '11 at 01:13
  • 1
    @user3539, I have updated the answer with more explanation. – Danny Dulai Oct 13 '11 at 22:16
  • 38
    They seem to have allowed it on Macs by adding cycle detection to the link() system call, and refusing to allow you to create a directory hard link if it would create a cycle. Seems to be a reasonable solution. – psusi Oct 14 '11 at 20:08
  • 11
    @psusi mkdir -p a/b; nocheckln c a; mv c a/b; -- the nocheckln there is a theoretical ln that doesnt check for directory args, and just passes to link, and because no cycle is made, we are all good in creating 'c'. then we move 'c' into 'a/b', and a cycle is created from a/b/c -> a/ -- checking in link() is not good enough – Danny Dulai Oct 15 '11 at 02:05
  • 1
    @DannyDulai, yea, I guess rename() needs to check as well... – psusi Oct 17 '11 at 13:26
  • 2
    @psusi: for example, here is some discussion of the need to implement cycle detection for rename() (w.r.t. the planned features of reiserfs): "Cycle may consists of more graph nodes than fits into memory. Cycle detection is crucial for rename semantics, and if cycle-just-about-to-be-formed doesn't fit into memory it's not clear how to detect it, because tree has to be locked while checked for cycles, and one definitely doesn't want to keep such a lock over IO." – imz -- Ivan Zakharyaschev Nov 25 '11 at 15:44
  • +1. A corollary is that the kernel would have to deal with loops of directories that are not linked by the root. http://stackoverflow.com/a/7720649/778990 – ignis Oct 11 '12 at 05:19
  • 5
    Cycles are very bad. Windows has this problem with "junctions" which are hard link directories. If you accidentally apply permissions to your whole profile, it uncovers a series of junctions that create an infinite cycle. Recursing through the directories recurses until path length limitations stop it. – doug65536 Jun 15 '13 at 19:46
  • +1 for the good explanation, but you left me with a hole in my entire knowledge of computer programming. does it take more than one stack per mount-point to track cycles? – Behrooz Oct 16 '13 at 15:14
  • 1
    Please clarify the answer, hard link names haven't got timestamps (yes, you can infer from the directory timestamps when the links where created, if they were the last operation on the directory, but that is quite outlandish...) – vonbrand Jan 25 '14 at 15:58
  • 2
    du is not a good reason as to why hard links to dirs is not allowed. du already has to do lots of "book keeping" by keeping track of inodes, to make sure it doesn't count blocks twice. – Lqueryvg Nov 20 '14 at 15:04
  • 1
    du was just an example, I'm sure you can use your imagination to extrapolate. – Danny Dulai Dec 16 '14 at 23:44
  • @WhiteWinterWolf, I've never used macs myself but read they have the ability somewhere. Years later I now have no idea where. You might need a force switch or something. – psusi Jan 23 '16 at 00:21
  • 6
    @WhiteWinterWolf, according to this link, they specifically added support for it for time machine, but only root is allowed to do it: http://superuser.com/questions/360926/creating-a-hardlinked-directory-loop-on-a-mac – psusi Jan 23 '16 at 16:18
  • @psusi: Funny, the quoted manpage does not come from OSX but from Linux. This flag forces ln to pass the parameters to the linkat() system function in the hope that the request will not be rejected by lower layers (as shown by strace). Interesting, despite what the manpage states this flag works the same way for root and unprivileged users (tested on Fedora and Debian). – WhiteWinterWolf Jan 23 '16 at 16:49
  • And, while not an authoritative source, the last paragraph of this page is relevant for OSX, a change seems to have been made at the file system level specifically for Time Machine starting from OSX Leopard, chances are that it use some different call than the standard link() call and, in all case, this feature is not meant to be provided to the end-user and is not supported by ln. Thanks @psusi :) ! – WhiteWinterWolf Jan 23 '16 at 16:56
  • @WhiteWinterWolf, of course the utility works the same way for root as for other users: it is the kernel that checks permissions, not the utility. Also you seem to have linked a rather old Ubuntu man page that says nothing about OSX. – psusi Jan 23 '16 at 19:56
  • @psusi: Wrong URL, sorry (too sad we cannot correct the comments). Here is the link relevant regarding OSX Time Machine. – WhiteWinterWolf Jan 24 '16 at 09:38
  • 2
    @DannyDulai, you write as though du cannot cope with hard links and of the "bookkeeping" it "would have to do, just to pull off this simple task". The truth is that du already copes with hard links and has done so for many years quite easily by keeping track of the inodes it's already visited. Also, du does not skip symlinks "completely" as you say; they have an inode and a size and have to be counted. The onus should be on you to come up with a better example than du, not for the reader to "imagine" one, because what you've written misrepresents how du really works. – Lqueryvg May 14 '16 at 09:07
  • @Lqueryvg -- "imagine you are writing the du command" does not mean you are tasked with writing the modern full featured du. It is a thought exercise meant to make you think about writing a command that iterates through directories and files. As for skipping, the skipping is in the context of drilling deeper into the directory hierarchy, not absolute skipping. You are picking at details of English, and not taking the text in context. – Danny Dulai May 18 '16 at 14:46
  • 2
    @DannyDulai, rest assured I understand the context. But, "the du command" versus "a command like du"? And "skips it completely" versus "doesn't drill deeper"? Is that just picking at English ? I don't think so. I can suggest some fairly minimal edits which would clear all of this up if you want. – Lqueryvg May 18 '16 at 22:12
  • @Lqueryvg : go for it.. I welcome the edits :-) – Danny Dulai May 19 '16 at 00:46
  • @DannyDulai, well I tried but my edits were rejected; not by you. Considering the nature of my edits (minor clarification only), I'm amazed. – Lqueryvg May 24 '16 at 15:14
  • 1
    @psusi hard link on directory are not allowed on MacOS new filesystem APFS. – gagarine Jan 23 '19 at 10:33
  • 3
    Actually, the "file tree walkers" could still easily avoid loops when traversing the tree: Only recurse if the subdirectory's ".." points back to the parent. @Lqueryvg's answer seems to get more directly to the real problem: The very idea of "parent directory" would have to be redesigned from the ground up. Much easier to restrict directory hardlinks. – Matt Oct 15 '19 at 09:07
  • "Because when you are traversing, there is no way to detect you are looping (without keeping track of inode numbers as you traverse). " -- You must keep track of inode numbers because of symlinks, and commands like du and find do exactly that. – Jim Balter Dec 26 '21 at 13:46
  • 1
    "Because when you are traversing, there is no way to detect you are looping (without keeping track of inode numbers as you traverse)" You mean like this. It's really not that complex or performance intensive to solve. – Philip Couling Jun 07 '23 at 12:17
  • @PhilipCouling You are using 2023 logic for decisions made in the early 1980s. Check this out from 1986:

    https://archive.org/details/designofunixoper00bach/page/128/mode/2up

    – Danny Dulai Jun 08 '23 at 19:20
  • 1
    @DannyDulai Not sure where 2023 from. I'm using my own logic from 2016 that I figured out in the time it took to type the question. I generally regard the authors of unix as having more brains than I do. – Philip Couling Jun 08 '23 at 19:44
  • 1
    @PhilipCouling The decision not to allow links to directories was made long before the 1980s, and the logic of the decision no longer applies due to machines being much faster and having much more memory. When Bill Joy added symlinks to BSD (n the 1980s, in fact), he should have removed the hard link limitation because the issues are the same--programs that traverse directory trees must maintain a "visited" table. Everything in this answer and the author's subsequent comments is wrong. Multics from which UNIX came had symlinks and directory loops. – Jim Balter Aug 22 '23 at 03:53
19

With the exception of mount points, each directory has one and only parent: ...

One way to do pwd is to check the device:inode for '.' and '..'. If they are the same, you have reached the root of the file system. Otherwise, find the name of the current directory in the parent, push that on a stack, and start comparing '../.' with '../..', then '../../.' with '../../..', etc. Once you've hit the root, start popping and printing the names from the stack. This algorithm relies on the fact that each directory has one and only one parent.

If hard links to directories were allowed, which one of the multiple parents should .. point to? That is one compelling reason why hardlinks to directories are not allowed.

Symlinks to directories don't cause that problem. If a program wants to, it could do an lstat() on each part of the pathname and detect when a symlink is encountered. The pwd algorithm will return the true absolute pathname for a target directory. The fact that there is a piece of text somewhere (the symlink) that points to the target directory is pretty much irrelevant. The existence of such a symlink does not create a loop in the graph.

Joe Inwap
  • 524
  • 4
  • 3
  • 4
    Not so sure about this. If we think of .. as being a sort of virtual hardlink to the parent, there is no technical reason that the target of the link can only have one other link to it. pwd would just have to use a different algorithm to resolve the path. – Benubird Mar 05 '14 at 09:44
  • .. only needs to refer to the inode of the parent. if the parent has hard links, those are just paths (names in other directories) that refer to the same inode. if .. was a path to the parent, then it would be like a symlink. – Skaperen Aug 30 '21 at 19:20
  • 1
    .. links aren't needed at all and some filesystems don't create them. You can always figure out the parent from the path (prefix the cwd for relative paths). – Jim Balter Dec 26 '21 at 13:53
  • @JimBalter No you can't, because you don't always know the path. scandirat() isn't given a full path, only a handle to the current directory and a name. This means the full path of current directory can completely change without code even knowing. This is how the shell works and explains why you can cd into a directory and then rename it without the shell getting into trouble. – Philip Couling Jun 07 '23 at 12:30
  • ^ This person is confused. Modern shells always maintain the current working directory ... that's how they can print it in the prompt. That the cwd might differ from what the .. chain yields because of renames is not relevant. – Jim Balter Jun 08 '23 at 16:14
  • P.S. If the renames make the path to the directory non-existent or inaccurate, that's life. That's exactly what happens on filesystems that have no .. links. – Jim Balter Jun 09 '23 at 05:04
  • "If hard links to directories were allowed, which one of the multiple parents should .. point to? That is one compelling reason why hardlinks to directories are not allowed." -- not really. .. would point to either the first hard link or the last one, depending on the implementation. The original reason not to have hard links to directories was to avoid loops, but loops are possible with symlinks so hard link loops could be handled the same way symlink loops are. (That wasn't feasible on early unix on tiny machines.) – Jim Balter Jun 09 '23 at 05:05
  • "Symlinks to directories don't cause that problem." -- this is confused. The .. entry doesn't point to the parent of the symlink, of course, but a directory with a hard link to it and a symlink to it can be reached by two paths, but .. only selects one of them ... exactly the same as if there were two hard links. "it could do an lstat() on each part of the pathname and detect when a symlink is encountered" -- er, no ... each part of what pathname? – Jim Balter Jun 09 '23 at 05:15
  • "The existence of such a symlink does not create a loop in the graph." -- Sure it does, which is why programs that traverse directory graphs have to watch out for them. e.g., /a/b could be a symlink to /a. – Jim Balter Jun 09 '23 at 05:21
  • @JimBalter forgive me disputing your historic knowledge. Trying to understand this comment. Maintaining a string or LL of strings as the CWD doesn't help when executing cd ... That CWD string path might no longer exist. Yet cd .. does work after a grandparent rename. We know shells hold an open dir fd. This points to shells relying on the .. link for cd, and points to the CWD not being used. Not confused, i'm just interpreting evidence. – Philip Couling Aug 22 '23 at 04:18
  • @PhilipCouling "Maintaining a string or LL of strings as the CWD doesn't help when executing cd .."-it does IF THERE ARE NO .. LINKS, which was the context of the comment-".. links aren't needed at all". "That CWD string path might no longer exist" -- so the shell would report a non-existent directory when you attempt the cd. "Yet cd .. does work after a grandparent rename." -- in POSIX systems with ".." links ... but the shell's autocomplete will be confused because it uses the saved cd path. "We know shells hold an open dir fd." -- no, they have a curdir, as all processes do. .. is relative. – Jim Balter Aug 23 '23 at 09:40
  • @PhilipCouling " Not confused" -- you're confusing the current implementation where directories have ".." links with a possible implementation where they don't. Go back and read all the comments for context. I'm retired, no longer interested in this, and won't comment further. – Jim Balter Aug 23 '23 at 09:42
  • @JimBalter calling me confused is a bit rude and in this case wholly wrong! I was demonstrating your claim that Linux doesn't need .. links leads to some features being impossible to implement,and thus proving that Linux in fact does need .. links. But since your approach to logical debate is to keep calling me confused without offering any logical counter argument then, yes, we'd better leave this one here. – Philip Couling Aug 23 '23 at 12:05
9

I like to add few more points about this question. Hard links for directories are allowed in linux, but in a restricted way.

One way we can test this is when we list the content of a directory we find two special directories "." and "..". As we know "." points to the same directory and ".." points to the parent directory.

So lets create a directory tree where "a" is the parent directory which has directory "b" as its child.

 a
 `-- b

Note down the inode of directory "a". And when we do a ls -la from directory "a" we can see that "." directory also points to the same inode.

797358 drwxr-xr-x 3 mkannan mkannan 4096 Sep 17 19:13 a

And here we can find that the directory "a" has three hard links. This is because the inode 797358 has three hardlinks in the name of "." inside "a" directory and name as ".." inside directory "b" and one with name "a" itslef.

$ ls -ali a/
797358 drwxr-xr-x 3 mkannan mkannan 4096 Sep 17 19:13 .

$ ls -ali a/b/
797358 drwxr-xr-x 3 mkannan mkannan 4096 Sep 17 19:13 ..

So here we can understand that hardlinks are there for directories only to connect with their parent and child directories. And so a directory without a child will only have 2 hardlink, and so directory "b" will have only two hardlink.

One reason why hard linking of directories freely were prevented would be to avoid infinite reference loops which will confuse programs which traverse filesystem.

As filesystem is organised as tree and as tree cannot have cyclic reference this should have been avoided.

Kannan Mohan
  • 3,231
  • 2
    Good example. It cleared my doubt. So these cases are handled in a special way to avoid infinite loops. right? – G Gill Oct 07 '14 at 01:42
  • 1
    As we have a limited way of allowing hard links for directories i.e ".." and "." we will not reach a infinite loop and so we would not require any special ways to avoid those as they will not happen :) – Kannan Mohan Oct 08 '14 at 03:20
  • @GGill There is is in fact special handling--programs that descend directory trees check the entries of a directory for the names "." and ".." don't process them, as that would result in an infinite loop. – Jim Balter Aug 22 '23 at 03:12
8

None of the following are the real reason for disallowing hard links to directories; each problem is fairly easy to solve:

  • cycles in the tree structure cause difficult traversal
  • multiple parents, so which is the "real" one ?
  • filesystem garbage collection

The real reason (as hinted by @Thorbjørn Ravn Andersen) comes when you delete a directory which has multiple parents, from the directory pointed to by ..:

What should .. now point to ?

If the directory is deleted from its parent but its link count is still greater than 0 then there must be something, somewhere still pointing to it. You can't leave .. pointing to nothing; lots of programs rely on .., so the system would have to traverse the entire file system until it finds the first thing that points to the deleted directory, just to update ... Either that, or the file system would have to maintain a list of all directories pointing to a hard linked directory.

Either way, this would be a performance overhead and an extra complication for the file system meta data and/or code, so the designers decided not to allow it.

Lqueryvg
  • 1,979
  • 3
    That's easy to solve as well: keep a list of parents of a child directory, which you update when you add or remove a link to the child. When you delete the canonical parent (the target of the child's ..), update .. to point to one of the other parents in the list. – jathd Jan 23 '15 at 19:29
  • 3
    I agree. Not rocket science to solve. But nonetheless a performance overhead, and it would take up a little bit extra space in the file system meta data and add complication. And so the designers went for the simple, fast approach - don't allow links to hard directories. – Lqueryvg Jan 24 '15 at 12:54
  • 1
    Sym links to dirs "violate settled semantics and behaviours", yet they are still allowed. Some commands therefore need options to control whether sym links are followed (e.g. -L in find and cp). When a program follows '..' there is further confusion, hence the difference in output from pwd and /bin/pwd after traversing a sym link.

    There are no "Unix answers"; just design decisions. This one revolves around what becomes of ".." as I stated in my answer. Unfortunately, '..' isn't even mentioned in the answer that everyone else is so sheepishly voting for.

    – Lqueryvg May 11 '16 at 21:18
  • BTW, I'm not saying I'm in favour of hard links to dirs. Not at all. I don't want my day job to be harder than it is already. – Lqueryvg May 11 '16 at 21:19
  • 2
    It's not what POSIX says, but IMO '..' should have never been a filesystem concept, rather resolved syntactically on the paths, so that a/.. would always mean .. This is how URLs work, btw. It's the browser that's resolving '..' before it even hits the server. And it works great. – ybungalobill Dec 26 '17 at 04:31
  • There's no such thing as deleting a directory (or file) that has hard links to it ... they aren't deleted until the link count goes to 0. (Removing a name from a directory doesn't delete the named thing if there are links to it. Removing a directory name doesn't remove the ".." link to it.) It doesn't matter whether there is 1 parent or more, the issues are the same, The real reason is dealing with cycles ... they didn't want to maintain "visited" tables for dump and other programs that enumerated trees. But those tables now exist due to symlinks. – Jim Balter Aug 22 '23 at 03:22
6

Hardlink creation on directories would be unrevertable. Suppose we have :

/dir1
├──this.txt
├──directory
│  └──subfiles
└──etc

I hardlink it to /dir2.

So /dir2 now also contains all these files and directories

What if I change my mind? I can't just rmdir /dir2 (because it is non empty)

And if I recursively deletes in /dir2... it will be deleted from /dir1 too!

IMHO it's a largely sufficient reason to avoid this!

Edit :

Comments suggest removing the directory by doing rm on it. But rm on a non-empty directory fails, and this behaviour must remain, whether the directory is hardlinked or not. So you can't just rm it to unlink. It would require a new argument to rm, just to say "if the directory inode has a reference count > 1, then only unlink the directory".

Which, in turns, break another principle of least surprise : it means that removal of a directory hardlink I just created is not the same as removal of a normal file hardlink...

I will rephrase my sentence : Without further development, hardlink creation would be unrevertable (as no current command could handle the removal without being incoherent with current behaviour)

If we allow more development to handle the case, the number of pitfalls, and the risk of data loss if you're not enough aware of how the system works, such a development implies, is IMHO a sufficient reason to restrict hardlinking on directories.

  • 1
    That should not be a problem. With your case, when we create hardlink to dir2 we have to make hardlink to all the contents in dir1 and so if we rename or delete dir2 only an extra link to the inode gets deleted. And that should not affect dir1 and its content as there is atleast one link (dir1) to the inode. – Kannan Mohan Sep 17 '14 at 13:20
  • 5
    Your argument is incorrect. You would just unlink it, not do rm -rf. And if the link count reaches 0, then the system would know it can delete all the contents too. – LtWorf Jun 12 '17 at 12:55
  • 1
    That's more or less all rm does underneath anyway (unlink). See: https://unix.stackexchange.com/questions/151951/what-is-the-difference-between-rm-and-unlink This really isn't an issue, any more than it is with hardlinked files. Unlinking just removes the named reference and decrements the link count. The fact that rmdir won't delete non-empty directories is irrelevant - it wouldn't do that for dir1 either. Hardlinks aren't copies of data, they are the same actual file, hence actually "deleting" the dir2 file would erase the directory listing for dir1. You would always need to unlink. – BryKKan Jul 11 '19 at 00:49
  • 1
    You can't just unlink it like a normal file, because rm on a directory don't unlink it if it's non empty. See Edit. – Pierre-Olivier Vares Jul 12 '19 at 07:16
  • This is just wrong. If hard links to directories were allowed, then of course the rmdir system call would not remove the inode if the link count indicated that there were other links, just as with the unlink system call. "without being incoherent with current behaviour" -- current behavior is that you can't hardlink to directories. Changing that of course implies that rmdir changes accordingly. "IMHO a sufficient reason to restrict hardlinking on directories" --- I have opinions too, but that's not what the question asks for. – Jim Balter Dec 26 '21 at 14:02
  • @KannanMohan "when we create hardlink to dir2 we have to make hardlink to all the contents in dir1" -- no ... where would these hardlinks to the contents of dir1 go? Hardlinking to a directory gives you another path to its contents; there's no reason to hardlink the contents as well. – Jim Balter Dec 26 '21 at 14:08
  • @LtWorf "You would just unlink it, not do rm -rf" -- yes, except that you would use rmdir, not unlink, and the rmdir system call would check the link count and not delete the inode if the directory remained in use, so this answer poses a non-problem. – Jim Balter Dec 26 '21 at 14:11
1

This is a good explanation. Regarding "Which one of the multiple parents should .. point to?" one solution would be for a process to maintain its full wd path, either as inodes or as a string. inodes would be more robust since names can be changed. At least in the olden days, there was an in-core inode for every open file that was incremented whenever a file was opened, decremented when closed. When it reached zero it and the storage it pointed to would be freed up. When the file was no longer open by anybody, it (The in-core copy) would be abandoned. This would maintain the path as valid if some other process moved a directory to another directory while the subdirectory was in the path of another process. Similar to how you can delete an open file but it is simply removed from the directory, but still open for any processes who have it open.

Hard-linking directories used to be freely allowed in Bell Labs UNIX, at least V6 and V7, Don't know about Berkeley or later. No flag required. Could you make loops? Yes, don't do that. It is very clear what you are doing if you make a loop. Nether should you practice knot tying around your neck while you are waiting for your turn to skydive out of a plane if you have the other end conveniently hung from a hook on the bulk-head.

What I hoped to do with it today was to hard-link lhome to home so that I could have /home/administ available whether or not /home was covered up with an automout over home, that automount having a symlink named administ to /lhome/administ. This enables me to have an administrative account that works regardless of the state of my primary home file system. This IS an experiment for linux, but I think learned at one time for the UCB based SunOS that automounts are done at the ascii string level. It is hard to see how they could be done otherwise as a layer on top of any arbitrary FS.

I read elsewhere that . and .. are not files any more in the directory either. I am sure that there are good reasons for all of this, and that much of what we enjoy (Such as being able to mount NTFS) is possible because of such things, but some of the elegance of UNIX was in the implementation. It is the benefits such as generality and malleability that this elegance provided that has enabled it to be so robust and to endure for four decades. As we loose the elegant implementations it will eventually become like Windows (I hope I am wrong!). Someone would then create a new OS which is based on elegant principles. Something to think about. Perhaps I am wrong, I am not (obviously) familiar with the current implementation. It is amazing though how applicable 30 year old understanding is to Linux... most of the time!

  • I think, though I may be wrong, that . and .. are not hardlinks in file-system, for modern file-systems. However the file-system driver fakes them. It is these file-system that stop ups hard linking directories. For old file-systems it was possible (but dangerous). To do what you are trying, look at mount --bind, see also mount --make… and maybe containers. – ctrl-alt-delor Feb 23 '16 at 23:03
  • "Hard-linking directories used to be freely allowed in Bell Labs UNIX, at least V6 and V7" -- only for superusers, who could also unlink directories. This was extremely dangerous and was fatal when UNIX started supporting foreign filesystems. I'm pretty sure both of these were disallowed in PWB. "some of the elegance of UNIX was in the implementation" -- much of it was necessitated by the tiny amount of memory on PDP-11s. Branches in shell scripts were done via a seek--"elegant" but bog slow. – Jim Balter Dec 26 '21 at 14:15
0

From what I gather, the main reason is that it's useful to be able to change directory names without messing up running programs that use their working directory to reference other files. Suppose you were using Wine to run ~/.newwineprefix/drive_c/Program Files/Firefox/Firefox.exe, and you wanted to move the entire prefix to ~/.wine instead. If for some strange reason Firefox was accessing drive_c/windows by referring to ../../windows, renaming ~/.newwineprefix breaks implementations of .. that keep track of the parent directory as a text string instead of an inode.

Storing the inode of a single parent directory must be simpler than trying to keep track of every path as both a text string and a series of inodes.

Another reason is that misbehaving applications might be able to create loops. Behaving applications should be able to check if the inode of the directory that's being moved is the same as the inode of any of nested directories it's being moved into, just as you can't move a directory into itself, but this might not be enforced at the filesystem level.

Yet another reason might be that if you could hardlink directories, you would want to prevent hardlinking a directory you couldn't modify. find has security considerations because it's used to clear files created by other users from temporary directories, which can cause problems if a user switches a real directory for a symlink while find is invoking another command. Being able to hardlink important directories would force an administrator to add extra tests to find to avoid affecting them. (Ok, you already can't do this for files, so this reason is invalid.)

Yet another reason is that storing the parent directory's inode may provide extra redundancy in case of file-system corruption or damage. If you wanted .. to list all parent directories that hardlink to this one, so a different, arbitrary parent could be easily found if the current one is delinked, not only are you violating the idea that hard links are equal, you have to change how the file system stores and uses inodes. Having programs treat paths as a series (unique to each hardlink) of directory inodes would avoid this, but you wouldn't get the redundancy in case of file-system damage.

Misaki
  • 31
  • 3