2

Does NTFS count other link types with the same counter it uses for hardlinks? Or does Linux include other link types when it calculates hardlink count in an NTFS filesystem?

I have an NTFS filesystem on an external hard disk that I use in Windows and Linux (fedora).

Some files have hardlink count = 2. Example output from 'ls' and 'stat':

$ ls -li | grep samplefile.jpg 
1002 -rwxrwxrwx. 2 charlie charlie     29496 Apr 18  2019 samplefile.jpg
$ stat samplefile.jpg 
  File: samplefile.jpg
  Size: 29496       Blocks: 64         IO Block: 4096   regular file
Device: 811h/2065d  Inode: 1002        Links: 2

But the filesystem contains no other directory entry referencing the same inode number (file id in NTFS) or the same length, evidenced by 'ls' and 'find' initiated from the filesystem root:

$ ls -Rli . | grep '1002\|29496'
1002 -rwxrwxrwx. 2 charlie charlie     29496 Apr 18  2019 samplefile.jpg
$ find . -samefile  path-to-file/samplefile.jpg 
./path-to-file/samplefile.jpg

As far as I can tell, Canon's DPP4 software (Digital Photo Professional) has produced the files with hardlink count = 2, by modifying a file then save as or export. Saving a modified file with the same name (save) does not increase the hardlink count. So, DPP4 seems to be storing a reference to the original file, and NTFS (or Linux) is counting the reference as a hardlink. But I don't understand how that ties in with hardlink count; or what NTFS mechanism DPP is using; or whether Linux (fuse) is including other link types when it calculates hardlink count; or whether I'm barking up completely the wrong tree.

Q: When I delete samplefile.jpg, either in Windows or Linux, will the file's data fail to delete, because the use count is non-zero? Answer: In Linux, the file's data deletes correctly. (I don't know about Windows.) Evidence: I deleted one of the directory entries (in Linux rm filename), and the used space reduced by exactly the size of the deleted file.

References: In the following question, the answers explain the meaning of the hardlink field in ls -l output, and they point out that it has more than one meaning, depending on the entry type, but no answer explicitly addresses NTFS.

what do ls output fields mean

Here is Microsoft NTFS documentation. Out of date, but the basics should be still valid:

ntfs documentation

  • 1
    Are you sure the other file with the same file id is not a hidden file, which ls -lRi is unable to find? – Johan Myréen Aug 23 '20 at 16:45
  • Thank you Johan. I hadn't thought of hidden files. I have changed the command to ls -lRia. It didn't find any other directory entry. To make sure it's working, and I haven't made a silly mistake, I created a hardlink with a hidden name, in Linux. hardlink count increased to 3, and ls -lRia found 2 directory entries. BTW, I am learning more about NTFS reparse points, junctions, symlinks, hardlinks. In a day or two, I may be able to throw more light on what DPP4 is doing. – James Watson Aug 24 '20 at 01:23

1 Answers1

3

NTFS filesystems historically support 8.3 filenames.

When this feature is on, the creation of file from Windows goes through the following:

  • if a filename fits into the 8.3 pattern, there's only one filename;
  • if it doesn't (like your samplefile.jpg), it gets an additional "hidden" 8.3 name (like SAMPLE~1.JPG), effectively a hardlink.

You can reference the file by both names, both on Windows and Linux, but the 8.3 names are hidden from the listings. Thus, ls doesn't show SAMPLE~1.JPG, but specific ls SAMPLE~1.JPG does.

This is the reason for Linux showing the hardlink count as 2. If you'd name it sample.jpg, it would show 1.

Interestingly, if you create a long named file on this NTFS from Linux, it doesn't get a 8.3 name. Still Windows sees it, under the long name.

Finally, not all NTFS systems have this feature on, some users prefer to disable it, even if they only use Windows.

Anton K
  • 319