8

I feel like every tutorial I read on resource management in server settings starts by asserting that file handles are a scarce resource, and we should therefore aim to keep the list of open files to a minimum.

But I don't really understand why they are a scarce resource. Isn't a file descriptor just a number? How much state does Linux, say, have to track per open file? Is the number of open files limited due to memory issues or something completely different?

beta
  • 203
  • 1
  • 3
  • Not an answer, just a link how to check the limit http://stackoverflow.com/questions/1356675/check-the-open-fd-limit-for-a-given-process-in-linux – orion Mar 05 '14 at 09:54
  • in my opinion it's amtter of human understanding, the day you need to debug and security, how a human can understand and debug 150000 files open if you master the files open all time you can determine easily if someone use your server as it should not be use – Kiwy Mar 05 '14 at 14:33
  • 1
    Related question: http://unix.stackexchange.com/questions/36841/why-is-number-of-open-files-limited-in-linux – Benubird Apr 07 '14 at 10:34

3 Answers3

3

Open files are managed by a structure in kernel memory that handles the inode reference in-memory.
They also track the opening mode of the file, the location in the file, as well as a cache.

AFAIK in most UNIX/Linux systems, that structure cannot be swapped out, and as the storage size is usually larger in an order of magnitude or two than the memory, it's virtually impossible to have a significant amount of the files in storage open at the same time.
In addition, not all platforms have an active sync deamon, so written information may sometime be kept only in the cache until a sync or close is performed.

In addition, there are a lot of file handles in use without you as a developer aware of them such as dynamically loaded libraries and the program file/interpreter binaries.

Didi Kohen
  • 1,841
2

Some of it is just because like you said, it's "just a number" -- and if you want to be able to occationally use higher number than what you can fit into an (let's say) an integer-datatype, then you must use a larger datatype for all descriptors... eg. 2 bytes per descriptorID, rather than just 1 byte -- and doubling the size of all your descriptors, will soon bleed memory, memory which could be better used for applications than for just the OS.

There are also lots of other information assosciated with the descriptors as well as the need to keep lists of which are in use and which are free -- there are limits to how large these datastructures can be to remain effective.

2

According to this tutorial, you can get the maximum number of open file handles from a call to cat /proc/sys/fs/file-max. On my system, I get a value of 797736, which is pretty big. A quick ps -e|wc -l tells me I have about 200 processes running, which means any given program can open around 4000 file handles.

However, that is a global value - you can get more specific, by using ulimit. ulimit -a reports that I can have a maxmimum of 1024 file handles open, which is still a pretty big number, although nothing compared to the absolute max. But, this can be increased if you need it to be and so isn't really a hard limit.

So, my conclusion? File handles are not a scarce resource. The tutorials just want to make sure you only open files while you need them, because if you leave file handles open it can interfere with other processes that might also be trying to work with the files, particularly if you are locking them when you open them.

Benubird
  • 5,912