4

I have a process running on one computer that spawns simulations by writing the simulation data to directory pre/id. Worker processes then copy a simulation from pre to a local disk, which can be on a different computer. pre is in a volume mounted with nfs. This part works well.

When a simulation is done, the results are moved to the directory result/id, which is what is causing trouble. The supervising process can decide to keep such a directory or to delete it. Occasionally, when it tries to delete result/id, the move operation seems to be incomplete, and removing the directory fails.

Everything runs on a variety of linux flavors. The workers move directories around using mv and then touch result/id/done to signal to the supervising process that the result can be read (and deleted). The supervising process uses boost::filesystem::remove_all to delete result/id.

How can I reliably wait for the move operation to be completed, before attempting to delete it?


Added: This code moves the result directory to where the supervising process waits for it:

mv $tempDir $finishedCasesDir # copy case to result directory
touch $finishedCasesDir/$caseName/done

This is the C++ code that waits for done to appear:

if(is_regular_file(resultPath/"done"))
{
  // get relevant result data
  ...
  // remove result directory
  remove_all(resultPath);
}

And the error:

terminate called after throwing an instance of 'boost::filesystem3::filesystem_error'
what():  boost::filesystem::remove: Directory not empty: "results/711a35ed-818e-4084-ab43-47531fdd8d11"
Christoph
  • 263
  • is it a c++ problem or a scripting problem ? because c++ would fit better on StackOverflow – Kiwy Feb 13 '14 at 11:00
  • It's a problem that occurs somewhere between my script moving a directory and my C++ program trying to delete that directory after it has been moved. I thought about posting at StackOverflow, but I thought it might be a good idea to do it here as I might be able to solve the problem using Linux or Unix features. – Christoph Feb 13 '14 at 11:37
  • You say the mover process creates result/id/done to signal that the move is complete. If done correctly, it should be enough for the supervising process to check for this file's existence. So why isn't this enough? – alexis Feb 13 '14 at 11:56
  • @alexis your last question is exactly the one I cannot answer. I'll add the relevant code (script and C++) and error message to my question. – Christoph Feb 13 '14 at 11:59
  • Are you absolutely sure that done does not exist in the directory prior to being moved? – alexis Feb 13 '14 at 12:13
  • Yes. Simulations are created using a template directory, and that doesn't contain a file called done. When a simulation is run, a possibly existing done file is removed before the simulation starts. – Christoph Feb 13 '14 at 12:17
  • If I understand your C++ code correctly, you're trying to delete the moved directory (that you called result/id) and not the original one (that you called pre/id). This moved directory is definitely not empty. I dont know boost libs but you probably need to erase the files into the dir before deleting the dir or you need to use another boost function who act like 'rm -Rf'. – DavAlPi Feb 13 '14 at 13:08
  • remove_all removes recursively. Also the code only fails occasionally - I can delete my result directories 15k times without any problems, and then it suddenly fails. That's why I concluded that there's something still being written into the directory by mv. – Christoph Feb 13 '14 at 13:17
  • If the mover script really only creates done after mv exits, it is not the source of the problem. Either a different process sometimes writes in the same directory (could you have id collisions?), or remove_all can fail to find and remove all files before removing the directory. What's left in the directory when you encounter a failure? – alexis Feb 13 '14 at 13:39
  • Sounds like several cpus/processes are involved ? And you have a race condition that occurs only occassionally. Your locking system depends on touch being atomic, but it isn't. Suppose the touch was interrupted after creating the output but before closing, supervisor comes in quick and removes file then tries to remove directory but can't because files is not actually removed until touch wakes up and closes fd. Just guessing obviously. – X Tian Feb 13 '14 at 13:48
  • Yes, several CPUs, even different machines. So I need some atomic replacement for touch done that works over nfs, or any other way of signalling to another process on another machine that it can harvest a result. – Christoph Feb 13 '14 at 13:58
  • unless you code around it, accept error backoff a bit and retry, log the event so you can monitor the situation and develop a better approach when needed. – X Tian Feb 13 '14 at 14:31
  • I'll just remove_all in a loop now until the error disappears - a bit brutal but it should have the desired effect without causing any harm. If I leave data in the directory, that could quickly add up to a few TB. Not good. – Christoph Feb 13 '14 at 14:40
  • 1
    @XTian, "touch is not atomic?" I don't buy this scenario. On Unix, holding an open file descriptor does not lock the directory entry-- you can unlink away. The inode and blocks won't be freed until the descriptor is closed, but that doesn't get in the way of unlinking the directory. – alexis Feb 14 '14 at 00:05
  • @alexis You are correct within a file system, I have added an edit to my answer to help clarify. – X Tian Feb 14 '14 at 14:11
  • @Christoph, what's left in the directory when you encounter a failure? – alexis Feb 14 '14 at 19:02
  • @XTian, sorry but your answer did not clarify. You just state that you found support for your reservations about NFS. What are these problems you read about? – alexis Feb 14 '14 at 19:04
  • @alexis I have not yet looked for .nfs files (because I was not aware of them and didn't look for hidden files), but the directories were left apparently empty (which was very puzzling at first), or with varying leftovers from my simulations. – Christoph Feb 14 '14 at 20:39

2 Answers2

2

Have you come across the flock command?

It provides you a file lock within the filesystem so can be used in shell scripts.

--- edit

After my initial answer above, further edits were made to original post, to which I added comments and a final suggestion of a race condition across several machines which are using nfs, and devised a scenario. This scenario was challenged by @alexis, to which I thought it deserved a reply.

@alexis you are correct when working within one filesystem, but the situation becomes more complicated when nfs mounted file systems are involved.

It is unclear from OP exactly what mix of machines/servers/clients nfs versions are involved, but I thought it was enough to say, "You need a better syncing mechanism than "touch - rm" and indeed, OP suggests it sort of works but has a 1 in 15k chance of failure. So I suggested either, find a better way to sync or code around it.

After a little investigation on the subject, I have found some references that do show "flaws" in nfs that indicate that removing a file does not work as expected across nfs. More over, there is are differences between nfs v3 and v4, specifically to address this flaw, also nfs4 could working differently, but doesn't or else it would break compatibility with older clients.

This nfs document summarises the situation it describes the silly rename that was introduced to code around the problem, and rfc 5661 NFS 4.1 provides further detail.

--edit 2

Extract of one paragraph from above references :

Because of the design of the NFS protocol, there is no way for a file to be deleted from the name space but still remain in use by an application. Thus NFS clients have to emulate this using what already exists in the protocol. If an open file is unlinked, an NFS client renames it to a special name that looks like ".nfsXXXXX". This "hides" the file while it remains in use. This is known as a "silly rename." Note that NFS servers have nothing to do with this behavior.

X Tian
  • 10,463
  • I did but I read that it doesn't work on nfs mounts. Also, how would it guarantee that mv is actually complete before returning? – Christoph Feb 13 '14 at 11:38
  • I don't think there's any reason to put the lock file on that file system ? – X Tian Feb 13 '14 at 11:52
  • What would I use a lock file for in this case? I don't think it would help solving my problem. – Christoph Feb 13 '14 at 12:33
  • I think the discussion we had (we probably should have moved this to chat) was very useful, so how can we conclude this? I could add as an answer the code I am now using. – Christoph Feb 14 '14 at 14:45
  • You could add your code as an answer to your own question. I think others would find this useful. tks for the feed back. – X Tian Feb 14 '14 at 15:38
  • "removing a file does not work as expected across nfs." That's a promising teaser, but in what way? All you've been saying is "don't trust NFS." What does NFS do in this case that will interfere with a recursive rm? What exactly does the reference say that's relevant? – alexis Feb 14 '14 at 19:00
1

Idea #1 - alternative approach?

Rather than touch a file what if you waited for the mv process to complete instead.

$ mv $tempDir $finishedCasesDir & # copy case to result directory
$ wait %1 && touch $finishedCasesDir/$caseName/done

This will only touch the file when the mv process has terminated.

Example

Here's an example using the sleep command as a stand in for your mv command.

start time

$ date
Thu Feb 13 21:23:33 EST 2014

start simulated "mv" command

$ sleep 10 &
[1] 28561

we then "wait" for it to finish

$ wait %1 && echo 'all done!'
[1]+  Done                    sleep 10
all done!

confirming that we were waiting for ~10 secs.

$ date
Thu Feb 13 21:23:48 EST 2014

continue

$ ...boost program can then run...

Idea #2 - NFS issue?

Based on feedback from @Gilles, I hadn't realized you were working with these files over NFS. I believe Gilles is 100% correct, I too have encountered similar issues when working with files over NFS where a process may still have access to a NFS mounted directory that you're attempting to delete. When you do this NFS will typically create a .nfsXXXX file in the directory, which will foil your Boost applications attempts to delete the file since it's effectively not empty.

NOTE: Having a shell whose CWD (Current Working Directory) is a sub-directory within this NFS mount is enough to cause this issue.

You can read more about this issue here in this article, titled: What is this .nfs file and why can I not remove it?.

excerpt from above article

% echo test> foo
% tail -f foo
test
^Z
Suspended
% rm foo
% ls -A
.nfsB23D
% rm .nfsB23D
% ls -A
.nfsC23D
% lsof .nfsC23D
COMMAND  PID USER   FD   TYPE DEVICE SIZE/OFF    NODE NAME
tail    1257 robh    0r  VREG  176,6        5 3000753 .nfsC23D
%

Notice: you can use the tool lsof to determine what process is maintaining a file descriptor.

References

slm
  • 369,824
  • I'm currently waiting for mv to complete without the trailing &. Does mv & plus wait %1 really change the situation as far as the filesystem is concerned? – Christoph Feb 14 '14 at 08:25
  • @Christoph - based on your description that your boost based C app is not able to delete the directory b/c the mv has apparently not finished emptying the directory out prior I would say try it and see. – slm Feb 14 '14 at 09:06
  • Same exception thrown by remove_all. That doesn't mean that my previous workaround really works better (that might have been luck). – Christoph Feb 14 '14 at 09:35
  • 1
    I'm pretty sure this wouldn't help, in fact it has more of a chance of failing. The original code would work if it wasn't for NFS: logically, done is only created after mv completes. I suspect that there are .nfsXXX files left behind on the server which aren't deleted yet; done is created locally so it's seen locally without a round trip to the server. – Gilles 'SO- stop being evil' Feb 14 '14 at 15:45
  • @Gilles - thanks I think you're dead on that the issue's NFS, I've added that to my A as well. – slm Feb 14 '14 at 16:12
  • I'll try a different approach when I'm back in office (monday), post my final solution (which might be different to what I have now) and see if I can mark an answer as accepted. All comments have been helpful and gave me some insight - thank you all! – Christoph Feb 14 '14 at 18:34
  • -1 Backgrounding and waiting is a roundabout way of just waiting in the first place. – alexis Feb 14 '14 at 19:01