There are a lot of specific cases in your request.
- Files actually outside a git-managed directory.
- Your
TheFile
fits this case.
- Files inside a directory managed by Git, with some
.git
marker.
.git
is not always a directory. It can be a file as well, with a path to the real GIT_DIR. We can further break these files down as follows:
- Known files, those present in the Git index.
- Ignored files, those files matching a pattern per
gitignore(5)
:
.gitignore
$HOME/.config/git/ignore
$GIT_DIR/info/exclude
- Files under an actual
$GIT_DIR
directory, but NOT part of the repo.
.git/hooks
are the most likely
- Could also be malware
So the most reliable case, is going to be generating TWO lists, relative to your given base directory $D
, and comparing them (be sure to sort them and remove duplicates beforehand).
I can't think of a reliable way to generate the sub-list for 2.3 above, so I leave that as an open problem (I'd love to know about it, because I've lost hooks before).
Shell script to list known files per 2.1 above:
for g in $(find $D -name .git) ; do
echo $g
p=${g%/.git} g2=`readlink -f $g` ;
( cd $p && GIT_DIR=$g2 \
git ls-files --exclude-standard --full-name ) \
| sed "s,^,${p}/,g" ;
done > list-2.1
Shell script to list ignored files per 2.2 above:
for g in $(find $D -name .git) ; do
p=${g%/.git} g2=`readlink -f $g` ;
( cd $p && GIT_DIR=$g2 \
git ls-files \
--others -i --exclude-standard ) \
| sed "s,^,${p}/,g" ;
done > list-2.2
Shell script to list files per 2.3 above:
TODO > list-2.3
Shell script to process the lists and find what's not on side B:
comm -23 <(find $D ! -type d |sort) <(sort 2.1 2.2 2.3 | uniq)
find . | grep -v '\.git'
not efficient enough? – pfnuesel Nov 15 '16 at 14:42.git
subdirectory? This is an easier problem than you asked - git has a--git-dir
option so files can be tracked in repos which are not in the working tree. – icarus Nov 15 '16 at 14:50.git
subdirectory and no parent directory has a.git
either. Meaning the directory is not under version control and also is not ignored or uncommited. – Raphael Ahrens Nov 15 '16 at 14:58--git-dir
connects to my question? – Willem Nov 15 '16 at 15:41--git-dir
means that that you could have lots of files say in~/sub/sub/directory
but the.git
file for them could be in/tmp
. When you were in~/sub/sub/directory
you would need to give the extra argument every time to git, but it does mean there would be no sign of a/git
file. Perhaps a slightly more realistic example might be that you have remote read-only access to a network drive hosting a project and want to track its changes in git. – icarus Nov 15 '16 at 15:48