3

TL;DR: I know a program creates and then deletes files in /tmp. How can I intercept them for examination ?

Context:

There's a particular .jar file, which I don't trust; for some reason its source code contains an ftm method and has capability to make connections, which is evident from network-related syscalls in output of strace (and when I mean connection, I don't mean unix domain sockets, it's AF_INET6). I've examined with Wireshark and saw no outgoing TCP or UDP connections during it's use.

However, I still don't quite trust it. From the output of strace I've seen that it's creating temporary files in /tmp and then deletes them. Is there a way to intercept those files to examine their contents ?

  • 3
    LD_PRELOAD intercepting unlink before calling java.... https://serverfault.com/questions/75927/blocking-rm-rf-for-application – Rui F Ribeiro Sep 10 '18 at 20:50
  • @RuiFRibeiro Thanks for the link. Tried the suggestion, made unlink.so. Now difference between ls /tmp before and after running the command. I'm no expert on shared libraries, or Java, but seems like unlink.so wasn't used by it, so just a guess but maybe Java doesn't use unlink(). I'm hoping someone can suggest a more or less universal way, because I want this to work consistently. I don't care how the program in question is done, I just want to see its temp files. – Sergiy Kolodyazhnyy Sep 10 '18 at 21:00
  • does strace show anything being written to said files? (strace may need flags to increase how much it logs) – thrig Sep 10 '18 at 21:01
  • @thrig With strace -f -e open,write,unlink java -jar file.jar input.txt I see there are writes to particular file descriptors. There's openat(AT_FDCWD, "/tmp/imageio1355028222376675525.tmp", O_WRONLY|O_CREAT|O_EXCL, 0600) = 16 , and data written to it appears to be the header of the output png file. So it writes output file to tmp first. I also see another temp file being opened and reopened as fd 4: openat(AT_FDCWD, "/tmp/hsperfdata_xie", O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = 4 But I don't see any writes to fd 4. Makes no sense to create O_RDONLY file and keep it empty – Sergiy Kolodyazhnyy Sep 10 '18 at 21:12
  • @thrig Actually at closer examination, apparently /tmp/hsperfdata_xie is a directory. What gets unlinked is 14057 unlink("/tmp/hsperfdata_xie/14048") = 0. The file 14048 gets opened as fd 5, and there are writes to it, 8 bytes of \0 but nothing else. I also don't see any child processes inheriting it via dup(). Again, makes no sense to write a file with 8 null bytes – Sergiy Kolodyazhnyy Sep 10 '18 at 21:17
  • no pipes involved? – Rui F Ribeiro Sep 10 '18 at 21:23
  • @RuiFRibeiro Nope. I don't see any corresponding dup() or dup2() call which would make use of fd 5 that would duplicate fd 5 onto either read or write end of pipe. There are pipe() calls, but none made by the subprocess that opens the file. The number of the temp file appears to be PID of the parent process of the one that creates the file, though – Sergiy Kolodyazhnyy Sep 10 '18 at 21:31
  • Found the solution , kinda close to what I wanted ( i.e., recover those temp files) although not ideal. Comments ? – Sergiy Kolodyazhnyy Sep 10 '18 at 22:05
  • Feel free to VTC as duplicate. Already voted myself. – Sergiy Kolodyazhnyy Sep 10 '18 at 22:25
  • 2
    As to the files rather than your Q, /tmp/hsperfdata_$user/ is 'HotSpot performance data' created automatically by the Sun/Oracle/OpenJDK JVM (which is codenamed HotSpot) with the JVM pid as filename and used by utilities like jps jstat jmap jconsole. See e.g. https://stackoverflow.com/questions/76327/how-can-i-prevent-java-from-creating-hsperfdata-files https://stackoverflow.com/questions/3806758/hsperfdata-uid-folder-not-getting-created – dave_thompson_085 Sep 11 '18 at 01:09
  • @dave_thompson_085 Thank you so much. I suppose this clarifies some of my doubts. – Sergiy Kolodyazhnyy Sep 11 '18 at 01:22

2 Answers2

6

Better yet, if you want to reverse engineer a nefarious Java binary, rather than trying to intercept files, decompile the suspect .jar file.

For it, you can use CFR - another java decompiler

CFR will decompile modern Java features - up to and including much of Java 9, but is written entirely in Java 6, so will work anywhere

To use, simply run the specific version jar, with the class name(s) you want to decompile (either as a path to a class file, or as a fully qualified classname on your classpath). (--help to list arguments).

Alternately, to decompile an entire jar, simply provide the jar path, and if you want to emit files (which you probably do!) add --outputdir /tmp/putithere

There are no lack of alternatives, however the CFR project seems to be well maintained, having a 2018 update.

Disclaimer: I have not done reverse engineering to Java/JAR binaries since 2005

Rui F Ribeiro
  • 56,709
  • 26
  • 150
  • 232
1

Note: improved solution posted at duplicate question

From reading How to access temporary file straight after creation? I got the idea of using inotify and creating a hard-link to file itself. This of course is a race condition, since file could be unlinked before hard link is created, however I did manage to recover the data in the temporary file the application is creating. Here's a short pipeline put together in terminal tab A, with terminal tab B running the actual command:

inotifywait -m -r /tmp/hsperfdata_xie/ 2>&1 | 
while IFS= read -r line; do 

    awk '$2 == "CREATE"{system("ln /tmp/hsperfdata_xie/"$3" /tmp/BACKUP")}' <<< "$line"
    echo "$line" # unnecessary, only if you want to know what's inotify is writing
done

The 3 disadvantages are:

  • race condition (explained above)
  • I put together awk very quickly for one specific file; but a more general and flexible awk command that parses inotifywatch output and joins pathnames $1 with filenames in $3 would have to take a bit of time to parse the lines, sprintf() everything to variable, and pass to system(), which may go back to previous bullet point - by the time parsing is done, there's no file to link.
  • requires two terminal tabs, although one could put the whole pipeline into background. A smarter way would be to have a full Python script with forked subprocesses and actually use inotify Python modules ( which maybe something I'll do in future).

As for file in question, it appears as some form of binary data, with recurring 0...sun.rt._sync_Inflations and 0...sun.rt._sync_Deflations strings (which may be related to Java multithreading). But for the purpose of this question, it's irrelevant - we already have. The only thing I wanted is to obtain the file itself.