0

I want to change a path that a program actually opens on filesystem for some paths. The reason is that I want to run a program in parallel, but that program uses /tmp/somedir/ as its temporary directory and parallel instances run into conflicts.

I found this great answer to do just that: Is it possible to fake a specific path for a process?. Sadly, while this works for cat as advertised, it does not work for my program. I thought the cause is that the program uses C++ API.

To reproduce, I first made a very simple program that writes something in a file:

#include <fstream>
#include <string_view>
#include <iostream>

int main() { std::ofstream myfile; myfile.open("test.log"); std::string_view text{"hello world\n"}; myfile.write(text.data(), text.size()); return 0; }

I then used strace and saw this at the end:

brk(NULL)                               = 0x558b5d5e3000
brk(0x558b5d604000)                     = 0x558b5d604000
futex(0x7f94e2e7e77c, FUTEX_WAKE_PRIVATE, 2147483647) = 0
openat(AT_FDCWD, "test.log", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3
write(3, "hello world\n", 12)           = 12
close(3)                                = 0
exit_group(0)                           = ?

So it looks like C api is called, but the function used is openat.

I also saw this for the C so library, this will be relevant later:

openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3

So I implemented openat in addition to open, and this is the full program. For test purposes, instead of changing a path I just log it into a file.

/*
 * capture calls to a routine and replace with your code
 * g++ -Wall -O2 -fpic -shared -ldl -lstdc++ -o fake_open_file.so fake_open_file.cpp
 * LD_PRELOAD=/home/myname/fake_open_file.so cat
 */
#define _FCNTL_H 1 /* hack for open() prototype */
#undef _GNU_SOURCE
#define _GNU_SOURCE /* needed to get RTLD_NEXT defined in dlfcn.h */
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <dlfcn.h>
#include <mutex>
#include <fstream>
#include <string_view>
#include <iostream>

// for the test, I just log anything that was open into a new log file struct open_reporter { open_reporter() = default; void report_filename(std::string_view filename) { std::lock_guard<std::mutex> l{file_report_mutex}; if(!is_open) { myfile.open("/home/myname/fileopen.log"); } std::string tmp = std::string{filename} + "\n"; myfile.write(tmp.data(), tmp.size()); } std::ofstream myfile; std::mutex file_report_mutex; bool is_open = false; };

static open_reporter reporter_;

extern "C" { int open(const char pathname, int flags, mode_t mode) { static int (real_open)(const char *pathname, int flags, mode_t mode) = nullptr;

    if (!real_open) {
        real_open = reinterpret_cast&lt;decltype(real_open)&gt;(dlsym(RTLD_NEXT, &quot;open&quot;));
        char *error = dlerror();
        if (error != nullptr) {
            reporter_.report_filename(&quot;ERROR OCCURED!&quot;);
            reporter_.report_filename(error);
            exit(1);
        }
    }

    reporter_.report_filename(pathname);
    return real_open(pathname, flags, mode);
}

int openat(int dirfd, const char *pathname, int flags, mode_t mode)
{
    static int (*real_openat)(int dirfd, const char *pathname, int flags, mode_t mode) = nullptr;

    if (!real_openat) {
        real_openat = reinterpret_cast&lt;decltype(real_openat)&gt;(dlsym(RTLD_NEXT, &quot;openat&quot;));
        char *error = dlerror();
        if (error != nullptr) {
            reporter_.report_filename(&quot;ERROR OCCURED!&quot;);
            reporter_.report_filename(error);
            exit(1);
        }
    }

    reporter_.report_filename(pathname);
    return real_openat(dirfd, pathname, flags, mode);
}

}

This works with cat still, but not with my test program. Even if I change open and openat to return 0, while this breaks cat it does not have any effect on my test program. I also checked if the symbols are in my binary:

$ nm -gD fake_open_file.so | grep open
0000000000001470 W _ZN13open_reporterD1Ev
0000000000001470 W _ZN13open_reporterD2Ev
0000000000001450 T open
0000000000001460 T openat

I can see both the functions present. Looking into C library, I see a difference, but I don't know what it means. I redacted things that are not open or openat:

$ nm -gD /lib/x86_64-linux-gnu/libc.so.6 |grep open
0000000000114820 W openat@@GLIBC_2.4
0000000000114820 W openat64@@GLIBC_2.4

0000000000114690 W open@@GLIBC_2.2.5 0000000000114690 W open64@@GLIBC_2.2.5

0000000000114690 W __open@@GLIBC_2.2.5 0000000000114690 W __open64@@GLIBC_2.2.5 00000000001147c0 T __open64_2@@GLIBC_2.7 0000000000119b80 T __open64_nocancel@@GLIBC_PRIVATE 0000000000114660 T __open_2@@GLIBC_2.7 0000000000040800 T __open_catalog@@GLIBC_PRIVATE 0000000000119b80 T __open_nocancel@@GLIBC_PRIVATE 0000000000114950 T __openat64_2@@GLIBC_2.7 00000000001147f0 T __openat_2@@GLIBC_2.7

Apart of the @@GLIBC stuff, these are the same. I never have done this before, so this is as far as my debugging ability goes. I am asking here and not SO because here is where I got the original answer and also this looks more like linux knowledge than a programming problem, the program itself is very simple.

muru
  • 72,889
Tomáš Zato
  • 1,776
  • "Probably the cause is that the program uses C++ API." no, all C++ runtimes I'm aware of use standard libc calls underneath – Marcus Müller Jan 19 '24 at 14:39
  • @MarcusMüller yes, correct. I should've written "I thought probably...". strace proved this idea wrong, as it logs openat C function. – Tomáš Zato Jan 19 '24 at 14:41
  • @MarcusMüller Turns out my suspicion was correct in the end, check the answer. This is unfortunate, as it makes the whole task much more difficult. – Tomáš Zato Jan 19 '24 at 19:03
  • @TomášZato Have you tried limiting the calls that strace monitors to a small list, say open, open64, stat, write, close? – doneal24 Jan 19 '24 at 19:58

1 Answers1

2

Background: C

Your strace output shows...

brk(NULL)                               = 0x558b5d5e3000
brk(0x558b5d604000)                     = 0x558b5d604000
futex(0x7f94e2e7e77c, FUTEX_WAKE_PRIVATE, 2147483647) = 0
openat(AT_FDCWD, "test.log", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3
write(3, "hello world\n", 12)           = 12
close(3)                                = 0
exit_group(0)                           = ?

...but strace traces system calls, not function calls...and function interposition via LD_PRELOAD works with function calls. With a C program, the openat call is likely being called via open or open64. So for example, if I start something like this:

#include <stdlib.h>
#include <unistd.h>
#include <stdio.h>
#include <fcntl.h>

int main() { int fd; int nb;

if (-1 == (fd = open(&quot;test.log&quot;, O_RDWR|O_CREAT, 0666))) {
    perror(&quot;open&quot;);
    exit(1);
}

if (-1 == (nb = write(fd, &quot;hello world\n&quot;, 12))) {
    perror(&quot;write&quot;);
    exit(1);
}

printf(&quot;write %d bytes\n&quot;, nb);

return 0;

}

I see in the strace output that this calls:

openat(AT_FDCWD, "test.log", O_RDWR|O_CREAT, 0666) = 3

But if I attempt to override openat with LD_PRELOAD, as you have, I see the same behavior: it doesn't work. However, if I intercept the open call instead:

int open(const char *pathname, int flags, mode_t mode) {
    static int (*real_open)(const char *, int, mode_t);
fprintf(stderr, &quot;OPEN PATH: %s\n&quot;, pathname);

if (!real_open) {
    real_open = (dlsym(RTLD_NEXT, &quot;open&quot;));
    char *error = dlerror();
    if (error != NULL) {
        fprintf(stderr, &quot;ERROR OCCURED! %s\n&quot;, error);
        exit(1);
    }
}

return real_open(pathname, flags, mode);

}

Then it works great:

$ LD_PRELOAD=./fakeopen.so ./c_example
OPEN PATH: test.log
write 12 bytes

More complicated: C++

Things are a little more complicated with C++ code, because when you write...

myfile.open("test.log");

...what exactly is getting called? If we look at the output of LD_DEBUG=symbols ./your_program, we see things like:

    420239:     symbol=_ZNSt14basic_ofstreamIcSt11char_traitsIcEE4openEPKcSt13_Ios_Openmode;  lookup in file=./cc_main [0]
    420239:     symbol=_ZNSt14basic_ofstreamIcSt11char_traitsIcEE4openEPKcSt13_Ios_Openmode;  lookup in file=/lib64/libstdc++.so.6 [0]

So the actual function call is to a mangled name like _ZNSt14basic_ofstreamIcSt11char_traitsIcEE4openEPKcSt13_Ios_Openmode. We can override that just like any other function. If we create wrapper.cc with:

#include <iostream>

extern "C" { int _ZNSt14basic_ofstreamIcSt11char_traitsIcEE4openEPKcSt13_Ios_Openmode() { { std::cerr << "This is the wrapped open method\n"; return 0; } } }

And compile this to wrapper.so:

g++ -shared -fPIC -o wrapper.so wrapper.cc

Then we can use it with your simple program:

LD_PRELOAD=./wrapper.so ./your_program

And get as output:

$ LD_PRELOAD=./wrapper.so ./your_program
This is the wrapped open method

So there, we have successfully wrapped the open method! To get it to successfully call the real method, you would need to figure out what the function signature for _ZNSt14basic_ofstreamIcSt11char_traitsIcEE4openEPKcSt13_Ios_Openmode actually looks like. I'm not a C++ person, and I can't answer that question, but hopefully this helps you make forward progress.

Misc notes

It may be possible to perform the function overloading using C++, which would allow you to use regular function names rather than mangled names (and you woudn't need to deal with guessing what the C prototype looks like for a C++ function).

That is discussed a little bit in this question on stackoverflow.

larsks
  • 34,737
  • So... in the and I was right when I suspected that C++ api is different call alltogether. Thanks a lot for the analysis. I'll see if I can get it to work the C++ way, but if not I might just use the mangled names that I find in the real binary I need to spoof folders for. – Tomáš Zato Jan 19 '24 at 19:01