2

If I buy a piece of x86 32-bit or 64-bit software but I don't receive the source code, and I need to modify the software, I'll need to convert the machine code back into a high level language or at least assembly code.

Is there a good utility to go from machine code to C?

I assume that it would attempt to identify whether the program was compiled with a C compiler as opposed to C++ or Objective C or anything else.

Thanks.

Asker
  • 23
  • 3
    Careful with the licence you get for that software and the law in your country/jurisdiction. – Mat Nov 24 '12 at 10:54
  • @Mat True but I'd only modify software that itself breaks laws e.g. the privacy laws, the 4th Amendment to the US Constitution etc. – Asker Nov 24 '12 at 15:06
  • 3
    That's not a good way of looking at the legal aspect of it. Killing a murderer is a crime in a lot of places. Law enforcement should be called in if you see illegal activities. – Mat Nov 24 '12 at 15:12
  • @Mat The authorities usually side with whoever is most powerful, and it is the powerful who are often the ones mandating that spyware and other forms of malware be put into operating systems and other programs. They are the ones breaking the laws. If I were to tell them that program X from company Y has spyware in it, they would do nothing because they're doing the same thing. – Asker Nov 24 '12 at 23:48
  • 1
    Considering you're looking for FOSS utility, I would try to find a good FOSS replacement for whatever is the program you're suspecting. – Didi Kohen Nov 25 '12 at 15:19
  • @DavidKohen Among the various pieces of software I'd be looking at, there is an ostensibly FOSS operating system, made by a company that may have links to the people who wrote Stuxnet. Needless to say, just because a company says their OS is compiled from FOSS doesn't mean they didn't secretly compile from an alternate form i.e. FOSS+spyware. – Asker Nov 30 '12 at 21:06

2 Answers2

1

Wow, some project! But OK, some toys to play with:

(Use all with a binary file as first argument.)

bits:

xxd -b        # xxd for hexdump (?): `-b` is `-bits`

octal:

od            # octal dump

hexadecimal:

hexdump       # these two share the hexdump(1) man page
hd            # symbolic link to hexdump
od -t x1      # `-t` for type, `x1` for hexadecimal with 1 byte per integer
xxd

strings:

strings

machine code:

objdump -D    # object dump: `-D` is `--disassemble-all`

And, last but not least, file

Be sure to check out this question.

Emanuel Berg
  • 6,903
  • 8
  • 44
  • 65
1

Going back from machine code to the source language is called decompilation. Disassembly (going from machine code to assembly language) can be done with objdump -d; objdump is part of the standard binutils suite of development tools. While a decompiler can be a useful tool in the process, decompiling the code with the intent of modifying it and recompiling it is rarely a productive way of modifying the behavior of a program. You will spend a lot of time getting back usable source code, and that source code won't be in any maintainable form.

The first step in understanding how a program works is debugging tools. Use tools such as strace and ltrace to see what system calls and library calls the program makes. Use a debugger such as Gdb to step through the program's instructions.

If you're very lucky, the behavior you seek can be achieved with a configuration file or environment variable. The next step is hooking into the program and replacing a few functions by your own version, by using LD_PRELOAD to link a library defining your own version of these functions.

Decompilation is usually useful to understand the algorithms used by a program, for example to write another program with a compatible network protocol or file format. It's often not a very useful tool when your goal is to modify the behavior of the program.

  • 1
    This is true. My thought was that if I owned a program that I suspected had spyware in it, then in the best case I'd like to analyze the system calls in it so I could find the spot where something nasty was being done. But I probably don't need to recompile it to fix the problem: If I knew precisely where it is, I can use a binary editor to overwrite the call with NOPs. It would be nice if there were a tool that were a combination disassembler and binary editor, like where I see the asm code on the left and machine code bytes on the right in the editor. – Asker Nov 24 '12 at 15:01