How to generate Ghidra fidb files

While working on one of the crackmes, I discovered that it uses ncurses to “draw” things on the command line. Therefore, in order not to delve into the internals when reversing, it is obvious to use or generate the FLIRT signatures provided by IDA. This is true if you have a full copy with all the bells and whistles, which is not my case, unfortunately. Even if you are ready to create signatures by yourself, the necessary utilities are not included in IDA Free. It’s sad, of course, but I still didn’t want to reverse engineer code that I didn’t need. So I had to switch to Ghidra.

Honestly, I don’t like Ghidra, but sometimes you need to use whatever tool you have if it helps. Ghidra will certainly help in this case, just because you can generate signatures when needed (Function ID as it is called there). The only problem is that there are no command line tools for this, at least that I know of. Using a GUI for such purposes is, in my opinion, an overkill.

Luckily, there is an abandoned project called ghidra-fid-generator that does exactly what I wanted: generates a fid file from a deb package. Even though it works, there is something about it that I don’t like, e.g.

  • It does not work with the latest versions of Ghidra.
  • FidDB uses what are called “common symbols” when creating the database. The goal is to tune the matching algorithm so that if a match is found but the function is marked as “very common”, it does not use the parent/child relationship as a distinguisher. In other words, a “very common” function’s weight in the overall score will not be taken into account if it is a parent/child of another node (more information about this can be found in the Ghidra documentation). The problem with the original ghidra-fid-generator is that it marks all symbols from the library it processes as common, which affects the algorithm.
  • It uses its own CreateMultipleLibraries.java script. Even though it isn’t updated too often, you still need to sync it with the upstream from time to time, just in case.
  • It doesn’t parse Debian package names correctly, so it affects fidb file metadata.
  • It uses 7zip, which is mostly not supported on Linux.
  • It doesn’t work on macOS.

and many other little things that bother me too. So I just decided to fork the idea and completely rewrite it.

I’ve fixed almost all the problems I mentioned, except for the “common symbols”: to use them correctly, you need to first generate fid files for a group of standard *nix libraries you want (glibc, stdc++, etc.) to be able to find these symbols. In addition, you need to prepare a separate RemoveFunctions.java script that the Ghidra team has for the MSVC compiler. So for my tiny crackme task, I decided to skip it for now, given that it works just fine without it.

In any case, I will be happy to improve these scripts if anyone other than me needs them. It’s on GitHub, so feel free to report an issue or submit a PR here: ghidra-fid-generator.