Android Developers Blog
The latest Android and Google Play news for app and game developers.
🔍
Platform Android Studio Google Play Jetpack Kotlin Docs News

13 красавіка 2017

FORTIFY in Android


Link copied to clipboard
Posted by George Burgess, Software Engineer

FORTIFY is an important security feature that's been available in Android since mid-2012. After migrating from GCC to clang as the default C/C++ compiler early last year, we invested a lot of time and effort to ensure that FORTIFY on clang is of comparable quality. To accomplish this, we redesigned how some key FORTIFY features work, which we'll discuss below.

Before we get into some of the details of our new FORTIFY, let's go through a brief overview of what FORTIFY does, and how it's used.

What is FORTIFY?


FORTIFY is a set of extensions to the C standard library that tries to catch the incorrect use of standard functions, such as memset, sprintf, open, and others. It has three primary features:

  • If FORTIFY detects a bad call to a standard library function at compile-time, it won't allow your code to compile until the bug is fixed.
  • If FORTIFY doesn't have enough information, or if the code is definitely safe, FORTIFY compiles away into nothing. This means that FORTIFY has 0 runtime overhead when used in a context where it can't find a bug.
  • Otherwise, FORTIFY adds checks to dynamically determine if the questionable code is buggy. If it detects bugs, FORTIFY will print out some debugging information and abort the program.

Consider the following example, which is a bug that FORTIFY caught in real-world code:

struct Foo {
    int val;
    struct Foo *next;
};
void initFoo(struct Foo *f) {
    memset(&f, 0, sizeof(struct Foo));
}
FORTIFY caught that we erroneously passed &f as the first argument to memset, instead of f. Ordinarily, this kind of bug can be difficult to track down: it manifests as potentially writing 8 bytes extra of 0s into a random part of your stack, and not actually doing anything to *f. So, depending on your compiler optimization settings, how initFoo is used, and your project's testing standards, this could slip by unnoticed for quite a while. With FORTIFY, you get a compile-time error that looks like:

/path/to/file.c: call to unavailable function 'memset': memset called with size bigger than buffer
    memset(&f, 0, sizeof(struct Foo));
    ^~~~~~
For an example of how run-time checks work, consider the following function:

// 2147483648 == pow(2, 31). Use sizeof so we get the nul terminator,
// as well.
#define MAX_INT_STR_SIZE sizeof("2147483648")
struct IntAsStr {
    char asStr[MAX_INT_STR_SIZE];
    int num;
};
void initAsStr(struct IntAsStr *ias) {
    sprintf(ias->asStr, "%d", ias->num);
}
This code works fine for all positive numbers. However, when you pass in an IntAsStr with num <= -1000000, the sprintf will write MAX_INT_STR_SIZE+1 bytes to ias->asStr. Without FORTIFY, this off-by-one error (that ends up clearing one of the bytes in num) may go silently unnoticed. With it, the program prints out a stack trace, a memory map, and will abort with a core dump.

FORTIFY also performs a handful of other checks, such as ensuring calls to open have the proper arguments, but it's primarily used for catching memory-related errors like the ones mentioned above.
However, FORTIFY can't catch every memory-related bug that exists. For example, consider the following code:

__attribute__((noinline)) // Tell the compiler to never inline this function.
inline void intToStr(int i, char *asStr) { sprintf(asStr, “%d”, num); }


char *intToDupedStr(int i) {
    const int MAX_INT_STR_SIZE = sizeof(“2147483648”);
    char buf[MAX_INT_STR_SIZE];
    intToStr(i, buf);
    return strdup(buf);
}
Because FORTIFY determines the size of a buffer based on the buffer's type and—if visible—its allocation site, it can't catch this bug. In this case, FORTIFY gives up because:

  • the pointer is not a type with a pointee size we can determine with confidence because char * can point to a variable amount of bytes
  • FORTIFY can't see where the pointer was allocated, because asStr could point to anything.

If you're wondering why we have a noinline attribute there, it's because FORTIFY may be able to catch this bug if intToStr gets inlined into intToDupedStr. This is because it would let the compiler see that asStr points to the same memory as buf, which is a region of sizeof(buf) bytes of memory.

How FORTIFY works


FORTIFY works by intercepting all direct calls to standard library functions at compile-time, and redirecting those calls to special FORTIFY'ed versions of said library functions. Each library function is composed of parts that emit run-time diagnostics, and—if applicable—parts that emit compile-time diagnostics. Here is a simplified example of the run-time parts of a FORTIFY'ed memset (taken from string.h). An actual FORTIFY implementation may include a few extra optimizations or checks.

_FORTIFY_FUNCTION
inline void *memset(void *dest, int ch, size_t count) {
    size_t dest_size = __builtin_object_size(dest);
    if (dest_size == (size_t)-1)
        return __memset_real(dest, ch, count);
    return __memset_chk(dest, ch, count, dest_size);
}
In this example:

  • _FORTIFY_FUNCTION expands to a handful of compiler-specific attributes to make all direct calls to memset call this special wrapper.
  • __memset_real is used to bypass FORTIFY to call the "regular" memset function.
  • __memset_chk is the special FORTIFY'ed memset. If count > dest_size, __memset_chk aborts the program. Otherwise, it simply calls through to __memset_real.
  • __builtin_object_size is where the magic happens: it's a lot like size sizeof, but instead of telling you the size of a type, it tries to figure out how many bytes exist at the given pointer during compilation. If it fails, it hands back (size_t)-1.

The __builtin_object_size might seem sketchy. After all, how can the compiler figure out how many bytes exist at an unknown pointer? Well... It can't. :) This is why _FORTIFY_FUNCTION requires inlining for all of these functions: inlining the memset call might make an allocation that the pointer points to (e.g. a local variable, result of calling malloc, …) visible. If it does, we can often determine an accurate result for __builtin_object_size.

The compile-time diagnostic bits are heavily centered around __builtin_object_size, as well. Essentially, if your compiler has a way to emit diagnostics if an expression can be proven to be true, then you can add that to the wrapper. This is possible on both GCC and clang with compiler-specific attributes, so adding diagnostics is as simple as tacking on the correct attributes.

Why not Sanitize?


If you're familiar with C/C++ memory checking tools, you may be wondering why FORTIFY is useful when things like clang's AddressSanitizer exist. The sanitizers are excellent for catching and tracking down memory-related errors, and can catch many issues that FORTIFY can't, but we recommend FORTIFY for two reasons:

  • In addition to checking your code for bugs while it's running, FORTIFY can emit compile-time errors for code that's obviously incorrect, whereas the sanitizers only abort your program when a problem occurs. Since it's generally accepted that catching issues as early as possible is good, we'd like to give compile-time errors when we can.
  • FORTIFY is lightweight enough to enable in production. Enabling it on parts of our own code showed a maximum CPU performance degradation of ~1.5% (average 0.1%), virtually no memory overhead, and a very small increase in binary size. On the other hand, sanitizers can slow code down by well over 2x, and often eat up a lot of memory and storage space.

Because of this, we enable FORTIFY in production builds of Android to mitigate the amount of damage that some bugs can cause. In particular, FORTIFY can turn potential remote code execution bugs into bugs that simply abort the broken application. Again, sanitizers are capable of detecting more bugs than FORTIFY, so we absolutely encourage their use in development/debugging builds. But the cost of running them for binaries shipped to users is simply way too high to leave them enabled for production builds.

FORTIFY redesign


FORTIFY's initial implementation used a handful of tricks from the world of C89, with a few GCC-specific attributes and language extensions sprinkled in. Because Clang cannot emulate how GCC works to fully support the original FORTIFY implementation, we redesigned large parts of it to make it as effective as possible on clang. In particular, our clang-style FORTIFY implementation makes use of clang-specific attributes and language extensions, as well as some function overloading (clang will happily apply C++ overloading rules to your C functions if you use its overloadable attribute).

We tested hundreds of millions of lines of code with this new FORTIFY, including all of Android, all of Chrome OS (which needed its own reimplementation of FORTIFY), our internal codebase, and many popular open source projects.

This testing revealed that our approach broke existing code in a variety of exciting ways, like:
template <typename OpenFunc>
bool writeOutputFile(OpenFunc &&openFile, const char *data, size_t len) {}

bool writeOutputFile(const char *data, int len) {
    // Error: Can’t deduce type for the newly-overloaded `open` function.
    return writeOutputFile(&::open, data, len);
}
and
struct Foo { void *(*fn)(void *, const void *, size_t); }
void runFoo(struct Foo f) {
    // Error: Which overload of memcpy do we want to take the address of?
    if (f.fn == memcpy) {
        return;
    }
    // [snip]
}


There was also an open-source project that tried to parse system headers like stdio.h in order to determine what functions it has. Adding the clang FORTIFY bits greatly confused the parser, which caused its build to fail.

Despite these large changes, we saw a fairly low amount of breakage. For example, when compiling Chrome OS, fewer than 2% of our packages saw compile-time errors, all of which were trivial fixes in a couple of files. And while that may be "good enough," it is not ideal, so we refined our approach to further reduce incompatibilities. Some of these iterations even required changing how clang worked, but the clang+LLVM community was very helpful and receptive to our proposed adjustments and additions, such as:


We recently pushed it to AOSP, and starting in Android O, the Android platform will be protected by clang FORTIFY. We're still putting some finishing touches on the NDK, so developers should expect to see our upgraded FORTIFY implementation there in the near future. In addition, as we alluded to above, Chrome OS also has a similar FORTIFY implementation now, and we hope to work with the open-source community in the coming months to get a similar implementation* into glibc, the GNU C library.

* For those who are interested, this will look very different than the Chrome OS patch. Clang recently gained an attribute called diagnose_if, which ends up allowing for a much cleaner FORTIFY implementation than our original approach for glibc, and produces far prettier errors/warnings than we currently can. We expect to have a similar diagnose_if-powered implementation in a later version of Android.