I've seen lots of chatter about Fil-C recently, which pitches itself as a memory safe implementation of C/C++. You can read the gritty details of how this is achieved, but for people coming across it for the first time, I think there is value in showing a simplified version, as once you've understood the simplified version it becomes a smaller mental step to then understand the production-quality version.
The real Fil-C has a compiler pass which rewrites LLVM IR, whereas the simplified model is an automated rewrite of C/C++ source code: unsafe code is transformed into safe code. The first rewrite is that within every function, every local variable of pointer type gains an accompanying local variable of AllocationRecord* type, for example:
Original Source After Fil-C Transform void f () { T1* p1; T2* p2; uint64_t x; ... void f () { T1* p1; AllocationRecord* p1ar = NULL ; T2* p2; AllocationRecord* p2ar = NULL ; uint64_t x; ...
Where AllocationRecord is something like:
struct AllocationRecord { char * visible_bytes; char * invisible_bytes; size_t length; };
Trivial operations on local variables of pointer type are rewritten to also move around the AllocationRecord* :
Original Source After Fil-C Transform p1 = p2; p1 = p2, p1ar = p2ar; p1 = p2 + 10; p1 = p2 + 10, p1ar = p2ar; p1 = (T1*)x; p1 = (T1*)x, p1ar = NULL; x = (uintptr_t)p1; x = (uintptr_t)p1;
When pointers are passed-to or returned-from functions, the code is rewritten to include the AllocationRecord* as well as the original pointer. Calls to particular standard library functions are additionally rewritten to call Fil-C versions of those functions. Putting this together, we get:
Original Source After Fil-C Transform p1 = malloc (x); ... free (p1); {p1, p1ar} = filc_malloc (x); ... filc_free (p1, p1ar);
The (simplified) implementation of filc_malloc actually performs three distinct allocations rather than just the requested one:
... continue reading