Type-based vs Value-based Reflection
Frequently, whenever the topic of Reflection comes up, I see a lot of complains specifically about the new syntax being added to support Reflection in C++26. I’ve always thought of that as being largely driven by unfamiliarity — this syntax is new, unfamiliar, and thus bad. I thought I’d take a different tactic in this post: let’s take a problem that can only be solved with Reflection and compare what the solution would look like between:
the C++26 value-based model
the Reflection Technical Specification (TS)’s type-based model
Don’t worry if you’re not familiar with the Reflection TS, I’ll go over it in some detail shortly.
But first, today’s problem. C++20 introduced the concept of structural type. These are the kinds of types that you can use as non-type constant template parameters. The definition of structural type is:
A structural type is one of the following: a scalar type, or
an lvalue reference type, or
a literal class type with the following properties: all base classes and non-static data members are public and non-mutable and the types of all base classes and non-static data members are structural types or (possibly multidimensional) arrays thereof.
There is no trait for this in the standard library today. How would we write one? Without reflection, this isn’t implementable. The first two bullets are easy, but even the most clever Boost.PFR tricks don’t do anything to help with the third. Let’s see how it’s done.
The Reflection TS
The Reflection TS (whose draft you can find here) was published in March, 2020. It came from the work done by Matúš Chochlík, Axel Naumann, and David Sankel in P0194.
The design was a type-based model. It introduced a new operator, reflexpr(E) , which gave you a unique type. What I mean by unique is that reflexpr(A) and reflexpr(B) are the same type if and only if A and B are the same entity.
That is the only new part of the language, which only yields types. The library side includes a bunch of template metafunctions to use for queries. For instance, the first example in the paper is, of course, enum-to-string:
enum E { first, second }; using E_m = reflexpr(E); using namespace std::experimental::reflect; using first_m = get_element_t<0, get_enumerators_t>; std::cout << get_name_v << std::endl; // prints "first"
This example also demonstrates the other important concept to point out in the Reflection TS: what is get_enumerators_t ? In TS terms, that is called an object sequence (whereas E_m is just an object). An object sequence is basically a typelist of objects — except that they’re not strictly specified as such (and in the one implementation of the TS that I’m aware of, they’re not implemented as such either). Instead, the TS came with other metafunctions to manipulate them.
That’s basically the design in a nutshell:
reflexpr(E) gives you a unique type representing properties of E
gives you a unique type representing properties of the library comes with a lot of queries on reflection types
some of those queries return values (like get_name_v ), some return types — which can be reflection types (like get_element_t ), and some return object sequences (like get_enumerators_t ).
On the whole, the above should be fairly familiar. It’s regular template metaprogramming. It’s simple.
Implementing the Reflection TS
It occurred to me recently that I could actually implement the Reflection TS on top of the p2996 design. I’m not going to implement the whole thing, I will instead do just enough to solve the problem I posed at the beginning of this blog post. But I’ll walk through how to do that, which should help shine light both on how the TS works and how the p2996 design works.
To start with, we need a reflection operator which returns a unique type per entity. We can do that with a macro:
namespace std::reflect { template struct Reflection { static constexpr auto value = R; }; #define reflexpr(E) ::std::reflect::Reflection<^^E> }
In the value-based reflection model, ^^E gives us a unique value for each entity, with uniqueness defined exactly how we need it. So we simply need to lift that into a type.
The other fundamental piece we need is an object sequence, which I will just just implement as a type-list (even though, as mentioned, it’s not specified as such):
namespace std::reflect { template struct Sequence { }; }
So far so good.
Next, the Reflection TS introduced a lot of concept s (it was introduced at the same time as Concepts in C++20) to help make it easier understand the API and catch invalid uses as early as possible.
The root concept is Object , which represents a reflection object, and everything else builds on top of that. There was also ObjectSequence , for object sequences. For some reason, ObjectSequence refined Object . I’m not sure why that’s helpful, since these are distinct kinds — so for simplicity I’m not going to do that, but otherwise I’m going to try to keep changes to a minimum.
In my implementation here, an Object is just a specialization of std::reflect::Reflection and ObjectSequence is just a specialization of std::reflect::Sequence . So I’ll add a helper concept for that:
namespace std::reflect { template struct Reflection { static constexpr auto value = R; }; template struct Sequence { }; #define reflexpr(E) ::std::reflect::Reflection<^^E> template concept Specializes = has_template_arguments(^^T) and template_of(^^T) == Z; template concept Object = Specializes; template concept ObjectSequence = Specializes; }
Note here that we don’t have universal template parameters, but we can use reflection parameters as a close enough substitute. As an implementation detail, it works well enough.
The rest of the hierarchy of concepts that we’ll need is based on properties what the Object in question represents. We have queries for all of those, so we can just use them:
namespace std::reflect { template concept Base = Object and is_base(T::value); template concept Named = Object and has_identifier(T::value); template concept Typed = Object and has_type(T::value); template concept Type = Object and is_type(T::value); template concept Record = Type and is_class_type(T::value); template concept Class = Record and not is_union_type(T::value); template concept Enum = Type and is_enum_type(T::value); template concept RecordMember = Object and is_class_member(T::value); }
There are a few other concepts in this hierarchy, like Scope and ScopeMember , that I’m omitting for simplicity. Also p2996 right now does not actually expose has_type , it’s for exposition-only, and this is the first time I’ve actually needed it. We should probably expose it, on the premise that we’ve done that for other functions (like has_parent ), but until then it’s straightforward enough to implement.
Okay, concepts are fun, but we haven’t even done any queries yet. Let’s at least get that first example going. I need:
get_element_t
enumerators_t (which in the TS was renamed to get_enumerators_t )
(which in the TS was renamed to ) get_name_v
get_element_t is the simplest. It takes a size_t I and an ObjectSequence T and returns the I th element of T . Boost.Mp11 fans might recognize this at mp_at_c (with the arguments flipped). I’d implement that this way:
namespace std::reflect { template struct Sequence { }; template concept ObjectSequence = Specializes; template using get_element_t = [: template_arguments_of(^^Seq)[I] :]; }
The I th element of Seq is the I th template argument of that type. An alternative approach, which might compile faster, would be to add an alias template inside of Sequence which takes advantage of the new pack indexing facility:
namespace std::reflect { template struct Sequence { template using nth = R...[I]; }; template concept ObjectSequence = Specializes; template using get_element_t = Seq::template nth; }
The next piece we need for the example is get_enumerators_t . This takes an Enum and yields an ObjectSequence of enumerators. In order to implement this, we need to use a function which is one of the most surprisingly useful functions in the value-based design: substitute .
substitute is actually quite simple. It takes a reflection of a template and a sequence of reflections of template arguments and gives back a reflection of the specialization. For instance, substitute(^^std::vector, {^^int}) gives you back ^^std::vector . Or, substitute(^^std::array, {^^int, std::meta::reflect_constant(4)}) gives you back std::array . The call to std::meta::reflect_constant is necessary because we need to provide reflections to substitute , so we need to take our 4 and produce a reflection of the value 4 . That’s what reflect_constant does.
For C++26, we do not yet have reflections of expressions. It’s possible that a future extension would simply allow ^^(4) there. The parentheses might be necessary because unlike the reflection syntax, which isn’t that bad, C++ has plenty of actually really bad syntax. Consider ^^int() — what should that give you a reflection of? Obviously, a reflection of the type “function with no parameters that returns int .” Were you expecting something else?
For this particular metafunction, we need to start with a sequence of enumerators and use them to produce a specialization of Sequence , whose template parameters are specializations of Reflection . Put differently, if we had a pack E... of the reflections of the enumerators of T , then we need to give back the type Sequence...> .
substitute is how we get there:
namespace std::reflect { template using get_enumerators_t = [: []{ vector args; for (meta::info e : enumerators_of(T::value)) { args.push_back(substitute(^^Reflection, {meta::reflect_constant(e)})); } return substitute(^^Sequence, args); }() :]; }
We start with enumerators_of , to get reflections of enumerators. And then we turn each one of those into a reflection of the appropriate specialization of Reflection . And that entire sequence is passed as template parameters to substitute into Sequence . That gives us a reflection of the Sequence we want, so we need to splice the result to get back to the type that we want.
It’s worth walking through this again with a short example. Let’s say we have
enum E { e1, e2, e3 };
Our sequence of steps is:
We start with ^^E . enumerators_of gives us std::vector{^^E::e1, ^^E::e2, ^^E::e3} . We need to turn that first into std::vector{^^Reflection<^^E::e1>, ^^Reflection<^^E::e2>, ^^Reflection<^^E::e2>} . That for loop is producing this vector of reflections. So that we can substitute into ^^Sequence, Reflection<^^E::e2>, Reflection<^^E::e2>> . Finally, we have reflection representing the type we want, so we splice it to get the type.
This pattern is going to come up a few times in the TS, so I will refactor it this way:
namespace std::reflect { inline constexpr auto into_reflection = [](meta::info r){ return substitute(^^Reflection, {meta::reflect_constant(r)}); }; inline constexpr auto into_seq = [](auto&& r){ return substitute(^^Sequence, r | views::transform(into_reflection)); }; template using get_enumerators_t = [: into_seq(enumerators_of(T::value)) :]; }
Note that all the sequence algorithms take any appropriate range of reflections, so the transform just works. You don’t have to turn it into vector at the end or any specific container. Earlier revisions of the design took a span , but this proved cumbersome in practice.
Lastly, we need a name. The interesting thing here is that get_name::value in the TS is, specifically, a char const(&)[N] that refers to a null-terminated byte string. In the value-based reflection design, identifier_of gives you a string_view (that is specified to be null-terminated). However, nothing I’m going to do relies on get_name::value specifically being a reference to an array, and get_name_v is a pointer anyway, so I will again simplify a bit here:
namespace std::reflect { template constexpr auto get_name_v = identifier_of(T::value).data(); }
We can still produce a char const(&)[N] if desired, using substitute . Have I mentioned that this is a very useful function? You can see an implementation of how to get there in P3617, which was recently approved for C++26. Using the proposed reflect_constant_string (which returns a reflection of an array), that would look like this: namespace std::reflect { template struct get_name { static constexpr auto& value = [: meta::reflect_constant_string(identifier_of(T::value)) :]; }; }
And with that, we can test out our implementation to see if it works (it does).
A First Comparison
We haven’t yet implemented all the pieces we need to implement is_structural using the Reflection TS, but we have for this first example. Let’s compare what it would look like to write a function that takes an enum and returns the string name of its first enumerator. It may not be the most compelling reflection use-case, but it still requires interesting things.
template consteval auto first_enum_ts() -> std::string_view { using namespace std::reflect; return get_name_v>>; } template consteval auto first_enum_value() -> std::string_view { return identifier_of(enumerators_of(^^T)[0]); }
The first thing to notice is that there is a direct one-to-one correspondence for all of the operations. This shouldn’t be too surprising, since the type-based design heavily informed the value-based design:
type-based value-based reflexpr(T) ^^T std::reflect::get_enumerators_t enumerators_of(r) std::reflect::get_element_t<0, Seq> seq[0] std::reflect::get_name_v identifier_of(r)
Now, with the type-based model, all the names have to either be qualified or brought in via using namespace . That’s not new, I frequently have a using namespace boost::mp11; when using Boost.Mp11. But in the value-based model, it’s unnecessary because we rely on argument-dependent lookup.
The other thing to notice is that we had to use a metafunction to pull out the first element in the type-based model, but in the value-based one we didn’t have to use a dedicated reflection function — we were able to just used the index operator. That’s pretty nice.
get_base_classes_t and get_data_members_t
Getting back to the problem I wanted to implement, there are a few more pieces we need. The definition of structural relies on recursing through base classes and non-static data members, so we will need the ability to do so. Now that we’ve provided a nice utility for converting a reflection range into a object sequence, we can simply reuse that:
namespace std::reflect { inline constexpr auto into_reflection = [](meta::info r){ return substitute(^^Reflection, {meta::reflect_constant(r)}); }; inline constexpr auto into_seq = [](auto&& r){ return substitute(^^Sequence, r | views::transform(into_reflection)); }; template using get_enumerators_t = [: into_seq(enumerators_of(T::value)) :]; static constexpr auto unchecked = std::meta::access_context::unchecked(); template using get_base_classes_t = [: into_seq( bases_of(T::value, unchecked) ) :]; template using get_nonstatic_data_members_t = [: into_seq( nonstatic_data_members_of(T::value, unchecked) ) :]; }
Easy enough.
There’s one thing I changed in the API here. In the Reflection TS, the metafunction is get_data_members . It returned all the data members — static and non-static. So if you wanted just the non-static data members (as you usually do), you would need to do a filter — something like boost::mp11::mp_remove_if, std::reflect::is_static> . That’s pretty tedious for a common operation, and I suspect that were the TS to be standardized, somebody would have pointed this out. On the other hand, the p2996 design does not have a simple function to get all the data members. You would have to either get all the members ( members_of ) and filter down or merge the non-static ( nonstatic_data_members_of ) and static ( static_data_members_of ) data members. So in this case, copying the TS design would’ve meant more work to implement something less useful.
At this point let’s stop and do another quick comparison — another fairly silly little metafunction. Before, we looked at the name of the first enumerator, now let’s look at the type of the first non-static data member.
template using first_nsdm_type_ts = std::reflect::get_type_t< std::reflect::get_element_t<0, std::reflect::get_nonstatic_data_members_t< reflexpr(T)>>>; template using first_nsdm_type_value = [: type_of(nonstatic_data_members_of( ^^T, std::meta::access_context::unchecked() )[0]) :];
As with the earlier example, we have a direct 1-1 mapping of operations… almost:
type-based value-based reflexpr(T) ^^T std::reflect::get_nonstatic_data_members_t nonstatic_data_members_of(r, std::meta::access_context::unchecked()) std::reflect::get_element_t<0, Seq> seq[0] std::reflect::get_type_t type_of(r) — [: r :]
The p2996 design for getting bases and non-static data members is, unfortunately, extremely verbose. But the type-based design has its own issue with verbosity due to having to qualify all the metafunctions. If we put the type-based solution in a context where we can using namespace std::reflect , that solution becomes a lot more palatable. And likewise if we add a wrapper for nsdms() or fields_of() that returns all the non-static data members:
template using first_nsdm_type_ts = get_type_t>>; template using first_nsdm_type_value = [: type_of(fields_of(^^T)[0]) :];
Now, this solution wasn’t quite what I expected. When I’d started implementing this example using the TS, I thought I would need one more metafunction on the type-based solution. To fill in that empty box in the bottom left corner:
type-based value-based reflexpr(T) ^^T get_nonstatic_data_members_t fields_of(r) get_element_t<0, Seq> seq[0] get_type_t type_of(r) get_reflected_type_t ?? [: r :]
The function std::meta::type_of(r) takes a reflection of a typed entity and produces a reflection of a type. But the metafunction std::reflect::get_type_t takes a reflection of a typed entity and produces the type directly. What I mean is:
// let's take some variable constexpr int v = 42; // in the value-based design, this is a *reflection* of int static_assert(type_of(^^v) == ^^int); // the TS, this is already int using T = std::reflect::get_type_t; static_assert(std::same_as); // int, not reflexpr(int)
On the one hand, that saves a step, if that’s what you really want. On the other hand, it requires re-invoking reflexpr if you need to then do more reflection things with it. For instance, if I wanted the first non-static data member’s type of the first non-static data member, in the value-based model I just call fields_of again but in the TS model I’d have to call reflexpr first.
In any case, I think one of the unheralded benefits of the new syntax — at least one that I hadn’t thought about before going through this exercise — is that we have distinct syntax for going into ( ^^T ) and out of ( [: r :] ) the reflection domain. In the Reflection TS, there was only distinct syntax for going into the domain ( reflexpr was a keyword, so would have shown up clearly). But on the way out were just regular metafunctions — get_type_t , get_pointer_v , etc. I think there’s something to be said for having this stand out.
A few predicates more
Alright lastly we just need a few predicates. We need to be able to check if a base or data member is public and mutable. The TS didn’t have a way to check for mutable , but p2996 does, so we’ll just add the equivalent:
namespace std::reflect { template requires RecordMember or Base inline constexpr bool is_public_v = is_public(T::value); template inline constexpr bool is_mutable_member_v = is_mutable_member(T::value); }
A type-based implementation
We have all the pieces, now let’s solve the problem.
There are basically two issues that we have to deal with in writing an is_structural type trait:
How to properly handle recursion, and How to properly guard instantiations.
What I mean by the second one is that we can’t just write a linear branch like this:
template inline constexpr bool is_structural = std::is_scalar_v or std::is_lvalue_reference_v or std::is_class_v and boost::mp11::mp_all_of< std::reflect::unpack_sequence_t< boost::mp11::mp_list, std::reflect::get_base_classes_t >, std::reflect::is_public >::value ;
This is only part of the implementation, I’m just checking that class types have all-public base classes to start. And checking this on class types does work:
struct B { }; struct D : B { }; static_assert(is_structural); // yes static_assert(is_structural); // yes
It’s just that checking it on non-class types doesn’t:
static_assert(is_structural); // error
That’s because boolean expressions like this short-circuit evaluation, but they don’t short-circuit instantiation. This is still trying to instantiate get_base_classes_t with reflexpr(int) , which is invalid because that metafunction is constrained on Class (which int is not).
So we need a different strategy.
There’s basically two approaches I know of to handle this. The first is specialization. We have three cases that happen to be completely disjoint (scalar, lvalue reference, and class type), so we can just handle them separately:
template inline constexpr bool is_structural = false; template requires std::is_scalar_v inline constexpr bool is_structural = true; template requires std::is_lvalue_reference_v inline constexpr bool is_structural = true; template requires std::is_class_v inline constexpr bool is_structural = boost::mp11::mp_all_of< std::reflect::unpack_sequence_t< boost::mp11::mp_list, std::reflect::get_base_classes_t >, std::reflect::is_public >::value;
That approach works great. But I’m not a huge fan of it for this particular problem. Template specialization is a best match algorithm. Our problem, though, calls for a linear sequence of bullets. It works, but it’s not a direct match for the algorithm we want to express, which can make things harder to reason about. In particular, if our cases weren’t disjoint, we’d have to spend more time working out how to actually express them.
I tend to prefer linearity. Which, in this case, means if constexpr :
template inline constexpr bool is_structural = []{ if constexpr (std::is_scalar_v) { return true; } else if constexpr (std::is_lvalue_reference_v) { return true; } else if constexpr (std::is_class_v) { return boost::mp11::mp_all_of< std::reflect::unpack_sequence_t< boost::mp11::mp_list, std::reflect::get_base_classes_t >, std::reflect::is_public >::value; } else { return false; } }();
That works too. And sure, both this approach and the previous one can be simplified a bit by combining cases, I’m not trying to code golf here. The nice part of wrapping this in a lambda (or making is_structural a consteval function) is that we have a nice place to stick a using namespace in there, which makes the implementation much more readable:
template consteval auto is_structural() -> bool { if constexpr (std::is_scalar_v) { return true; } else if constexpr (std::is_lvalue_reference_v) { return true; } else if constexpr (std::is_class_v) { using namespace boost::mp11; using namespace std::reflect; return mp_all_of< unpack_sequence_t>, is_public >::value; } else { return false; } }
Now, for the recursion part. We need to not just check that all of the base classes are public, but also that they’re structural. We could add a helper
template struct Func { template using fn = decltype(F.template operator()()); };
Which could drive our recursion:
template inline constexpr bool is_structural = []{ if constexpr (std::is_scalar_v) { return true; } else if constexpr (std::is_lvalue_reference_v) { return true; } else if constexpr (std::is_class_v) { using namespace std::reflect; using namespace boost::mp11; return mp_all_of_q< unpack_sequence_t>, Func<[] { return mp_bool< is_public_v and is_structural> >(); }> >::value and mp_all_of_q< unpack_sequence_t>, Func<[]{ return mp_bool< is_public_v and not is_mutable_member_v and is_structural>> >(); }> >::value; } else { return false; } }();
That’s a complete solution. Could even do a little bit better by having a dedicated predicate lambda (so that it can just return bool ) and handling the base classes and non-static data members at the same time:
template struct Pred { template using fn = mp_bool()>; }; template inline constexpr bool is_structural = []{ if constexpr (std::is_scalar_v) { return true; } else if constexpr (std::is_lvalue_reference_v) { return true; } else if constexpr (std::is_class_v) { using namespace std::reflect; using namespace boost::mp11; using Bases = unpack_sequence_t< mp_list, get_base_classes_t>; using Members = unpack_sequence_t< mp_list, get_nonstatic_data_members_t>; return mp_all_of_q< mp_append, Pred<[]