关于 C++ 你应该更新的知识

内容简介

到目前为止，C++ 仍然是计算机编程领域的经典语言之一，C++ 17 标准在2017上半年已经讨论确定。本期我们汇集了编程专家——祁宇（《深入应用 C++ 11》作者，C++ 开源社区 purecpp.org 创始人）的多年经验总结，并详细介绍了 C++ 17 最新标准中值得开发者关注的新特性和基本用法。

本书内容 C++14 实现编译期反射

文/祁宇

本文将通过分析 magic _ get 源码来介绍 magic _ get 实现的关键技术，深入解析实现 pod 类型反射的原理。

pod 类型编译期反射

反射是一种根据元数据来获取类内部信息的机制，通过元数据就可以获取对象的字段和方法等信息。C# 和 Java 的反射机制都是通过获取对象的元数据来实现的。反射可以用于依赖注入、ORM 对象-实体映射、序列化和反序列化等与对象本身信息密切相关的领域。比如 Java 的 Spring 框架，其依赖注入的基础是建立在反射的基础之上的，可以根据元数据获取类型的信息并动态创建对象。ORM 对象-实体之间的映射也是通过反射实现的。Java 和 C# 都是基于中间运行时的语言，中间运行时提供了反射机制，所以反射对于运行时语言来说很容易，但是对于没有中间运行时的语言，要想实现反射是很困难的。

在2016年的 CppCon 技术大会上，Antony Polukhin 做了一个关于 C++ 反射的演讲，他提出了一个实现反射的新思路，即无需使用宏、标记和额外的工具即可实现反射。看起来似乎是一件不可能完成的任务，因为 C++ 是没有反射机制的，无法直接获取对象的元信息。但是 Antony Polukhin 发现对 pod 类型使用 Modern C++ 的模版元技巧可以实现这样的编译期反射。他开源了一个 pod 类型的编译期反射库 magic _ get（https://github.com/apolukhin/magic_get），这个库也准备进入 boost。我们来看看 magic _ get 的使用示例。

#include struct foo {    int some_integer;    char c;};foo f {777, '!'};auto& r1 = boost::pfr::flat_get(f); //通过索引来访问对象foo的第1个字段auto& r2 = boost::pfr::flat_get(f); //通过索引来访问对象foo的第2个字段

通过这个示例可以看到，magic _ get 确实实现了非侵入式访问 foo 对象的字段，不需要写任何宏、额外的代码以及专门的工具，直接在编译期就可以访问 pod 对象的字段，没有运行期负担，确实有点 magic。

本文将通过分析 magic _ get 源码来介绍 magic _ get 实现的关键技术，深入解析实现 pod 类型反射的原理。

关键技术

实现 pod 类型反射的思路是这样的：先将 pod 类型转换为对应的 tuple 类型，接下来将 pod 类型的值赋给 tuple，然后就可以通过索引去访问 tuple 中的元素了。所以实现 pod 反射的关键就是如何将 pod 类型转换为对应的 tuple 类型和 pod 值赋值给 tuple。

pod 类型转换为 tuple 类型

pod 类型对应的 tuple 类型是什么样的呢？以上面的 foo 为例，foo 对应的 tuple 应该是 tuple，即 tuple 中的元素类型和顺序和 pod 类型中的字段完全一一对应。

根据结构体生成一个 tuple 的基本思路是，按顺序将结构体中每个字段的类型萃取出来并保存起来，后面再取出来生成对应的 tuple 类型。然而字段的类型是不同的，C++ 也没有一个能直接保存不同类型的容器，因此需要一个变通的方法，用一个间接的方法来保存萃取出来的字段类型，即将类型转换为一个 size _ t 类型的 id，将这个 id 保存到一个 array 中，后面根据这个 id 来获取实际的 type 并生成对应的 tuple 类型。

这里需要解决的一个问题是如何实现类型和 id 的相互转换。

type 和 id 在编译期相互转换

先借助一个空的模版类用来保存实际的类型，再借助 C++ 14 的 constexpr 特性，在编译期返回某个类型对应的编译期 id，就可以实现 type 转换为 id 了。具体代码如下：

https://ipad-cms.csdn.net/cms/article/code/3445

上面的代码在编译期将类型 int 和 char 做了一个编码，将类型转换为一个具体的编译期常量，后面就可以根据这些编译期常量来获取对应的具体类型。

编译期根据 id 获取 type 的代码如下：

constexpr auto id_to_type( std::integral_constant ) noexcept { int res{}; return res; }constexpr auto id_to_type( std::integral_constant ) noexcept { char res{}; return res; }

上面的代码中 id _ to _ type 返回的是 id 对应的类型的实例，如果要获取 id 对应的类型还需要通过 decltype 推导出来。magic _ get 通过一个宏将 pod 基本类型都做了一个编码，以实现 type 和 id 在编译期的相互转换。

#define REGISTER_TYPE(Type, Index)                                              \    constexpr std::size_t type_to_id(identity) noexcept { return Index; } \    constexpr auto id_to_type( std::integral_constant ) noexcept { Type res{}; return res; }  \// Register all base types here    REGISTER_TYPE(unsigned short    , 1)    REGISTER_TYPE(unsigned int      , 2)    REGISTER_TYPE(unsigned long long , 3)    REGISTER_TYPE(signed char       , 4)    REGISTER_TYPE(short             , 5)    REGISTER_TYPE(int               , 6)    REGISTER_TYPE(long long         , 7)    REGISTER_TYPE(unsigned char     , 8)    REGISTER_TYPE(char              , 9)    REGISTER_TYPE(wchar_t          , 10)    REGISTER_TYPE(long             , 11)    REGISTER_TYPE(unsigned long    , 12)    REGISTER_TYPE(void*            , 13)    REGISTER_TYPE(const void*      , 14)    REGISTER_TYPE(char16_t         , 15)    REGISTER_TYPE(char32_t         , 16)    REGISTER_TYPE(float             , 17)    REGISTER_TYPE(double           , 18)    REGISTER_TYPE(long double      , 19)

将类型编码之后，保存在哪里以及如何取出来是接着要解决的问题。magic _ get 通过定义一个 array 来保存结构体字段类型 id。

template    struct array {       typedef T type;       T data[N];       static constexpr std::size_t size() noexcept { return N; }   };

array 中的定长数组 data 中保存字段类型对应的 id，数组下标就是字段在结构体中的位置索引。

萃取 pod 结构体字段

前面介绍了如何实现字段类型的保存和获取，那么这个字段类型是如何从 pod 结构体中萃取出来的呢？具体的做法分为三步：

定义一个保存字段类型 id 的 array；
将 pod 的字段类型转换为对应的 id，按顺序保存到 array 中；
筛除 array 中多余的部分。

下面是具体实现代码：

template constexpr auto fields_count_and_type_ids_with_zeros() noexcept {    static_assert(std::is_trivial::value, "Not applyable");    array types{};    detect_fields_count_and_type_ids(types.data, std::make_index_sequence{});    return types;}template constexpr auto array_of_type_ids() noexcept {    constexpr auto types = fields_count_and_type_ids_with_zeros();    constexpr std::size_t count = count_nonzeros(types);    array res{};    for (std::size_t i = 0; i < count; ++i) {        res.data[i] = types.data[i];    }        return res;    }

定义 array 时需要定义一个固定的数组长度，长度为多少合适呢？应按结构体最多的字段数来确定。因为结构体的字段数最多为 sizeof(T)，所以 array 的长度设置为 sizeof(T)。array 中的元素全部初始化为0。一般情况下，结构体字段数一般不会超过 array 的长度，那么 array 中就就会出现多余的元素，所以还需要将 array 中多余的字段移除，只保存有效的字段类型 id。具体的做法是计算出 array 中非零的元素有多少，接着再把非零的元素赋给一个新的 array。下面是计算 array 非零元素个数，同样是借助 constexpr 实现编译期计算。

template constexpr auto count_nonzeros(Array a) noexcept {    std::size_t count = 0;    for (std::size_t i = 0; i < Array::size() && a.data[i]; ++i)        ++ count;    return count;}

由于字段是按顺序保存到 array 中的，所以在元素值为0时的 count 就是有效的元素个数。接下来我们来看看 detect _ fields _ count _ and _ type _ ids 的实现，这个 constexpr 函数将结构体中的字段类型 id 保存到 array 的 data 中。

detect_fields_count_and_type_ids(types.data, std::make_index_sequence{});

detect _ fields _ count _ and _ type _ ids 的第一个参数为定长数组 array 的 data，第二个参数是一个 std::index _ sequence 整形序列。detect _ fields _ count _ and _ type _ ids 具体实现代码如下：

template constexpr auto detect_fields_count_and_type_ids(std::size_t* types, std::index_sequence) noexcept-> decltype( type_to_array_of_type_ids(types) ){    return type_to_array_of_type_ids(types);}template constexpr T detect_fields_count_and_type_ids(std::size_t* types, std::index_sequence) noexcept {    return detect_fields_count_and_type_ids(types, std::make_index_sequence{});}template constexpr T detect_fields_count_and_type_ids(std::size_t*, std::index_sequence) noexcept {    static_assert(!!sizeof(T), "Failed for unknown reason");    return T{};}

上面的代码是为了将 index _ sequence 展开为 0，1，2..., sizeof(T) 序列，得到这个序列之后，再调用 type _ to _ array _ of _ type _ ids 函数实现结构体中的字段类型 id 保存到 array 中。

在讲 type _ to _ array _ of _ type _ ids 函数之前我们先看一下辅助结构体 ubiq。保存 pod 字段类型 id 实际上是由辅助结构体 ubiq 实现的，它的实现如下：

template struct ubiq {    std::size_t* ref_;    template     constexpr operator Type() const noexcept {        ref_[I] = type_to_id(identity{});        return Type{};    }};

这个结构体比较特殊，我们先把它简化一下。

struct ubiq {    template     constexpr operator Type() const {        return Type{};    };};

这个结构体的特殊之处在于它可以用来构造任意 pod 类型，比如 int、char、double 等类型。

int i = ubiq{};double d = ubiq{};char c = ubiq{};

因为 ubiq 构造函数所需要的类型由编译器自动推断出来，所以它能构造任意 pod 类型。通过 ubiq 结构体获取了需要构造的类型之后，我们还需要将这个类型转换为 id 按顺序保存到定长数组中。

template struct ubiq {    std::size_t* ref_;    template     constexpr operator Type() const noexcept {        ref_[I] = type_to_id(identity{});        return Type{};    }};

上面的代码中先将编译器推导出来的类型转换为 id，然后保存到数组下标为 I 的位置。

再回头看 type _ to _ array _ of _ type _ ids 函数。

template constexpr auto type_to_array_of_type_ids(std::size_t* types) noexcept -> decltype(T{ ubiq{types}... }) {    return T{ ubiq{types}... };}

type _ to _ array _ of _ type _ ids 有两个模版参数，第一个 T 是 pod 结构体的类型，第二个 size _ t...为0到 sizeof(T) 的整形序列，函数的入参为 size _ t*，它实际上是 array 的 data，用来保存 pod 字段类型 id。

保存字段类型的关键代码是这一行：T{ ubiq〈I〉{types}... }，这里利用了 pod 类型的构造函数，通过 initializer _ list 构造，编译器会将 T 的字段类型推导出来，并借助 ubiq 将字段类型转换为 id 保存到数组中。这个就是 magic _ get 中的 magic。

将 pod 结构体字段 id 保存到数组中之后，接下来就需要将数组中的 id 列表转换为 tuple 了。

pod 字段 id 序列转换为 tuple

pod 字段 id 序列转换为 tuple 的具体做法分为两步：

将 array 中保存的字段类型 id 放入整形序列 std::index _ sequence；
将 index _ sequence 中的类型 id 转换为对应的类型组成 tuple。

下面是具体的实现代码：

template constexpr const T& get(const array& a) noexcept {    return a.data[I];}template constexpr auto array_of_type_ids_to_index_sequence(std::index_sequence) noexcept {    constexpr auto a = array_of_type_ids();    return std::index_sequence< get(a)...>{};}

get 是返回数组中某个索引位置的元素值，即类型 id，返回的 id 放入 std::index _ sequence 中，接着就是通过 index _ sequence 将 index _ sequence 中的 id 转换为 type，组成一个 tuple。

template constexpr auto as_tuple_impl(std::index_sequence) noexcept {    return std::tuple< decltype( id_to_type(std::integral_constant{}) )... >{};}template constexpr auto as_tuple() noexcept {    static_assert(std::is_pod::value, "Not applyable");    constexpr auto res = as_tuple_impl(            array_of_type_ids_to_index_sequence(                    std::make_index_sequence< decltype(array_of_type_ids())::size() >()            )    );    static_assert(sizeof(res) == sizeof(T), "sizes check failed");    static_assert(            std::alignment_of::value == std::alignment_of::value,            "alignment check failed"    );    return res;}

id _ to _ type 返回的是某个 id 对应的类型实例，所以还需要 decltype 来推导类型。这样我们就可以根据 T 来获取一个 tuple 类型了，接下来是要将 T 的值赋给 tuple，然后就可以根据索引来访问 T 的字段了。

pod 赋值给 tuple

对于 clang 编译器，pod 结构体是可以直接转换为 std::tuple 的，所以对于 clang 编译器来说，到这一步就结束了。

template decltype(auto) get(const T& val) noexcept {    auto t = reinterpret_cast( std::addressof(val) );    return get(*t);}

然而，对于其他编译器，如 msvc 或者 gcc，tuple 的内存并不是连续的，不能直接将 T 转换为 tuple，所以更通用的做法是先做一个内存连续的 tuple，然后就可以将 T 直接转换为 tuple 了。

内存连续的 tuple

下面是实现内存连续的 tuple 代码：

template struct base_from_member {    T value;};template struct tuple_base;template struct tuple_base< std::index_sequence, Tail... >        : base_from_member...{    static constexpr std::size_t size_v = sizeof...(I);    constexpr tuple_base() noexcept = default;    constexpr tuple_base(tuple_base&&) noexcept = default;    constexpr tuple_base(const tuple_base&) noexcept = default;    constexpr tuple_base(Tail... v) noexcept            : base_from_member{ v }...    {}};template struct tuple_base {    static constexpr std::size_t size_v = 0;};template struct tuple: tuple_base<        std::make_index_sequence,        Values...>{    using tuple_base<            std::make_index_sequence,            Values...    >::tuple_base;};

关于 C++ 你应该更新的知识

[ 申请 ]友情链接：