zk-SNARKs的原理简介

【主要参看了 V神博客 An approximate introduction to how zk-SNARKs are possible】

1. 引言

最近十年最强大的密码学技术可能就是：通用简洁的零知识证明，又称为zk-SNARKs，全称为"zero knowledge succinct arguments of knowledge"。

所谓zk-SNARK，是指：可为生成特定输出的计算提供相应的proof证明，使得验证proof的速度远远快于执行相应计算的速度。

而ZK (zero knowledge) 是附加属性：proof 可隐藏计算的某些输入值。

如声称：我知道某个秘密数字，使得对cow+该秘密数字执行1亿次SHA256 hash运算，其输出结果以0x57d00485aa 打头。

借助zk-SNARKs为以上声明生成proof，可使Verifier验证该proof的速度远远快于其直接运行1亿次hash运算（此时Verifier需知悉该秘密数字）。而zk-SNARKs的proof可保证该秘密数字不被泄露。

在区块链背景下，有两类很强大的应用程序：【本文重点关注扩展性问题。】

扩展性：若某个区块直接验证的时间很长，可改为由一人验证并生成proof，而网络中的其它人快速验证该proof，而不再需要每个人都花很长时间来直接验证。
隐私性：在交易中，你需要能证明你拥有某种未花费的资产，但是不想暴露资产的整个来源去向，可解决比特币等区块链平台中交易透明性带来的信息泄露，如who is transacting with whom and the amount。

zk-SNARKs的实现非常复杂，在2014-2017年期间，人们常称其为"moon math"，近几年，经过科学家的努力，整个zk-SNARKs协议变得越来越简单和容易理解。

本文重点是向具有中等数学知识的人介绍zk-SNARKs的原理。

2. 为什么zk-SNARKs “应该”是很难实现的？

仍然以上节的例子为例：我知道某个秘密数字，使得对cow+该秘密数字执行1亿次SHA256 hash运算，其输出结果以0x57d00485aa 打头。

我们希望，整个实现过程具有如下属性：

succinct 简洁性：是指proof size和验证proof的时间都应该小于直接计算。若我们想要有“succinct” proof，则不能让Verifier在每轮都做hash运算，否则验证时间就与计算量成正比了。Verifier必须能对整个（1亿次）运算进行运算，而不是对每个单独的运算进行验证。

一种具有succinct特性的自然的应对技术为random sampling（随机取样）：即Verifier可随机选500个不同的位置进行取样并验证是否正确，若这500个随机取样都验证通过，则可假设剩余的 1亿-500 次运算有很大的概率是正确的。

同时，可借助Fiat-Shamir heuristic来将整个过程转为non-interactive proof： Prover computes a Merkle root of the computation，uses the Merkle root to pseudorandomly choose 500 indices, and provides the 500 corresponding Merkle branches of the data。

核心思想为：Prover does not know the hash until they have already “committed to” the data。若a malicous prover 在知悉取样位置后试图篡改数据，则其需要修改Merkle root，对应为a new set of random indices，对应需要再次篡改数据。。。如此往复，使得malicous prover进入了一个死循环。

但是随机采样验证存在一个致命缺陷：如下图所示，若malicious prover在计算的中间仅篡改了某个bit，使得输出结果截然不同，而Verifier借助随机采样验证可能永远都无法发现。在这里插入图片描述而借助zk-SNARK protocol，可解决以上问题，使得Verifier可check every single piece of the computation, without looking at each piece of the computation individually.

3. zk-SNARKs基础概念 3.1 polynomial 多项式

多项式为一类特殊的代数表达式，形如：【即为有限个形如 c x k cx^k cxk term 之和。】

x + 5 x+5 x+5
x 4 x^4 x4
x 3 + 3 x 2 + 3 x + 1 x^3+3x^2+3x+1 x3+3x2+3x+1
628 x 271 + 318 x 270 + 530 x 269 + ⋯ + 69 x + 381 628x^{271}+318x^{270}+530x^{269}+\cdots +69x+381 628x271+318x270+530x269+⋯+69x+381【其实即为表示具有816 digits 的 τ = 2 π \tau=2\pi τ=2π的多项式】

多项式有很多有趣的地方，特别的一点是： polynomials are a single mathematical object that can contain an unbounded amount of information (想象其为a list of intergers即可)。

而多项式等式代表的是an unbounded number of equations between numbers。如，若 A ( x ) + B ( x ) = C ( x ) A(x)+B(x)=C(x) A(x)+B(x)=C(x)成立，则以下皆成立：【以及其它任意点也都成立】

A ( 0 ) + B ( 0 ) = C ( 0 ) A(0)+B(0)=C(0) A(0)+B(0)=C(0)
A ( 1 ) + B ( 1 ) = C ( 1 ) A(1)+B(1)=C(1) A(1)+B(1)=C(1)
A ( 2 ) + B ( 2 ) = C ( 2 ) A(2)+B(2)=C(2) A(2)+B(2)=C(2)
A ( 3 ) + B ( 3 ) = C ( 3 ) A(3)+B(3)=C(3) A(3)+B(3)=C(3)

因此，可以 construct polynomials to deliberately represent sets of numbers so you can check many equations all at once. 如，假设你想验证以下等式是否成立：

12 + 1 = 13 12 + 1=13 12+1=13
10 + 8 = 18 10+8=18 10+8=18
15 + 8 = 23 15+8=23 15+8=23
15 + 13 = 28 15+13=28 15+13=28

可通过 Lagrange interpolation 来构建多项式 A ( x ) ， B ( x ) , C ( x ) A(x)，B(x),C(x) A(x)，B(x),C(x)，即在特定的坐标，如 ( 0 , 1 , 2 , 3 ) (0,1,2,3) (0,1,2,3)， A ( x ) A(x) A(x)的输出分别为 ( 12 , 10 , 15 , 15 ) (12,10,15,15) (12,10,15,15)， B ( x ) B(x) B(x)的输出分别为 ( 1 , 8 , 8 , 13 ) (1,8,8,13) (1,8,8,13)， C ( x ) C(x) C(x)的输出分别为 ( 13 , 18 , 23 , 28 ) (13,18,23,28) (13,18,23,28)，对应的多项式为：

A ( x ) = − 2 x 3 + 19 2 x 2 − 19 2 x + 12 A(x)=-2x^3+\frac{19}{2}x^2-\frac{19}{2}x+12 A(x)=−2x3+219x2−219x+12
B ( x ) = 2 x 3 − 19 2 x 2 + 29 2 x + 1 B(x)=2x^3-\frac{19}{2}x^2+\frac{29}{2}x+1 B(x)=2x3−219x2+229x+1
C ( x ) = 5 x + 13 C(x)=5x+13 C(x)=5x+13

仅需要验证 A ( x ) + B ( x ) = C ( x ) A(x)+B(x)=C(x) A(x)+B(x)=C(x) 多项式等式是否成立，即可一次性验证以上四个方程式。

3.2 多项式的自我比较

用于check relations between a large number of adjacent evaluations of the same polynomial using a simple polynomial equation。

如，已知多项式 F F F，满足 F ( x + 2 ) = F ( x ) + F ( x + 1 ) F(x+2)=F(x)+F(x+1) F(x+2)=F(x)+F(x+1) within the integer range { 0 , 1 , ⋯ , 98 } \{0,1,\cdots, 98\} {0,1,⋯,98}，若 F ( 0 ) = F ( 1 ) = 1 F(0)=F(1)=1 F(0)=F(1)=1，则 F ( 100 ) F(100) F(100)为第100个 Fibonacci number。

多项式 F ( x + 2 ) − F ( x + 1 ) − F ( x ) F(x+2)-F(x+1)-F(x) F(x+2)−F(x+1)−F(x) 不一定零多项式，因为其可give arbitrary answers outside the range x = { 0 , 1 , ⋯ , 98 } x=\{0,1,\cdots,98\} x={0,1,⋯,98}。而，若a polynomial P P P 为 0 across some set S = { x 1 , x 2 , ⋯ , x n } S=\{x_1,x_2,\cdots,x_n\} S={x1,x2,⋯,xn}，则可将其表示为： P ( x ) = Z ( x ) ∗ H ( x ) P(x)=Z(x)*H(x) P(x)=Z(x)∗H(x) 其中 Z ( x ) = ( x − x 1 ) ∗ ( x − x 2 ) ∗ ⋯ ∗ ( x − x n ) Z(x)=(x-x_1)*(x-x_2)*\cdots *(x-x_n) Z(x)=(x−x1)∗(x−x2)∗⋯∗(x−xn)， H ( x ) H(x) H(x) 也为一个多项式。也就是说，任意多项式that equals zero across some set is a (polynomial) multiple of the simplest (lowest-degree) polynomial that equals zero across that same set。

多项式的因式分解定理，有计算 P ( x ) / Z ( x ) P(x)/Z(x) P(x)/Z(x)，得商 Q ( x ) Q(x) Q(x)和余数 R ( x ) R(x) R(x)，满足： P ( x ) = Z ( x ) ∗ Q ( x ) + R ( x ) P(x)=Z(x)*Q(x)+R(x) P(x)=Z(x)∗Q(x)+R(x) 其中余数 R ( x ) R(x) R(x)的degree肯定比 Z ( x ) Z(x) Z(x)小。

若 P P P为0 on all of S S S，则 R R R也必须为0 on all S S S。因此，也可以借助多项式插值来计算 R ( x ) R(x) R(x)， R ( x ) R(x) R(x)的degree最大为 n − 1 n-1 n−1，其中 n n n为 S S S中元素的个数。

Interpolating a polynomial with all zeros gives the zero polynomial，因此有 R ( x ) = 0 R(x)=0 R(x)=0且 H ( x ) = Q ( x ) H(x)=Q(x) H(x)=Q(x)。

回到以上的例子中，即多项式 F F F encodes Fibonacci numbers （使得 F ( x + 2 ) = F ( x ) + F ( x + 1 ) F(x+2)=F(x)+F(x+1) F(x+2)=F(x)+F(x+1) across x = { 0 , 1 , ⋯ , 98 } x=\{0,1,\cdots,98\} x={0,1,⋯,98}），转为证明： P ( x ) = F ( x + 2 ) − F ( x + 1 ) − F ( x ) P(x)=F(x+2)-F(x+1)-F(x) P(x)=F(x+2)−F(x+1)−F(x) is zero over the range x = { 0 , 1 , ⋯ , 98 } x=\{0,1,\cdots,98\} x={0,1,⋯,98}。

若告知商： H ( x ) = F ( x + 2 ) − F ( x + 1 ) − F ( x ) Z ( x ) H(x)=\frac{F(x+2)-F(x+1)-F(x)}{Z(x)} H(x)=Z(x)F(x+2)−F(x+1)−F(x) 其中 Z ( x ) = ( x − 1 ) ∗ ( x − 2 ) ∗ ⋯ ∗ ( x − 98 ) Z(x)=(x-1)*(x-2)*\cdots *(x-98) Z(x)=(x−1)∗(x−2)∗⋯∗(x−98)

则可自己计算 Z ( x ) Z(x) Z(x)，然后验证等式，若验证通过则说明 F ( x ) F(x) F(x)符合要求。【此时需要将多项式 P ( x ) P(x) P(x)告知给Verifier，而借助后续的polynomial commitment可保证 P ( x ) P(x) P(x)为secret info，同时使得verify equations with polynomials that’s much faster than checking each coefficient。】

至此，将100-step-long computation (computing the 100th Fibonacci number) 转换为了 a single equation with polynomials。当然，证明 the N N N-th number 不是啥特别有用的任务，特别是Fibonacci numbers 具有 Closed-form expression。但是，借助类似的技术，just with some extra polynomials and some more complicated equations，可对具有任意步骤的任意计算进行相应的encode。

3.3 polynomial commitment

可将polynomial commitment看成是对多项式的特殊hash计算，该特殊的hash计算具有如下属性： can check equations between polynomials by checking equations between their hashes。

不同的polynomial commitment scheme具有不同的属性 in terms of exactly what kinds of equations you can check。

以 com ( P ) \text{com}(P) com(P)来表示“the commitment to the polynomial P P P”，常见的关系有：

加法：已知 com ( P ) , com ( Q ) , com ( R ) \text{com}(P),\text{com}(Q),\text{com}(R) com(P),com(Q),com(R)，验证 P + Q = R P+Q=R P+Q=R是否成立？
乘法：已知 com ( P ) , com ( Q ) , com ( R ) \text{com}(P),\text{com}(Q),\text{com}(R) com(P),com(Q),com(R)，验证 P ∗ Q = R P*Q=R P∗Q=R是否成立？
Evaluate at a point：已知 P , w , z P,w,z P,w,z和一个补充的proof （或“witness”） Q Q Q，验证 P ( w ) = z P(w)=z P(w)=z是否成立？

值得注意的是，这些primitive之间可以相互转换，如：

若有加法和乘法，则可实现evaluate：为了证明 P ( w ) = z P(w)=z P(w)=z，可构建 Q ( x ) = P ( x ) − z x − w Q(x)=\frac{P(x)-z}{x-w} Q(x)=x−wP(x)−z，Verifier仅需验证 Q ( x ) ∗ ( x − w ) + z = P ( x ) Q(x)*(x-w)+z=P(x) Q(x)∗(x−w)+z=P(x) 是否成立。
若有evaluate，则可做加法和乘法验证：原因在于根据 Schwartz-Zippel lemma 有：若some equation involving some polynomials holds true at a randomly selected coordinate，则其almost certainly holds true for the polynomials as a whole。因此，若有a mechanism to prove evaluations，则可借助interactive game来证明 P ( x + 2 ) − P ( x + 1 ) − P ( x ) = Z ( x ) ∗ H ( x ) P(x+2)-P(x+1)-P(x)=Z(x)*H(x) P(x+2)−P(x+1)−P(x)=Z(x)∗H(x)：之前也提及，可借助Fiat-Shamir heuristic将以上过程改为non-interactive： Prover 计算 r = h a s h ( com ( P ) , com ( H ) ) r=hash(\text{com}(P), \text{com}(H)) r=hash(com(P),com(H))，此处的hash运算为任意的密码学hash函数，不需要具有任何特殊属性。 The prover cannot “cheat” by picking P and H that “fit” at that particular r but not elsewhere, because they do not know r at the time that they are picking P and H.

目前为止的简要回顾为：

zk-SNARKs are hard because the verifier needs to somehow check millions of steps in a computation, without doing a piece of work to check each individual step directly (as that would take too long).
可通过将computation encode 为 polynomials 来实现批量验证。
单个的多项式可包含无限量的信息，且单个的多项式表达式（如 P ( x + 2 ) − P ( x + 1 ) − P ( x ) = Z ( x ) ∗ H ( x ) P(x+2)-P(x+1)-P(x)=Z(x)*H(x) P(x+2)−P(x+1)−P(x)=Z(x)∗H(x)）可代表无限多个数字之间的等式关系。
若可verify the equation with polynomials，则相当于同时隐式地verify all of the number equations（将 x x x替换为任意实际的x坐标值）。
可借助polynomial commitment (一种对多项式的特殊hash算法)，实现对equation between polynomials 快速验证，无论实际的polynomial是否非常大。【而不需要对多项式的每个系数进行比较，以判断两个多项式是否相等。】

3.3.1 polynomial commitment scheme

现有的polynomial commitment scheme主要有以下三种：

Kate polynomial commitment：具体可参见Dankrad Feist的介绍 Kate polynomial commitments
bulletproofs commitment：具体可参见curve25519-dalek团队的介绍 Module bulletproofs:: notes::inner_product_proof
FRI (Fast Reed-Solomon Interactive Oracle Proofs of Proximity)：具体可参见V神的介绍 STARKs, Part II: Thank Goodness It’s FRI-day

以上三种中，Kate polynomial commitment需要用到 elliptic curve pairing。相对来说，FRI更容易理解。

3.3.2 FRI

FRI (Fast Reed-Solomon Interactive Oracle Proofs of Proximity)。以下对FRI做个简单介绍，将忽略一些技巧和优化。

假设有个degree < n < n

zk-SNARKs的原理简介

[ 申请 ]友情链接：