I already wrote about Schnorr and BLS signatures and I think they are really great. One of the most exciting properties of these signature schemes is key aggregation — if we want to make a 2-of-2 multisignature address we just take our public keys and add them together. The signature will be also just a sum of two signatures. This is possible because both Schnorr and BLS are linear — the sum of the verification equations is also a valid equation.
For example, to verify Schnorr signature (R, s) = (k×G, k+hash(P,R,m)⋅pk) we need to confirm that s×G = R + hash(P,R,m)×P. This means that if we use two private keys pk1, pk2 with corresponding public keys P1=pk1×G, P2=pk2×G we can add them up to get a multisig key P = P1+P2. And to generate a multisignature we just add our random numbers and signatures: (R, s) = (R1+R2, s1+s2). We only need to agree on R=R1+R2 in advance as it is used in the hash of the message. If the equations above look confusing to you, check out my previous post on Schnorr signatures.
With ECDSA everything is a bit more complicated. The verification equation is not linear. To generate a signature we need to choose a random number k with corresponding point R = k×G and compute s = (z+r⋅pk)/k. Here z=hash(m) is the hash of the message we are signing and r is an x-coordiante of our random point R. Annoying part here is this division by k. A simple addition of the equations doesn’t work anymore. But multiplication does! We just need to be careful with it. There is a nice trick shown in the paper by Yehuda Lindell that allows us to do 2-party ECDSA and generate a common signature.
2-party ECDSA at a glance
To make key aggregation work with ECDSA we need to use multiplication instead of addition. From two private keys pk1, pk2 and corresponding public keys P1=pk1×G, P2=pk2×G we calculate an aggregated public key P=pk1×P2=pk2×P1=pk1⋅pk2×G. This is a standard Diffie-Helman key exchange — every party takes the public key of another party and multiplies it by his private key. Now both parties know the common public key without exposing anything about their private keys.
To generate a valid signature for the message m with hash z=hash(m) we need to generate two random numbers k1, k2 (one for every party) and then calculate somehow the signature (r, s). Calculating r is easy — we do the same Diffie-Helman key exchange. Parties send each other their random points R1=k1×G, R2=k2×G and calculate common point R=k1×R2=k2×R1=k1⋅k2×G. The part with s is much more complicated — we need to compute s=(z+r⋅pk1⋅pk2)/k1/k2 in such a way that private key and random number of one party stay unknown to another.
For this purpose, we can use homomorphic encryption, in particular, Paillier scheme. Homomorphic encryption is a wonderful tool — with it, we can do computations on the encrypted data without getting any knowledge of the data itself. I will explain how it works a bit later and for now try to imagine: we can send an encrypted secret to another party and he can add something to it, multiply it by some number and then return us the result without getting any information about what he just did. Sounds magical! And it really is. Modern cryptography is so exciting!
To calculate our common signature we need to do the following: the first party encrypts his private key pk1 and sends the encrypted value e(pk1) to another party. The second party, using this encrypted key, creates a partial signature s’=(z+r⋅e(pk1)⋅pk2)/k2 and sends it back to the first party. As the private key of the first party is encrypted, the second party learns nothing about it. And as the second party uses both his private key pk2 and his random number k2, the first party also learns nothing. Now, the first party can decrypt the returned value and divide it by k1. Finally, we get our signature s=(z+r⋅pk1⋅pk2)/k2/k1.
Notice that we need to get an encrypted value of the private key e(pk1)
only once during the setup phase and then we can reuse it for every
signature in the future. Even more, thanks to the homomorphic properties
of the encryption scheme we can use HD wallets
to generate new encrypted children keys from the encrypted master key.
To derive a child key we only need to add a certain number to the parent
private key — we can easily do it homomorphically with the encrypted
master key we have.
Unfortunately, this scheme works only for two parties. If we want to use arbitrary m-of-n multisignature with bare ECDSA we still can do it, but it requires a much more complicated scheme.
And it is 100 times slower. But even with only two parties we can do
many amazing things — all Lightning channels can appear as normal
transactions (pay-to-pubkeyhash), and we can even do scriptless scripts with ECDSA.
Homomorphic encryption
So how does this homomorphic magic work? As usual in cryptography, we use huge numbers everywhere. We start by choosing two large prime numbers p and q of the same length (if they have different lengths we need a fancier algorithm). We will use their product n=p⋅q and a number g=n+1 for encryption. These two numbers (n, g) are public and can be shared with anyone. Another pair of numbers, λ=lcm(p–1, q–1)=(p–1)⋅(q–1)/2 and µ=λ^–1 mod n are used for decryption. We need to keep them secret.
This λ number is pretty interesting. If we take any number r and calculate r^λ mod n we’ll get 1. If we calculate r^(λ⋅n) mod n² we will also get 1. This means we can compute an inverse of a number as r^(–1)=r^(λ-1) mod n. It also works for λ itself:
µ=λ^–1 = λ^(λ–1) mod n
Now, to encrypt a secret number x we pick a random number r and compute encrypted value e(x) = g^x ⋅ r^n mod n². All the operations with encrypted data are happening modulo n². It is a pretty large number, so homomorphic calculations may be pretty slow and memory consuming. If we want to add a number a to the value x using only encrypted data we just multiply it by g^a:
e(x)⋅g^a = g^(x+a)⋅ r^n mod n² = e(x+a)
To multiply x by some number b we need to exponentiate the encrypted value:
e(x)^b=g^(x⋅b)⋅(r^b)^n mod n² = e(x⋅b)
The random number changes from r to r^b, but we don’t really care. It’s just a different random number.
After we are done with the calculations we can decrypt the data and get the result. To extract x from the cyphertext we do the following:
x = (e(x)^λ mod n² – 1) / n ⋅ µ mod n
Looks confusing. Let’s take a closer look. First, we take our cyphertext e(x) and exponentiate it to the power of λ:
e(x)^λ mod n² = g^(x⋅λ) ⋅ r^(n⋅λ) mod n² = g^(x⋅λ) mod n²
Here we used the fact that r^(n⋅λ) mod n² = 1 and r part disappears.
Now, recall that g=n+1. Using binomial theorem we can expand (1+n)^x and after taking it modulo n² only first two terms will remain:
g^x mod n² = (1+n)^x mod n² = (1 + x⋅n + x⋅(x-1)/2 ⋅ n² + …) mod n² = (1+x⋅n)
In our case we have:
g^(λ⋅x) mod n² = (1 + λx⋅n + λx⋅(λx-1)/2 ⋅ n² + …) mod n² = 1 + λx⋅n
From here we subtract 1, divide by n, get rid of the λ (multiplying by the inverse of it — µ) and end up with pure x.
If you don’t believe in the above — try it out by yourself. I wrote a tiny jupyter notebook just to make sure it works.
Further reading
Here we discussed the basic idea of the algorithm, but there are a few things that are left outside of the scope of this post. In particular, the setup phase of the algorithm is a bit more complicated. The second party needs to make sure that the ciphertext he got from the first party corresponds to the public key P1. So the first party needs to prove the connection between the ciphertext e(pk1) and the public point P1. This is quite tricky and computationally heavy and can take a few seconds on the modern computer. But as soon as the setup phase is done, all the signing happens pretty quickly — all the proofs there are fast.
There are more interesting things in the paper. I strongly recommend reading it if you can. Also, homomorphic encryption is really wonderful and I wrote only the core principle without going into details.
And we didn’t discuss applications of this scheme in details. So check out these papers to find out more: