Cryptography – Specter Solutions | Your Keys, Your Bitcoin https://specter.solutions Tue, 19 Oct 2021 09:55:16 +0000 en-US hourly 1 https://wordpress.org/?v=6.4.2 https://specter.solutions/wp-content/uploads/2020/12/cropped-1200px-Bitcoin.svg-32x32.png Cryptography – Specter Solutions | Your Keys, Your Bitcoin https://specter.solutions 32 32 ECDSA is not that bad: two-party signing without Schnorr or BLS https://specter.solutions/ecdsa-is-not-that-bad-two-party-signing-without-schnorr-or-bls/ Sun, 14 Oct 2018 19:40:26 +0000 https://specter.solutions/?p=3583 ECDSA is not that bad: two-party signing without Schnorr or BLS Read More »

]]>

I already wrote about Schnorr and BLS signatures and I think they are really great. One of the most exciting properties of these signature schemes is key aggregation — if we want to make a 2-of-2 multisignature address we just take our public keys and add them together. The signature will be also just a sum of two signatures. This is possible because both Schnorr and BLS are linear — the sum of the verification equations is also a valid equation.

For example, to verify Schnorr signature (R, s) = (k×G, k+hash(P,R,m)⋅pk) we need to confirm that s×G = R + hash(P,R,m)×P. This means that if we use two private keys pk1, pk2 with corresponding public keys P1=pk1×G, P2=pk2×G we can add them up to get a multisig key P = P1+P2. And to generate a multisignature we just add our random numbers and signatures: (R, s) = (R1+R2, s1+s2). We only need to agree on R=R1+R2 in advance as it is used in the hash of the message. If the equations above look confusing to you, check out my previous post on Schnorr signatures.

With ECDSA everything is a bit more complicated. The verification equation is not linear. To generate a signature we need to choose a random number k with corresponding point R = k×G and compute s = (z+r⋅pk)/k. Here z=hash(m) is the hash of the message we are signing and r is an x-coordiante of our random point R. Annoying part here is this division by k. A simple addition of the equations doesn’t work anymore. But multiplication does! We just need to be careful with it. There is a nice trick shown in the paper by Yehuda Lindell that allows us to do 2-party ECDSA and generate a common signature.

 

2-party ECDSA at a glance

To make key aggregation work with ECDSA we need to use multiplication instead of addition. From two private keys pk1, pk2 and corresponding public keys P1=pk1×G, P2=pk2×G we calculate an aggregated public key P=pk1×P2=pk2×P1=pk1⋅pk2×G. This is a standard Diffie-Helman key exchange — every party takes the public key of another party and multiplies it by his private key. Now both parties know the common public key without exposing anything about their private keys.

To generate a valid signature for the message m with hash z=hash(m) we need to generate two random numbers k1, k2 (one for every party) and then calculate somehow the signature (r, s). Calculating r is easy — we do the same Diffie-Helman key exchange. Parties send each other their random points R1=k1×G, R2=k2×G and calculate common point R=k1×R2=k2×R1=k1⋅k2×G. The part with s is much more complicated — we need to compute s=(z+r⋅pk1⋅pk2)/k1/k2 in such a way that private key and random number of one party stay unknown to another.

For this purpose, we can use homomorphic encryption, in particular, Paillier scheme. Homomorphic encryption is a wonderful tool — with it, we can do computations on the encrypted data without getting any knowledge of the data itself. I will explain how it works a bit later and for now try to imagine: we can send an encrypted secret to another party and he can add something to it, multiply it by some number and then return us the result without getting any information about what he just did. Sounds magical! And it really is. Modern cryptography is so exciting!

To calculate our common signature we need to do the following: the first party encrypts his private key pk1 and sends the encrypted value e(pk1) to another party. The second party, using this encrypted key, creates a partial signature s’=(z+r⋅e(pk1)⋅pk2)/k2 and sends it back to the first party. As the private key of the first party is encrypted, the second party learns nothing about it. And as the second party uses both his private key pk2 and his random number k2, the first party also learns nothing. Now, the first party can decrypt the returned value and divide it by k1. Finally, we get our signature s=(z+r⋅pk1⋅pk2)/k2/k1.

Simplified diagram of 2-party ECDSA signing. Blue values are public points, red values are secret, orange text represents homomorphic encryption and decryption.

Notice that we need to get an encrypted value of the private key e(pk1)
only once during the setup phase and then we can reuse it for every
signature in the future. Even more, thanks to the homomorphic properties
of the encryption scheme we can use HD wallets
to generate new encrypted children keys from the encrypted master key.
To derive a child key we only need to add a certain number to the parent
private key — we can easily do it homomorphically with the encrypted
master key we have.

Unfortunately, this scheme works only for two parties. If we want to use arbitrary m-of-n multisignature with bare ECDSA we still can do it, but it requires a much more complicated scheme.
And it is 100 times slower. But even with only two parties we can do
many amazing things — all Lightning channels can appear as normal
transactions (pay-to-pubkeyhash), and we can even do scriptless scripts with ECDSA.

 

Homomorphic encryption

So how does this homomorphic magic work? As usual in cryptography, we use huge numbers everywhere. We start by choosing two large prime numbers p and q of the same length (if they have different lengths we need a fancier algorithm). We will use their product n=p⋅q and a number g=n+1 for encryption. These two numbers (n, g) are public and can be shared with anyone. Another pair of numbers, λ=lcm(p–1, q–1)=(p–1)⋅(q–1)/2 and µ=λ^–1 mod n are used for decryption. We need to keep them secret.

This λ number is pretty interesting. If we take any number r and calculate r^λ mod n we’ll get 1. If we calculate r^(λ⋅n) mod n² we will also get 1. This means we can compute an inverse of a number as r^(–1)=r^(λ-1) mod n. It also works for λ itself:

µ=λ^–1 = λ^(λ–1) mod n

Now, to encrypt a secret number x we pick a random number r and compute encrypted value e(x) = g^x ⋅ r^n mod n². All the operations with encrypted data are happening modulo . It is a pretty large number, so homomorphic calculations may be pretty slow and memory consuming. If we want to add a number a to the value x using only encrypted data we just multiply it by g^a:

e(x)⋅g^a = g^(x+a)⋅ r^n mod n² = e(x+a)

To multiply x by some number b we need to exponentiate the encrypted value:

e(x)^b=g^(x⋅b)⋅(r^b)^n mod n² = e(x⋅b)

The random number changes from r to r^b, but we don’t really care. It’s just a different random number.

After we are done with the calculations we can decrypt the data and get the result. To extract x from the cyphertext we do the following:

x = (e(x)^λ mod n² – 1) / n ⋅ µ mod n

Looks confusing. Let’s take a closer look. First, we take our cyphertext e(x) and exponentiate it to the power of λ:

e(x)^λ mod n² = g^(x⋅λ) ⋅ r^(n⋅λ) mod n² = g^(x⋅λ) mod n²

Here we used the fact that r^(n⋅λ) mod n² = 1 and r part disappears.

Now, recall that g=n+1. Using binomial theorem we can expand (1+n)^x and after taking it modulo only first two terms will remain:

g^x mod n² = (1+n)^x mod n² = (1 + x⋅n + x⋅(x-1)/2 ⋅ n² + …) mod n² = (1+x⋅n)

In our case we have:

g^(λ⋅x) mod n² = (1 + λx⋅n + λx⋅(λx-1)/2 ⋅ n² + …) mod n² = 1 + λx⋅n

From here we subtract 1, divide by n, get rid of the λ (multiplying by the inverse of it — µ) and end up with pure x.

If you don’t believe in the above — try it out by yourself. I wrote a tiny jupyter notebook just to make sure it works.

 

Further reading

Here we discussed the basic idea of the algorithm, but there are a few things that are left outside of the scope of this post. In particular, the setup phase of the algorithm is a bit more complicated. The second party needs to make sure that the ciphertext he got from the first party corresponds to the public key P1. So the first party needs to prove the connection between the ciphertext e(pk1) and the public point P1. This is quite tricky and computationally heavy and can take a few seconds on the modern computer. But as soon as the setup phase is done, all the signing happens pretty quickly — all the proofs there are fast.

There are more interesting things in the paper. I strongly recommend reading it if you can. Also, homomorphic encryption is really wonderful and I wrote only the core principle without going into details.

And we didn’t discuss applications of this scheme in details. So check out these papers to find out more:

]]>
BLS signatures: better than Schnorr https://specter.solutions/bls-signatures-better-than-schnorr/ Mon, 25 Jun 2018 21:30:09 +0000 https://specter.solutions/?p=3671 BLS signatures: better than Schnorr Read More »

]]>

In the previous post I wrote about Schnorr signatures and how awesome they are. This one is about Boneh-Lynn-Shacham signatures and their extremely nice features that are not possible with Schnorr.

Shortly, what we know so far:

ECDSA signatures are ok. They do their job and do it well, but nothing more. We can’t combine signatures or keys and every signature has to be verified independently. With multisig transactions, it becomes especially annoying. We have to check all the signatures and the corresponding public keys one by one, waste a lot of space in a block and pay large fees.

Schnorr signatures are awesome — if we do it right we can combine all signatures and public keys in the transaction to a single key and a signature and nobody will find out that they correspond to multiple keys. Also block validation can be faster — we can validate all signatures at once. There are a few issues though:

  • Multisig scheme requires several communication rounds. This can be very annoying with cold storage.
  • With signature aggregation we have to rely on random number generator — we can’t choose random point R deterministically like we do in ECDSA
  • m-of-n multisig scheme is tricky — we need to make a merkle tree of public keys that can get pretty large for large m and n.
  • We can‘t combine all signatures in the block to a single signature.

BLS signatures can fix all of the above. We don’t need random numbers at all, all signatures in the block can be combined to a single signature, m-of-n multisig is very simple and we don’t need several communication rounds between signers. In addition to that BLS signatures are 2 times shorter than Schnorr or ECDSA — signature is not a pair, but a single curve point. Sounds amazing! Let’s see how they work.

 

BLS signatures magic

Before we start we need two new constructions — hashing to the curve and curves pairing.

Hashing to the curve

Normally with ECDSA and Schnorr we hash the message and use this hash in the signing algorithm as a number. For BLS signatures we need a slightly modified hashing algorithm that hashes directly to the elliptic curve. The easiest way is to hash a message as usual and treat the result as an
x-coordinate of a point. Elliptic curves (like the one we are using in Bitcoin) usually have about 2²⁵⁶ points and SHA-256 hashing algorithm also gives a 256-bit result. But for every valid x-coordinate there are two points with positive and negative y-coordinate (just because if (x,y) is on the curve y²=x³+ax+b then (x,-y) is also on the curve). This means that our hash has roughly 50% probability to find two points for some x and 50% to find none.

Toy example of hashing to the elliptic curve y²=x³+7 defined over finite field modulo 23. Only half of all x-coordinates have points. Here only third attempt was successful.

To find a point for any message we can try hashing several times by appending a number to the message and incrementing it on fail. If hash(m||0) doesn’t find a point we try hash(m||1), hash(m||2) and so on until we finally get one that matches. Then we choose one of the two corresponding points, say the one with smaller y, and we are done.

Curves pairing

Another thing we need is a very special function that takes two points P and Q on a curve (or on two different curves) and maps them to a number:

e(P, Q) → n.

We also require one important property from this function. If we have some secret number x and two points P and Q we should obtain the same result regardless of which point we multiply by this number:

e(x×P, Q) = e(P, x×Q).

Basically we need to be able to swap multipliers of the points between two arguments without changing the result. More generally all these swaps should give the same result:

e(a×P, b×Q) = e(P, ab×Q) = e(ab×P, Q) = e(P, Q)^(ab)

I am not going to describe how exactly this function works. Underlying math is pretty complicated and if you want to know all the nasty details I would suggest reading this post and references in it. If you want to go deeper — this thesis is completely about pairings. For now we just accept that such functions exist and they don’t reveal any information about our secret number x.

One important note is that we can’t use any elliptic curve here (in particular, standard Bitcoin curve secp256k1 doesn’t work). To make this function efficient and secure we have to use very special curves from “pairing-friendly” family.

BLS signature scheme

Now we have everything we need to construct a BLS signature. As usual our private key is some secret number pk, our public key is P = pk×G and the message we are signing is m.

To calculate the signature we hash our message to the curve H(m) and multiply resulting point by our private key: S = pk×H(m). That’s it! Nothing else — no random numbers, no extra operations, just a hash times the private key! Our signature is just one single point on the curve that takes only 33 bytes in compressed serialisation format!

BLS signature generation. To obtain signature we multiply a hash of the message by the private key.

To verify this signature one can take our public key P and check that

e(P, H(m)) = e(G, S)

It it true because of the nice property of the pairing function described above:

e(P, H(m)) = e(pk×G, H(m)) = e(G, pk×H(m)) = e(G, S)

BLS signature verification. We just need to check that the public key and the message hash are mapped to the same number as the curve generator point and the signature.

This signature scheme is beautiful and simple, but it also has several very nice features, especially for Bitcoin.

 

Signature aggregation

Now let’s combine all signatures in the block! Imagine we have a block with 1000 transactions and every transaction contains a signature Si, a public key Pi and a message that is signed mi. Why to store all the signatures if we can combine them? After all, we only care if all signatures in the block are valid. Aggregated signature will be just a sum of all signatures in the block:

S = S1+S2+…+S1000

To verify the block we need to check that the following equality holds:

e(G, S) = e(P1, H(m1))⋅e(P2, H(m2))⋅…⋅e(P1000, H(m1000))

If you look carefully you will see that it’s indeed true:

e(G, S) = e(G, S1+S2+…+S1000) = e(G, S1)⋅e(G, S2)⋅…⋅e(G, S1000) = e(G, pk1×H(m1))⋅…⋅e(G, pk1000×H(m1000)) = e(pk1×G, H(m1))⋅…⋅e(pk1000×G, H(m1000)) = e(P1, H(m1))⋅e(P2, H(m2))⋅…⋅e(P1000, H(m1000))

We still need to know all the public keys and calculate 1001 pairing functions, but at least all the signatures in the block take only 33 bytes. Signature aggregation can be done by a miner and save a lot of space in the block.

 

Key aggregation and n-of-n multisignature

If we are using multisignature addresses, we are signing the same transaction with different keys. In this case we can do key aggregation like in Schnorr, where we combine all signatures and all keys to a single pair of a key and a signature. Let’s take a common 3-of-3 multisig scheme (it can be done similarly for any number of signers).

A simple way to combine them is to add all the signatures and all the keys together. The result will be a signature S=S1+S2+S3 and a key P=P1+P2+P3. It’s easy to see that the same verification equation still works:

e(G, S) = e(P, H(m))

e(G, S) = e(G, S1+S2+S3) = e(G, (pk1+pk2+pk3)×H(m)) = e((pk1+pk2+pk3)×G, H(m)) = e(P1+P2+P3, H(m)) = e(P, H(m))

Just like in Schnorr we need to protect ourselves from the rogue key attack. We can do it either by asking every co-signer to prove that they have private keys for their public keys (by signing their public keys), or we can add some nonlinearity to the scheme and make rogue key attack impossible. Instead of just summing up all the keys and signatures we multiply them by a certain number and then add:

S = a1×S1+a2×S2+a3×S3

P = a1×P1+a2×P2+a3×P3

Here coefficients of the signatures and keys are calculated deterministically from the public key of the signer and all other public keys:

ai = hash(Pi, {P1,P2,P3})

For example it can be just a concatenation of the signer’s public key and the whole set of public keys used for signing: ai = hash(Pi||P1||P2||P3).

The same verification equation still works. There will be additional coefficients ai in the proof, but the logic remains.

Nice thing about this scheme is that you don’t need several communication rounds between devices. You just need to know who are other signers, and you are all set. This is much simpler than 3-round multisig scheme for Schnorr signatures. It also doesn’t rely on any randomness, it is a completely deterministic signature algorithm.

 

Subgroup multisignature scheme (m-of-n multisig)

Often we don’t want to use n-of-n multisig scheme, we prefer to use m-of-n, say, 2-of-3. We don’t want to lose all our money just because we’ve lost one of our keys. It would be nice to use key aggregation in this scenario. With Schnorr signatures we were able to do so by using merkle tree of public keys. It is not the most efficient way, but it kinda works. Unfortunately, as soon as you go to high m and n values, merkle tree size blows up exponentially.

With BLS signature scheme there is another method. It’s not that simple though. We will need a normal hash function that outputs a number — hash(x), and a hash to the curve — H(x). We will also need a “setup” phase when we decide to use multisig, but after this we don’t need to communicate anymore — we only need signatures to sign any amount of transactions.

Again I will use a simple example where we want to construct 2-of-3 multisig scheme with keys stored on 3 different devices, but it can be extended to any values of m and n.

Setup phase

Each of our devices has a signer’s number i = 1,2,3 that represent its place in a set, a private key pki and a corresponding public key Pi = pki×G. We calculate an aggregated public key exactly the same way as before:

P = a1×P1+a2×P2+a3×P3, ai = hash(Pi, {P1,P2,P3})

Now every device needs to sign that number i is a member of our aggregated public key (for every i), aggregate these signatures and save the result on corresponding device:

MKi = (a1⋅pk1)×H(P, i)+(a2⋅pk2)×H(P, i)+(a3⋅pk3)×H(P, i)

This signature we will call a “membership key” and we will use it later on for signing. Each membership key is a valid n-of-n signature of the message H(P,i), it means that:

e(G, MKi)=e(P, H(P,i))

Remember this equation, we will need it later. It will be used to prove that we are valid participants of the multisignature scheme.

Signing

Now let’s say we want to sign a transaction only with keys pk1 and pk3. We generate two signatures S1 and S3:

S1 = pk1×H(P, m)+MK1, S3=pk3×H(P, m)+MK3

and add them up to obtain single signature and key:

(S’, P’) = (S1+S3, P1+P3)

I write P’ and S’ here to emphasise that this key and signature are signed only by a subset of signers and it not the same as P that is an aggregated key for all signers. To verify this 2-of-3 signature we need to check that:

e(G, S’) = e(P’, H(P, m))⋅e(P, H(P, 1)+H(P, 3))

We remember that membership keys MK1 and MK3 are valid signatures for messages H(P, 1) and H(P, 3) signed by aggregated key P, so:

e(G, S’) = e(G, S1+S3)=e(G, pk1×H(P, m)+pk3×H(P, m)+MK1+MK3) =e(G, pk1×H(P, m)+pk3×H(P, m))⋅e(G, MK1+MK3)=e(pk1×G+pk3×G, H(P, m))⋅e(P, H(P, 1)+H(P, 3))=e(P’, H(P, m))⋅e(P, H(P, 1)+H(P, 3))

That’s it. A bit more complicated than n-of-n, but still understandable.

Possible implementation

To implement this multisignature scheme we will need to send money to an address corresponding to an aggregated public key P and say that we need at least two signatures. In Bitcoin scripting language locking script could look like this:

OP_2 <P> OP_CHECK_BLS_MULTISIG

Here OP_2 tells that we require two keys to sign the message. We don’t say anywhere that there are 3 signers in total, so one can’t say if it is 2-of-3 or 2-of-100 multisig address. We also don’t reveal this information later.

To spend this output we need to provide a key P’, signature S’ and indexes of participating signers — in our case numbers 1 and 3. Unlocking script could look like this:

OP_1 OP_3 <P’> <S’>

Combining these scripts gives us all necessary information. From OP_1 and OP_3 we know that we need to calculate hashes H(P, 1) and H(P, 3), and then we have everything we need to verify the transaction.

This means that for any m and n we need only:

  • one aggregated public key P in the locking script
  • one aggregated public key of participating signers P’
  • one aggregated signature S’
  • m numbers with signer’s indexes.

It is very compact and beautiful!

There is only one thing that I don’t like here… Normally we use addresses only once — we use key derivation like BIP32 to generate new private keys and addresses. But for every new set of private keys we also need a set of new membership keys. One way to do it without running through setup phase every time is to generate a bunch of keys, like 1000 of them and get corresponding membership keys. After all, they are just 32-byte long. Then we will need to run the setup phase again only when all 1000 addresses are used.

 

Drawbacks

As it was pointed out in the comments here and on twitter, I skipped one important part and made you think that BLS signatures are perfect. They are not. And thanks a lot for bringing this up!

Pairing is hard. I completely skipped that part and we considered that our magic function e(P, Q) is efficient and secure. It is not exactly true.

Paring is not so efficient

BLS signature verification is order of magnitude harder than ECDSA. Signature aggregation for the whole block with 1000 transactions still requires to compute 1000 pairing, so verifying one tiny signature in a block may take longer than verifying 1000 separate ECDSA signatures. The only benefit we achieve here is that we can fit more transactions in the block as aggregated signature takes only ~32 bytes.

Unlike BLS, Schnorr signatures are very efficient — they can be validated all together and this process is factor of 3 more efficient than ECDSA. The question then arises: what is more important for us?

Security proof is harder

Pairing functions are complicated, and it can become our foe if we are not careful enough. On the one hand we want pairing to be efficient to verify signatures faster, on the other hand we don’t want to reveal any information about our secret key. These two requirements fight each other and we need to be extremely careful with our choice of pairing-friendly curves.

There is actually an attack on elliptic curve crypto systems, called MOV attack (named after Menezes, Okamoto, and Vanstone), that allows to reduce security of the system by using our magic pairing function. So we really need to be careful here.

 

Conclusion

BLS signatures are amazing — we can combine all signatures in a block to a single signature, we can use key aggregation and m-of-n multisig scheme without additional communication rounds, we don’t need to rely on random number generators and the scheme itself looks very nice and simple. There is still room for improvement, standardising and optimising everything will take some time. But I hope at some point this signature algorithm will become good enough to be included in the Bitcoin protocol and we will be able to use all these nice features with smaller and more aggregatable signatures.

I am very excited to see the first author Dan Boneh working on cryptocurrencies. He is a great cryptographer and his coursera crypto course is outstanding. I also recommend his crypto book even though it’s not finished yet. I am sure we will see many interesting schemes and protocol improvements from him and his team in the near future.

]]>
How Schnorr signatures may improve Bitcoin https://specter.solutions/how-schnorr-signatures-may-improve-bitcoin/ Sat, 23 Jun 2018 22:09:26 +0000 https://specter.solutions/?p=3703 How Schnorr signatures may improve Bitcoin Read More »

]]>

When I was reading the MuSig paper from Blockstream I was trying to imagine what would it mean for me as a bitcoin user. Some features of the Schnorr signatures I found really great and convenient, but others are pretty annoying. Here I want to share my thoughts with you, but first, a quick recap:

 

Elliptic Curve Digital Signature Algorithm

Currently in Bitcoin we use ECDSA. To sign a message m we hash it and treat this hash as a number: z = hash(m). We also need a random or random-looking number k. We prefer not to trust random number generators (too many failures and vulnerabilities are related to bad RGNs) so we usually use RFC6979 to calculate deterministic k based on our secret and the message we are signing.

Using a private key pk we can generate a signature for message m consisting of two numbers: r (x coordinate of the random point R = k×G) and s = (z+r⋅pk)/k. Then, using our public key P = pk×G anyone can verify our signature by checking that point (z/s)×G+(r/s)×P has x coordinate equal to r.

Visualisation of the ECDSA algorithm. Elliptic curve is plotted over real numbers for illustration purposes.

This algorithm is very common and pretty nice but it can be improved. First, signature verification includes inversion (1/s) and two points multiplications and these operations are very computationally heavy. In Bitcoin every node has to verify all the transactions. This means that when you broadcast a transaction, thousands of computers will have to verify your signature. Making verification process simpler will be very beneficial even if signing process is harder.

Second, every node has to verify every signature separately. In case of m-of-n multisig transaction node may even have to verify the same signature several times. For example, transaction with 7-of-11 multisig input will contain 7 signatures and require from 7 to 11 signature verifications on every node in the network. Also such transaction will take a huge amount of space in the block and you will have to pay large fees for that.

 

Schnorr signatures

Schnorr signatures are generated slightly differently. Instead of two scalars (r,s) we use a point R and a scalar s. Similar to ECDSA, R is a random point on elliptic curve (R = k×G). Second part of the signature is calculated slightly differently: s = k + hash(P,R,m) ⋅ pk. Here pk is your private key, P = pk×G is your public key, m is the message. Then one can verify this signature by checking that s×G = R + hash(P,R,m)×P.

Visualisation of the Schnorr signature verification.

This equation is linear, so equations can be added and subtracted with each other and still stay valid. This brings us to several nice features of Schnorr signatures that we can use.

To verify a block in Bitcoin blockchain we need to make sure that all signatures in the block are valid. If one of them is not valid we don’t care which one — we just reject the whole block and that’s it.

With ECDSA every signature has to be verified separately. Meaning that if we have 1000 signatures in the block we will need to compute 1000 inversions and 2000 point multiplications. In total ~3000 heavy operations.

With Schnorr signatures we can add up all the signature verification equations and save some computational power. In total for a block with 1000 transactions we need to verify that:

(s1+s2+…+s1000)×G=(R1+…+R1000)+(hash(P1,R1,m1)×P1+ hash(P2,R2,m2)×P2+…+hash(P1000,R1000,m1000)×P1000)

Here we have a bunch of point additions (almost free in sense of computational power) and 1001 point multiplication. This is already a factor of 3 improvement — we need to compute roughly one heavy operation per signature.

Batch validation of two signatures. As verification equation is linear the sum of several equations is valid as soon as all signatures are valid. We save some computational power as scalar and point additions are much easier than point multiplication.

2. Key aggregation

We want to keep our bitcoins safe, so we might want to use at least two different private keys to control bitcoins. One we will use on a laptop or a phone and another one — on a hardware wallet / cold wallet. So when one of them is compromised we still have control over our bitcoins.

Currently it is implemented via 2-of-2 multisig script. This requires two separate signatures to be included in the transaction.

With Schnorr signatures we can use a pair of private keys (pk1,pk2) and generate a shared signature corresponding to a shared public key P=P1+P2=pk1×G+pk2×G. To generate this signature we need to choose a random number on every device (k1,k2), generate a random point Ri=ki×G, add them up to calculate a common hash(P,R1+R2,m) and then get s1 and s2 from every device (si = ki + hash(P,R,m) ⋅ pki). Then we can add up these signatures and use a pair (R, s) = (R1+R2, s1+s2) as our signature for shared public key P. Everyone else won’t be able to say if it is an aggregated signature or not — it looks exactly the same as a normal Schnorr signature.

There are three problems with this construction. First one — from UI point of view. To make a transaction we need several communication rounds — to calculate common R, and then — to sign. With two private keys it can be done with a single access to the cold wallet: we prepare an unsigned transaction on our online wallet, choose k1 and write down R1=k1×G together with the unsigned transaction. Then we transfer this data to the cold wallet and sign. As we already have R1 we can sign transaction on the cold wallet in one run. From the cold wallet we get R2 and s2 which we transfer back to the online wallet. Online wallet signs the transaction with previously chosen (k1, R1), combines both signatures and broadcasts a signed transaction. This is pretty much similar to what we have now, but as soon as you add a third private key everything becomes more complicated. Imagine you have a fortune that is controlled by 10 private keys stored in different secure places around the world and you need to make a transaction. Currently you need to go over all these places only once, but if you are using key aggregation you need to do it twice — to assemble all Ri and then to sign. In this case it would probably be better to stay with separate signatures without aggregation — then only one round is necessary.

Post update, thanks to Manu Drijvers for bringing this up: for a provably secure multisignature scheme you need 3 communication rounds:

  • Choose a random number ki and corresponding Ri=ki×G and tell everyone only it’s hash ti=hash(Ri), so that everyone can be sure that you will not change your mind after learning other’s random numbers,
  • Gather all the numbers Ri together and calculate common R
  • Sign

Second problem is a known Rogue key attack. It is nicely described in the paper or here, so I won’t go into details. The idea is that if one of your devices is hacked (say, your online wallet) and pretends that its public key is (P1-P2) then it can control shared funds with its private key pk1. A simple solution is to require a public key to be signed with corresponding private key when we are setting up the devices.

And there is a third important problem. You can’t use deterministic k for signing. There is a simple attack that allows a hacker to get our private key if you are using deterministic k. Attack looks like this: someone hacked our laptop and has a complete control over one of two private keys (say, pk1). We feel safe as our bitcoins require an aggregated signature from both pk1 and pk2. So we are trying to make a transaction as usual, prepare an unsigned transaction and R1 value, transfer them to our hardware wallet and sign there. Then transfer back (R2, s2) and… something happened to our online wallet and it fails to sign and broadcast. We try to do it again, but our hacked computer uses another random value this time — R1′. We sign the same transaction with our hardware wallet again and bring values (R2, s2′) back to our hacked computer. Ups, we’ve lost all our bitcoins.

In this attack hacker gets a pair of valid signatures for the same transaction: (R1, s1, R2, s2) and (R1′, s1′, R2, s2′). Here R2 is the same, but R = R1+R2 and R’=R1’+R2 are different. This means that the hacker can calculate our second private key: s2-s2’=(hash(P,R1+R2,m)-hash(P,R1’+R2,m))⋅pk2 and pk2=(s2-s2′)/(hash(P,R1+R2,m)-hash(P,R1’+R2,m)). I find this the most inconvenient feature of key aggregation — we will need to use a good random number generators everywhere to use key aggregation.

3. MuSig

MuSig solves one of these problem — it makes rogue key attack impossible. The goal is to aggregate signatures and public keys from several parties/devices to a single one but without proving that you have a private key corresponding to the public key.

The aggregated signature corresponds to the aggregated public key. But instead of just adding up public keys of all co-signers we multiply them to some factor. The aggregated public key will be P=hash(L,P1)×P1+…+hash(L,Pn)×Pn. Here L=hash(P1,…,Pn) — a common number depending on all public keys. This nonlinearity prevents attacker from constructing a bad public key like in rogue key attack. Even though the attacker knows exactly what should be his hash(L,Patk)×Patk, he can’t derive Patk from it — it is the same problem as to derive the private key from the public key.

The rest is pretty similar to the previous case. To generate a signature each co-signer choses a random number ki and shares Ri=ki×G with others. Then they add up all these random points to a single R=R1+…+Rn and generate a signature si = ki + hash(P,R,m) ⋅ hash(L,Pi) ⋅ pki. The aggregated signature is (R, s)=(R1+…+Rn, s1+…+sn) and verification equation is the same as before: s×G = R + hash(P,R,m)×P.

4. Merkle Multisig

As you may have noticed, MuSig and key aggregation require all signers to sign a transaction. But what if you want to make a 2-of-3 multisig? Is it possible at all to use signature aggregation in this case, or we will have to use our usual OP_CHECKMULTISIG and separate signatures?

Well, it is possible, but with a small change in the protocol. We can develop a new op-code similar to OP_CHECKMULTISIG that checks if aggregated signature corresponds to a particular item in the Merkle tree of public keys.

For example, if we use a 2-of-3 multisig with public keys P1, P2 and P3, then we need to construct a Merkle tree of aggregated public keys for all combinations we can use: (P1, P2), (P2, P3), (P1, P3) and put the root in the locking script. To spend bitcoins we provide a signature and a proof that our public key is in the tree. For 2-of-3 multisig there are only 3 elements in the tree and the proof will consist of two hashes — the one we want to use and its neighbour. For 7-of-11 multisig there will be already 11!/7!/4!=330 possible key combinations and the proof will require 8 elements. In general the number of elements in the proof scales almost linear with the number of keys in multisig (it’s log2(n!/m!/(n-m)! ).

But with the Merkle tree of public keys we are not limited to m-of-n multisigs. We can make a tree with any public keys we want. For example, if we have a laptop, a phone, a hardware wallet and a recovery seed we can construct a structure that would allow us to spend bitcoins with a laptop and a hardware wallet, a phone and a hardware wallet or just with a recovery seed. This is currently not possible just with OP_CHECKMULTISIG — only if you construct much more complicated script with branches and stuff.

Merkle tree of aggregated public keys. More than just m-of-n multisig.

Conclusion

Schnorr signatures are great. They can save some computational power during block validation and also give us ability to use key aggregation. The last one has some inconveniences, but we aren’t forced to use them — after all, if we want we can continue using normal multisig schemes with separate, non-aggregated signatures and still gain something. I can’t wait to start using them and I hope they will be included in the Bitcoin protocol soon.

I really liked the paper, the MuSig scheme is smart and the paper itself is very easy to read. I would strongly recommend to look through it if you have time.

P.S. There is also another nice type of signatures — BLS signatures. They look even better than Schnorr in some sense. If you want to know what BLS signatures are about, take a look at my next post.

]]>