Age verification
When you verify your age in current solutions, you are oversharing. That means you reveal more information about yourself than is needed. The other part doesnt need to know your social security number, or all the contents of your drivers license. They dont even need to know your day of birth. They just need to know that your age is not below some threshold. Yes or no. It is a binary question.
You might think that you could instead authorize them to ask the authorities their yes or no question. But this creates another problem. That way the authorities would know about your age verifications. This is not needed. It also creates incentives for unethical authorities, to set up rules for age requirements and other requirements, to increase monitoring of citizens, while on paper not being a surveillance state.
But there is a technical solution, that allows you to verify your age, revealing only that one bit of information that the other part needs to know, in a way where no other parties will know about the verification. The solution is to use a so-called Zero Knowledge Proof (ZKP).
Such a gizmo normally works by starting up some heavy machinery - like constructing a circuit, turning it into polynomials, and so on. But in this case that is not necessary, because there is a simple and elegant solution. It uses hash-chains.
The solution uses only a few basic cryptographic building blocks.
Thats all! I told you its going to be simple...
Alice and Bob example
Let me start with a contrived example. There are three players:
A: The age authority Alice. She is the mother of Bob.
B: The boy Bob. He is 7 years old.
C: The Cycling Center at the holiday resort where Bob and his family are staying. They lend out bicycles, tricycles and electrical bikes.
To lend a bicycle it requires you to be at least 5 years old. Otherwise you get a tricycle.
So how can Bob prove he is at least 5, without revealing his actual age of 7, and without bothering Alice about it?
Alice selects a random number r, and then uses the public known hash function H, to iterate it 7 times on r. This gives her the commitment: c = H7r
Alice gives (r,c) to Bob.
If he likes to, Bob can check H7r ≟ c to be sure it is okay.
Bob then calculates the proof: p = H2r
Bob uses 7 - 5 = 2 iterations, because this leaves the minimum chain that the Cycling Center needs to know.
He then gives the center (p,c).
The Cycling Center checks if H5p ≟ c
Note that because the hash function is one-way, the center cannot guess what came before p, not to mention what r is. They learn nothing else about Bobs age, than it is 5 or more. They also dont need to get hold of Alice and ask her if Bob is old enough.
The center is satisfied with the proof, so they lend Bob a stylish blue bicycle with an electric two-tone horn.
That is it! Very elegant, right?
But we need a couple of modifications before we are there.
Bob could fake it
Bob is fervently longing for one of the electrical bikes. But the required age is 15 years.
He cannot make proofs for more than 7 years with the r and c he has from Alice. Unless he cheats.
Bob could iterate H some more times on the c he got, and then give this fake commitment to the center. He even could do the whole process with selecting an r and calculating c himself, just with 15 iterations of H.
One way to avoid that, is that the center starts requiring that commitments have a valid signature by authorities they trust. So Alice should sign the commitment before she gives it to Bob.
Diana could abet
Bobs older sister, Diana, is 15 years old. She also received a commitment from Alice.
Diana is not that interested in the Cycling Center, so she could give her r and her c to Bob.
One way the center could avoid that, is to require, that commitments contain an image of the owners signature, and combine that with a process where you have to sign your proof when you give it to the center.
When Alice made the commitments to Bob and Diana, she should have put images of their signatures in each of them respectively. If Bob tried to use Dianas commitment, the center would see that his signature didnt match.
Fitting the parts together
The protocol in real world terms, looks like this overall:
The number r should be selected randomly from the image of H. It will therefore, in other words, look like the output from H.
When B sends the proof to C, he also proves that he has the privB key.
The protocol can be different from the one shown here, but the core of it, with a hash-chain that is being shortened, is the same in all versions.
The earlist version I have found was published back in 1995 by Pedersen in "Electronic Payments of Small Amounts". But because it is so simple, it is not unlikely that there are earlier versions.
The technical term for the protocol is range proof.
Issues
The first issue is that you can only prove that k is greater than or equal to some threshold. It could be nice if you were able to prove less than or equal relations as well. If you have a hard upper bound, this is possible. But you often times dont, and then only greater than or equal proofs are possible.
The second issue is scalability. Issuing and checking takes time linear in the value of k or the threshold. This is prohibitively expensive for many applications. Like proofs around the balance of a bank account. You could of course count thousands, and then it might become practicable. But there is a limit to the granularity you can reasonably get with this protocol.
Variations
There are digital signature schemes that only relies on hash functions. So it should be possible to make the protocol in a way that doesnt make use of public key cryptography.
For some use cases, the granularity you need, is different in the low and the high end. There are many ways to accomodate this. One is to use logarithms.
Some people have worked on combining multiple hash-chains having different step sizes. This fixes the problem with scalability. But at the same time it compromises the privacy, as some information about k is given off.
The future
Hash-chains and protocols, like the one described here, are part of the large SSI agglomeration of technologies, that is being built these years. There is a lot of active research in this area, so we could be lucky that someone comes up with an improved version. But the protocol as it is, is still perfectly usable in a number of use cases - and it is delightfully simple.


