Vote

Theme Song: Ruler of Everything

In this assignment, you'll implement a cryptographic voting protocol based on the widely used Helios protocol. In particular, you'll explore how we can use zero-knowledge proofs and homomorphic encryption in practice.

Note that this project is the first pair project of the class; all projects from this point forward, including the final project, will be done in pairs. We recommend sticking with the same partner throughout the rest of the course. GitHub has been set up so that you can make teams, and Gradescope has been set up so that you can add up to one other classmate to a submission. Please use both of these features!


Background Knowledge

In this assignment, you'll build a cryptographically secure voting platform. There are four programs involved: Arbiters that generate the election parameters and decrypt the final result, a Registrar server that handles checking that all voters are registered to vote only once, a Tallyer server that checks that all votes are valid, and the Voter itself. All of these parties will interact to conduct an election.

We highly recommend reading Cryptographic Voting - A Gentle Introduction in full, or at least referencing it when writing your ZKPs. It will help immensely in understanding the mathematics of this assignment.

Additively Homomorphic Encryption

In standard encryption, encrypted data must be decrypted before it can be meaningfully altered. Indeed, being able to alter a ciphertext to produce a meaningful change in the corresponding plaintext is called malleability, and is usually undesirable. In particular, a malleable encryption scheme cannot be used in authenticated encryption. However, being able to compute over encrypted data is very useful, as it allows multiple parties to compute over shared data without leaking the data itself or coordinating beforehand. Encryption schemes that allow for computation over their ciphertexts are called homomorphic encryption schemes. Of those, some may only allow either addition or multiplication (called additively and multiplicatively homomorphic, respectively), while those that allow both are called fully homomorphic.

In this project, we'll explore an additively homomorphic encryption scheme. Formally, an additively homomorphic encryption scheme is an encryption scheme with an additional algorithm \(\mathsf{HomAdd}\) such that for any two messages \(m_0, m_1\), we have that \(\mathsf{Dec}(\mathsf{HomAdd}(\mathsf{Enc}(m_0), \mathsf{Enc}(m_1))) = m_0 + m_1\), where \(\mathsf{HomAdd}\) is a homomorphic addition operation. In other words, we can construct a ciphertext for \(m_0 + m_1\) using ciphertexts for \(m_0\) and \(m_1\) individually. Similar definitions exist for multiplicatively homomorphic and fully homomorphic schemes.

We have actually already seen a simple multiplicatively homomorphic encryption scheme: ElGamal encryption. To see why, consider two ciphertexts \(c_1 = (g^{r_1}, h^{r_1} \cdot m_1)\) and \(c_2 = (g^{r_2}, h^{r_2} \cdot m_2)\). Observe that we can construct a ciphertext for \(m_1 \cdot m_2\) by multiplying component-wise to obtain \(c = (g^{r_1 + r_2}, h^{r_1 + r_2} \cdot (m_1 \cdot m_2))\). We can apply the same idea to convert this encryption scheme into an additively homomorphic encryption scheme by instead encoding our messages as \(g^m\) instead of \(m\); then, the above protocol becomes combining \(c_1 = (g^{r_1}, h^{r_1} \cdot g^{m_1})\) and \(c_2 = (g^{r_2}, h^{r_2}\cdot g^{m_2})\) to get \(c = ( g^{r_1 + r_2}, h^{r_1 + r_2} \cdot g^{m_1 + m_2})\).

One glaring issue with this adaptation is that in order to decrypt and recover \(m_1 + m_2\) we need to solve the discrete logarithm problem. However, for our purposes, we will only be encrypting small values and combining them a small number of times, so a brute-force, linear-time approach is perfectly fine. Note that this doesn't compromise the security of encryption as the secret key \(sk\) and the random \(r\)'s are still expected to be very large, so a brute-force approach without knowledge of \(sk\) is still computationally infeasible.

Threshold Encryption

Let's say we will be using homomorphic encryption that allows anyone to add their vote to a publicly tracked value or set of values. As of now, a single party (arbiter) solely holds the decryption key and can check the value at any time they please. This isn't necessarily desirable; it would be nice if decryption keys could be split amongst multiple parties (arbiters) and a ciphertext can only be jointly decrypted by all the parties together. This is known as threshold encryption and is also achievable with ElGamal encryption.

In threshold ElGamal encryption, \(n\) parties will get together and each generate a keypair \((sk_i, pk_i)\) where \(pk_i = g^{sk_i}\). Each party publishes \(pk_i\) and keeps \(sk_i\) private. They will then multiply their public values together and obtain \(pk = \prod_i pk_i = g^{\sum_i sk_i}\). Encryption should use this combined public key \(pk\).

In order to jointly decrypt a ciphertext \(c=(c_1, c_2)\) that is encrypted using this public key, each party can partially decrypt the ciphertext, and then the parties can combine their partial decryptions to get a full decryption. To compute a partial decryption of \(c\), each party computes \(c_1^{sk_i}\). Then, multiplying all partial decryptions together retrieves \(\prod c_1^{sk_i} = c_1^{\sum sk_i} = c_1^{sk}\), which can then be used to decrypt the second component of the ciphertext, namely \(g^m = c_2/c_1^{sk}\).

Zero-Knowledge Proofs

Let's say that our protocol allows voters to encrypt 1 or 0 and post it to the public message board as a vote for or against a particular policy. How will we know that the voters haven't cheated and posted an encryption of 100, without decrypting every ciphertext and checking that it is, in fact, 1 or 0? Zero-knowledge proofs allow us to prove this fact, among many others, without revealing any other information. They are a powerful cryptographic primitive that allows us to build trust without unnecessarily revealing information. We'll explore three zero-knowledge proof protocols to get the hang of things.

Proving Correct Encryption

The first ZKP we'll explore is a protocol to prove that a ciphertext \(c=(c_1,c_2)\) is an ElGamal encryption of 0 under a public key \(pk\), where the witness is the randomness \(r\) used in the encryption, in particular \(c_1 = g^r\) and \(c_2 = pk^r\). The protocol is as follows.

Similarly, we can prove a ciphertext \(c=(c_1,c_2)\) is an encryption of 1 under a public key \(pk\), where the witness is the randomness \(r\) used in the encryption, in particular \(c_1 = g^r\) and \(c_2 = pk^r \cdot g\). We can re-write it as \(c_1 = g^r\) and \(c_2/g = pk^r\), and then use the above ZKP to prove that \((c_1, c_2/g)\) is an encryption of 0 (using the randomness \(r\)).

As we discussed in class, this sigma protocol satisfies completeness, is a proof of knowledge of \(r\), and is honest-verifier zero-knowledge. A more detailed explanation can be found in the lecture notes and the readings.

Proving OR Statement

The ZKP we need in the project is a proof that a ciphertext is an encryption of either 0 or 1. Proving AND statements is straightforward; simply prove both statements. However, proving OR statements is significantly more difficult since one of the statements could be false. We'll approach this ZKP in steps and build up to a protocol that works.

Consider the aforementioned ZKP that \(c=(c_1,c_2)\) is an encryption of 0. Notice that the prover can actually cheat in the ZKP if she knows \(\sigma\) before sending the first-round message. In particular, she can first randomly sample the third-round reponse \(r''\) from \((0, q-1)\), and then compute the first-round message by \(A=g^{r''}/c_1^{\sigma}\) and \(B = pk^{r''} / c_2^{\sigma}\), which will end up verifying correctly. We can use this observation to generate a ZKP for an OR statement.

The protocol is as follows. Suppose \(c=(c_1,c_2)\) is an encryption of 1 and the prover knows the randomness \(r\). (If \(c\) is an encryption of 0, the protocol follows similarly.)

This protocol is known as Disjunctive Chaum-Pedersen (DCP), or the Sigma-OR protocol. A more detailed explanation can be found in the lecture notes and the readings.

ZKP for Partial Decryption

We explore another ZKP that proves that a partial decryption of a ciphertext \(c=(c_1,c_2)\) is correct. The partial decryption with regard to a partial public key \(pk_i\) is \(d\), where the witness is the partial secret key \(sk_i\), in particular, \(pk_i = g^{sk_i}\) and \(d = c_1^{sk_i}\). The protocol is as follows.

Non-Interactive Zero-Knowledge (NIZK)

All the zero-knowledge proofs (sigma protocols) we have discussed above are only honest-verifier zero-knowledge (HVZK), namely it is zero-knowledge against an honest verifier who samples the challenge \(\sigma\) uniformly at random. The Fiat-Shamir heuristic allows us to transform these sigma protocols into non-interactive zero-knowledge (NIZK) proofs in the random oracle model. Instead of asking the verifier to sample \(\sigma\), we compute \(\sigma\) from a hash function computed on the ZK statement along with the first-round message. For example, in ZKP for partial decryption, we compute \(\sigma = H(pk_i, c, d, A, B)\), where \(H\) is a hash function modeled as a random oracle.

Some Resources

The following papers are incredibly useful for gaining a full understanding of protocols like ours:

Putting it all together

The following diagrams explain how the protocols work together from the perspective of a voter. The first two are two ways of presenting a general overview of the system. The last two show the registration and voting process, respectively. (Tip: to see the images more clearly, right-click and select "Open Image in New Tab".)

Architecture Overall

Architecture Overall 2

Architecture Registration

Architecture Voting

In short, we proceed in the following steps: setup, registration, voting, and verification.

Setup

Registration

Voting

Partial Decryption


Assignment Specification

Please note: you may NOT change any of the function headers defined in the stencil. Doing so will break the autograder; if you don't understand a function header, please ask us what it means and we'll be happy to clarify.

Functionality

You will primarily need to edit src/drivers/crypto_driver.cxx, src/pkg/arbiter.cxx, src/pkg/election.cxx, src/pkg/registrar.cxx, src/pkg/tallyer.cxx, and src/pkg/voter.cxx. The following is an overview of relevant files:

The following roadmap should help you organize concerns into a sequence:

Some tips:

Registration

Implement registration between the Voter and Registrar. Once you do so, Voters should be able to receive a verified DSA keypair (aka Certificate).

Application functions:

Vote Generation

Implement vote generation and verification, then allow Voters to vote with the Tallyer. Once you do so, votes should be put into the database.

Cryptographic functions:

Application functions:

Partial Decryption

Implement partial decryption and verification, then allow Arbiters to partially decrypt the election result

Cryptographic functions:

Application functions:

Support Code

Read the support code header files before coding so you have a sense of what functionality we provide. This isn't a networking class, nor is it a software engineering class, so we try to abstract away as many of these details as we can so you can focus on the cryptography.

The following is an overview of the functionality that each support code file provides.

Libraries: CryptoPP

You may find the following functions useful:

You may find the following wiki pages useful during this assignment:


Getting Started

To get started, get your stencil repository here and clone it into the devenv/home folder. From here you can access the code from both your computer and from the Docker container.

To prevent crypto_driver.cxx solutions to earlier assignments being leaked in later assignments, we ask that you copy your code from crypto_driver functions implemented in the last assignment into this one. The functions you copy over now will not need to be copied over in the following assignment.

Running

To build the project, cd into the build folder and run cmake ... This will generate a set of Makefiles building the whole project. From here, you can run make to generate a binary you can run, and you can run make check to run any tests you write in the test folder.

Testing

You may write tests in any of the test/**.cxx files in the Doctest format. We provide test/test_provided.cxx, feel free to write more tests in this file directly. If you want to add any new test files, make sure to add the file to the cmake variable, TESTFILES, on line 7 of test/CMakeLists.txt so that cmake can pick up on the new files. Examples have been included in the assignment stencil. To build and run the tests, run make check in the build directory. If you'd like to see if your code can interoperate with our code (which is what it will be tested against), feel free to download our binaries here - we try to keep these up to date, so if you're unsure about the functionality of our binaries, please ask us on Ed!


FAQ