Jan 22, 2024
Securing the circuit with zkScanner (Part I - Introduction)
Zero Knowledge Proof (ZKP) systems have never been more exciting than they are now. In simple terms, ZKP is a powerful privacy paradigm that can prove that your private information satisfies certain relationships without revealing the private information itself. It may sound abstract, so let’s give a very good example: you can prove your identity without exposing your password to the server! This application has already emerged through the application of ZKP.
In fact, for many years in the past, knowledge related to Zero-Knowledge Proofs (ZKPs) had predominantly remained at the theoretical level. However, starting in 2016, practical ZKP frameworks began to emerge within the academic community. Furthermore, with the development of Layer 2 solutions, ZKP frameworks experienced a significant explosion after 2019. Today, we are witnessing the first mini-boom of ZKPs, with countless zk-based schemes and applications taking center stage. From small-scale solutions like coin mixers and zero-knowledge Texas Hold’em poker to more substantial infrastructure elements in blockchain such as zkEVM and zkRollup, it can be said that even if you are not familiar with ZKPs, ZK is already shaping your understanding of the Web3 world. Moreover, it is exerting a profound influence on the future direction of the decentralized world.
ScaleBit valued this chance, among with the ZKP developers, scholars and users. Since last year, we have been sparing our efforts on ZKP, and we have achieved solid contribution to the commuity. Before delving into the outcome, we want to first share our understanding about the current state of ZKP, and find a common thread among the countless ZKP products that have emerged in recent years.
From ZKP to Circuit
In fact, from a conceptual standpoint, most existing ZKP frameworks are based on three paradigms: Groth16, Sonic, and zk-STARK.
Let’s start with the basics: compared to Sonic, Groth16 and zk-STARK do not belong to the same family, but they possess unique advantages: Groth16 proofs are concise and provably fast, much faster than other methods. However, as a tradeoff, Groth16 requires special circuit-specific settings. When it comes to zk-STARK, it overcomes many of the drawbacks of Groth16, as it does not require trusted setup and is resistant to quantum attacks. On the downside, zk-STARK proofs are more computationally expensive than Groth16.
With the comparison of Groth16 and zkSTARK, the Sonic family is quite complex, it grows a derivative of well-known methods like Halo, Marlin, and Plonk. In general, Sonic can be seen as a paradigm that falls between the other two: it only requires a one-time trusted setup for all circuits and offers a moderate proof cost.
When the methodology meets the practicality, which one should we choose? It’s indecisive for us to choose from the three schemes since each one has its advantage to lead us the way to ZKP. To address this dilemma, various front-end frameworks for ZKP have emerged. Some examples include artwork based on Groth16, Halo2 based on Plonk, Cairo based on zk-STARK, etc. Despite the diverse range of tools and languages, these frameworks all fall under the umbrella term arithmetic circuits. Basically, the circuit is just variables with assertions. The introduction of circuits allows users to develop ZKP applications smoothly, no need to delve into complex cryptographic principles. That is what developer are hoping for. As a result, these frameworks have gained significant popularity across the board. Developers also have the capability to simultaneously master multiple different solutions and make choices based on real-world circumstances.
So .. What’s the Problem
However, these front-end tools and languages have led to an obvious problem: they may introduce issues that would not exist when using low-level schemes. One important aspect that needs to be explained here is that: circuits are programs that are very different from traditional code. The goal of traditional programs is always to calculate results based on certain inputs, but circuits often assume that you already know the answers and ask the program to recalculate them to ensure the correctness of the answers. This naturally raises a question: if you are trying to provide the wrong answer, it surely cannot pass through the circuit. So, how can the circuit be wrong? In fact, circuits can indeed be wrong, and these errors can mainly be grouped into two categories: under-constrained and over-constrained. In simple terms, under-constrained means that in certain situations, your answer is wrong but the circuit doesn’t recognize it and allows it to pass. On the other hand, over-constrained means that your answer is theoretically correct, but the circuit still doesn’t allow it to pass.
These problems actually exist in almost all circuits. But why do these errors occur? To explain this, we need to go back to how circuits work. We give the circuit a set of answers, and the circuit verifies if the answers are correct. But what if we provide answers that are too long? Obviously, developers wouldn’t want to enter thousands of parameters every time when they perform zero-knowledge proof verification. Therefore, in addition to verifying the quantity relationship between each input, the circuit also needs to help developers calculate a large part of the inputs. Due to the fact that the computation process is not constrained by the circuit, it evidently introduces a significant amount of uncertainty into the circuit, thereby leading to potential vulnerabilities.
Certainly, we are very concerned about the two types of issues present in these circuits. We will delve into the principles behind these vulnerabilities and how to address them in future chapters. However, before formally addressing these problems, we need to select an appropriate frontend and begin our work based on it.
Why Circom First?
We are deeply concerned about the issues lurking in the circuits, and intend to kick off our work immediately with a front-end approach. Among the myriad of languages available, we have noticed the Circom language. Although Circom is relatively new, it has already been supporting both the Groth16 and Plonk schemes. Simultaneously, its syntax is more akin to conventional programming languages. In comparison to other complicated tools, it is easier for developers to learn, understand, and use. Therefore, we believe that detecting and verifying Circom will be more influential and significant. Let’s take a moment to briefly introduce Circom. Below is a snippet of typical Circom code:
pragma circom 2.1.6;
template Example () {
// declaration
signal input a;
signal input b;
signal c;
signal output d;
// Assignment and Constraint
c <== a * b;
// Assignment
d <-- c + a;
// Constraint
d === c + a;
}
Firstly, attention is drawn to the declaration section of Circom, where all the answers we intend to convey to the circuit are included. These answers are referred to as signals. Signals can reflect their nature through prefixes, with input signals indicating the parameters we need to communicate to the circuit. Signals and output signals can be derived from input signals through certain calculations within the circuit. While each signal can only be computed and assigned once, it can have multiple constraints.
As mentioned earlier, circuits can be used to indicate relationships between signals and compute their values. The determination of relationships are referred to as constraint, and the computation is called assignment. Circom’s constraint system primarily uses “a === b” to assert the equality of a and b. Its assignment system employs “a <– b” to signify the assignment of ‘b’ to ‘a’. Lastly, “a ==> b” performs bot constraint and assignment.
Last but not least, there is one crucial aspect to consider: all computations in Circom must be conducted within a finite field F. Essentially, this is due to the fact that ZKP scheme rely on specific equations of elliptic curves and are thus constrained by their cyclic groups. From a practical standpoint, it can be understood that any operation in Circom requires a ‘modulo p’ operation at the end to ensure that the value remains within the finite field. Moreover, due to the distinct nature of operations on elliptic curves compared to regular computations, non-quadratic expressions like a * a * a or a / b cannot be expressed as constraints in Circom (can be used in the assignment). These unique rules actually serve as the fundamental origins of many circuit vulnerabilities.
Now that we have a basic understanding of Circom, We are well-prepared to delve into the two types of vulnerabilities! In the next article, we will introduce ZKScanner and how it will discover and explore these two types of vulnerabilities.
About ScaleBit
ScaleBit, a subsidiary brand of BitsLab, is a blockchain security team that provides security solutions for Mass Adoption of Web3. With expertise in scaling technologies like blockchain interoperability and zero-knowledge proofs, we provide meticulous and cutting-edge security audits for blockchain applications. The team comprises security professionals with extensive experience in both academia and enterprise. Our mission is to provide security solutions for Web3 Mass Adoption and make security accessible for all.
- Website: https://www.scalebit.xyz/
- Twitter: https://twitter.com/scalebit_