Zero-Knowledge Machine Learning (ZKML): What Sparks Will ZK and AI Collide To Create?

2023.04.06

Zero-Knowledge Machine Learning (ZKML): What Sparks Will ZK and AI Collide To Create?

This article will introduce the motivation behind building ZKML, current efforts, and potential application areas.

2023.04.06 - 05:00:56

ZKML

Navigating Web3 tides with focused insights

This article will introduce the motivation behind building ZKML, current efforts, and potential application areas.

Author: dcbuilder.eth, Worldcoin

Translated by: TechFlow

Zero-Knowledge Machine Learning (ZKML) is a research and development field recently gaining significant attention in the cryptography community. But what exactly is it, and what are its potential uses? First, let's break down this term into its two components and explain each.

What is ZK?

A zero-knowledge proof is a cryptographic protocol where one party (the prover) can prove to another party (the verifier) that a given statement is true without revealing any additional information beyond the truth of the statement itself. This is a research area making tremendous progress across all fronts—from theoretical research to protocol implementation and practical applications.

ZK provides two key "primitives" (or building blocks): first, the ability to generate succinct proofs of computational integrity—proofs that are much cheaper to verify than the cost of executing the computation itself. (This property is known as "succinctness.") Second, ZK allows for hiding certain parts of the computation while still maintaining verifiable correctness. (This property is known as "zero-knowledge.")

Generating zero-knowledge proofs requires substantial computational resources—approximately 100 times more expensive than the original computation. This means that in some cases, generating such proofs may be impractical due to the time required even on state-of-the-art hardware.

However, recent advances in cryptography, hardware, and distributed systems have made zero-knowledge proofs an increasingly viable option for powerful computing tasks. These developments have enabled the creation of protocols that leverage computationally intensive proofs, thereby expanding the design space for new applications.

ZK Use Cases

Zero-knowledge cryptography is one of the most popular technologies in the Web3 space because it enables developers to build scalable and/or private applications. Below are some examples of how it is being used in practice (though note that many of these projects are still under development):

1. Scaling Ethereum via ZK rollups

Starknet
Scroll
Polygon Zero, Polygon Miden, Polygon zkEVM
zkSync

2. Building privacy-preserving applications

Semaphore
MACI
Penumbra
Aztec Network

3. Identity primitives and data provenance

WorldID
Sismo
Clique
Axiom

4. Layer 1 protocols

Zcash
Mina

As ZK technology matures, we believe there will be an explosion of new applications, as the tools needed to build them will require less domain-specific expertise and become easier for developers to use.

Machine Learning

Machine learning is a research area within artificial intelligence ("AI") that enables computers to automatically learn and improve from experience without being explicitly programmed. It uses algorithms and statistical models to analyze and identify patterns in data, then makes predictions or decisions based on those patterns. The ultimate goal of machine learning is to develop intelligent systems capable of adaptive learning, operating without human intervention, and solving complex problems across domains such as healthcare, finance, and transportation.

Recently, you may have noticed advancements in large language models (such as ChatGPT and Bard) and text-to-image models (such as DALL-E 2, Midjourney, or Stable Diffusion). As these models become increasingly capable and perform a broader range of tasks, it becomes crucial to determine whether a given output was generated by a model or by a human. In the following sections, we explore this idea further.

Motivation and Current Efforts in ZKML

We live in a world where AI/ML-generated content is becoming increasingly indistinguishable from human-generated content. Zero-knowledge cryptography will allow us to make statements such as: "Given content C, it was generated by applying model M to some input X." We will be able to verify whether a given output was produced by a large language model (like ChatGPT) or a text-to-image model (like DALL-E 2), or any other model for which we have created a zero-knowledge circuit representation. The zero-knowledge property of these proofs will also allow us to hide certain parts of the input or model when desired. A compelling example is applying machine learning models to sensitive data—users can learn the outcome of model inference on their data without revealing the input to third parties (e.g., in healthcare).

Note: When discussing ZKML, we refer specifically to creating zero-knowledge proofs of the inference step of ML models, not the training process—which is already extremely computationally intensive. Currently, even state-of-the-art zero-knowledge systems combined with high-performance hardware fall several orders of magnitude short of being able to prove massive models like today’s largest language models (LLMs). However, progress has been made in generating proofs for smaller models.

We conducted research into the current state of zero-knowledge cryptography applied to proving ML models and compiled a collection of relevant studies, articles, applications, and code repositories. Resources on ZKML can be found in the awesome-zkml repository maintained by the ZKML Community on GitHub.

The Modulus Labs team recently released a paper titled "The Cost of Intelligence," which benchmarks existing ZK proof systems across multiple model sizes. Currently, using proof systems like plonky2 on powerful AWS machines, it takes about 50 seconds to generate a proof for a model with approximately 18 million parameters. Below is a chart from that paper:

Another initiative aiming to advance the state of the art in ZKML is Zkonduit's ezkl library, which enables users to create ZK proofs for ML models exported in ONNX format. This empowers any ML engineer to generate ZK proofs for their model's inference steps and cryptographically prove outputs to any compliant verifier.

Several teams are working to improve ZK technology, designing optimized hardware for operations inside ZK proofs, and building optimized implementations of these protocols for specific use cases. As the technology matures, larger models will be provable in shorter times even on less powerful machines. We expect these advancements to unlock novel ZKML applications and use cases.

Potential Use Cases

To assess whether ZKML is suitable for a particular application, we can consider how the properties of zero-knowledge cryptography address challenges related to machine learning. This can be illustrated using a Venn diagram:

Definitions:

1. Heuristic optimization—a problem-solving approach that uses rules of thumb or "heuristics" to find good solutions to difficult problems, rather than relying on traditional optimization methods. Heuristic optimization aims to find good or "good enough" solutions in reasonable time relative to problem importance and complexity, rather than attempting to find optimal solutions.

2. FHE ML—Fully Homomorphic Encryption ML allows developers to train and evaluate models in a privacy-preserving manner; however, unlike ZK proofs, there is no way to cryptographically prove the correctness of the computation performed.

Teams like Zama.ai are actively working in this space.

3. ZK vs Validity—In industry, these terms are often used interchangeably, since validity proofs are ZK proofs that do not hide parts of the computation or its results. In the context of ZKML, most current applications leverage the validity proof aspect of ZK proofs.

4. Validity ML—ZK proofs for ML models where no part of the computation or result is kept secret. They prove the correctness of the computation.

Below are some potential ZKML use case examples:

1. Computational Integrity (Validity ML)

Modulus Labs
On-chain verifiable ML trading bots—RockyBot

Self-improving visual blockchain (examples):

Enhancing smart features of Lyra Finance Options Protocol AMM
Creating a transparent AI-based reputation system for Astraly (ZK oracle)
Developing technological breakthroughs for contract-level compliance tools in Aztec Protocol (a privacy-enabled zk-rollup) using ML

2. Machine Learning as a Service (MLaaS) transparency

3. ZK anomaly/fraud detection:

This use case enables the creation of ZK proofs for exploitability/fraud. Anomaly detection models can be trained on smart contract data and agreed upon by DAOs as meaningful metrics, enabling automation of security procedures—such as proactively and preventively pausing contracts. Startups are already exploring ways to use ML models for security purposes in smart contract environments, making ZK anomaly detection proofs a natural next step.

4. General-purpose validity proofs for ML inference: enabling easy proving and verification that an output is the product of a given model and input pair.

5. Privacy (ZKML)

6. Decentralized Kaggle: proving that a model achieves greater than x% accuracy on certain test data without revealing weights.

7. Privacy-preserving inference: feeding medical diagnostics on private patient data into a model and sending sensitive inferences (e.g., cancer test results) securely to patients.

8. Worldcoin:

Upgradability of IrisCode: World ID users will be able to self-custody their biometrics in encrypted storage on their mobile devices, download the ML model used to generate IrisCode, and locally create zero-knowledge proofs to demonstrate successful IrisCode generation. This IrisCode can then be permissionlessly submitted to any registered Worldcoin user, as receiving smart contracts can verify the zero-knowledge proof and thus validate the IrisCode creation. This means that if Worldcoin upgrades its machine learning model in a way that breaks backward compatibility in IrisCode generation, users won't need to return to the Orb—they can instead locally generate the necessary zero-knowledge proof on their device.
Orb Security: Currently, the Orb executes several fraud and tampering detection mechanisms within its trusted environment. However, we could create a zero-knowledge proof showing that these mechanisms were active during image capture and IrisCode generation, providing stronger liveness assurance for the Worldcoin protocol, as we could be fully certain these checks ran throughout the entire IrisCode generation process.

In summary, ZKML technology holds broad application potential and is rapidly evolving. As more teams and individuals enter this field, we believe ZKML use cases will become increasingly diverse and widespread.

Join TechFlow official community to stay tuned

Telegram:https://t.me/TechFlowDaily

X (Twitter):https://x.com/TechFlowPost

X (Twitter) EN:https://x.com/BlockFlow_News

Source

Add to Favorites

Share to Social Media

Author

Worldcoin

Zero-Knowledge Machine Learning (ZKML): What Sparks Will ZK and AI Collide To Create?

TechFlow Selected TechFlow Selected