opML and opp/AI
The FPVM is a specialized virtual machine designed for opML that can:
Execute and Verify Computation: Capable of performing individual computation steps to resolve disputes.
State Management: Uses Merkle trees to represent and manage computation states efficiently.
Onchain Arbitration: Runs on smart contracts, enabling the on-chain verification of disputed steps.
Optimistic Privacy-Preserving AI onchain
opp/ai (Optimistic Privacy-Preserving AI on Blockchain) is a novel framework that combines the efficiency of opML (Optimistic Machine Learning) with the privacy guarantees of zkML (Zero-Knowledge Machine Learning). By strategically partitioning machine learning (ML) models, opp/ai balances computational efficiency and data privacy, enabling secure and efficient AI services onchain.
The following is a comprehensive overview of opp/ai, including its architecture, technical components, and use cases.
Privacy Preservation: Protects sensitive data and proprietary models by leveraging Zero-Knowledge Proofs (ZKPs) for critical components.
Balanced Efficiency: Reduces zkML overhead by limiting its use to essential parts of the model while running the rest in opML.
Flexibility: Allows customization of the model partitioning to tailor privacy and efficiency based on application requirements.
Security: Combines cryptographic security from zkML with the economic security model of opML.
opp/ai divides the ML model into submodels based on privacy requirements:
zkML Submodels:
Components that handle sensitive data or proprietary algorithms.
Executed using Zero-Knowledge Proofs to ensure data and model confidentiality.
opML Submodels:
Components where efficiency is prioritized over privacy.
Executed using the optimistic approach of opML for maximum efficiency.
Example Partitioning:
An ML model M is partitioned into:
M_zk (zkML Submodel): Handles sensitive computations.
M_op (opML Submodel): Handles non-sensitive computations.
Zero-Knowledge Proofs for zkML Submodels:
The prover generates ZKPs for M_zk without revealing sensitive data.
Proofs are submitted to the blockchain and verified by validators.
Optimistic Execution for opML Submodels:
M_op is executed efficiently off-chain.
Results are submitted to the blockchain by decentralized service providers and are assumed correct unless challenged.
Integration Mechanism:
Outputs from M_zk are used as inputs for M_op and vice versa as necessary.
Ensures seamless integration of zkML and opML components.
opp/ai relies on the three major components of opML for opML submodels:
The FPVM is a specialized virtual machine designed for opML that can:
Execute and Verify Computation: Capable of performing individual computation steps to resolve disputes.
State Management: Uses Merkle trees to represent and manage computation states efficiently.
Onchain Arbitration: Runs on smart contracts, enabling the on-chain verification of disputed steps.
The ML engine in opML is tailored for both native execution and fraud-proof scenarios:
Dual Compilation: ML models are compiled twice:
Native Execution: Optimized for speed, utilizing multi-threading and GPU acceleration.
Fraud-Proof VM Instructions: Compiled into a machine-independent code for use in FPVM during disputes.
Determinism and Consistency:
Fixed-Point Arithmetic: Ensures consistent results across different environments.
Software-Based Floating-Point Libraries: Guarantees cross-platform determinism.
An efficient protocol for resolving disputes:
Bisection Protocol:
Iteratively narrows down the computation steps to find the exact point of disagreement.
Reduces the amount of data and computation needed for onchain verification.
Onchain Arbitration:
Only the minimal necessary computation (usually a single step) is performed onchain.
Ensures disputes are resolved fairly and efficiently.
In addition to the opML components above, opp/ai incorporates the following zkML components:
The prover generates ZKPs, such as zk-SNARKs, to demonstrate the validity of the ML inference.
The verifier smart contract efficiently validates the zero-knowledge proofs provided by the prover.
To achieve model privacy, the proofs of all zkML submodels should have their model inputs set as public inputs, while the model parameters or weights should be kept private or hashed. The workflow is as follows:
Model Partitioning:
The ML model is partitioned into zkML and opML submodels.
All submodels running in opML are published as a public model available to all potential submitters and challengers.
Request Initiation:
A user (requester) submits a request for an ML inference task.
Computation and Submission:
The prover receives the model input and computes the proof the zkML submodels, submitting them onchain for verifiers while also requesting to initiate opML service tasks for the opML submodels.
The submitter performs the opML service task, taking the result from the zkML submodels as input to the opML model and commits the results onchain.
Dispute Resolution:
The challengers validate the results and will start a dispute process with the submitter if they find any of them inaccurate. The dispute process will only apply to the specific opML submodel being challenged.
For opML submodels, disputes are resolved via the interactive dispute game using the FPVM.
For zkML submodels, the correctness is ensured by the cryptographic proofs.
Finalization:
Upon successful verification and resolution of disputes, the final result is confirmed on the blockchain.
In the context of preserving model privacy, the single prover trust assumption posits that only one trusted entity, the prover, has access to the model weights. All end users interact with this prover to generate zkML proofs without direct access to the model weights themselves.
This approach is designed to protect the confidentiality of the model while still enabling users to benefit from its predictive capabilities.
opp/ai acknowledges that attackers could attempt to reconstruct models through extensive queries due to the single prover trust assumption inherent to zkML.
In order to mitigate this, opp/ai can limit the number of allowed inferences or adjust the cost per inference to deter attacks.
In order to ensure user input privacy, zkML proof generation should occur on local devices where model weights are distributed. However, this lends the model to reverse engineering attacks.
Given this limitation of zkML, the opp/ai framework instead focuses on model parameter privacy rather than input privacy.
How does opp/ai differ from opML and zkML individually?
opp/ai integrates both opML and zkML to provide a balance between efficiency and privacy. It uses zkML for sensitive computations and opML for the rest, whereas opML focuses on efficiency without privacy, and zkML provides privacy but with higher computational costs.
What types of models are suitable for opp/ai?
Models that can be partitioned based on privacy requirements. This includes models where certain layers or components handle sensitive data or proprietary algorithms.
How do I decide which parts of my model should run in zkML vs. opML?
Analyze the model to identify components that require privacy (e.g., layers with proprietary weights or handling sensitive data). These should run in zkML. Components where privacy is not a concern can run in opML for efficiency.
What are the hardware requirements for opp/ai?
zkML Components: May require machines with substantial resources for proof generation, especially for larger models.
opML Components: Can run on standard PCs with CPU/GPU capabilities.
Is opp/ai practical for large-scale models?
While zkML components can be resource-intensive, opp/ai reduces the overhead by only applying zkML to critical parts of the model. This makes it more practical than using zkML for the entire model.
How does opp/ai handle input privacy?
Input privacy is challenging due to potential reverse-engineering attacks. opp/ai focuses on model privacy and acknowledges limitations in input privacy with current zkML technologies.
Can opp/ai be integrated with existing blockchain applications?
Yes, opp/ai is designed to be flexible and can be integrated into existing applications that require both efficient and privacy-preserving AI capabilities.
opML Documentation
Community and Support:
Discussion Forums: https://discord.gg/ora-io
Related Papers and Research:
opML: Optimistic Machine Learning on Blockchain: https://arxiv.org/pdf/2401.17555
opp/ai: Optimistic Privacy-Preserving AI on Blockchain: https://arxiv.org/pdf/2402.15006
For further questions or support, please reach out to the ORA team through the community channels.
Scalable machine learning (ML) inference onchain
opML (Optimistic Machine Learning) is an innovative framework that enables efficient and scalable machine learning (ML) inference directly on the blockchain. By leveraging an optimistic approach inspired by optimistic rollups (e.g., Optimism and Arbitrum), opML overcomes the computational limitations of blockchains, allowing for the execution of large ML models without sacrificing decentralization or security.
opML is currently available via ORA’s AI Oracle.
The following is a comprehensive overview of opML, including its architecture, technical components, and frequently asked questions.
High Efficiency: Computations are performed off-chain in optimized environments and only minimal data is processed on-chain during disputes.
Cost-Effectiveness: Reduces computational costs by avoiding the expensive proof generation required in Zero-Knowledge Machine Learning (zkML).
Decentralization: Maintains the decentralized ethos of blockchain by enabling onchain verification without relying on centralized servers.
Scalability: Capable of handling extensive computations that are impractical with traditional onchain methods.
Accessibility: Opens up advanced AI models and capabilities to decentralized applications.
opML combines Optimistic Execution with a Fraud-Proof Mechanism.
opML operates on the principle of optimistic execution, which assumes that the ML computations submitted are correct by default. This approach allows computations to be performed offchain in optimized environments, significantly reducing the computational burden on the blockchain.
Assumption of Correctness: Results are accepted unless proven otherwise.
Efficiency: By not requiring immediate verification, the system avoids unnecessary computational overhead.
To ensure security and correctness, opML incorporates the following fraud-proof mechanism:
Submission of Results: The service provider (submitter) performs the ML computation offchain and submits the result to the blockchain.
Verification Period: Validators (or challengers) have a predefined period (challenge period) to verify the correctness of the submitted result.
Dispute Resolution:
If a validator detects an incorrect result, they initiate an Interactive Dispute Game.
The dispute game efficiently pinpoints the exact computation step where the error occurred.
On-Chain Verification: Only the disputed computation step is verified on-chain using the Fraud Proof Virtual Machine (FPVM), minimizing resource usage.
Finalization: If no disputes are raised during the challenge period, or after disputes are resolved, the result is finalized on the blockchain.
Deterministic ML: Randomness and variability in floating-point computations results in inconsistency of ML results. To address this, opML fixes the random seed so that outputs can be verified across multiple instances of the model. ****
Separated Execution from Proving: opML compiles the source code twice for native execution and for the fraud-proof VM instructions. This allows for computations to be performed in optimized native environments ensuring fast execution while simultaneously proving based on machine-independent code.
Lazy Loading Design: Loads only necessary data into the FPVM during the dispute phase to overcome memory limitations of the VM.
Attention Challenge: Validators are randomly selected to verify computations. Failing to respond when selected results in penalties, motivating active participation.
Economic Security: Validators and submitters stake tokens to participate. Dishonest behavior results in penalties (loss of stake). By staking tokens, both submitters and validators have economic incentives to behave honestly.
The FPVM is a specialized virtual machine designed for opML that can:
Execute and Verify Computation Steps: Capable of performing individual computation steps to resolve disputes.
State Management: Uses Merkle trees to represent and manage computation states efficiently.
Onchain Arbitration: Runs on smart contracts, enabling the on-chain verification of disputed steps.
The ML engine in opML is tailored for both native execution and fraud-proof scenarios:
Dual Compilation: ML models are compiled twice:
Native Execution: Optimized for speed, utilizing multi-threading and GPU acceleration.
Fraud-Proof VM Instructions: Compiled into a machine-independent code for use in FPVM during disputes.
Determinism and Consistency:
Fixed-Point Arithmetic: Ensures consistent results across different environments.
Software-Based Floating-Point Libraries: Guarantees cross-platform determinism.
An efficient protocol for resolving disputes:
Bisection Protocol:
Iteratively narrows down the computation steps to find the exact point of disagreement.
Reduces the amount of data and computation needed for onchain verification.
Onchain Arbitration:
Only the minimal necessary computation (usually a single step) is performed onchain.
Ensures disputes are resolved fairly and efficiently.
Request Initiation: A user (requester) submits a request for an ML inference task.
Computation and Submission:
The submitter performs the computation off-chain using the optimized ML engine.
Results are submitted to the blockchain.
Challenge Period:
Validators verify the correctness of the result.
If no disputes are raised within the predefined period, the result is accepted.
Dispute Resolution (if necessary):
A validator initiates the dispute game if an incorrect result is detected.
The exact computation step in dispute is identified.
On-chain verification is performed using the FPVM.
Finalization: The result is finalized after successful verification or dispute resolution.
Definition: The system is secure as long as at least one validator is honest.
Implications:
Even if multiple validators are malicious, a single honest validator can ensure correctness.
Encourages decentralization and wide participation to strengthen security.
Verifier Dilemma:
Validators may choose not to verify computations to save resources, potentially allowing incorrect results.
Our Solution:
Implementing the Attention Challenge mechanism detailed above encourages validators to participate.
Economic incentives ensure that the cost of cheating outweighs potential gains.
What makes opML different from zkML?
opML focuses on efficiency and scalability by using an optimistic approach without zero-knowledge proofs. zkML provides strong privacy guarantees but is resource-intensive and less practical for large models. The use of opp/ai enables the use of opML with privacy guarantees.
How does opML ensure security without zero-knowledge proofs?
opML relies on a fraud-proof mechanism and the AnyTrust assumption. Validators can challenge incorrect results, and economic incentives discourage dishonest behavior.
What happens if no validators challenge an incorrect result?
The economic incentives and attention challenge mechanism are designed to ensure validators participate. However, if all validators are dishonest or fail to challenge, the system could accept incorrect results, highlighting the importance of validator participation.
Can opML handle private data or proprietary models?
opML assumes that data and models are not sensitive. For applications requiring privacy, consider using opp/ai, which integrates opML with zkML components for privacy preservation.
What are the hardware requirements for running opML?
opML is designed to run on standard PCs with CPU/GPU capabilities. No specialized hardware is required.
How does the dispute resolution process impact performance?
Disputes are rare and only involve verifying minimal computation steps on-chain. The impact on performance is minimal compared to the overall efficiency gains. The dispute resolution process can also be optimized by zkfp.
Is opML open-source?
Yes, opML is open-sourced. You can access the repository athttps://github.com/ora-io/opml. New versions will shortly be released.
opML GitHub Repository: https://github.com/ora-io/opml
opML Paper: https://arxiv.org/pdf/2401.17555
Issue Tracker: https://github.com/ora-io/opml/issues
For any further questions or support, feel free to reach out to the ORA team through the community channels.
ORA leverages opML for Onchain AI Oracle because it’s the most feasible solution on the market for running any-size AI model onchain. The comparison between opML and zkML can be viewed from the following perspectives:
Proof system: opML uses fraud proofs, while zkML uses zk proofs.
Security: opML uses crypto-economic based security, while zkML uses cryptography based security.
Finality: We can define the finalized point of zkML and opML as follows:
zkML: Zero-knowledge proof of ML inference is generated (and verified).
opML: Challenge period of ML inference is passed. With additional mechanisms, faster finality can be achieved in much shorter time than the challenge period.
Opp/AI combines both opML and zkML approaches to achieve scalability and privacy. It preserves privacy while being more efficient than zkML.
Compared to pure zkML, opp/ai has much better performance with the same privacy feature.