How Web3 plays a role in various links of the AI industry chain

YieldHunter · 2025-08-02T04:49:25+00:00

# AI+Web3: Towers and Squares## TL;DR1. Web3 projects with AI concepts have become targets for capital attraction in the primary and secondary markets.2. The opportunities of Web3 in the AI industry are mainly reflected in: utilizing distributed incentives to coordinate the long tail of potential supply across data, storage, and computing (; while establishing a decentralized market for open-source models and AI Agents.3. AI is mainly applied in the Web3 industry for on-chain finance ), cryptocurrency payments, trading, data analysis (, and assisting development.4. The utility of AI+Web3 is reflected in the complementarity of both parties: Web3 is expected to counteract the centralization of AI, while AI is expected to help Web3 break through boundaries.![AI+Web3: Towers and Squares])https://img-cdn.gateio.im/social/moments-25bce79fdc74e866d6663cf31b15ee55(## IntroductionIn the past two years, AI has developed rapidly, and the emergence of ChatGPT has opened a new era of generative artificial intelligence, also stirring up a wave in the Web3 field.With the support of AI concepts, financing for Web3 projects has significantly improved. In the first half of 2024 alone, 64 Web3+AI projects completed financing, with the AI-based operating system Zyber365 achieving the highest financing of 100 million USD in Series A.The secondary market is more prosperous. According to Coingecko, the total market value of the AI sector has reached 48.5 billion USD, with a 24-hour trading volume of nearly 8.6 billion USD. The advancements in mainstream AI technologies have brought significant benefits, such as a 151% average price increase in the AI sector following the release of OpenAI's Sora. The AI effect has also radiated to the cryptocurrency fundraising segment Meme: the first AI Agent concept MemeCoin GOAT has rapidly gained popularity, with a valuation of 1.4 billion USD, igniting the AI Meme craze.Research and topics related to AI+Web3 are heating up, from AI+Depin to AI Memecoin to AI Agent and AI DAO, the FOMO sentiment can hardly keep up with the speed of the new narrative rotation.The combination of AI and Web3, filled with hot money, trends, and future fantasies, is inevitably seen as a marriage arranged by capital matchmaking. It is difficult for us to determine whether this is the playground of speculators or the eve of a dawn explosion.To answer this question, the key lies in thinking: will it get better with the other party? Can we benefit from the other party's model? This article attempts to examine this pattern: how Web3 plays a role in various aspects of the AI technology stack, and what new vitality AI can bring to Web3?## Part.1 What opportunities does Web3 have under the AI stack?Before we delve into this topic, we need to understand the technology stack of AI large models:AI large models can be compared to the human brain, as in the early stages, like an infant, they need to observe and absorb a massive amount of external information to understand the world. This is the data "collection" phase. Since computers do not possess human multi-sensory capabilities, "preprocessing" is required before training to convert unannotated information into a format understandable by computers.After inputting data, AI constructs a model with understanding and predictive capabilities through "training", similar to how a baby gradually learns to understand the outside world. Model parameters are like a baby's continuously adjusting language abilities. Learning content is categorized or acquired through communication with others for feedback and correction, entering the "fine-tuning" stage.After children grow up and learn to speak, they can understand and express themselves in new conversations, similar to the "inference" stage of AI large models, where they can perform predictive analysis on new inputs. Infants use language to express feelings, describe objects, and solve problems, similar to how AI large models apply after training to various specific tasks, such as image classification and speech recognition.The AI Agent is closer to the next form of large models: capable of independently executing tasks to pursue complex goals, possessing thinking, memory, and planning abilities, and able to use tools to interact with the world.In response to the pain points of AI stacks, Web3 has currently begun to form a multi-layered interconnected ecosystem that covers various stages of the AI model process.![AI+Web3: Towers and Squares])https://img-cdn.gateio.im/social/moments-cc3bf45e321f9b1d1280bf3bb827d9f4() 1. Basic Layer: Computing Power and Data's Airbnb#### Hash RateCurrently, one of the highest costs of AI is the computing power and energy required for training and inference models.Training Meta's LLAMA3 requires 16,000 NVIDIA H100 GPUs for 30 days. The 80GB version costs between $30,000 and $40,000, necessitating a hardware investment of $400 million to $700 million. Monthly training consumes 1.6 billion kilowatt-hours of electricity, with energy expenses nearing $20 million.The decompression of AI computing power is the earliest intersection of Web3 and AI ------ DePin### decentralized physical infrastructure network(. DePin Ninja has listed over 1,400 projects, with GPU computing power sharing represented by io.net, Aethir, Akash, Render Network, and others.Main logic: The platform allows idle GPU resource owners to contribute computing power in a decentralized manner without permission, similar to an online marketplace for buyers and sellers like Uber or Airbnb, increasing the utilization rate of underutilized GPU resources. End users obtain low-cost and efficient computing resources; at the same time, the staking mechanism ensures that resource providers are penalized for violating quality control or interrupting the network.Features:- Gather idle GPU resources: mainly for excess computing power from third-party small and medium data centers, crypto mining farms, and PoS mining hardware such as FileCoin and ETH miners. There are also projects dedicated to launching devices with lower entry thresholds, such as exolab, which utilizes local devices like MacBook, iPhone, and iPad to establish and operate a large model inference computing network.- Targeting the long-tail market for AI computing power: a. Technical side: More suitable for inference steps. Training relies on ultra-large cluster GPUs, while inference has lower requirements for GPU computing performance, such as Aethir focusing on low-latency rendering and AI inference.b. Demand side: Small and medium computing power demanders will not train large models independently, but will only optimize and fine-tune around leading large models, which is naturally suitable for distributed idle computing power resources.- Decentralized Ownership: The significance of blockchain technology lies in the fact that resource owners always retain control over their resources, allowing for flexible adjustments while still generating profits.)# DataData is the foundation of AI. Without data, computation is as useless as floating weeds; the relationship between data and models is like "Garbage in, Garbage out". The quantity and quality of data determine the quality of the final model output. For AI model training, data determines language ability, comprehension ability, values, and human-like performance. Currently, the main challenges in AI data demand are:- Data Hunger: AI model training relies on massive data input. OpenAI trained GPT-4 with a parameter count reaching trillions.- Data Quality: With the integration of AI into various industries, new requirements have emerged for data timeliness, diversity, professionalism, and emerging data sources such as social media sentiment analysis.- Privacy and Compliance: Companies in various countries are gradually recognizing the importance of high-quality datasets and are restricting dataset crawling.- High data processing costs: Large data volume and complex processing. AI companies spend over 30% of their R&D costs on basic data collection and processing.Web3 solutions are reflected in four aspects:1. Data Collection: Free scraping of real-world data is quickly depleting, while AI companies' data spending is increasing year by year, but this does not benefit the true contributors. Platforms enjoy all the value creation, such as Reddit earning $203 million through data licensing agreements with AI companies.The vision of Web3 is to enable true contributors to participate in the creation of data value, obtaining more private valuable data at a low cost through distributed networks and incentive mechanisms.- Grass: Decentralized data layer and network, users run nodes to contribute idle bandwidth to relay traffic and capture real-time data, and receive token rewards.- Vana: Introduce the concept of data liquidity pool ###DLP(, where users upload private data to a specific DLP and flexibly choose whether to authorize third parties to use it.- PublicAI: Users can use the ) Web3 tag on X and @PublicAI to achieve data collection.2. Data Preprocessing: In AI data processing, the collected data is often noisy and contains errors, and it must be cleaned and converted into a usable format before training, involving repetitive tasks such as standardization, filtering, and handling missing values. This stage is one of the few manual processes in the AI industry, giving rise to the data labeling profession. As the requirements for data quality increase with the models, the threshold also rises, making it inherently suitable for decentralized incentive mechanisms in Web3.- Grass and OpenLayer are considering adding a data labeling phase.- Synesis proposes the concept of "Train2earn", emphasizing data quality, where users provide annotated data, comments, etc., to earn rewards.- The data labeling project Sapien gamifies the labeling tasks, allowing users to stake points to earn more points.3. Data Privacy and Security: It is necessary to clarify that data privacy and security are two concepts. Data privacy involves the handling of sensitive data, while data security protects data from unauthorized access, destruction, and theft. Advantages of Web3 privacy technologies and potential application scenarios: #AI或#1( sensitive data training; )2( data collaboration: multiple data owners participate in AI training together without sharing the original data.Current Web3 Privacy Technologies:- Trusted Execution Environment ) TEE (, such as Super Protocol- Fully Homomorphic Encryption ) FHE (, such as BasedAI, Fhenix.io, Inco Network- Zero-knowledge technology ) zk (, such as Reclaim Protocol using zkTLS technology to generate zero-knowledge proofs of HTTPS traffic, allows users to securely import external website activities, reputation, and identity data without exposing sensitive information.The field is still in its early stages, with most projects exploring; the current dilemma is the high computing costs, such as:- The zkML framework EZKL takes about 80 minutes to generate a 1M-nanoGPT model proof.- Modulus Labs data shows that the zkML overhead is more than 1000 times higher than pure computation.4. Data Storage: A place is needed to store data on-chain and generate LLM. With data availability )DA( as the core issue, the throughput of Ethereum before the Danksharding upgrade was 0.08MB. AI model training and real-time inference typically require a data throughput of 50-100GB per second. This magnitude of difference makes existing on-chain solutions difficult to handle resource-intensive AI applications.- 0g.AI is a representative project. It is a centralized storage solution designed for high-performance requirements of AI, with key features: high performance and scalability, supporting fast upload and download of large-scale datasets through advanced sharding and erasure coding technology, with data transfer speeds approaching 5GB per second.) 2. Middleware: Model Training and Inference#### Open Source Model Decentralized MarketThe debate between closed-source and open-source AI models continues. Open-source brings collective innovation, an advantage that closed-source cannot match, but how to enhance developer motivation without a profit model? Baidu founder Robin Li asserted in April that "open-source models will increasingly fall behind."Web3 proposes the possibility of a decentralized open-source model marketplace: tokenizing the models themselves, reserving a certain proportion of tokens for the team, and directing part of the future revenue streams to token holders.- The Bittensor protocol establishes an open-source model P2P market, composed of dozens of "subnets". Resource providers ### compete in computing, data collection/storage, and machine learning talent ( to meet the specific goals of subnet owners. Each subnet can interact and learn from each other to achieve stronger intelligence. Rewards are allocated through community voting and further distributed to each subnet based on competitive performance.- ORA introduces the initial model issuance concept )IMO(, tokenizing AI models, which can be bought, sold, and developed through a decentralized network.- Sentient, a decentralized AGI platform that incentivizes people to cooperate, build, replicate, and scale AI models, and rewards contributors.- Spectral Nova, focusing on the creation and application of AI and ML models.)# Verifiable ReasoningIn response to the "black box" problem of AI inference, the standard Web3 solution is to have multiple validators compare the results of repeated operations, but the shortage of high-end "Nvidia chips" has led to high costs for AI inference, posing challenges to this approach.What is more hopeful is the execution of ZK proofs for off-chain AI inference calculations, allowing for permissionless verification of AI model computations on-chain. It is necessary to encrypt proofs on-chain to confirm that off-chain calculations are completed correctly, such as ensuring that the dataset is not tampered with ###, while also ensuring all data remains confidential.Main advantages:- Scalability: Zero-knowledge proofs can quickly verify a large number of off-chain computations. Even with an increase in the number of transactions, a single proof can validate all transactions.- Privacy Protection: Data and AI model details are kept confidential, and parties can verify that the data and models have not been tampered with.- Trustless: No need to rely on centralized parties to verify computation.- Web2 Integration: By definition, Web2 is off-chain integration, and verifiable reasoning can help bring its datasets and AI computations on-chain, which aids in increasing the adoption rate of Web3.Current Web3 technologies for verifiable reasoning:- zkML: Combining zero-knowledge proofs with machine learning to ensure the privacy and confidentiality of data and models, allowing for verifiable computation without disclosing underlying attributes. For example, Modulus Labs has released a ZK prover built on ZKML for AI, which effectively verifies whether AI providers correctly execute algorithms on-chain. Currently, the clients are mainly on-chain DApps.- opML: Utilizing the optimistic aggregation principle, improve ML calculation by verifying the time of dispute occurrence.

YieldHunter

2025-08-02 04:49:25

AI+Web3: Towers and Squares

TL;DR

Web3 projects with AI concepts have become targets for capital attraction in the primary and secondary markets.
The opportunities of Web3 in the AI industry are mainly reflected in: utilizing distributed incentives to coordinate the long tail of potential supply across data, storage, and computing (; while establishing a decentralized market for open-source models and AI Agents.
AI is mainly applied in the Web3 industry for on-chain finance ), cryptocurrency payments, trading, data analysis (, and assisting development.
The utility of AI+Web3 is reflected in the complementarity of both parties: Web3 is expected to counteract the centralization of AI, while AI is expected to help Web3 break through boundaries.

![AI+Web3: Towers and Squares])https://img-cdn.gateio.im/webp-social/moments-25bce79fdc74e866d6663cf31b15ee55.webp(

Introduction

In the past two years, AI has developed rapidly, and the emergence of ChatGPT has opened a new era of generative artificial intelligence, also stirring up a wave in the Web3 field.

With the support of AI concepts, financing for Web3 projects has significantly improved. In the first half of 2024 alone, 64 Web3+AI projects completed financing, with the AI-based operating system Zyber365 achieving the highest financing of 100 million USD in Series A.

The secondary market is more prosperous. According to Coingecko, the total market value of the AI sector has reached 48.5 billion USD, with a 24-hour trading volume of nearly 8.6 billion USD. The advancements in mainstream AI technologies have brought significant benefits, such as a 151% average price increase in the AI sector following the release of OpenAI's Sora. The AI effect has also radiated to the cryptocurrency fundraising segment Meme: the first AI Agent concept MemeCoin GOAT has rapidly gained popularity, with a valuation of 1.4 billion USD, igniting the AI Meme craze.

Research and topics related to AI+Web3 are heating up, from AI+Depin to AI Memecoin to AI Agent and AI DAO, the FOMO sentiment can hardly keep up with the speed of the new narrative rotation.

The combination of AI and Web3, filled with hot money, trends, and future fantasies, is inevitably seen as a marriage arranged by capital matchmaking. It is difficult for us to determine whether this is the playground of speculators or the eve of a dawn explosion.

To answer this question, the key lies in thinking: will it get better with the other party? Can we benefit from the other party's model? This article attempts to examine this pattern: how Web3 plays a role in various aspects of the AI technology stack, and what new vitality AI can bring to Web3?

Part.1 What opportunities does Web3 have under the AI stack?

Before we delve into this topic, we need to understand the technology stack of AI large models:

AI large models can be compared to the human brain, as in the early stages, like an infant, they need to observe and absorb a massive amount of external information to understand the world. This is the data "collection" phase. Since computers do not possess human multi-sensory capabilities, "preprocessing" is required before training to convert unannotated information into a format understandable by computers.

After inputting data, AI constructs a model with understanding and predictive capabilities through "training", similar to how a baby gradually learns to understand the outside world. Model parameters are like a baby's continuously adjusting language abilities. Learning content is categorized or acquired through communication with others for feedback and correction, entering the "fine-tuning" stage.

After children grow up and learn to speak, they can understand and express themselves in new conversations, similar to the "inference" stage of AI large models, where they can perform predictive analysis on new inputs. Infants use language to express feelings, describe objects, and solve problems, similar to how AI large models apply after training to various specific tasks, such as image classification and speech recognition.

The AI Agent is closer to the next form of large models: capable of independently executing tasks to pursue complex goals, possessing thinking, memory, and planning abilities, and able to use tools to interact with the world.

In response to the pain points of AI stacks, Web3 has currently begun to form a multi-layered interconnected ecosystem that covers various stages of the AI model process.

![AI+Web3: Towers and Squares])https://img-cdn.gateio.im/webp-social/moments-cc3bf45e321f9b1d1280bf3bb827d9f4.webp(

) 1. Basic Layer: Computing Power and Data's Airbnb

Hash Rate

Currently, one of the highest costs of AI is the computing power and energy required for training and inference models.

Training Meta's LLAMA3 requires 16,000 NVIDIA H100 GPUs for 30 days. The 80GB version costs between $30,000 and $40,000, necessitating a hardware investment of $400 million to $700 million. Monthly training consumes 1.6 billion kilowatt-hours of electricity, with energy expenses nearing $20 million.

The decompression of AI computing power is the earliest intersection of Web3 and AI ------ DePin### decentralized physical infrastructure network(. DePin Ninja has listed over 1,400 projects, with GPU computing power sharing represented by io.net, Aethir, Akash, Render Network, and others.

Main logic: The platform allows idle GPU resource owners to contribute computing power in a decentralized manner without permission, similar to an online marketplace for buyers and sellers like Uber or Airbnb, increasing the utilization rate of underutilized GPU resources. End users obtain low-cost and efficient computing resources; at the same time, the staking mechanism ensures that resource providers are penalized for violating quality control or interrupting the network.

Features:

Gather idle GPU resources: mainly for excess computing power from third-party small and medium data centers, crypto mining farms, and PoS mining hardware such as FileCoin and ETH miners. There are also projects dedicated to launching devices with lower entry thresholds, such as exolab, which utilizes local devices like MacBook, iPhone, and iPad to establish and operate a large model inference computing network.
Targeting the long-tail market for AI computing power: a. Technical side: More suitable for inference steps. Training relies on ultra-large cluster GPUs, while inference has lower requirements for GPU computing performance, such as Aethir focusing on low-latency rendering and AI inference. b. Demand side: Small and medium computing power demanders will not train large models independently, but will only optimize and fine-tune around leading large models, which is naturally suitable for distributed idle computing power resources.
Decentralized Ownership: The significance of blockchain technology lies in the fact that resource owners always retain control over their resources, allowing for flexible adjustments while still generating profits.

)# Data

Data is the foundation of AI. Without data, computation is as useless as floating weeds; the relationship between data and models is like "Garbage in, Garbage out". The quantity and quality of data determine the quality of the final model output. For AI model training, data determines language ability, comprehension ability, values, and human-like performance. Currently, the main challenges in AI data demand are:

Data Hunger: AI model training relies on massive data input. OpenAI trained GPT-4 with a parameter count reaching trillions.
Data Quality: With the integration of AI into various industries, new requirements have emerged for data timeliness, diversity, professionalism, and emerging data sources such as social media sentiment analysis.
Privacy and Compliance: Companies in various countries are gradually recognizing the importance of high-quality datasets and are restricting dataset crawling.
High data processing costs: Large data volume and complex processing. AI companies spend over 30% of their R&D costs on basic data collection and processing.

Web3 solutions are reflected in four aspects:

Data Collection: Free scraping of real-world data is quickly depleting, while AI companies' data spending is increasing year by year, but this does not benefit the true contributors. Platforms enjoy all the value creation, such as Reddit earning $203 million through data licensing agreements with AI companies.

The vision of Web3 is to enable true contributors to participate in the creation of data value, obtaining more private valuable data at a low cost through distributed networks and incentive mechanisms.

Grass: Decentralized data layer and network, users run nodes to contribute idle bandwidth to relay traffic and capture real-time data, and receive token rewards.
Vana: Introduce the concept of data liquidity pool ###DLP(, where users upload private data to a specific DLP and flexibly choose whether to authorize third parties to use it.
PublicAI: Users can use the ) Web3 tag on X and @PublicAI to achieve data collection.

Data Preprocessing: In AI data processing, the collected data is often noisy and contains errors, and it must be cleaned and converted into a usable format before training, involving repetitive tasks such as standardization, filtering, and handling missing values. This stage is one of the few manual processes in the AI industry, giving rise to the data labeling profession. As the requirements for data quality increase with the models, the threshold also rises, making it inherently suitable for decentralized incentive mechanisms in Web3.

Grass and OpenLayer are considering adding a data labeling phase.
Synesis proposes the concept of "Train2earn", emphasizing data quality, where users provide annotated data, comments, etc., to earn rewards.
The data labeling project Sapien gamifies the labeling tasks, allowing users to stake points to earn more points.

Data Privacy and Security: It is necessary to clarify that data privacy and security are two concepts. Data privacy involves the handling of sensitive data, while data security protects data from unauthorized access, destruction, and theft. Advantages of Web3 privacy technologies and potential application scenarios: #AI或#1( sensitive data training; )2( data collaboration: multiple data owners participate in AI training together without sharing the original data.

Current Web3 Privacy Technologies:

Trusted Execution Environment ) TEE (, such as Super Protocol
Fully Homomorphic Encryption ) FHE (, such as BasedAI, Fhenix.io, Inco Network
Zero-knowledge technology ) zk (, such as Reclaim Protocol using zkTLS technology to generate zero-knowledge proofs of HTTPS traffic, allows users to securely import external website activities, reputation, and identity data without exposing sensitive information.

The field is still in its early stages, with most projects exploring; the current dilemma is the high computing costs, such as:

The zkML framework EZKL takes about 80 minutes to generate a 1M-nanoGPT model proof.
Modulus Labs data shows that the zkML overhead is more than 1000 times higher than pure computation.

Data Storage: A place is needed to store data on-chain and generate LLM. With data availability )DA( as the core issue, the throughput of Ethereum before the Danksharding upgrade was 0.08MB. AI model training and real-time inference typically require a data throughput of 50-100GB per second. This magnitude of difference makes existing on-chain solutions difficult to handle resource-intensive AI applications.

0g.AI is a representative project. It is a centralized storage solution designed for high-performance requirements of AI, with key features: high performance and scalability, supporting fast upload and download of large-scale datasets through advanced sharding and erasure coding technology, with data transfer speeds approaching 5GB per second.

) 2. Middleware: Model Training and Inference

Open Source Model Decentralized Market

The debate between closed-source and open-source AI models continues. Open-source brings collective innovation, an advantage that closed-source cannot match, but how to enhance developer motivation without a profit model? Baidu founder Robin Li asserted in April that "open-source models will increasingly fall behind."

Web3 proposes the possibility of a decentralized open-source model marketplace: tokenizing the models themselves, reserving a certain proportion of tokens for the team, and directing part of the future revenue streams to token holders.

The Bittensor protocol establishes an open-source model P2P market, composed of dozens of "subnets". Resource providers ### compete in computing, data collection/storage, and machine learning talent ( to meet the specific goals of subnet owners. Each subnet can interact and learn from each other to achieve stronger intelligence. Rewards are allocated through community voting and further distributed to each subnet based on competitive performance.
ORA introduces the initial model issuance concept )IMO(, tokenizing AI models, which can be bought, sold, and developed through a decentralized network.
Sentient, a decentralized AGI platform that incentivizes people to cooperate, build, replicate, and scale AI models, and rewards contributors.
Spectral Nova, focusing on the creation and application of AI and ML models.

)# Verifiable Reasoning

In response to the "black box" problem of AI inference, the standard Web3 solution is to have multiple validators compare the results of repeated operations, but the shortage of high-end "Nvidia chips" has led to high costs for AI inference, posing challenges to this approach.

What is more hopeful is the execution of ZK proofs for off-chain AI inference calculations, allowing for permissionless verification of AI model computations on-chain. It is necessary to encrypt proofs on-chain to confirm that off-chain calculations are completed correctly, such as ensuring that the dataset is not tampered with ###, while also ensuring all data remains confidential.

Main advantages:

Scalability: Zero-knowledge proofs can quickly verify a large number of off-chain computations. Even with an increase in the number of transactions, a single proof can validate all transactions.
Privacy Protection: Data and AI model details are kept confidential, and parties can verify that the data and models have not been tampered with.
Trustless: No need to rely on centralized parties to verify computation.
Web2 Integration: By definition, Web2 is off-chain integration, and verifiable reasoning can help bring its datasets and AI computations on-chain, which aids in increasing the adoption rate of Web3.

Current Web3 technologies for verifiable reasoning:

zkML: Combining zero-knowledge proofs with machine learning to ensure the privacy and confidentiality of data and models, allowing for verifiable computation without disclosing underlying attributes. For example, Modulus Labs has released a ZK prover built on ZKML for AI, which effectively verifies whether AI providers correctly execute algorithms on-chain. Currently, the clients are mainly on-chain DApps.
opML: Utilizing the optimistic aggregation principle, improve ML calculation by verifying the time of dispute occurrence.

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.

7 Likes