Artificial Intelligence (AI) has undergone groundbreaking advancements in recent years. A significant game changer has been the emergence of large language models, capable of transforming countless sectors, from technology to healthcare.

However, running such substantial models on personal devices has remained a challenge due to resource constraints. This article delves into a revolutionary solution that addresses this problem: Petals, a novel, decentralized method that allows running and fine-tuning large language models on any device.

Petals: decentralized inference and finetuning of large language models
Large language models are among the most significant recent advances in machine learning. Still, leveraging these models can be difficult: offloading and quantization have limitations, and third-party APIs are less flexible. We propose Petals, an open-source decentralized system (showcased this week…

Limitations of Large Language Models

Over the past six months, large language models have emerged as a crucial part of AI technology. Models such as ChatGPT have been instrumental in advancing towards artificial general intelligence, proving themselves valuable to society.

However, there are significant drawbacks. The prominence of ChatGPT, for example, has led to problems concerning privacy, security, cost, and transparency due to its centralized and closed source nature.

Moreover, the open-source alternatives such as LLaMA, Bloom, MPT, despite being free, transparent, and easy to install, come with their own hurdles. They require modern and costly hardware to run.

Running a mid-sized 33 billion parameter model could necessitate a thousand dollar-plus GPU to achieve acceptable inference speeds. This barrier puts such advanced models out of reach for the majority of individuals worldwide.

Decentralised AI as the Solution

To overcome these limitations, a new technology named Petals has emerged. Petals operates using decentralized methods to run and fine-tune large language models.

This novel approach resembles the peer-to-peer technologies such as torrents, which has been in use for decades. Like a torrent, Petals breaks down every model into small blocks and stores them on individual, consumer-grade computers throughout the world.

This distributed storage method allows even small end-user devices to contribute to this network, making it possible to run the most advanced AI models without needing state-of-the-art infrastructure.

Furthermore, Petals leverages the scalability of this distributed architecture: the more contributors to the network, the more powerful it becomes.

Interestingly, it also offers the capability to operate on the free tier of Google Collab, circumventing the need for local hardware entirely.

Google Colaboratory

Overcoming the Hardware Barrier

  • Running massive AI models requires expensive GPUs, limiting access for most people globally.
  • Petals allows models to run on consumer devices by distributing processing through peer-to-peer technology.
  • Small pieces of models are stored across individual devices to form a collective network.
  • This means AI can be democratized and made available to everyone.

The Power of Open Source

  • Petals currently supports open source models like LLaMA and Bloom.
  • These models are free, transparent and easy to use.
  • By distributing processing, petals unlocks the potential of these massive open source models.
  • Anyone can contribute computing power and benefit from the collective network.

Enabling New Possibilities

  • A decentralized architecture could enable more complex models.
  • Mixture-of-experts models like GPT-4 might thrive in this environment.
  • Multiple models can interact without centralized dependencies.
  • This breakthrough could unlock new horizons in AI development.

Boosting Contribution through Blockchain

  • Contributors could be rewarded for providing computing power via blockchain.
  • Tokens incentivize donation of idle GPU time to the network.
  • Priority access to the network could be given based on contribution.
  • This mutually beneficial ecosystem maximizes network potential.

Operational Efficiency of Petals

Petals have shown remarkable efficiency, achieving five to six tokens per second on the LLaMA 65 billion parameter model. This performance surpasses even the best consumer graphics cards, which cannot run such a model directly.

Users can participate in the Petals network as a client, a server, or both. As a client, you can utilize the network to train or run your models. As a server, you provide your hardware resources to help run models.

The ease of use and efficiency are impressive - a few lines of Python code are sufficient to leverage the power of Petals, whether as a client or a server. As the project develops we can hope to see other developers adding point and click clients that do not have th eneed for any additional programming interfaces.

Empowering All Devices Through Hardware Independence

One especially exciting possibility unlocked by Petals is the concept of truly hardware-independent AI. By distributing processing across consumer devices, Petals liberates AI from constraints imposed by specialized, expensive hardware. This hardware independence opens the door for AI to become accessible across all devices:

  • Low-powered mobile devices and edge devices can participate in AI advancement. Limited internal computing power ceases to be a barrier.
  • Instead of reliance on high-end GPUs or tensor processing units in data centers, even basic computers and smart devices can contribute meaningfully.
  • Each device gives what it can, and the collective network empowers AI progress. Processing power becomes decentralized.
  • All devices, regardless of their native capabilities, can leverage the shared collective network to benefit from state-of-the-art AI.

On the user's end, the capacity to run advanced AI models on any device, regardless of its computational power, can unlock new functionalities and improvements. Smartphones, smart appliances, and even older computers can leverage AI for tasks such as predictive text, voice recognition, or data analysis. Furthermore, smaller devices with limited processing power such as IoT sensors can utilize AI for real-time decision-making, making them smarter and more autonomous.

By removing hardware limitations, Petals brings collaborative AI to the edges - into homes, businesses, communities. Devices of all types can participate and unlock value. This hardware-agnostic paradigm promises to accelerate and democratize AI in exciting new ways.

Achieving Blazing Speeds

  • Petals achieves up to 5-6 tokens per second on a 65 billion parameter model.
  • This is faster than the best consumer GPUs which struggle to run such large models.
  • Processing is distributed efficiently, maximizing the capabilities of each device.
  • More contributors lead to greater overall speed and capabilities.

The Potential and Challenges of Petals

The potential of Petals is vast, particularly due to its decentralized nature. However, there are still obstacles. The distributed model and peer-to-peer computing require individuals to donate their idle resources to the broader network, which raises the question of incentives.

A potential solution could be blockchain technology, where each contributing node is rewarded for their compute power, encouraging contribution to the network. This could be structured so that the rewards offer priority or higher usage of the total network, and potentially even traded for monetary value.

Currently, Petals supports the Bloom and LLaMA models, bringing this revolutionary technology to the open-source community. The code implementation is incredibly user-friendly, simplifying the process of running large models.

Towards a FOSS and a Decentralized Future

Petals signifies a step towards a fully decentralized AI world, overcoming hardware limitations and accessibility barriers. While challenges remain, it offers an exciting possibility for AI and opens the door for AI enthusiasts, researchers, and developers across the globe to contribute to and benefit from the AI revolution.

As we further explore and develop this technology, the dream of running large language models on any device is turning into a reality. With more advancements on the horizon, the potential impact on global AI access and utilization is immense.

In envisioning the future of technology and computing, one can't help but consider the profound implications of merging open-source AI, peer-to-peer technology, and blockchain.

This innovative triad amalgamation could form the foundation of an evolutionary leap in digital technology, changing the course of artificial intelligence, blockchain, and computing forever.

Hardware-Independent AI

As I mentioned before, removing dependencies on specialized hardware would make AI universally accessible. Models could run on any device, allowing anyone to leverage and contribute to AI advancement.

Accelerated AI Capabilities

Decentralized processing power can rapidly accelerate AI capabilities. Without hardware limitations, models can grow ever larger and more advanced.

Democratized AI Development

Open collaboration on AI via open source and decentralized networks makes AI more transparent. Anyone can participate in shaping its evolution.

Renewed Computing Paradigms

Cloud computing moved processing from local devices to centralized servers. Combining open source AI and P2P networks like Petals completes the cycle back to distributed edge computing.

Mainstream Blockchain Adoption

Integrating incentives via blockchain could finally bring blockchain into the mainstream. Utility fuels adoption far more than speculation.

Amplifying AI's Future Through Open-Source, Peer-to-peer, and Blockchain

The fusion of these three technologies can unleash the true potential of AI in unprecedented ways.

Open-source AI enables collaboration, transparency, and rapid advancements. When combined with the decentralized distribution of peer-to-peer technology, it can lead to a robust and globally accessible AI ecosystem.

Additionally, incorporating blockchain can establish a layer of trust, ensuring the security and integrity of AI algorithms and their actions.

Hardware-independent AI—that is, AI capable of operating on any device—could become a reality with this fusion. By decoupling AI from the constraints of specific hardware, such AI can reach more users, devices, and systems, thereby maximizing its impact.

This universality can significantly enhance AI's ability to learn from diverse datasets, fostering a rapid evolution of AI models and applications.

Transforming Computing and Technology Landscape

The implications of this fusion extend beyond AI and blockchain, potentially transforming the entire computing and technology landscape.

Hardware-independent AI could enable every device, regardless of its computational power, to contribute to and benefit from advanced AI models.

Moreover, the concept of decentralization, a shared principle among these technologies, could become a cornerstone of future digital systems.

Blockchain's secure and transparent nature, combined with the decentralized distribution power of peer-to-peer technology, can redefine digital trust and data management.

The integration of these technologies could lead to a self-sustaining ecosystem where AI models are developed, distributed, and improved continuously by a global community.

In this ecosystem, users are incentivized to contribute by a blockchain-based reward system, similar to Bitcoin mining.

This could cultivate a democratized, global AI network that's not only revolutionary but also self-propagating.

Incentivizing Participation Through Blockchain

The decentralized nature of Petals inherently poses a crucial challenge: how to incentivize users to donate their idle computing resources to the network.

A significant proportion of GPU time remains idle, and leveraging these untapped resources could significantly boost the Petals network.

However, users need compelling incentives to join this movement and contribute their computational power. The solution might just lie in an unlikely companion - blockchain technology.

Blockchain technology, most widely known for powering cryptocurrencies like Bitcoin, provides a unique mechanism to substantiate transactions, also known as proof of work.

The proof-of-work model requires participants (or 'miners') to perform a certain amount of computational work to validate transactions and create new blocks in the blockchain. This proof-of-work process consumes computing power, which is precisely what the Petals network requires to function effectively.

Now, imagine intertwining blockchain and Petals (or any similar protocols). By integrating a proof-of-work model into Petals, users can contribute their idle computational power to validate transactions on a blockchain network.

In exchange for this work, they can be rewarded with cryptocurrencies such as Bitcoin. This concept effectively transforms idle computing power into actual value, creating a robust incentive for users to contribute to the Petals network.

This approach merges two of the most impactful technological advances in recent years - blockchain and artificial intelligence. It takes advantage of the secure, decentralized model of blockchain to fuel the power needs of large AI models, creating an innovative, functional solution.

This exciting proposition not only incentivizes participation in the Petals network but also revolutionizes how we perceive and utilize idle computational power.

By marrying blockchain and AI in this way, Petals or any similar protocol, could potentially lead to a new era of decentralized computing, opening up incredible opportunities for advancements in AI research and application.

This integration could also spark new developments in the way we structure and reward network participation, shaping the future of distributed computing.

Towards an Accessible and Equitable AI Future

The emergence of Petals marks a distincitive and big leap forward in democratizing artificial intelligence. By tapping into the combined strengths of open source software, peer-to-peer decentralized computing, and blockchain-powered incentives, Petals provides a pathway to make state-of-the-art AI accessible to all.

The limitations of existing centralized, proprietary models are left behind through this innovative approach. Hardware barriers crumble away as even basic devices can meaningfully contribute and benefit. Open collaboration accelerates advancement. Blockchain integration motivates broader involvement.

Most exciting is the promise of hardware-independent AI. The playing field is leveled when any device, from smartphones to sensors, can participate in progress. This dream is now closer than ever before.

Of course, challenges remain. Contribution must be incentivized, security assured, and models carefully crafted to be transparent and beneficial. But the door is now wide open to a more equitable AI landscape where anyone can stand at the frontier.

Petals lights the way forward to an AI ecosystem powered by the people, for the people. As we continue exploring its possibilities, one thing is certain - the future looks brighter than ever.

Petals: decentralized inference and finetuning of large language models
Large language models are among the most significant recent advances in machine learning. Still, leveraging these models can be difficult: offloading and quantization have limitations, and third-party APIs are less flexible. We propose Petals, an open-source decentralized system (showcased this week…

https://petals.ml/

Google Colaboratory
GitHub - bigscience-workshop/petals: 🌸 Run large language models at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading
🌸 Run large language models at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading - GitHub - bigscience-workshop/petals: 🌸 Run large language models at home, BitTorr…
Share this post