Introduction

Overview

Kuzco is a distributed GPU cluster built on the Solana blockchain, designed to facilitate efficient and cost-effective inference of large language models (LLMs) such as Llama2 and Mistral. By leveraging the power of idle compute resources contributed by network participants, Kuzco enables users to access and utilize these models through an OpenAI-compatible API.

⚠️

This documentation is a work in progress. Join Discord for more information.

Key Features

  • Distributed GPU Cluster: Kuzco harnesses the collective power of GPUs across the network, allowing for scalable and efficient LLM inference.
  • Solana Integration: Built on the Solana blockchain, Kuzco benefits from its high-performance, low-latency, and cost-effective infrastructure.
  • Idle Compute Utilization: Network participants can contribute their idle compute power and earn rewards for their contributions.
  • OpenAI-Compatible API: Kuzco provides an API that is compatible with OpenAI, making it easy for developers to integrate and utilize popular LLMs like Llama2 and Mistral.
  • Cost-Effective: By leveraging idle compute resources, Kuzco offers a cost-effective solution for LLM inference compared to traditional centralized approaches.

Getting Started

To get started with Kuzco, follow these steps:

  1. Installation: Install the necessary dependencies and set up your environment.
  2. Network Participation: Join the Kuzco network and configure your node to contribute idle compute power.
  3. API Integration: Integrate the Kuzco API into your application to access and utilize LLMs like Llama2 and Mistral.
  4. Inference: Use the API to perform inference tasks and leverage the power of the distributed GPU cluster.

For detailed instructions and code examples, refer to the respective sections of the documentation.