Indonesian Political, Business & Finance News

This Powerbank-Sized AI Device Boasts Intelligence Equivalent to a Doctorate

| | Source: KOMPAS Translated from Indonesian | Technology
This Powerbank-Sized AI Device Boasts Intelligence Equivalent to a Doctorate
Image: KOMPAS

The size of this device is mini and compact because it can be placed in the user’s pocket. However, its capabilities are claimed to be equivalent to a doctoral or PhD level.

The device is named AI Pocket Lab, developed by the US startup Tiiny AI.

The AI Pocket Lab is claimed to be the smallest artificial intelligence (AI) supercomputer in the world because its size is only that of a small powerbank. More specifically, its dimensions are 14.2 × 8 × 2.53 cm.

Although small, the AI Pocket Lab is capable of running large language models (LLM) or complex AI models with 120 billion parameters locally or without internet connectivity.

Because it can handle workloads of more than 100 billion parameters, this device qualifies as a supercomputer, not just a standard mini-PC or workstation.

This concept is considered very practical compared to general AI models that require data centre infrastructure.

So how did Tiiny AI create the AI Pocket Lab?

The AI Pocket Lab is built using a 12-core ARM processor, like processors commonly used in smartphones, laptops, and tablets.

The device is equipped with 80 GB of LPDDR5X RAM, far larger than typical gadgets like laptops which usually have 8 GB to 32 GB of RAM.

Of the total RAM, 48 GB is specifically allocated for the Neural Processing Unit (NPU), a chip to support AI computation.

The AI Pocket Lab can currently run AI models like GPT-OSS 120B, an AI model with 120 billion parameters from OpenAI.

The AI Pocket Lab is capable of providing computing power of up to 190 trillion operations per second (trillion operations per second/TOPS) from a combination of CPU and NPU.

To run large AI models on a small device, Tiiny AI incorporates several optimisation technologies.

One of them is TurboSparse, which allows large language models to run faster on devices with limited resources. This technology works by only using the required parts of the model parameters in each computation process.

View JSON | Print