A completely offline use of Whisper ASR and LLaMA-2 GPT Mannequin
These days, no person might be shocked by working a deep studying mannequin within the cloud. However the state of affairs may be far more sophisticated within the edge or client gadget world. There are a number of causes for that. First, the usage of cloud APIs requires gadgets to at all times be on-line. This isn’t an issue for an internet service however generally is a dealbreaker for the gadget that must be useful with out Web entry. Second, cloud APIs price cash, and clients probably won’t be comfortable to pay one more subscription price. Final however not least, after a number of years, the venture could also be completed, API endpoints might be shut down, and the costly {hardware} will flip right into a brick. Which is of course not pleasant for patrons, the ecosystem, and the setting. That’s why I’m satisfied that the end-user {hardware} ought to be totally useful offline, with out additional prices or utilizing the web APIs (nicely, it may be elective however not necessary).
On this article, I’ll present tips on how to run a LLaMA GPT mannequin and automated speech recognition (ASR) on a Raspberry Pi. That may permit us to ask Raspberry Pi questions and get solutions. And as promised, all this may work totally offline.
Let’s get into it!
The code introduced on this article is meant to work on the Raspberry Pi. However a lot of the strategies (besides the “show” half) may even work on a Home windows, OSX, or Linux laptop computer. So, these readers who don’t have a Raspberry Pi can simply take a look at the code with none issues.
{Hardware}
For this venture, I might be utilizing a Raspberry Pi 4. It’s a single-board laptop working Linux; it’s small and requires solely 5V DC energy with out followers and lively cooling:
A more moderen 2023 mannequin, the Raspberry Pi 5, ought to be even higher; based on benchmarks, it’s nearly 2x sooner. However additionally it is nearly 50% costlier, and for our take a look at, the mannequin 4 is sweet sufficient.