A new open-source project called Espresso allows developers to train and run transformer models directly on Apple's Neural Engine (ANE). The tool, published on GitHub by developer Christopher Karani, aims to unlock the hardware's full potential for AI workloads on Apple devices.
Apple's ANE has been underutilized for transformer architectures, which power large language models and other advanced AI systems. Espresso addresses this gap by providing a framework that bypasses CPU and GPU bottlenecks, tapping into the specialized neural processing hardware found in iPhones, iPads, and Macs.
As of the initial release, the project supports basic transformer operations including attention mechanisms and feed-forward layers. Performance benchmarks are not yet detailed in the repository, but the approach promises improved energy efficiency and faster inference for on-device models.
The tool is currently in early development and may require manual optimization for specific model architectures. Broader adoption could spur more privacy-preserving AI applications that run locally on Apple hardware without cloud dependencies.
Some developers note that Apple's own Core ML framework already provides ANE support, though Espresso offers a more direct and flexible interface for researchers and hobbyists seeking to experiment with custom transformers.