09 July 2021

Announcing Android’s updateable, fully integrated ML inference stack

Posted by Oli Gaymond, Product Manager, Android ML

On-Device Machine Learning provides lower latency, more efficient battery usage, and features that do not require network connectivity. We have found that development teams deploying on-device ML on Android today encounter these common challenges:

Many apps are size constrained, so having to bundle and manage additional libraries just for ML can be a significant cost
Unlike server-based ML, the compute environment is highly heterogeneous, resulting in significant differences in performance, stability and accuracy
Maximising reach can lead to using older more broadly available APIs; which limits usage of the latest advances in ML.

To help solve these problems, we’ve built Android ML Platform - an updateable, fully integrated ML inference stack. With Android ML Platform, developers get:

Built in on-device inference essentials - we will provide on-device inference binaries with Android and keep them up to date; this reduces apk size
Optimal performance on all devices - we will optimize the integration with Android to automatically make performance decisions based on the device, including enabling hardware acceleration when available
A consistent API that spans Android versions - regular updates are delivered via Google Play Services and are made available outside of the Android OS release cycle

Built in on-device inference essentials - TensorFlow Lite for Android

TensorFlow Lite will be available on all devices with Google Play Services. Developers will no longer need to include the runtime in their apps, reducing app size. Moreover, TensorFlow Lite for Android will use metadata in the model to automatically enable hardware acceleration, allowing developers to get the best performance possible on each Android device.

Optimal performance on all devices - Automatic Acceleration

Automatic Acceleration is a new feature in TensorFlowLite for Android. It enables per-model testing to create allowlists for specific devices taking performance, accuracy and stability into account. These allowlists can be used at runtime to decide when to turn on hardware acceleration. In order to use accelerator allowlisting, developers will need to provide additional metadata to verify correctness. Automatic Acceleration will be available later this year.

A consistent API that spans Android versions

Besides keeping TensorFlow Lite for Android up to date via regular updates, we’re also going to be updating the Neural Networks API outside of OS releases while keeping the API specification the same across Android versions. In addition we are working with chipset vendors to provide the latest drivers for their hardware directly to devices, outside of OS updates. This will let developers dramatically reduce testing from thousands of devices to a handful of configurations. We’re excited to announce that we’ll be launching later this year with Qualcomm as our first partner.

Sign-up for our early access program

While several of these features will roll out later this year, we are providing early access to TensorFlow Lite for Android to developers who are interested in getting started sooner. You can sign-up for our early access program here.

Android Android ML Featured latest machine learning TensorFlow TensorFlow Lite

Announcing Android’s updateable, fully integrated ML inference stack

Google developers blog

Connect

Subscribe

Announcing Android’s updateable, fully integrated ML inference stack

Google developers blog

Connect

Subscribe

Feed

Newsletter