Running Machine Learning Models on Android Devices - Issue #9
Build in Python, export and run (on-device) on Android
In this issue we will discuss how to run machine learning models locally on an Android mobile devices. We will explore :
Why on-device machine learning is interesting
Building Android apps with Jetpack Compose
How to build and train a simple neural network using Tensorflow
How to convert the model to the Tensorflow lite format which is optimized to run on resource constrained devices (microcontrollers, microprocessors)
How to integrate the Tensorflow lite model in your Android app and make predictions!
Let's dive in!
Note: This newsletter does not include code snippets for the tasks above. Please see the full post and links to sample code here.
Why Machine Learning On-Device?
There are several benefits to running a machine learning model locally on a device. Some of these reasons include:
Privacy: The input data for the model does not leave the device and enables better data privacy and security primitives.
Latency: The prediction is executed locally and does not require a network round trip.
Access to real-time sensor data: The model can be used to build interactive experiences that take advantage of local sensor data e.g. accelerometer, gyroscope, microphone, etc.
Cost: The overall cost of model distribution is reduced for the developer as you do not need to host servers that serve predictions.
Efficiency: The model can be trained on a large dataset on servers, and the resulting model compressed (quantization, pruning) to run on edge devices. Note that some use cases and models might not fit this criteria.
Building An Android App
As of writing time, there are probably over a dozen ways to build Android apps. This includes approaches like Android Native, React Native, WebViews, PWAs, Flutter, and a host of other multi platform app dev tools. Not just that, even for Android Native apps, you can chose to develop in multiple languages - Java, Kotlin, C/C++. And yes, you read that right, you can develop android apps in C/C++ using the android Native Development Kit.
After reviewing several options, I ended up going with Jetpack Compose for the following reasons:
Jetpack is a Native Android library written in Kotlin, a preferred langauge for Android development. This means it should be possible to closely tie UI interactions with low level ML code as needed.
Compose is flexible with support for state based UI design and complex animations that can create the feel of quality. With the range of app showcase examples I have seen built with Compose, the library appears to be both versatile and flexible.
Compose promises a declarative approach where components represented by Composable functions that render UI based on state variables. When state is updated, the Composable function is called again with the new data. Feels like React! Win for familiarity!
Compose is supported by the Material design ecosystem. Material is a design system created by Google to help teams build high-quality digital interfaces for Android, iOS, Flutter, and the web. Material Components (buttons, cards, switches, etc.) and layouts like Scaffold are available as composable functions and can be used as excellent, performant defaults in your app.
Problem Definition, Model Building
The problem I chose to address in this example focuses on predicting miles per gallon for cars using the popular cars
 dataset. It is based on the "Predict fuel efficiency" tutorial from tensorflow.org. The dataset contains information about cars including their horsepower, torque, weight, acceleration etc, and miles per gallon.
The model is then converted to Tensorflow lite format using the Tensorflow lite converter. A TensorFlow Lite model is represented in a special efficient portable format known as FlatBuffers (identified by the .tflite
 file extension). This provides several advantages over TensorFlow's protocol buffer model format such as reduced size (small code footprint) and faster inference (data is directly accessed without an extra parsing/unpacking step) that enables TensorFlow Lite to execute efficiently on devices with limited compute and memory resources. Tensorflow lite models can then be executed by the tensorflow lite interpreter.
TensorFlow Lite utilizes CPU kernels that are optimized for the ARM Neon instruction set, but also support faster execution on GPUs, TPUs, and DSPs via tensorflow lite delegates. Android offers delegates as well as iOS.
Building Android UIs in Jetpack Compose
The Android Studio IDE makes the process of creating Jetpack Compose apps easy by including project templates (it provides a Compose + Material 3 template). Once the project is created, a MainActivity.kt
 file is generated with a basic layout. Following this, you can you can start adding components to the screen and previewing changes in the design preview window.
Overall, I followed this tutorial on Jetpack compose basics to get familiar with basic layout (rows and columns), state, state hoisting, persisting state, styling and theming. The UI above is created by stacking components (several labels, text fields and a button) using the Row and Column layout components in Compose.
Importing the Tensorflow lite model
To import your exported model, right click on your project in Android Studio New -> Other -> Tensorflow Lite Model. This will create a new folder named ml
 in your project and import the model with sample code structure for predictions. Android studio will show the input/output signature of the loaded model (useful for verifying the model is exported properly) and sample code to get predictions from your tf.lite
 model with input data.
This code can then be modified - mostly with details on how to convert your input data source to byte buffers that the tensorflow lite interpreter expects. See code screenshot below.
Final Notes
Tensorflow lite provides a smooth experience for training models in python and deploying on-device for inference on edge devices (e.g., raspberry pi, arduino, edge TPUs, smartphones etc). It is also well supported and actively developed.
The biggest challenge along this process lies with correctly transforming your input features captured on device (e.g. audio from a microphone, images from the camera, input from the UI etc.,) to the correct format that the tensorflow lite interpreter on Android expects. In this case, the model expects a 9 dimensional array of floats, so this is straightforward. Models with text data, audio or image data will be a bit more complex (manually converting that data to Float buffers).
See the full post for links to sample code here.
Outro: I hope you are all doing well! Sending good vibes!