How to Add Machine Learning to an Android App

Gursimar Singh
9 min readAug 26, 2023

Introduction

In today’s digital age, machine learning (ML) has become an integral part of many applications, enhancing user experience and functionality. Android, being one of the most popular mobile platforms, offers a plethora of tools and libraries to integrate ML. This guide delves deep into on-device machine learning for Android apps, its advantages, and how to implement it.

Why add machine learning to your Android app?

Machine learning can be used to add a variety of features to your Android app, such as:

  • Object detection
  • Gesture recognition
  • Face recognition
  • Speech recognition
  • Natural language processing
  • Recommendation systems
  • Fraud detection
  • Spam detection
  • Personalization

Machine learning can also be used to improve the performance of existing features in your app, such as:

  • Search
  • Recommendations
  • Translation
  • Image classification
  • Fraud detection

What is On-Device Machine Learning?

On-device machine learning is a paradigm shift from traditional cloud-based ML. It allows:

  • Direct Processing: ML tasks are performed directly on the device, such as smartphones or tablets.
  • Independence from the Cloud: It operates without the need for a constant connection to the cloud.
  • Local Data Processing: Data is processed locally, ensuring faster response times and immediate inference. Inference, in this context, refers to the process of applying a trained machine learning model to make predictions or draw conclusions from input data.

Why Opt for On-Device Machine Learning?

Here are several compelling reasons to choose on-device ML:

  • Low Latency: On-device ML can provide faster response times and a smoother user experience by eliminating the need to send data to the cloud and wait for the results. This can be especially useful for real-time applications such as gaming, augmented reality, etc.
  • Data Privacy: On-device ML can ensure that sensitive data stays on the device and is not exposed to third parties or hackers. This can be important for applications that deal with personal or confidential information, such as health, finance, etc.
  • Offline Support: On-device ML can work without relying on network availability or quality. This can be beneficial for applications that operate in remote or low-connectivity areas or in scenarios where the user wants to save data or battery.
  • Cost Saving: On-device ML can reduce cloud computing and storage expenses by minimizing the amount of data transferred and processed on the cloud. This can be advantageous for applications that generate or consume large volumes of data, such as video streaming, social media, etc.

Applications of On-Device ML

On-device ML can be applied in various domains. Some of those are:

  • Object Detection (ODML): It can detect multiple objects within images or videos, provide bounding box information to highlight these objects, and classify them based on predefined categories.
  • Gesture Detection ODML: This feature is particularly useful for gaming and interactive apps. It can detect specific finger gestures in real-time, classify the type of gesture, and provide detailed fingertip points.

Object Detection (ODML)

  • Ability to detect multiple objects within an image or video.
  • Provides bounding box information to highlight the detected objects.
  • Classifies the detected object based on predefined categories.

Gesture Detection ODML

  • Detects specific finger gestures in real-time.
  • Classifies the type of gesture being made.
  • Provides detailed information about fingertip points.

How do I add On-Device ML?

There are various tools and frameworks that can help us add on-device ML to our Android app. Some of the popular ones are:

  • ML Kit: A Google SDK that provides ready-to-use ML models and APIs for common tasks such as barcode scanning, text recognition, face detection, pose detection, etc. The ML Kit also supports custom TensorFlow Lite models and AutoML Vision Edge models.
  • MediaPipe: A Google framework that enables building custom ML pipelines using pre-built components such as graphs, calculators, models, etc. MediaPipe also provides ready-to-use solutions for common tasks such as face mesh detection, hand tracking, object detection, etc.
  • TFLite: A Google library that allows running TensorFlow models on mobile devices with low latency and a small binary size. TFLite also supports hardware acceleration, model optimization, and metadata extraction.

ML Kit for Android

ML Kit is a mobile SDK that brings Google’s machine learning expertise to Android apps.

Core Features of ML Kit

  • Cross-Platform: The ML Kit is designed for both Android and iOS, ensuring a broader reach.
  • Pre-trained Models: Developers can utilize Google’s pre-trained models for various tasks.
  • Custom Model Support: While the pre-trained models are efficient, ML Kit also supports custom TensorFlow Lite models for specialized needs.
  • On-device & Cloud-based Processing: ML Kit offers the flexibility of running models on-device for real-time, offline processing, or in the cloud for higher accuracy.

Google Code Scanner

Allows apps to read barcodes without requiring camera permission. It performs on-device ML inference, and the user interface is provided by Google Play Services.

  • No camera permission required
  • On-Device ML inference
  • UI provided by Google Play Service

Text Recognition v2

Document Scanner

  • High-quality code scanner flow
  • Powered by Google Play Services
  • User privacy

Face Mesh Detection

Detects and maps facial features.

Here’s how you can integrate it:

//Create the face mesh detector with default option - Face Mesh
val defaultDetector = FaceMeshDetection.getClient(FaceMeshDetectorOptions.DEFAULT_OPTIONS)



//Create the face mesh detector based on the usecase
val boundingBoxDetector = FaceMeshDetection.getClient(FaceMeshDetectorOptions.Builder()
.setUseCase(UseCase.BOUNDING_BOX_ONLY).build()
)


defaultDetector.process(image)
.addOnSuccessListener { result ->
// Task completed successfully
// …
}
.addOnFailureListener { e ->
// Task failed with an exception
// …
}

Pose Detection

Detects human postures in real-time. It can identify a full-body 33 point skeletal match quickly.

  • Realtime
  • Full-body 33-point skeletal match
  • Fast

Here’s how you can integrate it:

// Base pose detector with streaming frames, when depending on the pose-detection sdk
val options = PoseDetectorOptions.Builder()
.setDetectorMode(PoseDetectorOptions.STREAM_MODE)
.build()

// Accurate pose detector on static images, when depending on the pose-detection-accurate sdk
val options = AccuratePoseDetectorOptions.Builder()
.setDetectorMode(AccuratePoseDetectorOptions.SINGLE_IMAGE_MODE)
.build()

val poseDetector = PoseDetection.getClient(options)

poseDetector.process(image)
.addOnSuccessListener { results ->
// Task completed successfully
// ...
}
.addOnFailureListener { e ->
// Task failed with an exception
// ...
}

Text Recognition

Recognizes text in various scripts.

Here’s how you can integrate it:

// When using Latin script library
val recognizer = TextRecognition.getClient(TextRecognizerOptions.DEFAULT_OPTIONS)

// When using Devanagari script library
val recognizer = TextRecognition.getClient(DevanagariTextRecognizerOptions.Builder().build())

recognizer.process(image)
.addOnSuccessListener { visionText ->
// Task completed successfully
// ...
}
.addOnFailureListener { e ->
// Task failed with an exception
// ...
}

MediaPipe

MediaPipe is a framework for building cross-platform applied machine learning pipelines. It is known for,

  • Ease of use
  • Innovative
  • Fast

MediaPipe Solutions

Offers pre-built solutions that allow developers to compose on-device ML applications in minutes.

Available Solutions

Compose on-device ML in minutes

Steps to Compose On-Device ML Rapidly:

  1. Identify the Requirement: Determine the specific ML functionality needed for the app. For instance, if you’re developing a photo editing app, you might need features like object detection or image segmentation.
  2. Choose the Right Tool: Based on the requirement, select an appropriate tool or library. If you need gesture recognition, MediaPipe might be the best choice. For text recognition, ML Kit would be ideal.
  3. Integrate Pre-Built Solutions: Utilize the ready-to-use solutions or APIs provided by the chosen tool. This often involves adding a few lines of code and ensuring the app has the necessary permissions.
  4. Test and Iterate: Once integrated, test the ML functionality thoroughly. Ensure it works in various scenarios and conditions. Gather user feedback and make necessary adjustments.
  5. Optimize for Performance: While rapid composition speeds up development, it’s essential to optimize the ML feature for performance. This might involve reducing the model size, ensuring low latency, or improving accuracy.

Gesture Recognition

Detects hand gestures in real-time, providing landmarks of the detected hands.

  • Realtime
  • Landmarks of the detected hands
  • Fast

Here’s how you can integrate it:

val baseOptionsBuilder = BaseOptions.builder().setModelAssetPath(MP_RECOGNIZER_TASK)

val optionsBuilder =
GestureRecognizer.GestureRecognizerOptions.builder()
.setBaseOptions(baseOptions)
.setMinHandDetectionConfidence(minHandDetectionConfidence)
.setMinTrackingConfidence(minHandTrackingConfidence)
.setMinHandPresenceConfidence(minHandPresenceConfidence)
.setResultListener(this::returnLivestreamResult)
.setErrorListener(this::returnLivestreamError)
.setRunningMode(RunningMode.LIVE_STREAM)

val options = optionsBuilder.build()
val gestureRecognizer = GestureRecognizer.createFromOptions(context, options)

val mpImage = BitmapImageBuilder(rotatedBitmap).build()
val frameTime = SystemClock.uptimeMillis()

gestureRecognizer?.recognizeAsync(mpImage, frameTime)

MediaPipe Studio

Allows developers to customize models with MediaPipe, defining model input and output requirements. It also includes the MediaPipe Model Maker tool for further customization.

MediaPipe Studio — Custom Models

  • Customize model with MediaPipe
  • Model input and output requirements
  • MediaPipe Model Maker tool

TensorFlow Lite — Custom ML Model on Android

TensorFlow Lite (TFLite) is an open-source deep learning framework developed by Google. It’s a lightweight version of TensorFlow, specifically designed for mobile and embedded devices. TFLite enables developers to run machine learning models on-device, ensuring faster inference and enhanced user experience. It offers:

  • Android inference engine
  • Automatic updates
  • Reduce binary size

Why TensorFlow Lite?

  • Lightweight: TFLite is optimized for size, making it suitable for mobile devices with limited storage and computational capabilities.
  • Versatile: Supports a wide range of ML tasks, from image classification to natural language processing.
  • Cross-Platform: Compatible with both Android and iOS, allowing for broader application reach.
  • Offline Capabilities: Since inference happens on-device, apps powered by TFLite can function without an internet connection.

Key Features of TFLite

  1. Model Conversion: TFLite provides tools to convert trained TensorFlow models into a format optimized for mobile devices.
  2. Pre-trained Models: Offers a collection of ready-to-use models for common tasks, reducing development time.
  3. Customization: While pre-trained models are handy, TFLite also supports custom models tailored to specific needs.
  4. Neural Network API Integration: On Android devices, TFLite can leverage the Neural Network API for hardware-accelerated inference.

Implementing TFLite in Android Apps

Integrating TFLite into Android apps involves a few key steps:

  1. Model Preparation: Begin by either choosing a pre-trained model or training a custom TensorFlow model. Once ready, convert this model to the TFLite format using the TFLite Converter.
  2. Integration: Add the TFLite Android library to the app project. This library provides the necessary functionalities to run TFLite models.
  3. Model Deployment: Embed the TFLite model within the app’s assets folder. This ensures the model is packaged with the app during deployment.
  4. Inference: Utilize the TFLite interpreter to run the model on input data and obtain predictions.

TensorFlow Lite in Google Play Services

Adoption since September 2022

  • 1B+ monthly users
  • 10K+ apps

ML Kit APIs

TensorFlow Lite in Google Play services powers multiple ML Kit APIs, including Barcode Scanning API, Language Identification, and Smart Reply.

By integrating these tools and libraries, developers can harness the power of machine learning directly on Android devices, offering users enhanced experiences and functionalities.

Integrating ML Kit APIs into Android Apps

The integration process is straightforward:

  1. Set Up Firebase: Since ML Kit is part of Firebase, start by setting up a Firebase project and adding the necessary dependencies to the Android app.
  2. Choose the API: Depending on the app’s requirement, select the appropriate ML Kit API.
  3. Implement the API: Follow the documentation to integrate the chosen API. For instance, for text recognition, capture image data, pass it to the Text Recognition API, and handle the returned text data.
  4. Optimize & Test: Ensure the ML feature works seamlessly in various scenarios. Optimize for performance and accuracy.

Conclusion

On-device ML is the future of mobile applications. With tools like ML Kit, MediaPipe, and TensorFlow Lite, Android developers are equipped to create intelligent apps that are fast, secure, and user-friendly. As ML continues to evolve, we can expect even more advanced features and easier integration methods in the future.

--

--

Gursimar Singh

Google Developers Educator | Speaker | Consultant | Author @ freeCodeCamp | DevOps | Cloud Computing | Data Science and more