20 May 2025

On-device GenAI APIs as part of ML Kit help you easily build with Gemini Nano

Posted by Caren Chang - Developer Relations Engineer, Chengji Yan - Software Engineer, Taj Darra - Product Manager

We are excited to announce a set of on-device GenAI APIs, as part of ML Kit, to help you integrate Gemini Nano in your Android apps.

To start, we are releasing 4 new APIs:

Summarization: to summarize articles and conversations
Proofreading: to polish short text
Rewriting: to reword text in different styles
Image Description: to provide short description for images

Key benefits of GenAI APIs

GenAI APIs are high level APIs that allow for easy integration, similar to existing ML Kit APIs. This means you can expect quality results out of the box without extra effort for prompt engineering or fine tuning for specific use cases.

GenAI APIs run on-device and thus provide the following benefits:

Input, inference, and output data is processed locally
Functionality remains the same without reliable internet connection
No additional cost incurred for each API call

To prevent misuse, we also added safety protection in various layers, including base model training, safety-aware LoRA fine-tuning, input and output classifiers and safety evaluations.

How GenAI APIs are built

There are 4 main components that make up each of the GenAI APIs.

Gemini Nano is the base model, as the foundation shared by all APIs.
Small API-specific LoRA adapter models are trained and deployed on top of the base model to further improve the quality for each API.
Optimized inference parameters (e.g. prompt, temperature, topK, batch size) are tuned for each API to guide the model in returning the best results.
An evaluation pipeline ensures quality in various datasets and attributes. This pipeline consists of: LLM raters, statistical metrics and human raters.

Together, these components make up the high-level GenAI APIs that simplify the effort needed to integrate Gemini Nano in your Android app.

Evaluating quality of GenAI APIs

For each API, we formulate a benchmark score based on the evaluation pipeline mentioned above. This score is based on attributes specific to a task. For example, when evaluating the summarization task, one of the attributes we look at is “grounding” (ie: factual consistency of generated summary with source content).

To provide out-of-box quality for GenAI APIs, we applied feature specific fine-tuning on top of the Gemini Nano base model. This resulted in an increase for the benchmark score of each API as shown below:

Use case in English	Gemini Nano Base Model	ML Kit GenAI API
Summarization	77.2	92.1
Proofreading	84.3	90.2
Rewriting	79.5	84.1
Image Description	86.9	92.3

In addition, this is a quick reference of how the APIs perform on a Pixel 9 Pro:

	Prefix Speed (input processing rate)	Decode Speed (output generation rate)
Text-to-text	510 tokens/second	11 tokens/second
Image-to-text	510 tokens/second + 0.8 seconds for image encoding	11 tokens/second

Sample usage

This is an example of implementing the GenAI Summarization API to get a one-bullet summary of an article:

val articleToSummarize = "We are excited to announce a set of on-device generative AI APIs..."

// Define task with desired input and output format
val summarizerOptions = SummarizerOptions.builder(context)
    .setInputType(InputType.ARTICLE)
    .setOutputType(OutputType.ONE_BULLET)
    .setLanguage(Language.ENGLISH)
    .build()
val summarizer = Summarization.getClient(summarizerOptions)

suspend fun prepareAndStartSummarization(context: Context) {
    // Check feature availability. Status will be one of the following: 
    // UNAVAILABLE, DOWNLOADABLE, DOWNLOADING, AVAILABLE
    val featureStatus = summarizer.checkFeatureStatus().await()

    if (featureStatus == FeatureStatus.DOWNLOADABLE) {
        // Download feature if necessary.
        // If downloadFeature is not called, the first inference request will 
        // also trigger the feature to be downloaded if it's not already
        // downloaded.
        summarizer.downloadFeature(object : DownloadCallback {
            override fun onDownloadStarted(bytesToDownload: Long) { }

            override fun onDownloadFailed(e: GenAiException) { }

            override fun onDownloadProgress(totalBytesDownloaded: Long) {}

            override fun onDownloadCompleted() {
                startSummarizationRequest(articleToSummarize, summarizer)
            }
        })    
    } else if (featureStatus == FeatureStatus.DOWNLOADING) {
        // Inference request will automatically run once feature is      
        // downloaded.
        // If Gemini Nano is already downloaded on the device, the   
        // feature-specific LoRA adapter model will be downloaded very  
        // quickly. However, if Gemini Nano is not already downloaded, 
        // the download process may take longer.
        startSummarizationRequest(articleToSummarize, summarizer)
    } else if (featureStatus == FeatureStatus.AVAILABLE) {
        startSummarizationRequest(articleToSummarize, summarizer)
    } 
}

fun startSummarizationRequest(text: String, summarizer: Summarizer) {
    // Create task request  
    val summarizationRequest = SummarizationRequest.builder(text).build()

    // Start summarization request with streaming response
    summarizer.runInference(summarizationRequest) { newText -> 
        // Show new text in UI
    }

    // You can also get a non-streaming response from the request
    // val summarizationResult = summarizer.runInference(summarizationRequest)
    // val summary = summarizationResult.get().summary
}

// Be sure to release the resource when no longer needed
// For example, on viewModel.onCleared() or activity.onDestroy()
summarizer.close()

For more examples of implementing the GenAI APIs, check out the official documentation and samples on GitHub:

Use cases

Here is some guidance on how to best use the current GenAI APIs:

For Summarization, consider:

Conversation messages or transcripts that involve 2 or more users

Articles or documents less than 4000 tokens (or about 3000 English words). Using the first few paragraphs for summarization is usually good enough to capture the most important information.

For Proofreading and Rewriting APIs, consider utilizing them during the content creation process for short content below 256 tokens to help with tasks such as:

Refining messages in a particular tone, such as more formal or more casual

Polishing personal notes for easier consumption later

For the Image Description API, consider it for:

Generating titles of images

Generating metadata for image search

Utilizing descriptions of images in use cases where the images themselves cannot be displayed, such as within a list of chat messages

Generating alternative text to help visually impaired users better understand content as a whole

GenAI API in production

Envision is an app that verbalizes the visual world to help people who are blind or have low vision lead more independent lives. A common use case in the app is for users to take a picture to have a document read out loud. Utilizing the GenAI Summarization API, Envision is now able to get a concise summary of a captured document. This significantly enhances the user experience by allowing them to quickly grasp the main points of documents and determine if a more detailed reading is desired, saving them time and effort.

side by side images of a mobile device showing a document on a table on the left, and the results of the scanned document on the right showing details providing the what, when, and where as written in the document

Supported devices

GenAI APIs are available on Android devices using optimized MediaTek Dimensity, Qualcomm Snapdragon, and Google Tensor platforms through AICore. For a comprehensive list of devices that support GenAI APIs, refer to our official documentation.

Learn more

Start implementing GenAI APIs in your Android apps today with guidance from our official documentation and samples on GitHub: AI Catalog GenAI API Samples with Compose, ML Kit GenAI APIs Quickstart.

#GenerativeAI APIs Developer Tools Gemini Nano Google I/O 2025 image description ML Kit

On-device GenAI APIs as part of ML Kit help you easily build with Gemini Nano

Key benefits of GenAI APIs

How GenAI APIs are built

Evaluating quality of GenAI APIs

Sample usage

Use cases

GenAI API in production

Supported devices

Learn more

Google developers blog

Connect

Subscribe

On-device GenAI APIs as part of ML Kit help you easily build with Gemini Nano

Key benefits of GenAI APIs

How GenAI APIs are built

Evaluating quality of GenAI APIs

Sample usage

Use cases

GenAI API in production

Supported devices

Learn more

Google developers blog

Connect

Subscribe

Feed

Newsletter