Posted by Michael Sipe, Product Manager
As an important framework for finding objects in photos and video, Mobile Vision operation for Android devices is restored in Google Play Services v9.2.
This new version of Google Play Services fixes a download issue in Google Play Services v.9.0 that caused a service outage. See release notes for details.
We’re also pleased to announce the Text API, a new component for Android Mobile Vision.
The Text API’s optical character recognition technology reads Latin character text (e.g. English, Spanish, German, French, etc.) in photos and returns the text as well as the organizational structure (paragraphs, lines, words). Mobile apps can now:
If you want to get started quickly, you can try our codelab which will get Android developers reading text with their apps in under an hour.
Like the Mobile Vision Face and Barcode components, the Text API runs on-device and is suitable for real-time applications. For more information, check out the Mobile Vision Developer site.
Posted by Laurence Moroney, Developer Advocate
With the release of Google Play services 7.8 we’re excited to announce that we’ve added new Mobile Vision APIs which provides the Barcode Scanner API to read and decode a myriad of different barcode types quickly, easily and locally.
Classes for detecting and parsing bar codes are available in the com.google.android.gms.vision.barcode namespace. The BarcodeDetector class is the main workhorse -- processing Frame objects to return a SparseArray<Barcode> types.
The Barcode type represents a single recognized barcode and its value. In the case of 1D barcode such as UPC codes, this will simply be the number that is encoded in the barcode. This is available in the rawValue property, with the detected encoding type set in the format field.
For 2D barcodes that contain structured data, such as QR codes, the valueFormat field is set to the detected value type, and the corresponding data field is set. So, for example, if the URL type is detected, the constant URL will be loaded into the valueFormat, and the URL property will contain the desired value. Beyond URLs, there are lots of different data types that the QR code can support -- check them out in the documentation here.
When using the API, you can read barcodes in any orientation. They don’t always need to be straight on, and oriented upwards!
Importantly, all barcode parsing is done locally, making it really fast, and in some cases, such as PDF-417, all the information you need might be contained within the barcode itself, so you don’t need any further lookups.
You can learn more about using the API by checking out the sample on GitHub. This uses the Mobile Vision APIs along with a Camera preview to detect both faces and barcodes in the same image.
The API supports both 1D and 2D bar codes, in a number of sub formats.
For 1D Bar Codes, these are:
For 2D Bar Codes, these are:
It’s easy to build applications that use bar code detection using the Barcode Scanner API, and we’ve provided lots of great resources that will allow you to do so. Check them out here:
Follow the Code Lab
Read the Mobile Vision Documentation
Explore the sample
With the release of Google Play services 7.8, we announced the addition of new Mobile Vision APIs, which includes a new Face API that finds human faces in images and video better and faster than before. This API is also smarter at distinguishing faces at different orientations and with different facial features facial expressions.
Face Detection is a leap forward from the previous Android FaceDetector.Face API. It’s designed to better detect human faces in images and video for easier editing. It’s smart enough to detect faces even at different orientations -- so if your subject’s head is turned sideways, it can detect it. Specific landmarks can also be detected on faces, such as the eyes, the nose, and the edges of the lips.
Important Note
This is not a face recognition API. Instead, the new API simply detects areas in the image or video that are human faces. It also infers from changes in the position frame to frame that faces in consecutive frames of video are the same face. If a face leaves the field of view, and re-enters, it isn’t recognized as a previously detected face.
When the API detects a human face, it is returned as a Face object. The Face object provides the spatial data for the face so you can, for example, draw bounding rectangles around a face, or, if you use landmarks on the face, you can add features to the face in the correct place, such as giving a person a new hat.
getPosition()
getWidth()
getHeight()
getId()
The Face API is smart enough to detect faces in multiple orientations. As the head is a solid object that is capable of moving and rotating around multiple axes, the view of a face in an image can vary wildly.
Here’s an example of a human face, instantly recognizable to a human, despite being oriented in greatly different ways:
The API is capable of detecting this as a face, even in the circumstances where as much as half of the facial data is missing, and the face is oriented at an angle, such as in the corners of the above image.
Here are the method calls available to a face object:
getEulerY()
getEulerZ()
A landmark is a point of interest within a face. The API provides a getLandmarks() method which returns a List , where a Landmark object returns the coordinates of the landmark, where a landmark is one of the following: Bottom of mouth, left cheek, left ear, left ear tip, left eye, left mouth, base of nose, right cheek, right ear, right ear tip, right eye or right mouth.
getLandmarks()
List
In addition to detecting the landmark, the API offers the following function calls to allow you to smartly detect various facial states:
getIsLeftEyeOpenProbability()
getIsRighteyeOpenProbability()
getIsSmilingProbability()
Thus, for example, you could write an app that only takes a photo when all of the subjects in the image are smiling.
It’s easy to build applications that use facial detection using the Face API, and we’ve provided lots of great resources that will allow you to do so. Check them out here:
Read the Documentation