Firebase ML Kit 5: Landmark Recognition

Firebase ML Kit 5: Landmark Recognition

When an app can recognise all sorts of known landmarks, it can add a whole new level of experience and immersion. It doesn’t only need to apply to tourism-related apps. Say for example, you have a books app and you stumble across a library. You can take a picture of it and let ML Kit recognise the place, then tell the user whether a certain book can be found in that library.

One thing about this though is that this particular service is only available as a Cloud Vision API, and not a regular On-Device API. That means at the very least, you’ll need to upgrade to a Blaze plan to use this feature.

This is the 5th post in the ML Kit series. If this is your first time hearing about ML Kit, check out the introduction here.

Add the Dependency

Add this dependency to your app-level  build.gradle file. Unlike the other ML Kit features, you won’t need to add any metadata to your AndroidManifest.xml file.

Configuring the Landmark Detector (Optional)

There are two settings you can change. The ModelType which is STABLE by default, or the MaxResults which is 10 by default.

Create the FirebaseVisionImage

The first step to most ML Kit operations is to get a FirebaseVisionImage which you can get from a bitmapmedia.ImageByteBufferbyte[], or a file on the device.

From Bitmap

Your image must be upright for this to work. This would normally be the simplest way to get a FirebaseVisionImage.

From media.Image

Such as when taking a photo using your device’s camera. You’ll need to get the angle by which the image must be rotated to be turned upright, given the device’s orientation while taking a photo, and calculate that against the default camera orientation of the device (which is 90 on most devices, but can be different for other devices).

Long method do make all those calculations, but it’s pretty copy-pastable. Then you can pass in the mediaImage and the rotation to generate your FirebaseVisionImage.

From ByteBuffer

You’ll need the above (from media.Image) method to get the rotation, then use this method to build the FirebaseVisionImage with the metadata of your image.

From File

Simple to present here in one line, but you’ll be wrapping this in a try-catch block.

Instantiate FirebaseVisionCloudLandmarkDetector

Leave the parameters empty if you haven’t configured any options.

Call detectInImage

Use the detector, call detectInImage, add success and failure listeners, and in the success method you have access to a list of the FirebaseVisionCloudLandmarks from which you can get information about each landmark. The code above says it all really.

Parsing Landmark Information

In the OnSuccess method, you have access to a list of landmarks. Loop through these landmarks, and in each one, you can get the Bounding BoxConfidence LevelEntity IDLandmark, and Location (of which there may be one than one).

Conclusion

I’m disappointed myself that currently this feature only works off of the Blaze plan. I do hope they’ll make an on-device version of this service so that we can use it with a spark plan as well.

If you want to learn more about ML Kit operations like Text RecognitionFace Detection, Barcode Scanning, and Image Labelling, check out the rest of my ML Kit series.