Google's Visual Search Can Now Answer More Complex Queries

When Google Lens was introduced in 2017, the search feature did a job that not too long ago would have seemed like science fiction: Point your phone’s camera at an object and Google Lens could identify it, show some context, maybe make a purchase. It was a new way of searching, one that didn’t involve randomly typing descriptions of the things you saw in front of you.

Lens also showed how Google plans to use its machine learning and AI tools to ensure that its search engine is visible in all possible places. As Google continues to use its AI-powered base models to create summaries of information in response to text searches, Google Lens visual search is evolving, too. And now the company says that Lens, which powers about 20 billion searches a month, will support more search methods, including video and multi-method searches.

Another change to the Lens means that more shopping context will appear in the results. Shopping, surprisingly, is one of the main uses of Lens; Amazon and Pinterest also have visual search tools designed to drive more purchases. Search for your friend’s sneakers on the old Google Lens, and you may have been shown a carousel of similar items. In the updated version of Lens, Google says it will show direct shopping links, customer reviews, publisher reviews, and comparison shopping tools.

Lens search is now multimodal, the buzzword in AI these days, meaning people can now search with a combination of video, images, and voice input. Instead of pointing the smartphone camera at an object, tapping the focus point on the screen, and waiting for the Lens app to generate results, users can point the lens and use voice commands at the same time, for example, “What kind of clouds are those?” or “What kind of sneakers are those and can I buy them?” where?”

The lens will also start working on real-time video capture, taking the tool a step further than pointing objects in still images. If you have a broken record player or see a flickering light on a malfunctioning device at home, you can shoot a quick video with Lens and, with the AI overview it generates, see tips on how to fix the thing.

First announced at I/O, the feature is considered experimental and is only available to people who have opted in to Google’s search labs, said Rajan Patel, an 18-year Googler and co-founder of Lens. Other features of Google Lens, voice mode and extended shopping, are being rolled out.

The “video understanding” feature, as Google calls it, is interesting for several reasons. Although it currently works with video captured in real time, if or when Google expands it to captured videos, entire video archives—whether from a personal camera or a database like Google—can be tagged and incredibly affordable.

The second assumption is that this Lens feature shares some features with Google’s Project Astra, which is expected to be available later this year. Astra, like Lens, uses multimodal input to interpret the world around you through your phone. As part of the Astra demo this spring, the company showed off a pair of prototype smart glasses.

Separately, Meta recently created a stir with its long-term vision of our augmented reality future, which involves mortals wearing soft glasses that can intelligently interpret the world around them and show them a holographic interface. Google, of course, has already tried to find this future with Google Glass (which uses a very different technology than that of the latest Meta voice). Are the new features of Lens, combined with Astra, a natural part of a new type of smart glasses?

Source link

Google’s Visual Search Can Now Answer More Complex Queries

Leave a Comment Cancel Reply