Tag Archives: Google Lens

Ask a techspert: How does Lens turn images to text?

When I was on holiday recently, I wanted to take notes from an ebook I was reading. But instead of taking audio notes or scribbling things down in a notebook, I used Lens to select a section of the book, copy it and paste it into a document. That got me curious: How did all that just happen on my phone? How does a camera recognize words in all their fonts and languages?

I decided to get to the root of the question and speak to Ana Manasovska, a Zurich-based software engineer who is one of the Googlers on the front line of converting an image into text.

Ana, tell us about your work in Lens

I’m involved with the text aspect, so making sure that the app can discern text and copy it for a search or translate it — with no typing needed. For example, if you point your phone’s camera at a poster in a foreign language, the app can translate the text on it. And for people who are blind or have low vision, it can read the text out loud. It’s pretty impressive.

So part of what my team does is get Lens to recognize not just the text, but also the structure of the text. We humans automatically understand writing that is separated into sentences and paragraphs, or blocks and columns, and know what goes together. It’s very difficult for a machine to distinguish that, though.

Is this machine learning then?

Yes. In other words, it uses systems (we call them models) that we’ve trained to discern characters and structure in images. A traditional computing system would have only a limited ability to do this. But our machine learning model has been built to “teach itself” on enormous datasets and is learning to distinguish text structures the same way a human would.

Can the system work with different languages?

Yes, it can recognize 30 scripts, including Cyrillic, Devanagari, Chinese and Arabic. It’s most accurate in Latin-alphabet languages at the moment, but even there, the many different types of fonts present challenges. Japanese and Chinese are tricky because they have lots of nuances in the characters. What seems like a small variation to the untrained eye can completely change the meaning.

What’s the most challenging part of your job?

There’s lots of complexity and ambiguity, which are challenging, so I’ve had to learn to navigate that. And it’s very fast paced; things are moving constantly and you have to ask a lot of questions and talk to a lot of people to get the answers you need.

When it comes to actual coding, what does that involve?

Mostly I use a programming language called C++, which enables you to run processing steps needed to take you from an image to a representation of words and structure.

Hmmm, I sort of understand. What does it look like?

A screenshot of some C++ code against a white background.

This is what C++ looks like.

The code above shows the processing for extracting only the German from a section of text. So say the image showed German, French and Italian — only the German would be extracted for translation. Does that make sense?

Kind of! Tell me what you love about your job

It boils down to my lifelong love of solving problems. But I also really like that I’m building something I can use in my everyday life. I’m based in Zurich but don’t speak German well, so I use Lens for translation into English daily.

Seniors search what they see, using a new Lens

Technology shines when it helps us get things done in our daily lives, and that’s exactly why a group of around 100 very eager seniors gathered in Odense, Denmark. All older than 65, many up to 85, they decided to stay on top of the latest technological tricks and tools. On this March day, the eye-opener was the often overlooked potential in searching for information using visual tools, like Google Lens.

So now the seniors searched their surroundings directly: Scanned trees, plants, animals and buildings, used Translate to get hold of Turkish language menu cards or Japanese sayings, and found product declarations through barcode scanning.

The group was taking part in a training set up by Faglige Seniorer, which organizes 300,000 seniors in total. They first partnered with Google back in 2019 to train seniors in using voice to search, and now the time had come to use live images.

“Often, when I go for a walk, I stumble upon an unknown flower or a tree. Now I can just take a picture to discover what kind of plant I am standing before,” Verner Madsen, one of the participants, remarked. “I don’t need to bring my encyclopedia. It is really smart and helpful.”

Seniors in a country like Denmark are generally very tech savvy, but with digitization constantly advancing — accelerating even faster during two years of COVID-19 — some seniors risk being left behind, creating gaps between generations. During worldwide lockdowns, technological tools have helped seniors stay connected with their family and friends, and smartphone features have helped improve everyday life. One key element of that is delivering accurate and useful information when needed. And for that, typed words on a smartphone keyboard can often be substituted with a visual search, using a single tap on the screen.

Being able to "search what you see" in this way was an eye-opener to many. As the day ended, another avid participant, Henrik Rasmussen, declared he was heading straight home to continue his practice.

“I thought I was up to speed on digital developments, but after today I realize that I still have a lot to learn and discover,” he said.

Search your world, any way and anywhere

People have always gathered information in a variety of ways — from talking to others, to observing the world around them, to, of course, searching online. Though typing words into a search box has become second nature for many of us, it’s far from the most natural way to express what we need. For example, if I’m walking down the street and see an interesting tree, I might point to it and ask a friend what species it is and if they know of any nearby nurseries that might sell seeds. If I were to express that question to a search engine just a few years ago… well, it would have taken a lot of queries.

But we’ve been working hard to change that. We've already started on a journey to make searching more natural. Whether you're humming the tune that's been stuck in your head, or using Google Lens to search visually (which now happens more than 8 billion times per month!), there are more ways to search and explore information than ever before.

Today, we're redefining Google Search yet again, combining our understanding of all types of information — text, voice, visual and more — so you can find helpful information about whatever you see, hear and experience, in whichever ways are most intuitive to you. We envision a future where you can search your whole world, any way and anywhere.

Find local information with multisearch

The recent launch of multisearch, one of our most significant updates to Search in several years, is a milestone on this path. In the Google app, you can search with images and text at the same time — similar to how you might point at something and ask a friend about it.

Now we’re adding a way to find local information with multisearch, so you can uncover what you need from the millions of local businesses on Google. You’ll be able to use a picture or screenshot and add “near me” to see options for local restaurants or retailers that have the apparel, home goods and food you’re looking for.

An animation of a phone showing a search. A photo is taken of Korean cuisine, then Search scans it for restaurants near the user that serve it.

Later this year, you’ll be able to find local information with multisearch.

For example, say you see a colorful dish online you’d like to try – but you don’t know what’s in it, or what it’s called. When you use multisearch to find it near you, Google scans millions of images and reviews posted on web pages, and from our community of Maps contributors, to find results about nearby spots that offer the dish so you can go enjoy it for yourself.

Local information in multisearch will be available globally later this year in English, and will expand to more languages over time.

Get a more complete picture with scene exploration

Today, when you search visually with Google, we’re able to recognize objects captured in a single frame. But sometimes, you might want information about a whole scene in front of you.

In the future, with an advancement called “scene exploration,” you’ll be able to use multisearch to pan your camera and instantly glean insights about multiple objects in a wider scene.

In the future, “scene exploration” will help you uncover insights across multiple objects in a scene at the same time.

Imagine you’re trying to pick out the perfect candy bar for your friend who's a bit of a chocolate connoisseur. You know they love dark chocolate but dislike nuts, and you want to get them something of quality. With scene exploration, you’ll be able to scan the entire shelf with your phone’s camera and see helpful insights overlaid in front of you. Scene exploration is a powerful breakthrough in our devices’ ability to understand the world the way we do – so you can easily find what you’re looking for– and we look forward to bringing it to multisearch in the future.

These are some of the latest steps we’re taking to help you search any way and anywhere. But there’s more we’re doing, beyond Search. AI advancements are helping bridge the physical and digital worlds in Google Maps, and making it possible to interact with the Google Assistant more naturally and intuitively. To ensure information is truly useful for people from all communities, it’s also critical for people to see themselves represented in the results they find. Underpinning all these efforts is our commitment to helping you search safely, with new ways to control your online presence and information.

Go beyond the search box: Introducing multisearch

How many times have you tried to find the perfect piece of clothing, a tutorial to recreate nail art or even instructions on how to take care of a plant someone gifted you — but you didn’t have all the words to describe what you were looking for?

At Google, we’re always dreaming up new ways to help you uncover the information you’re looking for — no matter how tricky it might be to express what you need. That’s why today, we’re introducing an entirely new way to search: using text and images at the same time. With multisearch in Lens, you can go beyond the search box and ask questions about what you see.

Let’s take a look at how you can use multisearch to help with your visual needs, including style and home decor questions. To get started, simply open up the Google app on Android or iOS, tap the Lens camera icon and either search one of your screenshots or snap a photo of the world around you, like the stylish wallpaper pattern at your local coffee shop. Then, swipe up and tap the "+ Add to your search" button to add text.

Multisearch allows people to search with both images and text at the same time.

With multisearch, you can ask a question about an object in front of you or refine your search by color, brand or a visual attribute. Give it a go yourself by using Lens to:

  • Screenshot a stylish orange dress and add the query “green” to find it in another color
  • Snap a photo of your dining set and add the query “coffee table” to find a matching table
  • Take a picture of your rosemary plant and add the query “care instructions”

All this is made possible by our latest advancements in artificial intelligence, which is making it easier to understand the world around you in more natural and intuitive ways. We’re also exploring ways in which this feature might be enhanced by MUM– our latest AI model in Search– to improve results for all the questions you could imagine asking.

This is available as a beta feature in English in the U.S., with the best results for shopping searches. Try out multisearch today in the Google app, the best way to search with your camera, voice and now text and images at the same time.

Here’s how online shoppers are finding inspiration

People shop across Google more than a billion times a day — and we have a pretty good sense of what they’re browsing for. For instance, our Search data shows that the early 2000’s are having a moment. We’re seeing increased search interest in “Y2k fashion” and products like bucket hats and ankle bracelets. Also popular? The iconic Clinique “Happy” perfume, Prada crochet bags and linen pants.

While we know what’s trending, we also wanted to understand how people find inspiration when they’re shopping for lifestyle products. So we surveyed 2,000 U.S. shoppers of apparel, beauty and home decor for our first Inspired Shopping Report. Read on to find out what we learned.

Shopping isn’t always a checklist

According to our findings, most fashion, beauty and home shoppers spend up to two weeks researching products before they buy them. Many, though, are shopping online just for fun — 65% say they often or sometimes shop or browse online when they’re not looking for anything in particular. To help make online shopping even easier and more entertaining, we recently added more browsable search results for fashion and apparel shopping queries. So when you search for chunky loafers, a lime green dress or a raffia bag on Google, you’ll scroll through a visual feed with various colors and styles — alongside other helpful information like local shops, style guides and videos.

Phone screens show animations of a Google search for various clothing items with visual scrolling results

Apparel queries on Search show a more visual display of products

Inspiration can strike anywhere

We know shopping inspiration can strike at any moment. In fact, 60% of shoppers say they often or sometimes get inspired or prompted to buy something even when they aren’t actively shopping. That can come from spotting great street style: 39% of shoppers say they often or sometimes look for a specific outfit online after they see someone wearing it. Or it can come from browsing online: 48% of shoppers have taken a screenshot of a piece of clothing, accessory or home decor item they liked (and 70% of them say they’ve searched for or bought it afterwards). Google Lens can help you shop for looks as soon as you spot them. Just snap a photo or screenshot and you’ll find exact or similar results to shop from.

Sometimes words aren’t enough

We know it can be hard to find what you’re looking for using words alone, even when you do have an image — like that multi-colored, metallic floral wallpaper you took a photo of that would go perfectly with your living room rug. Half of shoppers say they often or sometimes have failed to find a specific piece of clothing or furniture online after trying to describe it with just words. And 66% of shoppers wished they could find an item in a different color or print.

To help you track down those super specific pieces, we’re introducing an entirely new way to search — using text and images at the same time. With multisearch on Lens, you can better uncover the products you’re looking for even when you don’t have all the words to describe them. For example, you might be on the lookout for a scarf in the same pattern as one of your handbags. Just snap a photo of the patterned handbag on Lens and add the query “scarf” to complete your look. Or take a photo of your favorite heels and add the query “flats” to find a more comfortable version.

Phone screen shows the ability to search for a flat version of a pair of yellow high heels, using text and images at the same time.

With multisearch on Lens, you can search with both images and text at the same time

Trying before you buy matters

It’s not always possible to make it to the store and try something on before you buy it — but it matters. Among online beauty shoppers, more than 60% have decided not to purchase a beauty or cosmetic item online because they didn’t know what color or shade to choose, and 41% have decided to return an item because it was the wrong shade. With AR Beauty experiences, you can virtually discover and “try on” thousands of products from brands like Maybelline New York, M.A.C. and Charlotte Tilbury — helping you make more informed decisions. And now, shoppers can try on cosmetics from a variety of brands carried at Ulta Beauty right in Google Search. Just search for a product, like the Morphe Matte Liquid Lipstick or Kylie Cosmetics High Gloss, and find the best shade for you.

Phone screens show animations of models virtually trying on various lipstick and eyeshadow shades.

Google’s AR Beauty experience features products from Ulta Beauty

No matter where you find your shopping inspiration, we hope these features and tools help you discover new products, compare different options and ultimately make the perfect purchase.

5 tips to finish your holiday shopping with Chrome

We’re coming down to the wire with holiday shopping, and many of us are frantically searching online for last-minute stocking stuffers. Luckily, a few new features are coming to Chrome that will make these final rounds of shopping easier — helping you keep track of what you want to buy and finally hit "order."

Here are five ways to use Chrome for a stress-free shopping experience.

1. Keep track of price drops: Are you waiting for a good deal on that pair of headphones, but don’t have time to constantly refresh the page? A new mobile feature, available this week on Chrome for Android in the U.S., will show an item’s updated price right in your open tabs grid so you can easily see if and when the price has dropped. This same feature will launch on iOS in the coming weeks.

Screenshot showing a grid of four tabs in Chrome. Two tabs are product pages and show a price drop on top of the tab preview, highlighted in green.

2. Search with a snapshot from the address bar: If something catches your eye while you’re out window shopping, you can now search your surroundings with Google Lens in Chrome for Android. From the address bar, tap the Lens icon and start searching with your camera.

Coming soon, you’ll also be able to use Lens while you’re browsing in Chrome on your desktop. If you come across a product in an image and want to find out what it is, just right-click and select the “Search images with Google Lens” option.

3. Rediscover what’s in your shopping cart: You know you have items in your shopping cart, but you can't remember where exactly. No need to search all over again. Starting with Chrome on Windows and Mac in the U.S., you can now open up a new tab and scroll to the “Your carts” card to quickly see any site where you’ve added items to a shopping cart. Some retailers, like Zazzle, iHerb, Electronic Express and Homesquare, might even offer a discount when you come back to check out.

4. Get passwords off your plate: Don’t worry about setting up and remembering your account details for your favorite shopping sites. Chrome can help create unique, secure passwords and save your login details for future visits.

5. Simplify the checkout process: By saving your address and payment information with Autofill, Chrome can automatically fill out your billing and shipping details. And when you enter info into a new form, Chrome will ask if you’d like to save it.

How AI is making information more useful

Today, there’s more information accessible at people’s fingertips than at any point in human history. And advances in artificial intelligence will radically transform the way we use that information, with the ability to uncover new insights that can help us both in our daily lives and in the ways we are able to tackle complex global challenges.


At our Search On livestream event today, we shared how we’re bringing the latest in AI to Google’s products, giving people new ways to search and explore information in more natural and intuitive ways.


Making multimodal search possible with MUM

Earlier this year at Google I/O, we announced we’ve reached a critical milestone for understanding information with Multitask Unified Model, or MUM for short.


We’ve been experimenting with using MUM’s capabilities to make our products more helpful and enable entirely new ways to search. Today, we’re sharing an early look at what will be possible with MUM. 


In the coming months, we’ll introduce a new way to search visually, with the ability to ask questions about what you see. Here are a couple of examples of what will be possible with MUM.




With this new capability, you can tap on the Lens icon when you’re looking at a picture of a shirt, and ask Google to find you the same pattern — but on another article of clothing, like socks. This helps when you’re looking for something that might be difficult to describe accurately with words alone. You could type “white floral Victorian socks,” but you might not find the exact pattern you’re looking for. By combining images and text into a single query, we’re making it easier to search visually and express your questions in more natural ways. 



Some questions are even trickier: Your bike has a broken thingamajig, and you need some guidance on how to fix it. Instead of poring over catalogs of parts and then looking for a tutorial, the point-and-ask mode of searching will make it easier to find the exact moment in a video that can help.


Helping you explore with a redesigned Search page

We’re also announcing how we’re applying AI advances like MUM to redesign Google Search. These new features are the latest steps we’re taking to make searching more natural and intuitive.


First, we’re making it easier to explore and understand new topics with “Things to know.” Let’s say you want to decorate your apartment, and you’re interested in learning more about creating acrylic paintings.



If you search for “acrylic painting,” Google understands how people typically explore this topic, and shows the aspects people are likely to look at first. For example, we can identify more than 350 topics related to acrylic painting, and help you find the right path to take.


We’ll be launching this feature in the coming months. In the future, MUM will unlock deeper insights you might not have known to search for — like “how to make acrylic paintings with household items” — and connect you with content on the web that you wouldn’t have otherwise found.

Second, to help you further explore ideas, we’re making it easy to zoom in and out of a topic with new features to refine and broaden searches. 


In this case, you can learn more about specific techniques, like puddle pouring, or art classes you can take. You can also broaden your search to see other related topics, like other painting methods and famous painters. These features will launch in the coming months.


Third, we’re making it easier to find visual inspiration with a newly designed, browsable results page. If puddle pouring caught your eye, just search for “pour painting ideas" to see a visually rich page full of ideas from across the web, with articles, images, videos and more that you can easily scroll through. 

This new visual results page is designed for searches that are looking for inspiration, like “Halloween decorating ideas” or “indoor vertical garden ideas,” and you can try it today.

Get more from videos

We already use advanced AI systems to identify key moments in videos, like the winning shot in a basketball game, or steps in a recipe. Today, we’re taking this a step further, introducing a new experience that identifies related topics in a video, with links to easily dig deeper and learn more. 


Using MUM, we can even show related topics that aren’t explicitly mentioned in the video, based on our advanced understanding of information in the video. In this example, while the video doesn’t say the words “macaroni penguin’s life story,” our systems understand that topics contained in the video relate to this topic, like how macaroni penguins find their family members and navigate predators. The first version of this feature will roll out in the coming weeks, and we’ll add more visual enhancements in the coming months.


Across all these MUM experiences, we look forward to helping people discover more web pages, videos, images and ideas that they may not have come across or otherwise searched for. 


A more helpful Google

The updates we’re announcing today don’t end with MUM, though. We’re also making it easier to shop from the widest range of merchants, big and small, no matter what you’re looking for. And we’re helping people better evaluate the credibility of information they find online. Plus, for the moments that matter most, we’re finding new ways to help people get access to information and insights. 


All this work not only helps people around the world, but creators, publishers and businesses as well.  Every day, we send visitors to well over 100 million different websites, and every month, Google connects people with more than 120 million businesses that don't have websites, by enabling phone calls, driving directions and local foot traffic.


As we continue to build more useful products and push the boundaries of what it means to search, we look forward to helping people find the answers they’re looking for, and inspiring more questions along the way.


Posted by Prabhakar Raghavan, Senior Vice President