Veronica With Four Eyes

Recognizing Images With Seeing AI

The other day, my friend sent me a screenshot of a text conversation and asked if they should type out all of the text in the image so that I would be able to read what was going on. While I’m grateful that my friend was willing to do that, I told them that they wouldn’t have to, as I could read the information by recognizing images with Seeing AI. Here are my tips for recognizing images with Seeing AI for users with visual impairments, and how I use different features within the app.

What is Seeing AI?

Seeing AI is a free app developed by Microsoft that helps people with visual impairments get information in real-time about the world around them using artificial intelligence- which is the “AI” in Seeing AI. Seeing AI requires an internet connection but does not require a Microsoft Account to use. Seeing AI is currently available for iPhone and iPad, and supports several different languages including English, Spanish, Chinese, French, German, Vietnamese, and more.

Related links

How to recognize images with Seeing AI

Users can use the image recognition feature with Seeing AI for existing images by doing the following:

  1. If needed, download the Seeing AI app to the device if it is not already installed
  2. Choose the image that you want to recognize with Seeing AI. This can be in the gallery, in an application, in the web browser, etc. Images accessed through a web browser may need to be saved to the device.
  3. Open the Share menu for the image (this looks like a box with an arrow pointing upward, or three dots next to an image)
  4. From the Actions menu, select Recognize With Seeing AI
  5. The finished description will display on the bottom of the image, with information from relevant categories automatically displayed

Users can also identify images without saving them to their camera roll by opening the Seeing AI app, choosing a function within the app, and taking a picture. The app will read information out loud- no screen reader necessary.

Recognizing text

For images with text, Seeing AI will write out all visible text in the image verbatim, though some text formatting and spacing may be ignored. The text will be displayed with Dynamic Text settings so that users can easily read information with large print or with a screen reader. I’ve found that Seeing AI typically ignores blurred out or irrelevant background text, such as a stop sign.

Examples of images that I use text recognition with include:

  • Screenshots of Tumblr posts and tweets
  • Images that include lots of text
  • Screenshots of text conversations
  • Pictures with captions
  • Diagrams with labels

Related links

Scene descriptions

What’s going on in that photo? Seeing AI can provide scene descriptions that describe relevant parts of an image. This is a wonderful way to generate alt text and image descriptions, though unfortunately users can’t copy and paste text from the image recognition feature at this time. That said, the scene description feature still has a ton of other uses and is especially great with photos.

Examples of images I use scene description with include:

  • Photos of animals
  • Basic information about selfies that my friends send me
  • Short descriptions of photos, typically one sentence or less

Related links

Descriptions of people

For photos with people looking at the camera, Seeing AI can provide descriptions of what each person in the photo looks like, including the following information:

  • Approximate age
  • Gender
  • Hair color
  • Any identifying features, i.e glasses
  • Their facial expression, such as happy or neutral

In my experience, Seeing AI does not provide descriptions of clothing, and faces that are imported into the app’s facial recognition feature aren’t identified when using the image recognition feature in the photo gallery.

Related links

Exploring photos/images

Users can explore photos by touch by selecting the option “Explore Image” at the bottom of the screen. This will display the image in full resolution, and the user can move their finger across the screen to get information about different elements in a picture. This can include where people/objects are located, any text in the image, facial recognition information, and similar.

Examples of images that I use Explore with include:

  • Layouts of rooms to find obstacles
  • Pictures where there are a lot of visual elements
  • Scrolling through text conversations to figure out who said what
  • Exploring menus that are saved as images

Related links

Summary of recognizing images with Seeing AI

  • Seeing AI is a free app for iPhone/iPad that provides visual information for users with vision loss
  • The Seeing AI image recognition tool can analyze images in the camera roll or on the device and share key details
  • Seeing AI can recognize handwritten and typed text from content such as notes, screenshots, and text posts
  • Scene descriptions in Seeing AI provide information about objects in an image, such as animals, selfies, or short general descriptions about images
  • For images of people, Seeing AI provides details such as the subject’s age, gender, hair color, and facial expression, as well as other distinctive details such as glasses
  • The Seeing AI image exploration tool allows users to navigate an image by touch and explore elements individually
  • Image recognition with Seeing AI should not be treated as a replacement for quality alt text or image descriptions in web content

Recognizing Images With Seeing AI. How to get text-based descriptions of images in the Photos app using the free Microsoft Seeing AI app- great for users that frequently encounter inaccessible images online