Veronica With Four Eyes

Creating Audio Description For Primary Source Videos With YouDescribe

In the last few days, there has been a surge in the number of primary source videos that are being shared on social media and by news outlets. Many of these videos only have a few words spoken, yet tell a powerful story about what is happening in the world right now and what may happen in the future. As a person with low vision, a lot of these videos are inaccessible to me as I am unable to see what is going on when there is lots of movement or poor lighting conditions, and I also have a medical condition that is aggravated by flashing/strobe lights (especially red and blue flashing lights). In order to help make information about current events more accessible for people that are blind or that have low vision, here are my tips for creating audio description for primary source videos with the free YouDescribe platform.


Audio description, sometimes referred to as descriptive audio or described video, is an additional narrator track that provides visual information for people who otherwise would not be able to see it. Audio description is provided during natural pauses in dialogue so it does not distract from the video. Occasionally, describers may pause the video themselves and add description if there are no natural pauses available.

For most online videos, open audio description is used, meaning that the audio description automatically plays and does not require a special device to be used.



YouDescribe is a free website and iOS app that allows viewers to watch YouTube videos with audio description. The audio description tracks are written and recorded by sighted volunteers so that people with blindness and low vision can watch YouTube videos and receive visual information. YouDescribe is a project of the Smith-Kettlewell Eye Research Institute in San Francisco, California.

YouDescribe is available online and as a free iOS app. At this time, audio description tracks can only be created through web browsers. YouDescribe can be used with or without an account for viewing videos, but connecting to a Google account is required for requesting videos and creating descriptions.



Some of the many reasons to create audio description for primary source videos include:

  • Important visual information may not have corresponding audio information- I wouldn’t notice there was broken glass on the ground unless someone told me
  • Audio information can be very quiet or otherwise go unnoticed, such as someone whispering
  • Videos with lots of flashing lights can be triggering for people with photosensitivity or other conditions aggravated by strobe/flashing lights
  • Knowing when and where a video takes place can help tremendously with understanding its impact
  • It can be hard to tell who is saying what, especially when people are in crowded areas or are wearing headgear/masks that impact how their voice sounds
  • By giving viewers objective descriptions of visual information, they can form their own conclusions about what is happening in the video

Related links


One of the most common questions new describers have is when to use inline description (which involves reading audio description over the video audio) or extended description (which involves pausing the video to read audio description), and what to consider choosing one type over the other. While there are some people who prefer one style over the other, here is what I prefer as someone who relies on audio description for understanding content.

When to use inline:

  • When there is limited voiceover/speaking, loud music, or audio that is hard to hear
  • If the necessary descriptions are short and can be quickly read in natural pauses
  • Whenever the narrator is describing movement, i.e people running away or something being thrown

When to use extended:

  • If there is lots of description needed at the beginning for the layout of the scene
  • When a scene changes very quickly and additional description is needed, such as an explosion
  • If talking or voiceover content is a large focus of the video



Here are my recommendations for what to include in audio description for primary source videos:

  • Descriptions of people that are within the focus of the camera, including information such as their race/ethnicity, gender (if relevant), and similar information
  • Relevant objects or buildings in view of the camera, such as shields or cars
  • Signs or objects that are being held, with text read verbatim when possible
  • Any relevant clothing descriptions, such as t-shirts with text or masks/head coverings
  • Relevant facial expressions or movement, such as someone running or ducking
  • Names of buildings, landmarks, or information about the location of the video if not otherwise mentioned- for example, a video filmed on the streets of Washington, DC
  • The time of day the video is being filmed, and any relevant weather conditions
  • Significant objects or visual information that contribute to the meaning of the video, such as the presence of broken glass or the camera panning to something on fire

Related links


Here are my recommendations for what not to include in audio description for primary source videos:

  • Information that is already covered with voiceover or speaking content
  • Identifying information that was not available in the original video- if I recognized my friend in a video where they were unnamed, I would not reveal my friend’s name when creating audio description
  • Specific descriptions of irrelevant environmental text- it’s ok to say there is graffiti on the walls, no need to read out what the graffiti says if it isn’t relevant to the video
  • Descriptions of what everyday objects look like- I know what a traffic cone looks like, but I might not know that it is on top of a car
  • Precise counts for large crowds or images where there are lots of people- I don’t care that there are exactly 15 people in an image, but I do care if several people are circling around another person or area
  • Censoring text that is written on a sign, unless the alt text is specifically targeted at a younger audience or an audience that requested censoring of certain words- I write more about accessibility-friendly censoring in the post linked below
  • Commentary or opinions about what is happening- let the viewer come to their own conclusions

Related links


Creating audio description for primary source videos for audiences that are blind or visually impaired is a great way to help those who are filming these videos to ensure that their message is heard by as many people as possible, and I recommend posting links to audio-described content on social media or within descriptions of videos whenever they are available. I hope these tips for creating audio description for primary source videos are helpful for others!

Creating Audio Description For Primary Source Videos With YouDescribe. How to write audio description for primary source videos and document current events for blind/low vision audiences with the free YouDescribe tool