Why do we say uh and um

Content on WhatAnswers is provided "as is" for informational purposes. While we strive for accuracy, we make no guarantees. Content is AI-assisted and should not be used as professional advice.

Last updated: April 8, 2026

Quick Answer: We say 'uh' and 'um' as filler words during speech pauses, which serve important cognitive functions. Research shows these disfluencies occur about 4-6 times per minute in spontaneous speech, with 'uh' being more common before short pauses and 'um' before longer ones. Studies from the 1970s onward, including work by Herbert Clark and Jean Fox Tree, have demonstrated that listeners use these cues to predict upcoming speech content. These vocalizations are universal across languages, though their specific forms vary culturally.

Key Facts

Filler words occur approximately 4-6 times per minute in spontaneous English speech
'Uh' typically precedes pauses of 0.2 seconds or less while 'um' signals longer pauses
Research by Herbert Clark and Jean Fox Tree in the 1990s established that listeners use disfluencies to predict speech content
The study of speech disfluencies gained significant attention in psycholinguistics starting in the 1970s
While universal, specific filler sounds vary across languages (e.g., 'eto' in Japanese, 'euh' in French)

Overview

The use of 'uh' and 'um' represents a universal linguistic phenomenon known as speech disfluencies or filler words. These vocalizations have been studied systematically since the 1970s when researchers began examining spontaneous speech patterns. Historical analysis shows that while filler words have existed throughout recorded language history, their systematic study emerged with the development of psycholinguistics and corpus linguistics. In 1999, researchers Herbert Clark and Jean Fox Tree published influential work demonstrating that listeners actually use these disfluencies to predict upcoming speech content, challenging earlier views that they were merely speech errors. Cross-cultural studies reveal that while every language has equivalent filler sounds, their specific forms vary significantly - Japanese speakers use 'eto' and 'ano,' French speakers use 'euh,' and Spanish speakers use 'eh' and 'este.' The frequency of these disfluencies increases in spontaneous speech compared to prepared remarks, with studies showing they account for approximately 6-10% of spoken words in natural conversation.

How It Works

The production of 'uh' and 'um' involves complex cognitive processes during speech planning and execution. When speakers encounter difficulty retrieving words or planning upcoming speech segments, their brains trigger a pause mechanism. During these micro-pauses (typically 0.1-0.5 seconds), the vocal apparatus produces these filler sounds while cognitive resources work to resolve the speech planning issue. Research using eye-tracking and EEG studies shows that 'um' generally signals longer planning difficulties than 'uh,' with 'um' pauses averaging 0.5 seconds versus 0.2 seconds for 'uh.' The brain's left inferior frontal gyrus and anterior cingulate cortex show increased activity during disfluency production, indicating active problem-solving processes. These filler words serve as 'conversational traffic signals' that help listeners anticipate upcoming speech content, with studies showing listeners are better at predicting following words when disfluencies occur before them.

Why It Matters

Understanding 'uh' and 'um' has significant implications across multiple domains. In education, recognizing that these disfluencies signal active cognitive processing helps teachers identify when students are grappling with complex concepts. In artificial intelligence and speech recognition systems, accounting for disfluencies improves natural language processing accuracy by 15-20%. Clinical applications include using disfluency patterns to diagnose neurological conditions like aphasia and Parkinson's disease, where specific disfluency patterns emerge. In communication training, understanding that moderate use of filler words actually improves listener comprehension (up to 10% better in some studies) has changed how public speaking is taught. Furthermore, cross-cultural business communication benefits from recognizing different cultural norms around disfluency use, preventing misinterpretation in international negotiations.