Artificial intelligence fails a basic human attention test
While humans can easily ignore distractions in long lists, leading AI models lose focus and fail when asked to name colors instead of reading words.
The Stroop task is a classic psychological experiment that measures executive control by asking participants to name the color of a word's ink rather than reading the word itself. For most people, reading is an automatic habit that must be consciously suppressed when the word red is printed in blue ink. While humans can maintain high accuracy through long sequences of these conflicting cues, modern large language models experience a dramatic performance collapse as the task scales.