Attention and Action Recognition

Biomorphic Computing with Bill Tomlinson, UCI


To create a simple vision-centered perceptual system that is capable of concentrating or diffusing attention dependent upon a combination of its perceptual mechanisms and its internal state.

Based on the fact that creatures with different sensory motor capabilities segment the world in different ways (visually, temporally, sonically, etc...), this project will use some common computer vision techniques in novel ways to create alternate forms of visual perception for use in responsive objects/environments.

When I originally proposed this project, I had not yet decided upon an interface to visualize the attention of the perceptual system. In process, I came across an article (Rizzolatti, et al) linking action recognition to language formation. Action recognition describes the similar neural response to both performing an action and observing similar action performed by someone else. The focus of much of my work has been on vision and embodiment with a focus on technologies of reading. This article helped connect my research on peripheral and foveal attention for Biomorphic computing with my other interests. I decided that I wanted to incorporate action recognition into my project. The scenario would be as follows: There would be different portions of text, some periphery and some focal (these are determined by content). Depending on whether the reader fell into the camera's peripheral or foveal area, the system would pay attention to that area. This was visualized by the text associated with that area becoming readable (stops fluctuating, becomes larger). If the reader was in the foveal area, the focal text would become readable, but would move in a rhythmic pattern. If the reader moved in a similar pattern, the text would remain readable, if not it would go back to the defocused chaotic stew of words.


+non-focal awareness
+history vision/visual memory
+distortion (fisheye, etc.)
+non-object motion tracking
+peripheral focus
+loss, blur, noise amplification
+random sampling
+curved surface projection
+Troxler fading (peripheral attention)
+variable sight (fish who change their visual capabilities
depending on environment, season, time of day)
+multiple focal points (birds have two foveal regions: frontal
and lateral)
+variable color dimensions (this itself is not necessarily
feasible but...)
+'surface' subjectivity (like color, see above)
+feedback, proprioception (outside the scope of this project)

video capture
graphical output

text researched (only partially implemented)
from If on a winter's night a traveler by Italo Calvino

Ch. 1
You are about to begin reading Italo Calvino's new novel, If on a winter's night a traveler. Relax. Concentrate. Dispel every other thought. Let the world around you fade. Best to close the door; the TV is always on in the next room. Tell the others right away, "No, I don't want to watch TV!" Raise your voice-they won't hear you otherwise-"I'm reading! I don't want to be disturbed!" Maybe they haven't heard you, with all that racket; speak louder, yell: "I'm beginning to read Italo Calvino's new novel!" Or if you prefer, don't say anything: just hope they'll leave you alone.

Ch 6, p127-128
in New York, in the control room, the reader is soldered to the chair at the writs, with pressure manometers and a stethoscopic belt, her temples beneath their crown of hair held fast by the serpentine wires of the encephalogram that mark the intensity of her concentration and the frequency of stimuli. "All our work depends on the sensitivity of the subject at our disposal for the control tests: and it must, moreover, be a person of strong eyesight and nerves, to be subjected to the uninterrupted reading of novels and variants of novels as they are turned out by the computer. If reading attention reaches certain highs with a certain continuity, the product is viable and can be launched on the market; if attention, on the contrary, relaxes and shifts, the combination is rejected and its elements are broken up and used again in other contexts."

The man in the white smock rips off one encephalogram after another, as if they were pages from a calendar. "Worse and worse," he says. "Not one novel being published has to be revised or the reader is not functioning."

download the max patches focus.sitx [56k]
Intel's Open CV | Max Objects cv.jit

Gibson, James J., "A theory of direct visual perception." In Alva Noe and Evan Thompson, eds. Vision and Mind: selected readings in the philosophy of perception. pp. 77-90. 2002

Handy, Todd, et al. "Graspable objects grab attention when the potential for action is recognized." Nature Publishing Group. 2003

Helmholtz, H. trans. James P. C. Southall Treatise on Physiological Optics. 1894.

Luo, H., Gaborski, R. and Acharya, R., "Robust Snake Model", Computer Vision and Pattern Recognition 2000, CVPR2000, Hilton Head Island, SC., 2000.

Lou, Lianggang. "Selective Peripheral Fading: Evidence for inhibitory effect of attention on visual sensation."

Nikolic, M.I. and Sarter, N.B. "Peripheral Visual Feedback: A Powerful Means of Supporting Attention Allocation and Human-Automation Coordination in Highly Dynamic Data-Rich Environments." Human Factors, 43(1), 2000

Peters, Christopher and O'Sullivan, Carol. "Bottom Up Attention for Virtual Human Characters." Computer Animation for Social Agents 2003

Rizzolatti, Giacomo and Arbib, Michael A. "Language within our grasp" Trends in Neurosciences Vol. 21, No. 5, 1998

Thompson, E., Palacios, A., and Varela, F. "Ways of Coloring: Comparative color vision as a case study for cognitive science." Behavioral and Brain Sciences 15 (1992): 1-26.

Uexkull, Baron Jakob von. "A Stroll Through the Worlds of Animals and Men." 1934

How to display a flying dragon, from Johann Kestler, Physiologia Kircheriana Experimentalis, p. 247. from

© Erik Conrad 1998-2006