Date of Award

Fall 2014

Degree Type


Degree Name

Doctor of Philosophy (PhD)


Industrial Engineering

First Advisor

Juan Wachs

Committee Chair

Juan Wachs

Committee Member 1

Eugenio Culurciello

Committee Member 2

Shimon Nof

Committee Member 3

Brad Duerstock


Paraphrasing the theory of embodied cognition, all aspects of our cognition are determined primarily by the contextual information and the means of physical interaction with data and information. In hybrid human-machine systems involving complex decision making, continuously maintaining a high level of attention while employing a deep understanding concerning the task performed as well as its context are essential. Utilizing embodied interaction to interact with machines has the potential to promote thinking and learning according to the theory of embodied cognition proposed by Lakoff. Additionally, the hybrid human-machine system utilizing natural and intuitive communication channels (e.g., gestures, speech, and body stances) should afford an array of cognitive benefits outstripping the more static forms of interaction (e.g., computer keyboard). This research proposes such a computational framework based on a Bayesian approach; this framework infers operator's focus of attention based on the physical expressions of the operators. Specifically, this work aims to assess the effect of embodied interaction on attention during the solution of complex, time-sensitive, spatial navigational problems. Toward the goal of assessing the level of operator's attention, we present a method linking the operator's interaction utility, inference, and reasoning. The level of attention was inferred through networks coined Bayesian Attentional Networks (BANs). BANs are structures describing cause-effect relationships between operator's attention, physical actions and decision-making. The proposed framework also generated a representative BAN, called the Consensus (Majority) Model (CMM); the CMM consists of an iteratively derived and agreed graph among candidate BANs obtained by experts and by the automatic learning process. Finally, the best combinations of interaction modalities and feedback were determined by the use of particular utility functions. This methodology was applied to a spatial navigational scenario; wherein, the operators interacted with dynamic images through a series of decision making processes. Real-world experiments were conducted to assess the framework's ability to infer the operator's levels of attention. Users were instructed to complete a series of spatial-navigational tasks using an assigned pairing of an interaction modality out of five categories (vision-based gesture, glove-based gesture, speech, feet, or body balance) and a feedback modality out of two (visual-based or auditory-based). Experimental results have confirmed that physical expressions are a determining factor in the quality of the solutions in a spatial navigational problem. Moreover, it was found that the combination of foot gestures with visual feedback resulted in the best task performance (p< .001). Results have also shown that embodied interaction-based multimodal interface decreased execution errors that occurred in the cyber-physical scenarios (p < .001). Therefore we conclude that appropriate use of interaction and feedback modalities allows the operators maintain their focus of attention, reduce errors, and enhance task performance in solving the decision making problems.