Date of Award

8-2018

Degree Type

Dissertation

Degree Name

Doctor of Philosophy (PhD)

Department

Industrial Engineering

Committee Chair

Juan P. Wachs

Committee Member 1

Ramses Martinez

Committee Member 2

Karthik Ramani

Committee Member 3

Richard Voyles

Abstract

Humans are able to understand meaning intuitively and generalize from a single observation, as opposed to machines which require several examples to learn and recognize a new physical expression. This trait is one of the main roadblocks in natural human-machine interaction. Particularly, in gestures which are an intrinsic part of human communication. In the aim of natural interaction with machines, a framework must be developed to include the adaptability humans portray to understand gestures from a single observation. This problem is known as one-shot gesture recognition, and it has been researched previously. Nevertheless, most approaches rely heavily on purely numerical solutions, and leave aside the mechanisms humans use to perceive and execute gestures. A framework is proposed in this dissertation to incorporate the processes associated with gesture perception and execution to the paradigm of one-shot gesture recognition. By observing how humans perceive and process gestures, we can learn how to generate "humanlike" gestures by machines. This is achieved by employing model-based physiological constraints and quantifications of human variability to suggest how humans might replicate a particular gesture. The two approaches implemented, referred to as the forward and backward approach, rely on different aspects of human motion to generate these artificial gesture examples. The forward approach leverages spatial variability centered on the human shoulder and the reach of the hand within that work envelope. Conversely, the backward approach leverages the kinematic model of the human arm and strategies for trajectory planning such as jerk minimization and energy expenditure, commonly used to model human motion. The aforementioned approaches begin with the same subset of key points within the motion trajectory. These points have been found to correlate with fluctuations in mu power using electroencephalographic signals (EEG) during gesture observation. It was found that the executed gestures contain a bounded set of salient points within the motion trajectory that relate to neural signatures observed when passively watching gestures using EEG, referred to as the gist of the gesture. Therefore, this gist can be used to capture large variability within each gesture while keeping the main traits of the gesture class. The performance of the method is evaluated in terms of independence from the classifying method used, efficiency in terms of comparing to traditional N-shot learning approaches, and coherence in recognition among machines and humans. The obtained results show the performance of the developed framework, demonstrating independence from the selected classification strategy having used four different state-of-the-art classification algorithms. In the context of one-shot learning, the proposed framework resembles the way humans use their bodies for gesture recognition, by generating artificial gesture examples which capture human-like variation for all gesture classes.

Share

COinS