Video Processing for Safe Food Handling

Chengzhang Zhong, Purdue University

Abstract

A majority of foodborne illnesses result from inappropriate food handling practices. One proven practice to reduce pathogens is to perform effective hand-hygiene before all stages of food handling. In food handling, there exist steps to achieve good manufacturing practices (GMPs). Traditionally, the assessment of food handling quality would require hiring a food expert to conduct an audit, which is expensive in cost. Recently, recognizing activities in videos becomes rapidly growing field with wide-ranging applications. In this thesis, we propose to approach the assessment of food handling quality, especially hand-hygiene quality, with the video analytic methods of action recognition and action detection. Our approaches focus on hand-hygiene assessment with different requirements, which includes camera view and scenario variance. For hand-hygiene with egocentric video data, we create a two-stage system to localize and recognize all the hand-hygiene actions in each untrimmed video. In the first stage, we apply a low-cost hand mask and motion histogram features to localize the temporal regions of handhygiene actions. In the second stage, we use the two-stream network model combined with a search algorithm to recognize all types of hand-hygiene actions that happen in the untrimmed video. For hand-hygiene with multi-camera view video data, we design a two-stage system that processes untrimmed video from both egocentric and third-person cameras. In the first stage, a low-cost coarse classifier efficiently localizes the hand-hygiene period; in the second stage, more complex refinement classifiers recognize seven specific actions within the handhygiene period. For hand-hygiene across different scenarios, we propose a multi-modalities frame work to recognize hand-hygiene actions in untrimmed video sequences. We explore the capability of each modality (RGB, optical flow, hand segmentation mask, and human skeleton joints) at recognizing certain subset of hand-hygiene actions. Then, we construct an individual CNN for each of these modalities and apply a hierarchical method to coordinate all the modalities to recognize each hand-hygiene action in the input untrimmed video.

Degree

Ph.D.

Advisors

Reibman, Purdue University.

Subject Area

Artificial intelligence|Design

Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server
.

Share

COinS