You Only Gesture Once (YOUGO): American Sign Language Translation Using YOLOV3

Mehul Nanda, Purdue University

Abstract

Sign Language is a medium for communication used primarily by people who are either deaf or mute. People use it to communicate their thoughts, ideas, etc. to the world. Sign language has a defined vocabulary, grammar and associated lexicons. There are different types for sign language based on geography and context of spoken language such as American Sign Language, British Sign Language, Japanese Sign Language, etc. The emphasis of this research is on American Sign Language (ASL). Communication through sign language can be orchestrated in a variety of ways. There are certain words of the spoken language that can be directly represented and interpreted through simple gestures. Words like Hello, Mom, Dad, etc. have designated signs or gestures. However, there are certain words that don’t have pre-defined signs. In this case, a technique called “Fingerspelling” is used to spell the word out using sings for individual alphabets. Typically, fingerspelling is used when someone is trying to convey their name. Research in the field of sign language interpretation and translation had been sparse prior to the introduction of deep learning methods and algorithms. The most common technique for interpretation is using Image Processing Algorithms to extract features from orchestrated gestures and then using Convolutional Neural Networks to learn these features and increase utility. Advances in deep learning have led to the creation of Object Detection Algorithms that, when used in conjunction with neural networks, can identify all types of objects. Currently, there is research being conducted to identify words from the sign language vocabulary by classifying them as objects and making use of such object detection plus neural network combinations. You Only Look Once (YOLO) is one such algorithm that excels in identifying custom objects. It is used in conjunction with a neural network architecture known as Darknet. People that make use of sign language often need to rely on a translator to convey what they are trying to say to a person that does not understand sign language. Dependency on a translator can create issues and potentially render the person incapable of acting independently. The creation of a system that can help people use sign language without depending on another person can really help them be independent and ignite the confidence to present themselves to the world without any fear. The focus of this research is to propose a system that can accurately identify the orchestrated gesture and map it to the desired word or alphabet in the sign language vocabulary using object detection algorithms in conjunction with neural networks, typically the YOLOv3.

Degree

M.Sc.

Advisors

Gusev, Purdue University.

Subject Area

Artificial intelligence|Communication|Language|Neurosciences|Translation studies

Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server
.

Share

COinS