Date of Award


Degree Type


Degree Name

Master of Science (MS)


Computer and Information Technology

Committee Chair

Baijian Yang

Committee Member 1

John A. Springer

Committee Member 2

Tonglin Zhang


This research proposes a new ftting algorithm of logistic regression on IRWLS that utilizes the procedure of scanning data row-by-row and has the ability to acquire an exact result with only a few iterations. Furthermore, this research also realizes the distributed parallelization of the proposed method on Spark and conducts various experiments to manifest its memory-wise advantage over the traditional methods such as Spark MLlib package. The results show that the proposed method can provide an exact result rather than an approximated one within 5 or 6 iterations; achieve a satisfying accuracy for fight delay prediction within 1 or 2 iterations; has a better potential for parallelization and a better performance than MLlib with a 3-4x faster speed without full optimizations; and its performance is not undermined by an increasing data memory ratio.