Abstract:
The purposes of this research were 1) to analyze the factors that involved with the dropout
of undergraduate students 2) to propose a model for predicting the dropout of undergraduate
synthesize 3) to compare the performance of 3 different classification techniques, including
Decision Tree, K-Nearest Neighbors, and Naive algorithms. The data was collected from the
undergraduate students registration database of Ubon Ratchathani Rajabhat University during the
academic years from 2015 to 2017. The dataset has 11 attributes and 13,729 records. The data
were analyzed using the Information theory selection method. The results showed that 1) there are
8 factors that influencing students dropout 2) Those factors were used to build models with the
different techniques, Moreover, the cross-validation with 10 folds method was used to evaluate
the best prediction accuracy of each technique. 3) the result suggested that the Naive Bayes model
has the best performance among all techniques. It has the average accuracy of 93.58 %, which are
higher than Decision tree and K-Nearest Neighbors which have the average accuracy of 93.52 % and
87.95 %, accordingly. The findings also indicated that students decision to dropout was
significantly influenced by the student loan, major of study, grade point average, and the
occupation of their parents.