Non-Performing Loans (NPL) occurs when the loan cannot be returned at a predetermined time. According to F88 regulations, bad debt is classified based on 90 days of overdue debt. This amount of NPL has an impact on organization losses. This study aims to create a predictive model for new customer using 22 attributes as the characteristics of customers. Prediction results are classified as NPL (Bad) or performing loan (Good). The NPL prediction used 83732 customers' data from 2019 - 2022. The result of this study can be used as a management tool in analyzing loan applications from prospective customers. The benefit is to suppress the growth of customers in the NPL category, automate decision – making process, reduce risk and diligence reporting costs. Because I’m facing very imbalanced data, several methods will be used such as: Over-sampling and Undersampling to handle it. In the modeling phase, the comparison was carried out using both machine learning and reinforcement learning algorithms, Deep Q-Learning, Logistic Regression, Decision Tree, Random Forest, Support Vector Machine, LightGBM and XGBoost. The predictive model used the Deep Q-Learning as it outperforms single classifier algorithms and can handle imbalanced datasets. And finally, a project will deploy to production by Streamlit API.
Readership Map
Content Distribution
Non-Performing Loans (NPL) occurs when the loan cannot be returned at a predetermined time. According to F88 regulations, bad debt is classified based on 90 days of overdue debt. This amount of NPL has an impact on organization losses. This study aims to create a predictive model for new customer using 22 attributes as the characteristics of customers. Prediction results are classified as NPL (Bad) or performing loan (Good). The NPL prediction used 83732 customers' data from 2019 - 2022. The result of this study can be used as a management tool in analyzing loan applications from prospective customers. The benefit is to suppress the growth of customers in the NPL category, automate decision – making process, reduce risk and diligence reporting costs. Because I’m facing very imbalanced data, several methods will be used such as: Over-sampling and Undersampling to handle it. In the modeling phase, the comparison was carried out using both machine learning and reinforcement learning algorithms, Deep Q-Learning, Logistic Regression, Decision Tree, Random Forest, Support Vector Machine, LightGBM and XGBoost. The predictive model used the Deep Q-Learning as it outperforms single classifier algorithms and can handle imbalanced datasets. And finally, a project will deploy to production by Streamlit API.