This project explores the use of machine learning techniques to detect the implementation of Environmental, Social, and Governance (ESG) practices in banks through the analysis of their annual reports. As the importance of ESG factors continues to grow, effective and automated methods for evaluating ESG disclosures have become essential for investors, regulators, and stakeholders. While the majority of existing studies focus on processing textual data primarily in English, this research proposes several methodologies designed specifically for a Vietnamese-language dataset. However, because the amount of research on ESG in Vietnam is currently quite small, relevant data is often not much. Therefore, we build dataset based on the annual reports of major banks in the country and then we apply machine learning algorithms, particularly Natural Language Processing (NLP) techniques, to classify and assess ESG content in the textual data of annual reports. Using models such as Support Vector Machines (SVM), Logistic Regression, and BERT-based architectures, we analyze the textual elements of reports from major banks to identify key ESG factors. With the accuracry achieve 70.8% on final classification model, our findings contribute to improving transparency and accountability in ESG reporting, offering a valuable tool for assessing banks’ sustainability practices. This research highlights the potential of machine learning to streamline ESG evaluations, ensuring that financial institutions align with global sustainability standards and meet growing stakeholder expectations.
Readership Map
Content Distribution
This project explores the use of machine learning techniques to detect the implementation of Environmental, Social, and Governance (ESG) practices in banks through the analysis of their annual reports. As the importance of ESG factors continues to grow, effective and automated methods for evaluating ESG disclosures have become essential for investors, regulators, and stakeholders. While the majority of existing studies focus on processing textual data primarily in English, this research proposes several methodologies designed specifically for a Vietnamese-language dataset. However, because the amount of research on ESG in Vietnam is currently quite small, relevant data is often not much. Therefore, we build dataset based on the annual reports of major banks in the country and then we apply machine learning algorithms, particularly Natural Language Processing (NLP) techniques, to classify and assess ESG content in the textual data of annual reports. Using models such as Support Vector Machines (SVM), Logistic Regression, and BERT-based architectures, we analyze the textual elements of reports from major banks to identify key ESG factors. With the accuracry achieve 70.8% on final classification model, our findings contribute to improving transparency and accountability in ESG reporting, offering a valuable tool for assessing banks’ sustainability practices. This research highlights the potential of machine learning to streamline ESG evaluations, ensuring that financial institutions align with global sustainability standards and meet growing stakeholder expectations.