LendingClub 借貸平台信用評等:以公平方法緩解信用評等模型之種族偏誤

No Thumbnail Available

Date

2024

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

本研究旨在解決非傳統(Non-Traditional)P2P金融借貸平台信用評等模型產生的種族偏見(Racial Bias)結果,結合Reweighing資料預處理公平演算法(Reweighing Pre-Processing Fairness Algorithm)、成本敏感模型(Cost-Sensitive Modeling)及後設模型解釋(Post Hoc Model Explanation)方法,建立並探討多類別公平信用評等(Fair Credit Scoring)流程之可行性。實證研究建立之模型以正確率(Accuracy)、平均成本(Average Cost)、不公平性(Unfairness)指標比較不同模型設計下的分類結果。結合美國普查局(United States Census Bureau)資料進行視覺化分析,發現LendingClub數據集確實隱含種族平等差異,透過雙樣本無母數假設檢定(Wilcoxon等級和與卡方檢定)個別變數,亦可見優勢、劣勢族群間存在顯著差異,即本研究用於預測LendingClub評等結果的資料確實有可能導致不公平結果,而檢定效果量(Rosenthal Correlation、Cramer’s V)的計算則可作為個別變數與不公平結果的相關性量化佐證。本研究使用C5.0決策樹演算法建立模型,以符合公平演算法之權重設定、成本敏感模型建立、全局(Global)後設模型解釋的應用需求,建模資料選用美國 LendingClub P2P 借貸平台數據集建立模型。公平信用評等模型建立結果顯示,公平演算法在不使用額外替代變數的條件下,有助於平均成本與不公平性的下降。此外,結合公平演算法與成本敏感方法建立模型,亦可在降低模型不公平性的同時進一步下降平均成本。對於P2P借貸平台經營者而言,本研究使用之公平信用評等流程可以在確保貸款方(Lenders)與其他利害關係人的情況下,給予借款方(Borrowers)更為公平的借款機會。而後設模型解釋的使用,更有助於借款方、貸款方等借貸平台利害關係人了解、稽核複雜的機器學習信用評等模型,強化平台與利害關係人之間的維繫。
This research intends to mitigate the racial bias in the credit scoring model of non-traditional P2P lending platforms. The feasibility of a multi-class fair credit scoring process was empirically studied by combining the Reweighing Pre-Processing fairness algorithm, Cost-Sensitive Modeling, and Post Hoc Model Explanations. The empirical study compares the classification results of different model designs using Accuracy, Average Cost, and Unfairness metrics.Using data from the United States Census Bureau for visual analysis, racial treatment unfairness in the LendingClub dataset was confirmed. Through the Wilcoxon rank-sum tests and Chi-square tests of individual variables, significant differences between Privileged and Unprivileged groups were observed. In other words, the data for predicting the LendingClub gradings leads to biased results. Calculating the effect sizes (Rosenthal Correlation, Cramer's V) serves as quantitative evidence of the correlation between individual variables and unfair results. This research utilized the C5.0 decision tree algorithm with consideration of the weight setting of the fairness algorithm, Cost-Sensitive Modeling, and Global Post Hoc Explanations for studying multi-class fair credit rating. Models were built with the dataset of the LendingClub P2P lending platform. The results of fair credit scoring models show that the Reweighing fairness algorithm can reduce the Unfairness and Average Cost of models. In addition, combining the fairness algorithm and Cost-Sensitive Modeling can minimize the Average Cost of models while maintaining the functionality of the fairness algorithm.For managers of P2P lending platforms seeking a fair credit scoring process, the fairness approach of this research can provide fairer credit access for borrowers without sacrificing the interests of lenders and platforms. The inclusion of Post Hoc Explanations enables stakeholders of lending platforms to understand and assess complicated machine learning credit scoring models. As a result, the relationship between platforms and stakeholders can be strengthened.

Description

Keywords

信用評等, 公平借貸, 公平演算法, 成本敏感, P2P借貸, Cost-Sensitive, Credit Scoring, Fair Credit Scoring, Fairness Algorithm, P2P Lending

Citation

Collections

Endorsement

Review

Supplemented By

Referenced By