信贷风控

Design an end-to-end machine learning system for a real-time loan approval/rejection model, such as credit cards. Discuss the infrastructure, features, model, training and evaluation aspects of the system.

1. requirements

functional

  • stage: 获客、贷前(Loan origination)、贷中(Loan maintenance /servicing)、贷后(Delinquency management/recovery)

  • types of loans it will support

  • types of risk it will support -> 决定了是一个什么样的机器学习任务, 影响难点

  • 目标: 识别优劣(风险,需求,价值)

    • 业务指标

non-functional

  • compliance requirements

  • scalability goals

  • reliability, security

2. ML task & pipeline & keys

信贷风控决策流 黑白名单 rule + model binary classification, multi class classification or multi label classification

3. data

  • user (关系型数据库)

    • credit 信用

    • fraud 欺诈

  • log (分布式文件系统)

  • label

    • 滚动率(Roll Rate)、vintage

    • 从数据指标中发现风险点

    • 信贷: 逾期不放贷款的就是黑样本

需求类: 履约风险: 履约能力:

外部数据

  • 因合规要求,很多互联网大数据只能从征信平台以评分的形式引入

4. feature

feature engineering is the art

user

  • ID/Address Proof: Voter ID, Aadhaar, PAN Card

  • Employment Information, including salary slips

  • Credit Score

  • Bank Statements and Previous Loan Statements

  • 反欺诈重要feature,如ip,device_id,idfv,phone number


  • graph

  • 行为序列

  • compliance -> buy some user feature

5.model

解释性

  • Credit Scoring Models 评分卡

  • LR

  • GBDT

  • NN

  • Probabilistic Calibration

6. evaluation

  • offline

    • 准确率、AUC、Log Loss、Precision、Recall

    • Kolmogorov-Smirnov,风控常用指标

7. deploy & serving

  • feature service

  • prediction service

8. monitoring & maintenance

  • Approval Rate

9. QA & optimization

  • cold start

  • profit/revenue come from which (credit score) part of customers, risk come from which part of customers

reference

Last updated