Credit card approval

A credit card dataset for machine learning.

Source data can be found here.


Credit score cards are a common risk control method in the financial industry. It uses personal information and data submitted by credit card applicants to predict the probability of future defaults and credit card borrowings. The bank is able to decide whether to issue a credit card to the applicant. Credit scores can objectively quantify the magnitude of risk.


The dataset consists of two tables:

  • application_record - contains information about a person, the primary key is ID.
  • credit_record - contains a set of monthly records for a person. ID is a foreign key.


  • ID: Client number
  • CODE_GENDER: Gender
  • FLAG_OWN_CAR: Is there a car
  • FLAG_OWN_REALTY: Is there a property
  • CNT_CHILDREN: Number of children
  • AMT_INCOME_TOTAL: Annual income
  • NAME_INCOME_TYPE: Income category
  • NAME_EDUCATION_TYPE: Education level
  • NAME_FAMILY_STATUS: Marital status
  • NAME_HOUSING_TYPE: Way of living
  • DAYS_BIRTH: Birthday (Count backwards from current day (0), -1 means yesterday)
  • DAYS_EMPLOYED: Start date of employment (Count backwards from current day(0). If positive, it means the person currently unemployed.)
  • FLAG_MOBIL: Is there a mobile phone
  • FLAG_WORK_PHONE: Is there a work phone
  • FLAG_PHONE: Is there a phone
  • FLAG_EMAIL: Is there an email
  • OCCUPATION_TYPE: Occupation
  • CNT_FAM_MEMBERS: Family size


  • ID: Client number
  • MONTHS_BALANCE: Record month (The month of the extracted data is the starting point, backwards. 0 is the current month, -1 is the previous month, and so on)
  • STATUS: Status (0: 1-29 days past due, 1: 30-59 days past due, 2: 60-89 days overdue, 3: 90-119 days overdue, 4: 120-149 days overdue, 5: Overdue or bad debts, write-offs for more than 150 days, C: paid off that month, X: No loan for the month)

Generate unlimited rows from this dataset on

Trusted by the world’s most ambitious companies