Unveil the Future of Tech — Harness the Power of Data in the Cloud

Customer Bank Departure Prediction with Tidymodels - Second Part: Decision Threshold Examination

Investigating Decision Threshold Analysis for Bank Customer Churn in continuation of Part 2. We previously constructed a successful workflow that yielded impressive results on several classification metrics, then we delved into the effects of distinct up- and downsampling methods to mitigate...

, and Administrator

2025 July 25 . 10:22 AM

2 min read

Predicting Bank Customer Attrition Using Tidymodels - Analyzing Decision Thresholds (Part 2)

Customer Bank Departure Prediction with Tidymodels - Second Part: Decision Threshold Examination

In the ongoing exploration of the Bank Customer Churn problem, this article explains the application of decision threshold analysis using the 'probably' package in R to optimize model performance for non-technical audiences.

The decision threshold analysis involves systematically evaluating model performance for various thresholds, aiming to minimize costs associated with false positives (unnecessary retention offers) and false negatives (lost customers).

To perform this analysis, we first obtain predicted probabilities for churn from our classification model. Then, we define a decision threshold that converts these probabilities into binary churn predictions. Next, we use the 'probably' package to systematically vary this threshold, evaluate model performance, and calculate a cost function.

The 'probably' package in R facilitates decision threshold optimization and cost-based evaluation by working with predicted class probabilities. A typical workflow involves using the model to get predicted class probabilities on test data, creating a cost matrix reflecting business costs, and using 'probably' functions to evaluate metrics and choose an optimal threshold minimizing expected cost.

In our hypothetical scenario analysis, we calculated the Total Cost of FN and FP as an approximation using Customer Lifetime Value (CLV) and a cost of intervention, which is assumed to be the value of an annual fee for a standard account, i.e., $99. The median CLV is taken as the approximate CLV per customer, with annualized CLV calculated as the sum of account fees and credit card fees, with each product having a $99 annual fee except credit cards which have a $149 fee.

The constructed scenario analysis identified a decision threshold that minimizes costs. However, it was observed that the lowest cost model reduces model performance, presenting a trade-off between an effective model that differentiates classes moderately well and a lower cost one with more interventions and greater false positive predictions.

The 'probably::threshold_perf()' function was used to carry out threshold analysis, identifying an optimal threshold based on the J-index. Additionally, the 'workflowsets::extract_workflow_set_result' function was used to generate a tibble of all trialled hyperparameter combinations, and the best was selected based on a specified metric.

To provide more specific code examples tailored to your dataset or model output, feel free to ask for assistance. It's important to note that costs should be adjusted to your exact business context, such as "Cost of retaining a non-churner" vs. "Cost of missing a churner."

This approach makes the threshold decision explicit and data-driven rather than fixed at 0.5, providing a more informed and effective way to manage bank customer churn. The analysis was carried out using the 'probably' package from tidymodels, and the dataset used in the study was obtained from Kaggle (

In the context of data-and-cloud-computing, the application of the 'probably' package in R, a technology tool, allows for optimization of decision threshold analysis in the finance sector, particularly in business settings such as minimizing costs associated with bank customer churn. To effectively manage this problem, the 'probably' package facilitates the evaluation of various decision thresholds, aiming to minimize false positives and false negatives, ultimately leading to cost-effective solutions that improve the performance of churn prediction models.

Latest

This is the aerial view of a city. in this we can see buildings, towers, motor vehicles,...

Lifestyle

Romania's IPTV: The Future of Viewing Experiences

IPTV is revolutionizing Romania's content consumption. Engage with live polls, AR, and personalized content on your mobile devices. The future is here.

, and Administrator

2025 October 9

In the picture we can see a car engine with pipes, battery in it.

Climate-change

China Boosts EV Safety from 2026 with Mandatory Impact Tests and 'Battery Bazooka'

China's new EV safety rules promise tougher testing. The 'battery bazooka' could revolutionize fire prevention worldwide.

, and Administrator

2025 October 9

This is a paper. On this something is written.

War-and-conflicts

EU Committee Visits Taiwan Amid Rising Hybrid Threats and China Tensions

EU committee visits Taiwan to align against hybrid threats. President Lai Ching-te warns of increasing threats to both Taiwan and the EU.

, and Administrator

2025 October 9

In this image we can see there is a tool box with so many tools in it.

Stay Safe Online with Wise Learner Hub

CyberCX Speeds Up Essential Eight Compliance with New Solution

CyberCX's new solution cuts Essential Eight compliance time from months to days. It's a game-changer for organisations looking to bolster their cybersecurity fundamentals.

, and Administrator

2025 October 9

Customer Bank Departure Prediction with Tidymodels - Second Part: Decision Threshold Examination

Customer Bank Departure Prediction with Tidymodels - Second Part: Decision Threshold Examination

Read also:

Related

Latest