JUMP TO CONTENT

Wise Data Science

Human-in-the-Loop Threshold Management System for Machine Learning Models

Machine Learning (ML) models are at the core of modern businesses, powering everything from fraud detection to personalized recommendations. However, effectively deploying these models requires ensuring their decisions align with real-world business goals, regulatory requirements, and user experience. This often comes down to optimizing a single, critical parameter: the model threshold.

A threshold is a cutoff point used to convert continuous probability scores (e.g., a number between 0 and 1) from classification models into discrete class labels (e.g., "fraud" or "not fraud"). For example, if a model predicts a transaction has a 0.8 probability of being fraudulent, a threshold of 0.7 would classify it as "fraud," while a threshold of 0.9 would classify it as "not fraud."

Decision Flow based on Threshold. This flowchart visually represents the application of a threshold to a model's score. If the score is greater than or equal to the threshold, the item is flagged for investigation; otherwise, it is allowed to proceed.


Traditionally, optimizing these thresholds has been a complex, time-consuming task, often relegated to highly technical experts. At Wise, we faced the challenge of managing diverse customer segments, each with unique risk profiles and compliance mandates and as a solution we developed Threshold UI. This groundbreaking Human-in-the-Loop (HITL) system empowers non-technical stakeholders to directly influence and optimize ML model performance, marking a paradigm shift towards Human-Guided Machine Learning.

The Bottleneck: Optimizing ML Thresholds

While seemingly simple, selecting the right threshold is profoundly impactful and deceptively difficult, especially in high-stakes real-world applications like financial services.

1) Diverse Customer Cohorts (Subrules): Our customer base isn't monolithic. We serve individuals, businesses, and different geographic regions, each with unique risk profiles and regulatory obligations. A "one-size-fits-all" threshold often leads to sub-optimal outcomes. This necessitates different thresholds for different customer segments, which we refer to as "subrules."


This diagram illustrates how incoming transfers are processed by a fraud engine and then distributed into various customer cohorts (A, B, C, D, E). Each customer cohort is shown with a distinct numerical value (e.g., 0.7, 0.55), representing the specific threshold applied to that cohort. This highlights the necessity of having diverse, segment-specific thresholds to cater to the unique risk profiles and requirements of different customer groups.

2) Interdependent and Complex Constraints: It's not enough to optimize for each subrule independently. We face a web of interconnected constraints:

  • Global Constraints: For instance, maximum fraud rates required to operate in a specific region or a global limit on the number of transactions that can be held for scrutiny.

  • Local (Cohort-Specific) Constraints: Regulatory requirements might dictate that for customers in a specific country, the false positive rate for a particular model must be below a certain percentage, or a minimum recall rate (percentage of actual fraud caught) must be maintained for high-risk cohorts.

  • Business Objectives: Beyond regulation, different business objectives might apply. For one cohort, minimizing false negatives (risk of potential fraud) might be paramount, while for another, minimizing false positives (risk of disrupting legitimate transactions) could be the priority due to customer experience considerations.


3) Combinatorial Explosion: Imagine having 20 customer cohorts for a given model, and for each customer cohort hundreds of potential threshold values to select from. Manually exploring all combinations to find the set of thresholds that satisfies all global and local constraints, while optimizing for an overall business objective, is a monumental task. This quickly becomes mathematically intractable; with n = 400 thresholds across k = 20 cohorts, the search spaceconsists of nk combinations, approximately 40020 – a number roughly equivalent to the estimated number of atoms on Earth. Manually evaluating this many options is just too difficult. 

4) Impact on Business and Compliance: The implications of sub-optimal thresh-old management are significant. Slow model refreshes mean models might not adapt quickly to changing fraud patterns or market conditions. Non-compliance with regulations can lead to severe penalties. Most importantly, it can negatively impact customer trust and experience if legitimate transactions are unnecessarily delayedor blocked, or if genuine risks are missed.

This "last mile" problem of ML deployment is a critical bottleneck, hindering the agility and effectiveness of our machine learning systems.

Introducing Threshold UI: Your Co-Pilot for ML Decisions

Threshold UI is a sophisticated Human-in-the-Loop (HITL) system designed to put the power of ML decision-making directly into the hands of those who understand the business context best. It combines invaluable human domain knowledge, business priorities, and regulatory constraints with advanced optimization algorithms to find the best possible solution.

1. Empowering Non-Technical Users 

The platform is engineered for usability, abstracting away underlying mathematical complexity. Non-technical users can interact with Threshold UI through:

  • Direct Input of Business Constraints: Users define constraints using clear, business-centric language (e.g., "Can we find a threshold combination that keeps average daily suspension count below X?").

  • Clear Objective Function Selection: Users select the primary business objective they want to optimize for each model or subrule (e.g., minimizing overall cost, maximizing fraud detected or balancing precision and recall).

  • Scalability: With a growing portfolio of models and an increasing number of
    customer cohorts, managing these constraints manually would be impossible.
    Threshold UI automates this process, ensuring that scalability isn’t compromised
    by complexity.

The Threshold UI configuration screen allows non-technical stakeholders to define model constraints and objectives. Users can set global limits (e.g., "Daily backlog limit") and subrule-specific constraints (e.g., minimum precision or recall). They can also choose the objective function (e.g., F2-score) for each subrule, translating complex business requirements into actionable optimization parameters. The "Threshold alternatives" field indicates the granularity of threshold options the underlying algorithm considers.

2. Offline Experimentation and One-Click Deployment 

Threshold UI facilitates rapid, risk-free experimentation and deployment.

  • Generate Multiple "What-If" Scenarios: Users define several alternative sets of constraints and objective functions. Threshold UI then instantly calculates the optimal thresholds for each scenario, providing a comparative view of how different strategic choices would impact model performance.

  • Side-by-Side Comparison: The UI presents these different optimization options in a clear, comparative format, allowing product managers to understand the trade-offs and benefits of each.

  • One-Click Deployment: Once an optimal set of thresholds is identified, deploying it is as simple as a single click. This drastically reduces the time from decision to production, cutting model refresh times from hours or days to minutes.

This detailed view within Threshold UI provides a granular breakdown of metrics for each subrule across different threshold options (Baseline vs. New/Optimized). Users can examine how proposed threshold changes affect key performance indicators for specific customer segments, enabling informed decisions and fine-tuning for diverse cohorts. This helps in understanding the precise impact of optimized thresholds on various aspects of the business.

3. Ensuring Responsible AI: Threshold UI and Model Governance

Threshold UI plays a pivotal role in strengthening our model governance framework, particularly in model deployment and ongoing performance management. This includes:

  • Risk Management: Identifying and mitigating potential risks associated with model decisions, including bias, fairness, and unintended consequences.

  • Regulatory Compliance: Ensuring models adhere to internal policies, industry standards, and external regulations (e.g., GDPR, financial regulations).

  • Enhanced Transparency and Auditability: Every optimization run, along with the specified constraints, objectives, and resulting optimal thresholds, is recorded. This creates a clear, auditable trail of decision-making.

  • Scalable Governance: Threshold UI provides a scalable solution for operationalizing governance policies, allowing for consistent application of rules across a vast portfolio of models and customer segments.


4. Beyond the Numbers: The Human-Guided Machine Learning Paradigm

Threshold UI embodies a profound shift towards Human-Guided Machine Learning, recognizing that human intelligence, intuition, and domain expertise remain indispensable. It creates a synergistic relationship where humans and AI collaborate, each leveraging their unique strengths.

The design philosophy behind Threshold UI is deeply rooted in principles of Human-Computer Interaction (HCI):


  • Intuitive Design: Complex optimization problems are abstracted into clear, visual, and interactive elements, reducing cognitive load and broadening accessibility.

  • Feedback Loops: Users receive immediate, understandable feedback on the impact of their choices, reinforcing learning and building confidence.

  • Transparency and Explainability: While the underlying algorithm (Output Thresholding Using Mixed Integer Linear Programming - OTLP, refer OTLP: Output Thresholding Using Mixed Integer Linear Programming ) is sophisticated, Threshold UI presents its results in an easily digestible format. Users can see how constraints are met, what trade-offs were made, and the specific impact on different cohorts.

  • Human in the Loop: Threshold UI ensures that the human is always in control, providing the strategic direction and ethical guardrails, while the system handles the computational heavy lifting. This avoids the pitfalls of fully automated systems that might optimize for technical metrics at the expense of business reality.

List #1

Related blogs

Working globally, staying local: Life at Wise in Japan

Teaser

People profile

Content Type

Blog

Publish date

03/02/2026

Summary

"When you join, you’ll see your own growth mirrored by the growth of the company. You won't just be watching change happen; you’ll be a direct part of it." Ataro Shoji (He/Him) Payment

Teaser

Ataro Shoji, our Payment Operations Senior Specialist, highlights how collaborating across time zones allows us to build 'Money Without Borders' while enjoying an inclusive, mission-driven culture 🌏

Read more

by

Verona Hasani

by

Verona Hasani

Expanding in Hyderabad

Teaser

People profile

Content Type

Blog

Publish date

06/27/2025

Summary

We sat down with SK Saraogi, Head of Expansion APAC, to discuss our strategic expansion into Hyderabad, India 🇮🇳Read more to discover why Hyderabad is the perfect location for our second glo

Teaser

We sat down with SK Saraogi, Head of Expansion APAC, to discuss our strategic expansion into Hyderabad, India 🇮🇳 Read more to discover why Hyderabad is the perfect location for our second global hub in APAC, offering unique opportunities for innovation, growth, and an incredible talent pool 🚀

Read more

by

Verona Hasani

by

Verona Hasani

From Agent to Team Lead: A Journey in Fraud Prevention

Teaser

People profile

Content Type

Blog

Publish date

06/06/2025

Summary

Driven by a passion for growth and team development, discover how Anna Pavlics advanced from Agent to Team Lead in Wise's Fraud Prevention team 🚀 "We put tremendous effort into bui

Teaser

Driven by a passion for growth and team development, discover how Anna Pavlics advanced from Agent to Team Lead in Wise's Fraud Prevention team 🚀

Read more

by

Verona Hasani

by

Verona Hasani

From Support agent to leadership: A decade of growth

Teaser

People profile

Content Type

Blog

Publish date

02/18/2025

Summary

Head of Servicing Scale and Experience, Ian Rynne, discusses his journey from starting as a Customer Support agent to becoming the Head of Servicing Scale and Experience at Wise. 

Teaser

Head of Servicing Scale and Experience, Ian Rynne, discusses his journey from starting as a Customer Support agent to becoming the Head of Servicing Scale and Experience at Wise.

Read more

by

Verona Hasani

by

Verona Hasani

Life in the heart of Austin

Teaser

Our culture

Content Type

Blog

Publish date

05/15/2024

Summary

My name is Cynthia. I'm a fraud agent at Wise, based in the Austin office. We strive to prevent fraudulent activity on the Wise platform. It can be tricky, as Wise supports many diff

Teaser

We spoke to Cynthia, a fraud agent based in the Austin office, to find out more about what life is like at Wise in Austin.

Read more
Keeping Tampa Bay Beautiful

Teaser

Environment Social Governance

Content Type

Blog

Publish date

01/24/2024

Summary

Wise’s volunteer day is not just a perk; it’s a celebration of community, camaraderie, and making a difference.Meet Claire Adelman, Customer Support Training Specialist, and Javier Perdo

Teaser

Wise’s Volunteer Day is not just a perk; it’s a celebration of community, camaraderie, and making a difference. Meet Claire and Javier, both showcasing the true spirit of Wise in Tampa.

Read more
Life as a Due Diligence Agent: Delis

Teaser

People profile

Content Type

Blog

Publish date

11/24/2022

Summary

Hi! My name is Delis and I’m a Due Diligence Agent (CDD) in our Tallinn team, focusing on business verification. What this means is that I onboard high risk businesses by assessing their r

Teaser

Hi! My name is Delis and I’m a Due Diligence Agent (CDD) in our Tallinn team, focusing on business verification. What this means is that I onboard high risk businesses by assessing their risk level, investigating business activity and determining business structure.

Read more
Life as a Due Diligence Agent: Rza

Teaser

People profile

Content Type

Blog

Publish date

11/24/2022

Summary

Hi! My name is Rza Mustafayev and I’m a Due Diligence Agent in our Latin America & Middle-East and Africa Region, focusing on personal and business customers. What this means is that I:Rev

Teaser

Hi! My name is Rza Mustafayev and I’m a Due Diligence Agent in our Latin America & Middle-East and Africa Region, focusing on personal and business customers.

Read more
View all
Search

Browse Jobs