The Perils of Chasing p99. Hidden correlations can mislead… | by Krishna Rao

Hidden correlations can mislead optimization methods

Photograph by Chun Equipment Soo on Unsplash

p99, or the worth under which 99% of observations fall, is extensively used to trace and optimize worst-case efficiency throughout industries. For instance, the time taken for a web page to load, fulfill a purchasing order or ship a cargo can all be optimized by monitoring p99.

Whereas p99 is undoubtedly worthwhile, it’s essential to acknowledge that it ignores the highest 1% of observations, which can have an unexpectedly giant influence when they’re correlated with different important enterprise metrics. Blindly chasing p99 with out checking for such correlations can probably undermine different enterprise targets.

On this article, we are going to analyze the constraints of p99 via an instance with dummy knowledge, perceive when to depend on p99, and discover alternate metrics.

Think about an e-commerce platform the place a workforce is tasked with optimizing the purchasing cart checkout expertise. The workforce has acquired buyer complaints that trying out is somewhat gradual in comparison with different platforms. So, the workforce grabs the most recent 1,000 checkouts and analyzes the time taken for trying out. (I created some dummy knowledge for this, you might be free to make use of it and tinker with it with out restrictions)

import pandas as pdimport seaborn as snsorder_time = pd.read_csv(‘https://gist.githubusercontent.com/kkraoj/77bd8332e3155ed42a2a031ce63d8903/uncooked/458a67d3ebe5b649ec030b8cd21a8300d8952b2c/order_time.csv’)fig, ax = plt.subplots(figsize=(4,2))sns.histplot(knowledge = order_time, x = ‘fulfillment_time_seconds’, bins = 40, shade = ‘okay’, ax = ax)print(f’p99 for fulfillment_time_seconds: {order_time.fulfillment_time_seconds.quantile(0.99):0.2f} s’)

Distribution of order checkout instances. Picture by creator.

As anticipated, most purchasing cart checkouts appear to be finishing inside a number of seconds. And 99% of the checkouts occur inside 12.1 seconds. In different phrases, the p99 is 12.1 seconds. There are a number of long-tail circumstances that take so long as 30 seconds. Since they’re so few, they could be outliers and needs to be protected to disregard, proper?

Now, if we don’t pause and analyze the implication of the final sentence, it may very well be fairly harmful. Is it actually protected to disregard the highest 1%? Are we positive checkout instances are usually not correlated with some other enterprise metric?

Let’s say our e-commerce firm additionally cares about gross merchandise worth (GMV) and has an general company-level goal to extend it. We should always instantly examine whether or not the time taken to checkout is correlated with GMV earlier than we ignore the highest 1%.

import matplotlib.pyplot as pltfrom matplotlib.ticker import ScalarFormatterorder_value = pd.read_csv(‘https://gist.githubusercontent.com/kkraoj/df53cac7965e340356d6d8c0ce24cd2d/uncooked/8f4a30db82611a4a38a90098f924300fd56ec6ca/order_value.csv’)df = pd.merge(order_time, order_value, on=’order_id’)fig, ax = plt.subplots(figsize=(4,4)) sns.scatterplot(knowledge=df, x=”fulfillment_time_seconds”, y=”order_value_usd”, shade = ‘okay’)plt.yscale(‘log’)ax.yaxis.set_major_formatter(ScalarFormatter())

Relationship between order worth and achievement time. Picture by creator.

Oh boy! Not solely is the cart worth correlated with checkout instances, it will increase exponentially for longer checkout instances. What’s the penalty of ignoring the highest 1% of checkout instances?

pct_revenue_ignored = df2.loc[df1.fulfilment_time_seconds>df1.fulfilment_time_seconds.quantile(0.99), ‘order_value_usd’].sum()/df2.order_value_usd.sum()*100print(f’If we solely focussed on p99, we’d ignore {pct_revenue_ignored:0.0f}% of income’)## >>> If we solely focussed on p99, we’d ignore 27% of income

If we solely targeted on p99, we’d ignore 27% of income (27 instances better than the 1% we thought we have been ignoring). That’s, p99 of checkout instances is p73 of income. Specializing in p99 on this case inadvertently harms the enterprise. It ignores the wants of our highest-value customers.

df.sort_values(‘fulfillment_time_seconds’, inplace = True)dfc = df.cumsum()/df.cumsum().max() # p.c cumulative sumfig, ax = plt.subplots(figsize=(4,4))ax.plot(dfc.fulfillment_time_seconds.values, shade = ‘okay’)ax2 = ax.twinx()ax2.plot(dfc.order_value_usd.values, shade = ‘magenta’)ax.set_ylabel(‘cumulative achievement time’)ax.set_xlabel(‘orders sorted by achievement time’)ax2.set_ylabel(‘cumulative order worth’, shade = ‘magenta’)ax.axvline(0.99*1000, linestyle=’–‘, shade = ‘okay’)ax.annotate(‘99% of orders’, xy = (970,0.05), ha = ‘proper’)ax.axhline(0.73, linestyle=’–‘, shade = ‘magenta’)ax.annotate(‘73% of income’, xy = (0,0.75), shade = ‘magenta’)

Cumulative distribution perform of order achievement instances and order worth. Picture by creator.

Above, we see why there’s a giant mismatch between the percentiles of checkout instances and GMV. The GMV curve rises sharply close to the 99th percentile of orders, ensuing within the high 1% of orders having an outsize influence on GMV.

This isn’t simply an artifact of our dummy knowledge. Such excessive correlations are sadly not unusual. For instance, the highest 1% of Slack’s clients account for 50% of income. About 12% of UPS’s income comes from simply 1 buyer (Amazon).

To keep away from the pitfalls of optimizing for p99 alone, we will take a extra holistic method.

One answer is to trace each p99 and p100 (the utmost worth) concurrently. This fashion, we received’t be vulnerable to ignore high-value customers.

One other answer is to make use of revenue-weighted p99 (or weighted by gross merchandise worth, revenue, or some other enterprise metrics of curiosity), which assigns better significance to observations with greater related income. This metric ensures that optimization efforts prioritize probably the most worthwhile transactions or processes, somewhat than treating all observations equally.

Lastly, when excessive correlations exist between the efficiency and enterprise metrics, a extra stringent p99.5 or p99.9 can mitigate the chance of ignoring high-value customers.

It’s tempting to rely solely on metrics like p99 for optimization efforts. Nonetheless, as we noticed, ignoring the highest 1% of observations can negatively influence a big proportion of different enterprise outcomes. Monitoring each p99 and p100 or utilizing revenue-weighted p99 can present a extra complete view and mitigate the dangers of optimizing for p99 alone. On the very least, let’s keep in mind to keep away from narrowly specializing in some efficiency metric whereas shedding sight of general buyer outcomes.

Source link

The Perils of Chasing p99. Hidden correlations can mislead… | by Krishna Rao | Jun, 2024

Deceptive AI: Exploiting Generative Models in Criminal Schemes

This AI-powered “black-box” could make surgery safer

Build RAG applications using Jina Embeddings v2 on Amazon SageMaker JumpStart

MacroFab Announces New Product Offering to Provide Even More Custom Options to Customers

Detect email phishing attempts using Amazon Comprehend

Recommended For You

Deceptive AI: Exploiting Generative Models in Criminal Schemes

This AI-powered “black-box” could make surgery safer

Build RAG applications using Jina Embeddings v2 on Amazon SageMaker JumpStart

What We Learned from a Year of Building with LLMs (Part III): Strategy – O’Reilly

ProtEx: Enhancing Protein Function Prediction with Retrieval-Augmented Deep Learning

Detect email phishing attempts using Amazon Comprehend

I tried 8 of Google's newest AI products and updates at I/O 2024

Mouth-based touchpad enables people living with paralysis to interact with computers | MIT News

Leave a Reply Cancel reply

HPI-MIT design research collaboration creates powerful teams | MIT News

Exploring frontiers of mechanical engineering | MIT News

MIT faculty, instructors, students experiment with generative AI in teaching and learning | MIT News

Creating bespoke programming languages for efficient visual AI systems | MIT News

The Current State of AI! (My Personal News Recap)

Intellinum Unveils Flexi AI | RoboticsTomorrow

The $15,000 A.I. From 1983

Forward Chaining in Artificial Intelligence | Forward Chaining in Artificial Intelligence Example

The capabilities of multimodal AI | Gemini Demo

Gathering warehouse inventory data — plus an update from ElectroCraft

Simbe end user survey validates that Tally can transform inventory management

Asensus Surgical agrees to merger with KARL STORZ

Deceptive AI: Exploiting Generative Models in Criminal Schemes

Combining Diverse Datasets to Train Versatile Robots with PoCo Technique

Robot Talk Episode 88 – Lord Ara Darzi

CATEGORIES

SITE MAP

Welcome Back!

Retrieve your password

The Perils of Chasing p99. Hidden correlations can mislead… | by Krishna Rao | Jun, 2024

You might also like

Hidden correlations can mislead optimization methods

MacroFab Announces New Product Offering to Provide Even More Custom Options to Customers

Detect email phishing attempts using Amazon Comprehend

Recommended For You

Leave a Reply Cancel reply

CATEGORIES

SITE MAP

Welcome Back!

Retrieve your password