Function to find boundaries for outliers

PHOTO EMBED

Mon Sep 05 2022 09:55:05 GMT+0000 (Coordinated Universal Time)

Saved by @DataSynapse82 #python #pandas #dataset #eda #outliers #boundaries

def find_boundaries(df, variable, distance=1.5):

    IQR = df[variable].quantile(0.75) - df[variable].quantile(0.25)

    lower_boundary = df[variable].quantile(0.25) - (IQR * distance)
    upper_boundary = df[variable].quantile(0.75) + (IQR * distance)

    return upper_boundary, lower_boundary
content_copyCOPY

1.5 is the standard distance value but if we want to find very extreme values we should increase the value