Function to find boundaries for outliers
Mon Sep 05 2022 09:55:05 GMT+0000 (Coordinated Universal Time)
Saved by
@DataSynapse82
#python
#pandas
#dataset
#eda
#outliers
#boundaries
def find_boundaries(df, variable, distance=1.5):
IQR = df[variable].quantile(0.75) - df[variable].quantile(0.25)
lower_boundary = df[variable].quantile(0.25) - (IQR * distance)
upper_boundary = df[variable].quantile(0.75) + (IQR * distance)
return upper_boundary, lower_boundary
content_copyCOPY
1.5 is the standard distance value but if we want to find very extreme values we should increase the value
Comments