Forward fills Column names

PHOTO EMBED

Thu Aug 06 2020 08:57:00 GMT+0000 (Coordinated Universal Time)

Saved by @import_fola #python #pandas #data-cleaning

def ffill_cols(df, cols_to_fill_name='Unn'):
    """
    Forward fills column names. Propagate last valid column name forward to next invalid column. Works similarly to pandas
    ffill().
    
    :param df: pandas Dataframe; Dataframe
    :param cols_to_fill_name: str; The name of the columns you would like forward filled. Default is 'Unn' as
    the default name pandas gives unnamed columns is 'Unnamed'
    
    :returns: list; List of new column names
    """
    cols = df.columns.to_list()
    for i, j in enumerate(cols):
        if j.startswith(cols_to_fill_name):
            cols[i] = cols[i-1]
    return cols
content_copyCOPY

When reading excel files of government data, often it's structured in a way where they use the merge columns in excel so a column name is spread across multiple cells. reading this into pandas, only the cell with the column name retains the name and the others are renamed as 'Unnamed:<number>'. This code helps forward fill the Unnamed cells to have the same as the named cell.