Forward fills Column names
Thu Aug 06 2020 08:57:00 GMT+0000 (UTC)
Saved by
@import_fola
#python
#pandas
#data-cleaning
def ffill_cols(df, cols_to_fill_name='Unn'):
"""
Forward fills column names. Propagate last valid column name forward to next invalid column. Works similarly to pandas
ffill().
:param df: pandas Dataframe; Dataframe
:param cols_to_fill_name: str; The name of the columns you would like forward filled. Default is 'Unn' as
the default name pandas gives unnamed columns is 'Unnamed'
:returns: list; List of new column names
"""
cols = df.columns.to_list()
for i, j in enumerate(cols):
if j.startswith(cols_to_fill_name):
cols[i] = cols[i-1]
return cols
content_copyCOPY
When reading excel files of government data, often it's structured in a way where they use the merge columns in excel so a column name is spread across multiple cells.
reading this into pandas, only the cell with the column name retains the name and the others are renamed as 'Unnamed:<number>'. This code helps forward fill the Unnamed cells to have the same as the named cell.
Comments