Useful Pandas Methods

Data Understanding

  1. read_csv()
  2. head()
  3. tail()
  4. info()
  5. describe()
  6. isnull(), isnull().sum()
  7. duplicated(keep=False)

Data cleaning

  1. dropna()
  2. fillna()
  3. drop_duplicates()
  4. to_datetime()
    to_datetime() function converts an object to datetime format.

Data Analysis and Manipulation

  1. value_counts()
    df[‘Education’].value_counts()
  2. unique()
    df[‘Education’].unique()
  3. nunique()
    df[‘Education’].nunique() Output = 5
  4. sort_values()
    df.sort_values(by=’Income’, ascending=False)
  5. query()
    query() method filters out the data frame by the condition we want.
    df.query(‘Income > 100000’) or df[df[‘Income’] > 100000]
  6. groupby()

    df.groupby(‘Education’)[‘Income’].mean()

  7. pivot_table()—pivot_table() method creates a useful pivot table for us.
    There are 4 arguments we should use as input: data, index, columns and values.
    By default, the method uses the mean as an aggregation function. We can also change it.

    pd.pivot_table(data = df, values= ‘Income’, index = ‘Education’, columns = ‘Marital_Status’)

  8. apply()
    df[‘Response’].apply(lambda x : ‘Accepted’ if x == 1 else ‘Not Accepted’ )
  9. replace()
    df[‘Marital_Status’].replace(to_replace=[‘Alone’,’Divorced’,’Widow’,’YOLO’,’Absurd’],value=’Single’)