okidk
About Menu Contact

Pandas Cheat Sheet


Dataframe Manipulation

How to create a new dataframe using a mask to filter by date range:

mask = (df['date'] >= start_date) & (df['date'] <= end_date)
df2 = df.loc[mask]

How to filter a dataframe by column value

df2 = df[df['column_name']== 'filter_value']

How to remove column from dataframe

df = df.drop(['column_name'], axis=1)

How to reorder dataframe columns by moving last column to first

cols = df.columns.tolist()
cols = cols[-1:] + cols[:-1]
df = df[cols]

How to rename dataframe column

df = df.rename(columns={'old_name': 'new_name'})

How to merge two dataframes (equivalent to left join)

df = df.merge(df2,on='shared_column_name')

How to group dataframe column by sum of values and reset index

df = df.groupby(["column_a", "column_b"]).sum().reset_index()

How to sort dataframe by column values, high to low

df.sort_values(by=['col1'],ascending=False)

How to convert dataframe column values to list

new_list = df['column'].tolist()

How to convert list of integers to a string

string_name = ','.join([str(elem) for elem in list_name])

How to trim whitespace from column values

df['column'] = df['column'].str.strip()

How to convert column values to lowercase

df['column'] = df['column'].str.lower()

Printf total rows in dataframe

print(len(df.index))

How to save dataframe to csv

df.to_csv('file.csv')

How to convert csv to dataframe

df = pd.read_csv('file.csv')


Dataframe Time Manipulation

How to create first of month column from date column

df['month'] = df['date'].values.astype('datetime64[M]')

How to convert dataframe series column to date format

df['date'] = pd.to_datetime(df['date'], infer_datetime_format=True).dt.date

How to convert datetime object to date

date = date.date()

Basic datetime variable definitions

import datetime
from datetime import date

today = date.today()
yesterday = today - datetime.timedelta(1)
end = str(yesterday)

calculate days between two dates

start_date = datetime.strptime(start_date, '%Y-%m-%d')
end_date = datetime.strptime(end_date, '%Y-%m-%d')
days = (end_date - start_date).days

How to group data by date period

df = df.groupby(pd.Grouper(key="date", freq="1M")).sum().reset_index()


Dataframe Value Manipulation

How to replace strings in column

df['column'] = df['column'].str.replace('$', '')

How to replace blank or NaN values with 0

column_name.fillna(0,inplace=True)

How to change column data type to datetime obj

df['column'] = df['column'].astype('datetime64[ns]')

How to change dataframe values in a specific column that are found within a list

df.loc[(df["column"].isin(["a", "b", "c", "d"]), "column")] = "new_value"

How to correct mixed types of dollars and convert to float

df['column'] = df['column'].replace({'\$ -': '','\$': '', ',': ''}, regex=True).astype(float)


Lambda functions

How to generate column values based on different column values

df['column_b'] = df['column_a'].apply(lambda x: 1 if x > 0 else 0)

How to generate column values with division and avoid dividing by 0

def div(x, y):
if y > 0:
return round(x / y, 2)
else:
return 1

df['percentage'] = df.apply(lambda x: div(x['column_a'], x['column_b']), axis=1)

Error Handling

try:
# do stuff here

except Exception as e:
print(repr(e))
else:
print('Success.')

How to Create an Excel File

file = 'file_name.xlsx'
writer = pd.ExcelWriter(file, engine='xlsxwriter')

# Write each dataframe to a different worksheet.
df1.to_excel(writer, sheet_name=''df1')
df2.to_excel(writer, sheet_name='df2')
df3.to_excel(writer, sheet_name='df3')

# Close the Pandas Excel writer and output the Excel file.
writer.save()

How to Read Data from a Google Sheet with Pandas

import gspread

gc = gspread.service_account(filename='credentials.json')

try:
df = gc.open("google_sheet_name")
df1 = df.worksheet('sheet_name')
df1 = pd.DataFrame(sheet_name.get_all_records())

How to Insert Variables Into a String

name = "brian"
color = "blue"
string = "My name is {0}, my favorite color is {1}.".format(name, color)

How to Clean Currency Data with Lambda Function

def clean_currency(x):
if isinstance(x, str) == True:
x = x.replace('$ -', '').replace(',', '').replace('$', '')
if not x:
x = 0
else:
x = float(x)
return(x)

df['column_name'] = df['column_name'].apply(clean_currency)

How to Acess FB Ads API with Python

from facebook_business.api import FacebookAdsApi
from facebook_business.adobjects.adaccount import AdAccount

token = "your_token_here"
FacebookAdsApi.init(access_token=token)
ad_account = AdAccount('act_00000000000')