Pandas

Geekscoach

1 min readJul 17, 2020

Functions in this article

read_sql()
head()
columns
apply
describe()
idmax()
at
dropna()
replace()
cast()

Analyzing data with Pandas

Pandas is so big in functionallity that let’s show gathering the usufull functions.

Read SQL

df = pandas.read_sql(“SQL”, connection)

Describe the dataframe

df.head()

Get the columns names

df.columns

Apply a function per row

Define the function to apply

def my_func(x): #x is a list with the row elements (col1 , col2)

if x[0] == 3

return 4000

Use the function to df

df[“result”] = df[[“col1”, col2]].apply(lambda x: *my_func(x)**, axis=1)*

Get an overview, principal function on a dataset, this show the count, max, min, std, mean

df[“Column name”].describe()

Get Max value

df[“Column name”].idxmax()

Get a specific item

df.at[82, “Column name”]

Drop NA values

df.dropna(subset=[“col1”, “cols2”], axis = 0)

axis= 0 => row

axis=1 => column

Replace

df[“col1”].replace(old_value, new_value)

df[“col1”].replace(np.nan, 1)

Rename columns

df.rename(columns={“col_name”: “new_name”}, inplace=True)

Cast Type

Pandas

df = pandas.read_sql(“SQL”, connection)

df.head()

df.columns

df[“result”] = df[[“col1”, col2]].apply(lambda x: *my_func(x)**, axis=1)*

df[“Column name”].describe()

df[“Column name”].idxmax()

df.at[82, “Column name”]

df.dropna(subset=[“col1”, “cols2”], axis = 0)

df[“col1”].replace(old_value, new_value)

df.rename(columns={“col_name”: “new_name”}, inplace=True)

df[“col”] = df[“col”].astype(“int”)

Written by Geekscoach

No responses yet

Pandas

df = pandas.read_sql(“SQL”, connection)

df.head()

df.columns

df[“result”] = df[[“col1”, col2]].apply(lambda x: my_func(x), axis=1)

df[“Column name”].describe()

df[“Column name”].idxmax()

df.at[82, “Column name”]

df.dropna(subset=[“col1”, “cols2”], axis = 0)

df[“col1”].replace(old_value, new_value)

df.rename(columns={“col_name”: “new_name”}, inplace=True)

df[“col”] = df[“col”].astype(“int”)

Written by Geekscoach

No responses yet

df[“result”] = df[[“col1”, col2]].apply(lambda x: *my_func(x)**, axis=1)*