1 min readJul 17, 2020
Functions in this article
- read_sql()
- head()
- columns
- apply
- describe()
- idmax()
- at
- dropna()
- replace()
- cast()
Analyzing data with Pandas
Pandas is so big in functionallity that let’s show gathering the usufull functions.
- Read SQL
df = pandas.read_sql(“SQL”, connection)
- Describe the dataframe
- Get the columns names
- Apply a function per row
Define the function to apply
def my_func(x): #x is a list with the row elements (col1 , col2)
if x[0] == 3
return 4000
Use the function to df
df[“result”] = df[[“col1”, col2]].apply(lambda x: my_func(x), axis=1)
- Get an overview, principal function on a dataset, this show the count, max, min, std, mean
df[“Column name”].describe()
- Get Max value
df[“Column name”].idxmax()
- Get a specific item[82, “Column name”]
- Drop NA values
df.dropna(subset=[“col1”, “cols2”], axis = 0)
axis= 0 => row
axis=1 => column
- Replace
df[“col1”].replace(old_value, new_value)
df[“col1”].replace(np.nan, 1)
- Rename columns
df.rename(columns={“col_name”: “new_name”}, inplace=True)
- Cast Type