Pandas

Geekscoach
1 min readJul 17, 2020

--

Functions in this article

  • read_sql()
  • head()
  • columns
  • apply
  • describe()
  • idmax()
  • at
  • dropna()
  • replace()
  • cast()

Analyzing data with Pandas

Pandas is so big in functionallity that let’s show gathering the usufull functions.

  • Read SQL

df = pandas.read_sql(“SQL”, connection)

  • Describe the dataframe

df.head()

  • Get the columns names

df.columns

  • Apply a function per row

Define the function to apply

def my_func(x): #x is a list with the row elements (col1 , col2)

if x[0] == 3

return 4000

Use the function to df

df[“result”] = df[[“col1”, col2]].apply(lambda x: my_func(x), axis=1)

  • Get an overview, principal function on a dataset, this show the count, max, min, std, mean

df[“Column name”].describe()

  • Get Max value

df[“Column name”].idxmax()

  • Get a specific item

df.at[82, “Column name”]

  • Drop NA values

df.dropna(subset=[“col1”, “cols2”], axis = 0)

axis= 0 => row

axis=1 => column

  • Replace

df[“col1”].replace(old_value, new_value)

df[“col1”].replace(np.nan, 1)

  • Rename columns

df.rename(columns={“col_name”: “new_name”}, inplace=True)

  • Cast Type

df[“col”] = df[“col”].astype(“int”)

--

--

No responses yet