.. currentmodule:: pandas
.. ipython:: python :suppress: import pandas as pd import random pd.options.display.max_rows=15
Create a series of numbers following a set pattern in a certain set of cells. In Excel this would be done by shift+drag after entering the first number or by entering the first two or three values and then dragging.
This can be achieved by creating a series and assigning it to the desired cells.
.. ipython:: python df = pd.DataFrame({'AAA': [1] * 8, 'BBB': list(range(0, 8))}); df series = list(range(1, 5)); series df.iloc[2:(5+1)].AAA = series df
Filters can be achieved by using slicing.
The examples filter by 0 on column AAA, and also show how to filter by multiple values.
.. ipython:: python df[df.AAA == 0] df[(df.AAA == 0) | (df.AAA == 2)]
Another commonly used function is Drop Duplicates. This is directly supported in pandas.
.. ipython:: python df = pd.DataFrame({"class": ['A', 'A', 'A', 'B', 'C', 'D'], "student_count": [42, 35, 42, 50, 47, 45], "all_pass": ["Yes", "Yes", "Yes", "No", "No", "Yes"]}) df.drop_duplicates() df.drop_duplicates(["class", "student_count"])
This can be achieved by using pandas.pivot_table
for examples and reference,
please see pandas.pivot_table
Let's create a new column "girls_count" and try to compute the number of boys in each class.
.. ipython:: python df["girls_count"] = [21, 12, 21, 31, 23, 17]; df def get_count(row): return row["student_count"] - row["girls_count"] df["boys_count"] = df.apply(get_count, axis = 1); df
.. ipython:: python df1 = pd.DataFrame({"keys": [1, 2, 3, 4, 5, 6, 7], "first_names": ["harry", "ron", "hermione", "rubius", "albus", "severus", "luna"]}); df1 random_names = pd.DataFrame({"surnames": ["hadrid", "malfoy", "lovegood", "dumbledore", "grindelwald", "granger", "weasly", "riddle", "longbottom", "snape"], "keys": [ random.randint(1,7) for x in range(0,10) ]}) random_names random_names.merge(df1, on="keys", how='left')
To appended a row, we can just assign values to an index using iloc
.
NOTE: If the index already exists, the values in that index will be over written.
.. ipython:: python df1.iloc[7] = [8, "tonks"]; df1
The replace
method that comes associated with the DataFrame
object can perform
this function. Please see pandas.DataFrame.replace for examples.