7. Filter

7.1. Our next mission

Until November 2016, the use and sale of marijuana for recreational purposes was illegal in California. That changed when voters approved Proposition 64, which asked if the practice ought to be legalized.

A yes vote supported legalization. A no vote opposed it. In the final tally, 57% of voters said yes.

Our next mission is to use the DataFrames containing campaign committees and contributors to figure out the biggest donors both for and against the measure.

To do that, the first thing we need to do is isolate the fundraising committees active on Proposition 64, which are now buried among of the list of more than 100 groups active last November.

7.2. Filter a DataFrame

The most common way to filter a DataFrame is to pass an expression as an “index” that can be used to decide which records should be kept and which discarded.

You write the expression by combining a column on your DataFrame with an “operator” like == or > or < and a value to compare against each row.

Note

If you are familiar with writing SQL to manipulate databases, pandas’ filtering system is somewhat similar to a WHERE query. The official pandas documentation offers direct translations between the two.

In our case, the column we want to filter against is prop_name. We only want to keep those records where the value there matches the full name of Proposition 64.

Where do we get that? Our friend value_counts.

Running the value_counts method to spit out the full name of all 17 measures.

import pandas as pd
committee_list = pd.read_csv("https://raw.githubusercontent.com/california-civic-data-coalition/first-python-notebook/master/docs/src/_static/committees.csv")
committee_list.prop_name.value_counts()
PROPOSITION 057 - CRIMINAL SENTENCES. JUVENILE CRIMINAL PROCEEDINGS AND SENTENCING. INITIATIVE CONSTITUTIONAL AMENDMENT AND STATUTE.                           13
PROPOSITION 056 - CIGARETTE TAX TO FUND HEALTHCARE, TOBACCO USE PREVENTION, RESEARCH, AND LAW ENFORCEMENT. INITIATIVE CONSTITUTIONAL AMENDMENT AND STATUTE.    12
PROPOSITION 064- MARIJUANA LEGALIZATION. INITIATIVE STATUTE.                                                                                                   11
PROPOSITION 066- DEATH PENALTY. PROCEDURES. INITIATIVE STATUTE.                                                                                                 9
PROPOSITION 055 - TAX EXTENSION TO FUND EDUCATION AND HEALTHCARE. INITIATIVE CONSTITUTIONAL AMENDMENT.                                                          8
PROPOSITION 067- REFERENDUM TO OVERTURN BAN ON SINGLE-USE PLASTIC BAGS.                                                                                         7
PROPOSITION 062- DEATH PENALTY. INITIATIVE STATUTE.                                                                                                             7
PROPOSITION 059- SB 254 (CHAPTER 20, STATUTES OF 2016), ALLEN. CAMPAIGN FINANCE: VOTER INSTRUCTION                                                              6
PROPOSITION 058 - SB 1174 (CHAPTER 753, STATUTES OF 2014), LARA. ENGLISH LANGUAGE EDUCATION                                                                     4
PROPOSITION 063- FIREARMS. AMMUNITION SALES. INTIATIVE STATUTE.                                                                                                 4
PROPOSITION 054 - LEGISLATURE. LEGISLATION AND PROCEEDINGS. INITIATIVE CONSTITUTIONAL AMENDMENT AND STATUTE.                                                    4
PROPOSITION 053 - REVENUE BONDS. STATEWIDE VOTER APPROVAL. INITIATIVE CONSTITUTIONAL AMENDMENT.                                                                 4
PROPOSITION 051 - SCHOOL BONDS. FUNDING FOR K-12 SCHOOL AND COMMUNITY COLLEGE FACILITIES. INITIATIVE STATUTORY AMENDMENT.                                       4
PROPOSITION 052 - STATE FEES ON HOSPITALS. FEDERAL MEDI-CAL MATCHING FUNDS. INITIATIVE STATUTORY AND CONSTITUTIONAL AMENDMENT.                                  3
PROPOSITION 061- STATE PRESCRIPTION DRUG PURCHASES. PRICING STANDARDS. INITIATIVE STATUTE.                                                                      3
PROPOSITION 060- ADULT FILMS. CONDOMS. HEALTH REQUIREMENTS. INITIATIVE STATUTE.                                                                                 2
PROPOSITION 065- CARRY-OUT BAGS. CHARGES. INITIATIVE STATUTE.                                                                                                   1
Name: prop_name, dtype: int64

From that result we can copy the full name of the proposition and place it between quotation marks in a variable in a new cell. This will allow us to reuse it later.

my_prop = 'PROPOSITION 064- MARIJUANA LEGALIZATION. INITIATIVE STATUTE.'

In the next cell we will ask pandas to narrow down our list of committees to just those that match the proposition we’re interested in. We will create a filter expression and place it between two flat brackets following the DataFrame we wish to filter.

committee_list[committee_list.prop_name == my_prop]
ocd_prop_id calaccess_prop_id ccdc_prop_id prop_name ccdc_committee_id calaccess_committee_id committee_name committee_position
74 ocd-contest/232cb72c-1b11-4293-b4e2-a181e80d172e 1381868 83 PROPOSITION 064- MARIJUANA LEGALIZATION. INITI... 456 1381808 YES ON 64, CALIFORNIANS TO CONTROL, REGULATE A... SUPPORT
75 ocd-contest/232cb72c-1b11-4293-b4e2-a181e80d172e 1381868 83 PROPOSITION 064- MARIJUANA LEGALIZATION. INITI... 457 1382525 NEW APPROACH PAC (MPO) SUPPORT
76 ocd-contest/232cb72c-1b11-4293-b4e2-a181e80d172e 1381868 83 PROPOSITION 064- MARIJUANA LEGALIZATION. INITI... 458 1376077 CALIFORNIANS FOR SENSIBLE REFORM, SPONSORED BY... SUPPORT
77 ocd-contest/232cb72c-1b11-4293-b4e2-a181e80d172e 1381868 83 PROPOSITION 064- MARIJUANA LEGALIZATION. INITI... 459 1382568 PUBLIC AND MENTAL HEALTH ADVOCATES AGAINST 64,... OPPOSE
78 ocd-contest/232cb72c-1b11-4293-b4e2-a181e80d172e 1381868 83 PROPOSITION 064- MARIJUANA LEGALIZATION. INITI... 460 1371855 MARIJUANA POLICY PROJECT OF CALIFORNIA, YES ON 64 SUPPORT
79 ocd-contest/232cb72c-1b11-4293-b4e2-a181e80d172e 1381868 83 PROPOSITION 064- MARIJUANA LEGALIZATION. INITI... 461 1343793 CALIFORNIANS FOR RESPONSIBLE MARIJUANA REFORM,... SUPPORT
80 ocd-contest/232cb72c-1b11-4293-b4e2-a181e80d172e 1381868 83 PROPOSITION 064- MARIJUANA LEGALIZATION. INITI... 462 1386560 ADULT USE CAMPAIGN FOR PROPOSITION 64; THE SUPPORT
81 ocd-contest/232cb72c-1b11-4293-b4e2-a181e80d172e 1381868 83 PROPOSITION 064- MARIJUANA LEGALIZATION. INITI... 463 1387789 SAM ACTION, INC., A COMMITTEE AGAINST PROPOSIT... OPPOSE
82 ocd-contest/232cb72c-1b11-4293-b4e2-a181e80d172e 1381868 83 PROPOSITION 064- MARIJUANA LEGALIZATION. INITI... 464 1385506 DRUG POLICY ACTION - NON PROFIT 501C4, YES ON ... SUPPORT
83 ocd-contest/232cb72c-1b11-4293-b4e2-a181e80d172e 1381868 83 PROPOSITION 064- MARIJUANA LEGALIZATION. INITI... 465 1385745 FUND FOR POLICY REFORM (NONPROFIT 501(C)(4)) SUPPORT
84 ocd-contest/232cb72c-1b11-4293-b4e2-a181e80d172e 1381868 83 PROPOSITION 064- MARIJUANA LEGALIZATION. INITI... 466 1390962 MOMS STRONG AGAINST PROP 64 OPPOSE

Run it and it outputs the filtered dataset, just those committees active on Proposition 64.

7.3. Inspect the results

Now we should save the results of that filter into a new variable separate from the full list we imported from the CSV file.

Since it includes only the committees for the proposition we’re interested in let’s call it my_committees.

my_committees = committee_list[committee_list.prop_name == my_prop]

To check our work and find out how many committees are left after the filter, let’s run the DataFrame inspection commands we learned earlier.

First head.

my_committees.head()
ocd_prop_id calaccess_prop_id ccdc_prop_id prop_name ccdc_committee_id calaccess_committee_id committee_name committee_position
74 ocd-contest/232cb72c-1b11-4293-b4e2-a181e80d172e 1381868 83 PROPOSITION 064- MARIJUANA LEGALIZATION. INITI... 456 1381808 YES ON 64, CALIFORNIANS TO CONTROL, REGULATE A... SUPPORT
75 ocd-contest/232cb72c-1b11-4293-b4e2-a181e80d172e 1381868 83 PROPOSITION 064- MARIJUANA LEGALIZATION. INITI... 457 1382525 NEW APPROACH PAC (MPO) SUPPORT
76 ocd-contest/232cb72c-1b11-4293-b4e2-a181e80d172e 1381868 83 PROPOSITION 064- MARIJUANA LEGALIZATION. INITI... 458 1376077 CALIFORNIANS FOR SENSIBLE REFORM, SPONSORED BY... SUPPORT
77 ocd-contest/232cb72c-1b11-4293-b4e2-a181e80d172e 1381868 83 PROPOSITION 064- MARIJUANA LEGALIZATION. INITI... 459 1382568 PUBLIC AND MENTAL HEALTH ADVOCATES AGAINST 64,... OPPOSE
78 ocd-contest/232cb72c-1b11-4293-b4e2-a181e80d172e 1381868 83 PROPOSITION 064- MARIJUANA LEGALIZATION. INITI... 460 1371855 MARIJUANA POLICY PROJECT OF CALIFORNIA, YES ON 64 SUPPORT

Then info.

my_committees.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 11 entries, 74 to 84
Data columns (total 8 columns):
 #   Column                  Non-Null Count  Dtype 
---  ------                  --------------  ----- 
 0   ocd_prop_id             11 non-null     object
 1   calaccess_prop_id       11 non-null     int64 
 2   ccdc_prop_id            11 non-null     int64 
 3   prop_name               11 non-null     object
 4   ccdc_committee_id       11 non-null     int64 
 5   calaccess_committee_id  11 non-null     int64 
 6   committee_name          11 non-null     object
 7   committee_position      11 non-null     object
dtypes: int64(4), object(4)
memory usage: 792.0+ bytes