7. Filters¶
Until November 2016, the use and sale of marijuana for recreational purposes was illegal in California. That changed when voters approved Proposition 64, which asked if the practice ought to be legalized.
A yes vote supported legalization. A no vote opposed it. In the final tally, 57% of voters said yes.
Our next mission is to use the DataFrames containing campaign committees and contributors to figure out the biggest donors both for and against the measure.
To do that, the first thing we need to do is isolate the fundraising committees active on Proposition 64, which are now buried among of the list of more than 100 groups active last November.
7.1. Filtering a DataFrame¶
The most common way to filter a DataFrame is to pass an expression as an “index” that can be used to decide which records should be kept and which discarded.
You write the expression by combining a column on your DataFrame with an “operator” like ==
or >
or <
and a value to compare against each row.
Note
If you are familiar with writing SQL to manipulate databases, pandas’ filtering system is somewhat similar to a WHERE
query. The official pandas documentation offers direct translations between the two.
In our case, the column we want to filter against is prop_name
. We only want to keep those records where the value there matches the full name of Proposition 64.
Where do we get that? Our friend value_counts
.
Running the value_counts
method to spit out the full name of all 17 measures.
import pandas as pd
committee_list = pd.read_csv("https://raw.githubusercontent.com/california-civic-data-coalition/first-python-notebook/master/docs/src/_static/committees.csv")
committee_list.prop_name.value_counts()
PROPOSITION 057 - CRIMINAL SENTENCES. JUVENILE CRIMINAL PROCEEDINGS AND SENTENCING. INITIATIVE CONSTITUTIONAL AMENDMENT AND STATUTE. 13
PROPOSITION 056 - CIGARETTE TAX TO FUND HEALTHCARE, TOBACCO USE PREVENTION, RESEARCH, AND LAW ENFORCEMENT. INITIATIVE CONSTITUTIONAL AMENDMENT AND STATUTE. 12
PROPOSITION 064- MARIJUANA LEGALIZATION. INITIATIVE STATUTE. 11
PROPOSITION 066- DEATH PENALTY. PROCEDURES. INITIATIVE STATUTE. 9
PROPOSITION 055 - TAX EXTENSION TO FUND EDUCATION AND HEALTHCARE. INITIATIVE CONSTITUTIONAL AMENDMENT. 8
PROPOSITION 067- REFERENDUM TO OVERTURN BAN ON SINGLE-USE PLASTIC BAGS. 7
PROPOSITION 062- DEATH PENALTY. INITIATIVE STATUTE. 7
PROPOSITION 059- SB 254 (CHAPTER 20, STATUTES OF 2016), ALLEN. CAMPAIGN FINANCE: VOTER INSTRUCTION 6
PROPOSITION 058 - SB 1174 (CHAPTER 753, STATUTES OF 2014), LARA. ENGLISH LANGUAGE EDUCATION 4
PROPOSITION 063- FIREARMS. AMMUNITION SALES. INTIATIVE STATUTE. 4
PROPOSITION 054 - LEGISLATURE. LEGISLATION AND PROCEEDINGS. INITIATIVE CONSTITUTIONAL AMENDMENT AND STATUTE. 4
PROPOSITION 053 - REVENUE BONDS. STATEWIDE VOTER APPROVAL. INITIATIVE CONSTITUTIONAL AMENDMENT. 4
PROPOSITION 051 - SCHOOL BONDS. FUNDING FOR K-12 SCHOOL AND COMMUNITY COLLEGE FACILITIES. INITIATIVE STATUTORY AMENDMENT. 4
PROPOSITION 052 - STATE FEES ON HOSPITALS. FEDERAL MEDI-CAL MATCHING FUNDS. INITIATIVE STATUTORY AND CONSTITUTIONAL AMENDMENT. 3
PROPOSITION 061- STATE PRESCRIPTION DRUG PURCHASES. PRICING STANDARDS. INITIATIVE STATUTE. 3
PROPOSITION 060- ADULT FILMS. CONDOMS. HEALTH REQUIREMENTS. INITIATIVE STATUTE. 2
PROPOSITION 065- CARRY-OUT BAGS. CHARGES. INITIATIVE STATUTE. 1
Name: prop_name, dtype: int64
From that result we can copy the full name of the proposition and place it between quotation marks in a variable in a new cell. This will allow us to reuse it later.
my_prop = 'PROPOSITION 064- MARIJUANA LEGALIZATION. INITIATIVE STATUTE.'
In the next cell we will ask pandas to narrow down our list of committees to just those that match the proposition we’re interested in. We will create a filter expression that looks like this: committee_list.prop_name == my_prop
, and place it between two flat brackets following the DataFrame we wish to filter.
committee_list[committee_list.prop_name == my_prop]
ocd_prop_id | calaccess_prop_id | ccdc_prop_id | prop_name | ccdc_committee_id | calaccess_committee_id | committee_name | committee_position | |
---|---|---|---|---|---|---|---|---|
74 | ocd-contest/232cb72c-1b11-4293-b4e2-a181e80d172e | 1381868 | 83 | PROPOSITION 064- MARIJUANA LEGALIZATION. INITI... | 456 | 1381808 | YES ON 64, CALIFORNIANS TO CONTROL, REGULATE A... | SUPPORT |
75 | ocd-contest/232cb72c-1b11-4293-b4e2-a181e80d172e | 1381868 | 83 | PROPOSITION 064- MARIJUANA LEGALIZATION. INITI... | 457 | 1382525 | NEW APPROACH PAC (MPO) | SUPPORT |
76 | ocd-contest/232cb72c-1b11-4293-b4e2-a181e80d172e | 1381868 | 83 | PROPOSITION 064- MARIJUANA LEGALIZATION. INITI... | 458 | 1376077 | CALIFORNIANS FOR SENSIBLE REFORM, SPONSORED BY... | SUPPORT |
77 | ocd-contest/232cb72c-1b11-4293-b4e2-a181e80d172e | 1381868 | 83 | PROPOSITION 064- MARIJUANA LEGALIZATION. INITI... | 459 | 1382568 | PUBLIC AND MENTAL HEALTH ADVOCATES AGAINST 64,... | OPPOSE |
78 | ocd-contest/232cb72c-1b11-4293-b4e2-a181e80d172e | 1381868 | 83 | PROPOSITION 064- MARIJUANA LEGALIZATION. INITI... | 460 | 1371855 | MARIJUANA POLICY PROJECT OF CALIFORNIA, YES ON 64 | SUPPORT |
79 | ocd-contest/232cb72c-1b11-4293-b4e2-a181e80d172e | 1381868 | 83 | PROPOSITION 064- MARIJUANA LEGALIZATION. INITI... | 461 | 1343793 | CALIFORNIANS FOR RESPONSIBLE MARIJUANA REFORM,... | SUPPORT |
80 | ocd-contest/232cb72c-1b11-4293-b4e2-a181e80d172e | 1381868 | 83 | PROPOSITION 064- MARIJUANA LEGALIZATION. INITI... | 462 | 1386560 | ADULT USE CAMPAIGN FOR PROPOSITION 64; THE | SUPPORT |
81 | ocd-contest/232cb72c-1b11-4293-b4e2-a181e80d172e | 1381868 | 83 | PROPOSITION 064- MARIJUANA LEGALIZATION. INITI... | 463 | 1387789 | SAM ACTION, INC., A COMMITTEE AGAINST PROPOSIT... | OPPOSE |
82 | ocd-contest/232cb72c-1b11-4293-b4e2-a181e80d172e | 1381868 | 83 | PROPOSITION 064- MARIJUANA LEGALIZATION. INITI... | 464 | 1385506 | DRUG POLICY ACTION - NON PROFIT 501C4, YES ON ... | SUPPORT |
83 | ocd-contest/232cb72c-1b11-4293-b4e2-a181e80d172e | 1381868 | 83 | PROPOSITION 064- MARIJUANA LEGALIZATION. INITI... | 465 | 1385745 | FUND FOR POLICY REFORM (NONPROFIT 501(C)(4)) | SUPPORT |
84 | ocd-contest/232cb72c-1b11-4293-b4e2-a181e80d172e | 1381868 | 83 | PROPOSITION 064- MARIJUANA LEGALIZATION. INITI... | 466 | 1390962 | MOMS STRONG AGAINST PROP 64 | OPPOSE |
Run it and it outputs the filtered dataset, just those committees active on Proposition 64.
Now we should save the results of that filter into a new variable separate from the full list we imported from the CSV file.
Since it includes only the committees for one proposition let’s call it the singular prop.
my_committees = committee_list[committee_list.prop_name == my_prop]
To check our work and find out how many committees are left after the filter, let’s run the DataFrame inspection commands we learned earlier.
First head
.
my_committees.head()
ocd_prop_id | calaccess_prop_id | ccdc_prop_id | prop_name | ccdc_committee_id | calaccess_committee_id | committee_name | committee_position | |
---|---|---|---|---|---|---|---|---|
74 | ocd-contest/232cb72c-1b11-4293-b4e2-a181e80d172e | 1381868 | 83 | PROPOSITION 064- MARIJUANA LEGALIZATION. INITI... | 456 | 1381808 | YES ON 64, CALIFORNIANS TO CONTROL, REGULATE A... | SUPPORT |
75 | ocd-contest/232cb72c-1b11-4293-b4e2-a181e80d172e | 1381868 | 83 | PROPOSITION 064- MARIJUANA LEGALIZATION. INITI... | 457 | 1382525 | NEW APPROACH PAC (MPO) | SUPPORT |
76 | ocd-contest/232cb72c-1b11-4293-b4e2-a181e80d172e | 1381868 | 83 | PROPOSITION 064- MARIJUANA LEGALIZATION. INITI... | 458 | 1376077 | CALIFORNIANS FOR SENSIBLE REFORM, SPONSORED BY... | SUPPORT |
77 | ocd-contest/232cb72c-1b11-4293-b4e2-a181e80d172e | 1381868 | 83 | PROPOSITION 064- MARIJUANA LEGALIZATION. INITI... | 459 | 1382568 | PUBLIC AND MENTAL HEALTH ADVOCATES AGAINST 64,... | OPPOSE |
78 | ocd-contest/232cb72c-1b11-4293-b4e2-a181e80d172e | 1381868 | 83 | PROPOSITION 064- MARIJUANA LEGALIZATION. INITI... | 460 | 1371855 | MARIJUANA POLICY PROJECT OF CALIFORNIA, YES ON 64 | SUPPORT |
Then info
.
my_committees.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 11 entries, 74 to 84
Data columns (total 8 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 ocd_prop_id 11 non-null object
1 calaccess_prop_id 11 non-null int64
2 ccdc_prop_id 11 non-null int64
3 prop_name 11 non-null object
4 ccdc_committee_id 11 non-null int64
5 calaccess_committee_id 11 non-null int64
6 committee_name 11 non-null object
7 committee_position 11 non-null object
dtypes: int64(4), object(4)
memory usage: 792.0+ bytes