13. Export

Saving the dataframes you’ve created to your computer requires one final pandas method. It’s to_csv, an exporting companion to read_csv. Append it to any dataframe and provide a filepath. That’s all it takes.

Hide code cell content
import pandas as pd
accident_list = pd.read_csv("https://raw.githubusercontent.com/palewire/first-python-notebook/main/docs/src/_static/ntsb-accidents.csv")
accident_list['latimes_make_and_model'] = accident_list['latimes_make_and_model'].str.upper()
accident_counts = accident_list.groupby(["latimes_make", "latimes_make_and_model"]).size().reset_index().rename(columns={0: "accidents"})
survey = pd.read_csv("https://raw.githubusercontent.com/palewire/first-python-notebook/main/docs/src/_static/faa-survey.csv")
survey['latimes_make_and_model'] = survey['latimes_make_and_model'].str.upper()
merged_list = pd.merge(accident_counts, survey, on="latimes_make_and_model")
merged_list['per_hour'] = merged_list.accidents / merged_list.total_hours
merged_list['per_100k_hours'] = (merged_list.accidents / merged_list.total_hours) * 100_000
merged_list.to_csv("accident-rate-ranking.csv")

The file it creates can be imported into other programs for reuse, including the data visualization tools many newsrooms rely on to publish graphics. For instance, the file we’ve exported above could be used to quickly draft a chart with Datawrapper, like this one:

Note

Interested in learning more about how to publish data online? Check out “First Visual Story,” a tutorial that will show you how journalists at America’s top news organizations escape rigid content-management systems to publish custom interactive graphics on deadline.

The to_csv() method accepts several additional optional arguments. The most important one is the filename input, which is used to specify the path and name of the file that will be created. The index=False keyword argument tells pandas to exclude the index column of the DataFrame. You can also specify the separator by passing the sep parameter.

merged_list.to_csv("accident-rate-ranking.csv", index=False, sep=";")

This will create a CSV file without the index with semicolons as the separator between values.

And with that, you’ve completed “First Python Notebook.” If you have any questions or critiques, you can get involved on our GitHub repository, where all of the code that powers this site is available as open source.