```{include} _templates/nav.html ``` # Adding sites and bundles This page explains how to add new sites and bundles to the project by making a code contribution to the open-source repository. ```{note} If you lack the technical skills or time to add sources yourself, you can always make a request of the project's maintainers by [filling out this form](https://github.com/palewire/news-homepages/issues/new?assignees=palewire&labels=enhancement%2Cgood+first+issue%2Chelp+wanted&projects=&template=add-a-site.yaml&title=%5BAdd+site%5D%3A+) on GitHub or by emailing [b@palewi.re](mailto:b@palewi.re). ``` ## Adding a site ### 1. Add record to `sites.csv` file Adding a new site requires that a new row be added to [`sources/sites.csv`](https://github.com/palewire/news-homepages/blob/main/newshomepages/sources/sites.csv) with, at a minimum, the Twitter handle, URL, name, location, time zone, country and language of the target. Time zones should be provided in [Python's standard formatting scheme](https://gist.github.com/heyalexej/8bf688fd67d7199be4a1682b3eec7568). Country's should be provided as a two-digit [ISO 3166-1](https://en.wikipedia.org/wiki/ISO_3166-1) alpha code. Languages should be provided as a two-digit [ISO 639-1](https://en.wikipedia.org/wiki/List_of_ISO_639-1_codes) alpha code. You can override the system’s default by adding an optional attribute for the time delay before the screenshot, which, if provided, is expected in milliseconds. ### Test the screenshot After doing that, you should verify the site works by running the `screenshot.py` command and inspecting the result. ```bash pipenv run python -m newshomepages.screenshot your-handle ``` ### Hide ads and popups If there are popups or ads that interfere with the screenshot, our aim is to eliminate them via JavaScript. There are two techniques for acheiving the goal: 1. Adding a selector to the `target_list` in the [`newshomepages.screenshot`](https://github.com/palewire/news-homepages/blob/main/newshomepages/screenshot.py) module. This should be done in cases where the offending element appears to be generated by a third-party library that may occur on other sites. 2. You can devise a file in [`sources/javascript`](https://github.com/palewire/news-homepages/tree/main/newshomepages/sources/javascript) with its name slugged to match the Twitter handle of the site. This snippet will be run for just that site. Here’s a generic example that would remove any elements with the id of `ad_unit` or class of `popup`. If you identify the id or class of a page element you'd like to hide, it could be inserted into the scheme. ```javascript document.querySelectorAll( '#ad_unit,.popup' // <-- Pull your page’s identifiers here. If there's more than one thing to target you can comma seperate them. ).forEach(el => el.remove()) ``` This method also can accomodate more complicated manipulations of the page. Consult the examples in the repository to explore other techniques for targeting and hiding page elements. ### Add to a bundle Then you should link the site’s row to one or more of the topical bundles defined in [`sources/bundles.csv`](https://github.com/palewire/news-homepages/blob/main/newshomepages/sources/bundles.csv). This is done by putting the slugs of the desired bundles into your site’s bundle field. If you'd like to link a site with more than bundle, you should separate the slugs with `|`. For example, MSNBC is bundled with both national news outlets and left wing sites. So it's bundle field looks like: ``` us-national|us-left-wing ``` If an suitable bundle for your site does not exist, you can add one to the separate bundle data file, as described below. ## Adding a bundle Bundles are collections of sites that are grouped together for archiving, presentation and analysis. Adding a new bundle requires that a new row be added to [`sources/bundles.csv`](https://github.com/palewire/news-homepages/blob/main/newshomepages/sources/bundles.csv) with a slug, name, location and timezone. When its slug value is entered in the bundle field of the `sites.csv` file, the site is considered a part of the bundle. ### Scheduling actions While all sites in our directory are archived at least two times per day, bundles can have additional archiving runs scheduled via GitHub Actions. This allows for optimizing our runs for local time and also results in a tweet automatically posted to the [@newshomepages](https://twitter.com/newshomepages) account. Adding a new batch run requires creating a new YAML file in the `.github/workflows` directory that inherits from a reusable workflow shared by similar files. It should be named `archive-your-bundle-slug.yml`. If you'd like to schedule a new bundle run, submit a file like this via pull request. You should only need to customize the name, the cron and the bundle. ```yaml name: "Archive: Your bundle name" on: workflow_dispatch: schedule: - cron: "0 18 * * *" # <-- Your bundle's schedule goes here. jobs: archive-bundle: name: Archive bundle uses: palewire/news-homepages/.github/workflows/reusable-archive-bundle-workflow.yml@main with: bundle: your-bundle-slug secrets: inherit ```