Why
Recklessroosters.com was built with a few things in mind:
- I wanted a light-hearted way to show how seemingly unrelated data can showcase something
- I am a builder, and like building stuff. This keeps me up to date with the latest tech (like fastapi in this case)
- I think Chicken’s really are planning to take over the world. Ever notice how they are all quiet and starts to chirp at the same time? Planning, I say.
How
- In short, RecklessRoosters uses data from the GBIF database for daily animal spottings. We enrich this data, with the distance from a public road using the OSM overpass API. By this process, we can pinpoint any animals that may be jaywalking.
- The GBIF database is a public BigQuery dataset
- The extraction and enrichment process sits in GCP Cloud run
- The source dataset is written back to BigQuery, and displayed on a Looker Studio report
- The website is hosted on GCP App Engine
- The code to the frontend and backend code is available here:
Todo
- Automate new record sync from GBIF. This dataset is huge though, so think carefully about how to approach resources
- Automate the ELT/OSM overpass backend query to GCP cloud run or AWS lambda. GCP Dataflow seemed like overkill for this use case.
- Batch process this. In big data - remember Idempotency, scalability, simplicity
More questions?