A Log Archiving Service for MESA
01 Aug 2021As part of testing the MESA stellar evolution code, we want to provide the developers with easy access to the output associated with failing tests. These logs come from the many different machines that MESA is tested on (ranging from a Raspberry Pi to large scientific computing clusters). As such, they can be especially useful in the case of failures that are not easy for an individual developer to reproduce (for example, because the failure is intermittent or because it only occurs with a specific operating system or compiler). Even in the case of easily reproducible failures, this saves a developer the time needed to re-run the case and trigger the failure themselves.
Our mesa_test testing framework manages the test runs and then (upon failure) transmits the logs to a remote server that makes them publicly available.
This MESA logs service is provided using the following approach, hosted on a Digital Ocean droplet (the same one that serves this website).
A Flask app (served with Gunicorn and Nginx) provides a route that accepts JSON POST requests. The expected JSON contains information about the computer, commit, and test case along with the base64-encoded logs.
Upon receiving a correctly formatted request (authenticated via a secret API key), the Flask app writes the logs to disk on a small Digital Ocean block storage volume. Nginx also serves the contents of this volume, making the logs available at a standardized path
https://logs.mesastar.org/<commit>/<computer_name>/<test_case>/
such that our testing dashboard can easily detect whether logs exist for a particular failure with an HTTP HEAD request.
Because these files are intended for diagnosing failures, they are only retained for a limited time, under the assumption their utility is largely exhausted after issues are fixed. A systemd timer prunes log files older than 60 days.