After 18 months of heavy use within VIMC and in the COVID-19 modelling response, we have released orderly 1.3.0 to CRAN. This is a major update with many new features (more than 100 PRs since version 1.0.4 was published in early 2020).
Orderly is our reproducible research framework, designed to keep track of multiple versions of reports and allow researchers to automatically audit and roll back their data, or construct workflows around long running tasks.
Since 2020 we have:
- Expanded the documentation, including a discussion of different workflow patterns orderly enables
- Support for exporting work to be rerun elsewhere (e.g., a compute cluster) and then sent back to a central server via bundles
- Massively improved the power of OrderlyWeb, which can be used as a central server to share artefacts among a team see the remotes vignette.
- Expanded how dependencies work, to allow depending on the latest version of a particular parameter (e.g.,
latest(parameter:n_samples >= 1000)
) rather than just the latest version - Addded a new report development mode
orderly_develop_start
,orderly_develop_status
andorderly_develop_clean
which will setup an environment so reports can be developed in the same was as one might write code outside of orderly. It copies required files and dependencies, sources code files and loads declared packages. This supersedesorderly_test_start
but that function is retained in the package for now. - Allowed easier use of a mixture of draft and archive reports in development, via the
use_draft
argument toorderly_run
(rather thandraft
in theorderly.yml
- Added tools to visualise the graph structure of source or archive reports with
orderly_graph
- Added helper functions, inspired by
usethis
:orderly_use_resource
orderly_use_source
andorderly_use_package
which can add a resource, source, or package into theorderly.yml
- Allowed environment variables and secrets to be used in reports
In addition, there are lots of bug fixes and little features as needed, such as better handling of metadata for failed reports, more flexible querying of reports, integration with Microsoft Teams, more flexibility with complex SQL database configurations, and better error messages.
Over the course of the pandemic we have used orderly to collect together more than a Terabyte of research outputs (mostly simulation data) among a distributed team. It has been key to many of our workflows over the last 18 months, and we hope that other groups can leverage this work.