みらい 未来
Minimalist Async Evaluation Framework
for R
→ Designed for simplicity, a ‘mirai’ evaluates an R
expression asynchronously in a parallel process, locally or distributed
over the network.
→ Modern networking and concurrency, built on nanonext and NNG, ensures reliable scheduling over fast inter-process communications or TCP/IP secured by TLS.
→ Launch remote resources via SSH or cluster managers for distributed computing.
→ The queued architecture scales efficiently to millions of tasks over thousands of connections, requiring no storage on the file system.
→ Innovative features include event-driven promises, asynchronous
parallel map, and seamless serialization of otherwise non-exportable
reference objects.
mirai is Japanese for ‘future’ and is an implementation of futures in R.
→ mirai()
:
Sends an expression to be evaluated asynchronously in a separate R process and returns a mirai object immediately. Creation of a mirai is never blocking.
The result of a mirai m
will be available at
m$data
once evaluation is complete and its return value is
received. m[]
may be used to wait for and collect the
value.
library(mirai)
<- mirai(
m
{# slow operation
Sys.sleep(2)
sample(1:100, 1)
}
)
m#> < mirai [] >
$data
m#> 'unresolved' logi NA
# do other work
m[]#> [1] 79
$data
m#> [1] 79
→ daemons()
:
Sets persistent background processes (daemons) where mirai are evaluated.
To launch 6 local daemons:
daemons(6)
#> [1] 6
To launch daemons over the network for distributed computing, this is supported via:
See the reference vignette for further details.
→ mirai_map()
:
Maps a function over a list or vector, with each element processed as a mirai. For a dataframe or matrix, it automatically performs multiple map over the rows.
A ‘mirai_map’ object is returned immediately, and is always non-blocking.
Its value may be retrieved using its []
method,
returning a list. The []
method also provides options for
flatmap, early stopping and progress indicators.
<- data.frame(
df fruit = c("apples", "oranges", "pears"),
price = c(3L, 2L, 5L)
)
<- df |>
m mirai_map(\(...) sprintf("%s: $%d", ...))
m#> < mirai map [0/3] >
m[.flat]#> [1] "apples: $3" "oranges: $2" "pears: $5"
mirai is designed from the ground up to provide a production-grade experience.
→ Fast
→ Reliable
→ Scalable
mirai features the following core integrations, with usage examples in the linked vignettes:
Provides the first official alternative communications backend for R,
implementing the ‘MIRAI’ parallel cluster type, a feature request by
R-Core at R Project Sprint 2023.
Powers parallel map for the purrr functional programming toolkit, a
core tidyverse package.
Implements next generation, event-driven promises. ‘mirai’ and
‘mirai_map’ objects are readily convertible to ‘promises’, and may be
used directly with the promise pipe.
The primary async backend for Shiny, supporting ExtendedTask and the
next level of responsiveness and scalability for Shiny apps.
The built-in async evaluator behind the
@async
tag in
plumber2; also provides an async backend for Plumber.
Allows Torch tensors and complex objects such as models and optimizers
to be used seamlessly across parallel processes.
Allows queries using the Apache Arrow format to be handled seamlessly
over ADBC database connections hosted in background processes.
Targets, a make-like pipeline tool, has adopted crew as its default
high-performance computing backend. Crew is a distributed
worker-launcher extending mirai to different distributed computing
platforms, from traditional clusters to cloud services.
We would like to thank in particular:
Will Landau for being instrumental in shaping development of the package, from initiating the original request for persistent daemons, through to orchestrating robustness testing for the high performance computing requirements of crew and targets.
Joe Cheng for integrating the ‘promises’ method to work seamlessly within Shiny, and prototyping event-driven promises.
Luke Tierney of R Core, for discussion on L’Ecuyer-CMRG streams to ensure statistical independence in parallel processing, and making it possible for mirai to be the first ‘alternative communications backend for R’.
Travers Ching for a novel idea in extending the original custom serialization support in the package.
Henrik Bengtsson for valuable insights leading to the interface accepting broader usage patterns.
Daniel Falbel for discussion around an efficient solution to serialization and transmission of torch tensors.
Kirill Müller for discussion on using parallel processes to host Arrow database connections.
Install the latest release from CRAN:
install.packages("mirai")
The current development version is available from R-universe:
install.packages("mirai", repos = "https://r-lib.r-universe.dev")
◈ mirai R package: https://mirai.r-lib.org/
◈ nanonext R package: https://nanonext.r-lib.org/
mirai is listed in CRAN High Performance Computing Task View:
https://cran.r-project.org/view=HighPerformanceComputing
–
Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.