r - txtProgressBar for parallel bootstrap not displaying properly -
below mwe of problem: have programmed progress bar function using bootstrap (via boot function boot package).
this works fine long don't use parallel processing (res_1core
below). if want use parallel processing setting parallel = "multicore"
, ncpus = 2
, progress bar isn't displayed (res_2core
below).
library(boot) rsq <- function(formula, data, r, parallel = c("no", "multicore", "snow"), ncpus = 1) { env <- environment() counter <- 0 progbar <- txtprogressbar(min = 0, max = r, style = 3) bootfun <- function(formula, data, indices) { d <- data[indices,] fit <- lm(formula, data = d) curval <- get("counter", envir = env) assign("counter", curval + 1, envir = env) settxtprogressbar(get("progbar", envir = env), curval + 1) return(summary(fit)$r.square) } res <- boot(data = data, statistic = bootfun, r = r, formula = formula, parallel = parallel, ncpus = ncpus) return(res) } res_1core <- rsq(mpg ~ wt + disp, data = mtcars, r = 1000) res_2core <- rsq(mpg ~ wt + disp, data = mtcars, r = 1000, parallel = "multicore", ncpus = 2)
i have read related fact boot function calls on lapply
single core processing , mclapply
multicore processing. know of easy workaround deal this? mean, display progress taking account of parallel processes.
update
thanks input of karolis koncevičius, have found workaround (just use updated rsq
function below):
rsq <- function(formula, data, r, parallel = c("no", "multicore", "snow"), ncpus = 1) { bootfun <- function(formula, data, indices) { d <- data[indices,] fit <- lm(formula, data = d) return(summary(fit)$r.square) } env <- environment() counter <- 0 progbar <- txtprogressbar(min = 0, max = r, style = 3) flush.console() intfun <- function(formula, data, indices) { curval <- get("counter", envir = env) + ncpus assign("counter", curval, envir = env) settxtprogressbar(get("progbar", envir = env), curval) bootfun(formula, data, indices) } res <- boot(data = data, statistic = intfun, r = r, formula = formula, parallel = parallel, ncpus = ncpus) return(res) }
unfortunately, works multicore processing when run r terminal. ideas how patch displays in r console or rstudio?
not ordered, might helpful.
a simple statistics function boot:
library(boot) bootfun <- function(formula, data, indices) { d <- data[indices,] fit <- lm(formula, data=d) summary(fit)$r.square }
higher order function display progress:
progressreporter <- function(total, nbars=100, f, ...) { count <- 1 step <- ceiling(total/nbars) cat(paste(rep("|", nbars), collapse=""), "\r") flush.console() function(...) { if (count %% step==0) { cat(".") } count <<- count + 1 f(...) } }
now function cheating - displays progress every "step" of iterations. if have 1000 iterations, use 2 cores , print every 10th iteration - job. cores don't share state, each run counter 500, , function respond both counters.
on other hand if 1000 iterations, run 10 cores , report every 200 - function stay silent, cores count 100 each. none reach 200 - no progress bar. hope idea. think should ok in of cases.
try out:
res_1core <- boot(formula="mpg ~ wt + disp", data=mtcars, r=1000, statistic=progressreporter(1000, nbars=100, f=bootfun)) res_2core <- boot(formula="mpg ~ wt + disp", data=mtcars, r=1000, statistic=progressreporter(1000, nbars=100, f=bootfun), parallel="multicore", ncpus=2)
Comments
Post a Comment