Since SNOW is being discontinued, today I worked a bit on finding new solutions to have a progress bar in R for jobs running in parallel. In this example, I run 10,000 times a simple function to calculate logarithms, using 2 threads and monitoring the progress of the 10,000 calculations.
Set up the parameters
The following are the three parameters needed for any parallel job: number of threads, number of replicates (jobs) and a function:
nthreads<-2
nreps<-10000
funrep<-function(i){
res<-c(log2(i),log10(i))
}
SNOW solution
This was my old solution in SNOW, but CRAN is flagging all packages using SNOW with a warning “superseded packages” so we have to change it:
library(doSNOW)
cl<-makeCluster(nthreads)
registerDoSNOW(cl)
pb<-txtProgressBar(0,nreps,style=3)
progress<-function(n){
setTxtProgressBar(pb,n)
}
opts<-list(progress=progress)
i<-0
output<-foreach(i=icount(nreps),.combine=c,.options.snow=opts) %dopar% {
s<-funrep(i)
return(s)
}
close(pb)
stopCluster(cl)
Parallel solution (not working)
Unfortunately, Parallel doesn’t have a .options in foreach, and running it like this won’t work, as the combine function is run only at the end:
library(doParallel)
cl<-makeCluster(nthreads)
registerDoParallel(cl)
pb<-txtProgressBar(0,nreps,style=3)
output<-foreach(i=icount(nreps),.combine=c) %dopar% {
funrep(i)
setTxtProgressBar(pb,i)
}
stopCluster(cl)
Another parallel solution
After many tears, I finally found a solution that could work. Essentially, instead of c() I am running a progcombine() that contains c() and also updates a progress bar. Luckily, it works on both Windows and Linux:
library(doParallel)
progcombine<-function(){
pb <- txtProgressBar(min=1, max=nreps-1,style=3)
count <- 0
function(…) {
count <<- count + length(list(…)) – 1
setTxtProgressBar(pb,count)
flush.console()
c(…)
}
}
cl <- makeCluster(nthreads)
registerDoParallel(cl)
output<-foreach(i = icount(nreps),.combine=progcombine()) %dopar% {
funrep(i)
}
stopCluster(cl)
The working solution: pblapply
library(pbapply)
cl<-parallel::makeCluster(nthreads)
invisible(parallel::clusterExport(cl=cl,varlist=c(“nreps”)))
invisible(parallel::clusterEvalQ(cl=cl,library(utils)))
result<-pblapply(cl=cl,
X=1:nreps,
FUN=funrep)
parallel::stopCluster(cl)