Is the pain worth it?: Can Rcpp speed up Passing Bablok Regression?

Background

R dogma is that for loops are bad because they are slow but this is not the case in C++. I had never programmed a line of C++ as of last week but my beloved firstborn started university last week and is enrolled in a C++ intro course, so I thought I would try to learn some and see if it would speed up Passing Bablok regression.

Passing Bablok Regression

As mentioned in the past, the field of Clinical Chemistry has a peculiar devotion to Passing Bablok regression… and hey, why not?

Here is the code for a minimal implementation of Passing Bablok regression as discussed in this paper. This is the scale-invariant version.

PB.reg <- function(x,y){
  lx <- length(x)
  l <- lx*(lx - 1)/2
  k <- 0
  S <- rep(NA, lx)
  for (i in 1:(lx - 1)) {
    for (j in (i + 1):lx) {
            k <- k + 1
            S[k] <- (y[i] - y[j])/(x[i] - x[j])
    }
  }
  S.sort <- sort(S)
  N <- length(S.sort)
  neg <- length(subset(S.sort,S.sort < 0))
  K <- floor(neg/2)
  if (N %% 2 == 1) {
      b <- S.sort[(N+1)/2+K]
  } else {
      b <- sqrt(S.sort[N / 2 + K]*S.sort[N / 2 + K + 1])
  }
  a <- median(y - b * x)
  return(as.vector(c(a,b)))
}

PB.reg <- function(x,y){

lx <- length(x)

l <- lx*(lx - 1)/2

k <- 0

S <- rep(NA, lx)

for (i in 1:(lx - 1)) {

for (j in (i + 1):lx) {

k <- k + 1

S[k] <- (y[i] - y[j])/(x[i] - x[j])

}

S.sort <- sort(S)

N <- length(S.sort)

neg <- length(subset(S.sort,S.sort < 0))

K <- floor(neg/2)

if (N %% 2 == 1) {

b <- S.sort[(N+1)/2+K]

} else {

b <- sqrt(S.sort[N / 2 + K]*S.sort[N / 2 + K + 1])

}

a <- median(y - b * x)

return(as.vector(c(a,b)))

}

So let’s make some fake data and see what we get:

set.seed(314)
x <- seq(1, 100, length.out = 100)
y <- 1.1* x + 20 + x*rnorm(100,0,0.10)
reg <- PB.reg(x,y)
round(reg,2)

set.seed(314)

x <- seq(1, 100, length.out = 100)

y <- 1.1* x + 20 + x*rnorm(100,0,0.10)

reg <- PB.reg(x,y)

round(reg,2)

## [1] 18.90  1.14

1	## [1] 18.90 1.14

Just to sanity check, we can get the coefficients from the mcr() package.

library(mcr)
reg.mcr <- mcreg(x,y)
round(reg.mcr@glob.coef,2)

library(mcr)

reg.mcr <- mcreg(x,y)

round(reg.mcr@glob.coef,2)

## [1] 18.90  1.14

1	## [1] 18.90 1.14

Ok, looks good.

Rcpp

The Rcpp package permits compiling and execution of C++ code from within R. This can be good for computationally intensive tasks. Here is my child-like attempt at recapitulation of the R script above.

#include <iostream>
#include <vector>
#include <cmath>
#include <limits>
#include <algorithm>
#include <numeric>
#include "Rcpp.h"
using namespace std;
using namespace Rcpp;
// [[Rcpp::plugins("cpp11")]]
// [[Rcpp::export]]

vector <double> PB(vector<double> x,vector<double> y){
  int lx=x.size();
  int l=lx*(lx-1)/2;
  int k=-1;
  vector <double> S(l,numeric_limits<double>::quiet_NaN());
  vector <double> sortS;
  vector <double> neg;
  double b;
  double a;
  vector <double> aVec(x.size());
  vector <double> coef(2);

  // Calculate the components of the S Matrix
  for(int i=0; i<lx-1;i++){
    for(int j=i+1; j<lx;j++){
        k+=1;
        S[k]=(y[i]-y[j])/(x[i]-x[j]);
    }
  }

  // Sort the S Matrix
  sort(S.begin(),S.end());

  // Delete any undefined values from S
  for(vector <double> :: iterator it = S.begin(); it != S.end(); ++it){
      if (!isnan(*it)) sortS.push_back((*it));
  }

  // Put all the negative values into a vector called neg
  for(vector <double> :: iterator it = S.begin(); it != S.end(); ++it){
      if ((*it)<0) neg.push_back((*it));
  }

  // Calculate N and K
  int K=int(floor(float(neg.size()/2)));
  int N=sortS.size();

  // Calculate the slope
  if (N%2==1) {
    b=S[static_cast<int> ((N+1)/2+K-1)];
  } else {
    b=sqrt(S[static_cast<int> (N/2+K-1)]*S[static_cast<int> (N/2+K)]);
  }

  // Make a vector aVec of the estimates of the intercept
  for(int i=0; i<x.size(); i++){
      aVec[i]=y[i]-b*x[i];
  }
  sort(aVec.begin(),aVec.end());

  // Calculate the median from the sorted values of aVec
  if(x.size()%2==1){
    a=aVec[static_cast<int>((x.size()+1.0)/2.0-1.0)];
  }else{
    a=(aVec[static_cast<int> (x.size()/2.0-1.0)] + aVec[static_cast<int> (x.size()/2.0)])/2.0;
  }

  // Report results
  coef[0]=a;
  coef[1]=b;
  return coef;
}

int main(){
  return 0;
}

#include <iostream>

#include <vector>

#include <cmath>

#include <limits>

#include <algorithm>

#include <numeric>

#include "Rcpp.h"

using namespace std;

using namespace Rcpp;

// [[Rcpp::plugins("cpp11")]]

// [[Rcpp::export]]

vector <double> PB(vector<double> x,vector<double> y){

int lx=x.size();

int l=lx*(lx-1)/2;

int k=-1;

vector <double> S(l,numeric_limits<double>::quiet_NaN());

vector <double> sortS;

vector <double> neg;

double b;

double a;

vector <double> aVec(x.size());

vector <double> coef(2);

// Calculate the components of the S Matrix

for(int i=0; i<lx-1;i++){

for(int j=i+1; j<lx;j++){

k+=1;

S[k]=(y[i]-y[j])/(x[i]-x[j]);

}

// Sort the S Matrix

sort(S.begin(),S.end());

// Delete any undefined values from S

for(vector <double> :: iterator it = S.begin(); it != S.end(); ++it){

if (!isnan(*it)) sortS.push_back((*it));

}

// Put all the negative values into a vector called neg

for(vector <double> :: iterator it = S.begin(); it != S.end(); ++it){

if ((*it)<0) neg.push_back((*it));

}

// Calculate N and K

int K=int(floor(float(neg.size()/2)));

int N=sortS.size();

// Calculate the slope

if (N%2==1) {

b=S[static_cast<int> ((N+1)/2+K-1)];

} else {

b=sqrt(S[static_cast<int> (N/2+K-1)]*S[static_cast<int> (N/2+K)]);

}

// Make a vector aVec of the estimates of the intercept

for(int i=0; i<x.size(); i++){

aVec[i]=y[i]-b*x[i];

}

sort(aVec.begin(),aVec.end());

// Calculate the median from the sorted values of aVec

if(x.size()%2==1){

a=aVec[static_cast<int>((x.size()+1.0)/2.0-1.0)];

}else{

a=(aVec[static_cast<int> (x.size()/2.0-1.0)] + aVec[static_cast<int> (x.size()/2.0)])/2.0;

}

// Report results

coef[0]=a;

coef[1]=b;

return coef;

}

int main(){

return 0;

}

Saving this to a local directory as PB_Reg.cpp, we can compile it and call the function into the R session as follows:

library(Rcpp)
sourceCpp("PB_Reg.cpp")

1 2	library(Rcpp) sourceCpp("PB_Reg.cpp")

And run it:

round(PB(x,y),2)

1	round(PB(x,y),2)

## [1] 18.90  1.14

1	## [1] 18.90 1.14

Looks good! But is it faster? Like really faster? Let’s find out. And let’s also compare it to what the compiler() package produces:

library(compiler)
PB.reg.cmpfun <- cmpfun(PB.reg)

library(rbenchmark)
bmark <- benchmark(
  Rscript = PB.reg(x,y),
  `Compiler Package` = PB.reg.cmpfun(x,y),
  Rcpp = PB(x,y),
  replications = 100,
  order = "relative"
)
knitr::kable(bmark)

library(compiler)

PB.reg.cmpfun <- cmpfun(PB.reg)

library(rbenchmark)

bmark <- benchmark(

Rscript = PB.reg(x,y),

`Compiler Package` = PB.reg.cmpfun(x,y),

Rcpp = PB(x,y),

replications = 100,

order = "relative"

)

knitr::kable(bmark)

	test	replications	elapsed	relative	user.self	sys.self
3	Rcpp	100	0.024	1.000	0.024	0.000
2	Compiler Package	100	0.214	8.917	0.214	0.000
1	Rscript	100	0.217	9.042	0.213	0.004

So this makes C++ code about 9x faster than R source code. Not bad but in this case it was probably not worth the effort. Oh well. I learned something. Incidentally, the compiler() package did not help much in this case as it is only marginally better than raw R source. I have seen better results from compiler() in other cases.

Final thought

Speaking of speed:

“Free yourself, like a gazelle from the hand of the hunter, like a bird from the snare of the fowler.”

Proverbs 6:5

The dogma to avoid for loops in R is because, most of the time, it’s better to rely on vectorized operations.

For this function, a vectorized approach in R is much faster than the loop in R:

PB.reg.vectorized <- function(x, y) { xdiffs <- outer(x, x, "-") xdiffs <- xdiffs[lower.tri(xdiffs)] ydiffs <- outer(y, y, "-") ydiffs <- ydiffs[lower.tri(ydiffs)] S <- ydiffs / xdiffs S.sort <- sort(S) N <- length(S.sort) neg <- length(subset(S.sort,S.sort < 0)) K <- floor(neg/2) if (N %% 2 == 1) { b <- S.sort[(N+1)/2+K] } else { b <- sqrt(S.sort[N / 2 + K]*S.sort[N / 2 + K + 1]) } a <- median(y - b * x) return(as.vector(c(a,b))) }


library(microbenchmark)
set.seed(314)

x <- seq(1, 100, length.out = 100)

y <- 1.1* x + 20 + x*rnorm(100,0,0.10)

microbenchmark( original = PB.reg(x, y), vectorized = PB.reg.vectorized(x, y) ) # Unit: microseconds # expr min lq mean median uq max neval # original 1824.6 2007.95 3380.896 2269.0 3780.20 59922.6 100 # vectorized 518.0 571.60 763.019 612.6 957.75 3125.0 100

In the vectorized version, the loop is replaced by calls to outer to create two matrices. Matrix operations are vectorized in R. They're faster despite over half of the matrices not being used.

Of course, the C++ version would be faster still.

6 thoughts on “Is the pain worth it?: Can Rcpp speed up Passing Bablok Regression?”

Nathan Werth says:

September 10, 2019 at 2:07 pm

The dogma to avoid for loops in R is because, most of the time, it’s better to rely on vectorized operations.

For this function, a vectorized approach in R is much faster than the loop in R:

PB.reg.vectorized <- function(x, y) { xdiffs <- outer(x, x, "-") xdiffs <- xdiffs[lower.tri(xdiffs)] ydiffs <- outer(y, y, "-") ydiffs <- ydiffs[lower.tri(ydiffs)] S <- ydiffs / xdiffs S.sort <- sort(S) N <- length(S.sort) neg <- length(subset(S.sort,S.sort < 0)) K <- floor(neg/2) if (N %% 2 == 1) { b <- S.sort[(N+1)/2+K] } else { b <- sqrt(S.sort[N / 2 + K]*S.sort[N / 2 + K + 1]) } a <- median(y - b * x) return(as.vector(c(a,b))) }
library(microbenchmark) set.seed(314) x <- seq(1, 100, length.out = 100) y <- 1.1* x + 20 + x*rnorm(100,0,0.10)
microbenchmark( original = PB.reg(x, y), vectorized = PB.reg.vectorized(x, y) ) # Unit: microseconds # expr min lq mean median uq max neval # original 1824.6 2007.95 3380.896 2269.0 3780.20 59922.6 100 # vectorized 518.0 571.60 763.019 612.6 957.75 3125.0 100

In the vectorized version, the loop is replaced by calls to outer to create two matrices. Matrix operations are vectorized in R. They're faster despite over half of the matrices not being used.

Of course, the C++ version would be faster still.
1. dtholmes@mail.ubc.ca says:
  
  September 10, 2019 at 4:05 pm
  
  Wow – cool… that’s awesome. Thanks for the quick reply and working code.
Marc says:

September 11, 2019 at 7:53 am

I’m also quite new to C++, although in my case I’ve taken to it out of necessity.

I’ve found the armadillo library (provided by the RcppArmadillo package) to be helpful while I’m getting started. As an example:

#include
// [[Rcpp::depends(RcppArmadillo)]]

// [[Rcpp::export]]
arma::vec PB_ARMA(arma::vec const &x, arma::vec const &y){
arma::uword lx = x.n_rows;
arma::uword l = lx * (lx – 1) / 2;
int k = -1;

double b = arma::datum::nan;
double a = arma::datum::nan;

// Calculate and sort the components of the S Matrix
arma::vec S(l);
S.fill(arma::datum::nan);
for(int i = 0; i < lx-1; i++){
for(int j = i+1; j < lx; j++){
k += 1;
S(k) = (y(i) – y(j)) / (x(i) – x(j));
}
}

// Calculate the number of negative elements as well as the number of
// elements with finite values.
arma::uvec idxNeg = arma::find(S < 0);
arma::uvec idxFinite = arma::find_finite(S);

arma::uword K = idxNeg.n_rows / 2;
arma::uword N = idxFinite.n_rows;

// Remove the non-finite values (if any) and sort the result
S = arma::sort(S(idxFinite));

// Calculate the slope
if ((N % 2) == 1) {
b = S((N+1)/2 + K – 1);
} else {
b = sqrt(S(N/2 + K – 1) * S(N/2 + K));
}

// Make a vector aVec of the estimates of the intercept
arma::vec aVec = arma::sort(y – b * x);

// Calculate the median from the sorted values of aVec
a = arma::median(aVec);

// Report results
arma::vec coef = {a, b};
return coef;
}

I do agree that, in the absence of a major bottleneck, involving C++ in ones R code is probably overkill; outside of an R package anyway.
1. Marc says:
  
  September 12, 2019 at 1:39 am
  
  The lack of a basic element-wise addition operator for std::vector had me baffled as well, but I’ve also seen some reasonable arguments that ‘+’ could be interpreted as a concatenation operator (which had never occurred to me). So I guess the designers decided that the behaviour of ‘+’ on vectors was ambiguous and that other libraries would implement their own operators where the context made the intention obvious (e.g. a linear algebra library).
  
  I can definitely skip sorting aVec, good catch; taking it one step further aVec becomes unnecessary and the parameter ‘a’ can be calculated as ‘a = arma::median(y – b*x);’.
  1. Marc says:
    
    September 12, 2019 at 5:01 am
    
    A strange one, but vectors are a generic container that can contain types such as strings, stacks or queues (not to say that they should) and in these cases I understand why someone would get the impression that the addition operator implies concatenation.
Luis says:

September 13, 2019 at 12:00 am

If you can rely on vectorization, normally R is fast enough. Looping but using pre-allocated matrices is also very fast, sometimes with no time penalty. However, when I’ve had to implement algorithms that involve recursion or that cannot be vectorized properly, using Rcpp + Armadillo can give you huge gains. Last week I sped up code 200x moving to Rcpp, because it involved recursion plus some odd calculations.

Comments are closed.