Is the pain worth it?: Can Rcpp speed up Passing Bablok Regression?

Background

R dogma is that for loops are bad because they are slow but this is not the case in C++. I had never programmed a line of C++ as of last week but my beloved firstborn started university last week and is enrolled in a C++ intro course, so I thought I would try to learn some and see if it would speed up Passing Bablok regression.

Passing Bablok Regression

As mentioned in the past, the field of Clinical Chemistry has a peculiar devotion to Passing Bablok regression… and hey, why not?

Here is the code for a minimal implementation of Passing Bablok regression as discussed in this paper. This is the scale-invariant version.

So let’s make some fake data and see what we get:

Just to sanity check, we can get the coefficients from the mcr() package.

Ok, looks good.

Rcpp

The Rcpp package permits compiling and execution of C++ code from within R. This can be good for computationally intensive tasks. Here is my child-like attempt at recapitulation of the R script above.

Saving this to a local directory as PB_Reg.cpp, we can compile it and call the function into the R session as follows:

And run it:

Looks good! But is it faster? Like really faster? Let’s find out. And let’s also compare it to what the compiler() package produces:

test replications elapsed relative user.self sys.self user.child sys.child
3 Rcpp 100 0.024 1.000 0.024 0.000 0 0
2 Compiler Package 100 0.214 8.917 0.214 0.000 0 0
1 Rscript 100 0.217 9.042 0.213 0.004 0 0

So this makes C++ code about 9x faster than R source code. Not bad but in this case it was probably not worth the effort. Oh well. I learned something. Incidentally, the compiler() package did not help much in this case as it is only marginally better than raw R source. I have seen better results from compiler() in other cases.

Final thought

Speaking of speed:

“Free yourself, like a gazelle from the hand of the hunter, like a bird from the snare of the fowler.”

Proverbs 6:5