RCPP Sugar PDF
RCPP Sugar PDF
This note describes Rcpp sugar which has been introduced in version Apart from being strongly-typed and the need for explicit
0.8.3 of Rcpp (Eddelbuettel et al., 2019a; Eddelbuettel and François, 2011). return statement, the code is now identical between highly-
Rcpp sugar brings a higher-level of abstraction to C++ code written using vectorised R and C++.
the Rcpp API. Rcpp sugar is based on expression templates (Abrahams Rcpp sugar is written using expression templates and lazy eval-
and Gurtovoy, 2004; Vandevoorde and Josuttis, 2003) and provides some uation techniques (Abrahams and Gurtovoy, 2004; Vandevoorde
‘syntactic sugar’ facilities directly in Rcpp. This is similar to how RcppAr- and Josuttis, 2003). This not only allows a much nicer high-level
madillo (Eddelbuettel et al., 2019b) offers linear algebra C++ classes based syntax, but also makes it rather efficient (as we detail in section 4
on Armadillo (Sanderson, 2010). below).
Rcpp | sugar | R | C++
2. Operators
Rcpp sugar takes advantage of C++ operator overloading. The next
1. Motivation
few sections discuss several examples.
Rcpp facilitates development of internal compiled code in an R
package by abstracting low-level details of the R API (R Core Team, 2.1. Binary arithmetic operators. Rcpp sugar defines the usual bi-
2018) into a consistent set of C++ classes. nary arithmetic operators : +, -, *, /.
Code written using Rcpp classes is easier to read, write and
// two numeric vectors of the same size
maintain, without loosing performance. Consider the following
NumericVector x;
code example which provides a function foo as a C++ extension to
NumericVector y;
R by using the Rcpp API:
The left hand side (lhs) and the right hand side (rhs) of each
The goal of the function foo code is simple. Given two numeric
binary arithmetic expression must be of the same type (for example
vectors, we create a third one. This is typical low-level C++ code
they should be both numeric expressions).
that that could be written much more consicely in R thanks to
The lhs and the rhs can either have the same size or one of them
vectorisation as shown in the next example.
could be a primitive value of the appropriate type, for example
foo <- function(x, y) { adding a NumericVector and a double.
ifelse(x < y, x * x, -(y * y))
2.2. Binary logical operators. Binary logical operators create a
}
logical sugar expression from either two sugar expressions of
the same type or one sugar expression and a primitive value of the
Put succinctly, the motivation of Rcpp sugar is to bring a subset
associated type.
of the high-level R syntax in C++. Hence, with Rcpp sugar , the
C++ version of foo now becomes: // two integer vectors of the same size
NumericVector x;
Rcpp::NumericVector foo(Rcpp::NumericVector x,
NumericVector y;
Rcpp::NumericVector y) {
return ifelse(x < y, x * x, -(y * y));
// expressions involving two vectors
}
// use it as part of a numerical expression 3.2.2. seq_along. Givena sugar expression of any type, seq_along
NumericVector res = -x * (x + 2.0); creates an integer sugar expression whose values go from 1 to the
size of the input.
// two integer vectors of the same size
NumericVector y; IntegerVector x =
NumericVector z; IntegerVector::create( 0, 1, NA_INTEGER, 3 );
This is the most lazy function, as it only needs to call the size
3. Functions member function of the input expression. The input expression
need not to be resolved. The two examples above gives the same
Rcpp sugar defines functions that closely match the behavior of R
result with the same efficiency at runtime. The compile time will
functions of the same name.
be affected by the complexity of the second expression, since the
abstract syntax tree is built at compile time.
3.1. Functions producing a single logical result. Given a logical
sugar expression, the all function identifies if all the elements are 3.2.3. seq_len. seq_lencreates an integer sugar expression whose
TRUE. Similarly, the any function identifies if any the element is i-th element expands to i. seq_len is particularly useful in con-
TRUE when given a logical sugar expression. junction with sapply and lapply.
5.2. The VectorBase class. The CRTP is used as the basis for Rcpp 5.3. Example: sapply. As an example, the current imple-
sugar with the VectorBase class template. All sugar expression mentation of sapply, supported by the template class
derive from one class generated by the VectorBase template. The Rcpp::sugar::Sapply is given below:
current definition of VectorBase is given here:
template <int RTYPE, bool NA,
template <int RTYPE, bool na, typename VECTOR> typename T, typename Function>
class VectorBase { class Sapply : public VectorBase<
public: Rcpp::traits::r_sexptype_traits< typename
struct r_type : ::Rcpp::traits::result_of<Function>::type
traits::integral_constant<int,RTYPE>{}; >::rtype,
struct can_have_na : true,
traits::integral_constant<bool,na>{}; Sapply<RTYPE, NA, T, Function>
> {
typedef typename public:
traits::storage_type<RTYPE>::type typedef typename
stored_type; ::Rcpp::traits::result_of<Function>::type;
Sapply(const VEC& vec_, Function fun_) : 5.3.3. Indentification of expression type. Based
on the result type of
vec(vec_), fun(fun_){} the function, the r_sexptype_traits trait is used to identify the
expression type.
inline STORAGE operator[]( int i ) const { const static int RESULT_R_TYPE =
return converter_type::get(fun(vec[i])); Rcpp::traits::r_sexptype_traits<
} result_type>::rtype;
inline int size() const {
return vec.size(); 5.3.4. Converter. The
r_vector_element_converter class is used
} to convert an object of the function’s result type to the actual
storage type suitable for the sugar expression.
private:
typedef typename
const VEC& vec;
Rcpp::traits::r_vector_element_converter<
Function fun;
RESULT_R_TYPE>::type
};
converter_type;
// sugar
5.3.5. Storage type. The
storage_type trait is used to get access
template <int RTYPE, bool _NA_, to the storage type associated with a sugar expression type. For
typename T, typename Function > example, the storage type of a REALSXP expression is double.
inline sugar::Sapply<RTYPE, _NA_, T, Function>
sapply(const Rcpp::VectorBase<RTYPE,_NA_,T>& t, typedef typename
Function fun) { Rcpp::traits::storage_type<RESULT_R_TYPE>::type
STORAGE;
return
sugar::Sapply<RTYPE,_NA_,T,Function>(t, fun); 5.3.6. Input expression base type. The input expression—the expres-
} sion over which sapply runs—is also typedef’ed for convenience:
5.3.1. The sapply function. sapplyis a template function that takes typedef Rcpp::VectorBase<RTYPE, NA, T> VEC;
two arguments. The first argument is a sugar expression, which we
recognize because of the relationship with the VectorBase class 5.3.7. Output expression base type. In order to be part of the Rcpp
template. The second argument is the function to apply. sugar system, the type generated by the Sapply class template
The sapply function itself does not do anything, it is just used must inherit from VectorBase.
to trigger compiler detection of the template parameters that will
be used in the sugar::Sapply template. template <int RTYPE, bool NA,
typename T, typename Function>
5.3.2. Detection of return type of the function. In
order to decide which class Sapply : public VectorBase<
kind of expression is built, the Sapply template class queries the Rcpp::traits::r_sexptype_traits<
template argument via the Rcpp::traits::result_of template. typename
::Rcpp::traits::result_of<Function>::type
typedef typename >::rtype,
::Rcpp::traits::result_of<Function>::type true,
result_type; Sapply<RTYPE,NA,T,Function>
>
The result_of type trait is implemented as such:
The expression built by Sapply depends on the result type of
template <typename T> the function, may contain missing values, and the third argument
struct result_of{ is the manifestation of the CRTP.
typedef typename T::result_type type;
}; 5.3.8. Constructor. The constructor of the Sapply class template is
straightforward, it simply consists of holding the reference to the
template <typename RESULT_TYPE, input expression and the function.
typename INPUT_TYPE>
struct result_of<RESULT_TYPE (*)(INPUT_TYPE)> { Sapply(const VEC& vec_, Function fun_):
typedef RESULT_TYPE type; vec(vec_), fun(fun_){}
};
private:
const VEC& vec;
The generic definition of result_of targets functors with a
Function fun;
nested result_type type.
6. Summary
TBD
References
Abrahams D, Gurtovoy A (2004). C++ Template Metaprogramming: Concepts,
Tools and Techniques from Boost and Beyond. Addison-Wesley, Boston.
Eddelbuettel D, François R (2011). “Rcpp: Seamless R and C++ Integration.”
Journal of Statistical Software, 40(8), 1–18. URL http://www.jstatsoft.org/v40/
i08/.
Eddelbuettel D, François R, Allaire J, Ushey K, Kou Q, Russel N, Chambers J,
Bates D (2019a). Rcpp: Seamless R and C++ Integration. R package version
1.0.3, URL http://CRAN.R-Project.org/package=Rcpp.
Eddelbuettel D, François R, Bates D, Ni B (2019b). RcppArmadillo: Rcpp
integration for Armadillo templated linear algebra library. R package version
0.9.800.1.0, URL http://CRAN.R-Project.org/package=RcppArmadillo.
R Core Team (2018). Writing R extensions. R Foundation for Statistical Com-
puting, Vienna, Austria. URL http://CRAN.R-Project.org/doc/manuals/R-exts.
html.
Sanderson C (2010). “Armadillo: An open source C++ Algebra Library for Fast
Prototyping and Computationally Intensive Experiments.” Technical report,
NICTA. URL http://arma.sf.net.
Vandevoorde D, Josuttis NM (2003). C++ Templates: The Complete Guide.
Addison-Wesley, Boston.