Documentation ¶
Overview ¶
Package rsexp provides a translation between R and Go using cgo.
rsexp helps translate data from R's internal representation (a SEXP, hence the package name) in C to Go objects of standard types (floats, ints, strings, etc.). In cgo, C objects are always unexported, so the only way to use this package is via unsafe pointers, which can be cast both within the rsexp package and in other packages as a C.SEXP.
Hence, the workhorse object of the rsexp package is a GoSEXP:
type GoSEXP struct { Point unsafe.Pointer }
Whenever the rsexp package wants to read data that comes from R, or format data to send back to R, it uses the GoSEXP object. While there is no enforcement mechanism, a GoSEXP should always point to a SEXP object in C. The rsexp package will always use a GoSEXP assuming this is the case. Therefore, the cleanest and safest (also most convenient) way to create a new GoSEXP object is with the NewGoSEXP function.
Internally, the GoSEXP type has a method which dereferences the unsafe pointer and casts it as a C.SEXP:
func (g GoSEXP) deref() C.SEXP { return *(*C.SEXP)(g.Point) }
In other packages, a similar function can be written to do the same thing. In a perfect world, the rsexp package would export this method for other packages to use, but because C types are always unexported in cgo, rsexp's internal notion of a C.SEXP will be different from the package that imports it. In this specific case, Go's strict type safety is a bit of a hindrance.
Sending data from R to Go ¶
The rsexp package uses R's internal functions to extract data from a SEXP and into a Go typed object. More about these objects can be found in R's documentation at https://cran.r-project.org/doc/manuals/r-release/R-ints.html#SEXPs. In short, everything in R is a SEXP, which is a pointer to a SEXPREC, which in turn contains some header information and a pointer to the data itself. A SEXP can point to a SEXPREC of up to a couple dozen types. The rsexp package only concerns itself with 5 of them:
- REALSXP, akin to a Go slice of float64s
- INTSXP, akin to a Go slice of integers
- CHARSXP, akin to a a Go string
- STRSXP, akin to a Go slice of strings
- VECSXP, which is an R list and as akin to a rsexp.List. It has no parallel in base Go.
In C, the type of data a SEXP points to can be found using the “TYPEOF” function. It returns an integer, which can be matched to the relevant types based on the constant enumerations declared in this package.
The GoSEXP has several methods which will pull the data out of an underlying SEXP and into a Go slice (or Matrix). They are:
- func (g GoSEXP) AsFloats() ([]float64, error)
- func (g GoSEXP) AsInts() ([]int, error)
- func (g GoSEXP) AsStrings() ([]string, error)
- func (g GoSEXP) AsMatrix(nrow, ncol int) (Matrix, error)
Each of these functions checks the SEXPTYPE of the underlying SEXP and will return an error if it doesn't match the method that was called.
Matrices are a special case. In R, matrices are represented internally as a single vector containing all of the underlying data. However, using cgo to get the dimension metadata along with the underlying data is not feasible. Therefore, the user needs to know the size of the matrix being sent from R to Go *a priori*.
Sending data from Go to R ¶
Sending data from Go to R is done by creating a GoSEXP (which will always point to a newly created C.SEXP) from one of the supported Go types:
- func Float2sexp(in []float64) GoSEXP
- func Int2sexp(in []int) GoSEXP
- func String2sexp(in []string) GoSEXP
- func Matrix2sexp(in Matrix) GoSEXP
This SEXPTYPE of the output SEXP from these functions will match the R internal type which makes the most sense.
Because R does not allow functions to have multiple returns, the preferred way to return multiple pieces of data from a function is a list. Therefore, the rsexp package contains a wrapper type List, which is just a slice of GoSEXP objects.
type List []GoSEXP
One advantage of using a GoSEXP is that the SEXP underlying each element of the List can be of any type without violating Go's type safety. R lists work the same way. The function “List2sexp'` will condense all the SEXPs underlying the List to a single SEXP/GoSEXP that can be sent back to R as a VECSXP, or list.
Once again, matrices are a special case. Because the underlying data is simply a vector which is indistinguishable from a normal numeric vector, the Matrix2sexp function returns a GoSEXP that points to a list rather than a numeric vector. The list always has two elements. The first is a pair of integers with the size information, and the second is the data itself. That way, the matrix is easier to create in R than it would be otherwise:
# This is some R code to quickly make an R matrix from a matrix, formatted as a list, sent from Go rsexp.ParseGoMatrix <- function(rsexpMatrix) { # The matrix is the second element of the list, while the first is an integer vector that is c(nrow,ncol) outMat = matrix(data=rsexpMatrix[[2]], nrow=rsexpMatrix[[1]][1], ncol=rsexpMatrix[[1]][2]) return(outMat) }
In order to send data back to R, a GoSEXP must be dereferenced and cast as a C.SEXP, similar to rsexp's internal deref method. For convenience, the following function can be copy/pasted into other packages:
func derefGoSEXP(g GoSEXP) C.SEXP { return *(*C.SEXP)(g.Point) }
Building Your Package and Calling Function in R ¶
In order for a package of Go functions to be callable from R, they must take any number of C.SEXP objects as input and return a single C.SEXP object. They also need to be marked to be exported, by including an export statement immediately above the function signature. Note that if there is a space between the comment slashes and the export mark, Go will parse it as a vanilla comment and the function won't be exported.
//export DoubleVector func DoubleVector(C.SEXP) C.SEXP {}
The Go package must then be compiled to a C shared library:
go build -o <libName>.so -buildmode=c-shared <package>
Finally, the Go functions can be called in R using the .Call function:
output = .Call("DoubleVector", input)
For a more complete demonstration, see the example below, or the demo package at https://github.com/EMurray16/Rgo/demo.
Example ¶
The code below contains a functional example of a Go function that can be called from R:
package main // #include <Rinternals.h> // We need to include the shared R headers here // One way to find this is via the Rgo/rsexp directory // Another way is to find them from your local R installation // - Typical Linux: /usr/share/R/include/ // - Typical MacOS: /Library/Frameworks/R.framework/Headers/ // If all else fails, you can also find the required header files wherever Rgo is located on your computer // For example, on my computer all github packages are put in /Go/mod/pkg/github.com/... // #cgo CFLAGS: -I/Go/mod/pkg/github.com/EMurray16/Rgo/rsexp/Rheader/ import "C" import( "github.com/EMurray16/Rgo/rsexp" ) //export DoubleVector func DoubleVector(input C.SEXP) C.SEXP { // cast the incoming SEXP as a GoSEXP gs, err := rsexp.NewGoSEXP(&input) if err != nil { fmt.Println(err) return nil } // create a slice from the SEXPs data floats, err := gs.AsFloats() if err != nil { fmt.Println(err) return nil } // double each element of the slice for i, _ := range floats { floats[i] *= 2 } // create a SEXP and GoSEXP from the new data outputGoSEXP := rsexp.Float2sexp(floats) // return the result, dereferenced and casted as a C.SEXP return *(*C.SEXP)(outputGoSEXP.Point) }
Once it is compiled to a shared library, the function can be called using R's .Call() interface:
input = c(0,2.71,3.14) output = .Call("DoubleVector", input) print(output)
The result would look like this:
[0, 5.52, 6.28]
Index ¶
- Constants
- Variables
- func AreMatricesEqual(A, B Matrix) bool
- func AreMatricesEqualTol(A, B Matrix, tolerance float64) bool
- type GoSEXP
- type List
- type Matrix
- func CopyMatrix(in Matrix) (out Matrix)
- func CreateIdentity(size int) (*Matrix, error)
- func CreateZeros(Nrow, Ncol int) (*Matrix, error)
- func MatrixAdd(A, B *Matrix) (C *Matrix, err error)
- func MatrixMultiply(A, B *Matrix) (C *Matrix, err error)
- func NewMatrix(Nrow, Ncol int, data []float64) (*Matrix, error)
- func (m *Matrix) AddConstant(c float64)
- func (m *Matrix) AppendCol(data []float64) error
- func (m *Matrix) AppendRow(data []float64) error
- func (m *Matrix) CreateTranspose() *Matrix
- func (m *Matrix) GetCol(ind int) ([]float64, error)
- func (m *Matrix) GetInd(row, col int) (float64, error)
- func (m *Matrix) GetRow(ind int) ([]float64, error)
- func (m *Matrix) MultiplyConstant(c float64)
- func (m *Matrix) SetCol(ind int, data []float64) error
- func (m *Matrix) SetInd(row, col int, data float64) error
- func (m *Matrix) SetRow(ind int, data []float64) error
Constants ¶
const ( CHARSXP = 9 INTSXP = 13 REALSXP = 14 // A STRSXP is actually a vector of strings, where each element points to a CHARSXP. STRSXP = 16 // It's not obvious from the name, but a VECSXP is a list. Each element of a // VECSXP is a SEXP and can be of any type. VECSXP = 19 )
These constants are enumerations of the SEXPTYPEs that are part of R's internals. There are about 2 dozen in all, the rsexp package only supports 5 of them.
Variables ¶
var ( SizeMismatch = errors.New("operation is not possible with given input dimensions") InvalidIndex = errors.New("given index is impossible (ie < 0)") IndexOutOfBounds = errors.New("index is out of bounds (ie too large)") )
All matrix operations check inputs for validity and will return errors where applicable.
var ImpossibleMatrix = errors.New("matrix size and underlying data length are not compatible")
Any Matrix function or method which can return an error will first check the input Matrix or matrices for validity and return an ImpossibleMatrix error if there is an inconsistency between the matrix data and dimensions.
var NotASEXP = errors.New("non-SEXP object provided to a function that needs a SEXP")
NotASEXP is returned by NewGoSEXP when it cannot coerce the input object into a *C.SEXP.
var TypeMismatch = errors.New("input SEXP type does not match desired output type")
TypeMismatch is most often returned from an AsX method when the caller tries to extract the incorrect type from a SEXP.
Functions ¶
func AreMatricesEqual ¶
AreMatricesEqual returns true if the input matrices are of the same dimension and have identical data vectors. It's important to note that this function uses strict equality - even if elements of two matrices differ by floating point error, it will return false.
func AreMatricesEqualTol ¶
AreMatricesEqualTol is the same as AreMatricesEqual, except the data vectors are checked in relation to the input tolerance allowed. If any elements differ by more than the tolerance, this function will return false.
Types ¶
type GoSEXP ¶
GoSEXP wraps an unsafe pointer, which should always point towards a C.SEXP object. Because cgo doesn't allow for the exporting of C types, a GoSEXP is used as a translation object to pass a C.SEXP into the rsexp package. The preferred way to create a new GoSEXP object is with the function NewGoSEXP.
A GoSEXP can be dereferenced in any package and asserted as a C.SEXP. Internally, the rsexp package uses an unexported method to do this:
func (g GoSEXP) deref() C.SEXP { return *(*C.SEXP)(g.Point) }
Other packages can interact with a GoSEXP in the exact same way and get the same results, but must have their own dereference implementation which returns their own package's definition of a C.SEXP.
func Float2sexp ¶
Float2sexp creates a SEXP, of type REALSXP, from data contained in a slice of floats. The output of this function is a GoSEXP, which can be dereferenced and asserted as a C.SEXP in an external package and returned to R. In R, the result is a numeric (aka double) vector.
func Int2sexp ¶
Int2sexp creates a SEXP, of type INTSXP, from data contained in a slice of integers. The output of this function is a GoSEXP, which can be dereferenced and asserted as a C.SEXP in an external package and returned to R. In R, the result is an integer vector.
func List2sexp ¶
List2sexp creates a SEXP of type VECSXP from data contained in a List, or slice of GoSEXPs. The input of this function is a list of GoSEXP objects, which should already point to SEXPs of the correct types. The output of this function is a GoSEXP, which can be dereferenced and asserted as a C.SEXP in an external package and returned to R. In R the result is a list.
func Matrix2sexp ¶
Matrix2sexp creates a SEXP of type VECSXP (a list) from the data contained in a matrix. The resulting SEXP will always have two elements. The first is a length 2 vector of integers, containing the number of rows and number of columns of the matrix in that order. The second is a SEXP of type REALSXP, created by converting the slice of matrix data into a numeric vector.
func NewGoSEXP ¶
NewGoSEXP creates a new GoSEXP from the input object. Because C types are not able to be exported by cgo, the input is the dreaded empty interface. Despite the empty interface, the input to NewGoSEXP must always be a pointer to a C.SEXP, like so:
// assume s is a C.SEXP gs, err := NewGoSEXP(&s)
To try and enforce as much type safety as possible, the NewGoSEXP will return an error if the input is not a *SEXP. It will also return a TypeMismatch error if the SEXP is not of a type that the rsexp package supports, like a list or a closure.
For a demonstration of how to use NewGoSEXP, see the example provided in the documentation or the demo of this package that can be found in the same repository on Github.
func String2sexp ¶
String2sexp creates a SEXP of type STRSXP from data contained in a slice of strings. The output of this function is a GoSEXP, which can be dereferenced and asserted as a C.SEXP in an external package and returned to R. In R, the result is a string vector.
func (GoSEXP) AsFloats ¶
AsFloats reads data from a SEXP into a slice of float64s. This function is only compatible with SEXPs which are of SEXPTYPE 14 - REALSXP. Attempts to read SEXPs of other types using this function will result in a TypeMismatch error.
func (GoSEXP) AsInts ¶
AsInts reads data from a SEXP into a vector of ints. This function is only compatible with SEXPs which are of SEXPTYPE 13 - INTSXP. Attempts to read SEXPs of other types using this function will result in a TypeMismatch error.
func (GoSEXP) AsMatrix ¶
AsMatrix reads data from a SEXP into a Matrix type. This function is only compatible with SEXPs which are of SEXPTYPE 14 - REALSXP. This simply wraps a call to AsFloats and prepends the input matrix size. Because there is no interface to impute the size of a matrix from the SEXP in and of itself, the size of a matrix must be known a priori and provided as an input. If the input dimensions don't match the length of the vector in the SEXP, a SizeMismatch error will be returned.
type List ¶
type List []GoSEXP
List is a Go correlate to R's lists. The List type is a vector of GoSEXPs, which can be of any type. Just as in R, a List is the preferred way to return multiple objects from a function. However, a List is not the preferred way to to provide multiple inputs - both R and Go support any number of function arguments.
type Matrix ¶
type Matrix struct {
// The Matrix header - two integers which specify its dimension
Nrow, Ncol int
// The data in a matrix is represented as a single slice of data
Data []float64
}
Matrix is a representation of a matrix in Go that mirrors how matrices are represented in R. The Matrix contains a vector of all the data, and a header of two integers that contain the dimensions of the matrix. The Data vector is organized so that column indices are together, but row indices are not. In other words, the data can be thought of as a concatenation of several vectors, each of which contains the data for one column.
For example, the following Matrix:
Matrix{Nrow: 3, Ncol: 2, Data: []float64{1.1,2.2,3.3,4.4,5.5,6.6}}
will look like this:
[1.1, 4.4 2.2, 5.5 3.3, 6.6]
Matrix data is accessed using 0-based indexing, which is natural in Go but differs from R. For example, the 0th row in the example matrix is [1.1, 4.4], while the "1st" row is [2.2, 5.5].
func CopyMatrix ¶
CopyMatrix creates an exact copy of an existing matrix. The copies are independent, so that the output matrix can be changed without changing the input matrix and vice versa.
func CreateIdentity ¶
CreateIdentity creates an identity matrix, which is always square by definition, of the input dimension. An identity matrix is a matrix will all 0s, except for having a 1 in each element of the diagonal. If the given size is impossible, it will return an InvalidIndex error.
func CreateZeros ¶
CreateZeros creates a matrix of the given dimensions in which every element is 0. If the given dimensions are nonsensical (negative, for example) it will return an InvalidIndex error.
func MatrixAdd ¶
MatrixAdd adds two matrices. Matrix addition is done by adding each element of the two matrices together, so they must be of identical size. If they are not, a SizeMismatch error will be returned.
func MatrixMultiply ¶
MatrixMultiply performs a matrix multiplication of two matrices. This is not an element-wise multiplication, but a true multiplication as defined in elementary linear algebra. In matrix multiplication, order matters. Two matrices A and B can only be multiplied if A has the same number of rows as B has number of columns. If the dimensions of the input matrices do not allow for a multiplication, a SizeMismatch error is returned.
func NewMatrix ¶
NewMatrix creates a new matrix given a vector of data. The number of rows and columns must be provided, and it assumes the data is already in the order a Matrix should be, with column indexes adjacent. In other words, the data vector should be a concatenation of several vectors, one for each column. NewMatrix makes a copy of the input slice, so that changing the slice later will not affect the data in the matrix. If the provided dimensions don't match the length of the provided data, an ImpossibleMatrix error will be returned.
func (*Matrix) AddConstant ¶
AddConstant adds a constant to every element of a matrix. There is no SubtractConstant method. To subtract a constant N from a matrix, add its negative, -N.
func (*Matrix) AppendCol ¶
AppendCol appends a column onto an existing matrix and updates the dimension metadata accordingly. If the provided data column is not equal to the number of rows in the matrix, it will return a SizeMismatch error.
func (*Matrix) AppendRow ¶
AppendRow appends a row onto an existing matrix and updates the dimension metadata accordingly. If the length of the provided row is not equal to the number of columns in the matrix, it will return a SizeMismatch error.
func (*Matrix) CreateTranspose ¶
CreateTranspose creates a new matrix which is a transpose of the input matrix. The output matrix is created from a copy of the input matrix such that they can be altered independently.
func (*Matrix) GetCol ¶
GetCol gets the column of the matrix specified by the provided index, using 0-based indexing. The first column of a matrix is index 0, even though it may be more intuitive that it should be 1. If the input index is too big, it will return a IndexOutOfBounds error. If you get this error, there's a good chance it's just an off-by-one error. The resulting slice does not point to the matrix itself, so it can be edited without altering the matrix.
func (*Matrix) GetInd ¶
This method returns the value in the element of the matrix defined by the inputs.
func (*Matrix) GetRow ¶
GetRow gets the row of the matrix specified by the provided index, using 0-based indexing. The first row of a matrix is index 0, even though it may be more intuitive that it should be 1. If the input index is too big, it will return a IndexOutOfBounds error. If you get this error, there's a good chance it's just an off-by-one error. The resulting slice does not point to the matrix itself, so it can be edited without altering the matrix.
func (*Matrix) MultiplyConstant ¶
MultiplyConstant multiplies each element of a matrix by a constant. There is no DivideConstant method. To divide a matrix by a constant N, multiply it by its reciprocal, 1/N.
func (*Matrix) SetCol ¶
SetCol sets the column of the matrix, specified by the input index, to match the data provided. If the length of the provided column is not of the same as the number of rows in the matrix, it will return a SizeMismatch error.