Creates a scaler object containing column means and standard deviations so that it can be used to predict on a similar dataset

scaler(data, center = TRUE, scale = TRUE)

Arguments

data

(numeric matrix or numeric dataframe) The dataset

center

(flag) whether to center the columns or not

scale

(flag) whether to scale the columns or not

Details

This computes means and standard deviations of each columns and stores it for a prediction on a dataset using predict method. If scale is TRUE, the columns are automatically centered even if center is set to FALSE.

The scaler class provides a model-predict interface to scale and unscale matrices and dataframes. This predict method supports type argument - scale or unscale. The scaler_ function is used to construct scaler object by providing centering vector(alias for means of columns, ex: columnwise medians) and scaling vector (alias for column standard deviations, ex: columnwise mean absolute deviations). scaler class is meant to aid analysis, for performance critical work use Rfast::standardize()

Examples

set.seed(1) n_70 = round(nrow(mtcars) * 0.7) index = sample(1:nrow(mtcars), n_70) mtcars_A = mtcars[index, ] mtcars_B = mtcars[index, ] model = scaler(mtcars_A) # creates model based on mtcars_A mtcars_1 = predict(model, newdata = mtcars_A) # scale mtcars_A mtcars_2 = predict(model, newdata = mtcars_B) # scale mtcars_B using model class(mtcars_2) # does not convert to matrix
#> [1] "data.frame"
mtcars_2_B = predict(model, newdata = mtcars_2, type = "unscale") all.equal(mtcars_2_B, mtcars_B)
#> [1] "Names: 11 string mismatches" #> [2] "Attributes: < Component “row.names”: Modes: numeric, character >" #> [3] "Attributes: < Component “row.names”: target is numeric, current is character >"