Reorder the rules/rows of a rulelist

Implements a greedy strategy to add one rule at a time which maximizes/minimizes a metric.

# S3 method for class 'rulelist'
reorder(x, metric = "cumulative_coverage", minimize = FALSE, init = NULL, ...)

Arguments

x: A rulelist
metric: (character vector or named list) Name of metrics or a custom function(s). See calculate. The 'n+1'th metric is used when there is a match at 'nth' level, similar to base::order. If there is a match at final level, row order of the rulelist comes into play.
minimize: (logical vector) Whether to minimize. Either TRUE/FALSE or a logical vector of same length as metric
init: (positive integer) Initial number of rows after which reordering should begin
...: passed to calculate

Examples

library("magrittr")
att = modeldata::attrition
tidy_c5 =
  C50::C5.0(Attrition ~., data = att, rules = TRUE) %>%
  tidy() %>%
  set_validation_data(att, "Attrition") %>%
  set_keys(NULL) %>%
  head(5)

# with defaults
reorder(tidy_c5)
#> ---- Rulelist --------------------------------
#> ▶ Keys: NULL
#> ▶ Number of rules: 5
#> ▶ Model type: C5
#> ▶ Estimation type: classification
#> ▶ Is validation data set: TRUE
#> 
#> 
#>   rule_nbr trial_nbr LHS      RHS   support confidence  lift cumulative_coverage
#>      <int>     <int> <chr>    <fct>   <int>      <dbl> <dbl>               <dbl>
#> 1        3         1 ( Daily… Yes        13      0.933   5.8                  13
#> 2        4         1 ( JobRo… No        195      0.924   1.1                 208
#> 3        2         1 ( Envir… No        521      0.941   1.1                 645
#> 4        1         1 ( JobLe… Yes        16      0.944   5.9                 656
#> 5        5         1 ( Envir… Yes         9      0.909   5.6                 664
#> ----------------------------------------------

# use 'cumulative_overlap' to break ties (if any)
reorder(tidy_c5, metric = c("cumulative_coverage", "cumulative_overlap"))
#> ---- Rulelist --------------------------------
#> ▶ Keys: NULL
#> ▶ Number of rules: 5
#> ▶ Model type: C5
#> ▶ Estimation type: classification
#> ▶ Is validation data set: TRUE
#> 
#> 
#>   rule_nbr trial_nbr LHS      RHS   support confidence  lift cumulative_coverage
#>      <int>     <int> <chr>    <fct>   <int>      <dbl> <dbl>               <dbl>
#> 1        3         1 ( Daily… Yes        13      0.933   5.8                  13
#> 2        4         1 ( JobRo… No        195      0.924   1.1                 208
#> 3        2         1 ( Envir… No        521      0.941   1.1                 645
#> 4        1         1 ( JobLe… Yes        16      0.944   5.9                 656
#> 5        5         1 ( Envir… Yes         9      0.909   5.6                 664
#> # ℹ 1 more variable: cumulative_overlap <dbl>
#> ----------------------------------------------

# reorder after 2 rules
reorder(tidy_c5, init = 2)
#> ---- Rulelist --------------------------------
#> ▶ Keys: NULL
#> ▶ Number of rules: 5
#> ▶ Model type: C5
#> ▶ Estimation type: classification
#> ▶ Is validation data set: TRUE
#> 
#> 
#>   rule_nbr trial_nbr LHS      RHS   support confidence  lift cumulative_coverage
#>      <int>     <int> <chr>    <fct>   <int>      <dbl> <dbl>               <dbl>
#> 1        1         1 ( JobLe… Yes        16      0.944   5.9                  16
#> 2        2         1 ( Envir… No        521      0.941   1.1                 537
#> 3        4         1 ( JobRo… No        195      0.924   1.1                 648
#> 4        3         1 ( Daily… Yes        13      0.933   5.8                 656
#> 5        5         1 ( Envir… Yes         9      0.909   5.6                 664
#> ----------------------------------------------

Arguments

See also

Examples