diff options
author | Birte Kristina Friesel <birte.friesel@uos.de> | 2024-02-21 11:46:06 +0100 |
---|---|---|
committer | Birte Kristina Friesel <birte.friesel@uos.de> | 2024-02-21 11:46:06 +0100 |
commit | 5d83f255f05c3b74df0ace1f70b260959b392eca (patch) | |
tree | 19722b4889911cd3893990531fd62732f937e719 | |
parent | c3dbe93034bdeff9dba534d29b04daa527d70241 (diff) |
update documentation and examples
-rw-r--r-- | .gitlab-ci.yml | 2 | ||||
-rw-r--r-- | README.md | 5 | ||||
-rw-r--r-- | doc/analysis-nfp.md | 2 | ||||
-rw-r--r-- | doc/modeling-method.md | 26 | ||||
-rwxr-xr-x | examples/busybox.sh | 2 | ||||
-rwxr-xr-x | examples/explore-and-model-static | 4 |
6 files changed, 21 insertions, 20 deletions
diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml index 3309d76..1718a4f 100644 --- a/.gitlab-ci.yml +++ b/.gitlab-ci.yml @@ -58,7 +58,7 @@ make_model: - DFATOOL_DTREE_IGNORE_IRRELEVANT_PARAMS=0 bin/analyze-kconfig.py multipass.kconfig multipass.json.xz --export-webconf models/multipass-rmt.json - wget -q https://ess.cs.uos.de/.private/dfatool/x264.json.xz https://ess.cs.uos.de/.private/dfatool/x264.kconfig https://ess.cs.uos.de/.private/dfatool/x264.nfpkeys.json - mv x264.nfpkeys.json nfpkeys.json - - DFATOOL_DTREE_SKLEARN_CART=1 DFATOOL_PARAM_CATEGORICAL_TO_SCALAR=1 bin/analyze-kconfig.py x264.kconfig x264.json.xz --export-webconf models/x264-cart.json + - DFATOOL_MODEL=cart DFATOOL_PARAM_CATEGORICAL_TO_SCALAR=1 bin/analyze-kconfig.py x264.kconfig x264.json.xz --export-webconf models/x264-cart.json - DFATOOL_DTREE_IGNORE_IRRELEVANT_PARAMS=0 bin/analyze-kconfig.py x264.kconfig x264.json.xz --export-webconf models/x264-rmt.json artifacts: paths: @@ -111,12 +111,10 @@ The following variables may be set to alter the behaviour of dfatool components. | `DFATOOL_KCONF_WITH_CHOICE_NODES` | 0, **1** | Treat kconfig choices (e.g. "choice Model → MobileNet / ResNet / Inception") as enum parameters. If enabled, the corresponding boolean kconfig variables (e.g. "Model\_MobileNet") are not converted to parameters. If disabled, all (and only) boolean kconfig variables are treated as parameters. Mostly relevant for analyze-kconfig, eval-kconfig | | `DFATOOL_COMPENSATE_DRIFT` | **0**, 1 | Perform drift compensation for loaders without sync input (e.g. EnergyTrace or Keysight) | | `DFATOOL_DRIFT_COMPENSATION_PENALTY` | 0 .. 100 (default: majority vote over several penalties) | Specify penalty for ruptures.py PELT changepoint petection | +| `DFATOOL_MODEL` | cart, decart, lmt, **rmt**, xgb | Modeling method. See below for method-specific configuration options. | | `DFATOOL_DTREE_ENABLED` | 0, **1** | Use decision trees in get\_fitted | | `DFATOOL_DTREE_FUNCTION_LEAVES` | 0, **1** | Use functions (fitted via linear regression) in decision tree leaves when modeling numeric parameters with at least three distinct values. If 0, integer parameters are treated as enums instead. | -| `DFATOOL_DTREE_SKLEARN_CART` | **0**, 1 | Use sklearn CART ("Decision Tree Regression") algorithm for decision tree generation. Uses binary nodes and supports splits on scalar variables. Overrides `FUNCTION_LEAVES` (=0) and `NONBINARY_NODES` (=0). | -| `DFATOOL_DTREE_SKLEARN_DECART` | **0**, 1 | Use sklearn CART ("Decision Tree Regression") algorithm for decision tree generation. Ignore scalar parameters, thus emulating the DECART algorithm. | | `DFATOOL_CART_MAX_DEPTH` | **0** .. *n* | maximum depth for sklearn CART. Default (0): unlimited. | -| `DFATOOL_DTREE_LMT` | **0**, 1 | Use [Linear Model Tree](https://github.com/cerlymarco/linear-tree) algorithm for regression tree generation. Uses binary nodes and linear functions. Overrides `FUNCTION_LEAVES` (=0) and `NONBINARY_NODES` (=0). | | `DFATOOL_LMT_MAX_DEPTH` | **5** .. 20 | Maximum depth for LMT. | | `DFATOOL_LMT_MIN_SAMPLES_SPLIT` | 0.0 .. 1.0, **6** .. *n* | Minimum samples required to still perform an LMT split. A value below 1.0 sets the specified ratio of the total number of training samples as minimum. | | `DFATOOL_LMT_MIN_SAMPLES_LEAF` | 0.0 .. **0.1** .. 1.0, 3 .. *n* | Minimum samples that each leaf of a split candidate must contain. A value below 1.0 specifies a ratio of the total number of training samples. A value above 1 specifies an absolute number of samples. | @@ -125,7 +123,6 @@ The following variables may be set to alter the behaviour of dfatool components. | `DFATOOL_ULS_ERROR_METRIC` | **ssr**, rmsd, mae, … | Error metric to use when selecting best-fitting function during unsupervised least squares (ULS) regression. Least squares regression itself minimzes root mean square deviation (rmsd), hence the equivalent (but partitioning-compatible) sum of squared residuals (ssr) is the default. Supports all metrics accepted by `--error-metric`. | | `DFATOOL_ULS_MIN_DISTINCT_VALUES` | 2 .. **3** .. *n* | Minimum number of unique values a parameter must take to be eligible for ULS | | `DFATOOL_ULS_SKIP_CODEPENDENT_CHECK` | **0**, 1 | Do not detect and remove co-dependent features in ULS. | -| `DFATOOL_USE_XGBOOST` | **0**, 1 | Use Extreme Gradient Boosting algorithm for decision forest generation. | | `DFATOOL_XGB_N_ESTIMATORS` | 1 .. **100** .. *n* | Number of estimators (i.e., trees) for XGBoost. | | `DFATOOL_XGB_MAX_DEPTH` | 2 .. **6** .. *n* | Maximum XGBoost tree depth. | | `DFATOOL_XGB_SUBSAMPLE` | 0.0 .. **1.0** | XGBoost subsampling ratio. | diff --git a/doc/analysis-nfp.md b/doc/analysis-nfp.md index 877ac2a..5221c55 100644 --- a/doc/analysis-nfp.md +++ b/doc/analysis-nfp.md @@ -8,7 +8,7 @@ Classification and Regression Trees (CART) are capable of generating accurate mo Hence, after loading a CART model into kconfig-webconf, only a small subset of busybox features will be annotated with NFP deltas. ``` -DFATOOL_DTREE_SKLEARN_CART=1 DFATOOL_PARAM_CATEGORICAL_TO_SCALAR=1 DFATOOL_KCONF_WITH_CHOICE_NODES=0 .../dfatool/bin/analyze-kconfig.py --export-webconf busybox.json --force-tree ../busybox-1.35.0/Kconfig . +DFATOOL_MODEL=cart DFATOOL_PARAM_CATEGORICAL_TO_SCALAR=1 DFATOOL_KCONF_WITH_CHOICE_NODES=0 .../dfatool/bin/analyze-kconfig.py --export-webconf busybox.json --force-tree ../busybox-1.35.0/Kconfig . ``` Refer to the [kconfig-webconf README](https://ess.cs.uos.de/git/software/kconfig-webconf/-/blob/master/README.md#user-content-performance-aware-configuration) for details on using the generated model. diff --git a/doc/modeling-method.md b/doc/modeling-method.md index e4865d9..58fe03b 100644 --- a/doc/modeling-method.md +++ b/doc/modeling-method.md @@ -1,27 +1,22 @@ # Modeling Method Selection +Set `DFATOOL_MODEL` to an appropriate value, e.g. `DFATOOL_MODEL=cart`. + ## CART (Regression Trees) -Enable these with `DFATOOL_DTREE_SKLEARN_CART=1` and `--force-tree`. +sklearn CART ("Decision Tree Regression") algorithm. Uses binary nodes and supports splits on scalar variables. ### Related Options * `DFATOOL_PARAM_CATEGORICAL_TO_SCALAR=1` converts categorical parameters (which are not supported by CART) to numeric ones. -## XGB (Gradient-Boosted Forests / eXtreme Gradient boosting) +## DECART (Regression Trees) -Enable these with `DFATOOL_USE_XGBOOST=1` and `--force-tree`. -You should also specify `DFATOOL_XGB_N_ESTIMATORS`, `DFATOOL_XGB_MAX_DEPTH`, and possibly `OMP_NUM_THREADS`. - -### Related Options - -* `DFATOOL_PARAM_CATEGORICAL_TO_SCALAR=1` converts categorical parameters (which are not supported by XGB) to numeric ones. -* Anything prefixed with `DFATOOL_XGB_`. +sklearn CART ("Decision Tree Regression") algorithm. Ignores scalar parameters, thus emulating the DECART algorithm. ## LMT (Linear Model Trees) -Enable these with `DFATOOL_DTREE_LMT=1` and `--force-tree`. -They always use a maximum depth of 20. +[Linear Model Tree](https://github.com/cerlymarco/linear-tree) algorithm. Uses binary nodes and linear functions. ### Related Options @@ -51,6 +46,15 @@ All of these are valid regression model trees. * `DFATOOL_ULS_SKIP_CODEPENDENT_CHECK=1` * `DFATOOL_REGRESSION_SAFE_FUNCTIONS=1` +## XGB (Gradient-Boosted Forests / eXtreme Gradient boosting) + +You should also specify `DFATOOL_XGB_N_ESTIMATORS`, `DFATOOL_XGB_MAX_DEPTH`, and possibly `OMP_NUM_THREADS`. + +### Related Options + +* `DFATOOL_PARAM_CATEGORICAL_TO_SCALAR=1` converts categorical parameters (which are not supported by XGB) to numeric ones. +* Anything prefixed with `DFATOOL_XGB_`. + ## Least-Squares Regression If dfatool determines that there is no need for a tree structure, or if `DFATOOL_DTREE_ENABLED=0` has beenset, it will go straight to least-squares regression. diff --git a/examples/busybox.sh b/examples/busybox.sh index f9a1a93..a41875a 100755 --- a/examples/busybox.sh +++ b/examples/busybox.sh @@ -105,7 +105,7 @@ Once everything is done, you have various options for generating a kconfig-webco To do so, call bin/analyze-kconfig.py from the data directory. For example, to generate a CART: -> DFATOOL_DTREE_SKLEARN_CART=1 DFATOOL_PARAM_CATEGORICAL_TO_SCALAR=1 DFATOOL_KCONF_WITH_CHOICE_NODES=0 ~/var/ess/aemr/dfatool/bin/analyze-kconfig.py --force-tree --skip-param-stats --export-webconf /tmp/busybox-cart.json ../busybox-1.35.0/Kconfig . +> DFATOOL_MODEL=cart DFATOOL_PARAM_CATEGORICAL_TO_SCALAR=1 DFATOOL_KCONF_WITH_CHOICE_NODES=0 ~/var/ess/aemr/dfatool/bin/analyze-kconfig.py --force-tree --skip-param-stats --export-webconf /tmp/busybox-cart.json ../busybox-1.35.0/Kconfig . By adding the option "--cross-validation kfold:10", you can determine the model prediction error. diff --git a/examples/explore-and-model-static b/examples/explore-and-model-static index f2caf1a..74c6359 100755 --- a/examples/explore-and-model-static +++ b/examples/explore-and-model-static @@ -15,8 +15,8 @@ set -ex DFATOOL_DTREE_IGNORE_IRRELEVANT_PARAMS=0 DFATOOL_KCONF_WITH_CHOICE_NODES=0 ../bin/analyze-kconfig.py --export-webconf ../models/example-static-rmt-b.json --export-raw-predictions ../models/example-static-rmt-b-eval.json ../examples/kconfig-static/Kconfig . DFATOOL_DTREE_IGNORE_IRRELEVANT_PARAMS=0 DFATOOL_KCONF_WITH_CHOICE_NODES=1 ../bin/analyze-kconfig.py --export-webconf ../models/example-static-rmt-nb.json --export-raw-predictions ../models/example-static-rmt-nb-eval.json ../examples/kconfig-static/Kconfig . -DFATOOL_DTREE_SKLEARN_CART=1 DFATOOL_PARAM_CATEGORICAL_TO_SCALAR=1 DFATOOL_KCONF_WITH_CHOICE_NODES=0 ../bin/analyze-kconfig.py --export-webconf ../models/example-static-cart-b.json --export-raw-predictions ../models/example-static-cart-b-eval.json ../examples/kconfig-static/Kconfig . -DFATOOL_DTREE_SKLEARN_CART=1 DFATOOL_PARAM_CATEGORICAL_TO_SCALAR=1 DFATOOL_KCONF_WITH_CHOICE_NODES=1 ../bin/analyze-kconfig.py --export-webconf ../models/example-static-cart-nb.json --export-raw-predictions ../models/example-static-cart-nb-eval.json ../examples/kconfig-static/Kconfig . +DFATOOL_MODEL=cart DFATOOL_PARAM_CATEGORICAL_TO_SCALAR=1 DFATOOL_KCONF_WITH_CHOICE_NODES=0 ../bin/analyze-kconfig.py --export-webconf ../models/example-static-cart-b.json --export-raw-predictions ../models/example-static-cart-b-eval.json ../examples/kconfig-static/Kconfig . +DFATOOL_MODEL=cart DFATOOL_PARAM_CATEGORICAL_TO_SCALAR=1 DFATOOL_KCONF_WITH_CHOICE_NODES=1 ../bin/analyze-kconfig.py --export-webconf ../models/example-static-cart-nb.json --export-raw-predictions ../models/example-static-cart-nb-eval.json ../examples/kconfig-static/Kconfig . DFATOOL_DTREE_IGNORE_IRRELEVANT_PARAMS=0 DFATOOL_PARAM_CATEGORICAL_TO_SCALAR=1 DFATOOL_FIT_FOL=1 DFATOOL_KCONF_WITH_CHOICE_NODES=0 ../bin/analyze-kconfig.py --export-webconf ../models/example-static-fol-b.json --export-raw-predictions ../models/example-static-fol-b-eval.json ../examples/kconfig-static/Kconfig . DFATOOL_DTREE_IGNORE_IRRELEVANT_PARAMS=0 DFATOOL_PARAM_CATEGORICAL_TO_SCALAR=1 DFATOOL_FIT_FOL=1 DFATOOL_KCONF_WITH_CHOICE_NODES=1 ../bin/analyze-kconfig.py --export-webconf ../models/example-static-fol-nb.json --export-raw-predictions ../models/example-static-fol-nb-eval.json ../examples/kconfig-static/Kconfig . |