summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorBirte Kristina Friesel <birte.friesel@uos.de>2024-02-21 11:46:06 +0100
committerBirte Kristina Friesel <birte.friesel@uos.de>2024-02-21 11:46:06 +0100
commit5d83f255f05c3b74df0ace1f70b260959b392eca (patch)
tree19722b4889911cd3893990531fd62732f937e719
parentc3dbe93034bdeff9dba534d29b04daa527d70241 (diff)
update documentation and examples
-rw-r--r--.gitlab-ci.yml2
-rw-r--r--README.md5
-rw-r--r--doc/analysis-nfp.md2
-rw-r--r--doc/modeling-method.md26
-rwxr-xr-xexamples/busybox.sh2
-rwxr-xr-xexamples/explore-and-model-static4
6 files changed, 21 insertions, 20 deletions
diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml
index 3309d76..1718a4f 100644
--- a/.gitlab-ci.yml
+++ b/.gitlab-ci.yml
@@ -58,7 +58,7 @@ make_model:
- DFATOOL_DTREE_IGNORE_IRRELEVANT_PARAMS=0 bin/analyze-kconfig.py multipass.kconfig multipass.json.xz --export-webconf models/multipass-rmt.json
- wget -q https://ess.cs.uos.de/.private/dfatool/x264.json.xz https://ess.cs.uos.de/.private/dfatool/x264.kconfig https://ess.cs.uos.de/.private/dfatool/x264.nfpkeys.json
- mv x264.nfpkeys.json nfpkeys.json
- - DFATOOL_DTREE_SKLEARN_CART=1 DFATOOL_PARAM_CATEGORICAL_TO_SCALAR=1 bin/analyze-kconfig.py x264.kconfig x264.json.xz --export-webconf models/x264-cart.json
+ - DFATOOL_MODEL=cart DFATOOL_PARAM_CATEGORICAL_TO_SCALAR=1 bin/analyze-kconfig.py x264.kconfig x264.json.xz --export-webconf models/x264-cart.json
- DFATOOL_DTREE_IGNORE_IRRELEVANT_PARAMS=0 bin/analyze-kconfig.py x264.kconfig x264.json.xz --export-webconf models/x264-rmt.json
artifacts:
paths:
diff --git a/README.md b/README.md
index 02d2188..8690959 100644
--- a/README.md
+++ b/README.md
@@ -111,12 +111,10 @@ The following variables may be set to alter the behaviour of dfatool components.
| `DFATOOL_KCONF_WITH_CHOICE_NODES` | 0, **1** | Treat kconfig choices (e.g. "choice Model → MobileNet / ResNet / Inception") as enum parameters. If enabled, the corresponding boolean kconfig variables (e.g. "Model\_MobileNet") are not converted to parameters. If disabled, all (and only) boolean kconfig variables are treated as parameters. Mostly relevant for analyze-kconfig, eval-kconfig |
| `DFATOOL_COMPENSATE_DRIFT` | **0**, 1 | Perform drift compensation for loaders without sync input (e.g. EnergyTrace or Keysight) |
| `DFATOOL_DRIFT_COMPENSATION_PENALTY` | 0 .. 100 (default: majority vote over several penalties) | Specify penalty for ruptures.py PELT changepoint petection |
+| `DFATOOL_MODEL` | cart, decart, lmt, **rmt**, xgb | Modeling method. See below for method-specific configuration options. |
| `DFATOOL_DTREE_ENABLED` | 0, **1** | Use decision trees in get\_fitted |
| `DFATOOL_DTREE_FUNCTION_LEAVES` | 0, **1** | Use functions (fitted via linear regression) in decision tree leaves when modeling numeric parameters with at least three distinct values. If 0, integer parameters are treated as enums instead. |
-| `DFATOOL_DTREE_SKLEARN_CART` | **0**, 1 | Use sklearn CART ("Decision Tree Regression") algorithm for decision tree generation. Uses binary nodes and supports splits on scalar variables. Overrides `FUNCTION_LEAVES` (=0) and `NONBINARY_NODES` (=0). |
-| `DFATOOL_DTREE_SKLEARN_DECART` | **0**, 1 | Use sklearn CART ("Decision Tree Regression") algorithm for decision tree generation. Ignore scalar parameters, thus emulating the DECART algorithm. |
| `DFATOOL_CART_MAX_DEPTH` | **0** .. *n* | maximum depth for sklearn CART. Default (0): unlimited. |
-| `DFATOOL_DTREE_LMT` | **0**, 1 | Use [Linear Model Tree](https://github.com/cerlymarco/linear-tree) algorithm for regression tree generation. Uses binary nodes and linear functions. Overrides `FUNCTION_LEAVES` (=0) and `NONBINARY_NODES` (=0). |
| `DFATOOL_LMT_MAX_DEPTH` | **5** .. 20 | Maximum depth for LMT. |
| `DFATOOL_LMT_MIN_SAMPLES_SPLIT` | 0.0 .. 1.0, **6** .. *n* | Minimum samples required to still perform an LMT split. A value below 1.0 sets the specified ratio of the total number of training samples as minimum. |
| `DFATOOL_LMT_MIN_SAMPLES_LEAF` | 0.0 .. **0.1** .. 1.0, 3 .. *n* | Minimum samples that each leaf of a split candidate must contain. A value below 1.0 specifies a ratio of the total number of training samples. A value above 1 specifies an absolute number of samples. |
@@ -125,7 +123,6 @@ The following variables may be set to alter the behaviour of dfatool components.
| `DFATOOL_ULS_ERROR_METRIC` | **ssr**, rmsd, mae, … | Error metric to use when selecting best-fitting function during unsupervised least squares (ULS) regression. Least squares regression itself minimzes root mean square deviation (rmsd), hence the equivalent (but partitioning-compatible) sum of squared residuals (ssr) is the default. Supports all metrics accepted by `--error-metric`. |
| `DFATOOL_ULS_MIN_DISTINCT_VALUES` | 2 .. **3** .. *n* | Minimum number of unique values a parameter must take to be eligible for ULS |
| `DFATOOL_ULS_SKIP_CODEPENDENT_CHECK` | **0**, 1 | Do not detect and remove co-dependent features in ULS. |
-| `DFATOOL_USE_XGBOOST` | **0**, 1 | Use Extreme Gradient Boosting algorithm for decision forest generation. |
| `DFATOOL_XGB_N_ESTIMATORS` | 1 .. **100** .. *n* | Number of estimators (i.e., trees) for XGBoost. |
| `DFATOOL_XGB_MAX_DEPTH` | 2 .. **6** .. *n* | Maximum XGBoost tree depth. |
| `DFATOOL_XGB_SUBSAMPLE` | 0.0 .. **1.0** | XGBoost subsampling ratio. |
diff --git a/doc/analysis-nfp.md b/doc/analysis-nfp.md
index 877ac2a..5221c55 100644
--- a/doc/analysis-nfp.md
+++ b/doc/analysis-nfp.md
@@ -8,7 +8,7 @@ Classification and Regression Trees (CART) are capable of generating accurate mo
Hence, after loading a CART model into kconfig-webconf, only a small subset of busybox features will be annotated with NFP deltas.
```
-DFATOOL_DTREE_SKLEARN_CART=1 DFATOOL_PARAM_CATEGORICAL_TO_SCALAR=1 DFATOOL_KCONF_WITH_CHOICE_NODES=0 .../dfatool/bin/analyze-kconfig.py --export-webconf busybox.json --force-tree ../busybox-1.35.0/Kconfig .
+DFATOOL_MODEL=cart DFATOOL_PARAM_CATEGORICAL_TO_SCALAR=1 DFATOOL_KCONF_WITH_CHOICE_NODES=0 .../dfatool/bin/analyze-kconfig.py --export-webconf busybox.json --force-tree ../busybox-1.35.0/Kconfig .
```
Refer to the [kconfig-webconf README](https://ess.cs.uos.de/git/software/kconfig-webconf/-/blob/master/README.md#user-content-performance-aware-configuration) for details on using the generated model.
diff --git a/doc/modeling-method.md b/doc/modeling-method.md
index e4865d9..58fe03b 100644
--- a/doc/modeling-method.md
+++ b/doc/modeling-method.md
@@ -1,27 +1,22 @@
# Modeling Method Selection
+Set `DFATOOL_MODEL` to an appropriate value, e.g. `DFATOOL_MODEL=cart`.
+
## CART (Regression Trees)
-Enable these with `DFATOOL_DTREE_SKLEARN_CART=1` and `--force-tree`.
+sklearn CART ("Decision Tree Regression") algorithm. Uses binary nodes and supports splits on scalar variables.
### Related Options
* `DFATOOL_PARAM_CATEGORICAL_TO_SCALAR=1` converts categorical parameters (which are not supported by CART) to numeric ones.
-## XGB (Gradient-Boosted Forests / eXtreme Gradient boosting)
+## DECART (Regression Trees)
-Enable these with `DFATOOL_USE_XGBOOST=1` and `--force-tree`.
-You should also specify `DFATOOL_XGB_N_ESTIMATORS`, `DFATOOL_XGB_MAX_DEPTH`, and possibly `OMP_NUM_THREADS`.
-
-### Related Options
-
-* `DFATOOL_PARAM_CATEGORICAL_TO_SCALAR=1` converts categorical parameters (which are not supported by XGB) to numeric ones.
-* Anything prefixed with `DFATOOL_XGB_`.
+sklearn CART ("Decision Tree Regression") algorithm. Ignores scalar parameters, thus emulating the DECART algorithm.
## LMT (Linear Model Trees)
-Enable these with `DFATOOL_DTREE_LMT=1` and `--force-tree`.
-They always use a maximum depth of 20.
+[Linear Model Tree](https://github.com/cerlymarco/linear-tree) algorithm. Uses binary nodes and linear functions.
### Related Options
@@ -51,6 +46,15 @@ All of these are valid regression model trees.
* `DFATOOL_ULS_SKIP_CODEPENDENT_CHECK=1`
* `DFATOOL_REGRESSION_SAFE_FUNCTIONS=1`
+## XGB (Gradient-Boosted Forests / eXtreme Gradient boosting)
+
+You should also specify `DFATOOL_XGB_N_ESTIMATORS`, `DFATOOL_XGB_MAX_DEPTH`, and possibly `OMP_NUM_THREADS`.
+
+### Related Options
+
+* `DFATOOL_PARAM_CATEGORICAL_TO_SCALAR=1` converts categorical parameters (which are not supported by XGB) to numeric ones.
+* Anything prefixed with `DFATOOL_XGB_`.
+
## Least-Squares Regression
If dfatool determines that there is no need for a tree structure, or if `DFATOOL_DTREE_ENABLED=0` has beenset, it will go straight to least-squares regression.
diff --git a/examples/busybox.sh b/examples/busybox.sh
index f9a1a93..a41875a 100755
--- a/examples/busybox.sh
+++ b/examples/busybox.sh
@@ -105,7 +105,7 @@ Once everything is done, you have various options for generating a kconfig-webco
To do so, call bin/analyze-kconfig.py from the data directory.
For example, to generate a CART:
-> DFATOOL_DTREE_SKLEARN_CART=1 DFATOOL_PARAM_CATEGORICAL_TO_SCALAR=1 DFATOOL_KCONF_WITH_CHOICE_NODES=0 ~/var/ess/aemr/dfatool/bin/analyze-kconfig.py --force-tree --skip-param-stats --export-webconf /tmp/busybox-cart.json ../busybox-1.35.0/Kconfig .
+> DFATOOL_MODEL=cart DFATOOL_PARAM_CATEGORICAL_TO_SCALAR=1 DFATOOL_KCONF_WITH_CHOICE_NODES=0 ~/var/ess/aemr/dfatool/bin/analyze-kconfig.py --force-tree --skip-param-stats --export-webconf /tmp/busybox-cart.json ../busybox-1.35.0/Kconfig .
By adding the option "--cross-validation kfold:10", you can determine the model prediction error.
diff --git a/examples/explore-and-model-static b/examples/explore-and-model-static
index f2caf1a..74c6359 100755
--- a/examples/explore-and-model-static
+++ b/examples/explore-and-model-static
@@ -15,8 +15,8 @@ set -ex
DFATOOL_DTREE_IGNORE_IRRELEVANT_PARAMS=0 DFATOOL_KCONF_WITH_CHOICE_NODES=0 ../bin/analyze-kconfig.py --export-webconf ../models/example-static-rmt-b.json --export-raw-predictions ../models/example-static-rmt-b-eval.json ../examples/kconfig-static/Kconfig .
DFATOOL_DTREE_IGNORE_IRRELEVANT_PARAMS=0 DFATOOL_KCONF_WITH_CHOICE_NODES=1 ../bin/analyze-kconfig.py --export-webconf ../models/example-static-rmt-nb.json --export-raw-predictions ../models/example-static-rmt-nb-eval.json ../examples/kconfig-static/Kconfig .
-DFATOOL_DTREE_SKLEARN_CART=1 DFATOOL_PARAM_CATEGORICAL_TO_SCALAR=1 DFATOOL_KCONF_WITH_CHOICE_NODES=0 ../bin/analyze-kconfig.py --export-webconf ../models/example-static-cart-b.json --export-raw-predictions ../models/example-static-cart-b-eval.json ../examples/kconfig-static/Kconfig .
-DFATOOL_DTREE_SKLEARN_CART=1 DFATOOL_PARAM_CATEGORICAL_TO_SCALAR=1 DFATOOL_KCONF_WITH_CHOICE_NODES=1 ../bin/analyze-kconfig.py --export-webconf ../models/example-static-cart-nb.json --export-raw-predictions ../models/example-static-cart-nb-eval.json ../examples/kconfig-static/Kconfig .
+DFATOOL_MODEL=cart DFATOOL_PARAM_CATEGORICAL_TO_SCALAR=1 DFATOOL_KCONF_WITH_CHOICE_NODES=0 ../bin/analyze-kconfig.py --export-webconf ../models/example-static-cart-b.json --export-raw-predictions ../models/example-static-cart-b-eval.json ../examples/kconfig-static/Kconfig .
+DFATOOL_MODEL=cart DFATOOL_PARAM_CATEGORICAL_TO_SCALAR=1 DFATOOL_KCONF_WITH_CHOICE_NODES=1 ../bin/analyze-kconfig.py --export-webconf ../models/example-static-cart-nb.json --export-raw-predictions ../models/example-static-cart-nb-eval.json ../examples/kconfig-static/Kconfig .
DFATOOL_DTREE_IGNORE_IRRELEVANT_PARAMS=0 DFATOOL_PARAM_CATEGORICAL_TO_SCALAR=1 DFATOOL_FIT_FOL=1 DFATOOL_KCONF_WITH_CHOICE_NODES=0 ../bin/analyze-kconfig.py --export-webconf ../models/example-static-fol-b.json --export-raw-predictions ../models/example-static-fol-b-eval.json ../examples/kconfig-static/Kconfig .
DFATOOL_DTREE_IGNORE_IRRELEVANT_PARAMS=0 DFATOOL_PARAM_CATEGORICAL_TO_SCALAR=1 DFATOOL_FIT_FOL=1 DFATOOL_KCONF_WITH_CHOICE_NODES=1 ../bin/analyze-kconfig.py --export-webconf ../models/example-static-fol-nb.json --export-raw-predictions ../models/example-static-fol-nb-eval.json ../examples/kconfig-static/Kconfig .