sklearn_fab.SklearnFABBernGateLinearClassifier¶
- class sklearn_fab.SklearnFABBernGateLinearClassifier(random_seed=None, max_fab_iterations=100, start_from_mstep=False, num_acceleration_steps=0, repeat_until_convergence=False, projection_estep=False, shrink_threshold=1.0, fab_stop_threshold=0.001, tree_depth=5, comp_weights_min_scale=-0.5, comp_weights_max_scale=0.5, comp_bias_min_scale=0.25, comp_bias_max_scale=0.75, gate_opt_mode='opt', gate_max_bins=None, comp_foba_skip='power_of_two', comp_foba_skip_max_interval=25, comp_opt_mode='opt', comp_backward_step=False, post_comp_opt_type='standard', comp_l2_regularize=0.0, with_comp_scaled_l0_regularize=True, max_comp_relevant_features=100, max_comp_foba_iterations=100, num_threads_gates=1, num_threads_gate_features=1, num_threads_comps=1, base_model_dict=None)¶
FAB linear classifier with Bernoulli-gates.
This estimator learns a model that follows a binary tree structure where one or more components are created and are gated by different branching conditions (See tree_). There are two learning methods: random start and model hot-start. Random start fits based on internally generated random initial state. Model hot-start fits using initial solution generated by an existing model. The tree structure assigns each sample from feature data \(X\) a component. The prediction result values of each sample depend on the result of the decision function whose values are calculated using the prediction formula information (See comps_) and the feature data \(X\).
- Parameters:
- random_seedNone or int, optional [default: None]
Random seed value of numpy.random.
This applies during initialization and learning of the FAB learner used by the SklearnFABEstimator. If not specified, random seed value is auto generated and used while fitting. To check the generated random seed value, use the random_seed attribute after fitting.
- max_fab_iterationsint, optional [default: 100]
Maximum number of FAB-iterations. Domain = [1, inf).
- start_from_mstepbool, optional [default: False]
If True, the first iteration starts with M-step; otherwise, E-step. When model hot-start, this parameter is ignored.
- num_acceleration_stepsint, optional [default: 0]
The number of steps of acceleration algorithm for each FAB iteration. If 0, the acceleration algorithm is disabled. Domain = [1, inf).
- repeat_until_convergencebool, optional [default: False]
If False, FAB-iterations and the post-processing are executed only once even if the FAB-iterations are stopped not by convergence condition but by max_fab_iterations condition.
- projection_estepbool, optional [default: False]
If True, the projection E-step algorithm is enabled, otherwise the normal E-step method is executed. It results in higher accuracy in some cases by considering tree structure in the E-step.
- shrink_thresholdfloat or str, optional [default: 1.0]
Threshold value for shrinkage. If a str in percentage value (ex. ‘1.0%’) is specified, shrinkage is executed according to relative value (= num_samples * shrink_threshold). Domain = [1, inf) for absolute value, (0%, 100%) for relative value.
- fab_stop_thresholdfloat or str, optional [default: 0.001]
Threshold value for FAB-iterations: the increase of FIC value is less than the threshold, the FAB-iterations is considered to be converged. If a str in percentage value (ex. ‘1.0%’) is specified, convergence check is executed according to relative value (= threshold > \((FIC^{(t)} - FIC^{(t-1)}) / | FIC^{(t-1)} |\). Domain = (0, inf).
- tree_depthint, optional [default: 5]
Initial depth of the gate-tree structure of latent variable prior. The initial number of components is 2^tree_depth. If 0, the optimization with only one component will be executed. Domain = [0, inf). When model hot-start, this parameter is ignored.
- comp_weights_min_scale, comp_weights_max_scalefloat, optional [default: -0.5, 0.5, respectively]
Scale value for the initialization of weight values of components. Domain = (-inf, inf). When model hot-start, this parameter is ignored.
- comp_bias_min_scale, comp_bias_max_scalefloat, optional [default: 0.25, 0.75, respectively]
Scale value for the initialization of bias values of components. Domain = (-inf, inf). When model hot-start, this parameter is ignored.
- gate_max_binsNone or int, optional [default: None]
Maximum number of binning for each feature, which is used for gate parameter optimization. If None, all unique samples for each feature are used; otherwise, the equal-width binning algorithm is adopted. Domain = [1, inf).
- comp_foba_skip{‘power_of_two’, ‘quarter_square’, ‘none’}, optional [default: ‘power_of_two’]
The checking function for the FoBa algorithm skipping. If ‘none’, FoBa algorithm is executed for all steps.
- comp_foba_skip_max_intervalint, optional [default: 25]
The maximum interval for the FoBa algorithm skipping. If comp_foba_skip is ‘none’, this value is ignored. Domain = [2, inf)
- comp_backward_stepbool, optional [default: False]
Whether the backward-steps are enabled in FoBa algorithm. In the post-process the backward-step is forcibly enabled regardless of this value.
- post_comp_opt_type{‘standard’, ‘quadratic’}, optional [default: ‘standard’]
Algorithm of component parameter optimization in post-processing.
- comp_l2_regularizefloat, optional [default: 0.0]
L2-regularization hyper-parameter for component parameter optimization. The larger the specified value, the stronger the regularization effect is. If 0.0, L2-regularization is disabled. Domain = [0, inf).
- with_comp_scaled_l0_regularizebool, optional [default: True]
Whether with scaled L0-regularization using a tighter lower bound of FIC for components; approximation of det(F) is refined, where F is a empirical Fisher information matrix.
- max_comp_relevant_featuresint, optional [default: 100]
Maximum number of the relevant features for each component.
- max_comp_foba_iterationsint, optional [default: 100]
Maximum number of the FoBa-iterations for each component. Domain = [1, inf).
- num_threads_gatesint, optional [default: 1]
Maximum number of OpenMP threads of gate parameter optimization where tasks for all gates are divided into. If the number of gates is less than num_threads_gates, the former is used. Domain = [1, inf).
- num_threads_gate_featuresint, optional [default: 1]
Maximum number of OpenMP threads of gate parameter optimization where tasks for all features are divided into. If the number of features is less than num_threads_gate_features, the former is used. Domain = [1, inf).
- num_threads_compsint, optional [default: 1]
Maximum number of OpenMP threads of component parameter optimization. If the number of components is less than num_threads_comps, the former is used. Domain = [1, inf).
- base_model_dictNone or dict, optional [default: None]
Dictionary of information on an existing model acquired from its get_model_dict(). If None, fit with random start, else fit with model hot-start. When random start, this parameter is ignored. This parameter maybe changed to receive an estimator object in a future release.
- gate_opt_mode{‘opt’, ‘refit’, ‘keep’}, optional [default: ‘opt’]
Mode of gate parameter optimization:
‘opt’: optimizing with all features (selecting and fitting the features).
‘refit’: only fitting with relevant features.
‘keep’: keeping all parameter values.
When random start, this parameter is ignored.
- comp_opt_mode{‘opt’, ‘refit’}, optional [default: ‘opt’]
Mode of component parameter optimization:
‘opt’: optimizing with all features (selecting and fitting the features).
‘refit’: only fitting with relevant features.
When random start, this parameter is ignored.
Notes
The model uses the following algorithm for prediction:
Let
\(\mathtt{N}\) be the number of samples of feature data \(X\),
\(\mathtt{D}\) be the number of features of feature data \(X\),
\(Y_n\) be the prediction result value for sample \(X_n\) where \(n \in {\mathbb{Z}}, 0 \le n < \mathtt{N}\),
\(Z_n\) be the calculated decision function value for sample \(X_n\) where \(n \in {\mathbb{Z}}, 0 \le n < \mathtt{N}\),
\(w\) be the weights array of the component assigned to sample \(X_n\) where \(n \in {\mathbb{Z}}, 0 \le n < \mathtt{N}\),
\(b\) be the bias value of the component assigned to sample \(X_n\) where \(n \in {\mathbb{Z}}, 0 \le n < \mathtt{N}\)
\[Y_n = \frac{1}{1+\exp(-{Z_n})}\]where
\[Z_n = \overbrace{{w_1}{X_{n1}}}^{\text{contribution of 1st feature}} + \overbrace{{w_2}{X_{n2}}}^{\text{contribution of 2nd feature}} + \dots + \overbrace{{w_\mathtt{D}}{X_{n\mathtt{D}}}}^{\text{contribution of Dth feature}} + b\]Examples
>>> from sklearn_fab import SklearnFABBernGateLinearClassifier >>> cl = SklearnFABBernGateLinearClassifier(random_seed=2, tree_depth=3, shrink_threshold=1.0)
- Attributes:
- feature_names_array of shape (n_features)
The feature names.
- model_fab.hme.model.HMESupervisedModel
Model whose parameters have been fitted.
- fic_float
The Factorized Information Criterion (FIC) after fitting. FIC is the asymptotic approximation value used by FAB-engine.
- tree_SklearnFABTree
A binary tree object based on the latent variable prior information found in the model_.
- comps_list[fab.model.component.linear_predict_comp.LogisticRgComponent]
List of FAB LogisticRgComponent instances created in the model. Refer to FAB Reference > API Documents > fab packages > fab.model.component package for details about the component class.
- comp_ids_list[int]
List of component ID numbers.
- n_classes_int
The number of classes.
- classes_array of shape (n_classes)
The class labels.
- assign_comp(X)¶
Assign a component for each sample.
This method assigns a component with the highest likelihood for each sample using prior distribution.
- Parameters:
- Xarray-like of shape (n_samples, n_features)
Feature data.
- Returns:
- assigned_comp_idsarray of shape (n_samples)
Assigned component ID for each sample.
- Raises:
- NotFittedError
If this function is called before estimator is fitted.
- ValueError
Feature data is not a non-empty 2D array containing only finite values.
- calc_decision_function(X)¶
Calculates the value of decision function.
- Parameters:
- Xarray-like of shape (n_samples, n_features)
Feature data.
- Returns:
- scoresarray of shape (n_samples)
Calculated value of the decision function for each sample.
- Raises:
- NotFittedError
If this function is called before estimator is fitted.
- ValueError
Feature data is not a non-empty 2D array containing only finite values.
- fit(X, y, gate_feature_ids=None, comp_feature_ids=None, comp_mandatory_feature_ids=None, comp_positive_feature_ids=None, comp_negative_feature_ids=None)¶
Fit model.
If random_seed is not specified during initialization or is None, random seed value is internally generated.
- Parameters:
- Xarray-like of shape (n_samples, n_features)
Feature data.
- yarray-like of shape (n_samples)
Target data.
- gate_feature_idsNone or list[int], optional [default: None]
List of indices of feature data which are applied to gate parameter optimizations. If None, all features are used.
- comp_feature_idsNone or list[int], optional [default: None]
List of indices of feature data which are applied to component parameter optimizations. If None, all features are used. If empty list, model is learned as decision tree (not supported in learners consisting of B-spline Rg/Cl components).
- comp_mandatory_feature_idsNone or list[int], optional [default: None]
List of indices of feature data which non-L0-regularize constraints are applied to. It means the specified features will always be relevant for all components. If None, no features are specified for non-L0-regularization.
- comp_positive_feature_idsNone or list[int], optional [default: None]
List of indices of feature data whose weight values for all components are constrained to positive values. If None, all features are optimized with no constraints.
- comp_negative_feature_idsNone or list[int], optional [default: None]
List of indices of feature data whose weight values for all components are constrained to negative values. If None, all features are optimized with no constraint.
- Returns:
- self
- Raises:
- TypeError
random_seed is not None or int.
- ValueError
Feature and target data is not consistent length.
Feature data is empty and containing finite values.
Target data does not have np.nan or np.inf.
- get_model_dict()¶
Get information on model_ as dictionary.
- Returns:
- model_dictdict
Information on model_.
This dictionary has the following keys and values:
- gatesdict
Information on the latent variable prior.
- compslist[dict]
Information on the components.
- num_featuresint
Number of features.
- num_targetsint
Number of targets.
- gate_feature_idsNone or list[int]
List of indices of feature data which were used in the gate-parameters’ optimizations.
- comp_feature_idsNone or list[int]
List of indices of feature data which were used in the component-parameters’ optimizations.
- Raises:
- NotFittedError
If this function is called before estimator is fitted.
- get_params(deep=True)¶
Get parameters for this estimator.
- Parameters:
- deepbool, default=True
If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
- paramsdict
Parameter names mapped to their values.
- predict(X)¶
Predict target values.
- Parameters:
- Xarray-like of shape (n_samples, n_features)
Feature data.
- Returns:
- yarray of shape (n_samples)
Predicted values.
- Raises:
- NotFittedError
If this function is called before estimator is fitted.
- ValueError
Feature data is not a non-empty 2D array containing only finite values.
- predict_proba(X)¶
Probability estimates for each class based on the feature data.
- Parameters:
- Xarray-like of shape (n_samples, n_features)
Feature data.
- Returns:
- Parray of shape (n_samples, 2)
Probability of the samples for each class in the model.
- Raises:
- NotFittedError
If this function is called before estimator is fitted.
- ValueError
Feature data is not a non-empty 2D array containing only finite values.
- score(X, y, sample_weight=None)¶
Return the mean accuracy on the given test data and labels.
In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.
- Parameters:
- Xarray-like of shape (n_samples, n_features)
Test samples.
- yarray-like of shape (n_samples,) or (n_samples, n_outputs)
True labels for X.
- sample_weightarray-like of shape (n_samples,), default=None
Sample weights.
- Returns:
- scorefloat
Mean accuracy of
self.predict(X)
wrt. y.
- set_params(**params)¶
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline
). The latter have parameters of the form<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters:
- **paramsdict
Estimator parameters.
- Returns:
- selfestimator instance
Estimator instance.