DecisionTree
Extends:
Decision tree learner. Builds a decision tree by greedily splitting samples on one feature hierarchically.
Constructor Summary
Public Constructor | ||
public |
constructor(optionsUser: Object) Constructor. |
Member Summary
Public Members | ||
public |
criterion: * |
|
public |
maxDepth: * |
|
public |
numFeatures: * |
|
public |
|
|
public |
tree: * |
Method Summary
Public Methods | ||
public |
Build a (sub-)tree from a set of samples. |
|
public |
calculateImpurity(groups: Array<Array<mixed>>): number Calculate the impurity for multiple groups of labels. |
|
public |
calculateWeightedImpurity(groups: Array<Array<mixed>>, impurityCallback: function(labels: Array<number>): number): number Calculate the weighted impurity for multiple groups of labels. |
|
public |
Calculate the Shannon entropy a set of labels. |
|
public |
Find the best splitting feature and feature value for a set of data points. |
|
public |
Calculate the Gini coefficient a set of labels. |
|
public |
predict(X: *): * |
|
public |
predictSample(sampleFeatures: Array<number>): mixed Make a prediction for a single sample. |
|
public |
splitSamples(XSub: Array<number>, ySub: Array<mixed>, fInd: number, splitValue: number): DataSplitGroups Split a set of samples into two groups by some splitting value for a feature. |
|
public |
train(X: *, y: *) |
Inherited Summary
From class Estimator | ||
public abstract |
Make a prediction for a data set. |
|
public abstract |
Train the supervised learning algorithm on a dataset. |
Public Constructors
public constructor(optionsUser: Object) source
Constructor. Initialize class members and store user-defined options.
Params:
Name | Type | Attribute | Description |
optionsUser | Object |
|
User-defined options for decision tree |
optionsUser.criterion | string |
|
Splitting criterion. Either 'gini', for the Gini coefficient, or 'entropy' for the Shannon entropy |
optionsUser.numFeatures | number | string |
|
Number of features to subsample at each node. Either a number (float), in which case the input fraction of features is used (e.g., 1.0 for all features), or a string. If string, 'sqrt' and 'log2' are supported, causing the algorithm to use sqrt(n) and log2(n) features, respectively (where n is the total number of features) |
optionsUser.maxDepth | number |
|
Maximum depth of the tree. The depth of the tree is the number of nodes in the longest path from the decision tree root to a leaf. It is an indicator of the complexity of the tree. Use -1 for no maximum depth |
Public Members
public criterion: * source
public maxDepth: * source
public numFeatures: * source
public numFeaturesInt: * source
public tree: * source
Public Methods
public buildTree(XSub: Array<Array<number>>, ySub: Array<mixed>, depth: number): DecisionTreeNode source
Build a (sub-)tree from a set of samples.
public calculateImpurity(groups: Array<Array<mixed>>): number source
Calculate the impurity for multiple groups of labels. The impurity criterion used can be specified by the user through the user-defined options.
public calculateWeightedImpurity(groups: Array<Array<mixed>>, impurityCallback: function(labels: Array<number>): number): number source
Calculate the weighted impurity for multiple groups of labels. The returned impurity is calculated as the weighted sum of the impurities of the individual groups, where the weights are determined by the number of samples in the group.
public entropy(labels: Array<mixed>): number source
Calculate the Shannon entropy a set of labels.
Params:
Name | Type | Attribute | Description |
labels | Array<mixed> | Array of predicted labels |
public findSplit(XSub: Array<Array<number>>, ySub: Array<mixed>, baseImpurity: number): DataSplit source
Find the best splitting feature and feature value for a set of data points.
public gini(labels: Array<mixed>): number source
Calculate the Gini coefficient a set of labels.
Params:
Name | Type | Attribute | Description |
labels | Array<mixed> | Array of predicted labels |
public predict(X: *): * source
Make a prediction for a data set.
Override:
Estimator#predictParams:
Name | Type | Attribute | Description |
X | * |
Return:
* |
public predictSample(sampleFeatures: Array<number>): mixed source
Make a prediction for a single sample.
Return:
mixed | Prediction. Label of class with highest prevalence among k nearest neighbours |
public splitSamples(XSub: Array<number>, ySub: Array<mixed>, fInd: number, splitValue: number): DataSplitGroups source
Split a set of samples into two groups by some splitting value for a feature. The samples with a feature value lower than the split value go the left (first) group, and the other samples go to the right (second) group.
public train(X: *, y: *) source
Train the supervised learning algorithm on a dataset.
Override:
Estimator#trainParams:
Name | Type | Attribute | Description |
X | * | ||
y | * |