Skip to main content

Classification Model Generator

Description#

Classification is a technique to categorize data into a given number of classes. It identifies the category of new data and maps the input data into a specified category. A classification model tries to conclude from the input values for training based on the prediction of class labels or categories for the new data. Classification uses discrete and nominal values.

Properties#

Input#

  • Algorithm Type – It specifies the classification algorithm for prediction. Select from Random Forest, Support Vector Machine, Decision Tree.
    • Random forest – A meta estimator tries to fit several decision trees on various sub-samples of datasets and uses predictive accuracy of the model and controls over-fitting.
    • Support Vector Machine – It represents the training data as points in space divided into categories by a clear gap. Then, it maps the new examples into the same space and predicts a category based on which side of the gap they fall.
    • Decision Tree – It produces the sequences of rules used to classify the data. It is simple to understand, requires less data preparation, and can handle numerical and categorical data.
  • CSV File Path – Specify that the CSV file path of the data set required for model creation is present.
  • Missing Values Handler – Datasets may have missing values and will create problems for algorithms. The missing value handler identifies and replaces the missing values for each column in the specified input data before modelling the prediction task.
  • Model Name – It will generate a model with this name.
  • Selected Dependent Columns – It specifies the dependents columns that need prediction.
  • Selected Independent Columns – It specifies the columns that are required to train the model.
  • Test Data Size – It specifies the data size, and the value should be < 0.4 is better for the best model accuracy.

Misc#

  • DisplayName – Add a display name to your activity.
  • Private – By default, activity will log the values of your properties inside your workflow. If private is selected, then it stops logging.

Output#

  • Result – Name of the model generated by this activity.

Example#