Building a flowchart with SmartPredict is just as easy as transporting modules into the workspace in drag and drop mode.
As our flowchart is borrowed from a classification template, all that is left to do is configure the modules' parameters to meet our specific needs.
It is deemed useful to remind that the flowchart we are going to build represents the ML pipeline from the processing steps to the model for initiating the prediction.
The default build flowchart is composed of the following elements:
the Dataframe Loader
the Features Selector
the ML trainer
the Item saver
the ML evaluator
the Labeled Data splitter
the Data Object logger
and the Support Vector Classifier
We furthermore need to add :
an Ordinal Encoder in order to correctly handle the integer type of data.
[a processing pipeline + the original dirty dataset ] OR [a dataframe loader and a clean dataset]
The dataframe loader is useful for loading the dataframe provided by a clean dataset. This latter comes from the train dataset we initially had after some cleansing with the data processor .
Within the fields 'Columns to keep' and 'Columns to drop' enter the corresponding information .
The second flowchart shown above is another option for structuring our workflow , this time with a processing pipeline. It is what we are going to use for all the next steps.
To obtain the required configuration starting from the default flowchart, we furthermore need to add :
the processing pipeline
the unprocessed dataset
an ordinal encoder
The Features selector is part of the Core modules. It is located under the sub-tab of Data Selection modules. To configure it , select the features from the dataset inside the drag and drop area and select 'Survived' as a label.
Ordinal encoding deals with categorical data just like what we have here. You might already be familiar with the Ordinal encoding function. However, if you feel the need for more information , check its official documentation.
The Ordinal Encoder module 's configuration is shown below: