I use PDI 7.0 Community Edition on iOS X and Ubuntu.īut sometimes I got a error after saving and trying to open CPythonExecutor step. It is amazing opportunity ti incorporate pure Python script inside PDI. The model is saved (from the python environment) to the directory that the transformation was loaded from. The two python steps at the bottom of the transformation learn a decision tree model and then use that model to score the iris data respectively. These are then extracted as binary PNG data from the python environment and saved to files in the same directory as the transformation was loaded from. The third python step from the top uses the matplotlib library and plotting routines from the pandas library to compute some visualisations of the iris data (scatter plot matrix, Andrew's curves, parallel coordinates and rad-viz). The second python step from the top uses the scikit-learn decomposition routine to compute a principal components analysis on the iris data and then transforms the iris data into the PCA space, which is then the output of the step. This is output from the step as a pandas DataFrame, where each row corresponds to one of the quartiles computed (25th, 50th and 75th), and each column holds the value for one of the numeric fields in the iris data. The first python step (at the top) simply computes some quartiles for the numeric columns in the iris data. The example uses Fisher's classic iris data.