Predicting 3D Printing Quality using Machine Learning – Part 2: Machine Learning Model Building and Prediction Accuracy

In the last part of the data science tutorial, we saw how to establish the correlation between the typical parameters of a FDM 3D printing machine i.e., layer height, nozzle temperature, material etc. and the printed part quality parameters i.e., strength, elongation etc.

In this part we will see how to actually build different machine learning models for the prediction of the additive manufactured part quality. We will also see how the different ML models compare interns of accuracy and over-fitting.

If you remember, the dataset have the following features:

layer_height 
wall_thickness        
infill_density        
infill_pattern        
nozzle_temperature    
bed_temperature       
print_speed           
material              
fan_speed             
roughness             
tension_strenght      
elongation

And the first few instances of our dataset looks like this:

layer_height	wall_thickness	infill_density	infill_pattern	nozzle_temperature	bed_temperature	print_speed	material	fan_speed	roughness	tension_strenght	elongation
0	0.02	8	90	grid	220	60	40	abs	0	25	18	1.2
1	0.02	7	90	honeycomb	225	65	40	abs	25	32	16	1.4
2	0.02	1	80	grid	230	70	40	abs	50	40	8	0.8
3	0.02	4	70	honeycomb	240	75	40	abs	75	68	10	0.5
4	0.02	6	90	grid	250	80	40	abs	100	92	5	0.7

Out of these the infill_pattern and material are categorical in nature, which we need to encode to the numerical data:

#converting text labels to numbers
from sklearn.preprocessing import LabelEncoder
labelencoder = LabelEncoder()
df['infill_pattern'] = labelencoder.fit_transform(df['infill_pattern'])

labelencoder1 = LabelEncoder()
df['material'] = labelencoder1.fit_transform(df['material'])

Now if we check the dataset, it looks like:

layer_height	wall_thickness	infill_density	infill_pattern	nozzle_temperature	bed_temperature	print_speed	material	roughness	tension_strenght	elongation
0	0.02	8	90	0	220	60	40	0	25	18	1.2
1	0.02	7	90	1	225	65	40	25	32	16	1.4
2	0.02	1	80	0	230	70	40	50	40	8	0.8
3	0.02	4	70	1	240	75	40	75	68	10	0.5
4	0.02	6	90	0	250	80	40	100	92	5	0.7

Please observe the change in the values of the two categorical features.

We now want to normalize the non-categorical features:

#selecting all the columns except the ones with catagorical values
df1=df[df.columns.difference(["infill_pattern", "material"])]

normalized_df=(df1-df1.min())/(df1.max()-df1.min())

Joining the normalized non-categorical features and the encoded categorical features:

#selecting all the columns except the ones with catagorical values
df1=df[df.columns.difference(["infill_pattern", "material"])]

The dataset , now, looks like:

n_df1.head()

bed_temperature	elongation	fan_speed	infill_density	layer_height	nozzle_temperature	print_speed	roughness	tension_strenght	wall_thickness
0	60	1.2	0	90	0.02	220	40	25	18	8
1	65	1.4	25	90	0.02	225	40	32	16	7
2	70	0.8	50	80	0.02	230	40	40	8	1
3	75	0.5	75	70	0.02	240	40	68	10	4
4	80	0.7	100	90	0.02	250	40	92	5	6

n_df1_test_train=n_df1

In this dataset, there are three target variables (roughness, tensile_strength, elongation). To keep the things simple we are not going to use Multi Target Regression (MTR) , rather we will separate the three targets and run regression separately:

y1=n_df1_test_train[['roughness']]
y2=n_df1_test_train[['tension_strenght']]
y3=n_df1_test_train[['elongation']]
X=n_df1_test_train[n_df1_test_train.columns.difference(['roughness', 'tension_strenght', 'elongation'])]

We need to split the dataset between test and train:

from sklearn.model_selection import train_test_split
#X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)
X1_train, X1_test, y1_train, y1_test = train_test_split(X, y1, test_size=0.5, random_state=42)
X2_train, X2_test, y2_train, y2_test = train_test_split(X, y2, test_size=0.33, random_state=42)
X3_train, X3_test, y3_train, y3_test = train_test_split(X, y3, test_size=0.33, random_state=42)

We then will use the spitted dataset to build different machine learning models (Linear regression, polynomial regression, MLP regression, KNN regression, Ridge regression). Will train each of the models for the train dataset split, and calculate the accuracy using the test dataset:

#linear regression
from sklearn.linear_model import LinearRegression
linreg = LinearRegression().fit(X1_train, y1_train)
print('R-squared score X1-roughness(training): {:.3f}'
     .format(linreg.score(X1_train, y1_train)))
print('R-squared score X1-roughness(test): {:.3f}'
     .format(linreg.score(X1_test, y1_test)))
R-squared score X1-roughness(training): 0.907
R-squared score X1-roughness(test): 0.746

from sklearn.linear_model import LinearRegression
linreg2 = LinearRegression().fit(X2_train, y2_train)
print('R-squared score X2-tension_strength(training): {:.3f}'
     .format(linreg2.score(X2_train, y2_train)))
print('R-squared score X2-tension_strength(test): {:.3f}'
     .format(linreg2.score(X2_test, y2_test)))
R-squared score X2-tension_strength(training): 0.652
R-squared score X2-tension_strength(test): 0.662


from sklearn.linear_model import LinearRegression
linreg3 = LinearRegression().fit(X3_train, y3_train)
print('R-squared score X3-elongation(training): {:.3f}'
     .format(linreg3.score(X3_train, y3_train)))
print('R-squared score X3-elongation(test): {:.3f}'
     .format(linreg3.score(X3_test, y3_test)))
R-squared score X3-elongation(training): 0.718
R-squared score X3-elongation(test): 0.663

# Fitting Polynomial Regression to the dataset 
from sklearn.preprocessing import PolynomialFeatures
poly = PolynomialFeatures(degree = 4) 
X1_train_poly = poly.fit_transform(X1_train) 
X1_test_poly = poly.fit_transform(X1_test) 

poly.fit(X1_train_poly, y1_train) 
lin2 = LinearRegression() 
lin2.fit(X1_train_poly, y1_train) 

print('R-squared score X1_poly-roughness(training): {:.3f}'
     .format(lin2.score(X1_train_poly, y1_train)))
print('R-squared score X1_poly-roughness(test): {:.3f}'
     .format(lin2.score(X1_test_poly, y1_test)))
R-squared score X1_poly(training): 1.000
R-squared score X1_poly(test): 0.619


# Fitting Polynomial Regression to the dataset 
from sklearn.preprocessing import PolynomialFeatures
poly = PolynomialFeatures(degree = 4) 
X2_train_poly = poly.fit_transform(X2_train) 
X2_test_poly = poly.fit_transform(X2_test) 

poly.fit(X2_train_poly, y2_train) 
lin2_2 = LinearRegression() 
lin2_2.fit(X2_train_poly, y2_train) 

print('R-squared score X2_poly-tensileStrength(training): {:.3f}'
     .format(lin2_2.score(X2_train_poly, y2_train)))
print('R-squared score X2_poly-tensileStrength(test): {:.3f}'
     .format(lin2_2.score(X2_test_poly, y2_test)))
R-squared score X2_poly-tensileStrength(training): 1.000
R-squared score X2_poly-tensileStrength(test): -0.713


# Fitting Polynomial Regression to the dataset 
from sklearn.preprocessing import PolynomialFeatures
poly = PolynomialFeatures(degree = 4) 
X3_train_poly = poly.fit_transform(X3_train) 
X3_test_poly = poly.fit_transform(X3_test) 

poly.fit(X3_train_poly, y3_train) 
lin2_3 = LinearRegression() 
lin2_3.fit(X3_train_poly, y3_train) 

print('R-squared score X3_poly-elongation(training): {:.3f}'
     .format(lin2_3.score(X3_train_poly, y3_train)))
print('R-squared score X3_poly-elongation(test): {:.3f}'
     .format(lin2_3.score(X3_test_poly, y3_test)))
R-squared score X3_poly-elongation(training): 1.000
R-squared score X3_poly-elongation(test): 0.395


#nural net
from sklearn.neural_network import MLPRegressor
#mlpreg = MLPRegressor(hidden_layer_sizes = [100,100],activation = 'tanh',alpha = 0.0003,solver = 'lbfgs').fit(X1_train, y1_train)
mlpreg = MLPRegressor(hidden_layer_sizes = [100,100],activation = 'relu',alpha = 0.0003,solver = 'lbfgs').fit(X1_train, y1_train)

print('R-squared score X1_nn(training): {:.3f}'
     .format(mlpreg.score(X1_train, y1_train)))
print('R-squared score X1_nn(test): {:.3f}'
     .format(mlpreg.score(X1_test, y1_test)))
R-squared score X1_nn(training): 1.000
R-squared score X1_nn(test): 0.828

#nural net
from sklearn.neural_network import MLPRegressor
#mlpreg = MLPRegressor(hidden_layer_sizes = [100,100],activation = 'tanh',alpha = 0.0003,solver = 'lbfgs').fit(X1_train, y1_train)
mlpreg2 = MLPRegressor(hidden_layer_sizes = [100,100],activation = 'relu',alpha = 0.0003,solver = 'lbfgs').fit(X2_train, y2_train)

print('R-squared score X2_nn-tensileStrength(training): {:.3f}'
     .format(mlpreg2.score(X2_train, y2_train)))
print('R-squared score X1_nn-tensileStrength(test): {:.3f}'
     .format(mlpreg2.score(X2_test, y2_test)))
/home/suvo/anaconda3/envs/py37/lib/python3.7/site-packages/sklearn/neural_network/multilayer_perceptron.py:1321: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
  y = column_or_1d(y, warn=True)
R-squared score X2_nn-tensileStrength(training): 1.000
R-squared score X1_nn-tensileStrength(test): 0.532


#nural net
from sklearn.neural_network import MLPRegressor
#mlpreg = MLPRegressor(hidden_layer_sizes = [100,100],activation = 'tanh',alpha = 0.0003,solver = 'lbfgs').fit(X1_train, y1_train)
mlpreg3 = MLPRegressor(hidden_layer_sizes = [100,100],activation = 'relu',alpha = 0.0003,solver = 'lbfgs').fit(X3_train, y3_train)

print('R-squared score X3_nn-elongation(training): {:.3f}'
     .format(mlpreg3.score(X3_train, y3_train)))
print('R-squared score X3_nn-elongation(test): {:.3f}'
     .format(mlpreg3.score(X3_test, y3_test)))
/home/suvo/anaconda3/envs/py37/lib/python3.7/site-packages/sklearn/neural_network/multilayer_perceptron.py:1321: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
  y = column_or_1d(y, warn=True)
R-squared score X3_nn-elongation(training): 1.000
R-squared score X3_nn-elongation(test): 0.676


#KNN regression
from sklearn.neighbors import KNeighborsRegressor
knnreg = KNeighborsRegressor(n_neighbors = 1).fit(X1_train, y1_train)

print('R-squared score X1_knnreg(training): {:.3f}'
     .format(knnreg.score(X1_train, y1_train)))
print('R-squared score X1_knnreg(test): {:.3f}'
     .format(knnreg.score(X1_test, y1_test)))
R-squared score X1_knnreg(training): 1.000
R-squared score X1_knnreg(test): -0.150
#KNN regression-X2
from sklearn.neighbors import KNeighborsRegressor
knnreg2 = KNeighborsRegressor(n_neighbors = 1).fit(X2_train, y2_train)

print('R-squared score X2_knnreg-tensileStrength(training): {:.3f}'
     .format(knnreg2.score(X2_train, y2_train)))
print('R-squared score X2_knnreg-tensileStrength(test): {:.3f}'
     .format(knnreg2.score(X2_test, y2_test)))
R-squared score X2_knnreg-tensileStrength(training): 1.000
R-squared score X2_knnreg-tensileStrength(test): 0.014
#KNN regression-X3
from sklearn.neighbors import KNeighborsRegressor
knnreg3 = KNeighborsRegressor(n_neighbors = 1).fit(X3_train, y3_train)

print('R-squared score X3_knnreg-elongation(training): {:.3f}'
     .format(knnreg3.score(X3_train, y3_train)))
print('R-squared score X3_knnreg-elongation(test): {:.3f}'
     .format(knnreg3.score(X3_test, y3_test)))
R-squared score X3_knnreg-elongation(training): 1.000
R-squared score X3_knnreg-elongation(test): -0.304


#ridge regression
from sklearn.linear_model import Ridge
#X_train, X_test, y_train, y_test = train_test_split(X_crime, y_crime,random_state = 0)

linridge = Ridge(alpha=20.0).fit(X1_train, y1_train)

print('R-squared score X1_ridgereg(training): {:.3f}'
     .format(linridge.score(X1_train, y1_train)))
print('R-squared score X1_ridgereg(test): {:.3f}'
     .format(linridge.score(X1_test, y1_test)))
R-squared score X1_ridgereg(training): 0.301
R-squared score X1_ridgereg(test): 0.169
#ridge regression-X2
from sklearn.linear_model import Ridge
#X_train, X_test, y_train, y_test = train_test_split(X_crime, y_crime,random_state = 0)

linridge2 = Ridge(alpha=20.0).fit(X2_train, y2_train)

print('R-squared score X2_ridgereg-tensileStrength(training): {:.3f}'
     .format(linridge2.score(X2_train, y2_train)))
print('R-squared score X2_ridgereg-tensileStrength(test): {:.3f}'
     .format(linridge2.score(X2_test, y2_test)))
R-squared score X2_ridgereg-tensileStrength(training): 0.231
R-squared score X2_ridgereg-tensileStrength(test): 0.152
#ridge regression-X3
from sklearn.linear_model import Ridge
#X_train, X_test, y_train, y_test = train_test_split(X_crime, y_crime,random_state = 0)

linridge3 = Ridge(alpha=20.0).fit(X3_train, y3_train)

print('R-squared score X3_ridgereg-elongation(training): {:.3f}'
     .format(linridge3.score(X3_train, y3_train)))
print('R-squared score X3_ridgereg-elongation(test): {:.3f}'
     .format(linridge3.score(X3_test, y3_test)))
R-squared score X3_ridgereg-elongation(training): 0.269
R-squared score X3_ridgereg-elongation(test): 0.220

Conclusions

Comparison of R2 scores for different regression algorithms

From the above comparisons, we can say that the polynomial regression, MLP regression and KNN regression gives the perfect accuracy of 1(100%) for the training set but the accuracy reduces (sometimes gone negative) drastically for the test set.

On the other hand for the linear regression, the test set accuracy is closer to that of the train set.

The other models (except linear regression and ridge) is kind of overfitting the train dataset. For the case of the ridge regression, the model is underfitting.

In future, I shall try to use the same dataset and explain how to use it for MTR.

That’s it for now, this is me Shibashis signing off for the day. Thanks for stopping by at mechGuru.com.

In case you want the complete code of both the parts, plesae visit my GitHub page.

Shibashis Ghosh

Hi, I am Shibashis, a blogger by passion and an engineer by profession. I have written most of the articles for mechGuru.com. For more than a decades i am closely associated with the engineering design/manufacturing simulation technologies. I am a self taught code hobbyist, presently in love with Python (Open CV / ML / Data Science /AWS -3000+ lines, 400+ hrs. )