Machine Learning on a Cancer Dataset - Part 34

in programming •  7 years ago 

This is the final tutorial in which we discuss uncertainty estimation for machine learning in scikit-learn.

In the previous two videos we delved into uncertainty estimation by looking at the 'decision_function' and 'predict_proba' methods for binary classification. We used a support vector machine and we looked into uncertainty estimation for binary classification: categorizing tumor samples as malignant or benign.

What happens if we work with a dataset where data is labeled in more than two classes, so a non-binary situation. That's what we're getting into in this video tutorial.

First of all, we're using a different classifier - a GradientBoostingClassifier. Since we cannot work with the breast cancer dataset (because of its binary labels), we're using the 'iris' dataset, which, similar to the cancer dataset, is preloaded/preprocessed in scikit-learn. The 'iris' dataset categorizes flowers into: setosa, virginica, and versicolor.

So, let's see how uncertainty estimation looks like for this type of multiclass classification. Please watch the video below for the complete walkthrough.


Previous videos in this series:

  1. Machine Learning on a Cancer Dataset - Part 30
  2. Machine Learning on a Cancer Dataset - Part 31
  3. Machine Learning on a Cancer Dataset - Part 32
  4. Machine Learning on a Cancer Dataset - Part 33


To stay in touch with me, follow @cristi


Cristi Vlad, Self-Experimenter and Author

Authors get paid when people like you upvote their post.
If you enjoyed what you read here, create your account today and start earning FREE STEEM!
Sort Order:  

nice post.
Thanks For Share.

nice man follow and up vote to you