Tom Mitchell provides a definition.
A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E.
ex) playing checkers.
E = the experience of playing many games of checkers
T = the task of playing checkers.
P = the probability that the program can be assigned to one of two broad classifications.
In general, any machine learning problem can be assigned to one of two broad classifications.
Supervised learning and Unsupervised learning.