Which of the following is an example of unsupervised learning? Select one:
a. Classifying emails as spam or not spam
b. Clustering customers based on their buying patterns
c. Using a chatbot to respond to customer inquiries
d. Predicting the next word in a sentence
The correct answer and explanation is:
Correct Answer: b. Clustering customers based on their buying patterns
Explanation (300 words):
Unsupervised learning is a type of machine learning where the model is not provided with labeled data. Instead, it is given input data without explicit instructions on what to do with it. The goal is for the system to find hidden patterns or intrinsic structures in the data. This contrasts with supervised learning, where the model learns from input-output pairs (i.e., data with labels).
Among the given options:
- a. Classifying emails as spam or not spam is an example of supervised learning because the model is trained on a labeled dataset (spam vs. not spam).
- b. Clustering customers based on their buying patterns is a classic example of unsupervised learning. There are no labels provided (e.g., “loyal customer” or “bargain shopper”); instead, the algorithm groups customers based on similarities in behavior or characteristics without prior categorization.
- c. Using a chatbot to respond to customer inquiries typically involves natural language processing models trained on labeled datasets (supervised), or reinforcement learning in some cases, but not unsupervised learning by default.
- d. Predicting the next word in a sentence (as in language modeling) is also typically a supervised or self-supervised learning task, where models are trained using large amounts of labeled text (the current and next word pairs), even if the labels are generated from the data itself.
Clustering (as in choice b) is a common unsupervised learning technique. Algorithms like K-means, DBSCAN, and hierarchical clustering are designed to group data based on feature similarity without prior knowledge of categories. These techniques are especially valuable in exploratory data analysis, customer segmentation, market research, and image compression.
Thus, b is the correct answer because it best exemplifies how unsupervised learning works: discovering structure in unlabeled data.