What is the Difference between Supervised and Unsupervised Machine Learning?

Nora Zajzon
3 min readOct 29, 2020

--

While you may have heard about Machine Learning, there is a high chance you have not had a brief overview of the main methods used. Supervised and Unsupervised learning are the two main subcategories that fall under the Machine Learning subset of Artificial Intelligence (AI).

Supervised Learning

Supervised learning develops a model that predicts based on the input and output data it receives. The machine begins to learn from the known input and known responses. In other words it’s a learning method that is using a so called ground truth, prior knowledge of that the response should be for the samples given. This knowledge is provided in the form of a large dataset that’s tagged with the answers the algorithm should generate on its own if given that input. Supervised learning allows the algorithm to learn from these training examples and when given new input using a prediction model to predict the correct label. This method of machine learning is useful for when you need to give an approximate prediction for known data. The two main areas this type of prediction is useful is classification and regression.

Classification is a supervised learning technique used whenever an algorithm needs to decide what category the input data belongs to. A common example of this technique is determining whether mail is spam or genuine. Using classification, the computer is able to decide where the mail belongs and categorizes it.

Regression also predicts responses, however, they are continuous outputs instead of classification’s discrete ones. With many different regression algorithm techniques, the computer is able to predict data with changes and fluctuations in its output. Regression is helpful in for example the medical field to predict heart attacks using the data they gather from previous patients.

Unsupervised Learning

Unsupervised learning is other main technique used in machine learning where the algorithm first is given unlabeled input data without explicit instructions what to do with it and without a specific desired outcome. The algorithm then attempts to find structure in the unstructured input by extracting useful features and derive inferences. However, unlike supervised learning, this technique has no classified output. Instead, it uses another technique called clustering.

Clustering finds hidden patterns of similarities and groups the input elements in clusters. This technique is useful for marketing, like in presidential campaigns identifying voters’ political orientation. It can also detect anomaly, when input items do not belong to any cluster they signal unusual patterns. This can be used for example by banks to detect fraud, if a credit card is used at the same time at different locations.

Dimensionality reduction is the method of eliminating redundant features in order to reduce processing intensity by learning the relationships between individual features and represent the data using only features that interrelate which results in far fewer features than we started with.

Follow my linkden for more!

--

--