In this Unsupervised Machine Learning, input data is not labeled and does not have a known result. We have to prepare model by deducing structures present in the input data. This may be to extract general rules. It may be through a mathematical process to reduce redundancy.
Example problems are clustering, dimensionality reduction, and association rule learning.
Example algorithms include the Apriori algorithm and k-Means.
Ordinary Least Squares Regression (OLSR)
Multivariate Adaptive Regression Splines (MARS)
Locally Estimated Scatterplot Smoothing (LOESS)
k-Nearest Neighbor (kNN)
Learning Vector Quantization (LVQ)
Self-Organizing Map (SOM)
Locally Weighted Learning (LWL)
Decision Tree Algorithms
Decision tree methods construct a model of decisions. That is made based on the actual values of attributes in the data. Decisions fork in tree structures until a prediction decision is made for a given record. Decision trees are trained on data for classification and regression problems. Decision trees are often fast and accurate and a big favorite in Machine Learning. The most popular decision tree algorithms are:
Classification and Regression Tree (CART)
Iterative Dichotomiser 3 (ID3)
C4.5 and C5.0 (different versions of a powerful approach)
Chi-squared Automatic Interaction Detection (CHAID)
Conditional Decision Trees
These methods are those that apply Bayes’ Theorem for problems. Such as classification and regression. The most popular Bayesian algorithms are:
Clustering, like regression, describes the class of problem and the class of methods. The Clustering methods are organized by the modeling approaches such as centroid-based and hierarchal. All methods are concerned with using the inherent structures in the data. That is a need to best organize the data into groups of maximum commonality. The most popular clustering algorithms are:
Association Rule Learning Algorithms
Association rule learning methods extract rules. That best explain observed relationships between variables in data. These rules can discover important and useful associations in large multidimensional datasets. That can be exploited by an organization. The most popular association rule learning algorithms are:
Artificial Neural Network Algorithms
These are models that are inspired by the structure of biological neural networks. They are a class of pattern matching. That we use for regression and classification problems. Although, there is an enormous subfield. As it combines hundreds of algorithms and variations. The most popular artificial neural network algorithms are:
Deep Learning Algorithms
Deep Learning methods are a modern update to Artificial Neural Networks. That is exploiting abundant cheap computation. They are concerned with building much larger and more complex neural networks. The most popular Deep Learning algorithms are:
Deep Boltzmann Machine (DBM)
Deep Belief Networks (DBN)
Convolutional Neural Network (CNN)
Dimensionality Reduction Algorithms
Like clustering methods, dimensionality reduction seeks an inherent structure in the data. Although, in this case, to order to summarize.
Generally, it can be useful to visualize dimensional data. Also, we can use it in a supervised learning method. Many of these methods we adopt for use in classification and regression.
Principal Component Analysis (PCA)
Principal Component Regression (PCR)
Partial Least Squares Regression (PLSR)
Multidimensional Scaling (MDS)
Linear Discriminant Analysis (LDA)
Mixture Discriminant Analysis (MDA)
Quadratic Discriminant Analysis (QDA)
Flexible Discriminant Analysis (FDA)
Basically, these methods are models composed of weaker models. Also, as they are trained and whose predictions are combined in some way to make the prediction. Moreover, much effort is put into what types of weak learners to combine and the ways in which to combine them. Hence, this is a very powerful class of techniques and as such is very popular.
Bootstrapped Aggregation (Bagging)
Stacked Generalization (blending)
Gradient Boosting Machines (GBM)
Gradient Boosted Regression Trees (GBRT)
List of Common Machine Learning Algorithms
Naïve Bayes Classifier Machine Learning Algorithm
Generally, it would be difficult and impossible to classify a web page, a document, an email. Also, other lengthy text notes manually. This is where Naïve Bayes Classifier Machine Learning algorithm comes to the rescue. Also, a classifier is a function that allocates a population’s element value. For instance, Spam Filtering is a popular application of Naïve Bayes algorithm. Thus, spam filter here is a classifier that assigns a label “Spam” or “Not Spam” to all the emails. Basically, it is amongst the most popular learning method grouped by similarities. That works on the popular Bayes Theorem of Probability. It is a simple classification of words. Also, is defined for the subjective analysis of content.
K Means Clustering Machine Learning Algorithm
Generally, K-means is a used unsupervised Machine Learning algorithm for cluster analysis. Also, K-Means is a non-deterministic and iterative method. Besides, the algorithm operates on a given data set through a pre-defined number of clusters, k. Thus, the output of K Means algorithm is k clusters with input data that is separated among the clusters.
Support Vector Machine Learning Algorithm
Basically, it is a supervised Machine Learning algorithm for classification or regression problems. As in this, the dataset teaches SVM about the classes. So that SVM can classify any new data. Also, it works by classifying the data into different classes by finding a line. That we use to separates the training dataset into classes. Moreover, there are many such linear hyperplanes. Further, in this, SVM tries to maximize a distance between various classes. As that has to involve and this is referred to as margin maximization. Also, if the line that maximizes the distance between the classes is identified. Then the probability to generalize well to unseen data is increased. SVM’s are classified into two categories:
Linear SVM’s — Basically, in linear SVM’s the training data i.e. have to separate classifier by a hyperplane.
Non-Linear SVM’s- Basically, in non-linear SVM’s it is not possible to separate the training data using a hyperplane.
Apriori Machine Learning Algorithm
Basically, it is an unsupervised Machine Learning algorithm. That we use to generate association rules from a given data set. Also, association rule implies that if an item A occurs, then item B also occurs with a certain probability. Moreover, most of the association rules generated are in the IF_THEN format. For example, IF people buy an iPad THEN they also buy an iPad Case to protect it. The basic principle on which Apriori Machine Learning Algorithm works: If an item set occurs frequently then all the subsets of the item set, also occur frequently. If an item set occurs infrequently. Then all the supersets of the item set have infrequent occurrence.
Linear Regression Machine Learning Algorithm
It shows the relationship between 2 variables. Also, shows how the change in one variable impacts the other. Basically, the algorithm shows the impact on the dependent variable. That depends on changing the independent variable. Thus, the independent variables as explanatory variables. As they explain the factors impact the dependent variable. Moreover, a dependent variable has often resembled the factor of interest or predictor.
Decision Tree Machine Learning Algorithm
Basically, a decision tree is a graphical representation. That makes use of branching method to exemplify all possible outcomes of a decision. Basically, in a decision tree, the internal node represents a test on the attribute. As each branch of the tree represents the outcome of the test. And also the leaf node represents a particular class label. i.e. the decision made after computing all the attributes. Further, we have to represent classification through the path from a root to the leaf node.
Random Forest Machine Learning Algorithm
It is the go-to Machine Learning algorithm. That we use a bagging approach to create a bunch of decision trees with a random subset of the data. Although, we have to train a model several times on random sample of the dataset. That need to achieve good prediction performance from the random forest algorithm. Also, in this ensemble learning method, we have to combine the output of all the decision tree. That is to make the final prediction. Moreover, we derive the final prediction by polling the results of each decision tree.
Logistic Regression Machine Learning Algorithm
Generally, the name of this algorithm could be a little confusing. As Logistic Regression algorithm is for classification tasks and not regression problems. Also, the name "Regression" here implies that a linear model is fit into the feature space. Further, this algorithm applies a logistic function to a linear combination of features. That need to predict the outcome of a categorical dependent variable. Moreover, it was based on predictor variables. The probabilities that describe the outcome of a single trial are modeled as a function. Also, the function of explanatory variables.