Algorithms – A Data Analyst

10 groups of Machine Learning Algorithms

In this article, I grouped some of the popular machine learning algorithms either by learning or problem type. There is a brief description of how these algorithms work and their potential use case. Regression How it works: A regression uses the historical relationship between an independent and a dependent variable to predict the future values […]

Coding FP-growth algorithm in Python 3

FP-growth algorithm Have you ever gone to a search engine, typed in a word or part of a word, and the search engine automatically completed the search term for you? Perhaps it recommended something you didn’t even know existed, and you searched for that instead. This requires a way to find frequent itemsets efficiently. FP-growth […]

AdaBoost (Python 3)

AdaBoost The AdaBoost (adaptive boosting) algorithm was proposed in 1995 by Yoav Freund and Robert Shapire as a general method for generating a strong classifier out of a set of weak classifiers . AdaBoost works even when the classifiers come from a continuum of potential classifiers (such as neural networks, linear discriminants, etc.) AdaBoost Pros: […]

Apriori Algorithm (Python 3.0)

Apriori Algorithm The Apriori algorithm principle says that if an itemset is frequent, then all of its subsets are frequent.this means that if {0,1} is frequent, then {0} and {1} have to be frequent. The rule turned around says that if an itemset is infrequent, then its supersets are also infrequent. We first need to […]

Principal Component Analysis in scikit-learn

Principal Component Analysis (PCA) is an orthogonal linear transformation that turns a set of possibly correlated variables into a new set of variables that are as uncorrelated as possible. The new variables lie in a new coordinate system such that the greatest variance is obtained by projecting the data in the first coordinate, the second […]

Naiive Bayes in scikit-learn

Naïve Bayes is a simple but powerful classifier based on a probabilistic model derived from the Bayes’ theorem. Basically it determines the probability that an instance belongs to a class based on each of the feature value probabilities. One of the most successful applications of Naïve Bayes has been within the field of Natural Language […]

Decision Trees in scikit-learn

Decision trees are very simple yet powerful supervised learning methods, which constructs a decision tree model, which will be used to make predictions. The main advantage of this model is that a human being can easily understand and reproduce the sequence of decisions (especially if the number of attributes is small) taken to predict the […]

Regression in scikit-learn

We will compare several regression methods by using the same dataset. We will try to predict the price of a house as a function of its attributes. In [6]: import numpy as np import matplotlib.pyplot as plt %pylab inline Populating the interactive namespace from numpy and matplotlib Import the Boston House Pricing Dataset In [9]: from sklearn.datasets […]

Linear Classification method with ScikitLearn

This blog is from the book and aimed to be as a learning material for myself only.Linear Classification method implements regularized linear models with stochastic gradient descent (SGD) learning. Each sample estimates the gradient of the loss at a time and the model updates along the way with a decreasing strength schedule (aka learning rate). SGD allows […]

Evaluating Machine Learning Algorithms

This blog contains notes for me to understand how to evaluate machine learning algorithms . I want to see how models compare and contrast to each other. This is from the following web page: http://machinelearningmastery.com/machine-learning-in-python-step-by-step/ I am evaluating 6 different algorithms in this blog : Logistic Regression (LR) Linear Discriminant Analysis (LDA) K-Nearest Neighbors (KNN). […]

Tag: Algorithms