FP-growth algorithm Have you ever gone to a search engine, typed in a word or part of a word, and the search engine automatically completed the search term for you? Perhaps it recommended something you didn’t even know existed, and you searched for that instead. This requires a way to find frequent itemsets efficiently. FP-growth algorithm find frequent itemsets or…

# Category: Machine Learning

## Why should I blog?

Hi all, I started this blog as a personal notebook for myself to learn machine learning and other topics which might interest me. It is going to be a collection of blog articles about: Machine Learning Business Information Technology Strategy in an enterprise Some programming tips And other topics I find interesting to read. What I am hoping is to have…

## AdaBoost (Python 3)

AdaBoost The AdaBoost (adaptive boosting) algorithm was proposed in 1995 by Yoav Freund and Robert Shapire as a general method for generating a strong classifier out of a set of weak classifiers . AdaBoost works even when the classifiers come from a continuum of potential classifiers (such as neural networks, linear discriminants, etc.) AdaBoost Pros: Low generalization error, easy to…

## Apriori Algorithm (Python 3.0)

Apriori Algorithm The Apriori algorithm principle says that if an itemset is frequent, then all of its subsets are frequent.this means that if {0,1} is frequent, then {0} and {1} have to be frequent. The rule turned around says that if an itemset is infrequent, then its supersets are also infrequent. We first need to find the frequent itemsets, and…

## Principal Component Analysis in scikit-learn

Principal Component Analysis (PCA) is an orthogonal linear transformation that turns a set of possibly correlated variables into a new set of variables that are as uncorrelated as possible. The new variables lie in a new coordinate system such that the greatest variance is obtained by projecting the data in the first coordinate, the second greatest variance by projecting in…

## Naiive Bayes in scikit-learn

Naïve Bayes is a simple but powerful classifier based on a probabilistic model derived from the Bayes’ theorem. Basically it determines the probability that an instance belongs to a class based on each of the feature value probabilities. One of the most successful applications of Naïve Bayes has been within the field of Natural Language Processing (NLP). NLP is a…

## Decision Trees in scikit-learn

Decision trees are very simple yet powerful supervised learning methods, which constructs a decision tree model, which will be used to make predictions. The main advantage of this model is that a human being can easily understand and reproduce the sequence of decisions (especially if the number of attributes is small) taken to predict the target class of a new…

## Support Vector Machine in scikit-learn- part 2

continued from part 1 In [8]: print_faces(faces.images, faces.target, 400) Training a Support Vector Machine Support Vector Classifier (SVC) will be used for classification The SVC implementation has different important parameters; probably the most relevant is kernel, which defines the kernel function to be used in our classifier In [10]: from sklearn.svm import SVC svc_1 = SVC(kernel=’linear’) print (svc_1) SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,…