Machine Learning Algorithms

10 groups of Machine Learning Algorithms

Posted on Leave a commentPosted in Data Analysis Resources, Machine Learning, Predictive Analysis

In this article, I grouped some of the popular machine learning algorithms either by learning or problem type. There is a brief description of how these algorithms work and their potential use case. Regression How it works: A regression uses the historical relationship between an independent and a dependent variable to predict the future values […]

countvectorizer sklearn example

Countvectorizer sklearn example

Posted on Leave a commentPosted in Data Analysis Resources, Machine Learning, scikit-learn

This countvectorizer sklearn example is from Pycon Dublin 2016. For further information please visit this link. The dataset is from UCI. In [2]: messages = [line.rstrip() for line in open(‘smsspamcollection/SMSSpamCollection’)] In [3]: print (len(messages)) 5574 In [5]: for num,message in enumerate(messages[:10]): print(num,message) print (‘\n’) 0 ham Go until jurong point, crazy.. Available only in bugis n great world la e […]

What Make A Really Good Diamond?

Posted on Leave a commentPosted in Data Analysis Resources

The aim of this blog is to assess the quality and characteristics of the diamonds and gain insights about what makes a really good diamond. The data set is from ggplot2. The explanatory data analysis is done in Python and the notebooks are available on my Github. This blog address few important questions such as: […]

Visualise Categorical Variables in Python

Visualise Categorical Variables in Python

Posted on Leave a commentPosted in Data Analysis Resources

It is crucial to learn the methods of dealing with categorical variables as categorical variables are known to hide and mask lots of interesting information in a data set. A categorical variable identifies a group to which the thing belongs. You could categorise persons according to their race or ethnicity, cities according to their geographic […]

Hostelworld Challenge

Exploratory Data Analysis for Hostelworld Challenge

Posted on Leave a commentPosted in Competition Notes, Data Analysis Resources, Personal Stories

I recently took part in a challenge by Hostelworld. The challenge proposed by Hostelworld is to build a recommendation engine for users. Recommendations can save Recommendations can save travellers valuable time, improve their hostel experience, and increase user retention. This challenge will use user information, reviews, and hostel details. This is a link for Exploratory […]