Categories
Data Analysis Resources Personal Stories

Analysis of Residential Property Prices in Dublin

Living in Dublin, Ireland is amazingly expensive. Residential property prices in Dublin are growing. Yet we all think about buying a home while still wondering whether we might be better off continuing to rent. The data analyst in me wanted to dive deeper, to look back historically, to quantify, to visualize the trends, etc. to […]

Categories
Data Analysis Resources Machine Learning Predictive Analysis

10 groups of Machine Learning Algorithms

In this article, I grouped some of the popular machine learning algorithms either by learning or problem type. There is a brief description of how these algorithms work and their potential use case. Regression How it works: A regression uses the historical relationship between an independent and a dependent variable to predict the future values […]

Categories
Data Analysis Resources Machine Learning scikit-learn

Countvectorizer sklearn example

This countvectorizer sklearn example is from Pycon Dublin 2016. For further information please visit this link. The dataset is from UCI. In [2]: messages = [line.rstrip() for line in open(‘smsspamcollection/SMSSpamCollection’)] In [3]: print (len(messages)) 5574 In [5]: for num,message in enumerate(messages[:10]): print(num,message) print (‘\n’) 0 ham Go until jurong point, crazy.. Available only in bugis n great world la e […]

Categories
Data Analysis Resources

What Make A Really Good Diamond?

The aim of this blog is to assess the quality and characteristics of the diamonds and gain insights about what makes a really good diamond. The data set is from ggplot2. The explanatory data analysis is done in Python and the notebooks are available on my Github. This blog address few important questions such as: […]

Categories
Data Analysis Resources

Truth About Nutritional Information in Recipes

So much has been said about Proteins, Fat & Calories in recent years, that the single biggest challenge faced when trying to answer the question is how to “separate the wheat from the chaff.” Protein and fats are what our body uses and they all have calorie counts. Too many and we get fat, too […]

Categories
Data Analysis Resources

Visualise Categorical Variables in Python

It is crucial to learn the methods of dealing with categorical variables as categorical variables are known to hide and mask lots of interesting information in a data set. A categorical variable identifies a group to which the thing belongs. You could categorise persons according to their race or ethnicity, cities according to their geographic […]

Categories
Competition Notes Data Analysis Resources Personal Stories

Exploratory Data Analysis for Hostelworld Challenge

I recently took part in a challenge by Hostelworld. The challenge proposed by Hostelworld is to build a recommendation engine for users. Recommendations can save Recommendations can save travellers valuable time, improve their hostel experience, and increase user retention. This challenge will use user information, reviews, and hostel details. This is a link for Exploratory […]

Categories
Data Analysis Resources Predictive Analysis

Time-Series Predictive Analysis of DAX 30

In this blog post we’ll examine some common techniques used in time-series analysis of DAX 30 by applying them to a data set containing daily closing values from 1990 up to present day. The DAX (Deutscher Aktienindex (German stock index)) is a blue chip stock market index consisting of the 30 major German companies trading […]