Categories
Data Analysis Resources Predictive Analysis

Text Analytics in the Healthcare Industry: Data Warehousing and Applications

Abstract— Text analytics is the method of extracting information from text. It involves structuring the text to evaluate, discover patterns and interpret the output. It enhances meaning to data and finds nuggets of information from both transaction-based and decision support systems by removing the barrier between structured and unstructured data. Analysis of text data helps […]

Categories
Data Analysis Resources Experience

Analysis of winning numbers of Irish Lotto

This blog is an analysis of winning numbers of Irish Lotto from last two years. The National Lottery brought new initiatives from Thursday, September 3, 2015, with adding two numbers to the draw meaning players choose from 47 numbers rather than 45 numbers. With this change, the odds of picking the six winning numbers went from just […]

Categories
Business Data Analysis Resources

Jobs which are most susceptible to automation

Throughout history, the technological advances have raised fears that traditional jobs will become obsolete. In this post, I find out the jobs which are most susceptible to automation. Elon Musk told the National Governors Association: “There certainly will be job disruption. Because what’s going to happen is robots will be able to do everything better […]

Categories
Data Analysis Resources Personal Stories

Analysis of Residential Property Prices in Dublin

Living in Dublin, Ireland is amazingly expensive. Residential property prices in Dublin are growing. Yet we all think about buying a home while still wondering whether we might be better off continuing to rent. The data analyst in me wanted to dive deeper, to look back historically, to quantify, to visualize the trends, etc. to […]

Categories
Data Analysis Resources Machine Learning Predictive Analysis

10 groups of Machine Learning Algorithms

In this article, I grouped some of the popular machine learning algorithms either by learning or problem type. There is a brief description of how these algorithms work and their potential use case. Regression How it works: A regression uses the historical relationship between an independent and a dependent variable to predict the future values […]

Categories
Data Analysis Resources Machine Learning scikit-learn

Countvectorizer sklearn example

This countvectorizer sklearn example is from Pycon Dublin 2016. For further information please visit this link. The dataset is from UCI. In [2]: messages = [line.rstrip() for line in open(‘smsspamcollection/SMSSpamCollection’)] In [3]: print (len(messages)) 5574 In [5]: for num,message in enumerate(messages[:10]): print(num,message) print (‘\n’) 0 ham Go until jurong point, crazy.. Available only in bugis n great world la e […]

Categories
Data Analysis Resources

What Make A Really Good Diamond?

The aim of this blog is to assess the quality and characteristics of the diamonds and gain insights about what makes a really good diamond. The data set is from ggplot2. The explanatory data analysis is done in Python and the notebooks are available on my Github. This blog address few important questions such as: […]

Categories
Data Analysis Resources

Truth About Nutritional Information in Recipes

So much has been said about Proteins, Fat & Calories in recent years, that the single biggest challenge faced when trying to answer the question is how to “separate the wheat from the chaff.” Protein and fats are what our body uses and they all have calorie counts. Too many and we get fat, too […]