Predicting NBA winners with Decision Trees and Random Forests in Scikit-learn

Predicting NBA winners with Decision Trees and Random Forests in Scikit-learn

Posted on Leave a commentPosted in Machine Learning, Predictive Analysis, scikit-learn

In this blog, we will be predicting NBA winners with Decision Trees and Random Forests in Scikit-learn.The National Basketball Association (NBA) is the major men’s professional basketball league in North America and is widely considered to be the premier men’s professional basketball league in the world. It has 30 teams (29 in the United States and […]

countvectorizer sklearn example

Countvectorizer sklearn example

Posted on 1 CommentPosted in Data Analysis Resources, Machine Learning, scikit-learn

This countvectorizer sklearn example is from Pycon Dublin 2016. For further information please visit this link. The dataset is from UCI. In [2]: messages = [line.rstrip() for line in open(‘smsspamcollection/SMSSpamCollection’)] In [3]: print (len(messages)) 5574 In [5]: for num,message in enumerate(messages[:10]): print(num,message) print (‘\n’) 0 ham Go until jurong point, crazy.. Available only in bugis n great world la e […]

Features selection for determining House Prices ?

Posted on Leave a commentPosted in Kaggle, Predictive Analysis, scikit-learn

Home values are influenced by many factors. Basically, there are two major aspects: The environmental information, including location, local economy, school district, air quality, etc. The characteristics information of the property, such as lot size, house size and age, the number of rooms, heating / AC systems, garage, and so on. When people consider buying […]

Womens Health Risk Assessment

Modeling Women’s Health Risk Assessment

Posted on Leave a commentPosted in Competition Notes, Machine Learning, scikit-learn

Women’s Health Risk Assessment is a multi-class classification competition for finding an optimized machine learning a solution that allows a young woman (age 15-30 years old) to be accurately categorized for their particular health risk. Based on the category a patient falls within, healthcare providers can offer appropriate education and training programs to help reduce […]

Evaluating Algorithms using Kaggle’s Digit Recognizer Data

Posted on Leave a commentPosted in Kaggle, Machine Learning, scikit-learn

In [1]: import numpy as np import pandas as pd import seaborn as sns import matplotlib.pyplot as plt %matplotlib inline import warnings warnings.filterwarnings(‘ignore’) In [2]: # importing the train dataset train = pd.read_csv(r’C:\Users\piush\Desktop\Dataset\DigitRecognizer\train.csv’) train.head(10) Out[2]: label pixel0 pixel1 pixel2 pixel3 pixel4 pixel5 pixel6 pixel7 pixel8 … pixel774 pixel775 pixel776 pixel777 pixel778 pixel779 pixel780 pixel781 pixel782 pixel783 0 […]