This notebook contains my notes for Predictive Analysis on Binary Classification. It acts as a cookbook. It is a continuation from the previous post on assessing performance of Predictive Models. For Deployment Retrain the model on the full data set and pull out the coefficients corresponding to the best alpha—the one determined to minimize out-of-sample error, which is estimated in…

# Tag: Binary Classification

## Predictive Analysis , Binary Classification (Cookbook) – 6

This notebook contains my notes for Predictive Analysis on Binary Classification. It acts as a cookbook. It is a continuation from the previous post on Pearson’s Correlation. This notebook discusses assessing performance of Predictive Models. One of the most used is the misclassification error—that is, the fraction of examples that the function pred() predicts incorrectly. Reading and Arranging data In [29]:…

## Predictive Analysis , Binary Classification (Cookbook) – 5

This notebook contains my notes for Predictive Analysis on Binary Classification. It acts as a cookbook. It is a continuation from the previous post on visualizing. This notebook discusses Pearson’s Correlation. Pearson’s Correlation Calculation for Attributes 2 versus 3 and 2 versus 21 In [21]: from math import sqrt #calculate correlations between real-valued attributes dataRow2 = rocksVMines.iloc[1,0:60] dataRow3 = rocksVMines.iloc[2,0:60] dataRow21…

## Predictive Analysis , Binary Classification (Cookbook) – 4

This notebook contains my notes for Predictive Analysis on Binary Classification. It acts as a cookbook. It is a continuation from the previous post on using pandas. Visualizing Parallel Coordinates Plots In [15]: for i in range(208): #assign color based on color based on “M” or “R” labels if rocksVMines.iat[i,60] == “M”: pcolor = “red” else: pcolor = “blue” #plot rows…

## Predictive Analysis , Binary Classification (Cookbook) – 3

This notebook contains my notes for Predictive Analysis on Binary Classification. It acts as a cookbook. It is a continuation from the previous post on the summary statistics. Using Python Pandas to Read Data In [12]: import pandas as pd from pandas import DataFrame import matplotlib.pyplot as plot %matplotlib inline target_url = (“https://archive.ics.uci.edu/ml/machine-learning-” “databases/undocumented/connectionist-bench/sonar/sonar.all-data”) #read rocks versus mines data into pandas…

## Predictive Analysis , Binary Classification (Cookbook) – 2

This notebook contains my notes for Predictive Analysis on Binary Classification. It acts as a cookbook. It is a continuation from the previous post. Summary Statistics for Numeric and Categorical Attributes In [6]: import numpy as np #generate summary statistics for column 3 (e.g.) col = 3 colData = [] for row in xList: colData.append(float(row[col])) colArray = np.array(colData) colMean = np.mean(colArray)…

## Predictive Analysis , Binary Classification (Cookbook) – 1

This notebook contains my notes for Predictive Analysis on Binary Classification. It acts as a cookbook. Importing and sizing up a New Data Set The file is comma delimited, with the data for one experiment occupying one line of text. This makes it a simple matter to read a line, split it on the comma delimiters, and stack the resulting…