Tom Fawcett

Co-author of the popular book Data Science for Business, Tom brings over 20 years of experience applying machine learning and data mining in practical applications. He is a veteran of companies such as Verizon and HP Labs, and an editor of the Machine Learning Journal.

Recent Posts

Avoiding Common Mistakes with Time Series Analysis

A basic mantra in statistics and data science is correlation is not causation, meaning that just because two things appear to be related to each other doesn’t mean that one causes the other. This is a lesson worth learning.

Imbalanced Classes FAQ

Here we share some further thoughts on imbalanced classes, and offer more resources.

Learning from Imbalanced Classes

This post gives insight and concrete advice on how to tackle imbalanced data.

Analyzing Caltrain Delays: What We Can Learn

In this post, we will explore some aspects of the train delay data we’ve been collecting from the Caltrain API over the past few months. The goal is to get our heads into the data before setting off on building a prediction model.

The Basics of Classifier Evaluation, Part 2

A previous blog post made the point that classifiers shouldn’t use classification accuracy as a performance metric. The next part in this series was going to discuss other evaluation techniques such as ROC curves, profit curves, and lift curves. However, there are several important points to be made first. Here I present a sequence that shows the progression and inter-relation of the issues.

The Basics of Classifier Evaluation, Part 1

If it’s easy, it’s probably wrong.

Avoiding Common Mistakes with Time Series

A basic mantra in statistics and data science is correlation is not causation,

Listening to Caltrain: Analyzing Train Whistles with Data Science

Many people who live and work in Silicon Valley depend on Caltrain for transportation. And because the SVDS headquarters are in Sunnyvale, not far from a station, Caltrain is literally in our own backyard. So. as an R&D project, we have been playing with data science techniques to understand and predict delays in the Caltrain […]

Sign up for our newsletter