Technical Archives - Page 5 of 8 - Silicon Valley Data Science

Learning from Imbalanced Classes

This post gives insight and concrete advice on how to tackle imbalanced data.

TOM FAWCETT

August 25, 2016

Scaling Data Science: Dream Big, Start Medium-ish

On July 13th we welcomed the Open Data Science Conference meetup series to our HQ—our speaker talked about thinking critically about the size of your data.

CHRISTIAN PEREZ

August 11, 2016

How I Learned to Stop Worrying and Love Ephemeral Storage

This post will show architects and developers how to set up Hadoop to communicate with S3, use Hadoop commands directly against S3, use distcp to perform transfers between Hadoop and S3, and how distcp can be used to update on a regular basis based only on differences.

MAURICIO VACAS

August 4, 2016

Structured Streaming in Spark

This post gives you a quick overview of the new structured streaming feature in Spark 2.0, illustrating why it’s an exciting addition.

ANDREW RAY

July 28, 2016

Brain Monitoring with Kafka, OpenTSDB, and Grafana

A team of our data scientists recently won 2nd place in Confluent’s Kafka Hackathon. In this post, explore their project—streaming EEG data and visualizing it.

JEFF LAM

July 14, 2016

Building Pipelines to Understand User Behavior

In this post, we cover what’s needed to understand user activity, and we look at some pipeline architectures that support this analysis.

MARK MIMS

June 23, 2016

Kafka Simple Consumer Failure Recovery

This post walks you through a simple failure recovery mechanism, as well as a test harness that allows you to make sure this mechanism works as expected.

DMITRIY FEFERMAN

June 21, 2016

Noteworthy Links: Social Media Edition

In this post we share some links to interesting work being done with social media data.

June 14, 2016

Materialized Views with Cassandra

In this screencast, Principal Engineer and Cassandra committer Gary Dusbabek provides an overview of Materialized Views.

May 31, 2016

Posts Tagged ‘Technical’

Sign In