Technical Archives - Page 7 of 8 - Silicon Valley Data Science

How to Choose a Data Format

It’s easy to become overwhelmed when it comes time to choose a data format. In this post Silvia gives you a framework for approaching this choice, and provide some example use cases.

SILVIA OLIVEROS

March 8, 2016

Crossing the Development to Production Divide

We know what it’s like to deal with complex production deployments that cover the gamut from infrastructure upgrades, to feature deployments, to data migrations, where each step threatens to derail the plan. In this post she’ll give an overview of obstacles she’s faced (you may be able to relate) and talk about solutions to overcome these obstacles.

HEATHER NELSON

March 3, 2016

Ethereum: Rise of the World Computer

The Ethereum network is a distributed economy like Bitcoin, except it is much, much more powerful. Rick Seeger dives into why you should be paying attention to its popularity.

RICK SEEGER

February 23, 2016

Reshaping Data with Pivot in Spark

Andrew gives you a deep dive into pivoting data with SparkSQL. This piece was originally posted on the Databricks blog.

ANDREW RAY

February 16, 2016

Data Day and Graph Day Texas Slides

Check out the slides from our recent presentations at Data Day TX and Graph Day.

February 2, 2016

Pivoting Data in SparkSQL

Andrew Ray, Senior Data Engineer, contributed to the most recent release of Spark. This post gives examples of how to use his pivot commit in PySpark.

ANDREW RAY

January 5, 2016

The Basics of Classifier Evaluation: Part 2

A previous blog post made the point that classifiers shouldn’t use classification accuracy as a performance metric. The next part in this series was going to discuss other evaluation techniques such as ROC curves, profit curves, and lift curves. However, there are several important points to be made first. Here I present a sequence that shows the progression and inter-relation of the issues.

TOM FAWCETT

December 10, 2015

Advanced Spark Meetup Recap

Our audience of engineers got right into the guts of Spark’s GraySort benchmark win last year with Chris Fregly from IBM Spark Technology Center. Here are a few highlights from the meetup.

CHRISTIAN PEREZ

December 3, 2015

From Impala to Hive with Love

While on paper it should be a seamless transition to run Impala code in Hive, in reality it’s more like playing a relentless game of whack-a-mole. This post provides hints to make the transition easier.

MAURICIO VACAS

October 20, 2015

Posts Tagged ‘Technical’

Sign In