Archive for the ‘Tools’ Category

Getting Started with Deep Learning

One way to give back to the open source community that provides us with tools is to help others evaluate and choose those tools in a way that takes advantage of our experience. We offer this analysis, along with explanations of the various criteria upon which we based our decisions.

Big Data is About Agility

Any technology is only as good as the way in which you use it.

Jupyter Notebook Best Practices for Data Science

Editor’s note: Welcome to Throwback Thursdays! Every third Thursday of the month, we feature a classic post from the earlier days of our company, gently updated as appropriate. We still find them helpful, and we think you will, too! The original version of this post can be found here. The Jupyter Notebook is a fantastic […]

Structured Streaming in Spark

This post gives you a quick overview of the new structured streaming feature in Spark 2.0, illustrating why it’s an exciting addition.

Brain Monitoring with Kafka, OpenTSDB, and Grafana

A team of our data scientists recently won 2nd place in Confluent’s Kafka Hackathon. In this post, explore their project—streaming EEG data and visualizing it.

Materialized Views with Cassandra

In this screencast, Principal Engineer and Cassandra committer Gary Dusbabek provides an overview of Materialized Views.

Noteworthy Links: Hadoop Edition

Hadoop is 10 years old! Check out these related links.

Jupyter Notebook for Data Science Teams

Data Scientist Jonathan Whitmore has just released a screencast tutorial for Jupyter Notebooks.

Building a Prediction Engine using Spark, Kudu, and Impala

In this post, Richard walks you through a demo based on the Meetup.com streaming API to illustrate how to predict demand in order to adjust resource allocation.

Sign up for our newsletter