Archive for the ‘Tools’ Category

Big Data is About Agility

Any technology is only as good as the way in which you use it.

Jupyter Notebook Best Practices for Data Science

Editor’s note: Welcome to Throwback Thursdays! Every third Thursday of the month, we feature a classic post from the earlier days of our company, gently updated as appropriate. We still find them helpful, and we think you will, too! The original version of this post can be found here. The Jupyter Notebook is a fantastic […]

Structured Streaming in Spark

This post gives you a quick overview of the new structured streaming feature in Spark 2.0, illustrating why it’s an exciting addition.

Brain Monitoring with Kafka, OpenTSDB, and Grafana

A team of our data scientists recently won 2nd place in Confluent’s Kafka Hackathon. In this post, explore their project—streaming EEG data and visualizing it.

Materialized Views with Cassandra

In this screencast, Principal Engineer and Cassandra committer Gary Dusbabek provides an overview of Materialized Views.

Noteworthy Links: Hadoop Edition

Hadoop is 10 years old! Check out these related links.

Jupyter Notebook for Data Science Teams

Data Scientist Jonathan Whitmore has just released a screencast tutorial for Jupyter Notebooks.

Building a Prediction Engine using Spark, Kudu, and Impala

In this post, Richard walks you through a demo based on the Meetup.com streaming API to illustrate how to predict demand in order to adjust resource allocation.

Why Notebooks Are Super-Charging Data Science

There is little limit to what can be done with a notebook. As well as the data science work you might expect, such as manipulating and graphing data, we’ve used them for sharing work on analytical tasks such as motion detection in video. In this post Edd takes a look at why we’re seeing notebooks everywhere.

Sign up for our newsletter