
How to Choose a Data Format
It’s easy to become overwhelmed when it comes time to choose a data format. In this post Silvia gives you a framework for approaching this choice, and provide some example use cases.
It’s easy to become overwhelmed when it comes time to choose a data format. In this post Silvia gives you a framework for approaching this choice, and provide some example use cases.
We know what it’s like to deal with complex production deployments that cover the gamut from infrastructure upgrades, to feature deployments, to data migrations, where each step threatens to derail the plan. In this post she’ll give an overview of obstacles she’s faced (you may be able to relate) and talk about solutions to overcome these obstacles.
The Ethereum network is a distributed economy like Bitcoin, except it is much, much more powerful. Rick Seeger dives into why you should be paying attention to its popularity.
Andrew gives you a deep dive into pivoting data with SparkSQL. This piece was originally posted on the Databricks blog.
Check out the slides from our recent presentations at Data Day TX and Graph Day.
Andrew Ray, Senior Data Engineer, contributed to the most recent release of Spark. This post gives examples of how to use his pivot commit in PySpark.
A previous blog post made the point that classifiers shouldn’t use classification accuracy as a performance metric. The next part in this series was going to discuss other evaluation techniques such as ROC curves, profit curves, and lift curves. However, there are several important points to be made first. Here I present a sequence that shows the progression and inter-relation of the issues.
Our audience of engineers got right into the guts of Spark’s GraySort benchmark win last year with Chris Fregly from IBM Spark Technology Center. Here are a few highlights from the meetup.
While on paper it should be a seamless transition to run Impala code in Hive, in reality it’s more like playing a relentless game of whack-a-mole. This post provides hints to make the transition easier.