
The Data Platform Puzzle
Building or rebuilding a data platform can be a daunting task, as most questions that need to be asked have open-ended answers. But that doesn’t mean you have to guess and use your gut.
Building or rebuilding a data platform can be a daunting task, as most questions that need to be asked have open-ended answers. But that doesn’t mean you have to guess and use your gut.
Deploying a model without a rigorous process in place has consequences. We go over techniques for successful deployment and management.
In this post, we will look at driving product engagement with behavioral data, as well as building an integrated analytical environment.
In this revamped classic, Edd looks at the challenges of moving forward with a new architecture, and where you need to start.
This post will show architects and developers how to set up Hadoop to communicate with S3, use Hadoop commands directly against S3, use distcp to perform transfers between Hadoop and S3, and how distcp can be used to update on a regular basis based only on differences.
In this post, we cover what’s needed to understand user activity, and we look at some pipeline architectures that support this analysis.
This post walks you through a simple failure recovery mechanism, as well as a test harness that allows you to make sure this mechanism works as expected.
In this post, we’re going to go over the capabilities you need to have in place in order to successfully build and maintain data systems and data infrastructure.
In this post, Fausto talks about the characteristics that differentiate data infrastructure development from traditional development, and highlights key issues to look out for.