Posts Tagged ‘Architecture’

Big Data is About Agility

Any technology is only as good as the way in which you use it.

We Need a New Data Architecture: What Next?

In this revamped classic, Edd looks at the challenges of moving forward with a new architecture, and where you need to start.

Building Pipelines to Understand User Behavior

In this post, we cover what’s needed to understand user activity, and we look at some pipeline architectures that support this analysis.

Building Data Systems: What Do You Need?

In this post, we’re going to go over the capabilities you need to have in place in order to successfully build and maintain data systems and data infrastructure.

Understanding Modern Data Systems

In this post, Fausto talks about the characteristics that differentiate data infrastructure development from traditional development, and highlights key issues to look out for.

Building a Prediction Engine using Spark, Kudu, and Impala

In this post, Richard walks you through a demo based on the Meetup.com streaming API to illustrate how to predict demand in order to adjust resource allocation.

Crossing the Development to Production Divide

Heather knows what it’s like to deal with complex production deployments that cover the gamut from infrastructure upgrades, to feature deployments, to data migrations, where each step threatens to derail the plan. In this post she’ll give an overview of obstacles she’s faced (you may be able to relate) and talk about solutions to overcome these obstacles.

The Data Platform Puzzle

Editor’s Note: We’re talking about this, and more, at Enterprise Dataversity and Strata NY this fall. Building or rebuilding a data platform can be a daunting task, as most questions that need to be asked have open-ended answers. Different people may answer each question a different way, including the dreaded response of “well, it depends.” Truthfully, […]

Space Shuttle Problems: Long-term Planning Amid Changing Technology

How can you manage your implementation in a way that allows you to take maximum advantage of technology innovation as you go, rather than having to freeze your view of technology to today’s state and design something that will be outdated when it launches? You must start by deciding which pieces are necessary now, and which can wait.

From Impala to Hive with Love

While on paper it should be a seamless transition to run Impala code in Hive, in reality it’s more like playing a relentless game of whack-a-mole. This post provides hints to make the transition easier.

Develop Spark Apps on YARN Using Docker

Rather than get bitten by the idiosyncrasies involved in running Spark on YARN vs. standalone when you go to deploy, here’s a way to set up a development environment for Spark that more closely mimics how it’s used in the wild.

5 Things a Blockchain Needs to Succeed

Today, the currency supply supported by the Bitcoin blockchain is worth four billion dollars. So, what have we learned? There are five essential properties any good blockchain must have.

Developing in a Microservice Environment: Part 4

Before writing its first line of code in a microservice project, the team has to set up an environment to facilitate an effective development process. The distributed nature of microservice architecture makes setting up such an environment non-trivial.

Developing in a Microservice Environment: Part 3

In the first and second posts of this series, I described the importance of effective documentation when developing microservices, as well as the challenge of maintaining consistency in an environment with distributed development teams. In this post, I will look at some strategies for effective communication.

Developing in a Microservice Environment: Part 2

In the first post of this series on helpful guidelines to help you get started in the creation of effective microservice architectures, I described the importance of effective documentation when developing microservices. In this post, I will tackle the challenge of maintaining consistency in an environment with distributed development teams.

Developing in a Microservice Environment: Part 1

If you have identified microservices as the best solution to the technical problems you are facing, then consider the following collection of helpful guidelines to help you get started.

Evaluating Microservices: Real World Lessons

Microservices are a popular topic in developer circles, because they are a means of solving problems that have plagued monolithic software projects for decades: namely, tardiness and bugs, both caused by complexity.

The Basics of Classifier Evaluation, Part 1

If it’s easy, it’s probably wrong.

We Need a New Data Architecture: What Next?

It’s clear from the explosion of interest in newer platforms and technologies that the old tools and licensing costs don’t work to meet new business needs.

Dust in the Chain

Since the blockchain is both easily accessible and immutable, it is incredibly useful for other purposes. Issuing a tiny fraction of a Bitcoin (called dust) with embedded data allows anyone to easily store data permanently and publicly.

Use Cases for Apache Spark

The Apache Spark big data processing platform has been making waves in the data world, and for good reason.

The Blockchain is Forever

Marc Andreessen has compared the current state of Bitcoin to the state of the internet in 1993.

Two Tips for Optimizing Hive

Hadoop is only beneficial if using it is efficient.

Using Docker to Build a Data Acquisition Pipeline with Kafka and HBase

It’s hard to miss that Docker has been taking off lately.

Flexible Data Architecture with Spark, Cassandra, and Impala

An important aspect of a modern data architecture is the ability to use multiple execution frameworks over the same data.

Data Architecture Reading List

Databases sure ain’t what they used to be—it takes more than a relational database to put together a modern data architecture.

Sign up for our newsletter