Thoughts from Euro PyData

The Hardest Part of Technology is the Humans  |  July 29th, 2014

The PyData Conference in Berlin gave me a lot to think about this past weekend. As a collaborator on the firstPyData Workshop, I am glad to see what a successful and varied event it has become (thanks to the hard work of Peter Wang, Travis Oliphant, Leah Silen, and the great team at NumFOCUS). The conference has occurred an impressive seven times in half a dozen cities since its birth in 2012, and each time it is a unique experience with subtly differing audiences and topical focuses that emerge organically based on the local community—and are thus an intriguing reflection of how different parts of the overall community are approaching data science with Python.

This time around, I was struck both by what was present and what wasn’t. At the April PyData, the IPython Notebook was mentioned with an enthusiasm that bordered on reverence in literally every single talk I attended. At the Berlin event, it came up once or twice in the talks I heard, and only in passing—everyone agreed it’s a marvelous tool, but for whatever reason there just wasn’t as much hype around it this time.

Instead, this event’s strongly recurring themes included the Pandas library and a fascinating host of wetware issues woven into otherwise technical sessions. Jean-Paul Schmetz kicked off the sessions with an insightful talk on “Dealing with Complexity” that perhaps dealt with the problems of wetware most directly. “People are complex, and when you intertwine people, code, and data, it gets worse,” he said. “You have to make your people be simplifiers, not complectifiers.” He pointed out that achieving this tends to be very “un-agile,” because it requires a lot of oversight in the form of training, review, and strong leadership. But ultimately, he said, “Simplicity is whatever helps you do the right thing; complexity is whatever prevents that.” If you are at all in charge of processes or culture at your company, his talk is well worth watching.

The wetware theme showed up again in Lynn Root’s simultaneously fascinating and terrifying talk, “How to Spy with Python,” in which she systematically demonstrated what the NSA is doing with PRISM and how it can be done with Python. Her talk had a decidedly technical focus, but listeners couldn’t help but confront and discuss their own feelings on privacy and security. What we do with data and how we treat it come down to our basic human instincts. This talk was a must-see for anyone concerned about these issues.

Another notable example of the wetware theme, and an obvious example of Pandas’ popularity as well, was a pleasing bookend to Schmetz’s opening thoughts: Chris Nyland gave a talk in the last time-slot of the conference on the “Panda’s Thumb: unexpected evolutionary use of a Python library.” Nyland is a programmer and a tax lawyer in the UK who argued—with considerable energy and humor—that the Pandas library, while not the perfect tool for the job, is a far superior substitute for spreadsheets in helping lawyers with the problems of tax compliance. This, of course, raises the issue of getting a group of people to adopt a tool with which they are unfamiliar. Evangelism is always an uphill battle.

With kudos to the PyData team for having so many videos available so soon (as Nyland put it, “The only way@PyDataConf could have put my talk online faster involves time travel…”), I encourage you to take advantage of all the great talks from this past weekend.

If you’d like to learn more about Python for data science, look for the PyData Day at Strata Conference + Hadoop World in New York this October.