Below you can find the slides from the DockerCon EU presentation in Barcelona from last week. It was a real success and we got a lot of interesting questions!
Our talk “Swarming Spark Applications” submitted at DockerCon 2015 has been accepted! Here is the abstract:
We built Zoe, an open source user-facing service that ties together Spark, a data-intensive framework for big data computation, and Swarm, the Docker clustering system. It targets data scientists who need to run their data analysis applications without having to worry about systems details. Zoe can execute long running Spark jobs, but also Scala or iPython interactive notebooks and streaming applications, covering the full Spark development cycle. When a computation is finished, resources are automatically freed and available for other uses, since all processes are run in Docker containers.
In this talk we are going to present why Zoe, the Container Analytics as a Service, was born, its architecture and the problems it tries to solve. Zoe would not be there without Swarm and Docker and we will also talk about some of the stumbling blocks we encountered and the solutions we found, in particular in transparently connecting Docker hosts through a physical network. Zoe was born as a research prototype, but is now stable and is currently being used to run real jobs from users in our research institution. Application scheduling on top of Swarm and optimized container placement will also be covered during the presentation.
You can find more information about Zoe here: http://www.zoe-analytics.eu
Cost-based Memory Partitioning and Management in Memcached, by D. Carra, P. Michiardi and M. Steiner
In this work we present a cost-based memory partitioning and management mechanism for Memcached, an in-memory key-value store used as Web cache, that is able to dynamically adapt to user requests and manage the memory according to both object sizes and costs. We then present a comparative analysis of the vanilla memory management scheme of Memcached and our approach, using real traces from a major content delivery network operator. Our results indicate that our scheme achieves near-optimal performance, striking a good balance between the performance perceived by end-users and the pressure imposed on back-end servers.
Third International Workshop on In-memory Data Management and Analytics: http://imdm.ws/2015/
Duy Hung Phan and Quang-Nhat are participating to the 2015 edition of the IEEE BigData Congress in New York!