begriffs

Intro to Apache Mesos, the distributed systems SDK

November 28, 2014

Niklas Nielsen, distributed systems engineer at Mesosphere gives an overview of how Mesos manages resources to ensure their fair and efficient use in a compute cluster. He also demonstrates two higher-level frameworks on top of Mesos which keep jobs alive and manage timing and dependencies.

Overview

  • What is Apache Mesos (and what is it not)
  • Abstracting from physical machines to resources
  • “Everything fails all the time”
  • Mesos’ heuristic for the NP-Hard cloud scheduling problem
  • Delegation to local decision-making nodes
  • Resource offers
  • Scheduling tasks across racks or nodes using attributes
  • Avoiding resource starvation using reservations
  • Mesos is a kernel with which you rarely interact directly
    • You use frameworks on top
    • Marathon starts processes in a mesos cluster and does deployments and upgrades
    • Chronos is a distributed cron with dependencies
  • Twitter uses Mesos to handle
    • 240 million monthly users
    • 150k tweets per second
    • 100TB per day of compressed data