As 2017 comes to an end, we’re looking back at the three blog posts that were most popular with our readers. We can’t go too far back (though we’ve had 13 posts and one release already), since we founded our startup just a little over one month ago (on November 20, 2017, to be precise), but during this short period we’ve achieved a whole lot, and laid the foundation for some exciting new projects we plan to ship out early next year.
This post and the pull requests we proposed and pushed upstream into Apache Spark are filling a deep void, and they’ve been met with tremendously positive reception. They were featured in several newsletters and have been referenced by popular sites - starting with Hadoop Weekly - that specialize in enterprise monitoring on Grafana. We look forward to following up on how we use Prometheus for metric-based alerts, SLA policy enforcement, correlation in the cloud infrastructure, Kubernetes and application metrics, and predictive cluster behavior. Happy monitoring! We love Prometheus!
This is part of a Spark on Kubernetes series which already has several posts that are almost on par with this one’s popularity.
- Introduction to Spark on Kubernetes
- Scaling Spark made simple on Kubernetes
- Running Zeppelin Spark notebooks on Kubernetes
We’re all set to make Spark a first class citizen on Kubernetes - this will involve some changes to the Apache Spark project, the most important of which is the scheduler. It’s no longer necessary to have multi-level schedulers (each without the knowledge of the others) and create deployment islands; Spark is now scheduled by the Kubernetes scheduler: fast, efficient and with the potential to bring to bear all the available resources of your Kubernetes cluster. There are additional inherited benefits based on default Kubernetes building blocks, which we will blog about early next year. Needless to say, this blog was also featured in several newsletters and has been cross-referenced from quite a few locations. Thank you Google Analytics for all your valuable insights, we can see a clear trend emerging, here!
Interestingly, this is a non-technical blog post which highlights some of the values along which we are set to operate. While we normally blog about technical challenges/achievements, usually alongside code examples that add value for our readers, this post somehow broke the top three. We’re happy to see that, just like us, our blog post readers and early adopters (that means YOU) are interested in a sense transparency, and share our core values. We believe ALL employees deserve as much, and we encourage others to give it a try. It definitely works for us.
Having said that, we’d like to wrap this year up by wishing you a well-deserved rest - and to encourage you to return to reading our blog in the near future, since there will be some very interesting projects coming out of our lab early next year. We’re close to completely removing Zookeeper from Apache Kafka and relying only on the default Kubernetes
etcd, and to releasing a
mysql wire protocol-based OLTP spotguide (yes, Pipeline is application agnostic, and, beside the default Spark, Zeppelin, Kafka spotguides, we’ll be adding spotguides for Java (with cgroups support), Java Enterprise Edition, Tensorflow and many others). We’re also close to the release of the first opensource cloud cost management project, Hollowtrees (we’re sorry to be open sourcing this later than anticipated, but we’re extremely thorough with what we release, and we had to re-architect this project so as to be fully pluggable into any existing architecture. You will most definitely be thankful we did).
We would like to thank you for your support, feedback and interest in our technology and open source projects. ``s