November 8, 2017 Timothy Prickett Morgan
Nobody likes to talk about the scope and scale of platforms than we do at The Next Platform. Almost all of the interesting frameworks for various kinds of distributed computing are open source projects, but the lack of fit and finish is a common complaint across open source software projects.
As Mark Collier, chief operating officer at the OpenStack Foundation, puts it succinctly: “Open source doesn’t have an innovation problem. It has an integration problem.”
Collier’s chief concern, as well as that of his compatriot, Jonathan Bryce, executive director of the OpenStack Foundation and a former Racker – meaning former employee of cloud provider Rackspace Hosting – like Collier, is to make the OpenStack cloud controller that they helped created along with executives at NASA an enterprise-friendly platform. One that normal software engineers, not the platoons of PhDs that the hyperscalers and cloud builders employ, can deploy and use.
It is a tall order, and in many ways it is a much tougher job than the impressive scale goals that Rackspace Hosting and NASA both set when they merged their separate compute and storage cloud efforts back in July 2010. Back then, with clouds growing at an explosive exponential pace, it was not a surprise that the two organizations wanted to create a cloud controller that could span 1 million servers and 60 million virtual machines. While OpenStack has certainly proven that it can scale, for most enterprise customers a private cloud with a few thousand nodes is sufficient, and it is more important that a cloud do a lot of things, such as integrating with Docker containers and Kubernetes container orchestration, or support the latest TensorFlow or Caffe2 machine learning frameworks as transparently as they support the Xen or KVM or ESXi server virtualization hypervisors. In a sense, clouds are scaling up with ever more complex workloads rather than scaling out across more and more nodes, and this has as much to do with how many cores and how much main memory can be crammed into a node these days as anything else.
The focus on testing and integration that Collier and Bryce are talking about this week at the OpenStack Summit in Sydney, Australia, is not as exciting as the early days of the cloud controller; but all good and used open source software eventually matures and becomes boring – just about when something new comes along and either replaces it or evolves it. To its credit, OpenStack is still evolving, absorbing new technologies such as bare metal provisioning and layering on containers, just to name two. And making this hodge podge of open source projects into a platform – something the commercial Linux distributors labored heavily on for a decade until they mastered the task – is now the number one job of all OpenStackers.
“I think that the core services of compute, storage, and networking in OpenStack are very polished at this point,” Collier tells The Next Platform. “People have been running them at scale, but it is not always as massive as we think. I heard that China UnionPay was supporting 500 million users with over 150 applications on OpenStack, running over 50 million transactions per day – that’s over 3,000 transactions per second – that handled a total of 100 billion renminbi (about $15 billion) a day. I thought to myself, this OpenStack cloud must have over 1 million cores to do that, but it consists of two clusters with a total of 1,200 servers with around 10,000 cores. Because of OpenStack’s maturity and widespread adoption, now is the right time for our community to focus on the integration work. And as a foundation, we are trying to evolve to better support them.”
To better support OpenStack distributors as well as cloud builders and enterprises that work from the raw code and roll their own, the foundation is setting up what it is calling OpenLab, which as the name suggests is an effort by key players to improve the user experience and performance of software development kits, various frameworks, and multi-cloud management tools that run atop OpenStack. Huawei Technologies, Intel, Deutsche Telekom, and Vexxhost are ponying up the iron to do the testing of upstream projects with OpenStack, starting with Kubernetes, Terraform, and Gophercloud and expanding out from there.
Another subtle change in the OpenStack community is that the releases of the software, which have been on an October-April cadence for the past eight years with the software for each release coming out just before each OpenStack Summit is now being released a few months before the big shindigs. This, says Collier, allows the software engineers working on OpenStack to get a better sense of where the code is at before they attend the events where the changes and additions to the next release are discussed at each summit.
Yet another change being implemented in the OpenStack development effort, as steered by the foundation, is a shift from delineation by projects to centered on specific use cases where different parts of the stack come into play. This “integration effort” has four core infrastructure uses cases now – datacenter cloud, container, edge, and continuous integration/continuous delivery application development – and will be adding other variants like machine learning and AI, financial services, or augmented reality and virtual reality in the future.
“In a few years, I think that edge computing, in terms of total infrastructure use, will be larger than the core datacenter that companies like Amazon Web Services deploy,” says Collier. “You have the same need to manage all of this data, but you can do it more efficiently, faster, and with lower latency on site than back at a consolidated datacenter. This is the interesting emerging model.”
Stick A Fork In It
It is reasonable, then, to ask at this point just how close to done OpenStack is, so we did that. “A lot of the core OpenStack services are mostly done for the present date use cases,” explains Bryce. “I think what is interesting to see is how much infrastructure is diversifying. Infrastructure just used to mean a bunch of servers in a datacenter. What we see now is this spreading out for edge applications, but also for retailers like Wal-Mart and Target, who are thinking about what the next generation of their in-store infrastructure is going to look like and what role OpenStack plays in that. AT&T has rolled out more than a hundred datacenters running OpenStack and that is handling different production services, including phone calls and a variety of network functions, but now they are looking at the next wave out. When you have a hundred regional offices, these fan out to thousands of branch locations as well as the cell towers and the half racks of gear sitting up on a telephone pole. They want the same level of automation for the infrastructure out there that they have in the datacenters.”
There are always new corner cases where new things need to be added to OpenStack, and in some cases interestingly enough, customers are only deploying OpenStack to manage bare metal provisioning using Ironic, others use a mix of Ironic and Keystone identity management – there is no server virtualization or containerization at all.
While OpenStack is complex – meaning it can do many different things and will be asked to do more – that is the easy part for software engineers, who are always eager to hack some code to solve a problem. The hard part has always been, and will always be, making that hack look like it had always been part of the plan from the beginning.
To do this and to bolster the efforts to document this stack of code and how it works together, the OpenStack Foundation is investing $20 million a year. It is not clear how much all of the “free” time that the tens of thousands of software engineers who work on OpenStack is worth, but it is surely at least an order of magnitude or two larger if you sat down and added up all of the hours and allocated pay. That is an interesting bit of accounting we would love to see for all the big open source projects, come to think of it . . . .