From Zero to 2500: Managing Development for OpenStack Completely in
Who am I?
Office of Technology
Who am I?
Developer Infrastructure Core Team
The Four Opens
- Open Source
we don't hold back "Enterprise" features, we don't cripple things
- Open Design
design process open to all, decisions are not made inside company doors
- Open Development
public source code, public code review, all code is reviewed and gated
- Open Community
lazy consensus, democratic leadership from participants,
public logged meetings in IRC, public archived mailing lists
Tooling, Automation and CI for OpenStack Project
The original OpenStack use case
- Fully automated gated commits
- Full end-to-end integration tests from scratch for every commit
- Massive scale
OpenStack Scale by the numbers
- 2 KJPH (kilo-jobs / hour) (1/3 the total Travis job rate)
- 2376 arbitrary developers
- 1474 git repositories
- 11727 Jobs
- Merge 10k Changes / Month
ansible has received 13171 PRs (changes),
has merged 8190 of them and has 37788 commits in its entire lifetime
Infra operates the same way as OpenStack
- All server config management in git
- Puppet manages the servers: puppet apply
- Ansible runs puppet: ansible puppet module
- Ansible OpenStack Dynamic Inventory
- Only thing not public are keys and secrets
It wasn't always this way!
Let me take you on a walk down memory lane ...
We started with 4 cloud servers in Rackspace
- Hudson Master (https://launchpad.net/~hudson-openstack)
- Nova Build Node
- Swift Build Node
- The other server (now known as old-wiki)
old-wiki is still running! (On Ubuntu 10.04)
I didn't even have access to the cloud account!
- Hudson jobs ran Tarmac, which tested and merged Launchpad Merge Requests
- Hudson ran Tarmac in a loop, published the build results
- One Job per project
- Three of us with direct Hudson Admin permissions
This state persisted for the first year and first three OpenStack
Each project got a node and a job. Configured by hand. By me.
It got annoying
Please remember we're talking 2011 here
Puppet vs. Chef and git vs. bzr and humans pushing things
We were so excited about sharing Ops best practices!
We were so sad
Brief Rant - I do not want to write Apache configs in Puppet DSL
Our developers wanted to collaborate on test jobs.
Giving hundreds of people access to directly edit test jobs == sadness
Did I mention our test jobs implement captive gating?
Jenkins Job Builder
YAML encoding of Jenkins Job definitions with templating
Allowed jobs to go through code review before being applied!
Andreas Jaegar is OpenStack's all-time contributions leader. He works
on docs and test jobs
Introduction of Puppetmaster
- In cron job, config repo updated on puppetmaster
- puppet agent ran on each node
- Landing a commit == config running on hosts (eventually)
- What to do with passwords and keys?
- Create replica repos on git farm (and on github I suppose)
- Create repo in Gerrit
- Push contents
Too much clicking!
jeepyb - Gerrit Project Builder
Lesson: Don't let Monty name things
Back up: Salt to run Puppet
Ansible to run Puppet
- puppet ansible module
- Role copies subset of hiera secrets to node before puppet
- Moved from puppetmaster to puppet apply
Remaining manual human tasks
- Adding new secrets to hiera
- Launching new servers
Ansible Role Cloud Launcher
- name: admin-clouds
- name: aoclcompany.xlarge
- name: ops
- name: ubuntu-trusty
- name: bootstrap-keypair
- name: bootstrap-key
- name: awesomecloud
- name: yaycloud-ops
Problems depending on services
Even when the service is Open Source, it can stop being
WAS an Open Source translations system.
We run Zanata ourselves now. (Thanks Lyz!)
Remaining external service dependencies
- Rackspace Public Cloud
Launchpad OpenID -> openstackid
Launchpad Bugs -> storyboard
The Multi-cloud OpenStack Story
- Our build nodes already span 12 different OpenStack Public Clouds
- Work starting on spreading the Control Plane out
- Starting with Vexxhost