Workflows are like kickball games: everyone knows the general idea of what’s going on, there’s an orderly progression towards an end-goal, nobody wants to be excluded, and people lose their shit when they get hit in the face by a big rubber ball. Okay, so maybe it’s not a perfect mapping but you get the idea.
The previous two posts (one and two) focused on writing modules, wrapping modules, and classification. While BOTH of these things are very important in the grand scheme of things, one of the biggest problems people get hung-up on is how do you iterate upon your modules, and, more importantly, how do you eventually get these changes pushed into production in a reasonably orderly fashion?
This post is going to be all over the place. We’re gonna cover the idea of separate environments in Puppet, touch on dynamic environments, and round it out with that mother-of-a-shell-script-turned-personal-savior, R10k. Hold on to your shit.
Puppet Environments
Puppet has the concept of ‘environments’ where you can logically
separate your modules and manifest (read: site.pp
) into separate folders
to allow for nodes to get entirely separate bits of code based on which
‘environment’ the node belongs to.
Puppet environments are statically set in puppet.conf
, but,
as other blog posts have noted, you can do some crafty things in
puppet.conf
to give you the solution of having ‘dynamic environments’.
NOTE: The solutions in this post are going to rely on Puppet environments, however environments aren’t without their own shortcomings namely, this bug on Ruby plugins in Puppet). For testing and promoting Puppet classes written in the DSL, environments will help you out greatly. For complete separation of Ruby instances and any plugins to Puppet written in Ruby, however, you’ll need separate masters (which is something that I won’t be covering in this article).
One step further – ‘dynamic’ environments
Adrien Thebo, hitherto known as ‘Finch’, – who is known for building awesome things and talking like he’s fresh from a Redbull binge – created the now-famous blog post on creating dynamic environments in Puppet with git. That post relied upon a post-commit hook to do all the jiggery-pokery necessary to checkout the correct branches in the correct places, and thus it had a heavy reliance upon git.
Truly, the only magic in puppet.conf
was the inclusion of ‘$environment’ in
the modulepath
configuration entry on the Puppet master (literally that
string and not the evaluated form of your environment). By doing that, the
Puppet master would replace the string ‘$environment’ with the environment
of the node checking in and would look to that path for Puppet manifests
and modules. If you use something OTHER than git, it would be up to you to
create a post-receive hook that populated those paths, but you could still
replicate the results (albiet with a little work on your part).
People used this pattern and it worked fairly well. Hell, it STILL works fairly well, nothing has changed to STOP you from using it. What changed, however, was the ecosystem around modules, the need for individual module testing, and the further need to automate this whole goddamn process.
Before we deliver the ‘NEW SOLUTION’, let’s provide a bit of history and context.
Module repositories: the one-to-many problem
I touched on this topic in the first post, but one of the first problems you encounter when putting your modules in version control is whether or not to have ONE GIANT REPO with all of your modules, or a repository for every module you create. In the past we recommended putting every module in one repository (namely because it was easier, the module sharing landscape was pretty barren, and teams were smaller). Now, we recommend the opposite for the following reasons:
- Individual repos mean individual module development histories
- Most VCS solutions don’t have per-folder ACLs for a single repositories; having multiple repos allows per-module security settings.
- With the one-repository-per-module solution, modules you pull down from the Forge (or Github) must be committed to your repo. Having multiple repositories for each module allow you to keep everything separate
- Publishing this module to the Forge (or Github/Stash/whatever) is easier with separate repos (rather than having to split-out the module later).
The problem with having a repository for every Puppet module you create is that you need a way to map every module with every Puppet master (and, also which version of every module should be installed in which Puppet environment).
A project called librarian-puppet sprang up that created the
‘Puppetfile
’, a file that would map modules and their versions to a
specific directory. Librarian was awesome,
but, as Finch noted in his post, it had some shortcomings
when used in an environment with many and fast-changing modules.
His solution, that he documented here,, was the tool we now come
to know as R10k.
Enter R10k
R10k is essentially a Ruby project that wraps a bunch of shell commands you
would NORMALLY use to maintain an environment of ever-changing Puppet modules.
Its power is in its ability to use Git branches combined with a Puppetfile
to keep your Puppet environments in-sync. Because of this, R10k is CURRENTLY
restricted to git. There have been rumblings of porting it to Hg or svn, but
I know of no serious attempts at doing this (and if you ARE doing this, may
god have mercy on your soul). Great, so how does it work?
Well, you’ll need one main repository SIMPLY for tracking the Puppetfile
.
I’ve got one right here, and it only has my Puppetfile
and a
site.pp
file for classification (should you use it).
NOTE: The Puppetfile and librarian-puppet-like capabilities under the hood
are going to be doing most of the work here – this repository is solely so you
can create topic branches with changes to your Puppetfile
that will
eventually become dynamically-created Puppet environments.
Let’s take a look at the Puppetfile
and see what’s going on:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
|
This example lists the syntax for dealing with modules from both the Forge and Github, as well as pulling specific versions of modules (whether versions in the case of the Forge, or Github references as tags, branches, or even specific commits). The syntax is not hard to follow – just remember that we’re mapping modules and their versions to a set/known environment.
For every topic branch on this repository (containing the Puppetfile
), R10k
will in turn create a Puppet environment with the same name. For this reason,
it’s convention to rename the ‘master’ branch to ‘production’ since that’s the
default environment in Puppet (note that renaming branches locally is easy –
renaming the branch on Github can sometimes be a pain in the ass). You will
also note why it’s going to be somewhat hard to map R10k to subversion, for
example, due to the lack of lightweight branching schemes.
To explain any more of R10k reads just as if I were describing its installation, so let’s quit screwing around and actually INSTALL/SETUP the damn thing.
Setting up R10k
As I mentioned before, we have the main repository that will be
used to track the Puppetfile
, which in turn will track the modules to
be installed (whether from The Forge, Github, or some internal git repo). Like
any good Puppet component, R10k itself can be setup with a Puppet module.
The module I’ll be using was developed by Zack Smith,
and is pretty simple to get started. Let’s download it from the forge first:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
|
The module will be installed into the first path in your modulepath, which in
the case above is /etc/puppetlabs/puppet/modules
. This modulepath will change
due to the way we’re going to setup our dynamic Puppet environments. For this
example, I’m going to have environments dynamically generated at
/etc/puppetlabs/puppet/environments
, so let’s create that directory first:
1
|
|
Now, we need to setup R10k on this machine. The module we downloaded will
allow us to do that, but we’ll need to create a small Puppet manifest that
will allow us to setup R10k out-of-band from a regular Puppet run (you CAN
continuously-enforce R10k configuration in-band with your regular Puppet
run, but if we’re setting up a Puppet master to use R10k to serve out dynamic
environments it’s possible to create a chicken-and-egg situation.). Let’s
generate a file called r10k_installation.pp
in /var/tmp
and have it look
like the following:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
|
So what is every section of that declaration doing?
version => '1.1.3'
sets the version of the R10k gem to installsources => {...}
is a hash of sources that R10k is going to track. For now it’s only our main Puppet repo, but you can also track a Hiera installation too. This hash accepts key/value pairs for configuration settings that are going to be written to/etc/r10k.yaml
, which is R10k’s main configuration file. The keys in-use areremote
, which is the path to the repository to-be-checked-out by R10k,basedir
, which is the path on-disk to where dynamic environments are to be created (we’re using the$::settings::confdir
variable which maps to the Puppet master’s configuration directory, or/etc/puppetlabs/puppet
), andprefix
which is a boolean to determine whether to use R10k’s source-prefixing feature. NOTE: thefalse
value is a BOOLEAN value, and thus SHOULD NOT BE QUOTED. Quoting it turns it into a string, which matches as a boolean TRUE value. Don’t quotefalse
– that’s bad, mmkay.purgedirs=> ["${::settings::confdir}/environments"]
is configuring R10k to implement purging on the environments directory (so any folders that R10k doesn’t create it will delete). This configuration MAY be moot with newer versions of R10k as I believe it implements this behavior by default.manage_modulepath => true
will ensure that this module sets themodulepath
configuration item in/etc/puppetlabs/puppet/puppet.conf
modulepath => ...
sets themodulepath
value to be dropped into/etc/puppetlabs/puppet/puppet.conf
. Note that we are interpolating variables ($::settings::confdir
again), AND inserting the LITERAL string of$environment
into themodulepath
– this is because Puppet will replace$environment
with the value of the agent’s environment at catalog compilation.
JUST IN CASE YOU MISSED IT: Don’t quote the false
value for the prefix setting
in the sources
block. That is all.
Okay, we have our one-time Puppet manifest, and now the only thing left to do is to run it:
1 2 3 4 5 6 7 |
|
At this point, it goes without saying that git
needs to be installed, but
if you’re firing up a new VM that DOESN’T have git
, then R10k is going to
spit out an awesome error – so ensure that git
is installed. After that,
let’s synchronize R10k with the r10k deploy environment -pv
command (-p
for Puppetfile
synchronization and -v
for verbose mode):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 |
|
I ran this first synchronization with verbose mode so you can see exactly what’s
getting copied where. Futher synchronizations don’t have to be in verbose mode,
but it’s good for debugging. After all of that, we have an
/etc/puppetlabs/puppet/environments
folder containing our dynamic Puppet
environments based off of the branches of the main Puppet repo:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
|
As you can see (at the time of this writing), my main Puppet repo
has three main branches: development
, master
, and production
, and so
R10k created three Puppet environments matching those names. It’s somewhat
of a convention to rename the master branch to production
, but in this
case I left it alone to demonstrate how this works.
ONE OTHER BIG GOTCHA: R10k does NOT resolve dependencies, and so it is UP TO YOU to track them in your Puppetfile. Check this out:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 |
|
I’ve installed Puppet Enterprise 3.1.0, and so /opt/puppet/share/puppet/modules
reflects the state of the Puppet Enterprise (also known as ‘PE’) modules at that
time. You can see that there are some conflicts because certain modules require
certain versions of other modules. This is currently the nature of the beast
with regard to Puppet modules. Some of these errors are loud and incidental
(i.e. someone set a dependency on a version and forgot to update it), some
are due to namespace changes (i.e. cfprice-inifile
being ported over to
puppetlabs-inifile
), and so on. Basically, ensure that you handle the
dependencies you care about inside the Puppetfile
as R10k won’t do it for you.
There – we’ve done it! We’ve configured R10k! Now how the hell do you use it?
R10k demonstration – from module iteration to environment iteration
Let’s take the environment we’ve setup in the previous steps and walk you through adding a new module to your production environment, iterating upon that module, pushing the changes to that module, pushing the changes to a Puppet environment, and then promoting those changes to production.
NOTES ON THE SETUP OF THIS DEMO:
- In this demonstration, classification method is going to be left to the user
(i.e. it’s not a part of the magic). So, when I tell you to classify your
node with a specific class, I don’t care if you use the Puppet Enterprise
Console,
site.pp
, or any other manner. - I’m using Github for my repositories so that you folk watching and playing along at home can have something to follow. Feel free to substitute Github for something like Atlassian Stash/Bitbucket, internal repos, or whatever.
Add the module to an environment
The module we’ll be working with, a simple module called ‘notifyme’, will notify a message that will help us track the module’s process through all phases of iteration.
The first thing we need to do is to add the module to an environment, so let’s dynamically create a NEW environment by creating a new topic branch and pushing it up to the main puppet repo. I will perform this step on my laptop and outside of the VM I’m using to test R10k:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
|
The contents I added to my Puppetfile
look like this:
1 2 |
|
Perform an R10k synchronization
To pull the new dynamic environment down to the Puppet master, do another
R10k synchronization with r10k deploy environment -pv
:
1 2 3 4 5 6 7 8 |
|
I only included the relevant messages, but you can see that it pulled in a new environment called ‘notifyme’ that ALSO pulled in a module called ‘notifyme’
Rename the branch to avoid confusion
Suddenly I realize that this may get confusing having both an environment called ‘notifyme’ with a module/class called ‘notifyme’. No worries, how about we rename that branch?
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
|
That bit of git
renamed the ‘notifyme’ branch to ‘garysawesomeenvironment’.
The next git command is a bit tricky – when you git push
to a remote, it’s
supposed to be:
git push name_of_origin local_branch_name:remote_branch_name
In our case, the name of our origin is LITERALLY ‘origin’, but we actually want
to DELETE a remote branch. The way to delete a local branch is with
git branch -d branch_name
, but the way to delete a REMOTE branch is to push
NOTHING to it. So consider the following command:
git push origin :notifyme
We’re pushing to the origin named ‘origin’, but providing NO local branch name and pushing that bit of nothing to the remote branch of ‘notifyme’. This kills (deletes) the remote branch.
Finally, we push to our origin named ‘origin’ again and push the contents of the local branch ‘garysawesomeenvironment’ to the remote branch of ‘garysawesomeenvironment’ which in turn CREATES that branch if it doesn’t exist. Whew. Let’s run another damn synchronization:
1 2 3 4 5 6 7 8 |
|
Cool, let’s check out our environments
folder on our VM:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
|
Run Puppet to test the new environment
Perfect! Now classify your node to include the ‘notifyme’ class, and let’s run Puppet to see what we get when we try to join the environment called ‘garysawesomeenvronment’:
1 2 3 4 5 6 7 8 |
|
Cool! Now let’s try to run Puppet with another environment, say ‘production’:
1 2 3 4 5 6 |
|
We get an error because that module hasn’t been loaded by R10k for that environment.
Tie a module version to an environment
Okay, so we added a module to a new environment, but what if we want to test out a specific commit, branch, or tag of a module and test it in this new environment? This is frequently what you’ll be doing – making a change to an existing module, pushing your change to a topic branch of that module’s repository, tying it to an environment (or creating a new environment by branching the main Puppet repository), and then testing the change.
Let’s go back to my ‘notifyme’ module that I’ve cloned to my laptop and push a change to a BRANCH of that module’s Github repository:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 |
|
What I’m showing you is the workflow that creates a new local branch called ‘change_the_message’ to the notifyme module, changes the message in my notify resource, commits the change, and pushes the changes to a remote branch ALSO called ‘change_the_message’.
Because I created a topic branch, I can provide that branch name in the
Puppetfile
located in the ‘garysawesomeenvironment’ branch of the
main Puppet repo. THAT is the piece that ties together the specific
version of the module with the Puppet environment we want on the Puppet master.
Here’s that change:
1 2 3 |
|
Again, that change gets put into the ‘garysawesomeenvironment’ branch of the main Puppet repo and pushed up to the remote:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 |
|
Now let’s synchronize again!!
1 2 3 4 5 6 |
|
Cool, let’s check our work on the VM:
1 2 3 4 5 |
|
And finally, let’s run Puppet:
1 2 3 4 5 6 7 8 |
|
TADA! We’ve successfully tied a specific version of a module to a specific dynamic environment, deployed it to a master, and tested it out! Smell that? That’s the smell of awesome. Or Jeff in the next cubicle eating a burrito. Either way, I like it.
Merge your changes with master/production
It’s green – fuck it; ship it! NOW you’re speaking ‘agile’! Assuming everything went according to plan, let’s merge our changes in with the production environment and synchronize. This is up to your company’s workflow docs (whether you use pull requests, a merge master, or poke Patrick and tell him to tell Andy to merge in your change). I’m using git and Github, so let’s merge.
First, do the Module:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
|
So now we have an issue, and that issue is that the production environment has YET to have the ‘notifyme’ module added to it. If we merge the contents of the ‘garysawesomeenvironment’ branch with the ‘production’ branch of the main Puppet repo, then we’re going to be pointing at the ‘change_the_message’ branch of the ‘notifyme’ module (because that was our last commit).
Because of this, I can’t do a straight merge, can I? For posterity’s sake (in
the event that someone in the future wants to look for that branch on my Github
repo), I’m going to keep that branch alive. In a production environment, I
most likely would NOT have additional branches open for all my component modules
as that would get pretty annoying/confusing. Understand that this is a one-off
case because I’m doing a demo. BECAUSE of this, I’m going to modify the
Puppetfile
in the ‘production’ branch of the main Puppet repo:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 |
|
Alright, we’ve updated the production environment, now synchronize again (I’ll spare you and do it WITHOUT verbose mode):
1
|
|
Okay, now run Puppet with the PRODUCTION environment:
1 2 3 4 5 6 7 8 |
|
Beautiful, we’re synchronized!!!
Making a change to an EXISTING module in an environment
Okay, so we saw previously how to add a NEW module to an environment, but what if we already HAVE a module in an environment and we want to make an update/change to it? Well, it’s largely the same process:
- Cut a branch to the module
- Commit your code and push it up to the module’s repo
- Cut a branch to the main Puppet repo
- Push that branch up to the main Puppet repo
- Perform an R10k synchronization to sync the environments
- Test your changes
- Merge the changes with the master branch of the module
- DONE!
Let’s go back and change that notify message again, shall we?
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
|
Okay, let’s re-use ‘garysawesomeenvironment’ because I like the name, but tie it to the new ‘another_change’ branch of the ‘notifyme’ module:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
|
The Puppetfile
for that branch now has an entry for the ‘notifyme’ module
that looks like this:
1 2 3 |
|
Okay, synchronize again!
1
|
|
And now run Puppet in the ‘garysawesomeenvironment’ environment:
1 2 3 4 5 6 7 8 |
|
There’s the message that I changed in the ‘another_change’ branch of my ‘notifyme’ module! What’s it look like if I run in the ‘production’ environment, though?
1 2 3 4 5 6 7 8 9 |
|
There’s the old message that’s in the ‘master’ branch of the ‘notifyme’
module (which is where the ‘production’ branch Puppetfile
is pointing).
To merge the changes into the production environment, we now only have to
do one thing: that’s merge the changes in the ‘another_change’ branch of
the ‘notifyme’ module to the ‘master’ branch – that’s it! Why? Because
the Puppetfile
in the production
branch of the main Puppet repo
(and thus the production Puppet ENVIRONMENT) is already POINTING at the
master branch of the ‘notifyme’ module. Let’s do the merge:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
|
Another R10k synchronization is needed on the master:
1
|
|
And now let’s run Puppet in the production environment:
1 2 3 4 5 6 7 8 |
|
There’s the message that was previously in the ‘another_change’ branch that’s been merged to the ‘master’ branch (and thus is entered into the production Puppet environment).
OR, use tags
One more note – for production environments that want a BIT more stability
(rather than hoping that someone follows the policy of pushing commits to
a BRANCH of a module rather than pushing directly to master – by accident or
otherwise – and allowing that commit to make DIRECTLY it into production), the
better way is to tie all modules to some sort of release version. For modules
released to the Puppet Forge, that’s a version, for modules stored in git
repositories, that would be a tag. Tying all modules in your production
environment (and thus production Puppetfile
) to specific tags in git
repositories IS a “best practice” to ensure that the code that’s executed in
production has some sort of ‘safe guard’.
TL;DR: Example tied to ‘master’ branch above was demo, and not necessarily recommended for your production needs.
Holy crap, that’s a lot to take in…
Yeah, tell me about it. And, believe it or not, I’m STILL not done with everything that I want to talk about regarding R10k – there’s still more info on:
- Using R10k with a monolithic modules repo
- Incorporating Hiera data
- Triggering R10k with MCollective
- Tying R10k to CI workflow
Those will come in a later post once I have time to decide how to tackle them. Until then, this should give you more than enough information to get started with R10k in your own environment.
If you have any questions/comments/corrections, PLEASE enter them in the comments below and I’ll be happy to respond when I’m not flying from gig to gig! :) Cheers!
EDIT: 2/19/2014 – correct librarian-puppet assumption thanks to Reid Vandewiele