DigiACTive

[insert witty subtitle here]

Vagrant, NFS Synced Folders, Permissions, and Bindfs on CentOS 6

WARNING: highly technical post ahead!

A tale of two synced folder backend implementations

Here at DigiACTive, we’ve made much use of Vagrant to help us manage and deploy a consistent development environment across all our development machines. For the uninitiated, Vagrant essentially allows developers to create a standard configuration — operating system, software packages, configuration files, and so on — which can be automagically deployed (usually using a single command!) into a variety of environments, such as a VirtualBox virtual machine. Our development team uses both Linux and Mac OS X, and our software has a large number of dependencies, so having a standard development environment has proven very useful.

Our Vagrant configuration uses Vagrant’s synced folders to share the developer’s working copy of our Git repository with the test server. Depending on the provider onto which the Vagrant setup is deployed, there are a number of different backend implementations used for synced folders. By default, VirtualBox deployments will use VirtualBox shared folders — which, apart from being notoriously unreliable, have limited support for some important POSIX file system features, such as hard links.

To get around this, one can enable the alternative NFS backend, which doesn’t have these limitations — and can conveniently be enabled with a single line in the Vagrantfile. It all sounds great, right?

Not quite.

Daniel fixed up the last details of our Vagrantfile, gave it a final test, and pushed it up. Ben pulled it down on to his MacBook, ran vagrant up, and after a couple of hours of watching our Chef scripts download many, many dependencies, it all worked! (Well, mostly. Anyway.)

Meanwhile, on my Debian box, I (Andrew) was getting very, very close to tearing my hair out. After fixing up a number of other problems to get Vagrant running properly, I tried to provision the VM — and every time I tried, it complained that the provisioning scripts didn’t have appropriate permissions when working with our shared folders.

NFS and permissions

After some investigation, it became apparent what the issue was.

Our Chef scripts were attempting to change the owner of the configuration files they were modifying to vagrant, the default user set up inside the VM. However, as the synced folders are mounted using NFS, changing the ownership of a file on the remote client (the VM) means changing the ownership on the server (the host system). On the host system, root squashing means that the client didn’t have the root privileges necessary to do that.

However, it soon became apparent that there was another issue here:

1
2
3
4
5
6
[vagrant@localhost vagrant]$ ls -la
total 84
...
drwxr-xr-x   2 1000 1000 4096 Dec  7 05:46 scope
drwxr-xr-x   3 1000 1000 4096 Jan 16 12:13 src
drwxr-xr-x   4 1000 1000 4096 Nov 19 00:51 vagrant

It turns out that NFS shares UIDs between the server and clients — which is fine if all systems use the same authentication backend. This isn’t the case in a Vagrant system, obviously. NFS does not provide a built-in way to map between different server/client users — I believe this can be accomplished with a proper directory service, but we weren’t going to set that up just for a simple development VM.

This also explained why Daniel and Ben had no problems provisioning on their Macs. On my Debian host system, standard UIDs start at 1000, while in the CentOS guest system, UIDs start at 501. Coincidentally, Mac OS X also starts UIDs at 501 — so on their systems, the files were already owned by vagrant.

I searched around for a few too many hours trying to find a solution, playing around with random NFS mounting options in an effort to make it work…

bindfs and vagrant-bindfs

Enter bindfs.

Bind mounts have existed for a while in various *nixes, allowing users to mount already-mounted filesystems to other locations.

bindfs takes this concept further, though. Using the wonderful powers of FUSE, bindfs allows you to virtually alter ownership and permission bits — which is exactly what I needed! As a FUSE filesystem, bindfs has some unfortunate performance issues, but hey, at least it’d work!

Even better, there’s a Vagrant plugin, vagrant-bindfs, that allows easy configuration of bindfs mounts straight from the Vagrantfile!

I thought I’d found my solution, and proceeded to try and grab the bindfs RPM for CentOS 6…

…which didn’t exist. It used to be built as part of the EPEL repository, but alas, no more!

bindfs, vagrant-bindfs, and RPMs

After much searching around for information about building RPMs, I ended up downloading the RPM spec file from the old EPEL package, updating it to bindfs 1.12.3, and compiling it myself. After learning more than I ever wanted to know about RPM building, I managed to compile a CentOS 6-compatible bindfs RPM.

However, there was still the small issue of installing this RPM during the Vagrant configuration process. For vagrant-bindfs to work, bindfs needs to be installed in the boot-up configuration phase, not the provisioning phase, which meant that adding it to our Chef scripts wasn’t going to work.

Conveniently, vagrant-bindfs makes use of Vagrant’s guest capabilities framework to detect when the guest doesn’t have bindfs installed and trigger an installation capability to download the appropriate package. The installation capability needs to be implemented separately for each type of guest, and at present only Debian is supported.

I implemented a quick-and-dirty hack to extend vagrant-bindfs’s installation capability for CentOS. Because bindfs has a number of dependencies which mean we can’t just use rpm -i to install it automatically, I decided to create a YUM repository (which is actually really easy!), which makes the rest really easy.

So, after many, many hours of trying, I finally got bindfs working, and finally managed to provision my testing VM!

If you’re trying to get bindfs working with Vagrant and CentOS 6, and don’t want to go through the same pain I went through, I give you…

Downloads and Links!

DigiACTive provides these downloads as-is, and takes no responsibility for any issues! Use at your own risk!

Setting Up Python 2.7 on CentOS 6.4 the (Really) Easy Way

Setting up Python 2.7 on CentOS 6.4 is not particularly easy: CentOS ships with 2.6 and yum seems to depend on it.

The traditional way to do this is to build Python from source, and there is even a Chef cookbook to do this.

However, this is slow, and when you want to verify that your box builds correctly so that your team can start work on it, you want to minimise these sorts of delays. So I (Daniel) have been hunting around for the best way to install pre-built RPMs.

The closest I found was this excellent article from Red Hat: Setting up Django and Python 2.7 on Red Hat Enterprise 6 the easy way.

To summarise the article, RHEL (and CentOS) support ‘software containers’. They’re sort of like virtualenvs or RVMs for packages – isolated, but more lightweight than full chroots. Python 2.7 is a supported container. Not all the steps are the same for CentOS as they are for RHEL, so here’s an update of the article for CentOS:

  • CentOS makes it even easier than RHEL to set up the SCLs. Instead of writing random files to your filesystem, you just need to yum install centos-release-SCL
  • You’re now ready to do a yum install python27
  • As with RHEL, you can now pop a shell with Python 2.7 by simply doing scl enable python27 bash
    • If you want to run commands with more than one argument, you need to enclose them in quotes, otherwise scl will interpret them as other environments to enable.

As long as you are happy running scl enable python27 bash before work, you’re now set.

The gotchas

While this is much, much faster than building from source, there are two, interrelated gotchas to bear in mind.

Virtualenvs no longer shield you from interpreter differences

I naively expected that virtualenvs would ‘just work’ and would prevent me from having to think about the extra layer of ‘magic’ I had just added.

I was wrong.

If you create a virtualenv inside scl, and then exit scl and try to run Python, it fails:

1
2
3
4
5
6
[vagrant@localhost tmp]$ scl enable python27 bash # enable the Python 2.7 environment
[vagrant@localhost tmp]$ virtualenv foo
[vagrant@localhost tmp]$ exit # leave the 2.7 environment, back to 2.6
[vagrant@localhost tmp]$ source foo/bin/activate
(foo)[vagrant@localhost tmp]$ python
python: error while loading shared libraries: libpython2.7.so.1.0: cannot open shared object file: No such file or directory

You get the same error when you try to create a virtualenv with the newly installed binary outside of scl.

The correct solution is to just wrap everything in scl: making scl enable python27 bash part of your workflow, just like source env/bin/activate.

Not everything is easy to wrap in scl enable

For times when it is impossible or very difficult to use scl enable as designed (for example when you’re using someone else’s chef cookbook), you can work around it. On my 64 bit system, the library it’s looking for is in /opt/rh/python27/root/usr/lib64, so you just need to set an environment variable for that:

1
env LD_LIBRARY_PATH=/opt/rh/python27/root/usr/lib64 /opt/rh/python27/root/usr/bin/python

If you’re writing a Chef cookbook, you can give an execute block (or similar) an environment parameter:

1
2
3
4
execute "foo" do
  ...
  environment "LD_LIBRARY_PATH" => "/opt/rh/python27/root/usr/lib64/"
end

Conclusion

Using a packaged Python is a “good thing”. The most obvious reason is that of security: if you’ve hardcoded a version of Python to download, build and install, and a security hole is found and a new version is released, you’ll have to manually update all your recipes. If you’re using distro packages, it’ll happen more or less automatically in your regular updates.

The second reason distro packages are better is simply that of speed. Time is money: why spend it waiting Python to build?

Hopefully this makes it a bit easier for you all.

Being Part of the Open Source Software Ecosystem

At DigiACTive, we’re building our solution on top of an open source software stack. Practically, we had a number of reasons for chosing open source, including price, ease of access and cross-platform compatibility. More fundamentally, open source software has played a huge role in bringing technological innovations to those who would otherwise not have had access to them, and that’s very much in line with how we see the Digital Canberra Challenge.

It’s therefore very exciting to be part of advancing the state of open source software in a couple of small ways.

Improving django-storage-swift

We recently submitted a set of patches to django-storage-swift, a Python module for interfacing between Django, our web framework of choice, and OpenStack Swift, our object store of choice.

Our main contribution is support for temporary URLs. Temporary URLs provide a way for us to grant users access to their files and their files only in a simple and transparent way. In particular, they are an excellent and simple mechanism for preventing attackers from accessing confidential information by guessing file names. We also improved the out-of-the-box experience for users of django-storage-swift, and made a couple of other minor fixes and improvements.

We’ve made our changes available in our fork of the project, and contributed them back to the original project with pull requests.

Updating the clamav chef cookbook

We’re using chef-solo with Vagrant to deploy repeatable environments for development. We recently tried to install a cookbook to support clamav so we can virus scan user-submitted files, but discovered that it’s not compatible with the most recent updates to the yum cookbook. We’ve updated it and submitted the changes back to the original author, and they’ve been well received and merged in.

Looking forward

We’ve been making good progress building our framework and we’re looking forward to making increasingly rapid progress towards the Digital Canberra Challenge project!

Hello World

Greetings, Internet!

We’re DigiACTive, a team of students from the Australian National University competing in the inaugural Digital Canberra Challenge.

We’re working with the ACT Territory and Municipal Services Directorate and National ICT Australia to improve the system of obtaining government permits and approvals to hold public events. At the moment, permit applications are long and complex, and it’s easy to accidentally forget to submit important documents — making the process difficult and frustrating for event organisers. On the other side, the government’s legacy systems make it hard to track the progress of applications, making the approval process slower than it ought to be. Overall, it’s all a bit of a mess!

That’s where we come in. Over the next few months, we’ll be developing a proof-of-concept system to help both event organisers and government officials keep track of complex applications while making sure they’re complying with all the appropriate regulations. Our prototype will help guide the ACT Government as it adopts new technology to replace the existing permits system.

It’s early stages at the moment — we’ll try to keep this blog updated as we go so you can have some idea what we’re up to. In the meantime, if you’ve got some ideas on how to improve the permits system, we’d love to hear from you! Contact us at digiactive.canberra at gmail dot com.

- The DigiACTive Team