Debugging a segfault from Ansible

2017-10-25 23:09

I did a thing this week that solved^w worked around a really interesting problem and helped me learn a good deal about the inner workings of Ansible. I thought I’d share since there’s a lot of really useful Ansible debugging tools I learned about along the way, and maybe I can help someone else who’s encountering this same problem.

tl;dr:

If you are getting segfaults from Ansible on macOS, specifically while interacting with AWS services, try adding environment variable no_proxy='*' to your ansible-playbook command.

The symptom

Ansible playbook runs were sometimes (often) dying with ERROR! A worker was found in a dead state while running on macOS. This was happening in tasks that were looking up data via the AWS API (specifically Credstash) to render templates.

The problem was intermittent for me, but no one else on my project could reproduce it. I also couldn’t reproduce on a Linux VM as I’ll talk about in a bit. Convinced it was something in my virtualenv or laptop but unable to continue without fixing this, I dug deeper.

I was using Ansible 2.3 on macOS 10.12.

The approach

My initial approach involved trying to debug using tools built-in to Ansible. This is done at a few layers which I’ll talk about in a bit.

When those just led me to “yep, worker is ded”, I started to reach for a debugger. I’ve used pdb in the past, but that didn’t work because of Ansible’s execution model (again, more on that in a bit). I found another blog post with a broken link suggesting that epdb was the way to go, but it had the same issue. I did find out later that it’s now built into Ansible!

With all of those options exhausted and in order to keep going, I resorted to adding some debug statements in the upstream libraries.

Built in Ansible Debugging Tools

But first, let's talk about some of those built-in tools to Ansible!

Logging

My project did not have logging enabled initially. This was one of the first things I enabled because I thought maybe it would log something a bit more useful that the generic error message I received. Unfortunately for me, enabling it only confirmed what I already suspected -- a worker was dying unexpectedly. This was generally helpful to have enabled, but not useful in my situation.

Enabling is simple, update your ansible.cfg with a path.

Verbosity settings

Ansible has a couple of options for controlling the verbosity of its output. I’m not clear on what does what, but used the recommendation of a maintainer in an issue on Github. I ran it with both environment variable ANSIBLE_DEBUG=1 and -vvvv set, and gained a whole wealth of info. This also enabled me to more easily add tracing statements which helped in narrowing down the issue.

ANSIBLE_KEEP_REMOTE_FILES

Ansible writes out Python files to a .ansible/tmp directory, executes them, then deletes once it’s done. If you set ANSIBLE_KEEP_REMOTE_FILES=1, it will leave those files in place for you to inspect later. This can be useful if you want to use a debugger (read the next section instead!) or otherwise inspect what’s running during a task. I did a bit of a side quest on this, but it was ultimately not fruitful because the files were not being written in my situation.

Ansible debugger

This was one I learned while writing this, but you can add strategy: debug in a play to launch a debugger when a task in that play fails. This is really handy, but wouldn’t help in this situation as the worker wasn’t gracefully dying.

You can also enable this with an environment variable: ANSIBLE_STRATEGY=debug

The Fun Part

I first started encountering this error after making a somewhat major change in the project to some of the twistier parts of the environment. When I did a bit of searching around using Google for this error, most folks were commenting about running out of memory and the task being killed by OOM killer. This didn’t appear to be the case for me (unlike most of them, I had a machine with 16G of memory and quite a bit free). Still, I ran it using time -l to get memory usage. Seemed perfectly fine, hanging out around 68M or so.

Disabling a particular template let me get past the error (but without this file, obviously), so I was convinced that something in this particular change was causing issues. I added a dummy template in the role before that to do another lookup just to see if it happened to be related to the content in a lookup, but it failed in the same way too.

I talked with a few coworkers, had them try using my branch to run Ansible and no one else could reproduce it. They were on an older version of macOS than I was, but everything else was the same. I also tried it on a CentOS 7 VM and was also unable to reproduce there.

After that, I spent a bit of time working on increasing Ansible’s verbosity. I enabled logging and turned on the higher verbosity settings. I was able to see that the last thing it was doing was doing a lookup in Credstash for a secret. After that, the parent process was cleaning up when it found the dead worker.

Loading LookupModule 'credstash'

Next priority then was to get whatever information possible from these workers. Ansible will fork some processes to work in parallel, throwing tasks onto a queue and letting those workers chew through the jobs. This is managed in lib/ansible/executor/task_queue_manager.py, and a coworker helped point out that there were only 3 exit codes that could cause this error, found in has_dead_workers(). Some debug logging in this function (display.debug(..)) helped us figure out that the workers were segfaulting (exit code -11).

Since we knew that the last thing it was doing was attempting to load the Credstash module, we started with that module.

A few (dozen) display.debug() statements later, we discovered that when segfaulted it was doing the credstash.getSecret() call. Knowing that this rabbit hole was deep, I spent a bit of time first reviewing changelogs with both Credstash and boto3 with no success in finding anything useful.

When there was nothing useful to be found there, I started adding trace logging all over the third party libraries (boto, botocore, and requests) and got to a call to a function inside Python’s Standard Library that was failing: proxy_bypass(), passing a string which was the URL of DynamoDB for my region.

One quick search on Google, and I found a bug in Python related to this. The workaround listed (add no_proxy=’*’ environment variable) addressed it and there were no more segfaults.

I haven’t really decided where to go from here, aside from adding that environment variable to my Ansible calls. I can add it to my .zshrc file so it’s always set, which is one way to address it.. but only for myself. At some point I may take a pass at fixing it, but a lot of folks much smarter than me have already worked on it and haven’t addressed it.

MySQL Secure Installation Using Ansible

2016-01-22 21:56

Recently I was setting up MySQL using Ansible, and wanted to ensure the mysql_secure_installation script or an equivalent was run to get rid of the default users and db. Turned out that writing a task in Ansible wasn't all that bad using the check_implicit_admin feature of the MySQL plugins, it could even be idempotent.

- name: delete anonymous MySQL server user for {{ ansible_nodename }} mysql_user: login_user=root login_password='{{ mysql_root }}' check_implicit_admin=yes user="" host={{ item }} state="absent" with_items: - "" - "{{ ansible_nodename }}" - localhost - name: Change root user password on first run mysql_user: login_user=root login_password="{{ mysql_root }}" check_implicit_admin=yes name=root password={{ mysql_root }} priv=*.*:ALL,GRANT host={{ item }} with_items: - "{{ ansible_nodename }}" - 127.0.0.1 - ::1 - localhost - name: remove the MySQL test database action: mysql_db login_user=root login_password="{{ mysql_root }}" db=test state=absent

ansible mysql

Weekend Project: Library Checker

2015-10-12 04:21

This weekend I finally took some time to work a project that I've been toying with for a while. I built a small app that for now is being known as LibChecker. It's a Flask app which takes a URL to a public Amazon wishlist and searches for each item on the list at my local library (Hennepin County Library, hclib.org). Here's a screenshot of the app in action:

screenshot

The biggest challenge here was that neither Amazon wishlists nor the library have any sort of API. The app uses the BeautifulSoup library to scrape results. Since this is just screen scraping, there's quite a few ways that it can break or be wrong. The search on the library side is pretty aggressive, stripping out editions and subtitles. Then, it matches against a single author (the library gives back only a single author when in search view). Despite all of this, it does the job that I set out to do here. The links given are to a specific result within a set of search results, so the user can easily browse for related titles if the result isn't quite correct.

The code for this is available on BitBucket. The app has been deployed on OpenShift at libchecker.wyattwalter.com.

2015 Reading List - Part 2

2015-10-10 03:37

Continuing on with my post from earlier this year, here's my list of books for Q3 of 2015. This time I read most of the books during the first month or so and then slowed down considerably for a while (too busy with other things).

Good Boss, Bad Boss: How to Be the Best... and Learn from the Worst - Picked this one up at the recommendation of my own boss. A great read, for sure. The biggest takeaway for me reading it the first time is an understanding and acceptance that I'll never truly know what it's like to work for me. And that's ok. This one was packed with advice and definitely worth the read for me.
The Practice of System and Network Administration, Second Edition - This one is a classic in my field. Would definitely recommend at least a skim if you're a SysAdmin and have never read it. I plan on picking up the later released Practice of Cloud Administration at some point in the near future.
Zero to One: Notes on Startups, or How to Build the Future - This book is a long form of asking seven questions to answer when considering launching a business. It changed my perspective a bit and helped me evaluate previous failures a bit more.
The One Minute Negotiator: Simple Steps to Reach Better Agreements - I picked this book up at the library by accident. I had read The One Minute Manager in the past and confused it with this one sitting on the shelf. I'm glad I did. It's a short read, but a great prescriptive plan for analysis of whether or not a situation is a negotiation and what a good approach may be.
Resilience and Reliability on AWS - I have had this one on my wish list for a while, by the title and description, I thought it would be really helpful. Was a total disappointment to me. The content wasn't terribly deep and was a bunch of source code which could've been a repository on Github.
Execution: The Discipline of Getting Things Done - Thinking about getting things done at a level that's larger than myself is something I've thought about often, but rarely a thing I've been responsible for. This book was really an interesting one to me, but is focused at the executive layers of an organization.
Made to Stick: Why Some Ideas Survive and Others Die - While there's a good deal of creativity and art in advertising and other types of campaigns, Dan and Chip Heath argue that there's a framework to follow in order to convey a message that "sticks". Definitely one that I would recommend to anyone looking to be heard with an important message (who isn't?).

reading

DevOpsDays Minneapolis 2015

2015-07-10 03:31

I had the pleasure of attending DevOpsDays Minneapolis again this year. This is my third time at a DevOpsDays, the 2nd such event here in Minneapolis. Below are some of my highlights and takeaways from the conference this year.

Designated Ops

During Katherine Daniels' talk on DevOps at Etsy, she described what they call there "Designated Ops", which is a person from the Ops team designated (but not dedicated!) to each smaller group within their product engineering team. Each team's designated Ops person would attend standups most days for that team and develop a relationship over time with that team and be their first point of contact into the Ops team. This helps them develop familiarity, trust, and get an operational view of things in the development phase, rather than being reactive when some new feature shipped without any input.

"Developer of the week"

Relatedly, later a similar idea came out during an open space session (unfortunately, I don't know the person or company involved here so I can't give proper credit) where the development team had a rotation of what they called the "developer of the week". During a developer's "week", they would physically be sitting amongst the customer support team, helping answer questions about their company's product, seeing pain points for customers and customer support folks, as well as sharing tips and tricks that maybe they wouldn't have realized would be useful.

Remote Work

During another open space, we talked quite a bit about remote work and how to improve bonds between team members even though we can't often be physically in the same space often (for some people, ever). Some action items for me:

buy a real camera! - Using the little web cam that's built into the laptop is not flattering to anyone, you're almost always looking at someone at weird angles. Video chat really is the best way we have today to interact with a remote team today. Invest in making the experience as good as it can (reasonably) be.
everyone uses their camera - There can be exceptions to this rule occasionally, but if you're having a meeting everyone should be engaged and connecting which can be made far better by looking at each other and watching responses. If the team is not engaged, what's the point of the meeting?
establish protocol for communicating and asking/answering questions - This is especially important when working with teams which are cross-cutural. When asking questions via text or video chat, it can be impossible or difficult to read someone's body language. Examples in the discussion:
- When answering a question, be sure to ask back "did that answer your question?" if no positive response is given.
- If someone is intimidated by asking a group and asks via private message, ask the question to the group on their behalf (not revealing who asked it).

"Finished"

Another open space that I got a lot of value from was one proposed to talk on how to "finish" work in an either IT or development that can never really be "done". This is a reality that's faced in a lot of fields. If you're not careful this list of "I'd like to get back to that" can start to become depressing. This topic deserves a post on its own, but I'll add the takeaways that I thought were helpful for my situation.

You don't really need to accomplish more to feel accomplishment. Morale is really the most important thing. While getting tasks completed is important, accomplishing more tasks more quickly will happen naturally as a team accrues wins and gains momentum. In a lot of cases "done" really means finishing with the current iteration. While you should start with an end goal in mind, don't create grandiose tickets with all of the things you'd like to do on a project that stay open indefinitely.

Related to the previous point, retrospectives and demos are really important. Often in Ops work demos are difficult. Patching and maintenance work don't demo well, so this one may be used sparingly for some teams. Retrospecitves are something that I can't say I've ever participated in, will certainly be trying this one.

TODO

Recommended followup reading and watching:

The Year Without Pants: WordPress.com and the Future of Work - a book on remote work at Wordpress, recommended to me by someone in the remote work open space
The Box: How the Shipping Container Made the World Smaller and the World Economy Bigger - a book on literal shipping containers, about friction in the shipping industry and how it was changed, recommended by Mary Poppendeick during her talk "The New Software Development Game"
Focus: Use Different Ways of Seeing the World for Success and Influence - another recommendation by Mary, on motivation and how we view the world
This is Lean: Resolving the Efficiency Paradox - I believe also recommended during Mary's talk, don't remember the context for this one now
"How DevOps Can Fix Federal Government IT" by Mark Schwartz at DevOps Enterprise Summit 2014 - recommended by Joshua Zimmerman during his talk on DevOps and public sector

conference devopsdays

2015 Reading List - Part 1

2015-07-03 22:11

Inspired by other folks posting reading lists and wanting to start writing again, I've decided to start compiling my list of books read and post them periodically. So far this is the list I've recorded for the first half of 2015. Since I didn't start recording the books I read until recently, this list is only the highlights (and lowlights) that I can remember.

How to Win Friends & Influence People by Dale Carnegie - This one is a classic in the self-help genre. The book was less about winning friends as the title suggests, and more about influence and getting along with strangers. I'd definitely recommend it to anyone who struggles with conflicts and affecting behavior from others.
How to Fail at Almost Everything and Still Win Big: Kind of the Story of My Life by Scott Adams - Yes, that's the same Scott Adams of Dilbert fame. His storytelling is amazing and this book has actually had a major impact on my year. Specifically, his generic life advice - diet, exercise, and "using systems not goals" - has helped me quite a bit since I read the book. Not bad for a book by a guy that creates comics for a living.
Quiet: The Power of Introverts in a World That Can't Stop Talking by Susan Cain - This one was recommended at a local tech Meetup, so I decided to check it out from the library. As someone who considers himself an introvert, it was eye-opening to read examples of how my world experiences are vastly different from others with less of this trait. Definitely worth a read, whether you consider yourself introverted or not. I did skip over a bit of the first section of the book as it got a bit repetitive to me while describing all the reasons that it's an "extroverted world".
Managing Humans: Biting and Humorous Tales of a Software Engineering Manager by Michael Lopp - This book has been on my shelf for a while. I don't even remember where or why I picked it up. Since I recently became a manager (again), I figured now was as good a time as any to finally read it. A pretty generic read about managing software developers, covered quite a few topics that I didn't think about as someone who went from being an engineer to manager without a lot of formal training.
Great by Choice: Uncertainty, Chaos, and Luck--Why Some Thrive Despite Them All by Jim Collins - I've read a lot of the other popular books by Collins (Good to Great, Built to Last, How the Mighty Fall) and this one definitely didn't disappoint. None of the concepts about "greatness" and achivement introduced are new, but reiterating each of the concepts in the compilation and reinforcing their importance was good stuff.
Start with Why: How Great Leaders Inspire Everyone to Take Action by Simon Sinek - I read this one immediately following Great By Choice which was kind of interesting since several of the companies studied are the exact same ones. The two authors came to slightly different conclusions about what made those companies great (though not necessarily competing), which is expected since there's really not one solution for being "great". The book was a bit repetitive, but perhaps my perspective was skewed by reading a book on a similar topic so close together.
The Goal: A Process of Ongoing Improvement by Eliyahu M Goldratt and Jeff Cox - I really enjoy a well-done education fiction book, and The Goal definitely fits into that category. This fictional book is used to provide an introduction to the lean manufacturing techniques and does so in a really easy-to-read way.
Rework by Jason Fried - This book is a collection of essays from 37 Signals capturing lessons learned while building their business. It's been on my list of books to read for a while and was a bit disappointed at the lack of depth.

That's it for now. Hopefully a longer list for the second half of 2015 since I've started to record which books I read in a consistent place.

reading

Cleanup After Conversion from Aperture to Photos

2015-05-04 03:53

I finally got around to upgrading my Mac to Yosemite this past weekend. With the upgrade, Aperture stopped working and since Apple since stopped supporting it and pulled it from the App Store (jerks), I was stuck converting my library over to the new Photos app.

The actual conversion was fairly smooth. My photos all imported, and the old library was renamed "Aperture Library.migrated" or something like it on disk. That's where it got a bit confusing for me. See, I don't have that large of a library (130G or so), but it's larger than it really needs to be. Anyone who shoots in raw can likely relate. Each image at 20M+ - with a really low-end and outdated DSLR - can add up quickly, especially when you don't cleanup things like 30 blurry shots of that bear we encountered at Sequoia National Park a few years ago.

I have been wanting to cleanup my library for quite some time and decided that since I'd be learning this new Photos app, now seemed like a great time to do some cleaning. Since I wanted to know how much impact my cleanup efforts had actually made (what engineer doesn't measure before they start?), I did my usual thing when checking out disk space used. I hopped over to iTerm, cd'd to my Pictures directory, and started poking around. Turned out, du was showing ~130G for my Aperture library, but something like 6G for Photos. This was sort of weird to me since I thought my library had been "migrated" and now I didn't know what to do with that old Aperture library.

I sort of ignored this situation at first and started cleaning out photos (since I did know roughly where I had started). However, all of my efforts to clean out photos (I calculated that it should've been at least 10G at some point) had only produced more disk space usage than when I started. This was due to Photos generating thumbnails, which I assumed would happen, but no positive progress was made on the Masters part of the library. I read a few forum threads about older versions of Aperture not deleting originals, and was dreading the mess that I was likely going to have to clean up.

After exploring for a bit more (okay, like an hour), I discovered that du showed the Photos library was ~120G when I ran du directly against it, rather than using "*" in the Pictures directory. Clever. Photos was using hardlinks to maintain the same file structure for the "Masters" directory, but not duplicate space usage after the migration of the library. After quickly experimenting and confirming that the inodes were indeed the same between the two libraries for a few random masters, I realized the only thing left to do was delete the old Aperture library to get rid of the references to the deleted files (after safely archiving it to an additional external drive, of course).

Except for one more problem. It turned out that Time Machine will create a local backup copy of files (not cleverly using hardlinks!) while you delete them, until they can be transfered safely to your backup volume. This is great, except that I didn't have an extra 130G on my disk at the time to store them. There's also apparently no safety mechanism for Time Machine to detect this exact situation. Fortunately, I caught on around the 94% mark while it was emptying my trash. Cancelling the trash emptying (took a long time and) made some of the temporary files disappear, but not all. I was still stuck around 90%.

Looking at the storage data in the "About This Mac" window, it clearly showed that a lot of storage was still being used by "Backups". Since I had actually planned on switching Time Machine volumes, a quick sudo tmutil disablelocal cleared out that storage. Since this is normally a useful feature, I also turned it back on right away with sudo tmutil enablelocal.

Good times. I'm pretty late on this one, but hopefully I can help someone else avoid the same pains I went through!

apple photography

Documentation and Monitoring

2014-08-15 04:55

At DevOpsDays Minneapolis a few weeks ago, we were discussing the topic of documentation within the context of Operations/DevOps/IT/whatever. After I talked a bit about what we did at our company, I realized that we were sort of unique, and others found this technique useful. I thought I'd share a bit about what we're doing. Certainly we're not all the way there yet, but striving to improve over time.

Playbooks

For any check that gets added to the monitoring system, what we call a "playbook" must be written before the pull request is approved and merged. A playbook is essentially a document describing the following things:

What this check is checking - this seems obvious, but should include things like data sources; the "what"
What the impact of this alert could be - what services would be affected; the "why"
Where to go digging for more info on what could've caused this state; at a minimum, start with a log file
Bonus: an idea of what "normal" looks like

A playbook doesn't necessarily have to have all that much info, just enough to give the person who's oncall a fighting chance. This also seems to be a fairly nice format to begin documenting something. It gives practical knowledge at the time it's needed.

Notifications

We've recently migrated to Sensu, which helps us out a lot here. Since it acts as a 'monitoring router' and lets us setup everything as we want, we can easily add arbitrary data and display them in alerts however we want. All checks are defined in Puppet and we add a playbook in the custom data on a check like so:

$wiki = 'http://wiki.example.com' check 'elasticsearch' { ... custom => { ... playbook => "${wiki}/Elastic_Search#Dealing_with_Pages", ... } }

Once this playbook is defined as an attribute on the check, you can easily add it to the message goes out. In this case, we're using the standard mailer handler from the community repo with lines added something like this:

playbook = "Playbook: #{@event['check']['playbook']}" if @event['check']['playbook'] ... body = <<-BODY.gsub(/^ {14}/, '') #{@event['check']['output']} Host: #{@event['client']['name']} Timestamp: #{Time.at(@event['check']['issued'])} Address: #{@event['client']['address']} Check Name: #{@event['check']['name']} Command: #{@event['check']['command']} Status: #{@event['check']['status']} Occurrences: #{@event['occurrences']} #{playbook} BODY

This adds a link in the email body to the playbook if it exists (like I said, not perfect yet :)).

Conclusion

When faced with the challenge of building out documentation for an environment, writing down the 'what', 'why', and 'where to start digging' when something pages is an excellent (and seemingly often overlooked) first step. No one has time to read a 10 page manual in this scenario which will force the writing to be concise and as helpful as possible. Obviously, implementation of this concept will vary wildly, depending upon which monitoring solution you might use.

documentation monitoring sensu

Remote Access to AirPort via SSH Port Forwarding

2013-07-24 22:18

At home I use an AirPort Extreme as my firewall / access point / all of that, with a few ports forwarded through to access some services I have running on a small box at my house (among them SSH access to mange things). I ran into a situation while traveling where I wanted to make a couple of changes to the port forwarding configuration, but did not have the option to "Allow setup over WAN", and also have no desire to enable this. I've always had home routers that have some web GUI, so I just used SSH port forwarding and hit the web interface in my browser. As it turns out, it's not a lot more difficult with an AirPort Extreme, and still lets me leave the AirPort mangement port not to be exposed to the Internet. In my situation, I have my MacBook along and have a Linux machine at home that I access via SSH.

The AirPort Utility communicates with the AirPort via TCP port 5009. You can setup a port forward with something like:

ssh -L 5009:172.16.0.1:5009 home.mydomain.com

Of course, you'll want to change your internal IP address (172.16.0.1 in my case) to whatever the internal IP is for your AirPort and then change the SSH hostname to whatever you use. If you don't know the IP of your AirPort and have a fairly typical router, you can use 'route' on Linux or Mac systems to find the IP of your default gateway. For most people, this will be the IP of their AirPort.

Leave the above SSH session open in the background and launch the AirPort Utility on your Mac. When you launch the Utility, it will probably not find one (unless you happen to be on another network with an AirPort). Go to File -> Configure Other. In the pop-up, enter "localhost" in the Address field, and your password into the Password field. You should then be able to manage the AirPort via the utility. Note that as usual any changes you save to the AirPort will trigger a reboot and probably cut off the connection you established above. You will likely have to restart the tunnel after each time the box reboots.

airport ssh

Deploying Your Nikola Site to S3 (and CloudFront)

2013-07-03 00:26

A few weeks ago, I made a post about moving my site into S3 and a few things unexpected issues that I ran into. What I didn't mention was my migration to Nikola from WordPress and then my actual deployment into S3. I've outlined most of the deployment steps I went through. A basic understanding of AWS services (specifically S3 and CloudFront) is assumed, but I try to be as helpful as I can.

S3 Setup

This is fairly well documented in the S3 documentation, but to host a website you first need to create a bucket with the name of the hostname you want to use. For me, I have two buckets: whatan00b.com and www.whatan00b.com (more on the two buckets later). Once the buckets are created, you'll want to enable website hosting on them. In the S3 interface, if you click on Properties for the bucket, expand the "Static Website Hosting" bit. Check the box for "Enable website hosting". The default document root of index.html is perfectly fine. Error Document is something you'll likely want to come back to later.

If you created two buckets like I did, you'll want to do that step only on the primary bucket you want to host the website through. For the other bucket (for me, whatan00b.com), instead check the radio button for "Redirect requests to another host name". You'll want to fill out the hostname you used in the other bucket to redirect requests and save it.

"Pretty URLs"

This is somewhat related to the migration from WordPress, but I really didn't want to have URLs that pointed directly to the static HTML file, I wanted to hit the directory index to view pages. This was mainly to avoid adding redirects in S3. This is easily doable, I just like the cleaner look of URLs. If you left index.html as the document root as index.html in the S3 setup step, things should be fine on the S3 side.

On the Nikola side, I used the PRETTY_URLS option and set it to True in config.py. At the time of this writing, I had to install Nikola from the master branch on Github to make this work (see example) . It appears this feature will be released in 5.4.5. Enabling this option creates a directory for each blog post, with an index.html inside. Note that S3 will redirect and add a trailing slash if the user does not add one themselves (this was fine for me).

Deploying the files!

I have been using Fabric for other projects as of late, so I actually skipped the deployment bit altogether inside of Nikola. The deployment piece inside of Nikola currently only allows users to specify a list of commands that make up their deployment anyway (I assume this is generally an rsync to a list of servers or something along those lines for most folks). This is probably easily extendable, but I'm already in the habit of doing a 'fab deploy' when I am ready to push out code anyway, so I chose to stick with Fabric and ignore the Nikola deployment bit altogether for now. You could easily write the same type of function and add a "fab deploy" to the deployment section of your Nikola config file. You can view my deployment script over on Bitbucket (and the whole site for that matter!). It's probably not the most efficient file sync function ever written, but it's mine and it works for me.

If you don't want to go through all of this, there are lots of FTP / SFTP / whatever clients out there that support S3. You can also easily use something like Cyberduck on a Mac or Windows to push the files to your bucket.

CloudFront Setup

The CloudFront setup was one that bit me. This is something that I mentioned in my previous post, but I'll reiterate it again. For this setup, don't set CloudFront to use the S3 bucket as its source. Set the origin to be a "Custom Origin", using the S3 bucket website hosting endpoint as the URL. This can be found in the bucket properties section you looked at in the S3 setup section. If you don't do this, the site index will work, but lower directory indexes won't work.

Once the distribution is created and origins set, you should be able to hit the origin URL given in the AWS console. Once you are happy with the way things look, you can add an alias on the CloudFront distribution with the hostname you want to use for your site. After that's complete, you're just a CNAME away from the site being hosted on CloudFront. I happen to use Route53 as my DNS provider, so I just added an alias for the CloudFront distribution and things work nicely.

aws cloudfront nikola s3

What a n00b!