Home DevOps: Ansible for the Win!

I’ve a Raspberry Pi that I use for various things.  I’m a big fan of these little boxes, but they can be temperamental.  You end up fiddling around to get things installed and sometimes even a simple package update will leave the box a dead parrot.  A couple weeks back, I was just running a regular update and my Pi died a horrible death.  The upside with a Pi is you just re-image the disk and you’re back in business. The disks are small, the process simple.  However, if you’ve customized things all that’s gone.

I decided to rebuild my Pi, after re-imaging it, using Ansible.  Ansible is straightforward and easy to start up with. I’ve used it on and off over time and am proficient with it.  In under an hour I had rebuilt my Pi, with all my customizations with an ansible playbook. It didn’t take much more effort than doing it by hand really, but I did feel like maybe I’d gone a bit far using ansible.  Until the next morning that is.  I’d forgotten a few security measures, and my Pi is accessible to the internet, and in less than half a day someone or some bot had gotten in and taken over. Sigh. Now the whole ansible decision seemed far wiser.  I enhanced my playbook with the security changes, re-images, and reran, and the Pi was back and better in under twenty minutes.

Since that practical example, I’ve done everything on my Pi via ansible and had no regrets.

 

Advertisements

Fluentsee: Fluentd Log Parser

I wrote previously about using fluentd to collect logs as a quick solution until the “real” solution happened.  Well, like many “temporary” solutions, it settled in and took root. I was happy with it, but got progressively more bored of coming up with elaborate command pipelines to parse the logs.

Fluentsee

So in the best DevOps tradition, rather than solve the initial strategic problem, I came up with an another layer of paint to slap on as a tactical fix, and fluentsee was born.  Fluentsee is written in Java, and lets you filter the logs, and print out different format outputs:

$ java -jar fluentsee-1.0.jar --help
Option (* = required)          Description
---------------------          -----------
--help                         Get command line help.
* --log <String: filename>     Log file to use.
--match <String: field=regex>   Define a match for filtering output. May pass in
                                 multiple matches.
--tail                         Tail the log.
--verbose                      Print verbose format entries.

So, for example, to see all the log entries from the nginx container, with a POST you would:

$ java -jar fluentsee-1.0.jar --log /fluentd/data.log \
--match 'json.container_name=.*nginx.*' --match 'json.log=.*POST.*'

The matching uses Java regex’s. The parsing isn’t wildly efficient but keeps up generally.

Grab it on Github

There’s a functional version now on github, and you can expect enhancements, as I continue to ignore the original problem and focus on the tactical patch.

Collecting Docker Logs With Fluentd

I’m working on a project involving a bunch of services running in docker containers.  We are working on a design and implementation of our full blown log gathering and analysis solution, but what was I to do till then?  Having to bounce around to all the hosts and look at them there was getting tiresome, but I didn’t want to expend much energy on a stopgap measure either.

Enter Fluentd

Docker offers support for various different logging drivers, so I ran down the list and gave each choice about ten minutes of attention, and sure enough, one choice only needed ten minutes to get up and running – fluentd.

What it Took

  1. Pick a machine to host logs
  2. Run a docker image of fluentd on that host
  3. Add a couple of additional options on my docker invocations.

What That Got Me

With the above done, all my docker containers logs aggregated on the designated host in an orderly format, with log rolling etc.

But…

The orderly format in the aggregated log,  was well structured but maybe not friendly.  Its format is:

TIMESTAMP HOST_ID JSON_BLOB

So an example might look like:

20170804T140852+0000 9c501a9baf61 {"container_id":"...","container_name":"...","source":"stdout","log":"..."}

Everything in its place but…

How To Deal

So with everything going into one file, and a mix of text and JSON, I settled on the following approach.   First I installed jq to help format the JSON.  Then I just employed tried and true command line tools.

For example, lets say you just want to look at the log entries for an nginx container:

grep /nginx /fluentd/data.20170804.log | cut -c 35- | jq -C . | less -r

That’s all it takes!  Use grep to pull the lines with the container name, cut out the JSON, have jq format it, and view it.

Maybe you just want the log field, rather then the entire entry:

grep /nginx /fluentd/data.20170804.log | cut -c 35- | jq -C .log | less -r

Just have jq pull out the single field.

It’s Low Tech But…

For about ten minutes setup work, and a little command line magic, I’ve got a good solution until the real answer arrives.

Tech Notes

There were a couple of specifics worth noting in the process here.  First, there are at least two ways to direct docker to use a specific log driver. One is via the command line on a run. The other is to configure the docker daemon via its /etc/docker/daemon.json file.  The command line is more granular, you can pick and choose which containers log to which driver. That’s flexible and nice, but unfortunately docker “compose” and “cloud” don’t support setting the driver for a container.  Setting at the docker daemon level as a default solves the compose/cloud issue, but, creates a circular dependency if you’re running fluentd in docker, because that container won’t start unless fluentd is running, but fluentd is in that container.  I went with setting it at the daemon level, and I made sure to run the fluentd container first thing, with a command line option indicating the traditional log driver.

The second noteworthy point was that the fluentd container provides a data.log link that was supposed to always point to the newest log… for me it doesn’t.  I have to look into the log area and find the newest log myself because data.log doesn’t update correctly through some log rotations.

Information Graveyard

I’m trying to learn how to write a skill for Amazon’s Alexa, taking the tried and true approach of searching for tutorials on the internet.  At this point it’s been only frustration.  I’ve found both Amazon written tutorials and third party ones.  Not a single one yet has provided instructions that correspond to the current Amazon tools.  Some are relatively recent, or at least claim to have been recently updated, but not a one has actually provided a working example.  It’s not a matter of slight differences that can be worked around, each one has had at least one step that didn’t seem to correspond to anything in the AWS console as it is today.

Keeping posts up to date is work, I realize. I’m guilty too, of leaving out of date documentation out in the wild, but I make an effort to be responsible, and I’m not expecting revenue from my posts.  How is it that even Amazon’s own tutorials are completely borked?  I tried this about two months back and it was the same story. Since then both the tutorials and AWS tools have been updated, but the new combination is no more workable than the prior.

Some products are notably bad on this point.  Amazon’s SDKs and tools are a consistent pain point. The Spring ecosystem too is bad.  JBoss a mess.  The problem also is made worse by how the developers refactor code and API.  Making changes and improvements in a way that facilitated migrations is a skill.  I wish Amazon acquired that skill.

Compromises

I hit on a really good article on the Law of Demeter. If you’re not familiar with it read that article, and if you do you may find my discussion with the author in the comments. The discussion was around the how rigidly you took the term Law.

Why Quibble The Word Law?

I code mostly in Java, classified as an object-oriented language, and I’ve coded in the OO paradigm in C++, Objective-C and Smalltalk too.  But I started in procedural (Pascal and C) and I’ve worked with functional (Lisp, SML), and smatterings of other languages with their styles too.  They all have their strong points. I’ve learned tactics and patterns from them all and when I’m encounter a situation where one applies, if the current tools can implement the tactic I use it.  I’m not saying anything astonishing here, modern tools are rarely purist in their approaches anymore.

The Law of Demeter is a good OO maxim, but if you’re writing code that handles serialized data, whether if be a RESTful service, or data store persistence, etc. you’ll likely be dealing with composited data (addresses, inside accounts, inside portfolios etc.).  Accessing portfolio.account.address.state violates the Law of Demeter. There are patterns to mitigate some of the issues here, like Data Transfer Objects, the Visitor Pattern, or Facade Pattern,  but depending on the situation some of these cures are worse than the problem.

In Summary

Keep the Law of Demeter in mind as you write/review your code. If it’s been rampantly ignored that certainly is a code smell.  But paradigm “laws” are for purists, and writing software is a pragmatic process… so… yeah… it’s a maxim.

Revisiting Immutables

A while back I looked into the immutables project for Java. It lets you define immutable objects in a very simple way.  Depending on your programming background you might not even see the role of an immutable object, but as more languages and patterns concern themselves with more rigorous and safer practices you may find the call for immutability in your Java.

When I looked into the project I was impressed, but at the time, the current build tools and IDEs struggled with the preprocessing requirements, and Java 7 didn’t complement it either.  All that has changed, and this project is now one I’m adding to my short list of favorites.

What You Get

I’m not going to go into a lot of detail here, there are plenty of good sources for that just a google away, but here’s the concept in brief.  The project lets you define an immutable object by simply adding annotations to an interface or abstract class, for example:

@Value.Immutable
publice interface Item {
  String getName();
  Set<String> getTags();
  Optional<String> getDescription();
}

With that definition in place, a preprocessor will create you:

  • A immutable class that implements your interface
  • A very clean and powerful builder implementation with slick support for collections, defaults, optionals and more.

With the example above you can start using the immutable easily:

Item item = ImmutableItem.builder()
                         .name("Veggie Pizza")
                         .addTag("mushrooms")
                         .addTag("peppers")
                         .description("A pizza with vegetables.")
                         .build();

System.out.println("Order " + item.getName() + " it's a "
                       item.getDescritpion().orElse("nice choice"));

The Mechanics

The mechanics of using the project, at least with gradle, are now as simple as:

buildscript {
    dependencies {
        classpath 'gradle.plugin.org.inferred:gradle-processors:1.2.11'
    }
}

apply plugin: 'org.inferred.processors'

dependencies {
    processor 'org.immutables:value:2.4.6'
}

And the generated code appeared in IntelliJ seamlessly!

With Java 8

Java 8 complements the project in at least a couple of ways.  Java now has its own Optional class so you don’t need to pull in another projects for that. But what I found nicer was using default methods.  The project has long (always?) supported abstract classes as well as interfaces and so there was a way to add pure implementation code to the immutables. But interfaces provide more inheritance flexibility and now with Java 8’s default methods you can get the best if both, interfaces benefits, with accompanying method implementations!

So…

I highly recommend giving the project a test run and incorporating immutables in your toolbox.

GraphQL: A Java Server in Ratpack

I previously wrote about an implementation of a GraphQL server in Java.  That post is showing age because the code is part of a kata project and constantly evolves.

A Concise Example

So, I’ve created a new concise example in GitHub that exemplifies using:

The example is just the code needed to manage a list of companies in memory. It implements basic CRUD operations but with an extensible pattern.

Grab The Code

It’s all covered in ~300 lines of code:

  1. A package with a GraphQL Schema, using a Query and a Mutation class
  2. A GraphQLHandler that dispatches POST requests to GraphQL
  3. An Application that creates the Ratpack server for the GraphQLHandler

The README covers how to run it, and there are a series sample requests included.