GraphQL Java 6.o

I’ve been a fan of GraphQL ever since I first tried it.  I push back against RESTful APIs to anyone that will listened (or not).  I’ve written a few post about it (GraphQL: Java Server, JavaScript Client, GraphQL: Java Schema AnnotationsGraphQL: A Java Server in Ratpack).  What I haven’t done, is stay current.  I got hooked on graphql-java at version 3.X and decided the annotations were the best way to go, and sadly the annotations development stalled and made upgrades tricky, and so I didn’t.  But it was a constant nagging itch, to upgrade, and finally I did.

This post will discuss a Ratpack server, using GraphQL-java 6.0. I should note, that as I did this work, the annotations finally release an upgrade. Doh.

GraphQL Java 6.0

I committed to upgrade. The annotations had not kept up so this meant a bit of a rewrite.  Normally I’m pretty suspicious of gratuitous annotation use.  They often mask too much of what’s really going on, and they tend to spray the information about one concern throughout your code, making it hard to locate coherent answers on a topic.  That was exactly the case here.  Leaving the annotations behind meant:

  • I had to figure out what previous magic I now had to handle on my own.
  • I had to determine just how deeply into my code they’d rooted themselves.

I tried to approach it intelligently, but in the end I went with brute force, I removed the dependency, and kept trying to build the project, ripping out references, until, while no longer working, the project built and the annotations were gone.  Then I set about fixing what I’d broken.

What Was Missing?

Basically without the annotations there were two things I needed to repair:

  • Defining the query schema
  • Wiring the queries up to the data sources

Defining Your Schema

GraphQL-java 6.0 supports a domain specific language for defining your schema known as IDL.  It’s a major win.  First, it gets your schema, which is by definition, a single concept, into one place, and makes it coherent.  Second, they didn’t go off and write “Yet Another DSL” but instead supported one that while not part of the official spec, is part of the reference implementation, and has traction in the community. Nice.

Wiring Up Your Data Sources

The best practice for this now is using the “DataFetcher” interface. The name is a bit misleading, since these aren’t just for your queries (i.e. fetching data) but also for your mutations (modifying data).  The name is weak, but the interface and it’s use is a breeze.

To the Code

I did all this work on my snippets server kata project, so for a richer example go there, but for the sake of clarity here will look at the more concise Ratpack GraphQL Server example.

The Dispatcher

This didn’t change hardly at all.  It’s still as simple as grappig a JSON map, pulling out the query and variables, and executing them:

Pretty straight forward.

Defining the Schema: IDL

So in this trivial example all I have are Company entities, defined with this bean:

And all I wanted to support was, get one, get all, save one, delete one.  So I needed to define my Company type, two queries, and two mutations. Defining this in IDL was easy:

Loading The Schema

I just tucked my schema definition into my resources folder and then:

Wiring The Data to The Schema

In GraphQL-java, the way I chose to do this is with DataFetcher implementations. So for example, to find a company, by name, from a map it would look about like:

So that’s the way to “fetch” the data, but how do you connect this to your schema? You define a “RuntimeWiring”:

And then you associate that wiring with your schema you loaded:

And Then…

Well that’s it.  You’ve:

  • Created a GraphQL dispather
  • Defined your entites
  • Defined your GraphQL schema
  • Created queries
  • Instantiated the schema, wired in the queries

Done.  Take a look at my GraphQL server in Ratpack for the complete working code.

 

 

Advertisements

Travis-CI to Docker Hub

More and more of my work involved docker images.  I consume them and produce them. My standard open source project CI/CD tool chain stack is Java, Gradle, GitHub, Travis-CI, CodeCov and Bintray.  End to end free and functional.

Recently I moved my snippets server app into a docker image.  This added Docker Hub to my stack, and happily it was an easy addition because of Gradle and Travis-CI.

Setting up the Build

A quick search and review turned up the gradle-docker-plugin.  With this plugin and access to a docker engine you can create, run and push docker images easily. The docs for the plugin will walk you through how to add it to your build.gradle. Also note, to use the types in the tasks below, you’ll need proper import statements. My build.gradle is pretty clean, but I’ll walk through some details below.

Creating the Dockerfile

The plugin is pretty flexible, so the following notes are not the answer but my answer.  Rather than create a fixed Dockerfile, I create mine on the fly from gradle:

task createDockerfile(type: Dockerfile) {
    def labels = [maintainer: 'nwillc@gmail.com']

    destFile = project.file('build/docker/Dockerfile')
    from 'anapsix/alpine-java:8_server-jre_unlimited'
    label labels
    exposePort 4567
    runCommand 'mkdir -p /opt/service/db'
    volume '/opt/service/db'
    copyFile "libs/$project.name-$project.version-standalone.jar", '/opt/service'
    workingDir '/opt/service'
    defaultCommand '--port', '4567', '--store', 'H2'
    entryPoint 'java', '-Djava.awt.headless=true', '-jar', "$project.name-$project.version-standalone.jar"
}

There’s a fair bit going on there so let’s walk through it.  First off, I’m creating the Dockerfile down in my build directory.  Then I’m using the plugin to do standard Dockerfile operations like setting the base from image, creating folders, copying in artifacts, and setting up the command and entry point.  The plugin sticks pretty close to the dockerfile DSL so you should be able to pick it up easily.  It’s worth noting that because this is in gradle, I can use the groovy variables to denote things like the artifact name etc.

Creating the Image

With the task to create the Dockerfile done, building an image is trivial.

task buildImage(type: DockerBuildImage, dependsOn: [assemble, createDockerfile]) {
    inputDir = project.file("build")
    dockerFile = createDockerfile.destFile
    tag = "nwillc/$project.name:$project.version"
}

So here I just indicate where I’ll root the build, in the build folder, and grab the previously created Dockerfile, and tag the image. Running this task will create your artifact, create your Dockerfile, and build the image.

Pushing the Image

I push my images into docker hub’s free public area. So, all I need to add to my build is info about my credentials and a push task.

docker {
    registryCredentials {
        url = 'https://index.docker.io/v1/'
        username = 'nwillc'
        password = System.getenv('DOCKER_PASSWD')
        email = 'nwillc@gmail.com'
    }
}
task pushImage(type: DockerPushImage, dependsOn: buildImage) {
    imageName buildImage.tag
}

Note I grab the password from an environment variable. That keeps it out of my github repo and you can set these in a secure manner in Travis-CI.

Running the Build and Doing the Deploy

With your build.gradle ready to go, and your DOCKER_PASSWD set you can now locally do a ./gradlew pushImage and it should all work, ending up with the image in docker hub.

But now let’s get our CI/CD working. Travis-CI has all you need supported. Set the DOCKER_PASSWD in your Travis-CI account’s profile, and then add the relevant bits to your .travis.yml, here are the key elements:

sudo: required
services: docker
after_success:
  - docker login -u nwillc -p ${DOCKER_PASSWD}
  - ./gradlew pushImage

You’ll need sudo, you have to indicate you’re using the docker service, you’ll need to login to docker hub, and finally push the image after successful build.

Done

With your build.gradle, and .travis.yml enhanced, it’s done. Every push to github builds and tests and if everything is happy your docker hub image is updated.

Home DevOps: Ansible for the Win!

I’ve a Raspberry Pi that I use for various things.  I’m a big fan of these little boxes, but they can be temperamental.  You end up fiddling around to get things installed and sometimes even a simple package update will leave the box a dead parrot.  A couple weeks back, I was just running a regular update and my Pi died a horrible death.  The upside with a Pi is you just re-image the disk and you’re back in business. The disks are small, the process simple.  However, if you’ve customized things all that’s gone.

I decided to rebuild my Pi, after re-imaging it, using Ansible.  Ansible is straightforward and easy to start up with. I’ve used it on and off over time and am proficient with it.  In under an hour I had rebuilt my Pi, with all my customizations with an ansible playbook. It didn’t take much more effort than doing it by hand really, but I did feel like maybe I’d gone a bit far using ansible.  Until the next morning that is.  I’d forgotten a few security measures, and my Pi is accessible to the internet, and in less than half a day someone or some bot had gotten in and taken over. Sigh. Now the whole ansible decision seemed far wiser.  I enhanced my playbook with the security changes, re-images, and reran, and the Pi was back and better in under twenty minutes.

Since that practical example, I’ve done everything on my Pi via ansible and had no regrets.

 

Fluentsee: Fluentd Log Parser

I wrote previously about using fluentd to collect logs as a quick solution until the “real” solution happened.  Well, like many “temporary” solutions, it settled in and took root. I was happy with it, but got progressively more bored of coming up with elaborate command pipelines to parse the logs.

Fluentsee

So in the best DevOps tradition, rather than solve the initial strategic problem, I came up with an another layer of paint to slap on as a tactical fix, and fluentsee was born.  Fluentsee is written in Java, and lets you filter the logs, and print out different format outputs:

$ java -jar fluentsee-1.0.jar --help
Option (* = required)          Description
---------------------          -----------
--help                         Get command line help.
* --log <String: filename>     Log file to use.
--match <String: field=regex>   Define a match for filtering output. May pass in
                                 multiple matches.
--tail                         Tail the log.
--verbose                      Print verbose format entries.

So, for example, to see all the log entries from the nginx container, with a POST you would:

$ java -jar fluentsee-1.0.jar --log /fluentd/data.log \
--match 'json.container_name=.*nginx.*' --match 'json.log=.*POST.*'

The matching uses Java regex’s. The parsing isn’t wildly efficient but keeps up generally.

Grab it on Github

There’s a functional version now on github, and you can expect enhancements, as I continue to ignore the original problem and focus on the tactical patch.

Collecting Docker Logs With Fluentd

I’m working on a project involving a bunch of services running in docker containers.  We are working on a design and implementation of our full blown log gathering and analysis solution, but what was I to do till then?  Having to bounce around to all the hosts and look at them there was getting tiresome, but I didn’t want to expend much energy on a stopgap measure either.

Enter Fluentd

Docker offers support for various different logging drivers, so I ran down the list and gave each choice about ten minutes of attention, and sure enough, one choice only needed ten minutes to get up and running – fluentd.

What it Took

  1. Pick a machine to host logs
  2. Run a docker image of fluentd on that host
  3. Add a couple of additional options on my docker invocations.

What That Got Me

With the above done, all my docker containers logs aggregated on the designated host in an orderly format, with log rolling etc.

But…

The orderly format in the aggregated log,  was well structured but maybe not friendly.  Its format is:

TIMESTAMP HOST_ID JSON_BLOB

So an example might look like:

20170804T140852+0000 9c501a9baf61 {"container_id":"...","container_name":"...","source":"stdout","log":"..."}

Everything in its place but…

How To Deal

So with everything going into one file, and a mix of text and JSON, I settled on the following approach.   First I installed jq to help format the JSON.  Then I just employed tried and true command line tools.

For example, lets say you just want to look at the log entries for an nginx container:

grep /nginx /fluentd/data.20170804.log | cut -c 35- | jq -C . | less -r

That’s all it takes!  Use grep to pull the lines with the container name, cut out the JSON, have jq format it, and view it.

Maybe you just want the log field, rather then the entire entry:

grep /nginx /fluentd/data.20170804.log | cut -c 35- | jq -C .log | less -r

Just have jq pull out the single field.

It’s Low Tech But…

For about ten minutes setup work, and a little command line magic, I’ve got a good solution until the real answer arrives.

Tech Notes

There were a couple of specifics worth noting in the process here.  First, there are at least two ways to direct docker to use a specific log driver. One is via the command line on a run. The other is to configure the docker daemon via its /etc/docker/daemon.json file.  The command line is more granular, you can pick and choose which containers log to which driver. That’s flexible and nice, but unfortunately docker “compose” and “cloud” don’t support setting the driver for a container.  Setting at the docker daemon level as a default solves the compose/cloud issue, but, creates a circular dependency if you’re running fluentd in docker, because that container won’t start unless fluentd is running, but fluentd is in that container.  I went with setting it at the daemon level, and I made sure to run the fluentd container first thing, with a command line option indicating the traditional log driver.

The second noteworthy point was that the fluentd container provides a data.log link that was supposed to always point to the newest log… for me it doesn’t.  I have to look into the log area and find the newest log myself because data.log doesn’t update correctly through some log rotations.

Information Graveyard

I’m trying to learn how to write a skill for Amazon’s Alexa, taking the tried and true approach of searching for tutorials on the internet.  At this point it’s been only frustration.  I’ve found both Amazon written tutorials and third party ones.  Not a single one yet has provided instructions that correspond to the current Amazon tools.  Some are relatively recent, or at least claim to have been recently updated, but not a one has actually provided a working example.  It’s not a matter of slight differences that can be worked around, each one has had at least one step that didn’t seem to correspond to anything in the AWS console as it is today.

Keeping posts up to date is work, I realize. I’m guilty too, of leaving out of date documentation out in the wild, but I make an effort to be responsible, and I’m not expecting revenue from my posts.  How is it that even Amazon’s own tutorials are completely borked?  I tried this about two months back and it was the same story. Since then both the tutorials and AWS tools have been updated, but the new combination is no more workable than the prior.

Some products are notably bad on this point.  Amazon’s SDKs and tools are a consistent pain point. The Spring ecosystem too is bad.  JBoss a mess.  The problem also is made worse by how the developers refactor code and API.  Making changes and improvements in a way that facilitated migrations is a skill.  I wish Amazon acquired that skill.

Compromises

I hit on a really good article on the Law of Demeter. If you’re not familiar with it read that article, and if you do you may find my discussion with the author in the comments. The discussion was around the how rigidly you took the term Law.

Why Quibble The Word Law?

I code mostly in Java, classified as an object-oriented language, and I’ve coded in the OO paradigm in C++, Objective-C and Smalltalk too.  But I started in procedural (Pascal and C) and I’ve worked with functional (Lisp, SML), and smatterings of other languages with their styles too.  They all have their strong points. I’ve learned tactics and patterns from them all and when I’m encounter a situation where one applies, if the current tools can implement the tactic I use it.  I’m not saying anything astonishing here, modern tools are rarely purist in their approaches anymore.

The Law of Demeter is a good OO maxim, but if you’re writing code that handles serialized data, whether if be a RESTful service, or data store persistence, etc. you’ll likely be dealing with composited data (addresses, inside accounts, inside portfolios etc.).  Accessing portfolio.account.address.state violates the Law of Demeter. There are patterns to mitigate some of the issues here, like Data Transfer Objects, the Visitor Pattern, or Facade Pattern,  but depending on the situation some of these cures are worse than the problem.

In Summary

Keep the Law of Demeter in mind as you write/review your code. If it’s been rampantly ignored that certainly is a code smell.  But paradigm “laws” are for purists, and writing software is a pragmatic process… so… yeah… it’s a maxim.