My goal for the day was to get consul infrastructure set up at work. I’d done proof of concept (PoC) work on all of the various bits and it was just a matter of putting the parts together:
- Everything from docker images
- The consul server on EC2 instances (in a VPC)
- The consul agents on a mix of client machines, some Linux, some OSX
I was working from home, and that seemed like it would help. but I didn’t realize that it would actually make the task basically impossible.
What Got Me
Consul is a all about networking, that’s is game. So as I tried to glue all the PoC bits together here’s what got me:
- Getting the EC2 security groups fixed up for consul’s too many ports – annoying but doable
- Dealing with the fact that the networking on Docker for OSX isn’t quite right. It lacks some of the bridging features and things like -net=host “work” behave non-intuitively
- Working from home I was on a NAT’d machine connected through a VPN so my address wasn’t always my address and some traffic wouldn’t traffic.
- The universe hates me. Ok that’s hyperbolic whining, but by the days end I was sure it was so.
Basically combining all those gotchas together meant that:
- Every example for consul in docker was from Linux and there was a 50/50 chance it would fail mysteriously on OSX.
- The errors I hit were often lack of connectivity … and so you were left trying to diagnose silence … not a lot to go on there. Bueller? Bueller? Buller?
- There was so much “useful” information out there that I just kept trying… I mean if I just tried one more suggestion that would get it right?
Tomorrow I’ll be back on site and that will eliminate the NATing and the VPN. Perhaps with those two complexities removed I’ll make progress. I could always run consul natively rather than in docker…. but I really don’t want to admit defeat.