Kubernetes – setting up the hosts

Introduction

This is Step 1 in my recent Kubernetes setup where I very quickly describe the process followed to build and configure the basic requirements for a simple Kubernetes cluster.

Step 2 is here https://www.donaldsimpson.co.uk/2018/12/29/kubernetes-from-cluster-reset-to-up-and-running/

and Step 3 where I set up Helm and Tiller and deploy an initial chart to the cluster: https://www.donaldsimpson.co.uk/2019/01/03/kubernetes-adding-helm-and-tiller-and-deploying-a-chart/

The TL/DR

A quick summary should cover 99% of this, but I wanted to make sure I’d recorded my process/journey to get there – to cut a long story short, I ended up using this Ansible project:

https://github.com/DonaldSimpson/ansible-kubeadm


which I forked from the original here:

https://github.com/ben-st/ansible-kubeadm

on the 5 Ubuntu linux hosts I created by hand (the horror) on my VMWare ESX home lab server. I started off writing my own ansible playbook which did the job, then went looking for improvements and found the above fitted my needs perfectly.

The inventory file here: https://github.com/DonaldSimpson/ansible-kubeadm/blob/master/inventory details the addresses and functions of the 5 hosts – 4 x workers and a single master, which I’m planning on keeping solely for master role.

My notes:

Host prerequisites are in my rough notes below – simple things like ssh keys, passwwordless sudo from the ansible user, installing required tools like python, setting suitable ip addresses and adding the users you want to use. Also allocating suitable amounts of mem, cpu and disk – all of which are down to your preference, availability and expectations.

https://kubernetes.io/docs/setup/independent/create-cluster-kubeadm/

ubuntumaster is 192.168.0.46
su – ansible
check history

ansible setup

https://www.howtoforge.com/tutorial/setup-new-user-and-ssh-key-authentication-using-ansible/
1 x master  - sudo apt-get install open-vm-tools-desktop - sudo apt install openssh-server vim whois python ansible - export TERM=linux re https://stackoverflow.com/questions/49643357/why-p-appears-at-the-first-line-of-vim-in-iterm
 - /etc/hosts:
127.0.1.1       umaster
192.168.0.43    ubuntu01
192.168.0.44    ubuntu02
192.168.0.45    ubuntu03
// slave nodes need:ssh-rsa AAAAB3NzaC1y<snip>fF2S6X/RehyyJ24VhDd2N+Dh0n892rsZmTTSYgGK8+pfwCH/Vv2m9OHESC1SoM+47A0iuXUlzdmD3LJOMSgBLoQt ansible@umaster
added to root user auth keys in .ssh and apt install python ansible -y
//apt install python ansible -y
useradd -m -s /bin/bash ansible
passwd ansible <type the password you want>

echo  -e ‘ansible\tALL=(ALL)\tNOPASSWD:\tALL’ > /etc/sudoers.d/ansibleecho  -e 'don\tALL=(ALL)\tNOPASSWD:\tALL' > /etc/sudoers.d/don
mkpasswd --method=SHA-512 <type password "secret">
Password:
$6$dqxHiCXHN<snip>rGA2mvE.d9gEf2zrtGizJVxrr3UIIL9Qt6JJJt5IEkCBHCnU3nPYH/
su - ansible
ssh-keygen -t rsa

cd ansible01/
vim inventory.ini
ansible@umaster:~/ansible01$ cat inventory.ini
[webserver]
ubuntu01 ansible_host=192.168.0.43
ubuntu02 ansible_host=192.168.0.44
ubuntu03 ansible_host=192.168.0.45

ansible@umaster:~/ansible01$ cat ansible.cfg
[defaults]
 inventory = /home/ansible/ansible01/inventory.ini
ansible@umaster:~/ansible01$ ssh-keyscan 192.168.0.43 >> ~/.ssh/known_hosts
# 192.168.0.43:22 SSH-2.0-OpenSSH_7.6p1 Ubuntu-4
# 192.168.0.43:22 SSH-2.0-OpenSSH_7.6p1 Ubuntu-4
# 192.168.0.43:22 SSH-2.0-OpenSSH_7.6p1 Ubuntu-4
ansible@umaster:~/ansible01$ ssh-keyscan 192.168.0.44 >> ~/.ssh/known_hosts
# 192.168.0.44:22 SSH-2.0-OpenSSH_7.6p1 Ubuntu-4
# 192.168.0.44:22 SSH-2.0-OpenSSH_7.6p1 Ubuntu-4
# 192.168.0.44:22 SSH-2.0-OpenSSH_7.6p1 Ubuntu-4
ansible@umaster:~/ansible01$ ssh-keyscan 192.168.0.45 >> ~/.ssh/known_hosts
# 192.168.0.45:22 SSH-2.0-OpenSSH_7.6p1 Ubuntu-4
# 192.168.0.45:22 SSH-2.0-OpenSSH_7.6p1 Ubuntu-4
# 192.168.0.45:22 SSH-2.0-OpenSSH_7.6p1 Ubuntu-4
ansible@umaster:~/ansible01$ cat ~/.ssh/known_hosts
or could have donefor i in $(cat list-hosts.txt)
do
ssh-keyscan $i >> ~/.ssh/known_hosts
done
cat deploy-ssh.yml

 – hosts: all
   vars:
     – ansible_password: ‘$6$dqxHiCXH<kersnip>l.urCyfQPrGA2mvE.d9gEf2zrtGizJVxrr3UIIL9Qt6JJJt5IEkCBHCnU3nPYH/’
  gather_facts: no
   remote_user: root

   tasks:

   – name: Add a new user named provision
     user:
          name=ansible
          password={{ ansible_password }}

   – name: Add provision user to the sudoers
     copy:
          dest: “/etc/sudoers.d/ansible”
          content: “ansible ALL=(ALL)  NOPASSWD: ALL”

   – name: Deploy SSH Key
     authorized_key: user=ansible
                     key=”{{ lookup(‘file’, ‘/home/ansible/.ssh/id_rsa.pub’) }}”
                     state=present

   – name: Disable Password Authentication
     lineinfile:
           dest=/etc/ssh/sshd_config
           regexp=’^PasswordAuthentication’
           line=”PasswordAuthentication no”
           state=present
           backup=yes
     notify:
       – restart ssh

   – name: Disable Root Login
     lineinfile:
           dest=/etc/ssh/sshd_config
           regexp=’^PermitRootLogin’
           line=”PermitRootLogin no”
           state=present
           backup=yes
     notify:
       – restart ssh

   handlers:
   – name: restart ssh
     service:
       name=sshd
       state=restarted

// end of the above file

ansible-playbook deploy-ssh.yml –ask-pass
results inLAY [all] *********************************************************************************************************************************************************************************************************************************************************************

TASK [Add a new user named provision] ******************************************************************************************************************************************************************************************************************************************
fatal:

[ubuntu02]

: FAILED! => {"msg": "to use the 'ssh' connection type 
with passwords, you must install the sshpass program"}
for each node/slave/hostsudo apt-get install -y sshpass
ubuntu01 ansible_host=192.168.0.43
ubuntu02 ansible_host=192.168.0.44
ubuntu03 ansible_host=192.168.0.45

kubernetes setup
https://www.techrepublic.com/article/how-to-quickly-install-kubernetes-on-ubuntu/run install_apy.yml against all hosts and localhost too
on master:

kubeadm init

results in:root@umaster:~# kubeadm init
[init] using Kubernetes version: v1.11.1
[preflight] running pre-flight checks
I0730 15:17:50.330589   23504 kernel_validator.go:81] Validating kernel version
I0730 15:17:50.330701   23504 kernel_validator.go:96] Validating kernel config
    [WARNING SystemVerification]: docker version is greater than the most recently validated version. Docker version: 17.12.1-ce. Max validated version: 17.03
[preflight] Some fatal errors occurred:
    [ERROR Swap]: running with swap on is not supported. Please disable swap
[preflight] If you know what you are doing, you can make a check non-fatal with `–ignore-preflight-errors=…`
root@umaster:~#
doswapoff -a then try again
kubeadm init… wait for images to be pulled etc – takes a while

Your Kubernetes master has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

You should now deploy a pod network to the cluster.
Run “kubectl apply -f [podnetwork].yaml” with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/

You can now join any number of machines by running the following on each node
as root:

  kubeadm join 192.168.0.46:6443 --token 9e85jo.77nzvq1eonfk0ar6 --discovery-token-ca-cert-hash sha256:61d4b5cd0d7c21efbdf2fd64c7bca8f7cb7066d113daff07a0ab6023236fa4bc
root@umaster:~#

Next up…

The next post in the series is here: https://www.donaldsimpson.co.uk/2018/12/29/kubernetes-from-cluster-reset-to-up-and-running/ and details an automated process to scrub my cluster and reprovision it (form a Kubernetes point of view – the hosts are left intact).

Kubernetes – from cluster reset to up and running

This is Step 2 in a series of Kubernetes blog posts

Step 1 covers the initial host creation and basic provisioning with Ansible: https://www.donaldsimpson.co.uk/2019/01/03/kubernetes-setting-up-the-hosts/

and Step 3 is where I set up Helm and Tiller and deploy an initial chart to the cluster: https://www.donaldsimpson.co.uk/2019/01/03/kubernetes-adding-helm-and-tiller-and-deploying-a-chart/

These are notes on going from a freshly reset kubernetes cluster to a running & healthy cluster with a pod network applied and worker nodes connected.

To get to this starting point I provisioned 4 Ubuntu hosts (1 master & 3 workers) on my VMWare server – a Dell Poweredge R710 with 128GB RAM.

I then used this Ansible project:

https://github.com/DonaldSimpson/ansible-kubeadm

to configure the hosts and prep for Kubernetes with kubeadm:

https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/

I’ll write about this in more detail in another post…

Please note that none of this is production grade or recommended, it’s simply what I have done to suit my needs in my home lab. My focus is on automating Kubernetes processes and deployments, not creating highly available bullet-proof production systems.

To reset and restore a ‘new’ cluster, first on the master instance – reboot and as a normal user (I’m using an “ansible” user with sudo throughout):


sudo kubeadm reset
(y)
sudo swapoff -a
sudo kubeadm init --pod-network-cidr=10.244.0.0/16

I’m passing that CIDR address as I’m using Flannel for pod networking (details follow) – if you use something else you may not need that, but may well need something else.

That should be the MASTER started, with a message to add nodes with:


  kubeadm join 192.168.0.46:6443 --token 9w09pn.9i9uu1ht8gzv36od --discovery-token-ca-cert-hash sha256:4bb0bbb1033a96347c6dd888c769ec9c5f6caa1b699066a58720ffdb97a0f3d7

which all sounds good, but the first most basic check produces the following error:


ansible@umaster:~$ kubectl cluster-info
To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
Unable to connect to the server: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kubernetes")

which I think is due to the kubeadm reset cleaning up the previous config, but can be easily fixed with this:


mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

then it works and MASTER is up and running ok:


ansible@umaster:~$ sudo kubectl cluster-info
Kubernetes master is running at https://192.168.0.46:6443
KubeDNS is running at https://192.168.0.46:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.

————- ADD NODES ——————

Use the command and token provided by the master on the worker node(s) (in my case that’s “ubuntu01” to “ubuntu04”). Again I’m running as the ansible user everywhere, and I’m disabling swap and doing a kubeadm reset first as I want this repeatable:

sudo swapoff -a
sudo kubeadm reset
sudo  kubeadm join 192.168.0.46:6443 --token 9w09pn.9i9uu1ht8gzv36od --discovery-token-ca-cert-hash sha256:4bb0bbb1033a96347c6dd888c769ec9c5f6caa1b699066a58720ffdb97a0f3d7

I think the token expires after a few hours. If you want to get a new one you can query the Master using:

https://kubernetes.io/docs/reference/setup-tools/kubeadm/kubeadm-token/

Or, as I’ve just found out, the more recent versions ok k8s provide “kubeadm token create –print-join-command”, which provide output like the following example that you can save to a file/variable/whatever:

kubeadm join 192.168.0.46:6443 --token 8z5obf.2pwftdav48rri16o --discovery-token-ca-cert-hash sha256:2fabde5ad31a6f911785500730084a0e08472bdcb8cf935727c409b1e94daf44

I believe options to specify json or alternative output formatting is in the works too.

That’s all that is needed, if you’ve not used this node already it may take a while to pull things in but if you have it should be pretty much instant.

When ready, running a quick check on the MASTER shows the connected node (ubuntu01) and the Master (umaster) and their status:


ansible@umaster:~$ sudo kubectl get nodes --all-namespaces
NAME       STATUS     ROLES    AGE     VERSION
ubuntu01   NotReady   <none>   27s     v1.13.1
umaster    NotReady   master   8m26s   v1.13

The NotReady status is because there’s no pod network available – see here for details and options:

https://kubernetes.io/docs/setup/independent/create-cluster-kubeadm/#pod-network

so apply a pod network (I’m using flannel) like this on the Master only:


ansible@umaster:~$ sudo kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/bc79dd1505b0c8681ece4de4c0d86c5cd2643275/Documentation/kube-flannel.yml
clusterrole.rbac.authorization.k8s.io/flannel created
clusterrolebinding.rbac.authorization.k8s.io/flannel created
serviceaccount/flannel created
configmap/kube-flannel-cfg created
daemonset.extensions/kube-flannel-ds-amd64 created
daemonset.extensions/kube-flannel-ds-arm64 created
daemonset.extensions/kube-flannel-ds-arm created
daemonset.extensions/kube-flannel-ds-ppc64le created
daemonset.extensions/kube-flannel-ds-s390x created

Then check again and things should look better now they can communicate…


ansible@umaster:~$ sudo kubectl get nodes --all-namespaces
NAME       STATUS   ROLES    AGE     VERSION
ubuntu01   Ready    <none>   2m23s   v1.13.1
umaster    Ready    master   10m     v1.13.1
ansible@umaster:~$

Adding any number of subsequent nodes is very easy and exactly the same (the pod networking setup is a one-off step on the master only). I added all 4 of my worker vms and checked they were all Ready and “schedulable”. My server coped with this no problem at all. Note that by default you can’t schedule tasks on the Master, but this can be changed if you want to.

That’s the very basic “reset and restore” steps done. I plan to add this process to a Jenkins Pipeline, so that I can chain a complete cluster destroy/reprovision and application build, deploy and test process together.

The next steps I did were to:

  • install the Kubernetes Dashboard to the cluster
  • configure the Kubernetes Dashboard and fix permissions
  • deploy a sample application, replicaset & service and expose it to the network
  • configure Heapster

which I’ll post more on soonish… and I’ll add the precursor to this post on the host provisioning and kubeadm setup too.

Meetup – Deploying Openshift to AWS with HashiCorp Terraform and Ansible

 

Automated IT Solutions presented a talk on “Deploying Openshift to AWS with HashiCorp Terraform and Ansible”, by Liam Lavelle on 16th October 2018.

 

We would like to thank

 

  • Liam Lavelle for an interesting, informative and fun session
  • Everyone that came along to make it such a good event, with some great questions, helpful answers and interesting discussions
  • Hays for the beer, pizza, venue and help with everything

Hope to see you all at the next one soon!

The slides and all materials used in this session are available on our GitHub repo here:

 

Deploying Openshift to AWS with HashiCorp Terraform and Ansible

Tuesday, Oct 16, 2018, 6:15 PM

HAYS
7 Castle St, Edinburgh EH2 3AH Edinburgh, GB

30 Members Went

In this session we look at Infrastructure as Code and Configuration as Code, as we demonstrate how to use these approaches to deploy RedHat OpenShift to AWS with HashiCorp Terraform and Ansible. We start off with configuring AWS credentials, then use HashiCorp Terraform to create the AWS infrastructure needed to deploy and run our own RedHat OpenSh…

Check out this Meetup →

 

Here are the details:
When:
Tuesday, October 16th, 2018
6:15 PM to 9:00 PM

 

Where:
Hays office on the 2nd floor
7 Castle St, Edinburgh EH2 3AH · Edinburgh

 

What:
Deploying Openshift to AWS with HashiCorp Terraform and Ansible

 

Agenda:

In this session we look at Infrastructure as Code and Configuration as Code, as we demonstrate how to use these approaches to deploy RedHat OpenShift to AWS with HashiCorp Terraform and Ansible.

We start off with configuring AWS credentials, then use HashiCorp Terraform to create the AWS infrastructure needed to deploy and run our own RedHat OpenShift cluster.

We then go through using Ansible to deploy OpenShift to AWS, followed by a review of the Cluster, then take a quick look at troubleshooting any issues you may encounter.

There will be a break in the middle for beer & pizza courtesy of Hays, and we will wrap things up with a quick Q&A and feedback session.

If you would like to bring your own laptop and follow along, please do!

Who:
Intermediate Linux and some AWS knowledge is useful but not essential.

New Meetup – Vagrant from scratch to LAMP stack

Automated IT Solutions are running a new Meetup in Edinburgh on Friday 18th May, check out the details and register for this free session here – beer, pizza and free HashiCorp stickers included!:

Vagrant from scratch to LAMP stack

Friday, May 18, 2018, 6:15 PM

HAYS
7 Castle St, Edinburgh EH2 3AH Edinburgh, GB

18 Members Attending

Automated IT Solutions are presenting a session on HashiCorp Vagrant: “from scratch to LAMP stack” by Adam Cheney. In this session you will learn: – Vagrant basics, introduction and usage – How to install and configure Vagrant – Provisioning VMs with Vagrant and Ansible followed by a live demonstration/workshop of building a LAMP stack within Vagra…

Check out this Meetup →

Adding an insecure-registry to Docker on Ubuntu

Quick note on adding an entry like –insecure-registry 172.30.0.0/16 to docker running on Ubuntu.

While trying to get oc cluster up working on an Ubuntu VM I was getting the following error message and (helpfully) a suggested solution:

don@ubuntu:~# oc cluster up doncluster
Starting OpenShift using registry.access.redhat.com/openshift3/ose:v3.7.23 ...
-- Checking OpenShift client ... OK
-- Checking Docker client ... OK
-- Checking Docker version ... OK
-- Checking for existing OpenShift container ... OK
-- Checking for registry.access.redhat.com/openshift3/ose:v3.7.23 image ... OK
-- Checking Docker daemon configuration ... FAIL
   Error: did not detect an --insecure-registry argument on the Docker daemon
   Solution:
     Ensure that the Docker daemon is running with the following argument:
         --insecure-registry 172.30.0.0/16

I normally work on RedHat boxes, and this is usually easily solved by going to /etc/sysconfig/docker and adding the desired registry to the line:

INSECURE_REGISTRY=

On more recent RedHat docker installs this is now done in the externalised config file /etc/containers/registries.conf.

On my Ubuntu VM neither of these exist, and running locate with grep plus a quick google brings back loads of other file locations and suggestions, none of which worked for me (/etc/default/docker, exporting DOCKER_OPTS etc etc).

So, I checked systemctl status docker and got the following:

don@ubuntu:~# systemctl status docker
● docker.service - Docker Application Container Engine
 Loaded: loaded (/lib/systemd/system/docker.service; enabled; vendor preset: enabled)
 Active: active (running) since Wed 2018-01-24 11:29:25 GMT; 25min ago
 Docs: https://docs.docker.com
 Main PID: 4648 (dockerd)
 Tasks: 19 (limit: 19660)
 Memory: 26.8M
 CPU: 1.324s
 CGroup: /system.slice/docker.service
 ├─4648 /usr/bin/dockerd -H fd:// --insecure-registry 172.30.0.0/16
 └─4667 docker-containerd -l unix:///var/run/docker/libcontainerd/docker-containerd.sock --shim docker-containerd-shim --metrics-interval=0 --start-timeout 2m --state-di (...snip)

which prompted me to look at the file

/lib/systemd/system/docker.service

Adding the settings I wanted to the end of the ExecStart line like so:

ExecStart=/usr/bin/dockerd -H fd:// --insecure-registry 172.30.0.0/16

followed by a

systemctl daemon-reload
systemctl restart docker

did the trick, finally.

I am now hitting this issue, which looks like a systemd + docker mismatch… and am thinking CentOS may be a better place to test this!

don@ubuntu:~# oc cluster up doncluster
Starting OpenShift using registry.access.redhat.com/openshift3/ose:v3.7.23 ...
-- Checking OpenShift client ... OK
-- Checking Docker client ... OK
-- Checking Docker version ... OK
-- Checking for existing OpenShift container ... OK
-- Checking for registry.access.redhat.com/openshift3/ose:v3.7.23 image ... OK
-- Checking Docker daemon configuration ... OK
-- Checking for available ports ... FAIL
   Error: Cannot get TCP port information from Kubernetes host
   Caused By:
     Error: cannot start container cec56a101a46aa25adb6806f7c84df218e5d79c392fa0c38207f92510eb46538
     Caused By:
       Error: Error response from daemon: {"message":"oci runtime error: rootfs_linux.go:53: mounting \"/sys/fs/cgroup\" to rootfs \"/var/lib/docker/aufs/mnt/aeedaa83596edc9cb2b2cd835000277f9a5355f709694f8ec70d88787395cbd0\" caused \"no subsystem for mount\""}

argh.

Getting started with Terraform and AWS

These are my notes from running through the Terraform getting started guide here:

https://www.terraform.io/intro/getting-started/install.html

to set up terraform (on a Mac) and provision a basic test instance in AWS.

Install process

This is very easy, simply download terraform for your platform (a single binary), extract it somewhere sensible and add that location to your PATH variable.

I set this up in my .profile, along the lines of:

export TFORM=/Users/donaldsimpson/TFORM
export PATH=$M2:$TFORM:$PATH

quick check that all looks ok:

Setup

As per the guide, the next steps are to get a note of your AWS access_key and secret_key from this AWS page, then create and edit a local “example.tf” file for your project, like this:

provider "aws" {
  access_key = "ACCESS_KEY_HERE"
  secret_key = "SECRET_KEY_HERE"
  region     = "us-east-1"
}

resource "aws_instance" "example" {
  ami           = "ami-2757f631"
  instance_type = "t2.micro"
}

I hit this issue: https://github.com/hashicorp/terraform/issues/4367 as my AWS account is pretty old, and had to change the values for

ami = "ami-2757f631"
instance_type = "t2.micro"

to be:

ami = "ami-408c7f28"
instance_type = "t1.micro"

Terraform init

You should now be able to run terraform init and see something positive…

Check the plan

Running “terraform plan” provides a dry run/sanity check of what would be done

Make it so

terraform apply: run the plan, and actually create the resources listed above:

Show it is so

Once that has completed, you can check your AWS console and see the newly created instance:

“terraform show” can confirm the same details in a less pointy-clickety way:

Next steps

This was all pretty simple, quick and straightforward.

The next steps are to manage the hosts in an Infrastructure as Code manner, adding in changes and deletions/reprovisioning, and to do something useful with them.

I’d also like to try using Terraform with Digital Ocean and VMWare providers.

 

Jenkins and Docker – Part 1 of 3

This post is the first in a series of 3 introducing the combined power of Jenkins, Docker, and the Jenkins DSL.

They should hopefully provide enough information to get to grips with both Docker and Jenkins – what they both do and how to use them – by showing some practical examples of them working together.

The first step, if you haven’t already, is to download and install Docker on your platform – the Docker website covers this in good detail for most platforms…

Docker for Mac

Docker for Windows

Docker for Linux

Once that’s done, you can try it out with the customary “Hello World” example…

I’m running Docker on an Ubuntu VM, but the commands and the results are the same regardless of platform – that’s one of the main Docker concepts.

You can then check which processes (docker containers) are running using the “docker ps” command – in my example you can see that there’s one Jenkins image running. If you run “docker ps -a” you will see all containers (including stopped ones, of which I have a few on this host):

and you can check your Docker version with:

root@ubuntud:~# docker --version
Docker version 1.13.0, build 49bf474

Now that the basic setup is done, we can move on to something a little more interesting – downloading and running a “Dockerised” Jenkins container.

I’m going to use my own Dockerised Jenkins Image, and there will be more detail on that in the next post – you’re welcome to try it out too, just run this command in your terminal:

docker run -d -p 8080:8080 donaldsimpson/dockerjenkins

if you don’t happen to have my docker image cached locally (like I do) then docker will automatically download it for you from Docker Hub then run it:

That command did a quite few important things, here’s a quick explanation of them all:

docker run -d

tells docker that we want to run the container in the background so that we can carry on and do other things while it runs. The alternative is -it, for an interactive/foreground session.

docker run -d -p 8080:8080

The -p 8080:8080 tells docker to map port 8080 on the local host to port 8080 in the running container. This means that when we visit localhost:8080 the request will be passed through to the container.

docker run -d -p 8080:8080 donaldsimpson/dockerjenkins

and finally, we have the namespace and name of the Docker image we want to run – my “donaldsimpson/dockerjenkins” one – more on this later!

You can now visit port 8080 on your Docker host and see that Jenkins is up and running….

 

That’s Jenkins up and running and being happily served from the Docker container that was just pulled from Docker Hub – how easy was that?!

And the best thing is, it’s entirely and reliably repeatable, it’s guaranteed to work the same on all platforms that can run Docker, and you can quickly and easily update, delete, replace, change or share it with others! Ok, that’s more than one thing, but the point is that there’s a lot to like here 🙂

That’s it for this post – in the next one we will look in to the various elements that came together to make this work – the code and configuration files in my Git repo, the automated build process on Docker Hub that builds and updates the Docker Image, and how the two are related.

Using ngrok to work around Carrier Grade NAT (CGNAT)

I wrote a while back about my troubles with Carrier Grade Nat (CGNAT), and described a solution that involved tunneling out of CGNAT using a combination of SSH and an AWS server – the full article is here.

That worked ok, but it was pretty fragile and not ideal – connections could be dropped, sessions expired, hosts rebooted etc etc. Passing data through my EC2 host is also not ideal.

My “new and improved” solution to this is to use a local tool like ngrok to create the tunnel for me. This is proving to be far simpler to manage, more reliable, and ngrok also provides a load of handy additional features too.

Here’s a very quick run through of getting it up and running on my Ubuntu VM, which sits behind CGNAT and hosts a webserver I’d like to be able to access from the outside occasionally. This is the front end to my ZoneMinder CCTV interface, but it could be anything you want to host and on any port.

First off, don’t use the default Ubuntu install, that will give you version 1.x which is out of date and didn’t work for me at all – it’s better, quicker and easier to get the latest binary for your platform directly from the ngrok website, extract that on your host and run it directly or add it to you PATH.

wget http://<YourDownloadURL>/ngrok-stable-linux-amd64.zip

unzip ngrok-stable-linux-amd64.zip

once that’s downloaded and extracted, you can (optionally) add your auth token, which you get when you register on the ngrok site. This is optional, but you get some worthwhile features from doing so.

./ngrok authtoken <YourAuthTokenFromTheNgrokWebsite>

Then you simply run ngrok like so:

./ngrok http 80

which should give you a console something like this:

from here you can get the Forwarding URL (http://<uniqueid>.eu.ngrok.io in this example) and your local port 80 should be available on that from anywhere on the internet.

Note I’m using this command:

screen ./ngrok http -region eu 80

to start up ngrok using screen, so I can CTRL+A+D out of that and resume it when I want using screen -r,

Here’s a pic of the console running, showing requests, and Apache being served by the ngrok URL:

That’s it – quick and easy, more stable, and far less faffing too.

 

There are tons of other options worth exploring, like specifying basic HTTP auth, saving your config to a local file, running other ports etc, all of them are explained in the documentation.

There’s a handy review of ngrok and several very similar tools here: http://john-sheehan.com/blog/a-survey-of-the-localhost-proxying-landscape

And some good tips & tricks with ngrok here:
https://developer.atlassian.com/blog/2015/05/secure-localhost-tunnels-with-ngrok/
as noted in the comments on that page: you obviously need to be safe and sensible when opening up ports to the internet…

Cheers,

Don

PS: Update to add the script I use to update the ngrok URL when it changes.

I have this in a local Jenkins job that runs every 30 mins or so, and it has been happily doing the job for a couple of years now – it’s far from perfect and it’s a lot to set up if you’re not used to these tools, but I’m adding it here just in case it helps anyone else….

#!/bin/bash

# Backup of the Jenkins job/script I put together to automatically update my home ngrok tunnel.
# When the tunnel dies, this script will (via Jenkins) create a new one and update a PHP redirect file on my
# AWS Host that allows me to connect to my CGNET'd home server via my AWS website using a dynamic ngrok end point
# Uses:
# - Jenkins
# - bash
# - ngrok
# - jq
# - grep and awk
# - PHP
# - Apache
# - AWS


# check if ngrok is running/not
pidof  ngrok >/dev/null
if [[ $? -ne 0 ]] ; then
		# A (re)start and update is required
    echo "Starting ngrok on $(date)"
    # Start up a new instance of ngrok
    BUILD_ID=dontKillMe nohup /root/ngrok/ngrok http -region eu 80 &
		# Give it a moment before testing it...
		echo "Sleeping for 15 seconds..."
    sleep 15
    # Get the updated publish_url value from the ngrok api
		export NGROKURL=`curl -s http://127.0.0.1:4040/api/tunnels | jq '.' | grep public_url | grep https | awk -F\" '{print $4}'`
    echo "NGROKURL is $NGROKURL"
    # add that to a one-line PHP redirect page
		echo "<?php header('Location: $NGROKURL/zm'); exit;?>" > ZoneMinder.php
    # upload that to my AWS host
    echo "scp'ing zm.php to AWS host..."
		scp -i /MY_AWS_KEY_FILE.pem ZoneMinder.php MY_AWS_USER@MY_AWS_HOST.amazonaws.com:/MY_HTDOCS_DIR/ZoneMinder.php
		echo "Transfer complete."
    # Send an update message via email
		echo "New ngrok url is $NGROKURL/zm" | mailx -s "ngrok zm url updated" MY_EMAIL@gmail.com
else
		# Nothing needed, carry on
		echo "ngrok is currently running, nothing to do"
fi

Tunneling out of Carrier Grade Nat (CGNAT) with SSH and AWS

Update: there’s a new & improved solution here too.
Intro

After switching to a 4G broadband provider, who shall (pretty much) remain nameless, I discovered they were using Carrier-Grade  NAT (aka CGNAT) on me.

There are more details on that here and here but in short, the ISP is ‘saving’ IPv4 addresses by sharing them out amongst several users and NAT’ing their connections – in much the same way as you do at home, when you port forward multiple devices using one external IP address: my home network is just one ‘device’ in a pool of their users, who are all sharing the same external IP address.

The impact of this for me is that I can no longer NAT my internal network services, as I have been given a shared pubic-facing IPv4 address. This approach may be practical for a bunch of mobile phone users wanting to check Twitter and Facebook, but it sucks big time for gamers or anyone else wanting to connect things from their home network to the internet. So, rather than having “Everything Everywhere” through my very expensive new 4G connection – with 12 months contract – it turns out I get “not much to anywhere“.

The Aim

Point being; I would like to be able to check my internal servers and websites when I’m away – especially my ZoneMinder CCTV setup – but my home broadband no longer has its own internet address. So an alternative solution had to be found…

The “TL; DR” summary

I basically use 2 servers, the one at home (unhelpfully now stuck behind my ISPs CGNAT) and one in the Amazon Cloud (my public facing AWS web server with DNS), and create a reverse SSH Tunnel between them. Plus a couple of essential tweaks you wont find out about if you don’t read any further 🙂

The Steps
Step 1 – create the reverse SSH tunnel:

This is initiated on the internal/home server, and connects outwards to the AWS host on the internet, like so.

ssh -N -R 8888:localhost:80 -i /home/don/DonKey.pem awsuser@ec2-xx-xx-xx-xx.compute-x.amazonaws.com

Here is an explanation of each part of this command:

-N (from the SSH man page) “Do not execute a remote command.  This is useful for just forwarding ports.”

-R (from the SSH man page)  “Specifies that connections to the given TCP port or Unix socket  on the remote (server) host are to be forwarded to the given host and port, or Unix socket, on the local side.”

8888:localhost:80 – means, create the reverse tunnel from localhost port 80 (my ZoneMinder web app) to port 8888 on the destination host. This doesn’t look right to me, but it’s what’s needed for a reverse tunnel

the -i and everything after it is just me connecting to my AWS host as my user with an identity file. YMMV, whatever you nornally do should be fine.

When you run this command you should not see any issues or warnings. You need to leave it running using whatever method you like – personally I like screen for this kind of thing, and will also be setting up Jenkins jobs later (below).

Step 2 – check on the AWS host

With that SSH command still running on your local server you should now be able to connect to the web app from your remote AWS Web Server, by reading from port 8888 with curl or wget.

This is a worthwhile check to perform at this point, before moving on to the next 2 steps – for example:

don@MyAWSHost:~$ wget -q -O- localhost:8888/zm | grep -i ZoneMinder
      <h1>ZoneMinder Login</h1>
don@MyAWSHost:~$

This shows that port 8888 on my AWS server is currently connected to the ZoneMinder application that’s running on port 80 of my home web server. A good sign.

Step 3 – configure AWS Security & Ports

Progress is being made, but in order to be able to hit that port with a browser and have things work as I’d like, I still need to configure AWS to allow incomming connections to the newly chosen port 8888.

This is done through the Amazon EC2 Management Console using the left hand menu item “Network & Security” then “Security Groups”:

awsmenuThis should load your current Security Groups, which you can click on to Edit. You may have a few to check.

Now select Add and configure a new Inbound rule something like so:

awsinboundruleIt’s the “Custom TCP Rule” second from the bottom, with port 8888 and “Anywhere” and “0.0.0.0/0” as the source in my picture. Don’t go for the HTTP option – unless you’re sure that’s what you want 🙂

Step 4 – configure SSH on AWS host

At this point I thought I was done… but it didn’t work and I couldn’t immediately see why, as the wget check was all good.

Some head scratching and checking of firewalls later, I realised it was most likely to be permissions on the port I was tunneling – it’s not very likely to be exposed and world readable by default, is it? Doh.

After a quick google I found a site that explained the changes I needed to make to my sshd_config file, so:

vim /etc/ssh/sshd_config

and add a new line that says:

GatewayPorts yes

to that file, checking that there’s no existing reference to GatewayPorts – edit this file carefully and at your own risk.

As I understand it – which may best be described as ‘loosely’ – the reason this worked when I tested with wget earlier is because I was connecting to the loopback interface; this change to sshd binds the port to all interfaces. See the detailed answer on this post for further detail, including ways to limit this to specific users.

Once that’s done, restart sshd with

service ssh restart

and you should now be able to connect by pointing a web browser at port 8888 (or whatever you set) of your AWS web server and see your app responding from the other end:

zmlogin
Step 5 – automate it with Jenkins

The final step for me is to wrap this (the ssh tunnel creation part) up in a Jenkins job running on my home server.

This is useful for a number of reasons, such as avoiding and resetting defunct/stale connections and enabling scheduling – i.e. I can have the port forwarded when I want it, and have it shutdown during the hours I don’t.

Sun server migration

The people of eBay kindly provided me with a Sun 900 38u rack cabinet for much, much cheapness. They also chucked in a wopping v890, a couple of Storedge 3300’s and something 2u-sized and servery that I’ve not yet managed to identify or attempt to power on.

Seeing a Sun cabinet being towed across the countryside by a mad man in a Landrover Defender is not a regular occurence, so I thought I’d share pics of the process…

It was delivered on a pallet, which was collapsing under the incredible weight of all the steel inside the cabinet; it must weigh about the same as a small car:

img_7855

The Landy won the tug of war, but only just…

img_7858

I had to partially dismantle the thing:

img_7865

but it was soon restocked with some new additions when it was safely indoors – my old Cobalt 550 server and a SunBlade 100 I had sitting around.

img_7884

I’ve not had a chance to fire up the v890 yet, need to speak to eBay about some disks first, but I did power on the Sun Microsytems light on the top – my wife now refers to it as “that geeky vending machine thingy”…

img_7882Will post progress!

Pin It on Pinterest