69 useful Terminal/CLI commands

For a long time I’ve maintained a memory aid in the form of a list of useful commands which can be used on the command line for Linux, macOS (OS X), BSD, Solaris, etc., so I thought I’d list them in a sticky blog post in case they come in useful for others. Most of these will run on any Unix-type operating system, though I’ve usually indicated where a command is OS-specific. These can be run manually for admin purposes and also scripted for automation purposes.

Continue reading “69 useful Terminal/CLI commands”

Testing the Monty Hall problem using a Python program to run game simulations

If a search brought you here then you probably already know what the Monty Hall problem is. If not, and in case you are interested, the Wikipedia page explains it in detail.

In a nutshell: the game comes from a gameshow called Let’s Make a Deal, in which the contestant is presented with three doors, behind one of which is a car which will be won if the contestant chooses that door. Once the contestant has made their initial choice from the three doors, the gameshow host Monty Hall will open one of the other doors to show that the car is not behind that door. The contestant is then given a choice of staying with their original choice of door, or switching to the other remaining door which has not yet been opened.

I recently encountered the Monty Hall problem for the first time and succumbed to the logic failure that most people succumb to, i.e. feeling certain that staying on my original choice had exactly the same probability of winning as switching to the other available door.

Due to my initial difficulty in comprehending that switching doors always gives approximately double the chance of winning, I decided to take the scientific approach of writing a Python program to test this out. This clearly demonstrated that my 50/50 belief was wrong, and that in fact one should always switch because doing so approximately doubles the chance of winning.

There are plenty of other similar programs out there that people have written to test this, but they generally seem unnecessarily complicated. To test this satisfactorily it’s only necessary to choose Door 1 initially, then compare the results of staying with Door 1 vs switching to whichever one of Doors 2 and 3 is not already open.

To that end, here is my code:

import random

runs = 10 # total runs
games = 1000 # number of games per run for each choice

# Function to run the game simulations based on 
# choice of "stay" or "switch"
def play_games(choice):
  wins = 0

  # When running the game simulations we can assume 
  # an initial choice of door 1 each time. 
  # There's no need to make it any more complex than that
  for i in range(games):
    # Randomly select one of the three doors for the car to be behind
    car_door = random.randrange(1,4)
    # If we stay on door 1 we only need to know if the car is behind
    # door 1 (in which case we win) or not
    if choice == 'stay' and car_door == 1:
      wins += 1
    # If we switch to the other door (2 or 3, whichever of those two is
    # not already open) then we only need to know if the car is not
    # behind door 1 (in which case we win)
    elif choice == 'switch' and car_door != 1:
      wins += 1

  return(wins)

for j in range(runs):
  print('\nrun:\t' + str(j + 1))

  game_wins = {}

  # Run the game simulations for both choices, and print the results
  for choice in ('stay', 'switch'):
    game_wins[choice] = play_games(choice)
    print(choice + ':\t' + str(game_wins[choice]) + ' wins out of ' + str(games) + ' games')

  # Determine the win ratio of switches to stays and print the result
  win_ratio = game_wins['switch'] / game_wins['stay']
  print('result:\t"switch" won ' + str(round(win_ratio, 2)) + ' times as often as "stay"')

print('')

The results of the game simulations are very clear:

$ python3 ./montyhall.py 

run:	1
stay:	310 wins out of 1000 games
switch:	674 wins out of 1000 games
result:	"switch" won 2.17 times as often as "stay"

run:	2
stay:	350 wins out of 1000 games
switch:	667 wins out of 1000 games
result:	"switch" won 1.91 times as often as "stay"

run:	3
stay:	307 wins out of 1000 games
switch:	651 wins out of 1000 games
result:	"switch" won 2.12 times as often as "stay"

run:	4
stay:	336 wins out of 1000 games
switch:	670 wins out of 1000 games
result:	"switch" won 1.99 times as often as "stay"

run:	5
stay:	341 wins out of 1000 games
switch:	638 wins out of 1000 games
result:	"switch" won 1.87 times as often as "stay"

run:	6
stay:	322 wins out of 1000 games
switch:	679 wins out of 1000 games
result:	"switch" won 2.11 times as often as "stay"

run:	7
stay:	355 wins out of 1000 games
switch:	668 wins out of 1000 games
result:	"switch" won 1.88 times as often as "stay"

run:	8
stay:	311 wins out of 1000 games
switch:	665 wins out of 1000 games
result:	"switch" won 2.14 times as often as "stay"

run:	9
stay:	338 wins out of 1000 games
switch:	668 wins out of 1000 games
result:	"switch" won 1.98 times as often as "stay"

run:	10
stay:	358 wins out of 1000 games
switch:	646 wins out of 1000 games
result:	"switch" won 1.8 times as often as "stay"

It took me some time to comprehend why switching always roughly doubles your chances of winning. If you’re also having some difficulty getting your head around it, this video of Derren Brown explaining how his version of the game works is helpful for grasping the logic. Enjoy.

How to use GitHub Actions and AWS CodeDeploy for automated CI/CD builds and deployment

Introduction

I recently migrated a client to a new AWS-based infrastructure, fully automated and managed via IaC (primarily Packer, Ansible and Terraform). However, a somewhat clunky old build/deploy system was still being used, so it was also time to migrate that to a new automated CI/CD (continuous integration/continuous delivery) system for builds and deployments. Keeping costs as low as possible was a priority, so I ruled out Jenkins since that would have cost money to maintain an additional instance for extended periods of time.

Since GitHub was already in use, GitHub Actions was an obvious choice because the virtual instances (known as “runners”) used for code builds only exist for as long as necessary to run all the build commands. Costs are therefore kept as low as possible. Since the infrastructure was already running on Amazon Web Services, AWS CodeDeploy made sense as an integrated solution for deploying code. The challenge therefore was to get the builds working on GitHub Actions, then to connect GitHub Actions to AWS CodeDeploy for full CI/CD deployments.

This simple diagram shows the desired CI/CD architecture with GitHub Actions and AWS CodeDeploy:

Continue reading “How to use GitHub Actions and AWS CodeDeploy for automated CI/CD builds and deployment”

Using SNS and procmail for Amazon Simple Email Service (SES) logging

Introduction

I run my own mail system on a Linux VPS for all incoming and outgoing email. I’m very experienced with email server administration, and it’s fully set up with modern encryption and authentication methods such as TLS, SPF, DKIM, DMARC. It has everything needed for a mail server to have a great reputation to maximise deliverability.

Nevertheless, it’s becoming increasingly difficult to run an email server, or cluster of email servers, in this age when more and more IP ranges are being placed onto private blacklists which aren’t publicly accessible, and which offer no facility for removal of IPs from the blacklists. My VPS’s IP range is apparently on some internal Microsoft blacklist, and my VPS provider is aware of this problem but seems unable to do anything about it. It has therefore become more or less impossible to get email through to Microsoft-hosted email addresses, despite all my best efforts. The logs show that the emails are being accepted, usually by servers whose names end with “mail.protection.outlook.com”, but after being accepted they are apparently being sent directly to the Microsoft Hotmail and Outlook equivalent of /dev/null.

I’ve therefore had to accept that it’s become necessary to relay outgoing email via a service which can ensure the best possible deliverability, and I’m now using Amazon Simple Email Service (SES) for this purpose. However, SES doesn’t offer a simple way of viewing email logs showing the kind of information you see in logs from MTAs such as Postfix, Sendmail, or Exim, so I had to set something up for that. There are various different solutions for this, but I just wanted something quick and easy which would sit nicely alongside my existing mail logs.

Continue reading “Using SNS and procmail for Amazon Simple Email Service (SES) logging”

How to set up a Kubernetes cluster with minikube and then with Amazon EKS

Purpose of this tutorial project

Our goal is to create a Kubernetes cluster serving the output of simple-webapp via nginx. simple-webapp is a simple Python app I wrote for these kinds of projects, which outputs a basic web page as proof of concept. In a real production environment, this would be a full-blown web application of some kind.

The Kubernetes cluster will consist of the following:

  • Two cluster Nodes.
  • A simple-webapp Deployment consisting of four Pods, each running the simple-webapp container, exposed internally to nginx via a ClusterIP Service.
  • An nginx Deployment consisting of four Pods, each running an nginx container with a modified nginx.conf file made available via a ConfigMap which allows nginx to reverse-proxy traffic to the simple-webapp service, exposed externally via a LoadBalancer Service.
Continue reading “How to set up a Kubernetes cluster with minikube and then with Amazon EKS”

Genrify: Python app to filter Spotify library based on genre

Taking a temporary departure from my usual Infrastructure/DevOps articles, this article is about “Genrify”, a Python app I’ve written to select tracks from Saved Albums or Playlists on Spotify based on Artist Genre and add them to Queue or to a new Playlist.

I got frustrated at having a Spotify library of added albums and created playlists, but not being able to query my library by genre. Basically, I wanted functionality like Smart Playlists in Apple’s Music app (previously iTunes), where it’s possible to say something like “select all songs from recently added albums where the genre is darkwave”. Therefore, with the goal in mind of being able to do something similar with my Spotify library, I created this Python 3 app called Genrify which uses Spotify’s API via the Spotipy Python library to achieve this functionality.

Continue reading “Genrify: Python app to filter Spotify library based on genre”

Creating a CI/CD pipeline with GitHub and Multibranch Pipelines in Jenkins

My intention here is to show how to set up a simple CI/CD multibranch pipeline for teams looking to explore the continuous integration and continuous delivery end of DevOps. This pipeline provides a starting point which can be changed and expanded depending on particular team requirements.

This article assumes your team is already familiar with git and GitHub, and that you have Jenkins installed and ready to use in a location accessible from GitHub. Installing Jenkins is quite straightforward so I won’t go into that here, as there are plenty of other guides available for that. I also won’t spend time explaining what CI/CD is, because again there’s plenty of info about that out there, and if you’re looking for implementation guides then you probably already know what CI/CD is anyway and just want to get started.

This pipeline uses three branches in the git repository: dev, test and main. The dev branch is for development builds. Upon creation of a pull request and successful merge from dev into the test branch, the test branch will be used to run a simple automated test. Again, after a successful pull request/merge from test to main, the main branch is used for delivery to a staging environment for QA testing. This is quite basic and can be changed and expanded according to team needs, e.g. feature branches for specific areas of code, additional test environments, the addition of deployment to a production environment, etc.

The pull requests and merges are done manually so that code can be reviewed and checked for issues before merging. Apart from that, the rest of the builds, tests, and deliveries/deployments are automated.

Continue reading “Creating a CI/CD pipeline with GitHub and Multibranch Pipelines in Jenkins”

AWS Provisioning and Deployment with Linux EC2 instances using PowerShell

I didn’t expect to find myself needing to learn PowerShell for automation purposes, but I must admit I really like it. It seems sort of like an amalgam of Bash, Perl and Python. It’s an unexpectedly impressive creation from Microsoft. I’ve been using PowerShell on macOS but it can also be used easily on Linux, and Windows of course.

I created three simple PowerShell scripts for automated provisioning of Linux EC2 instances within AWS. Running these will provision an Amazon Linux 2 EC2 instance with SSH key pair and Security Group, with a webapp deployed thereon, plus an associated DNS record in Route 53.

You can find these scripts and related config on my GitHub.

Continue reading “AWS Provisioning and Deployment with Linux EC2 instances using PowerShell”

How to provision an ECS cluster and deploy a webapp on it with load-balanced Docker containers, using Ansible

I wrote a suite of Ansible playbooks to provision an ECS (Elastic Container Service) cluster on AWS, running a webapp deployed on Docker containers in the cluster and load balanced from an ALB (Application Load Balancer), with the Docker image for the app pulled from an ECR (Elastic Container Registry) repository.

This is a follow-up to my project/article “How to use Ansible to provision an EC2 instance with an app running in a Docker container” which explains how to get a containerised Docker app running on an regular EC2 instance, using Docker Hub as the image repo. That could work well as a simple Staging environment, but for Production it’s desirable to easily cluster and scale the containers with a load balancer, so I came up with this solution for provisioning/deploying on ECS which is well-suited for this kind of flexibility. (To quote AWS: “Amazon ECS is a fully managed container orchestration service that makes it easy for you to deploy, manage, and scale containerized applications”.) This solution also uses Amazon’s own ECR for Docker images, rather than Docker Hub.

Continue reading “How to provision an ECS cluster and deploy a webapp on it with load-balanced Docker containers, using Ansible”

How to use Ansible to provision an EC2 instance with an app running in a Docker container

I created this suite of Ansible playbooks to provision a basic AWS (Amazon Web Services) infrastructure on EC2 with a Staging instance, and to deploy a webapp on the Staging instance which runs in a Docker container, pulled from Docker Hub.

Firstly a Docker image is built locally and pushed to a private Docker Hub repository, then the EC2 SSH key and Security Groups are created, then a Staging instance is provisioned. Next, the Docker image is pulled on the Staging instance, then a Docker container is started from the image, with nginx set up on the Staging instance to proxy web requests to the container. Finally, a DNS entry is added for the Staging instance in Route 53.

This is a simple Ansible framework to serve as a basis for building Docker images for your webapp and deploying them as containers on Amazon EC2. It can be expanded in multiple ways, the most obvious being to add an auto-scaled Production environment with Docker containers and a load balancer. (For Ansible playbooks suitable for provisioning an auto-scaled Production environment, check out my previous article and associated files “How to use Ansible for automated AWS provisioning”.) More complex apps could be split across multiple Docker containers for handling front-end and back-end components, so this could also be added as needed.

Continue reading “How to use Ansible to provision an EC2 instance with an app running in a Docker container”

How to automate provisioning and deployment of RabbitMQ with cert-manager on a Kubernetes cluster in GKE within GCP

I was brought in by a startup to set up their core infrastructure in a way that functioned as needed and could be automated for safe and efficient provisioning and deployment. The key requirement was making RabbitMQ work only with secure certificate-based connections – the AMQPS protocol, rather than AMQP – for security and compliance purposes. This needed to be done within a Kubernetes cluster for storage and shared states via StatefulSets, ease of scaling and deployment, and general flexibility. It was also necessary to set this up on GCP (Google Cloud Platform) as that was already in use by the startup and they didn’t want to consider alternative cloud providers at this stage, so GKE (Google Kubernetes Engine) needed to be used for the Kubernetes cluster.

Getting certificates for use with RabbitMQ within Kubernetes required the setup of cert-manager for certificate management, which in turn needed ingress-nginx to allow incoming connections for Let’s Encrypt verification so that certificates could be issued.

I successfully solved the problems and fulfilled the requirements. It’s still a “work in progress” to some extent. Some of the config is a little “rough and ready” and could be improved with more modularisation and better use of variables and secrets. Also, the initial cluster provisioning is fully automated with Terraform, and the rest is only semi automated currently. So there is room for further improvement.

All the code and documentation is available in my GitHub repository. Below I will explain the whole process from start to finish.

Continue reading “How to automate provisioning and deployment of RabbitMQ with cert-manager on a Kubernetes cluster in GKE within GCP”