hughevans.dev
hughevans.dev

Aerospike Barcelona Data Management Community Meetup

I had an amazing time speaking at the Aerospike Barcelona Data Management Community Meetup this week about working with flight radar data in Apache Druid. The team at Criteo were amazing hosts, super welcoming and friendly and the audience were really engaged with great questions after talks wrapped up. I’m looking forward to speaking at another Aerospike event later this year in Copenhagen.

If you’d like to check out my talk you watch the recording below.

read more

Backup your OpenSearch indices with manual snapshots

You’re making a change to your OpenSearch managed service and it’s all going great - right up until you make a mistake, destroying your cluster and causing you to lose all your indices. If only you had a snapshot you could restore your cluster from? Too bad you didn’t create any. 

Kermit the frog makes a rookie devops error

Taking OpenSearch snapshots is relatively easy but may require making some configuration changes to your IAM roles. It’s definitely worth doing because once you’ve successfully taken a snapshot you can use it to restore the indices in deleted, destroyed, or corrupted OpenSearch clusters or even create a duplicate cluster with the same data.

Prerequisites

In order to manually take snapshots you’ll need admin access to your OpenSearch service API either via curl or OpenSearch devtools, in this guide I’ll be using the latter method.

Before taking a snapshot you will need to create a role that will allow your OpenSearch service to write the snapshot to an S3 bucket and grant permission to the OpenSearch service to use that role. The Terraform for your IAM config should look something like the below, for more details see the AWS documentation.

IAM role

resource aws_iam_role" "es_snapshot" {
  name                 = "es-snapshot"
  managed_policy_arns  = [aws_iam_policy.es_snapshot.arn]
  assume_role_policy   = <<EOF
{
"Version" : "2012-10-17",
"Statement" : [{
    "Sid" : "",
    "Effect" : "Allow",
    "Principal" : {
    "Service" : "es.amazonaws.com"
    },
    "Condition" : {
    "StringEquals" : {
        "aws:SourceAccount" : "<your aws account id>"
    },
    "ArnLike" : {
        "aws:SourceArn" : "<the arn for your opensearch cluster>"
    }
    },
    "Action" : "sts:AssumeRole"
}]

}
  EOF
}

Note the condition in the above terraform statement: this limits access to this role to a specific OpenSearch service with your AWS account, without it any OpenSearch service could assume this role.

IAM policy

resource "aws_iam_policy" "es_snapshot" {
  name = "es-snapshot-policy"
  policy = jsonencode({
    "Version" : "2012-10-17",
    "Statement" : [{
      "Action" : [
        "s3:ListBucket"
      ],
      "Effect" : "Allow",
      "Resource" : [
        "<arn of the s3 bucket you want to store your snapshots in>"
      ]
      },
      {
        "Action" : [
          "s3:GetObject",
          "s3:PutObject",
          "s3:DeleteObject"
        ],
        "Effect" : "Allow",
        "Resource" : [
          "<arn of the s3 bucket you want to store your snapshots in>/*"
        ]
      }
    ]
  })
}

Register a snapshot repository

In order to take a snapshot you first need to configure a snapshot repository to store your snapshots. In this guide I’ll be covering how to do this using an S3 bucket

First, if there isn’t one already you will need to register a snapshot repository, you can use the get request below to list any existing repositories (do not use cs-automated-enc, it is reserved by OpenSearch for automated snapshots).

GET _cat/repositories

If needed, register a new snapshot repository like so (note the use of the role we created in the previous section).

PUT _snapshot/opensearch-snapshots
{
  "type": "s3",
  "settings": {
    "bucket": "<your s3 bucket name>",
    "region": "eu-west-1",
    "role_arn": "<arn of your snapshot role>",
    "server_side_encryption": true
  }
}

Manually taking a snapshot

Check for any ongoing snapshots, you cannot take a snapshot if one is already in progress and OpenSearch automatically takes snapshots periodically.

GET _snapshot/_status

Take a snapshot. Adding the data to the end of the snapshot name is optional, but I’d recommend adding the correct time here so you can easily find the snapshot if you need to restore from it later.

PUT _snapshot/opensearch-snapshots/snapshot-2023-03-13-1135

Check snapshot progress with the first get request below and then view it with the second once complete. Use of the “pretty” query is not required but helps to make the output more readable.

GET _snapshot/_status
GET _snapshot/opensearch-snapshots/_all?pretty

You should see your snapshot listed alongside any pre-existing snapshots. Congratulations, you’re now ready to restore from a snapshot should you ever need to. Don’t stop here though, I recommend that you continue with the next section to familiarise yourself with the process of restoring from a snapshot - you should also take snapshots regularly to help reduce the risk of data loss.

Restoring from a snapshot

read more

Clustered Keycloak SSO Deployment in AWS

 Keycloak is an open source Identity and Access Management tool with features such as Single-Sign-On (SSO), Identity Brokering and Social Login, User Federation, Client Adapters, an Admin Console, and an Account Management Console.

Why use Keycloak?

There are several factors to deciding whether or not to use Keycloak or a SaaS IAM Service like AWS SSO. SaaS IAM services are typically easier to implement, better supported, and do not require manual deployment but Keycloak is free to use, feature rich, and flexible.

Pre-requisites

This guide assumes you already have at least one Keycloak instance with a Postgres database configured, if this is the case your keycloak.conf should include a section that looks something like the example below.

db=postgres
db-password=<your db password>
db-userame=keycloak
db-pool-initial-size=1
db-pool-max-size=10
db-schema=public
db-url-database=keycloak
db-url-host=<url of your db>
db-url-port=5432

If you do not yet have your database configured please refer to the documentation on configuring relational databases for Keycloak.

Configuring JDBC Ping

In order for Keycloak instances to cluster they must discover each other and this can be achieved by using JDBC Ping which allows nodes to discover each other via your existing database. JDBC Ping is a convenient discovery method because it does not require the creation of additional AWS resources and is compatible with AWS unlike the default discovery method (multicast) which is not permitted by AWS.

In order to use JDBC Ping we first need to define a transport stack, this can be achieved by adding the below element to the infinispan tag in your cache-ispn.xml file and replacing the default values (these should match the db-password and db-url-host from your keycloak.conf file).

<jgroups>
    <stack name="jdbc-ping-tcp" extends="tcp">
        <JDBC_PING connection_driver="org.postgresql.Driver"
                    connection_username="keycloak"
                    connection_password="<your database password>"
                    connection_url="jdbc:postgresql://<url of your database>:5432/keycloak"
                    initialize_sql="CREATE TABLE IF NOT EXISTS JGROUPSPING (own_addr varchar(200) NOT NULL, cluster_name varchar(200) NOT NULL, ping_data BYTEA, constraint PK_JGROUPSPING PRIMARY KEY (own_addr, cluster_name));"
                    info_writer_sleep_time="500"
                    remove_all_data_on_view_change="true"
                    stack.combine="REPLACE"
                    stack.position="MPING" />
    </stack>
</jgroups>

We have now defined a new JGroups stack which will create a table in your database if one doesn’t already exist which Keycloak instances can use to discover each other, when you start a new Keycloak instance it will write its name as a new record into this table. To use this stack simply amend the transport element as shown below to reference the newly defined stack.

<transport lock-timeout="60000" stack="jdbc-ping-tcp"/>

Configuring Security Groups

Keycloak uses Infinispan to cache data both locally to the Keycloak instance and for remote caches. Infinispan by default uses port 7800 so we need to configure the Security Group our Keycloak instances are deployed to in order to permit both ingress and egress via port 7800. This can be done in a number of ways such as via the AWS Console, below is an example of configuring ports for Keycloak using Terraform.

## keycloak cluster egress
resource "aws_security_group_rule" "keycloak_cluster_egress_to_keycloak" {
    description              = "keycloak cluster"
    from_port                = 7800
    protocol                 = "tcp"
    security_group_id        = aws_security_group.keycloak.id
    source_security_group_id = aws_security_group.keycloak.id
    to_port                  = 7800
    type                     = "egress"
}

## keycloak cluster ingress
resource "aws_security_group_rule" "keycloak_cluster_ingress_to_keycloak" {
    description              = "keycloak cluster"
    from_port                = 7800
    protocol                 = "tcp"
    security_group_id        = aws_security_group.keycloak.id
    source_security_group_id = aws_security_group.keycloak.id
    to_port                  = 7800
    type                     = "ingress"
}

Restarting Keycloak

Keycloak does not automatically apply changes made to its configuration so you will need to restart your Keycloak instance/instances for clustering to work. First run the following from the terminal to rebuild your Keycloak instance to register the changes we made to your configuration.

➜/bin/kc.sh build

Once you have rebuilt Keycloak restart your Keycloak service by running the following (alternatively you can restart your Keycloak instance).

systemctl restart keycloak

Your Keycloak instances should now be running in a clustered state.

Testing your Keycloak cluster

To check that your Keycloak cluster is functioning correctly check your database and see if the JGROUPSPING table both exists and includes the name of all instances currently in the cluster, your table should look something like the below.

own_addr cluster_name ping_data
***** ISPN *****
***** ISPN *****

If you terminate a Keycloak instance or start a new instance you should see the records in this table change.

Troubleshooting

Changes made to config files aren’t applied after building Keycloak

Ensure that the config files you have changed match those configured in keycloak.conf, this guide for example assumes that you have your Infinispan config file set as cache-ispn.xml in your keycloak.conf file.

cache-config-file: cache-ispn.xml

Keycloak services don’t start after changing config files

Check the Keycloak logs and ensure your database access details (password and host url) are set correctly: if these values are incorrect the Keycloak service will fail to start.

Resources

Use of JDBC_PING with Keycloak 17 (Quarkus distro)

Embedding Infinispan caches in Java applications

Keycloak Server caching

Clustered Keycloak SSO Deployment in AWS was originally published on the Daemon Insights blog

read more

Turn a Raspberry Pi into an IoT device with AWS

Cheap and easy IoT with AWS

read more

DALL·E 2 - what happens when machines make art?

3D render of DALL-E-2 making art in an open office on a red brick background, digital art

What is DALL·E 2?

read more

Exposing metrics to Prometheus with Service Monitors

You’ve done the hard part and added instrumentation to your application to gather metrics, now you just need to expose those metrics to Prometheus so you can alert on them and monitor them, easy right?

read more