Nomad
Set up a Nomad cluster on Azure
This tutorial will guide you through deploying a Nomad cluster with access control lists (ACLs) enabled on Azure. Consider checking out the cluster setup overview first as it covers the contents of the code repository used in this tutorial.
Prerequisites
For this tutorial, you will need:
- Packer 1.7.7 or later installed locally
- Terraform 1.2.0 or later installed locally
- Nomad 1.3.3 or later installed locally
- An Azure account and the
az
CLI tool installed locally
Note
This tutorial creates Azure resources that may not qualify as part of the Azure free tier. Be sure to follow the Cleanup process at the end of this tutorial so you don't incur any additional unnecessary charges.
Clone the code repository
The cluster setup code repository contains configuration files for creating a Nomad cluster on Azure. It uses Consul for the initial setup of the Nomad servers and clients and enables ACLs for both Consul and Nomad.
Clone the code repository.
$ git clone https://github.com/hashicorp/learn-nomad-cluster-setup
Navigate to the cloned repository folder.
$ cd learn-nomad-cluster-setup
Check out the v0.3
tag of the repository as a local branch named nomad-cluster
.
$ git checkout v0.3 -b nomad-cluster
Navigate to the azure
folder.
$ cd azure
Set up your local environment
Before you begin, you will need to set up your environment. This includes a variables file with configurations for Packer and Terraform as well as the az
CLI.
Rename variables.hcl.example
to variables.hcl
. You will update the variables in the file with values from the az
commands that follow.
$ mv variables.hcl.example variables.hcl
Warning
The .gitignore
file in the example repo is set to ignore variables.hcl
so your configurations will not get pushed to your source code repository if you choose to do so. Do not commit sensitive data like credentials to your source code repository.
Open your terminal, log in to Azure with az
, and follow the prompts to complete the login process.
$ az login
A web browser has been opened at https://login.microsoftonline.com/organizations/oauth2/v2.0/authorize.
Please continue the login in the web browser. If no web browser is available or if the web browser
fails to open, use device code flow with `az login --use-device-code`.
[
{
"cloudName": "AzureCloud",
"homeTenantId": "1e472a2a-7ab3-9bd1-2016-a32fd04dfb29",
"id": "0e3e2e88-47a3-4107-a2b2-f325314dfb67",
"isDefault": true,
"managedByTenants": [
{
"tenantId": "c9ed8610-2016-4bf5-b919-437a07bf2464"
}
],
"name": "[SUBSCRIPTION_NAME]",
"state": "Enabled",
"tenantId": "1e472a2a-7ab3-9bd1-2016-a32fd04dfb29",
"user": {
"name": "[USER_EMAIL]",
"type": "user"
}
}
]
Copy the values for id
and tenantId
and paste them into the variables.hcl
file as values for subscription_id
and tenant_id
. For this example, the value for subscription_id
would be 0e3e2e88-47a3-4107-a2b2-f325314dfb67
and tenant_id
would be 1e472a2a-7ab3-9bd1-2016-a32fd04dfb29
.
azure/variables.hcl
# Packer variables (all are required)
location = "LOCATION"
subscription_id = "0e3e2e88-47a3-4107-a2b2-f325314dfb67"
tenant_id = "1e472a2a-7ab3-9bd1-2016-a32fd04dfb29"
client_id = "CLIENT_ID"
client_secret = "CLIENT_SECRET"
Next, create an Azure service principal with the value of subscription_id
in the --scopes
argument.
$ az ad sp create-for-rbac \
--role="Contributor" \
--scopes="/subscriptions/0e3e2e88-47a3-4107-a2b2-f325314dfb67"
Creating 'Contributor' role assignment under scope '/subscriptions/0e3e2e88-47a3-4107-a2b2-f325314dfb67'
The output includes credentials that you must protect. Be sure that you do not include these credentials in your code or check the credentials into your source control. For more information, see https://aka.ms/azadsp-cli
{
"appId": "ab3cb7b2-c932-4eb7-89ce-a369de998a37",
"displayName": "azure-cli-2022-12-02-15-40-24",
"password": "UVq8Q~7VPT9hIVYQ6QCtmCfUyNOTLoaIsze8IdwS",
"tenant": "1e472a2a-7ab3-9bd1-2016-a32fd04dfb29"
}
Copy the values for appId
and password
and paste them into the variables.hcl
file as values for client_id
and client_secret
. For this example, the value for client_id
would be ab3cb7b2-c932-4eb7-89ce-a369de998a37
and client_secret
would be UVq8Q~7VPT9hIVYQ6QCtmCfUyNOTLoaIsze8IdwS
.
azure/variables.hcl
# Packer variables (all are required)
location = "LOCATION"
subscription_id = "0e3e2e88-47a3-4107-a2b2-f325314dfb67"
tenant_id = "1e472a2a-7ab3-9bd1-2016-a32fd04dfb29"
client_id = "ab3cb7b2-c932-4eb7-89ce-a369de998a37"
client_secret = "UVq8Q~7VPT9hIVYQ6QCtmCfUyNOTLoaIsze8IdwS"
Update the location
variable with your Azure location preference.
azure/variables.hcl
# Packer variables (all are required)
location = "eastus"
subscription_id = "0e3e2e88-47a3-4107-a2b2-f325314dfb67"
tenant_id = "1e472a2a-7ab3-9bd1-2016-a32fd04dfb29"
client_id = "ab3cb7b2-c932-4eb7-89ce-a369de998a37"
client_secret = "UVq8Q~7VPT9hIVYQ6QCtmCfUyNOTLoaIsze8IdwS"
Update the retry_join
variable with the values for subscription_id
, tenant_id
, client_id
, and client_secret
.
Tip
Newlines have been added to the code snippet below for readability - you do not need to add newlines to the retry_join
variable.
azure/variables.hcl
# ...
# Terraform variables (all are required)
retry_join = "provider=azure tag_name=ConsulAutoJoin tag_value=auto-join
subscription_id=0e3e2e88-47a3-4107-a2b2-f325314dfb67
tenant_id=1e472a2a-7ab3-9bd1-2016-a32fd04dfb29
client_id=ab3cb7b2-c932-4eb7-89ce-a369de998a37
secret_access_key=UVq8Q~7VPT9hIVYQ6QCtmCfUyNOTLoaIsze8IdwS"
Create a resource group with your same Azure location preference as above. In this example, the resource group name is nomad-cluster-rg
.
$ az group create -l eastus -n nomad-cluster-rg
{
"id": "/subscriptions/0e3e2e88-47a3-4107-a2b2-f325314dfb67/resourceGroups/nomad-cluster-rg",
"location": "eastus",
"managedBy": null,
"name": "nomad-cluster-rg",
"properties": {
"provisioningState": "Succeeded"
},
"tags": null,
"type": "Microsoft.Resources/resourceGroups"
}
Then, create a storage account for the Packer image in the resource group. In this example, the storage account name is nomadvms
.
$ az storage account create \
-n nomadvms \
-g nomad-cluster-rg \
-l eastus \
-t Account
{
"accessTier": "Hot",
"allowBlobPublicAccess": true,
"allowCrossTenantReplication": null,
"allowSharedKeyAccess": null,
"allowedCopyScope": null,
"azureFilesIdentityBasedAuthentication": null,
"blobRestoreStatus": null,
"creationTime": "2022-12-02T16:56:57.339136+00:00",
# ...
"primaryEndpoints": {
"blob": "https://nomadvms.blob.core.windows.net/",
"dfs": "https://nomadvms.dfs.core.windows.net/",
"file": "https://nomadvms.file.core.windows.net/",
"internetEndpoints": null,
"microsoftEndpoints": null,
"queue": "https://nomadvms.queue.core.windows.net/",
"table": "https://nomadvms.table.core.windows.net/",
"web": "https://nomadvms.z13.web.core.windows.net/"
},
# ...
"statusOfPrimary": "available",
"statusOfSecondary": "available",
"storageAccountSkuConversionStatus": null,
"tags": {},
"type": "Microsoft.Storage/storageAccounts"
}
Update the resource_group_name
and storage_account
variables with the names you chose for them in the above commands. In this example, resource_group_name
is nomad-cluster-rg
and storage_account
is nomadvms
.
azure/variables.hcl
# ...
resource_group_name = "nomad-cluster-rg"
storage_account = "nomadvms"
The variables.hcl
file now contains values for each of the variables except for image_name
, which will come from the Packer build in the next section.
azure/variables.hcl
# Packer variables (all are required)
location = "eastus"
subscription_id = "0e3e2e88-47a3-4107-a2b2-f325314dfb67"
tenant_id = "1e472a2a-7ab3-9bd1-2016-a32fd04dfb29"
client_id = "ab3cb7b2-c932-4eb7-89ce-a369de998a37"
client_secret = "UVq8Q~7VPT9hIVYQ6QCtmCfUyNOTLoaIsze8IdwS"
resource_group_name = "nomad-cluster-rg"
storage_account = "nomadvms"
# Terraform variables (all are required)
retry_join = "provider=azure tag_name=ConsulAutoJoin tag_value=auto-join
subscription_id=0e3e2e88-47a3-4107-a2b2-f325314dfb67
tenant_id=1e472a2a-7ab3-9bd1-2016-a32fd04dfb29
client_id=ab3cb7b2-c932-4eb7-89ce-a369de998a37
secret_access_key=UVq8Q~7VPT9hIVYQ6QCtmCfUyNOTLoaIsze8IdwS"
Create the Nomad cluster
There are two main steps to creating the cluster: building a virtual machine image with Packer and provisioning the cluster infrastructure with Terraform.
Build the VM image
Initialize Packer to download the required plugins.
Tip
packer init
returns no output when it finishes successfully.
$ packer init image.pkr.hcl
Then, build the image and provide the variables file with the -var-file
flag.
Tip
Packer will print out a Warning: Undefined variable
message notifying you that some variables were set in variables.hcl
but not used, this is only a warning. The build will still complete sucessfully.
$ packer build -var-file=variables.hcl image.pkr.hcl
azure-arm.hashistack: output will be in this color.
==> azure-arm.hashistack: Running builder ...
==> azure-arm.hashistack: Getting tokens using client secret
==> azure-arm.hashistack: Getting tokens using client secret
azure-arm.hashistack: Creating Azure Resource Manager (ARM) client ...
==> azure-arm.hashistack: WARNING: Zone resiliency may not be supported in East US, checkout the docs at https://docs.microsoft.com/en-us/azure/availability-zones/
==> azure-arm.hashistack: Getting source image id for the deployment ...
# ...
==> azure-arm.hashistack: Cleanup requested, deleting resource group ...
==> azure-arm.hashistack: Resource group has been deleted.
Build 'azure-arm.hashistack' finished after 15 minutes 36 seconds.
==> Wait completed after 15 minutes 36 seconds
==> Builds finished. The artifacts of successful builds are:
--> azure-arm.hashistack: Azure.ResourceManagement.VMImage:
OSType: Linux
ManagedImageResourceGroupName: nomad-cluster-rg
ManagedImageName: hashistack.20221202190723
ManagedImageId: /subscriptions/0e3e2e88-47a3-4107-a2b2-f325314dfb67/resourceGroups/nomad-cluster-rg/providers/Microsoft.Compute/images/hashistack.20221202190723
ManagedImageLocation: East US
Update the variables file for Terraform
Open variables.hcl
in your text editor and update image_name
with the value output for ManagedImageName
from the Packer build. In this example, the value would be hashistack.20221202190723
.
azure/variables.hcl
# ...
# Alphanumeric and periods only
image_name = "hashistack.20221202190723"
Then, open your terminal and use the built-in uuid()
function of the Terraform console to generate two new UUIDs for the token's credentials.
$ terraform console
> uuid()
> "a90a52ae-bcb7-e38a-5fe9-6ac084b37078"
> uuid()
> "d14d6a73-a0f1-508d-6d64-6b0f79e5cb44"
> exit
Copy these UUIDs and update the nomad_consul_token_id
and nomad_consul_token_secret
variables with the UUID values. Save the file.
In this example, the value for nomad_consul_token_id
would be a90a52ae-bcb7-e38a-5fe9-6ac084b37078
and the value for nomad_consul_token_secret
would be d14d6a73-a0f1-508d-6d64-6b0f79e5cb44
.
azure/variables.hcl
# ...
# Alphanumeric and periods only
image_name = "hashistack.20221202190723"
nomad_consul_token_id = "a90a52ae-bcb7-e38a-5fe9-6ac084b37078"
nomad_consul_token_secret = "d14d6a73-a0f1-508d-6d64-6b0f79e5cb44"
# ...
The remaining variables in variables.hcl
are optional.
- Â
allowlist_ip
is a CIDR range specifying which IP addresses are allowed to access the Consul and Nomad UIs on ports8500
and4646
as well as SSH on port22
. The default value of0.0.0.0/0
will allow traffic from everywhere.
Note
We recommend that you update allowlist_ip
to your machine's IP address or a range of trusted IPs.
- Â
admin_password
is the password for theubuntu
account on the server and client machines and can be used to access the machines over SSH.
Warning
We recommend that you update admin_password
to a different value.
- Â
name
is a prefix for naming the Azure resources. - Â
server_instance_type
andclient_instance_type
are the virtual machine instance types for the cluster server and client nodes, respectively. - Â
server_count
andclient_count
are the number of nodes to create for the servers and clients, respectively.
Deploy the Nomad cluster
Initialize Terraform to download required plugins and set up the workspace.
$ terraform init
Initializing the backend...
Initializing provider plugins...
- Finding hashicorp/azurerm versions matching "3.0.0"...
- Installing hashicorp/azurerm v3.0.0...
- Installed hashicorp/azurerm v3.0.0 (signed by HashiCorp)
Terraform has created a lock file .terraform.lock.hcl to record the provider
selections it made above. Include this file in your version control repository
so that Terraform can guarantee to make the same selections by default when
you run "terraform init" in the future.
Terraform has been successfully initialized!
Provision the resources and provide the variables file with the -var-file
flag. Respond yes
to the prompt to confirm the operation. The provisioning takes several minutes. Once complete, the Consul and Nomad web interfaces will become available.
$ terraform apply -var-file=variables.hcl
# ...
Plan: 28 to add, 0 to change, 0 to destroy.
Do you want to perform these actions?
Terraform will perform the actions described above.
Only 'yes' will be accepted to approve.
Enter a value: Yes
# ...
Apply complete! Resources: 28 added, 0 changed, 0 destroyed.
Outputs:
IP_Addresses = <<EOT
Client public IPs: 52.91.50.99, 18.212.78.29, 3.93.189.88
Server public IPs: 107.21.138.240, 54.224.82.187, 3.87.112.200
The Consul UI can be accessed at http://107.21.138.240:8500/ui
with the bootstrap token: dbd4d67b-4629-975c-e9a8-ff1a38ed1520
EOT
consul_bootstrap_token_secret = "dbd4d67b-4629-975c-e9a8-ff1a38ed1520"
lb_address_consul_nomad = "http://107.21.138.240"
Verify the services are in a healthy state. Navigate to the Consul UI in your web browser with the URL in the Terraform output.
Click on the Log in button and use the bootstrap token secret consul_bootstrap_token_secret
from the Terraform output to log in.
Click on the Nodes page from the sidebar navigation. There are six healthy nodes, including three Consul servers and three Consul clients created with Terraform.
Set up access to Nomad
Run the post-setup.sh
script.
Warning
If the nomad.token
file already exists from a previous run, the script won't work until the token file has been deleted. Delete the file manually and re-run the script or use rm nomad.token && ./post-script.sh
.
Note
It may take some time for the setup scripts to complete and for the Nomad user token to become available in the Consul KV store. If the post-setup.sh
script doesn't work the first time, wait a couple of minutes and try again.
$ ./post-setup.sh
The Nomad user token has been saved locally to nomad.token and deleted from the Consul KV store.
Set the following environment variables to access your Nomad cluster with the user token created during setup:
export NOMAD_ADDR=$(terraform output -raw lb_address_consul_nomad):4646
export NOMAD_TOKEN=$(cat nomad.token)
The Nomad UI can be accessed at http://107.21.138.240:4646/ui
with the bootstrap token: 22444f72-c222-bd26-6c2c-584fb9e5b698
Apply the export
commands from the output.
$ export NOMAD_ADDR=$(terraform output -raw lb_address_consul_nomad):4646 && \
export NOMAD_TOKEN=$(cat nomad.token)
Finally, verify connectivity to the cluster with nomad node status
$ nomad node status
ID Node Pool DC Name Class Drain Eligibility Status
06320436 default dc1 ip-172-31-18-200 <none> false eligible ready
6f5076b1 default dc1 ip-172-31-16-246 <none> false eligible ready
5fc1e22c default dc1 ip-172-31-17-43 <none> false eligible ready
Navigate to the Nomad UI in your web browser with the URL in the post-setup.sh
script output. Click on Sign In in the top right corner and log in with the bootstrap token saved in the NOMAD_TOKEN
environment variable. Set the Secret ID to the token's value and click Sign in with secret. Click on the Clients page from the sidebar navigation.
Cleanup
Use terraform destroy
to remove the provisioned infrastructure. Respond yes
to the prompt to confirm removal.
$ terraform destroy -var-file=variables.hcl
# ...
azurerm_virtual_network.hashistack-vn: Destruction complete after 20s
azurerm_resource_group.hashistack: Destroying... [id=/subscriptions/c9ed8610-47a3-4107-a2b2-a322114dfb29/resourceGroups/hashistack]
azurerm_resource_group.hashistack: Still destroying... [id=/subscriptions/c9ed8610-47a3-4107-a2b2-a322114dfb29/resourceGroups/hashistack, 10s elapsed]
azurerm_resource_group.hashistack: Destruction complete after 16s
Destroy complete! Resources: 28 destroyed.
Next steps
In this tutorial you created a Nomad cluster on Azure with Consul and ACLs enabled. From here, you may want to:
- Run a job with a Nomad spec file or with Nomad Pack
- Test out native service discovery in Nomad
For more information, check out the following resources.
- Learn more about managing your Nomad cluster
- Read more about the ACL stanza and using ACLs