Post-Image

Automating Access Through a Jump Host - AWS

This is a continuation from a previous article and gets a lot more technical. It is not a part 2, but a branch. Think of it has a hub and spoke style collection of articles. This is the Amazon Web Services (AWS) spoke.

I will be focusing on how to automate the user management on a jump host within the AWS cloud.

Security First

I know I am going to ruffle some feathers with this, but I find AWS takes a more of a hard-lined approach to security then the other public cloud providers. They all meet the same certificate requirements, but when it comes to the intricate functionality, they take the security over user experience route.

Examples:

  • No details provided on login, switch role, or password reset failures.
  • Only a single second factor option (virtual, U2F, etc) allowed for users and root.
  • Identity Access Management (IAM) policy trust relationships can’t be granted to a group.
  • AWS Managed keys cannot be extend to or shared with other accounts within the organization.
  • You cannot change the key pair on a deployed instance without redeploying the instance.
  • There is no way to add additional local users through the AWS console.

The other cloud providers offer tools to make similar tasks easier or have transparent sharing of encrypted data and keys across subscriptions (Azure), projects (GCP), or regions.

Help, I Locked My Keys in My Instance!

In case you missed that, let me repeat it. With AWS you can’t manage additional local user accounts from outside of the instance and you can’t change the assigned key pair once an instance has been deployed. If you lose the private key and didn’t create a secondary sudo capable account, you are going to have a bad day. If someone external of your organization acquires your private key, which is used across your instances, you are going to have a bad week or even month.

There is an involved way to reset the key stored on the instance. It involves using Systems Manager to deploy the AWSSupport-ResetAccess automation. This process will deploy another instance with EC2Rescue installed on it, shutdown your instance, create an AMI image, inject the new key pair, and deploy a replacement instance using the updated AMI image. I assume it will create an EC2Rescue instance for each instance you target with the automation, which means it will not scale well if you need to reset keys on a large number on instances. That is a lot of effort for a lapse in securing your private keys.

Make sure your private keys are locked away in a virtual vault/keychain/keyring and never stored on an unencrypted disk or handed out.

Maintaining the User Accounts

Without a way to manage local user accounts on instances, management is left to the administrator to solve. The issue with this comes with how you are going to keep current with managing who can gain access through the jump host and having this management be as painless as possible.

A Jump Host is becoming a popular way of securing the entry into public cloud environments, even when a VPN is available. There just isn’t a quick and easy way to automate local user management in a secure fashion in AWS.

Elevating Privileges

On Linux, the all powerful and mighty administrator account is called root. Some user accounts can assume root by way of a command called sudo. These users are referred to as sudoers. Using sudo to assume root usually requires the user to re-enter their password. With turning off password authentication and using pre-shared keys, user accounts don’t have passwords. The requirement for a password to use sudo gets disabled and there isn’t anything, except the optional passphrase of the pre-shared key, to prevent a sudoer from assuming root.

Most of the public clouds offer built-in tools for creating or resetting local administrator accounts. Managing basic user accounts is an after thought and is generally not possible. Sudo access is handed out indiscriminately. This is kind of acceptable for internal servers (anyone who needs SSH access to your internal servers most likely needs to do administrator tasks). This is not acceptable for an externally facing Jump Host.

What is Available

Here is the highly manual AWS recommend method for managing the local user accounts.

https://aws.amazon.com/premiumsupport/knowledge-center/new-user-accounts-linux-instance/

For a properly secured Jump Host, we need a way to automatically create additional guest users. These guest user accounts should not be able to make changes to the host and should be added or removed when required. Such a workflow is possible on AWS, but it isn’t setup with a flick of a switch. Currently it requires granting some permissions to the Just Host instance and deploying a script. Once that is done, the Jump Host with then keep itself in sync with AWS IAM group of users.

For an instance to talk to the cloud around it, it needs to be granted special powers. All three leading clouds offer a way to grant abilities and permissions to an instance. With AWS can assign an IAM Role to an instance that can then be used as a gateway to other AWS services. This is a seamless way of allowing the instance to authenticate and elevate privileges when required. The alternative is to store access keys for authentication on your Jump Host, and you do not want to do that.

Did Someone Mention a Script?

For automated Jump Host user management, you will need a script that runs on the VM instance every hour to keep everything in sync. The VM instance will require limited read only access to IAM users, IAM groups, and public SSH keys for those users.

The script will need to be able to:

  1. Authenticate with IAM without the need of access keys.
  2. Get a list of a users that are a member of a specific group.
  3. Get any public SSH keys for each of those users.
  4. Create or re-activate users on the local OS.
  5. Add the current SSH key(s) for each user and remove any stale ones.
  6. Disable any user accounts that don’t belong.
  7. Modify the SSH server to allow access to the users.

Configuration Steps

Create an IAM Role and Inline Policy

We will refer to this new role as IAMKeyLookup, but you can call it whatever you like.

The role will need a trust relationship that will allow the role to be assumed by the IAM role attached to the Jump Host instance. This is one of those rare occurrences of nested role assuming. Replace anything in <> brackets with your own values.

Trust Relationship
{
"Version": "2008-10-17",
"Statement": 
	[
    	{"Effect": "Allow",
        "Principal": 
        	{"AWS": "arn:aws:iam:<Your Account ID>:role/<Instance Role>"},
            "Action": "sts:AssumeRole"}
     ]
}
Role Policy

You will need an inline policy that grants iam:ListUsers, iam:GetGroup, iam:GetSSHPublicKey, iam:ListSSHPublicKeys, iam:GetUser, iam:ListGroups permissions.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": [
                "iam:ListUsers",
                "iam:GetGroup",
                "iam:GetSSHPublicKey",
                "iam:ListSSHPublicKeys",
                "iam:GetUser",
                "iam:ListGroups"
            ],
            "Resource": " arn:aws:iam:::*",
            "Effect": "Allow"
        }
    ]
}
Instance Role

I am assuming you would already have an IAM role attached to your Jump Host instance for Systems Manager patch deployment using the SSM agent and publishing of metric and logs from the CloudWatch agent. Those are both processes you would want to employ on your Jump Host.

https://docs.aws.amazon.com/systems-manager/latest/userguide/what-is-systems-manager.html

https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/Install-CloudWatch-Agent.html

To allow the script access to details from IAM, you will need to modify the instance role and attach an inline policy that grants access to assume the IAMKeyLookup role.

{
	"Version": "2012-10-17",
	"Statement": [
      {
          "Action": [
              "sts:AssumeRole"
          ],
          "Resource": "arn:aws:iam:::role/IAMKeyLookup",
          "Effect": "Allow"
      }
	]	
}

The Script

With this script, I have eliminated as many dependencies as possible and tested it on most of the available distributions (Amazon, RedHat, CentOS, Ubuntu, and SLES). I have also tested with the CIS Benchmark marketplace images. The requirements are a recent Linux install, systemd, bash, and the AWS CLI.

To have the script run hourly, you would put it in the /etc/cron.hourly/ directory. You could also specify a custom cron schedule, if you need it to run more or less often.

#!/bin/bash
# Supply the IAM group whose members need local user accounts created.
GROUPNAME=JumpHostAccess
# Get the account ID of the current account. In the future I will allow providing this so that you can.
ACCOUNTID=$(aws sts get-caller-identity --output text --query Account)
# Assume a role that gets us access to the public keys.
ASSUMEDROLE=$(aws sts assume-role --role-arn "arn:aws:iam::$ACCOUNTID:role/IamKeyLookup" --role-session-name JumphostUserSync --output text)
if [[ $ASSUMEDROLE == "AccessDenied" ]]; then
echo "An IAM role needs to be attached to the instance and have permission to assume the IamKeyLookup role"
exit
fi
# Update the session keys with the new set.
export AWS_ACCESS_KEY_ID=$(echo $ASSUMEDROLE | awk '{print $5}')
export AWS_SECRET_ACCESS_KEY=$(echo $ASSUMEDROLE | awk '{print $7}')
export AWS_SESSION_TOKEN=$(echo $ASSUMEDROLE | awk '{print $8}')
# Get a list of users that should have access.
USERLIST=$(aws iam get-group --group-name $GROUPNAME --query 'Users[].UserName' --output text)
# Make sure everything is lowercase
USERLIST=${USERLIST,,}
# Go through the list of IAM users.
for IAMUSERNAME in $USERLIST
do
# Create a local user account for each IAM user.
if ! getent passwd $IAMUSERNAME &>/dev/null; then
useradd -c "AWS User" -m -s /bin/bash -k /etc/skel $IAMUSERNAME
else
# Enable the local user account if it exists.
usermod --expiredate "" $IAMUSERNAME > /dev/null 2>&1
fi
# Create the hidden ssh directory for storing keys.
mkdir -p /home/$IAMUSERNAME/.ssh
# Backup any existing authorised_keys file.
if [ -f /home/$IAMUSERNAME/.ssh/authorized_keys ]; then
mv /home/$IAMUSERNAME/.ssh/authorized_keys "/home/$IAMUSERNAME/.ssh/authorized_keys-$(date +%Y%m%d)"
fi
# Populate a new authorised_keys file with public keys from the IAM user.
for KEYID in $(aws iam list-ssh-public-keys --user-name $IAMUSERNAME --query "SSHPublicKeys[?Status=='Active'].SSHPublicKeyId" --output text)
do
KEYBODY=$(aws iam get-ssh-public-key --user-name $IAMUSERNAME --ssh-public-key-id $KEYID --encoding "SSH" --query 'SSHPublicKey.SSHPublicKeyBody' --output text)
echo $KEYBODY >> /home/$IAMUSERNAME/.ssh/authorized_keys
done
# Remove the backup keys file if nothing changed.
if [ -f "/home/$IAMUSERNAME/.ssh/authorized_keys" ]; then
if [ -f "/home/$IAMUSERNAME/.ssh/authorized_keys-$(date +%Y%m%d)" ]; then
if ! diff -q /home/$IAMUSERNAME/.ssh/authorized_keys "/home/$IAMUSERNAME/.ssh/authorized_keys-$(date +%Y%m%d)" &>/dev/null; then
rm "/home/$IAMUSERNAME/.ssh/authorized_keys-$(date +%Y%m%d)"
fi
fi
# Make sure the file is protected. SSH server will fail to use it if it is not protected.
chmod 0600 /home/$IAMUSERNAME/.ssh/authorized_keys
fi
# Lock down the permissions.
chown -R $IAMUSERNAME:$IAMUSERNAME /home/$IAMUSERNAME/.ssh
chmod 700 /home/$IAMUSERNAME/.ssh
done
# Explicitly allow users to log in through SSH. This is needed for CIS and hardened images.
# sed seems to have an issue with the length of variables included in the command line.
# We need to expand everything into a variable first. Ugly, I know.
cmd="sed -i \"s/^AllowUsers ubuntu[^$]*$/AllowUsers ubuntu $USERLIST/\" /etc/ssh/sshd_config"
eval $cmd
# The sshd service needs to be restarted before the new AllowUsers change to take effect.
systemctl restart sshd
# Get a list of existing users created by this script.
ACTIVELIST=$(grep "AWS User" /etc/passwd | awk -F':' '{ print $1 }')
# Go through the active list of local users and disable any that don't exist in the IAM group.
for ACTIVEUSER in $ACTIVELIST
do
found=0
for IAMUSERNAME in $USERLIST
do
if [ "$IAMUSERNAME" == "$ACTIVEUSER" ]; then
found=1
break
fi
done
# If the user is not found, disable them.
if [ "$found" == 0 ]; then
echo "Disabling $ACTIVEUSER"
usermod --expiredate 1 $ACTIVEUSER > /dev/null 2>&1
fi
done

When creating a new user or changing SSH keys, you may need to wait up to an hour before the changes take effect. You could add a line to /etc/rc.local to run the script on start up. That way restarting the instance could speed up the process. If the file doesn’t exist, you can just create it.

You would deploy the script by way of cloud-init or user data.

https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/user-data.html

To keep things short, I am going to leave that process up to you to discover.

Another piece that is required and not always installed by default on marketplace images is AWS CLI. If it is not available as a package in an official repository you will also need to install Python and pip for deployment.

https://docs.aws.amazon.com/cli/latest/userguide/install-linux.html

Thoughts and Ideas

Why Pull When I Can Push?

Scheduling the script to run on its own every hour takes less work to implement, but over time it will consumes more CPU credits then an on-demand process. Another drawback is when creating a new user or changing SSH keys, you may need to wait up to an hour before the changes take effect.

If you want changes to take place instantly and be more efficient you could run the script when a change is made. This would require a Lambda function to run when an event occurs. You would have CloudWatch IAM events trigger the Lambda function which will then have Systems Manager run the script on the instance.

Fargate This!

With the plummeting price and speedy start up time of ECS Fargate containers (now powered with firecracker), deploying a Fargate Jump Host may be a solution. By default, Fargate containers are not designed to be connected to using SSH, but that doesn’t mean it is impossible. It would take a lot more trial and error to develop and deploy, you would also have the extra overhead of an Application Load Balancer but it would take you one step closer to serverless.

Why Not Lambda?

Now that Lambda can be a target for an Application Load Balancer, a truly serverless Jump Host could be possible. I would expect you would need to create some sort of SSH tunnel proxy or port over openSSH server to lambda. That fact, right there, just shuts down that train of though before it left the station. Dare to dream, dare to dream…

 

About James MacGowan

James started out as a web developer with an interest in hardware and open sourced software development. He made the switch to IT infrastructure and spent many years with server virtualization, networking, storage, and domain management.
After exhausting all challenges and learning opportunities provided by traditional IT infrastructure and a desire to fully utilize his developer background, he made the switch to cloud computing.
For the last 3 years he has been dedicated to automating and providing secure cloud solutions on AWS and Azure for our clients.

Share This Article

Comments