How to Set up AWS SageMaker for Multiple Users
Introduction
Amazon SageMaker is a fully managed service that provides every machine learning (ML) developer and data scientist with the ability to build, train, and deploy ML models quickly. Amazon SageMaker Studio is a web-based, integrated development environment (IDE) for ML that lets you build, train, debug, deploy, and monitor your ML models. Amazon SageMaker Studio provides all the tools you need to take your models from experimentation to production while boosting your productivity. You can write code, track experiments, visualize data, and perform debugging and monitoring within a single, integrated visual interface.
Each user who is connected to SageMaker Studio has their own dedicated set of resources, such as a home directory on an Amazon Elastic File System (Amazon EFS) volume, compute instances and a dedicated AWS Identity and Access Management (IAM) execution role.
In most companies or setups, there are multiple users and data science teams who are all collaborating and will need to access AWS SageMaker for collaboration.
One of the most challenging tasks in setting up user access for AWS SageMaker studio is how to manage and control multiple users, groups and data science teams for data access and resource allocation.
Amazon SageMaker Studio supports the following authentication methods for onboarding users. When setting up Studio, you can pick an authentication method that you use for all your users:
IAM – Includes the following:
IAM users – Users managed in IAM
AWS account federation – Users managed in an external identity provider (IdP)
IAM Identity Center(Initially known as AWS Single Sign-On) – Users managed in an external IdP federated using AWS SSO
In this blog, we discuss how to configure access control for teams or groups within Amazon SageMaker using Attribute-based access control (ABAC).
Attribute-based access control (ABAC) is an authorisation strategy that defines permissions based on attributes. In AWS, these attributes are called tags. You can attach tags to IAM resources, including IAM entities (users or roles) and AWS resources. You can create a single ABAC policy or a small set of policies for your IAM principals. These ABAC policies can be designed to allow operations when the principal’s tag matches the resource tag (ABAC Documentation).
ABAC is a powerful approach that you can utilize to configure AWS SageMaker Studio so that different ML and data science teams have complete isolation of team resources.
The blog guides how to configure Amazon SageMaker Studio for both AWS Identity and Access Management (IAM) and AWS IAM Identity Center (successor to AWS Single Sign-On) authentication methods, it will also help you to set up IAM policies for users and roles using ABAC principles.
Typically, two groups of people interact with Amazon SageMaker Studio resources and all have different levels of access they need to fulfil their duties.
Admin User
Create, modify, and delete any IAM resource.
Create Amazon SageMaker Studio user profiles with a tag.
Sign in to the Amazon SageMaker console.
Read and describe Amazon SageMaker resources.
Data scientists/ Developers
Launch an Amazon SageMaker Studio IDE assigned to a specific user.
Create Amazon SageMaker resources with necessary tags. For this post, we use the team tag.
Update, delete, and run resources created by themselves.
Sign in to the Amazon SageMaker console if an IAM user.
Read and describe Amazon SageMaker resources created by the team, i.e., having the right team tags.
Create an AWS SageMaker Domain with Source Identity feature
To restrict activity within SageMaker Studio by user profile, you can enable the sourceIdentity feature.
Case in point you have two users in a team, user 1 and user 2, and you want to configure so that user 1 can describe and view user 2’s artifacts (experiments, etc) but can only update or modify their artifacts.
sourceIdentity in SageMaker Studio is turned off by default and to enable this feature, use the AWS CLI during domain creation and domain update. This feature is enabled at the domain level and not at the user profile level.
Using the following code, you can enable the propagation of the user profile name as the sourceIdentity during the domain creation using the create-domain API.
create-domain
--domain-name <value>
--auth-mode <value>
--default-user-settings <value>
--subnet-ids <value>
--vpc-id <value>
[--tags <value>]
[--app-network-access-type <value>]
[--home-efs-file-system-kms-key-id <value>]
[--kms-key-id <value>]
[--app-security-group-management <value>]
[--domain-settings "ExecutionRoleIdentityConfig=USER_PROFILE_NAME"]
[--cli-input-json <value>]
[--generate-cli-skeleton <value>]
If you already have a domain, you can enable the propagation of the user profile name as the sourceIdentity during the domain update using the update-domain API.
Using the following code, you can turn on the sourceIdentity.
update-domain
--domain-id <value>
[--default-user-settings <value>]
[--domain-settings-for-update "ExecutionRoleIdentityConfig=USER_PROFILE_NAME"]
[--cli-input-json <value>]
[--generate-cli-skeleton <value>]
Applying your policy to the admin user
Now, we should attach a policy to the admin user who is responsible for creating SageMaker Studio profiles for different users.
In the policy, the admin will include a tag, eg userId. You can use a different name for the tag.
The AWS SageMaker studio console doesn’t allow us to add tags when creating user profiles hence we will use AWS Command Line Interface (AWS CLI).
For admin users managed in IAM, attach the following policy to the user. For admin users managed in an external IdP, add the following policy to the rule that the user assumes upon federation. The following policy enforces the userId tag to be present when the sagemaker:CreateUserProfile action is invoked.
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "CreateSageMakerStudioUserProfilePolicy",
"Effect": "Allow",
"Action": "sagemaker:CreateUserProfile",
"Resource": "*",
"Condition": {
"ForAnyValue:StringEquals": {
"aws:TagKeys": [
"userId"
]
}
}
}
]
}
For IAM Identity Center, it doesn’t require this policy, instead it performs Identity check.
Assigning the policy to AWS SageMaker users
At some point, you would want to limit the access to SageMaker studio to specific users, which is inevitable in a setting where there are multiple users.
The following policy is used to restrict access to specific users by requiring the resource tag to match the username for the sagemaker:CreatePresignedDomainUrl action.
This policy works such that when a user tries to access the AWS SageMaker Studio launch URL, the check is performed.
As stated earlier, when setting up AWS SageMaker studio, you are required to choose the authentication method you would want to use, depending on the authentication method that you chose, we will attach policies to our users.
For IAM users, attach the following policy to the user. Use the IAM username for the userId tag value.
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "AmazonSageMakerPresignedUrlPolicy",
"Effect": "Allow",
"Action": [
"sagemaker:CreatePresignedDomainUrl"
],
"Resource": "*",
"Condition": {
"StringEquals": {
"sagemaker:ResourceTag/userId": "${aws:username}"
}
}
}
]
}
For AWS account federation, where users are managed in an external Identity Provider (IdP), we will attach the following policy to the role that the user assumes after federation.
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "AmazonSageMakerPresignedUrlPolicy",
"Effect": "Allow",
"Action": [
"sagemaker:CreatePresignedDomainUrl"
],
"Resource": "*",
"Condition": {
"StringEquals": {
"sagemaker:ResourceTag/studiouserid": "${aws:PrincipalTag/studiouserid}"
}
}
}
]
}
We have seen the policies that one needs to attach when they chose IAM as the authentication method.
When one chooses IAM Identity Center as the authentication method, the policy is not required because IAM Identity Center performs an Identity check.
Now after attaching the policies, one needs to add the following statement in the Trust Relationship section to one of the above policies that they have chosen.
The statement defines the allowed transitive tag.
"Statement": [
{
--Existing statements
},
{
"Sid": "IdentifyTransitiveTags",
"Effect": "Allow",
"Principal": {
"Federated": "arn:aws:iam::<account id>:saml-provider/<identity provider>"
},
"Action": "sts:TagSession",
"Condition": {
"ForAllValues:StringEquals": {
"sts:TransitiveTagKeys": [
"userId"
]
}
}
]
We have created the policies and assigned them to our users. With this, you can control access to AWS SageMaker when you have multiple users.
In most cases in companies, you will find that there are multiple users and the users are organized in teams, e.g data science teams, developers teams etc.
With these, it will be necessary that you create roles for these teams.
Creating roles for the teams
Now after creating policies to manage the access of multiple users in AWS SageMaker, we will now move to work with teams.
We will create roles for each team but we must create the policies. For simplicity, we use the same policies for both teams. In most cases, you just need one set of policies for all teams, but you have the flexibility to create different policies for different teams.
After creating the policies, the next step is to create a role for each team, then we attach the policies and tag the roles with appropriate team tags.
Creating the policies
We will create three policies which will grant different privileges but you can create them according to your needs.
Policy 1: Amazon SageMaker read-only access
This policy grants privileges to List and describe AWS SageMaker resources.
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "AmazonSageMakerListandSearchOnlyPolicy",
"Effect": "Allow",
"Action": [
"sagemaker:GetSearchSuggestions",
"sagemaker:List*"
],
"Resource": "*"
},
{
"Sid": "AmazonSageMakerUIandMetricsOnlyPolicy",
"Effect": "Allow",
"Action": [
"sagemaker:*App",
"sagemaker:Search",
"sagemaker:RenderUiTemplate",
"sagemaker:BatchGetMetrics"
],
"Resource": "*"
},
{
"Sid": "AmazonSageMakerEC2ReadOnlyPolicy",
"Effect": "Allow",
"Action": [
"ec2:DescribeDhcpOptions",
"ec2:DescribeNetworkInterfaces",
"ec2:DescribeRouteTables",
"ec2:DescribeSecurityGroups",
"ec2:DescribeSubnets",
"ec2:DescribeVpcEndpoints",
"ec2:DescribeVpcs"
],
"Resource": "*"
},
{
"Sid": "AmazonSageMakerIAMReadOnlyPolicy",
"Effect": "Allow",
"Action": [
"iam:ListRoles"
],
"Resource": "*"
}
]
}
We have other policies which perform different functions such as giving privileges to create, read, update and delete access to AWS supporting services.
You can customize this policy according to your needs. Since the source identity feature is enabled, you can use the aws:SourceIdentity condition key to restrict access to related services as well.
These policies can be written once but for good readability, it is advisable to split them according to the different functions they are performing.
After creating the policies, it is now time to create and configure the roles for each team using the policies that we have created.
We will then tag the roles on the IAM console or with the CLI command, and these steps are the same for all authentication types.
For example, if we have Team A, we will tag its role with the tag key = team and value = “
Creating the AWS SageMaker Studio user profile
Now after setting up IAM users and groups, configuring the permissions of accessing AWS SageMaker studio and its services, we will now add the userId tag when creating AWS SageMaker studio user profiles.
This process slightly differs depending on the authentication method that you chose and we will go through all the authentication methods.
IAM Users
If you choose IAM as the authentication method, you will create Studio user profiles for each user by including the role that was created for the team the user belongs to.
The following code is a sample CLI command.
Through the following code, we showcase a sample of how to create a user profile in the CLI Command including a tag.
aws sagemaker create-user-profile --domain-id <domain id>
--user-profile-name <unique profile name> --tags Key=userId,Value=<aws user name> --user-settings ExecutionRole=arn:aws:iam::<account id>:role/<Team Role Name>
AWS account federation
In AWS account federation where a user is managed in an external Identity provider (Idp), you will create a user attribute (userId) with a unique value for each user.
Through the following code, we are going to show an example of configuring the attribute using Okta as our external Identity provider.
The example below shows how to add the "userId" attribute in OKTA. In OKTA's SIGN ON METHODS screen, configure the following SAML 2.0 attributes, as shown in the image below.
Attribute 1:
Name: https://aws.amazon.com/SAML/Attributes/PrincipalTag:userId
Value: user.userId
Attribute 2:
Name: https://aws.amazon.com/SAML/Attributes/TransitiveTagKeys
Value: {"userId"}
Next, we will create the user profile using the following command. We will use the user attribute value in the preceding step for the userId tag value.
aws sagemaker create-user-profile --domain-id <domain id> --user-profile-name <unique profile name> --tags Key=userId,Value=<user attribute value> --user-settings ExecutionRole=arn:aws:iam::<account id>:role/<Team Role Name>
So far we have seen on how to create SageMaker user profiles for IAM users and also IAM account federation.
For IAM Identity Center, it follows a different path of using AWS SSO and Okta universal directory and then enabling Identity synchronization between Okta and AWS SSO through which users can now connect.
Conclusion
One of the most common real-world challenges in setting up user access for AWS SageMaker Studio is how to manage multiple users, groups or teams.
In this blog, we have discussed how you can isolate access to SageMaker studio using the ABAC technique. We saw how you can restrict access to a studio profile to only assigned users using tags, and also for team members restrict access to SageMaker studio artifacts to only team members.
Following the examples from this blog, you will be able to customize policies by applying more tags to create more complex controls when setting up AWS SageMaker for multiple users.
Additional Resources:
How to Work With Pycharm and AWS SageMaker Using AWS SageMaker Python SDK
How to securely connect to AWS SageMaker using SSH through a Bastion Host
About Saturn Cloud
Saturn Cloud is your all-in-one solution for data science & ML development, deployment, and data pipelines in the cloud. Spin up a notebook with 4TB of RAM, add a GPU, connect to a distributed cluster of workers, and more. Request a demo today to learn more.
Saturn Cloud provides customizable, ready-to-use cloud environments for collaborative data teams.
Try Saturn Cloud and join thousands of users moving to the cloud without
having to switch tools.