Tuesday, June 9, 2026

Construct a customized portal with embedded Amazon SageMaker AI MLflow Apps


As ML groups develop, embedding Amazon SageMaker AI MLflow Apps right into a customized portal requires a scalable strategy to entry administration. Distributing presigned URLs doesn’t scale for groups with dozens of information scientists, and granting particular person AWS Administration Console entry provides operational overhead for directors managing entry controls. Groups who depend on SSO-integrated inner portals want their MLflow experiment monitoring accessible alongside different inner functions by way of a single bookmarkable URL. With a customized portal, you cut back onboarding time for brand new workforce members, simplify entry administration, and provides knowledge scientists a constant expertise throughout your inner instruments.

With this resolution, you give your machine studying (ML) groups a persistent, bookmarkable URL to the complete MLflow net UI with out presigned URLs or AWS Administration Console entry. You may embed the MLflow experiment monitoring UI immediately into your group’s SSO-integrated inner portal or customized dashboard, so customers authenticate as soon as and entry experiment monitoring alongside different inner instruments. Your steady integration and steady supply (CI/CD) pipelines and automation scripts can work together with MLflow REST APIs programmatically by way of the identical proxy endpoint, with SigV4 authentication dealt with behind the scenes.

On this submit, you learn to construct a customized portal with embedded SageMaker AI MLflow Apps UI. You stroll by way of the structure sample behind a React entrance finish paired with a Flask reverse proxy that handles AWS Signature Model 4 (SigV4) authentication, deploy all the stack by way of the AWS Cloud Improvement Equipment (AWS CDK), validate the deployment, and evaluation safety concerns and cleanup procedures.

Resolution overview

You deploy a customized React net software with the SageMaker AI MLflow Apps UI embedded utilizing iframe, backed by a Flask reverse proxy working on Amazon Elastic Compute Cloud (Amazon EC2). The structure consists of 4 parts that work collectively to provide your workforce authenticated entry to MLflow.

Software Load Balancer

The Software Load Balancer (ALB) serves as the only entry level to your customers. It handles HTTPS termination by routing site visitors to the suitable backend targets and integrates together with your group’s current DNS and certificates infrastructure. It supplies a secure, public-facing URL for the portal that may combine with current SSO infrastructure. It distributes site visitors for each the React dashboard and MLflow API requests, and helps customized domains and SSL termination.

Notice: This implementation makes use of ALB with HTTP. For manufacturing environments, you need to add HTTPS with an SSL/TLS certificates through AWS Certificates Supervisor (ACM).

React entrance finish portal

The React entrance finish provides your workforce a branded entry level to the MLflow expertise. It supplies a customized portal that embeds the MLflow monitoring UI in an iframe and serves as an integration level for organizational branding and extra instruments. It delivers static information by way of the Flask proxy from the /app path.

Flask reverse proxy service

The Flask reverse proxy sits between the entrance finish and the MLflow backend, dealing with authentication so your customers by no means handle AWS credentials immediately. A Python-based Flask software handles:

  • Intercepting incoming requests, together with UI paths and REST API calls.
  • Signing every request with AWS SigV4 utilizing momentary credentials obtained by assuming a devoted AWS Id and Entry Administration (IAM) function.
  • Forwarding signed requests to the Amazon SageMaker AI MLflow Apps endpoint.
  • Rewriting absolute MLflow URLs in HTML responses to relative paths and stripping X-Body-Choices headers so the UI renders appropriately inside an iframe.

Amazon SageMaker AI MLflow apps

Amazon SageMaker AI absolutely manages MLflow apps for you, so there aren’t any servers to provision or patch. Amazon SageMaker AI MLflow Apps supplies experiment monitoring with runs, metrics, parameters, and artifacts, together with a mannequin registry for mannequin versioning and lifecycle administration. It’s a absolutely managed backend with no infrastructure to take care of.

This structure helps safe communication whereas sustaining compatibility with current enterprise portals. The proxy service acts as a bridge, remodeling commonplace HTTPS requests into authenticated AWS API calls.

Structure and request workflow

The next diagram exhibits how the completely different parts work collectively to provide your workforce safe, browser-based entry to Amazon SageMaker AI MLflow Apps.

Right here’s what occurs when a person navigates to the portal:

  1. The person opens the ALB URL of their browser, both immediately or by way of a hyperlink in your group’s inner portal. The ALB routes the request to the Amazon EC2 occasion working the Flask proxy.
  2. The Flask proxy serves the React dashboard (from the /app path). The React app renders the web page and hundreds the MLflow UI inside an iframe pointing to /mlflow-ui/.
  3. From this level on, each request the iframe makes goes by way of the Flask proxy, whether or not it’s loading the MLflow UI pages or calling API endpoints like /api/2.0/mlflow/experiments/search. The proxy indicators every request with AWS SigV4 utilizing momentary credentials (obtained by assuming a devoted IAM function) and forwards it to the serverless MLflow App endpoint.
  4. When the MLflow App responds, the proxy does two issues earlier than passing the response again to the browser. It rewrites absolute MLflow URLs to relative paths in order that navigation works appropriately by way of the proxy. It additionally strips X-Body-Choices headers in order that the browser permits the content material to render contained in the iframe.

Your customers see the complete MLflow monitoring UI, together with experiments, runs, metrics, and mannequin registry, proper of their browser, with AWS authentication dealt with behind the scenes.

Walkthrough

The next part walks you thru methods to deploy the answer. ### Stipulations

To comply with together with this walkthrough, be sure to have the next conditions:

  • An AWS account.
  • AWS Command Line Interface (AWS CLI) v2.34.5 or later (required for create-mlflow-app, list-mlflow-apps, and describe-mlflow-app instructions).
  • Python 3.13 or later put in regionally (utilized by the deployment script to parse JSON outputs).
  • AWS CDK v2 (aws-cdk-lib 2.243.0 or later) put in and bootstrapped within the goal account and Area. For directions, see Getting began with the AWS CDK.
  • Node.js 18.x or later put in regionally for CDK deployment.
  • Python 3.13 put in on the Amazon EC2 occasion (automated by the setup script).
  • Adequate IAM permissions to create VPCs, Amazon EC2 cases, ALBs, Amazon SageMaker AI domains, MLflow Apps, and IAM roles.
  • An Ubuntu 24.04 LTS AMI obtainable within the goal AWS Area (robotically resolved utilizing SSM Parameter Retailer).
  • Required information:
    • Primary understanding of AWS companies and IAM permissions.
    • Familiarity with Python and Flask functions.
    • Understanding of MLflow ideas and operations.
  • Value concerns:
    • This resolution creates AWS assets that will incur prices.
    • Key cost-driving assets embrace:
      • Amazon EC2 cases.
      • Software Load Balancer.
      • Amazon SageMaker AI assets.
      • Amazon Easy Storage Service (Amazon S3) storage.

For details about AWS service pricing, see the AWS Pricing Calculator.

Deploy the answer

This part guides you thru deploying the answer in your AWS account and validating it. The deployment makes use of a single deploy.sh script that orchestrates CDK stack deployment and serverless MLflow App creation.

Step 1: Clone the repository and deploy the infrastructure

  1. Obtain the answer code and set up dependencies:
    # Clone the repository
    git clone https://github.com/aws-samples/sample-sagemaker-mlflow-embedded-ui.git
    
    # Navigate to undertaking listing and set up dependencies
    cd sample-sagemaker-mlflow-embedded-ui
    npm set up

  2. Set your AWS account ID and Area as setting variables:
    export CDK_DEFAULT_ACCOUNT=
    export CDK_DEFAULT_REGION=
    export AWS_DEFAULT_REGION=
    export AWS_REGION=

    Notice: In case you beforehand deployed to a special Area, delete the cached context file.

  3. Bootstrap your setting for AWS CDK (skip this step in case your AWS account and Area is already bootstrapped for AWS CDK).Bootstrap the AWS account and Area for CDK:
  4. Deploy the required assets in your AWS account.Run the deployment script to deploy the stacks:

    Notice the ALB DNS identify and Amazon EC2 occasion ID from the deployment output. You want these within the following steps.

Step 2: Arrange the Flask proxy service on Amazon EC2

  1. Sign up to the Amazon EC2 occasion utilizing the occasion ID from Step 1. Use AWS Programs Supervisor Session Supervisor to entry the occasion. For detailed directions, see the Session Supervisor connection information.
  2. Set up Python 3.13 and dependencies.Set up Python packages:
    # Change to root person
    sudo su -
    cd /root
    
    # Set up Python and dependencies
    chmod +x install_python13.sh
    ./install_python13.sh

    Notice: This script works on Ubuntu-based methods. For different Linux distributions, confirm that Python 3.12+, PIP3, and Virtualenv are put in utilizing your system’s bundle supervisor.

  3. Set up and begin the MLflow proxy service:
    chmod +x setup_mlflow_proxy_app.sh
    ./setup_mlflow_proxy_app.sh

  4. Verify Flask MLflow proxy service standing:
    systemctl standing mlflowproxy

    If the service isn’t working, test logs with the next.

    journalctl -u mlflowproxy

Step 3: Validate the deployment

This part demonstrates methods to work together with MLflow REST APIs by way of the ALB. These examples use the HTTP (unsecured) protocol, and for manufacturing environments, HTTPS is beneficial. The next examples use the curl software to make API requests, however you can too use a software like Postman or equal.

  1. Open the ALB URL that you just famous in Step 1 in your browser. You too can retrieve it from the AWS CloudFormation stack output:
    aws cloudformation describe-stacks --stack-name sagemaker-infra-flaskapp --query 'Stacks[0].Outputs[?OutputKey==`ALBUrl`].OutputValue' --output textual content

  2. Open the ALB URL in your browser at http:///. You’re robotically redirected to /app, the place the React dashboard shows the MLflow UI embedded in an iframe, as proven within the following determine.React dashboard at the ALB URL with the SageMaker AI MLflow Apps experiment tracking UI embedded in an iframe
  3. Confirm the well being endpoint:

    This could return {"standing": "wholesome"}.

  4. Take a look at MLflow experiment monitoring through the REST API.
    1. Create an experiment.Use the MLflow REST API by way of the ALB to create a brand new experiment. Notice the experiment ID from the response.
      curl -X POST http:///api/2.0/mlflow/experiments/create -H "Content material-Sort: software/json" -d '{"identify": "my-first-experiment"}'

    2. Create and log a run.Create a run underneath the experiment and log metrics and parameters.
      curl -X POST http:///api/2.0/mlflow/runs/create -H "Content material-Sort: software/json" -d '{"experiment_id": "", "run_name": "training-run-1"}'
      
      curl -X POST http:///api/2.0/mlflow/runs/log-parameter -H "Content material-Sort: software/json" -d '{"run_id": "", "key": "learning_rate", "worth": "0.01"}'
      
      curl -X POST http:///api/2.0/mlflow/runs/log-metric -H "Content material-Sort: software/json" -d '{"run_id": "", "key": "accuracy", "worth": 0.95, "timestamp": 1700000000000, "step": 1}'

    3. Confirm the run within the React dashboard.Refresh the React dashboard in your browser at http:///app. The MLflow UI now shows the experiment, runs, metrics, and parameters you created within the previous steps, as proven within the following determine.MLflow UI in the React dashboard showing the new experiment, run, logged parameters, and metrics created via the REST API

Clear up

To keep away from ongoing expenses and take away the assets created by this resolution, comply with these cleanup steps:

  1. Run the cleanup script from the undertaking root.

    This script tears down the deployed assets in reverse dependency order. It begins by destroying the Flask app stack, then deletes the serverless MLflow App by way of the AWS CLI and waits for the deletion to complete. After that, it removes the MLflow assets, Amazon SageMaker area, and networking stacks. The networking stack contains an AWS Lambda-backed customized useful resource. It robotically cleans up Amazon SageMaker AI-created Amazon Elastic File System (Amazon EFS) file methods, orphaned community interfaces, and safety teams earlier than deleting the VPC.

  2. Guide useful resource cleanup.The MLflow artifacts Amazon S3 bucket has a RETAIN removing coverage and should be manually deleted if now not wanted. For detailed directions, see Deleting a basic objective bucket within the Amazon S3 Consumer Information.

CDK stack particulars

The answer deploys 4 CDK stacks, every answerable for a definite layer of the structure.

Networking stack

This stack creates the VPC and related networking parts, together with private and non-private subnets, route tables, and safety teams. It supplies the community basis that every one different stacks depend upon.

SageMaker AI area stack

This stack units up the Amazon SageMaker AI area, which serves because the organizational container for SageMaker assets. The area supplies the id and entry context wanted for the MLflow App.

SageMaker MLflow stack

This stack deploys the serverless MLflow App inside the SageMaker AI area that shops experiments, runs, metrics, and mannequin registry knowledge.

Flask software stack

This stack deploys the Flask reverse proxy service on an Amazon EC2 occasion behind an ALB. It handles SigV4 authentication and serves the React entrance finish portal.

Subsequent steps

After deploying the portal, take into account extending it with these use instances:

When deploying this resolution in a manufacturing setting, take into account implementing these further safety measures:

  • Configure Amazon CloudWatch monitoring for the Flask-based proxy service to trace software well being, detect anomalies, and arrange alerts for suspicious actions. For extra info, see Monitor your cases utilizing CloudWatch and Create a CloudWatch alarm primarily based on anomaly detection.
  • Implement price limiting for the Flask-based proxy service to guard towards potential denial-of-service (DoS) assaults and management the variety of requests from particular person purchasers. You should use AWS WAF along side Software Load Balancer to implement rate-based guidelines.
  • Allow HTTPS termination on the Software Load Balancer degree to help safe communication between purchasers and your software. You should use ACM to provision and handle SSL/TLS certificates to your software. For directions on configuring HTTPS listeners, see the Software Load Balancer HTTPS listeners documentation.

Conclusion

On this submit, you realized methods to construct a React-based dashboard with the Amazon SageMaker AI MLflow Apps UI embedded utilizing iframe, backed by a Flask reverse proxy that handles SigV4 authentication. This resolution helps ML infrastructure groups present persistent, bookmarkable entry to the complete MLflow experiment monitoring expertise by way of a customized portal that integrates with current organizational infrastructure.

With this strategy, your workforce will get a persistent, bookmarkable URL for MLflow experiment monitoring with out presigned URLs, together with direct integration into current SSO-protected inner portals. Customers get the complete MLflow UI expertise, together with run comparability, metric visualization, and mannequin registry, whereas directors profit from decreased operational overhead by eradicating per-user console entry. Your entire resolution is deployed as infrastructure as code with automated provisioning and cleanup. To get began, clone the pattern repository and deploy the stack in your AWS account.


Concerning the authors

Manish Garg

Manish Garg

Manish is a Lead Marketing consultant with AWS Skilled Companies, specializing in migrating and modernizing buyer workloads on AWS. He possesses a profound enthusiasm for know-how, coupled with a eager curiosity within the realms of DevOps practices.

Ram Yennapusa

Ram Yennapusa

Ram is a Senior Supply Marketing consultant at Amazon Net Companies (AWS). He works with enterprise clients to design and implement cloud-based options at scale, with a give attention to DevOps and MLOps. Ram has over 15 years of expertise in software program growth and cloud structure, serving to organizations navigate their cloud transformation journey. He helps clients construct environment friendly, safe, and scalable options on AWS.

Ashish Bhatt

Ashish Bhatt

Ashish is a Senior Supply Marketing consultant with AWS Skilled Companies, specializing in designing and constructing options for buyer workloads on AWS. He brings deep experience in DevOps, MLOps, and infrastructure engineering with a give attention to constructing scalable infrastructure and empowering growth groups by way of fashionable engineering practices.

Related Articles

Latest Articles