Construct a biomedical analysis agent with Biomni instruments and Amazon Bedrock AgentCore Gateway

This publish is co-authored with the Biomni group from Stanford.

Biomedical researchers spend roughly 90% of their time manually processing huge volumes of scattered info. That is evidenced by Genentech’s problem of processing 38 million biomedical publications in PubMed, public repositories just like the Human Protein Atlas, and their inside repository of lots of of thousands and thousands of cells throughout lots of of illnesses. There’s a speedy proliferation of specialised databases and analytical instruments throughout totally different modalities together with genomics, proteomics, and pathology. Researchers should keep present with the massive panorama of instruments, leaving much less time for the hypothesis-driven work that drives breakthrough discoveries.

AI brokers powered by basis fashions supply a promising resolution by autonomously planning, executing, and adapting advanced analysis duties. Stanford researchers constructed Biomni that exemplifies this potential. Biomni is a general-purpose biomedical AI agent that integrates 150 specialised instruments, 105 software program packages, and 59 databases to execute subtle analyses similar to gene prioritization, drug repurposing, and uncommon illness prognosis.

Nevertheless, deploying such brokers in manufacturing requires strong infrastructure able to dealing with computationally intensive workflows and a number of concurrent customers whereas sustaining safety and efficiency requirements. Amazon Bedrock AgentCore is a set of complete companies to deploy and function extremely succesful brokers utilizing any framework or mannequin, with enterprise-grade safety and scalability.

On this publish, we present you methods to implement a analysis agent utilizing AgentCore with entry to over 30 specialised biomedical database instruments from Biomni, thereby accelerating scientific discovery whereas sustaining enterprise-grade safety and manufacturing scale. The code for this resolution is out there within the open-source toolkit repository of starter brokers for all times sciences on Amazon Net Providers (AWS). The step-by-step instruction helps you deploy your personal instruments and infrastructure, together with AgentCore elements, and examples.

Prototype-to-production complexity hole

Transferring from an area biomedical analysis prototype to a manufacturing system accessible by a number of analysis groups requires addressing advanced infrastructure challenges.

Agent deployment with enterprise safety

Enterprise safety challenges embody OAuth-based authentication, safe device sharing by scalable gateways, complete observability for analysis audit trails, and computerized scaling to deal with concurrent analysis workloads. Many promising prototypes fail to achieve manufacturing due to the complexity of implementing these enterprise-grade necessities whereas sustaining the specialised area experience wanted for correct biomedical evaluation.

Session-aware analysis context administration

Biomedical analysis workflows usually span a number of conversations and require persistent reminiscence of earlier analyses, experimental parameters, and analysis preferences throughout prolonged analysis periods. Analysis brokers should keep contextual consciousness of ongoing tasks, keep in mind particular protein targets, experimental circumstances, and analytical preferences. All that should be achieved whereas facilitating correct session isolation between totally different researchers and analysis tasks in a multi-tenant manufacturing atmosphere.

Scalable device gateway

Implementing a reusable device gateway that may deal with concurrent requests from analysis agent, correct authentication, and constant efficiency turns into important at scale. The gateway should allow brokers to find and use instruments by safe endpoints, assist brokers discover the appropriate instruments by contextual search capabilities, and handle each inbound authentication (verifying agent identification) and outbound authentication (connecting to exterior biomedical databases) in a unified service. With out this structure, analysis groups face authentication complexity and reliability points that forestall efficient scaling.

Answer overview

We use Strands Brokers, an open supply agent framework, to construct a analysis agent with native device implementation for PubMed biomedical literature search. We prolonged the agent’s capabilities by integrating Biomni database instruments, offering entry to over 30 specialised biomedical databases.

The general structure is proven within the following diagram.

The AgentCore Gateway service centralizes Biomni database instruments as safer, reusable endpoints with semantic search capabilities. AgentCore Reminiscence service maintains contextual consciousness throughout analysis periods utilizing specialised methods for analysis context. Safety is dealt with by AgentCore Identification service, which manages authentication for each customers and gear entry management. Deployment is streamlined with the AgentCore Runtime service, offering scalable, managed deployment with session isolation. Lastly, the AgentCore Observability service allows complete monitoring and auditing of analysis workflows which can be important for scientific reproducibility.

Step 1 – Creating instruments such because the Biomni database instruments utilizing AgentCore Gateway

In real-world use instances, we have to join brokers to totally different information sources. Every agent may duplicate the identical instruments, resulting in intensive code, inconsistent conduct, and upkeep nightmares. AgentCore Gateway service streamlines this course of by centralizing instruments into reusable, safe endpoints that brokers can entry. Mixed with the AgentCore Identification service for authentication, AgentCore Gateway creates an enterprise-grade device sharing infrastructure. To provide extra context to the agent with reusable instruments, we supplied entry to over 30 specialised public database APIs by the Biomni instruments registered on the gateway. The gateway exposes Biomni’s database instruments by the Mannequin Context Protocol (MCP), permitting the analysis agent to find and invoke these instruments alongside native instruments like PubMed. It handles authentication, fee limiting, and error dealing with, offering a seamless analysis expertise.

def create_gateway(gateway_name: str, api_spec: record) -> dict:
    # JWT authentication with Cognito
    auth_config = {
        "customJWTAuthorizer": {
            "allowedClients": [
                get_ssm_parameter("/app/researchapp/agentcore/machine_client_id")
            ],
            "discoveryUrl": 
                get_ssm_parameter("/app/researchapp/agentcore/cognito_discovery_url"),
        }
    }
    
    # Allow semantic seek for BioImm instruments
    search_config = {"hcp": {"searchType": "SEMANTIC"}}
    
    # Create the gateway
    gateway = bedrock_agent_client.create_gateway(
        identify=gateway_name,
        collectionexecution_role_arn,
        protocolType="MCP",
        authorizerType="CUSTOM_JWT",
        authorizerConfiguration=auth_config,
        protocolConfiguration=search_config,
        description="My App Template AgentCore Gateway",
    )

We use an AWS Lambda perform to host the Biomni integration code. The Lambda perform is mechanically configured as an MCP goal within the AgentCore Gateway. The Lambda perform exposes its accessible instruments by the API specification ( api_spec.json).

# Gateway Goal Configuration
lambda_target_config = {
    "mcp": {
        "lambda": {
            "lambdaArn": get_ssm_parameter("/app/researchapp/agentcore/lambda_arn"),
            "toolSchema": {"inlinePayload": api_spec},
        }
    }
}

# Create the goal
create_target_response = gateway_client.create_gateway_target(
    gatewayIdentifier=gateway_id,
    identify="LambdaUsingSDK",
    description="Lambda Goal utilizing SDK",
    targetConfiguration=lambda_target_config,
    credentialProviderConfigurations=[{
        "credentialProviderType": "GATEWAY_IAM_ROLE"
    }],
)

The total record of Biomni database instruments included on the gateway are listed within the following desk:

Group	Software	Description
Protein and construction databases	UniProt	Question the UniProt REST API for complete protein sequence and practical info
	AlphaFold	Question the AlphaFold Database API for AI-predicted protein construction predictions
	InterPro	Question the InterPro REST API for protein domains, households, and practical websites
	PDB (Protein Information Financial institution)	Question the RCSB PDB database for experimentally decided protein constructions
	STRING	Question the STRING protein interplay database for protein-protein interplay networks
	EMDB (Electron Microscopy Information Financial institution)	Question for 3D macromolecular constructions decided by electron microscopy
Genomics and variants	ClinVar	Question NCBI's ClinVar database for clinically related genetic variants and their interpretations
	dbSNP	Question the NCBI dbSNP database for single nucleotide polymorphisms and genetic variations
	gnomAD	Question gnomAD for population-scale genetic variant frequencies and annotations
	Ensembl	Question the Ensembl REST API for genome annotations, gene info, and comparative genomics
	UCSC Genome Browser	Question the UCSC Genome Browser API for genomic information and annotations
Expression and omics	GEO (Gene Expression Omnibus)	Question NCBI's GEO for RNA-seq, microarray, and different gene expression datasets
	PRIDE	Question the PRIDE database for proteomics identifications and mass spectrometry information
	Reactome	Question the Reactome database for organic pathways and molecular interactions
Medical and drug information	cBioPortal	Question the cBioPortal REST API for most cancers genomics information and medical info
	ClinicalTrials.gov	Question ClinicalTrials.gov API for details about medical research and trials
	OpenFDA	Question the OpenFDA API for FDA drug, system, and meals security information
	GtoPdb (Information to PHARMACOLOGY)	Question the Information to PHARMACOLOGY database for drug targets and pharmacological information
Illness and phenotype	OpenTargets	Question the OpenTargets Platform API for disease-target associations and drug discovery information
	Monarch Initiative	Question the Monarch Initiative API for phenotype and illness info throughout species
	GWAS Catalog	Question the GWAS Catalog API for genome-wide affiliation examine outcomes
	RegulomeDB	Question the RegulomeDB database for regulatory variant annotations and practical predictions
Specialised databases	JASPAR	Question the JASPAR REST API for transcription issue binding website profiles and motifs
	WoRMS (World Register of Marine Species)	Question the WoRMS REST API for marine species taxonomic info
	Paleobiology Database (PBDB)	Question the PBDB API for fossil incidence and taxonomic information
	MPD (Mouse Phenome Database)	Question the Mouse Phenome Database for mouse pressure phenotype information
	Synapse	Question Synapse REST API for biomedical datasets and collaborative analysis information

The next are examples of how particular person instruments get triggered by the MCP from our check suite:

# Protein and Construction Evaluation
"Use uniprot device to search out details about human insulin protein"
# → Triggers uniprot MCP device with protein question parameters
"Use alphafold device for construction predictions for uniprot_id P01308"
# → Triggers alphafold MCP device for 3D construction prediction
"Use pdb device to search out protein constructions for insulin"
# → Triggers pdb MCP device for crystallographic constructions
# Genetic Variation Evaluation  
"Use clinvar device to search out pathogenic variants in BRCA1 gene"
# → Triggers clinvar MCP device with gene variant parameters
"Use gnomad device to search out inhabitants frequencies for BRCA2 variants"
# → Triggers gnomad MCP device for inhabitants genetics information

Because the device assortment grows, the agent can use built-in semantic search capabilities to find and choose instruments primarily based on the duty context. This improves agent efficiency and lowering growth complexity at scale. For instance, the person asks, “inform me about HER2 variant rs1136201.” As an alternative of itemizing all 30 or extra instruments from the gateway again to the agent, semantic search returns ‘n’ most related instruments. For instance, Ensembl, Gwas catalog, ClinVar, and Dbsnp to the agent. The agent now makes use of a smaller subset of instruments as enter to the mannequin to return a extra environment friendly and quicker response.

The next graphic illustrates utilizing AgentCore Gateway for device search.

AgentCore Gateway tool search

Now you can check your deployed AgentCore gateway utilizing the next check scripts and examine how semantic search narrows down the record of related instruments primarily based on the search question.

uv run checks/test_gateway.py --prompt "What instruments can be found?"
uv run checks/test_gateway.py --prompt "Discover details about human insulin protein" --use-search

Step 2- Strands analysis agent with an area device

The next code snippet reveals mannequin initialization, implementing the PubMed native device that’s declared utilizing the Strands @device decorator. We’ve applied the PubMed device in research_tools.py that calls PubMed APIs to allow biomedical literature search capabilities throughout the agent's execution context.

from agent.agent_config.instruments.PubMed import PubMed

@device(
    identify="Query_pubmed",
    description=(
        "Question PubMed for related biomedical literature primarily based on the person's question. "
        "This device searches PubMed abstracts and returns related research with "
        "titles, hyperlinks, and summaries."
    ),
)
def query_pubmed(question: str) -> str:
    """
    Question PubMed for related biomedical literature primarily based on the person's question.
    
    This device searches PubMed abstracts and returns related research with 
    titles, hyperlinks, and summaries.
    
    Args:
        question: The search question for PubMed literature
        
    Returns:
        str: Formatted outcomes from PubMed search
    """
    pubmed = PubMed()
    
    print(f"nPubMed Question: {question}n")
    consequence = pubmed.run(question)
    print(f"nPubMed Outcomes: {consequence}n")
    
    return consequence

class ResearchAgent:
    def __init__(
        self,
        bearer_token: str,
        memory_hook: MemoryHook = None,
        session_manager: AgentCoreMemorySessionManager = None,
        bedrock_model_id: str = "us.anthropic.claude-sonnet-4-20250514-v1.0",
        #bedrock_model_id: str = "openai.gpt-oss-120b-1.0",  # Different
        system_prompt: str = None,
        instruments: Listing[callable] = None,
    ):
        
        self.model_id = bedrock_model_id
        # For Anthropic Sonnet 4 interleaved pondering
        self.mannequin = BedrockModel(
            model_id=self.model_id,
            additional_request_fields={
                "anthropic_beta": ["interleaved-thinking-2025-05-14"],
                "pondering": {"kind": "enabled", "budget_tokens": 8000},
            },
        )
        
        self.system_prompt = (
            system_prompt
            if system_prompt
            else """
You're a **Complete Biomedical Analysis Agent** specialised in conducting 
systematic literature opinions and multi-database analyses to reply advanced biomedical analysis 
questions. Your main mission is to synthesize proof from each printed literature 
(PubMed) and real-time database queries to supply complete, evidence-based insights for 
pharmaceutical analysis, drug discovery, and medical decision-making.

Your core capabilities embody literature evaluation and extracting information from 30+ specialised 
biomedical databases** by the Bioimm gateway, enabling complete information evaluation. The 
database device classes embody genomics and genetics, protein construction and performance, pathways 
and system biology, medical and pharmacological information, expression and omics information and different 
specialised databases.
"""
        )

As well as, we applied citations that use a structured system immediate to implement numbered in-text citations [1], [2], [3] with standardized reference codecs for each educational literature and database queries, marking certain each information supply is correctly attributed. This permits researchers to shortly entry and reference the scientific literature that helps their biomedical analysis queries and findings.

"""

- ALWAYS use numbered in-text citations [1], [2], [3], and so on. when referencing any information supply
- Present a numbered "References" part on the finish with full supply particulars
- For tutorial literature: format as "1. Creator et al. Title. Journal. Yr. ID: [PMID/DOI], accessible at: [URL]"
- For database sources: format as "1. Database Identify (Software: tool_name), Question: [query_description], Retrieved: [current_date]"
- Use numbered in-text citations all through your response to help all claims and information factors
- Every device question and every literature supply should be cited with its personal distinctive reference quantity
- When instruments return educational papers, cite them utilizing the tutorial format with full bibliographic particulars
- Construction: Format every reference on a separate line with correct numbering - NO bullet factors
- Current the References part as a clear numbered record, not a complicated paragraph
- Keep sequential numbering throughout all reference sorts in a single "References" part

"""

Now you can check your agent regionally:

uv run checks/test_agent_locally.py --prompt "Discover details about human insulin protein"
uv run checks/test_agent_locally.py --prompt "Discover details about human insulin protein" --use-search

Step 3 - Add Persistent Reminiscence for contextual analysis help

The analysis agent implements the AgentCore Reminiscence service with three methods: semantic for factual analysis context, user_preference for analysis methodologies, and abstract for session continuity. The AgentCore Reminiscence session supervisor is built-in with Strands session administration and retrieves related context earlier than queries and save interactions after responses. This permits the agent to recollect analysis preferences, ongoing tasks, and area experience throughout periods with out guide context re-establishment.

# Check reminiscence performance with analysis conversations

python checks/test_memory.py load-conversation
python checks/test_memory.py load-prompt "My most well-liked response format is detailed explanations"

Step 4 - Deploy with AgentCore Runtime

To deploy our agent, we use AgentCore Runtime to configure and launch the analysis agent as a managed service. The deployment course of configures the runtime with the agent's important entrypoint (agent/important.py), assigns an IAM execution position for AWS service entry, and helps each OAuth and IAM authentication modes. After deployment, the runtime turns into a scalable, serverless agent that may be invoked utilizing API calls. The agent mechanically handles session administration, reminiscence persistence, and gear orchestration whereas offering safe entry to the Biomni gateway and native analysis instruments.

agentcore configure --entrypoint agent/important.py -er arn:aws:iam::<Account-Id>:position/<Function> --name researchapp<AgentName>

For extra details about deploying with AgentCore Runtime, see Get began with AgentCore Runtime within the Amazon Bedrock AgentCore Developer Information.

Brokers in motion

The next are three consultant analysis situations that showcase the agent's capabilities throughout totally different domains: drug mechanism evaluation, genetic variant investigation, and pathway exploration. For every question, the agent autonomously determines which mixture of instruments to make use of, formulates applicable sub-queries, analyzes the returned information, and synthesizes a complete analysis report with correct citations. The accompanying demo video reveals the entire agent workflow, together with instruments choice, reasoning, and response era.

Conduct a complete evaluation of trastuzumab (Herceptin) mechanism of motion and resistance mechanisms you’ll want:
1. HER2 protein construction and binding websites
2. Downstream signaling pathways affected
3. Identified resistance mechanisms from medical information
4. Present medical trials investigating mixture therapies
5. Biomarkers for remedy response predictionQuery related databases to supply a complete analysis report.
Analyze the medical significance of BRCA1 variants in breast most cancers danger and remedy response. Examine:
1. Inhabitants frequencies of pathogenic BRCA1 variants
2. Medical significance and pathogenicity classifications
3. Related most cancers dangers and penetrance estimates
4. Remedy implications (PARP inhibitors, platinum brokers)
5. Present medical trials for BRCA1-positive sufferers
  Use a number of databases to supply complete proof

The next video is an illustration of a biomedical analysis agent:

Scalability and observability

Some of the important challenges in deploying subtle AI brokers is ensuring they scale reliably whereas sustaining complete visibility into their operations. Biomedical analysis workflows are inherently unpredictable—a single genomic evaluation may course of 1000's of information, whereas a literature evaluation might span thousands and thousands of publications. Conventional infrastructure struggles with these dynamic workloads, notably when dealing with delicate analysis information that requires strict isolation between totally different analysis tasks.On this deployment, we use Amazon Bedrock AgentCore Observability to visualise every step within the agent workflow. You should use this service to examine an agent's execution path, audit intermediate outputs, and debug efficiency bottlenecks and failures. For biomedical analysis, this degree of transparency is not only useful—it is important for regulatory compliance and scientific reproducibility.

Periods, traces, and spans type a three-tiered hierarchical relationship within the observability framework. A session accommodates a number of traces, with every hint representing a discrete interplay throughout the broader context of the session. Every hint accommodates a number of spans that seize fine-grained operations. The next screenshot footwear the utilization of 1 agent: Variety of periods, token utilization, and error fee in manufacturing

The next screenshot reveals the brokers in manufacturing and their utilization (variety of Periods, variety of invocations)

The built-in dashboards present efficiency bottlenecks and establish why sure interactions may fail, enabling steady enchancment and lowering the imply time to detect (MTTD) and imply time to restore (MTTR). For biomedical purposes the place failed analyses can delay important analysis timelines, this speedy difficulty decision functionality makes certain that analysis momentum is maintained.

Future path

Whereas this implementation focuses on solely a subset of instruments, the AgentCore Gateway structure is designed for extensibility. Analysis groups can seamlessly add new instruments with out requiring code modifications through the use of the MCP protocol. Newly registered instruments are mechanically discoverable by brokers permitting your analysis infrastructure to evolve alongside the quickly altering device units.

For computational evaluation that requires code execution, the AgentCore Code Interpreter service will be built-in into the analysis workflow. With AgentCore Code Interpreter the analysis agent can retrieve information and execute Python-based evaluation utilizing domain-specific libraries like BioPython, scikit-learn, or customized genomics packages.

Future extensions might help a number of analysis brokers to collaborate on advanced tasks, with specialised brokers for literature evaluation, experimental design, information evaluation, and consequence interpretation working collectively by multi-agent collaboration. Organizations may also develop specialised analysis brokers tailor-made to particular therapeutic areas, illness domains, or analysis methodologies that share the identical enterprise infrastructure and gear gateway.

Trying forward with Biomni

“Biomni at this time is already helpful for tutorial analysis and open exploration. However to allow actual discovery—like advancing drug growth—we have to transfer past prototypes and make the system enterprise-ready. Embedding Biomni into the workflows of biotech and pharma is crucial to show analysis potential into tangible affect.

That’s why we're excited to combine the open-source atmosphere with Amazon Bedrock AgentCore, bridging the hole from analysis to manufacturing. Trying forward, we’re additionally enthusiastic about extending these capabilities with the Biomni A1 agent structure and the Biomni-R0 mannequin, which can unlock much more subtle biomedical reasoning and evaluation. On the identical time, Biomni will stay a thriving open-source atmosphere, the place researchers and business groups alike can contribute instruments, share workflows, and push the frontier of biomedical AI along with AgentCore.”

Conclusion

This implementation demonstrates how organizations can use Amazon Bedrock AgentCore to remodel biomedical analysis prototypes into production-ready methods. By integrating Biomni's complete assortment of over 150 specialised instruments by the AgentCore Gateway service, we illustrate how groups can create enterprise-grade device sharing infrastructure that scales throughout a number of analysis domains.The mixture of Biomni's biomedical instruments with the enterprise infrastructure of Bedrock AgentCore organizations can construct analysis brokers that keep scientific rigor whereas assembly manufacturing necessities for safety, scalability, and observability. Biomni's numerous device assortment—spanning genomics, proteomics, and medical databases—exemplifies how specialised analysis capabilities will be centralized and shared throughout analysis groups by a safe gateway structure.

To start constructing your personal biomedical analysis agent with Biomni instruments, discover the implementation by visiting our GitHub repository for the entire code and documentation. You possibly can comply with the step-by-step implementation information to arrange your analysis agent with native instruments, gateway integration, and Bedorck AgentCore deployment. As your wants evolve, you possibly can prolong the system together with your group's proprietary databases and analytical instruments. We encourage you to hitch the rising atmosphere of life sciences AI brokers and instruments by sharing your extensions and enhancements.

In regards to the authors

Hasan Poonawala is a Senior AI/ML Options Architect at AWS, working with Healthcare and Life Sciences prospects. Hasan helps design, deploy and scale Generative AI and Machine studying purposes on AWS. He has over 15 years of mixed work expertise in machine studying, software program growth and information science on the cloud. In his spare time, Hasan likes to discover nature and spend time with family and friends.

pidemal Pierre de Malliard is a Senior AI/ML Options Architect at Amazon Net Providers and helps prospects within the Healthcare and Life Sciences Trade. He's at present primarily based in New York Metropolis.

Necibe Ahat is a Senior AI/ML Specialist Options Architect at AWS, working with Healthcare and Life Sciences prospects. Necibe helps prospects to advance their generative AI and machine studying journey. She has a background in laptop science with 15 years of business expertise serving to prospects ideate, design, construct and deploy options at scale. She is a passionate inclusion and variety advocate.

Kexin Huang is a final-year PhD pupil in Laptop Science at Stanford College, suggested by Prof. Jure Leskovec. His analysis applies AI to allow interpretable and deployable biomedical discoveries, addressing core challenges in multi-modal modeling, uncertainty, and reasoning. His work has appeared in Nature Medication, Nature Biotechnology, Nature Chemical Biology, Nature Biomedical Engineering and prime ML venues (NeurIPS, ICML, ICLR), incomes six greatest paper awards. His analysis has been highlighted by Forbes, WIRED, and MIT Know-how Overview, and he has contributed to AI analysis at Genentech, GSK, Pfizer, IQVIA, Flatiron Well being, Dana-Farber, and Rockefeller College.