An Incident is a logical entity for handling unplanned work. Each Incident includes a Title and Alerts or Context (sourced from incident management systems or manually input). Incidents are assigned to a team, owned by a user who is responsible for oversight, and tracked with Start Time and End Time, marked by users. During an Incident, the Agent collects Background Information and generates Root Cause Analysis to support resolution efforts.
Stay tuned for updates on how DuploCloud's AI DevOps Engineers can automate incident detection, root cause analysis, and resolution workflows.
Bring Your Own Agent
While DuploCloud's built-in agents operate on a wide range of standard cloud functionalities you can build your agents that are very custom to your own use cases and stitch together your organization's tool set and workflows. For example you can build an agent that can triage your customer support tickets by connecting to your Zendesk and cloud infrastructure tools.
Coming soon a list of custom agent case studies and descriptions that our customers have implemented.
IDE Extension
AI Devops Policy Model
A high-level overview of the building blocks of DuploCloud's AI DevOps Engineer
The DuploCloud AI DevOps Policy Model lays down the foundational building blocks of the system that orchestrates large and complex DevOps projects seamlessly—just as a human engineer would.
This diagram illustrates a hierarchical AI DevOps platform architecture. Administrators define permissions and integrations, Engineers are configured with capabilities and boundaries, Projects break down requirements into executable plans and tasks, and specialized Agents carry out the actual work through multi-agent orchestration in a ticket-based workflow.
Core Concepts
Below are brief introductions to the core concepts of the AI DevOps Policy Model, which will be explained in detail in subsequent sections.
1. Engineer
The Engineer is the primary entity representing an AI DevOps Engineer. Each Engineer has a unique name within your organization, is defined by a set of key attributes that govern its behavior and capabilities, and contains a container of Skills called a Persona. The Engineer operates within defined access boundaries called Scope, with specific exceptions managed through Guardrails. As it works, the Engineer builds a Knowledge Base of learned information about the environment. Resource usage is controlled through Quota settings, and the Engineer's performance is tracked through Health Metrics, which include analytics and cost data.
2. Skill
A Skill is an instruction or prompt that tells an agent how to behave or what capabilities it has. Skills represent the atomic unit of capability within the system. Examples include Kubernetes troubleshooting skills, Terraform provisioning skills, cost optimization skills, and custom skills tailored specifically to an organization's workflows or requirements.
3. Persona
A Persona serves as a logical container that organizes multiple Skills together. Personas help customers organize capabilities by role or function, making it easier to manage and deploy different sets of abilities. For instance, an SRE Persona might combine troubleshooting, monitoring, and incident response skills. Similarly, a Provisioning Persona could bundle Terraform skills with Kubernetes deployment skills, while a Security Persona might group compliance skills with security scanning skills.
4. Provider
A Provider is a system that an Engineer can access, including cloud platforms (AWS, Azure, GCP), repositories, MCP Servers, and more. Each Provider is associated with a specific account or namespace and requires a credential reference for authentication. Credentials can be stored in DuploCloud or referenced externally, enabling just-in-time, scoped access to resources without exposing sensitive authentication data.
5. Scope
Scope defines what an Engineer can access within each Provider. Each Scope entry specifies a Provider and associated Credential, and includes granular access controls through regions, resource types, tags, or custom resource maps (key-value pairs for filtering specific resources like namespaces). Scope can also include MCP Servers for extended system access.
6. Guardrails
Guardrails define exceptions within a Scope, specifying what an Engineer cannot access or perform. These restrictions can target specific resources to exclude—such as a production database instances—specific operations that should be restricted, or entire environments that should be avoided. Guardrails provide fine-grained control over the Engineer's permissions within its broader Scope.
7. Quota and Quality of Service
Quota and Quality of Service (QoS) settings control resource limits for the Engineer. These controls can include maximum concurrent projects, token limits, and cloud resource provisioning limits. The system is designed to accommodate additional options in future versions as requirements evolve.
8. Knowledge Base
The Knowledge Base represents the Engineer's learned understanding of the environment. It captures architecture and topology information, relationships between systems, codebase structure, and historical context from completed work. This knowledge is stored both as files in customer repositories in markdown format and in a vector database within DuploCloud, enabling efficient retrieval and reference during operations.
Workflow concepts
Below are some of the workflow concepts in the AI DevOps Policy Model:
1. Project
A Project is a logical entity representing planned work. Each Project contains a Title and Summary. The heart of a Project are its requirements, which can be defined in plain English.
2. Requirements
Requirements are a user-generated document that enumerates a Project’s goals and objectives in natural language. Every Project includes associated user requirements, which may also specify how a Project’s goals are to be accomplished.
3. Plan
A Plan represents an execution plan containing tasks. It is an Agent-generated breakdown of how to accomplish a Project's Requirements. An Agent derives the Plan from the Requirements and requires human approval before execution. Plans are versioned whenever the underlying requirements change, ensuring full traceability. If you reject a Plan, you can provide feedback for regeneration, allowing iterative refinement until the Plan meets your requirements.
4. Task
A Task represents a unit of work within a Plan. An Agent generates Tasks from an approved Plan, and each Task is assigned to an Agent based on the Skill needed to complete it. Tasks require human approval before proceeding and become Tickets once approved. You have the flexibility to add new Tasks by providing feedback to the Agent and can override Task routing decisions within the Plan if needed.
Overview
An overview and demo of DuploCloud's AI DevOps Engineer
DuploCloud is an agentic DevOps automation platform that leverages AI to accomplish a wide range of DevOps tasks. At the core of this platform is DuploCloud's AI DevOps Engineer that organizations can leverage for complex DevOps projects. Unlike traditional AI coding assistants that merely enhance human capabilities, the AI DevOps Engineer is an autonomous worker, with human-in-the loop governance, that can be onboarded, trained, assigned tasks, and managed just like a human team member.
What Can an AI DevOps Engineer Do for You?
Manage projects related to deployments, migrations, observability, security, and compliance.
Troubleshoot incidents end-to-end from setting up monitoring or alerting to responding in real time to outages and service degradations.
Help with everyday tasks like infrastructure health reviews, code deployment, IaC maintenance, rollbacks, and backups.
Perform other functions like collecting evidence for compliance audits, reporting, discovering cloud resources to generate documentation, and more.
DuploCloud’s AI DevOps Engineer is highly customizable. You can start with DuploCloud’s out-of-the-box engineers, extend them with specific capabilities tailored to your organization, or build entirely new engineers with specialized skills. Think of this as a self-hosted Claude Code, purpose-built for DevOps, with significantly more autonomy and control. The possibilities are endless!
Main Platform Components
Engineer Hub
The Engineer Hub is the home of all your specialized AI engineers, each capable of autonomously handling complex projects with a human-in-the-loop. In the hub, you can create Platform Engineers, CI/CD Engineers, SRE Engineers, and more, and manage your AI engineers' permissions, projects, and performance. Simply define your high-level project requirements and your AI engineer will convert them to a detailed plan coordinating a team of AI agents to complete the tasks to achieve your goals.
Agentic AI Helpdesk
The Agentic AI Helpdesk is where you go to achieve task-level objectives with the help of specialized agents. These agents are designed to execute specific tasks and include SRE, K8s, AWS, GCP, Docs, and Architecture agents. Modeled on a traditional IT help desk and accessible through a web browser, Slack, or Teams chat, or directly within an IDE extension (offering a "Cursor-like" experience), it lets you create tickets and assign them to AI agents, which then execute tasks in real time.
Integrations
The AI DevOps Engineer comes with a suite of integrations that help you accomplish tasks across cloud infrastructure provisioning, observability, security, compliance, cost, CI/CD, etc. These integrations provide real-world connectivity through tools and APIs, including:
Cloud Providers: AWS, GCP, Azure
Kubernetes: EKS, AKS, GKE
Git Repositories: GitHub, GitLab, Bitbucket
Getting Started with DuploCloud
An outline of the DuploCloud approach compared to existing DevOps
DuploCloud manages cloud infrastructure in over two hundred organizations ranging from startup, small businesses to publicly listed companies. We categorize our clients into two segments:
Young companies with little to no in-house DevOps expertise and very limited automation. Typically, these organizations want to completely revamp their DevOps stack and move to modern technologies like Kubernetes. In this scenario, DuploCloud is installed in a fresh cloud account, and the organization’s existing deployment is migrated to the new account using state-of-the-art automation, pipelines, security, observability, and compliance—supporting Kubernetes and serverless deployment models. Data sources like databases, S3 buckets, etc., are migrated to the new account or remain in existing accounts. The onboarding process is described in detail in the next section.
Mature organizations with significant in-house DevOps expertise. Mature organizations must decide between operating in their existing account or migrating to a new account. If an organization wants to modernize, for example, migrating from VM to K8s, moving to a new account is the best option. However, if there is already a mature K8s automation stack, it may be imported into DuploCloud and mapped to the DuploCloud constructs described in the policy model. Most use cases in mature organizations involve leveraging AI to build AI DevOps teammates that help handle growing volumes of support tickets and reduce DevOps backlog, allowing human teams to focus on higher-value work.
A Project is a logical entity representing planned work. Each Project includes a Title and Summary. The heart of a Project is its requirements, which are organized into Infrastructure, CI/CD, Workloads, Observability, Compliance, Security, and Cost. Projects can optionally define their own Scope as a subset of the Engineer's Scope. If no custom Scope is specified, the Project defaults to using the Engineer’s full Scope.
Stay tuned for updates on leveraging DuploCloud's AI DevOps Engineers to execute complete end-to-end projects.
Overview
DuploCloud AI Suite is an advanced platform that transforms DevOps operations through purpose-built AI Agents that augment your existing infrastructure management capabilities. Designed to address the complexity of cloud operations and the scarcity of specialized DevOps expertise, AI Suite intelligently automates routine and complex tasks while maintaining a human-in-the-loop approach for critical decisions. The solution seamlessly integrates with your existing infrastructure and tooling, creating a secure collaborative environment between AI Agents and human operators.
The AI Suite consists of two primary components: AI Studio and AI HelpDesk. AI Studio enables the building, training, and deployment of specialized Agents for various DevOps functions including deployments, observability, CI/CD, security, incident management and more. AI HelpDesk provides an intuitive interface where end users can open tickets, interact with Agents, visualize operations through a shared browser, and maintain oversight of AI-executed commands in production environments. Together, these components deliver a comprehensive solution that enhances productivity, reduces operational overhead, and maintains the security and compliance standards intrinsic to the DuploCloud platform.
Reporting AI Concerns
DuploCloud is committed to the responsible development and deployment of AI systems. If you have experienced or observed an adverse impact from any DuploCloud AI product—including unexpected behavior, bias, privacy concerns, security issues, or other harmful outcomes—we encourage you to report it.
Reports may be submitted by anyone, whether or not you are a DuploCloud customer. We will acknowledge receipt of your report within one (1) business hour and investigate promptly. Please include a description of the issue, when it occurred, and any relevant details that can help us understand and address the concern.
Agents
Agents are AI-powered tools your Engineer uses on actual Project tasks, such as Kubernetes deployments or monitoring.
Adding an Agent
Navigate to Agents and click Add Agent.
Provide a Name, Description, technical details like access endpoint and path, and any needed metadata.
Click Update to create your Agent.
The DuploCloud Platform already provides several pre-built Agents that you can leverage.
For more details on how to create custom Agents on DuploCloud, see the .
Engineers
Adding an AI DevOps Engineer streamlines DevOps while keeping you in control. Assemble your AI DevOps Engineer by combining Providers, Skills/Personas, and Agents.
Creating an Engineer
Navigate to Engineers and click Add Engineer.
Give the Engineer a Name and Description.
Select the Persona(s) to equip the Engineer with the needed Skills, and click Next.
Select the Agent(s) that you want this Engineer to use, and click Next.
Select the Providers and their respective Scopes for the Engineer.
Click Create.
Congratulations! Your Engineer is now ready to receive assignments, interpret goals, break down work, and provide a transparent audit trail—all with human oversight and control.
Supporting Components
Cartography
DuploCloud Cartography is a Dockerized build of Cartography extended with custom DuploCloud intelligence and Neo4j models that unifies DuploCloud, Kubernetes, AWS, GitHub, and external systems into a single, queryable graph. It injects a Duplo ingestion stage at runtime (via sitecustomize) to ingest Tenants, Plans, Infrastructures, Hosts, Pods, Agents, and container versions, standardizing on IN_TENANT relationships for clean scoping and authorization. The project supports an optional dependencies manifest (local file or Kubernetes ConfigMap) to link workloads to Kubernetes objects, AWS resources (e.g., S3, RDS), and external services, while helper scripts and JIT credential support make it straightforward to run against Neo4j for local development or production.
The Duplo Presidio Service is a wrapper built on top of Microsoft Presidio that enhances its ability to detect and protect sensitive data by allowing custom recognizers and anonymizers.
This ensures secrets, tokens, and credentials are automatically identified and protected before data leaves your Duplo-managed Kubernetes cluster.
Personas
A Persona is a logical container that groups related Skills by role or function. For example, an SRE Persona combines troubleshooting, monitoring, and incident response skills, a Provisioning Persona bundles Terraform and Kubernetes deployment skills, and a Security Persona groups compliance and security scanning skills.
Creating a Persona
Select Personas and click Add Persona.
Select the relevant Skills to include the Persona.
Click Update to create the Persona.
Prerequisites
Tasks to perform before you use AWS with DuploCloud
Providers store credentials for connecting to cloud accounts, Kubernetes namespaces, incident tools, and other services. After adding Providers, you create Scopes that define what resources can be accessed. Permissions are then granted by assigning these Scopes to Engineers when creating or editing them.
Adding Providers and Defining Access Scopes
Navigate to to Providers and select the tab for the Provider type you are trying to add, e.g.,
MCP Servers
MCP Servers provide an AI Engineer with access to essential external systems and tools that aren't directly part of cloud infrastructure or code repositories. This includes observability platforms (Prometheus, Grafana, Datadog), SIEM tools, ticketing systems, communication platforms, and specialized DuploCloud integrations.
Adding MCP Servers to an Engineer’s Scope
Navigate to MCP Servers
Services
A conceptual overview of DuploCloud Services
A Service could be a Kubernetes Deployment, StatefulSet, or DaemonSet. It can also be a Lambda function or an ECS task or service, capturing a microservice. Each Service (except Lambda) is given a Load Balancer to expose itself and is assigned a DNS name.
DuploCloud Services should not be confused with Kubernetes or ECS services. By Service, we mean application components that can be either Docker-based or serverless.
Public Cloud Tutorials
Links to the Quick Start Guide for each cloud provider
These tutorials are specific to various public cloud environments and demonstrate some of DuploCloud's most common use cases:
DuploCloud Common Components
DuploCloud components common to AWS, GCP, and Azure DuploCloud deployments
Several DuploCloud components are used with AWS, GCP, Azure, and Hybrid/On-Premises Services. These include Infrastructures, Plans, Tenants, Hosts, and Load Balancers. This section provides a conceptual explanation of the following common DuploCloud components:
Creating a Tenant (Environment)
Using DuploCloud Tenants for AWS
In AWS, cloud features such as AWS resource groups, AWS IAM, AWS security groups, KMS keys, as well as Kubernetes Namespaces, are exposed in Tenants which reference their configurations.
For more information about DuploCloud Tenants, see the topic in the DuploCloud Common Components documentation.
Creating a Tenant
Creating an ECS Service
Finish the Quick Start Tutorial by creating an ECS Service
This section of the tutorial shows you how to deploy a web application with .
For a full discussion of the benefits of using EKS vs. ECS, consult.
Instead of creating a DuploCloud Service with AWS ECS, you can alternatively finish the tutorial by:
Management Portal Scope
An overview of the scope of cloud provider resources (accounts) that a DuploCloud Portal can manage
Following is the scope of cloud provider resources (accounts) that a single DuploCloud Portal can manage:
Azure: A single DuploCloud Portal can manage multiple Azure subscriptions. Azure has native identity services like Azure Active Directory (Azure AD) and Entra ID, which provide managed identities that can be granted access across multiple subscriptions. DuploCloud inherits the permissions of these managed identities, allowing it to seamlessly access and manage resources across the Azure subscriptions it is connected to.
GCP:Similar to Azure, a single instance of DuploCloud can manage multiple GCP Projects.
AWS: In AWS a single DuploCloud Portal manages one and only one AWS account. This is inline with the AWS IAM implementation i.e. even in native AWS IAM model the building blocks like IAM Role, Instance Profiles do not span multiple accounts. The cross account SCP Policies are quite light weight. In fact AWS Organizations was added almost 10 years after the launch of AWS. For example, when a user logs in using AWS Identity Center, they have to choose an account and the session is scoped to that. See the picture below of the IAM login console.
Setting Tenant expiration
Manage Tenant expiry settings in the DuploCloud Portal
Managing Tenant Expiration
In the DuploCloud Portal, configure an expiration time for a Tenant. At the set expiration time, the Tenant and associated resources are deleted.
In the DuploCloud Portal, navigate to Administrator -> Tenants
Diagnostics
An overview of DuploCloud diagnostics
DuploCloud Diagnostics Functions
The DuploCloud platform automatically orchestrates the following main diagnostic functions:
DuploCloud Prerequisites
This section is in progress. In the meantime, please see the prerequisite section under the DuploCloud user guide for your cloud provider:
Agents in DuploCloud AI Studio are specialized AI components designed to perform specific DevOps and infrastructure management functions. Each Agent serves as a dedicated expert in areas such as Kubernetes troubleshooting, observability analysis, CI/CD operations, or security monitoring. Built on foundation models and enhanced with the knowledge graph the DuploCloud DevOps Automation Platform provides.
Agent Definitions
Agent definitions form the blueprint for creating AI Agents within DuploCloud AI Studio. These definitions specify the Agent's capabilities, LLM, and knowledge sources. Each definition includes parameters such as the base model to use, required tools, permission scopes, and response formats. Agent definitions are version-controlled, allowing for systematic improvement and rollback capabilities. This declarative approach to Agent creation ensures consistency, reusability, and transparent governance of AI capabilities across your organization.
Prebuilt Agents
Prebuilt Agents are ready-to-use solutions developed outside of AI Studio that conform to DuploCloud's agent standards and interfaces. These Agents are packaged as Docker images with pre-configured APIs that seamlessly integrate with the AI HelpDesk. Prebuilt agents offer immediate value for common DevOps scenarios without requiring custom development. They're ideal for organizations seeking to rapidly implement AI assistance for standardized workflows or those with existing Agent implementations they wish to incorporate into the DuploCloud ecosystem.
Dynamic Agents
Dynamic Agents are built using DuploCloud's LangChain-based framework, allowing for flexible, composable Agentic AI solutions tailored to specific operational needs. These Agents are constructed at build time by combining a base Agent configuration with selected Tools from the AI Studio repository. The LangChain architecture enables dynamic Agents to reason through complex problems, execute appropriate Tools based on context, and maintain coherent conversations with users. This approach facilitates rapid prototyping and iteration, as Tools can be developed independently and combined in various configurations to create specialized Agents without rewriting core functionality.
Tools
Tools are modular, reusable components that enable Agents to perform specific actions or access particular systems. Built on LangChain's Tool framework, each Tool encapsulates a discrete capability such as executing Kubernetes commands, querying observability platforms, or managing cloud resources. Tools can be shared across multiple Agents, promoting code reuse and consistency in functionality. AI Studio provides a library of pre-built Tools for common DevOps tasks while supporting custom Tool development for organization-specific needs.
Builds
The build process transforms Agent definitions and associated Tools into deployable Docker images. During a build, AI Studio pulls the specified base Agent configuration, incorporates selected Tools, and packages everything into a containerized application with standardized API endpoints. Each build generates logs for troubleshooting and validation, and the resulting artifact is versioned for traceability. The build system manages dependencies, ensures proper initialization of language models, and optimizes the final image for performance.
Images
Agent images are Docker Images that encapsulate the AI agent's code, dependencies, etc.
Deployments
Deploying an Agent makes it available for use within your infrastructure. AI Studio streamlines deployment, allowing users to configure resource limits, replica counts, and environment variables to customize behavior. Each deployment is exposed with a Load Balancer and DNS record.
Registration
Registering an Agent connects a deployed agent instance into AI HelpDesk, making it available for user interaction through Tickets. During registration, you provide essential information including the Agent's friendly name, endpoint URL, API path for message handling, and applicable Tenant scopes.
Cloud
,
Kubernetes
, etc. The available Provider types are listed in the table below. To add a Provider type that is not listed, contact your DuploCloud Support team on Slack or via email at [email protected].
Cloud Providers
AWS, GCP, Azure
Kubernetes
EKS, AKS, GKE, RHOS
Observability
OpenTelemetry (Otel), Datadog, New Relic, Sentry
Incident Management
Grafana Alert Manager, Datadog, New Relic, Sentry, PagerDuty, Incident.io
Git Repositories
GitHub, GitLab, Bitbucket
Click Add. The Edit Provider screen displays, showing the relevant inputs required for connecting to that Provider type.
Complete the required fields.
Click Update to finish granting access. This will return you to the Provider’s screen.
Select the Credentials tab, and click Add. The Edit Credential pane displays.
Enter the credential specification, and click Update. This will return you to the Provider’s screen.
Click Scope and then Add Scope. Give the Scope a suitable name and description, select one of the credentials added and (optionally) select an MCP Server. Enter the resource map in Key: Values format.
Click Create.
Now, you can move on to creating access for MCP Servers for your Engineers.
and click
Add Server
.
Enter the details required to connect the MCP Server and click Next.
Provide the credentials to connect the MCP Server and click Update.
Note: Scopes let you fine-tune the Engineer’s access. When creating or editing an Engineer, you can attach scope definition files (YAML or JSON) that specify the resources and permissions the Engineer is allowed to access.
DuploCloud Supported Services
For information on cloud-specific Services supported by DuploCloud, see:
DuploCloud supports a simple, application-specific interface to configure dozens of cloud services, such as S3, SNS, SQS, Kafka, Elasticsearch, Data Pipeline, EMR, SageMaker, Azure Redis, Azure SQL, Google Redis, etc. Almost all commonly used services are supported, and new ones are constantly added. DuploCloud Engineers fulfill most requests for new Services within days, depending on their complexity.
All Services and cloud features are created within a Tenant. While users specify application-level constructs for provisioning cloud resources, DuploCloud implicitly adds all the underlying DevOps and compliance controls.
Below is an image of some properties of a Service:
The Add Service page in the DuploCloud Platform
Navigate to Administrator -> Tenant in the DuploCloud Portal and click Add. The Create a Tenant pane displays.
In the Name field, enter a name for the Tenant. Choose unique names that are not substrings of one another, for example, if you have a Tenant named dev, you cannot create another named dev2. We recommend using distinct numerical suffixes like dev01 and dev02.
In the Plan list box, select the Plan to associate the Tenant with.
Unlike AWS EKS, creating and deploying services and apps with ECS requires creating a Task Definition, a blueprint for your application. Once you create a Task Definition, you can run it as a Task or as a Service. In this tutorial, we run the Task Definition as a Service.
To deploy your app with AWS ECS in this ECS tutorial, you:
Create a Task Definition using ECS.
Create a DuploCloud Service named webapp, backed by a Docker image.
Expose the app to the web with a Load Balancer.
Complete the tutorial by testing your application.
Estimated time to complete remaining tutorial steps: 30-40 minutes
Network Architecture and Configurations
Behind the scenes, the topology that DuploCloud creates resembles this low-level configuration in AWS.
We implement the same experience, providing an Account Switcher on the login page and inside the Portal, as shown below. For instructions on adding additional login portals to the DuploCloud login screen, see the Multiple Portal Login Options page.
. The
Tenants
page displays.
From the Name column, select the Tenant for which you want to configure an expiration time.
The Tenants details page Action menu with Set Tenant Expiration selected.
From the Actions list box, select Set Tenant Expiration. The Tenant - Set Tenant Expiration pane displays.
The Tenant - Set Tenant Expiration pane
Select the date and time (using your local time zone) when you want the Tenant to expire.
Click Set. At the configured day and time, the Tenant and associated resources will be deleted.
The Set Tenant Expiration option is not available for Default or Compliance Tenants.
Central Logging
A shared Elasticsearch cluster is deployed and Filebeat is installed in all worker nodes to fetch logs from various applications across Tenants. The logs are injected with metadata corresponding to the Tenant, Service, container ID, Host, etc. Further, each Tenant has a central logging dashboard which includes the Kibana view of logs from applications within the Service. See the screenshot below:
The Logging dashboard in the DuploCloud Portal
Metrics
Metrics are fetched from Hosts, containers, and Services and displayed in Grafana. Service metrics are collected behind the scenes by calling cloud provider APIs like CloudWatch and Azure Monitor. For nodes and containers, metrics are collected using Prometheus, Node Exporter, and cAdvisor. The Metrics dashboards are Tenant-centric and segregated per application and Service, as shown in the image below.
The Metrics dashboard in the DuploCloud Portal
Alarms and Faults
The Platform creates faults for many failures automatically. For example, health check failures, container crashes, node crashes, deployment failures, etc. Further, users can easily set alarms like CPU and memory for EC2 instances or free disk space for RDS databases. Failures are displayed as faults under their respective Tenants. Sentry and PagerDuty projects can be linked to Tenants, and DuploCloud will send faults there so the user can set notification configurations.
Audit Trail
All system changes are logged in an audit trail in Elasticsearch where they can be sorted and viewed by Tenant, Service, change type, user, and dozens of other filters.
The Audit dashboard in the DuploCloud Portal
AI HelpDesk
AI HelpDesk serves as the central interface through which users engage with AI Agents in the DuploCloud ecosystem. It provides a structured, collaborative environment where DevOps tasks and infrastructure management operations can be executed with AI assistance while maintaining human oversight. The HelpDesk is designed to replicate the collaborative experience between engineers, offering visualization tools, command execution capabilities, and intuitive communication channels. This component of the AI Suite bridges the expertise gap in cloud operations by making specialized AI assistance readily available to all users regardless of their technical background. You can also interact with the AI HelpDesk directly in Slack (learn more here).
Key HelpDesk Features
Tickets
Tickets form the foundation of interaction within the AI HelpDesk, establishing a dedicated communication channel between a human user and an assigned AI Agent. Each ticket represents a specific task, request, or troubleshooting session that requires attention. When creating a ticket, users can select the appropriate AI Agent based on the nature of the task (such as Kubernetes troubleshooting, observability analysis, or deployment assistance). This ticketing system maintains context throughout the conversation, allowing for continuous collaboration until resolution while preserving a complete audit trail of actions and decisions for future reference.
Canvas
The Canvas is a dynamic, shared workspace within the HelpDesk Ticket that facilitates real-time collaboration between users and AI Agents. This interactive environment serves as a virtual "whiteboard" where both parties can visualize operations, execute commands, and share information seamlessly. This shared visual interface significantly enhances the effectiveness of human-AI interaction by providing transparency into the AI's reasoning and actions while allowing users to contribute their expertise when needed.
AI Suggested Commands
Within the Canvas, AI agents can analyze the current context and proactively suggest relevant terminal commands to address the task at hand. These suggested commands appear as interactive elements that users can approve, reject, ignore, or execute. This feature combines the AI's extensive knowledge of best practices with human judgment, preventing potentially harmful operations while expediting routine tasks. Each suggestion includes a brief explanation of its purpose and expected outcome, enabling users to learn while accomplishing their objectives efficiently.
Terminal
The Terminal component embedded within the Canvas provides users with direct command-line access to their infrastructure. When a user interacts with the Terminal, the AI Agent can observe these interactions, providing context-aware assistance, explaining unexpected outputs, or suggesting follow-up commands. This bidirectional visibility creates a seamless experience where users can freely switch between AI-guided operations and manual intervention. The Terminal maintains appropriate permission boundaries based on the user's role while providing the familiar command-line interface that DevOps professionals expect.
Users are also able to establish an interactive terminal session directly within a running application container from the HelpDesk Canvas. This capability is invaluable for troubleshooting application-specific issues, verifying configurations, or performing targeted diagnostics. The AI agent remains engaged during these application container shell sessions, offering assistance with application-specific commands and interpreting outputs in the context of the application's architecture. This deep level of access, combined with AI guidance, dramatically reduces the time required to identify and resolve application-level issues while maintaining security boundaries.
Getting Help with DuploCloud
Support features included with the product and how to contact DuploCloud Support
DuploCloud offers hands-on 24/7 support for all customers via Slack or email. Automation and developer self-service are at the heart of the DuploCloud Platform. We are dedicated to helping you achieve hands-off automation as fast as possible via rapid deployment of managed services or customized Terraform scripts using our exclusive Terraform provider. Additionally, you can access various help options, including product documentation and customer support, directly from the DuploCloud Portal. For real-time answers tailored specifically to your organization's needs, ask customer support about Ask DuploCloud, our AI-powered assistant.
How to Contact DuploCloud for Support
Use the customer Slack or Microsoft Teams channel created during onboarding.
Click the chat icon () in the DuploCloud Portal to post your question. If we are unable to respond immediately, we will automatically create a ticket for you and someone from the DuploCloud engineering team will reach out to you ASAP
DuploCloud Support Features
Some of the support features we offer include:
Debugging Agent performance and hallucinations.
Configuring changes in your public cloud infrastructures and associated Kubernetes (K8s) Constructs managed by DuploCloud.
Setting up CI/CD Pipelines.
Unsupported or Partially Supported Features
We cover most of your DevOps needs, but there are some limitations. Examples of needs we do not or only partially support include:
Patching an application inside a Docker image.
Monitoring alerts in a Network Operations Center (NOC).
Troubleshooting application code.
How to Get Help From Within the DuploCloud Portal
From any page in the DuploCloud Portal, click the Help menu icon () in the upper right (next to your name and person icon) to access a variety of tools and links for your self-service DevOps needs.
What's New: Stay informed about the latest features and updates in the DuploCloud Platform.
FAQs: Access frequently asked questions to quickly find answers to common inquiries.
Documentation: Browse through our comprehensive product documentation to help you navigate the Platform and optimize your usage.
Overview
An overview and demo of DuploCloud's comprehensive DevOps platform
DuploCloud is an agentic DevSecOps Automation platform that leverages AI for a wide range of Cloud Infrastructure automation needs. It encompasses DevOps, Security, Compliance, Observability and CI/CD. The software can:
Configure and update resources safely and securely
Write IAC (Cursor-like experience in an IDE)
Troubleshoot incidents
Perform several other functions like collect evidence for compliance audits, reporting, discover cloud resources to generate documentation, and so on.
The DuploCloud platform is entirely self-hosted in the customer's cloud account. With an open architecture you can choose your own model, build your own Agents and bring your own automation tools.
The platform is composed of two main components:
Agentic Orchestration (AI Suite)
This module coordinates a collection of AI Agents specialized in DevOps operations. Together, these Agents function as an AI DevOps Engineer, capable of handling complex, high-level tasks delegated by users.
The primary user experience mirrors an IT Help Desk. Users can create tickets and assign them to AI Agents, which execute tasks in real time. The help desk interface is accessible through a webbrowser, Slack or Teams chat thread, or directly within an IDE extension (offering a “Cursor-like” experience).
Automation Platform
To effectively manage cloud infrastructure, AI Agents rely on a suite of automation tools. For example, enabling AI-driven observability requires an underlying observability framework such as Datadog or OpenTelemetry. DuploCloud’s automation platform delivers a comprehensive set of such capabilities, including:
Provisioning Toolkit: Automates the creation and management of hundreds of cloud services such as EKS, AKS, GKE, S3, SQS, RDS, Azure SQL, and Google Cloud SQL. The toolkit is accessible to AI agents through MCP, Terraform provider, CLI, and APIs.
Observability: Built on the OpenTelemetry stack, offering full-spectrum monitoring — including tracing, logging, alerting, profiling, infrastructure metrics, RUM, and SLO/SLA tracking.
Security Tooling: Provides integrated SIEM, vulnerability assessment, antivirus, just-in-time access control, and other security mechanisms.
Teams can also bring their own automation tools whether built in-house or from a third party.
Users can interface directly with the automation tools without going through the AI functionality using the APIs, Terraform provider, CLI and Browser workflows.
AI Orchestration Demo
Check out a quick video of the AI functionality.
Automation Platform Demo
Check out a 6-minute video overview of DuploCloud's comprehensive Automation Platform.
Hosts
A conceptual overview of DuploCloud Hosts
Hosts (VMs) are a cornerstone of cloud infrastructure, essential for providing isolated, scalable, and flexible environments for running applications and services. Hosts can exist in various forms and configurations, depending on the environment and the technology stack.
For instructions to create a Host in DuploCloud, see the documentation for your specific cloud provider:
In DuploCloud, Hosts are virtualized computing resources provided by your cloud service provider (e.g., AWS EC2, Google Compute Engine, Azure VMs) or your organization's data center and managed by the DuploCloud Platform. They are used to provision scalable, on-demand infrastructure. DuploCloud abstracts the complexities of provisioning, configuring, and managing these Hosts. DuploCloud supports the following Host contexts:
Public Cloud: VMs provided by cloud providers and managed through the DuploCloud Platform.
Private Cloud: Virtualized environments managed within an organization's data center.
Combination of On-Premises and Cloud: A mix of physical hosts, VMs, and cloud-hosted instances.
Connect to the VPN
Obtain VPN credentials and connect to the VPN
DuploCloud integrates natively with OpenVPN by provisioning VPN users in the Duplocloud Portal. As a DuploCloud user, you can access resources in the private network by connecting to the VPN with the OpenVPN client.
The OpenVPN Access Server only forwards traffic destined for resources in the DuploCloud-managed private networks. Traffic accessing other resources on the internet does not pass through the tunnel.
Obtaining VPN Credentials
VPN credentials are listed on your user profile page in the DuploCloud Portal. It can be accessed by clicking the user icon () and selecting Profile.
Setting up the OpenVPN User Profile and Client App
Click on the VPN URL link in the VPN Details section of your user profile. Modern browsers will call the link unsafe since it uses a self-signed certificate. Make the necessary selections to proceed.
Log into the OpenVPN Access Server user portal using the username and password from the VPN Details section of your DuploCloud user profile page.
Click on the OpenVPN Connect Recommended for your device icon to install the OpenVPN Connect app for your local machine.
Navigate to your downloads folder, open theOpenVPN Connect file you downloaded in the previous step, and follow the prompts to finish the installation.
In the OpenVPN access server dialog box, click on the blue Yourself (user-locked profile) link to download your OpenVPN user profile.
Navigate to your Downloads folder and click on the .ovpn file downloaded in the previous step. The Onboarding Tour dialog box displays.
Click OK, and select Connect after import. Click Add in the upper right. If prompted to enter a password, use the password in the VPN Profile area of your user profile page in the DuploCloud Portal. You are now connected to the VPN.
Creating a Native Docker Service
Finish the Quick Start Tutorial by running a native Docker Service
This section of the tutorial shows you how to deploy a web application with a DuploCloud Docker Service, by leveraging DuploCloud platform in-built container management capability.
Instead of creating a DuploCloud Docker Service, you can alternatively finish the tutorial by:
running Docker containers.
running Docker containers.
Deploying a DuploCloud Docker Service
Instead of creating a DuploCloud Service using EKS or ECS, you can deploy your application with native Docker containers and services.
To deploy your app with a DuploCloud Docker Service in this tutorial, you:
Create an EC2 host instance in DuploCloud.
Create a native Docker application and Service.
Expose the app to the web with an Application Load Balancer in DuploCloud.
Estimated time to complete remaining tutorial steps: 30-40 minutes
Network Architecture and Configurations
Behind the scenes, the topology that DuploCloud creates resembles this low-level configuration in AWS.
Enable EKS logs
Enable logging functionality for EKS
Enabling EKS logging while creating an Infrastructure
Follow the steps in the section Creating an Infrastructure. In the EKS Logging list box, select one or more ControlPlane Log types.
EKS Logging field with several ControlPLane Log types selected
Enabling EKS logging for an existing Infrastructure
Enable EKS logging for an Infrastructure that you have already created.
In the DuploCloud Portal, navigate to Administrator -> Infrastructure.
From the NAME column, select the Infrastructure for which you want to enable EKS logging.
Click the Settings tab.
EKS Setup
Enable Elastic Kubernetes Service (EKS) for AWS by creating a DuploCloud Infrastructure
In the DuploCloud platform, a Kubernetes Cluster maps to a DuploCloud Infrastructure.
Start by creating a new Infrastructure in DuploCloud. When prompted to provide details for the new Infrastructure, select Enable EKS. In the EKS Version field, select the desired release.
Up to one instance (0 or 1) of an EKS is supported for each DuploCloud Infrastructure.
Creating an Infrastructure with EKS can take some time. See the section for details about other elements on the Add Infrastructure form.
When the Infrastructure is in the ready state, as indicated by a Complete status, navigate to Kubernetes -> Services and select the Infrastructure from the NAME column to view the Kubernetes configuration details, including the token and configuration for kubectl.
When you create Tenants in an Infrastructure, a namespace is created in the Kubernetes cluster with the name duploservices-TENANT_NAME
AWS User Guide
Initial steps for AWS DuploCloud users
The DuploCloud platform installs in an EC2 instance within your AWS account. It can be accessed using a web interface, API, or Terraform provider.
You can log in to the DuploCloud portal, using single sign-on (SSO), with your GSuite or O365 login.
Before You Begin
Before getting started, complete the following steps:
Read the and learn about DuploCloud terms like , , and
Set up the DuploCloud Portal
Read the section and ensure at least one person has administrator access
Connect to the DuploCloud Slack channel for support from the DuploCloud team
Enable Cluster Autoscaler
Enable Cluster Autoscaler for a Kubernetes cluster
Configuring Cluster Autoscaler for your Infrastructure
The Cluster AutoScaler automatically adjusts the number of nodes in your cluster when Pods fail or are rescheduled onto other nodes.
In the DuploCloud Portal, navigate to Administrator -> Infrastructure. The Infrastructure page displays.
From the NAME column, select the Infrastructure with which you want to use Cluster AutoScaler.
Click the Settings tab.
Click Add. The Add Infra - Set Custom Data pane displays.
From the Setting Name list box, select Cluster Autoscaler.
Select Enable to enable EKS.
Click Set. Your configuration is displayed in the Settings tab.
Tenant Config settings
Configure settings for all new Tenants under a Plan
Configuring Tenant Config settings
You can configure settings to apply to all new Tenants under a Plan using the Config tab. Tenant Config settings will not apply to Tenants created under the Plan before the settings were configured.
From the DuploCloud portal, navigate to Administrator -> Plan.
Click on the Plan you want to configure settings under in the NAME column.
Select the Config tab.
Click Add. The Add Config pane displays.
From the Config Type field, select TenantConfig.
In the Name field, enter the setting that you would like to apply to new Tenants under this Plan. (In the example, the enable_alerting setting is entered.)
In the Value field, enter True.
Click Submit. The setting entered in the Name field (enable alerting in the example) will apply to all new Tenants added under the Plan.
Viewing Tenant Config settings
You can check that the Tenant Config settings are enabled for new Tenants on the Tenants details page, under the Settings tab.
From the DuploCloud portal, navigate to Administrator -> Tenants.
From the NAME column, select a Tenant that was added after the Tenant Config setting was enabled.
Click on the Settings tab.
Add VPC endpoints
Securely access AWS Services using VPC endpoints
An AWS VPC endpoint creates a private connection to supported AWS services and VPC endpoint services powered by AWS PrivateLink. Amazon VPC instances do not require public IP addresses to communicate with the resources of the service. Traffic between an Amazon VPC and a service does not leave the Amazon network.
VPC endpoints are virtual devices. They are horizontally scaled, redundant, and highly available Amazon VPC components that allow communication between instances in an Amazon VPC and services without imposing availability risks or bandwidth constraints on network traffic. There are two types of VPC endpoints, Interface Endpoints, and Gateway Endpoints.
DuploCloud allows you to specify predefined AWS endpoints for your Infrastructure in the DuploCloud Portal.
Adding VPC endpoints to a Duplocloud Infrastructure
In the DuploCloud Portal, navigate to Administrator -> Infrastructure. The Infrastructure page displays.
Select the Infrastructure to which you want to add VPC endpoints.
Click the Endpoints tab.
Hosts (VMs)
Adding EC2 hosts in DuploCloud AWS
Once you have the Infrastructure (Networking, Kubernetes cluster, and other standard configurations) and an environment (Tenant) set up, the next step is to launch EC2 virtual machines (VMs). You create VMs to be:
EKS Worker Nodes
Worker Nodes (Docker Host), if the built-in container orchestration is used.
DuploCloud AWS requires at least one Host (VM) to be defined per AWS account.
You also create VMs if Regular nodes are not part of any container orchestration. For example, a user manually connects and installs apps, as when using Microsoft SQL Server in a VM, Running an IIS application, or such custom use cases.
While all the lower-level details like IAM roles, Security groups, and others are abstracted away from the user (as they are derived from the Tenant), standard application-centric inputs must be provided. This includes a Name, Instance size, Availability Zone choice, Disk size, Image ID, etc. Most of these are optional, and some are published as a list of user-friendly choices by the admin in the plan (Image or AMI ID is one such example). Other than these AWS-centric parameters, there are two DuploCloud platform-specific values to be provided:
Agent Platform: This is applicable if the VM is going to be used as a host for by the platform. The choices are:
EKS Linux: If this is to be added to the EKS cluster. For example, EKS is the chosen approach for container orchestration
Linux Docker: If this is to be used for hosting Linux containers using the
Docker Windows: If this is to be used for hosting Windows containers using the
Allocation Tags (Optional): If the VM is being used for containers, you can set a label on it. This label can then be specified during docker app deployment to ensure the application containers are pinned to a specific set of nodes. Thus, you can further split a tenant into separate server pools and deploy applications.
If a VM is being used for container orchestration, ensure that the Image ID corresponds to an Image for that container orchestration. This is set up for you. The list box will have self-descriptive Image IDs. Examples are EKS Worker, Duplo-Docker, Windows Docker, and so on. Anything that starts with Duplo would be an image for the Built-in container orchestration.
ACM Certificate
Create an AWS Certificate Manager certificate
The DuploCloud Platform needs a wild character AWS Certificate Manager (ACM) certificate corresponding to the domain for the Route 53 Hosted Zone.
For example, if the Route 53 Hosted Zone created is apps.acme.com, the ACM certificate specifies *.apps.acme.com. You can add additional domains to this certificate (for example, *.acme.com).
The ACM certificate is used with AWS Elastic Load Balancers (ELBs) created during DuploCloud application deployment. Follow this AWS guide to issue an ACM certificate.
Once the certificate is issued, add the Amazon Resource Name (ARN) of the certificate to the DuploCloud Plan (starting with the DEFAULT Plan) so that it is available to subsequent configurations
Adding an ACM Certificate with ARN to a DuploCloud Plan
In the DuploCloud Platform, navigate to Administrator -> Plans. The Plans page displays.
Select the default Plan from the NAME column.
Click the Certificates tab.
Note that the ARN Certificate must be set for every new Plan created in a DuploCloud Infrastructure.
Enabling Automatic AWS ACM Certificate Creation
Configure DuploCloud to automatically generate Amazon Certificate Manager (ACM) certificates for your Plan's DNS.
From the DuploCloud portal, navigate to Administrator -> Systems Settings.
Select the System Config tab, and click Add. The Add Config pane displays.
From the Config Type list box, select Flags.
Policy Model
A high-level overview of the building blocks of DuploCloud's infrastructure-based architecture
The DuploCloud Policy Model is an application-infrastructure-centric abstraction created atop the user's cloud provider account. The following diagram shows the abstractions within which applications are deployed and users operate. The below bullet points are a brief introduction to the concepts and subsequent sub-pages explain this in details
DuploCloud Platform installs in customer's cloud account. In case of Azure and GCP, a single instance can manage multiple subscriptions and GCP projects respectively. In case of AWS we need one Agent per AWS account, but they come together in a federated fashion to expose a single interface. This is described more in detail under the Management Portal Scope Section below
Infrastructure: An infrastructure maps to a VPC in a region with optionally a Kubernetes Cluster. One cloud account (AWS account, GCP project or Azure subscription) can have multiple infrastructures (1:N). For more details of infrastructure see .
Plan: When you create an in DuploCloud, a Plan is automatically generated. A Plan is a placeholder or a template for configurations. These configurations are consistently applied to all Tenants within the Plan (or Infrastructure). For more details of Plan
Tenant: A Tenantis like an environment and is a child of the Infrastructure. It is the most fundamental construct in DuploCloud. While Infrastructure is a VPC-level isolation, Tenant is the next level of isolation implemented by segregating Tenants using concepts like Security Groups, IAM roles, Instance Profiles, K8S Namespaces, KMS Keys, etc. For more details of Tenant see the
Skills
Skills define the tasks your AI Engineer can perform, like Kubernetes operations, CI/CD workflows, or security tasks. Multiple Skills can be grouped into Personas which represent specific roles (for example, DevOps Engineer or Full Stack Engineer).
At its core, a Skill is a folder containing a SKILL.md file. This file includes required metadata (at minimum, name and description), and instructions that tell the Agent how to perform a specific task. A Skill can also include supporting assets such as scripts, templates, and reference materials. .
There are three ways to assign Skills to an Engineer:
Pre-Built Skills: The DuploCloud platform provides pre-built Skills that you can use to quickly create and configure your first Engineer.
External Skills: You can use Skills from third-party vendors directly within the platform. Examples include:
Onboarding
What you can expect during the DuploCloud onboarding process
Phase 1. Kickoff and Delivery
During Kickoff and Delivery, your Team learns about the DuploCloud onboarding flow and what to expect in each phase. Our Team works closely with yours to review your project scope and objectives, technical specifications and information, and important dates and deadlines.
By the end of this phase, DuploCloud engineers will configure a DuploCloud Platform in your company's cloud account. We will ask your Team for any feedback about the onboarding approach to improve the process in the future.
FAQ
Getting Started and Infrastructure
What is the infrastructure cost of running the DuploCloud AI Suite?
Infrastructure costs (EC2 instances, load balancers, S3 buckets, Bedrock LLM cost) are typically estimated at $80-150 per month. LLM/Bedrock costs will only apply when you actively use the features and will be a few dollars per month. Additional costs may apply depending on your specific use case.
Plan
A conceptual overview of DuploCloud Plans
When you create an in DuploCloud, a Plan is automatically generated. A Plan is a placeholder or a template for configurations. These configurations are consistently applied to all Tenants within the Plan (or Infrastructure). Examples of such configurations include:
Certificates available to be attached to Load Balancers in the Plan's Tenants
Machine images
View Presidio Details
Duplo Presidio
The Duplo Presidio Service is a wrapper built on top of that enhances its ability to detect and protect sensitive data by supporting custom recognizers and anonymizers.
This ensures that secrets, tokens, passwords, and credentials are automatically identified and anonymized before data leaves your Duplo-managed Kubernetes cluster. By disabling irrelevant recognizers and introducing project-specific patterns, we reduce noise and improve accuracy.
Enable ECS logging
Enable ECS Elasticsearch logging for containers at the Tenant level
To generate logs for AWS ECS clusters, you must first create an Elasticsearch logging container. Once auditing is enabled, your container logging data can be captured for analysis.
Prerequisites
Infrastructure
A conceptual overview of DuploCloud Infrastructures
Infrastructures are abstractions that allow you to create a Virtual Private Cloud (VPC) instance in the DuploCloud Portal. When you create an Infrastructure, a (with the same Infrastructure name) is automatically created and populated with the Infrastructure configuration.
For instructions to create an Infrastructure in the DuploCloud Portal, see:
Setting Tenant session duration
Manage Tenant session duration settings in the DuploCloud Portal
Managing Tenant session duration
In the DuploCloud Portal, configure the session duration time for all Tenants or a single Tenant. At the end of a session, the Tenants or Tenant ceases to be active for a particular user, application, or Service.
For more information about IAM roles and session times in relation to a user, application, or Service, see the .
Step 4: Create a Task Definition for an Application
Create a Task Definition for your application in AWS ECS
You enabled ECS cluster creation when you created the . In order to create a Service using ECS, you first need to create a that serves as a blueprint for your application.
Once you create a Task Definition, you can run it as a Task or as a Service. In this tutorial, we run the Task Definition as a Service.
Estimated time to complete Step 4: 10 minutes.
Step 5: Create a Service
Create a native Docker Service in the DuploCloud Portal
You can use the DuploCloud Portal to create a native Docker service without leaving the DuploCloud interface.
Estimated time to complete Step 5: 10 minutes.
Prerequisites
Before creating a Service, verify that you completed the tasks in the previous tutorial steps. Using the DuploCloud Portal, confirm that:
ECS Setup
Enable Elastic Container Service (ECS) for AWS when creating a DuploCloud Infrastructure
Setting up an Infrastructure that uses ECS is similar to creating an , except that during creation, instead of selecting Enable EKS, you select Enable ECS Cluster.
For more information about ECS Services, see the documentation.
Up to one instance (0 or 1) of an ECS is supported for each DuploCloud Infrastructure.
VPN Setup
Accept OpenVPN, provision the VPN, and add VPN users
DuploCloud integrates with OpenVPN by provisioning VPN users that you add to the DuploCloud Portal. OpenVPN setup is a comprehensive process that includes accepting OpenVPN, provisioning the VPN, adding users, and managing connection limits to accommodate a growing team.
Accepting OpenVPN
Accept OpenVPN Free Tier (Bring Your Own License) in the AWS Marketplace:
Creating an EKS Service
Finish the Quick Start Tutorial by creating an EKS Service
So far in this DuploCloud AWS tutorial, you created a VPC network with configuration templates (), an isolated workspace (), and an (optionally).
Now you need to create a DuploCloud Service on top of your Infrastructure and configure it to run and deploy your application. In this tutorial path, we'll deploy an application using Docker containers and leveraging .
Alternatively, you can finish this tutorial by:
running Docker containers
Connect EC2 instance
Connect an EC2 instance with SSH by Session ID or by downloading a key
Once an EC2 Instance is created, you connect it with SSH either by using Session ID or by downloading a key.
Connecting to an EC2 Linux instance using SSH
In the DuploCloud Portal, navigate to Cloud Services -> Hosts and select the host to which you want to connect.
Click Add.
In the Name field, enter a certificate name.
In the Certificate ARN field, enter the ARN.
Click Create. The ACM Certificate with ARN is created.
From the Key list box, select Other.
In the Key field that displays, enter enabledefaultdomaincert.
In the Value list box, select True.
Click Submit. DuploCloud automatically generates Amazon Certificate Manager (ACM) certificates for your Plan's DNS.
Your Team Provides:
Project details, including objectives, technical specifications, and dates/deadlines.
A list of project members and roles.
A new cloud account with access for DuploCloud engineers.
Read-only access to your existing accounts, documents, repositories, and artifacts.
DuploCloud Provides:
Introduction to the onboarding process.
A DuploCloud Platform in your new cloud account.
Phase 2. Assessment and Project Planning
In the Assessment and Project Planning phase, DuploCloud engineers create and review a high-level block diagram of your project architecture, verify your containerization/migration needs, and confirm your service configurations, interdependencies, and data migration requirements. We also complete a compliance assessment to ensure your project meets all required compliance guidelines. Together, Teams choose a working-session cadence that aligns with your project needs and timeline.
In this phase we confirm if the Agents that are available out-of-the-box accomplish the required automation goals or we need to tweak those Agents or write new ones.
By the conclusion of this phase, we will provide you with a DuploCloud Portal your Team can access and detailed information about the project plan. So far no new workload has been deployed or an existing infrastructure been imported into the Dev Environment.
Your Team Provides:
Verification of your project's containerization needs, service configurations, interdependencies, and data migration requirements.
Project plan questions or feedback.
Input for the creation of a working session plan.
DuploCloud Provides:
List of in-scope services and their statuses.
Project plan for the initial workload deployment.
Confirmation of Tenant structure.
Assessment of any new AI Agents that should be built (if any).
A DuploCloud Portal with access for your Team.
Recurring working session schedule.
Phase 3. Initial Workload Deployment
In this phase, DuploCloud engineers deploy your Dev Environment, which includes all in-scope services and applications. During deployment working sessions, we provide your Team with comprehensive DuploCloud Platform training. Teams discuss and complete any necessary application-level changes and move on to app containerization, secret management, and Kubernetes configuration (where required). Finally, we review the Dev Environment and your Team's test plan.
If we are importing existing infrastructure then we would map out the existing Dev Namespaces, wire them to the DuploCloud observability and security solution as appropriate and test out agentic workflows. We aim to accomplish a majority of required use cases.
In large organizations during onboarding we only aim to achieve a representative set of use cases unlike a smaller organization where we can cover pretty much all use case and are ready to go live in a production environment fully managed by DuploCloud.
Your Team Provides:
Necessary application changes.
Dev Environment testing and signoff.
DuploCloud Provides:
A complete Dev Environment deployment for testing.
Training on the DuploCloud Platform during deployment work sessions.
Terraform code that can be used as a template for new environments, if needed.
Custom AI Agent where applicable
Phase 4. CI/CD & Release Management
The CI/CD & Release Management phase involves identifying Services and Tenants to implement pipelines, selecting and agreeing on a pipeline implementation logic, and building the pipelines. DuploCloud builds an operational CI/CD pipeline for each Service and trains your Team to add and modify CI/CD pipelines in the future.
Your Team Provides:
Input for CI/CD pipeline development.
Participation in information/knowledge sharing, training, and demo.
DuploCloud Provides:
An operational CI/CD pipeline for each of the project’s Services.
Training so your Team can add and modify pipelines.
Phase 5. Production Deployment
The fifth phase, Production Deployment, focuses on the Production environment. During this phase, the DuploCloud Team works with your Team to confirm your high-availability requirements and apply any needed adjustments. We also review and update infrastructure component scale parameters (e.g., CPU and memory utilization) and monitoring and alerting configurations. Lastly, we review data migration requirements and formulate a production cutover plan.
In large organization this covers only a subset of environments or use cases. The internal Teams take over from here and continue the process.
Shared Responsibilities
Deploy the Production environment
Test the Production environment
Stabilize production applications
Phase 6. Onboarding Signoff
Onboarding Signoff ensures that your Team is prepared for the following stages of support and operations, where you’ll receive ongoing maintenance assistance. We review your ongoing support needs, discuss your plans for the next 3 to 6 months, and establish the next steps with the Operations Team to ensure a smooth handover and continuity of service. On top of that, the DuploCloud Team delivers an updated architecture diagram, providing a clear and current overview of the system's structure. Lastly, we ask you for feedback about the onboarding experience, which is crucial for assessing the process and identifying areas for improvement.
Your Team Provides:
Feedback about the onboarding experience.
DuploCloud Provides:
An outline of your next steps with the Operations Team.
Creating an Infrastructure with ECS can take some time. See the Infrastructure section for details about other elements on the Add Infrastructure form.
Add Infrastructure page with Enable ECS Cluster selected
Cloud Migration from any existing platform.
Proactive, tailored EKS Cluster Upgrades designed for minimum downtime impact.
Accelerated onboarding of existing Services.
Troubleshooting and debugging for:
Apps and Services crashing.
Slow of crashing OpenSearch or Database Instances.
Proof-of-Concepts (PoCs) for third-party integrations, including roll-out to the development environment.
Downtime during rolling Upgrades.
Investigation and clarification of public cloud provider billing increases. Many times DuploCloud can suggest a more cost-effective alternative.
Consolidation of third-party tools for which you currently subscribe that are included with your DuploCloud subscription.
Adding a CI/CD Pipeline for a new Service.
Database configuration.
Application Functional, Stress and Load testing.
The Help menu in the DuploCloud Portal
In the Onboarding Tour dialog box, click the > button twice. Click Agree and OK as needed to proceed to the Import .ovpn profile dialog box, and click OK.
The user menu accessible from the user icon in the upper right
OpenVPN Access Server login screen
The OpenVPN Access Server pane
The Import .ovpn profile dialog box
Click Add. The Infra - Set Custom Data pane displays.
From the Setting Name list box, select EKS ControlPlane Logs.
In the Setting Value field, enter: api;audit;authenticator;controllerManager;scheduler
Click Set. The EKS ControlPlane Logs setting is displayed in the Settings tab.
Infra - Set Custom Data pane for setting EKS ControlPlane Logs
Infrastructure page with Status Complete displayed
📂 Structure of the ConfigMap
Presidio is configured through a YAML-based ConfigMap.
This config controls global flags, disabled built-in recognizers, and custom recognizers that we add.
Example (simplified from our deployment):
🔎 Built-in Recognizers
Presidio ships with a rich set of built-in recognizers for detecting common forms of sensitive data (PII). These recognizers are enabled by default and cover global as well as region-specific entities.
While recognizers detect sensitive data, anonymizers transform or redact that data to protect it. Presidio comes with several built-in anonymizers that can be applied when sanitizing text.
Common Anonymizers
replace → Replaces detected entity with a placeholder (e.g., <ENTITY_TYPE>)
mask → Masks characters with a symbol (e.g., ****1234)
redact → Removes the detected entity entirely
hash → Hashes the entity value (SHA256 by default)
encrypt / decrypt → Encrypts or decrypts values using AES
We provide several pre-built agents including Kubernetes troubleshooting, Private GPT, Architecture Diagram generation, and Observability. Full documentation is available in the out-of-the-box agents section.
Can we get custom agents built for our specific use cases?
Yes, we can develop custom agents tailored to your specific DevOps use cases.
What is Private GPT and why should we use it?
Private GPT is a ChatGPT-like interface that keeps all your data within your AWS account. It's ideal for organizations with data security requirements that prevent using public AI services.
AI Models and Framework
Does DuploCloud use Artificial Intelligence (AI) models?
Yes. DuploCloud AI Suite is a framework to build and deploy AI agents. These agents will use LLMs to respond to users, take action, etc.
What Model Providers do you support?
The default Model Provider will be AWS Bedrock to keep the data within your cloud account. With our PreBuilt Agent option any Model Provider can be used.
What is the purpose of these AI models?
The models, provided through AWS Bedrock inside your own AWS account, are used to power your custom AI agents. They generate intelligent responses to support DevOps workflows, troubleshooting, and automation.
What type of output do the models generate?
Outputs include infrastructure provisioning commands, log analysis, troubleshooting recommendations, and other data that help DevOps teams operate more efficiently.
Data and Training
What data is used to train the AI models?
We do not train models. DuploCloud uses foundational models from AWS Bedrock. Your data is never used to train or improve these models.
What data is input into the models?
The platform provides context about your infrastructure (e.g., Kubernetes cluster names, namespaces, application logs) so the model can assist with relevant operations.
Does DuploCloud's AI process personal information?
No. The AI agents do not process personal or sensitive personal information.
How does DuploCloud ensure the security of my data in DuploCloud AI Suite?
The DuploCloud Platform has always been a self-hosted solution that lives inside your Cloud - The DuploCloud AI Suite follows the same pattern. The AI Agent, any knowledge sources (VectorDBs), and the LLM are all hosted inside your cloud account.
System Integration and Oversight
What systems do AI agents connect to?
DuploCloud's AI agents interact with your DevOps and infrastructure systems (such as AWS, Datadog, and application logs) to perform tasks and provide insights.
Is there human oversight of AI actions?
Yes. The DuploCloud AI HelpDesk is designed with human-in-the-loop oversight. Users can review, approve, or reject actions the AI proposes before execution, ensuring control and safety.
Business Value
Why should we try the Agentic Help Desk?
The Agentic Help Desk tackles the biggest DevOps headaches with AI-driven workflows:
Ticket overload → Automates repetitive tasks so engineers focus on higher-value work
Troubleshooting bottlenecks → Provides end-to-end visibility to accelerate root cause analysis
Deployment delays → Manages rollbacks, approvals, and scaling with human-in-the-loop safeguards
Knowledge silos → Captures and retains critical expertise, ensuring continuity when people leave
With agents for Kubernetes, observability, and log analysis, teams already see faster fixes, safer launches, and less burnout.
Future Development
What's coming next? What more can I expect?
We're expanding the HelpDesk with more prebuilt agents to tackle problems across your infrastructure, and new capabilities designed to deliver long-term value for your growth and modernization journey.
WAF web ACLs
Common IAM policies and SG rules to be applied to all resources in the Plan's Tenants
Unique or shared DNS domain names where applications provisioned in the Plan's Tenants can have a unique DNS name in the domain
Resource Quota that is enforced in each of the Plan's Tenants
DB Parameter Groups
Policies and feature flags applied at the Infrastructure level to the Plan's Tenants
The figure below shows a screenshot of the Plan constructs:
The Capabilities tab for the NONPROD Plan in the DuploCloud Portal
DuploCloud Plans and DNS Considerations
When creating DuploCloud Plans and DNS names, consider the following to prevent DNS issues:
Plans in different portals will delete each other's DNS records, so each portal must use a distinct subdomain for its Plans.
DuploCloud Plans in the same portal can share a DNS domain without deleting each other's records. DuploCloud-created DNS names will always include the Tenant name, which prevents collisions.
The recommended practice for most portals is to set all Plans to the same DNS name, including the default Plan.
Ideally, custom subdomains will be set in the Plans before turning on Shell, Monitoring, or Logging. If the DNS is changed later, those services may need to be updated.
Each Infrastructure represents a network connection to a unique VPC/VNET, in a Region with a Kubernetes Cluster. For AWS, it can also include an ECS. An Infrastructure can be created with five basic inputs: Name, VPC CIDR, Number of AZs, Region, and a choice to enable or disable a K8s/ECS cluster.
The Add Infrastructure page in the DuploCloud Portal
When you create an Infrastructure, DuploCloud automatically creates the following components:
VPC with two subnets (private, public) in each availability zone
Required security groups
NAT Gateway
Internet Gateway
Route tables
with the master VPC, which is initially configured in DuploCloud
Additional requirements like custom Private/Public Subnet CIDRs can be configured in the Advanced Options area.
A common use case is two Infrastructures: one for Prod and one for Nonprod. Another is having an Infrastructure in a different region for disaster recovery or localized client deployments.
Plans and Infrastructures
Once an Infrastructure is created, DuploCloud automatically creates a Plan (with the same Infrastructure name) containing the Infrastructure configuration. The Plan is used to create Tenants.
In the DuploCloud Portal, navigate to Administrator -> System Settings. The System Settings page displays.
Click the System Config tab.
Click Add. The App Config pane displays.
From the Config Type list box, select AppConfig.
From the Key list box, select AWS Role Max Session Duration.
From the Select Duration Hour list box, select the maximum session time in hours or set a Custom Duration in seconds.
Click Submit. The AWS Role Max Session Duration and Value are displayed in the System Config tab. Note that the Value you set for maximum session time in hours isdisplayed in seconds. You can Delete or Update the setting in the row's Actions menu.
System Config tab on System Settings page displaying MaximumSessionDuration for all Tenants
Configuring session duration for a single Tenant
In the DuploCloud Portal, navigate to Administrator -> Tenants. The Tenants page displays.
From the Name column, select the Tenant for which you want to configure session duration time.
Click the Settings tab.
Click Add. The Add Tenant Feature pane displays.
From the Select Feature list box, select AWS Role Max Session Duration.
From the Select Duration Hour list box, select the maximum session time in hours or set a Custom Duration in seconds.
Click Add. The AWS Role Max Session Duration and Value are displayed in the Settings tab. Note that the Value you set for maximum session time in hours isdisplayed in seconds. You can Delete or Update the setting in the row's Actions menu.
The Tenants details page with AWS Role Max Session Duration enabled
The DuploCloud Tenant list box with dev01 selected
Navigate to Cloud Services -> ECS.
In the Task Definition tab, click Add. The Add Task Definition-Basic Options area displays.
In the Name field, enter sample-task-def.
From the vCPU list box, select 0.5 vCPU.
From the Memory list box, select 1 GB.
Click Next. The Advanced Options area displays.
In the Container - 1 section, enter Container Namesample-task-def-c1.
In the Image field, enter duplocloud/nodejs-hello:latest.
In the Port Mappings section, in the Port field, enter 3000. Port mappings allow containers to access ports for the host container instance to send or receive traffic.
Click Add. The Add ServiceBasic Options page displays.
In the Service Name field, enter demo-service-d01.
From the Platform list box, select Linux/Docker Native.
In the Docker Image field, enter duplocloud/nodejs-hello:latest.
From the Docker Networks list box, select Docker Default.
Click Next. The Advanced Options page displays.
Click Create.
On the Add Service page, you can also specify optional Environment Variables (EVs) such as databases, Hosts, ports, etc. You can also pass Docker credentials using EVs for testing purposes.
Checking Your Work
In the Tenant list box, select dev01.
Navigate to Docker -> Services.
In the NAME column, select demo-service-d01.
Check the Current column to verify that demo-service-d01 has a status of Running.
Once the Service is Running, you can check the logs for additional information. On the Services page, select the Containers tab, click the menu icon ( ) next to the container name,and select Logs.
Accept the agreement. Other than the regular EC2 instance cost, no additional license costs are added.
Provisioning the VPN
In the DuploCloud Portal, navigate to Administrator -> System Settings.
Select the VPN tab.
Click Provision VPN.
After the OpenVPN is provisioned, it is ready to use. DuploCloud automates the setup by launching a CloudFormation script to provision the OpenVPN.
The VPN tab on the System Settings page in the DuploCloud Portal
The OpenVPN admin password can be found in the CloudFormation stack in your AWS console.
Managing VPN Connection Limits
To support a growing team, you may need to increase the number of VPN connections. This can be achieved by purchasing a larger license from your VPN provider. Once acquired, update the license key in the VPN's web user interface through the DuploCloud team's assistance. Ensure the user count settings in the VPN reflect the new limit and verify team access to manage these changes efficiently.
After you select the Host, on the Host's page click the Actions menu and select SSH. A new browser tab opens and you can connect your Host using SSH with by session ID. Connection to the host launches in a new browser tab.
Connect by downloading a key
After you select the Host, on the Host's page click the Actions menu and select Connect -> Connection Details. The Connection Info for Host window opens. Follow the instructions to connect to the server.
Click Download Key.
Connection Info for Host window with Download Key button
Disable the option to download the SSH key
If you don't want to display the Download Key button, disable the button's visibility.
In the DuploCloud Portal, navigate to Administrator -> System Settings.
Click the System Config tab.
Click Add. The Add Config pane displays.
From the Config Type list box, select Flags.
From the Key list box, select Disable SSH Key Download.
From the Value list box, select true.
Click Submit.
Add Config pane with Disable SSH Key Download Key selected
Configuring admin-only access to the SSH key
Configuring the following system setting disables SSH access for read-only users. Once this setting is configured, only administrator-level users can access SSH.
From the DuploCloud Portal, navigate to Administrator -> Systems Settings.
Select the Settings tab, and click Add. The Update Config Flags pane displays.
The Update Config Flags pane
From the Config Type list box, select Flags.
In the Key list box, select Admin Only SSH Key Download.
In the Value field list box, select true.
Click Submit. The setting is configured and SSH access is limited to administrators only.
Tickets
The AI HelpDesk provides a conversational interface for managing infrastructure tasks through AI-powered Tickets. Each Ticket connects you to an intelligent AI Agent who can reason through problems, suggest commands, and collaborate with you in real time via the Canvas workspace.
Creating a Ticket
To begin a new interaction, users create a Ticket within the context of a specific Tenant, ensuring that all Agent actions and suggestions occur within the correct infrastructure scope. When creating a Ticket, the system requires selecting an AI Agent with the appropriate domain expertise, such as Kubernetes operations, observability, or deployment troubleshooting. A ticket can include input from multiple Agents. This allows for collaboration across different domains (for example, Kubernetes and Observability) within the same troubleshooting session.
To create a Ticket, complete the following steps:
Navigate to AI Suite -> HelpDesk.
Select the appropriate Agent and Instance. Other Agents can be assigned to the Ticket later if new capabilities are needed, see below.
In the Message to Agent field, describe your task or problem, or choose one of the provided prompts (for details on customizing prompt suggestions, see the ).
Tickets can also be created and managed in Slack via the .
Collaborating in the Canvas
Each Ticket provides a Canvas, a dynamic, interactive workspace. The Canvas preserves the state of the interaction, ensuring continuity across multiple exchanges and allowing users to collaborate with the AI Agent as a knowledgeable partner.
In the Canvas you can:
Send messages to the AI Agent
View the agent’s reasoning and suggestions
Track task progress over time
Reviewing Suggested Commands
As the conversation evolves, the AI Agent may propose command-line instructions based on the context of the Ticket. These suggestions appear within the Canvas and are fully interactive. You can:
Approve the command, queuing it for execution the next time a message is sent to the Agent.
Reject the command, optionally providing feedback such as preferences or restrictions (e.g., “Avoid using describe commands”).
Ignore the suggestion, leaving it unacknowledged.
Taking Action in the Terminal
Embedded in the Canvas is a shared Terminal that gives you secure, real-time command-line access to your environment, directly from the workspace. You can respond to AI Agent suggestions or run your own diagnostics and scripts, while the Agent, if granted access, can view the commands you’ve executed and suggest its own. This creates a collaborative space for you and the AI Agent to can work together.
To access the Terminal:
Click Terminal in the Canvas and select the desired scope:
Admin Terminal: selectfor full environment-level access.
Tenant Terminal: select to limit commands to a specific Tenant.
All Terminal input and output are automatically saved to the Ticket history. Shared context allows the Agent to provide more accurate suggestions in future interactions while you maintain control over what is shared.
Using Agent-Suggested Resources
As part of a Ticket interaction, the AI Agent may provide helpful supporting resources in the form of clickable links, URLs and files. These assets are tailored to the specific context of your request and are intended to accelerate troubleshooting or task completion.
Clickable links: follow Agent-provided links to web resources like access guides, external articles, or other resources relevant to your request.
URLs: Open URLs in a new tab to access dashboards, CI/CD pipeline, logs, and more without leaving the Canvas, enabling monitoring or review of deployment activity.
Temporary files: Access scripts, manifests, configuration templates, or log snapshots stored in a secure Canvas directory. View and edit them directly in the embedded Terminal, with changes visible to the Agent; the directory is automatically deleted when the Ticket is resolved or closed.
Managing Tickets
DuploCloud HelpDesk allows you to manage your Tickets efficiently. You can access past Tickets for reference, assign a different AI Agent to a Ticket for additional expertise, switch between open Tickets, update Ticket status, or delete Tickets.
From the HelpDesk, select one of the following options:
Product Updates
DuploCloud AI Suite Release - 09/05/2025
Overview
We're excited to introduce DuploCloud AI Suite, a comprehensive artificial intelligence platform that transforms how you manage cloud infrastructure. This inaugural release brings intelligent automation to DevOps workflows through specialized AI Agents that work alongside your team to solve complex infrastructure challenges.
What's New
AI Studio
Build and Deploy Custom AI Agents
AI Studio provides everything you need to create, customize, and deploy AI Agents tailored to your organization's needs:
Agent Specification Builder - Define your Agent's capabilities and behavior
Vector Database Support - Enable Agents to access and search your documentation and knowledge base
One-Click Deployment - Automatically deploy Agents to Kubernetes
HelpDesk
Intelligent Support with Human Oversight
HelpDesk transforms traditional IT support through AI-powered assistance while maintaining complete human control:
Smart Ticketing System
Create tickets and get matched with the most appropriate AI Agent
View, search, and filter tickets
Track Agent assignments and ticket status in real time
Collaborative AI Assistance
Human-in-the-Loop Approval - All Agent actions require your explicit approval before execution
Shared Canvas - Work side-by-side with AI Agents in a collaborative workspace
Interactive Terminal - Share a live terminal where both you and the Agent can run commands
Advanced Security Controls
Double Approval - Sensitive commands require additional confirmation for extra security
Credential Proxying - Agents use your permissions, never their own
Smart Prompt Suggestions - Get started quickly with Agent-specific conversation starters
Chat Bubble Integration - Quick access from anywhere in the DuploCloud platform
Available AI Agents
Kubernetes Agent
Your intelligent Kubernetes troubleshooting companion that can:
Diagnose deployment issues and container problems
Create new deployments and services
Perform cluster maintenance and optimization tasks
Observability Agent
Powered by DuploCloud's Advanced Observability Suite, this Agent helps you:
Retrieve and analyze metrics and logs for any microservice
Identify performance bottlenecks and anomalies
Get intelligent insights from your OpenTelemetry data
CI/CD Agent
Intelligent pipeline support that automatically:
Monitors your Jenkins and GitHub Actions pipelines
Creates support tickets when builds fail
Attaches relevant logs, configuration, and error details
Architecture Diagram Agent
Transform complex infrastructure into clear visuals:
Generate architecture diagrams using natural language descriptions
Automatically map relationships between services, databases, and infrastructure
Create shareable Mermaid diagrams of your entire technology stack
Private GPT Agent
Secure AI assistance for security-conscious organizations:
Private ChatGPT-like experience using AWS Bedrock
Complete data privacy with enterprise-grade security
Perfect for teams who need AI assistance without third-party data sharing
Supporting Infrastructure
DuploCloud Cartography
Automatic Discovery - Continuously maps your AWS, Kubernetes, and DuploCloud resources
Relationship Mapping - Understands dependencies between microservices and infrastructure
Custom Dependencies - Define application-specific relationships for complete visibility
DuploCloud Presidio
Data Protection - Automatically redacts sensitive information in AI conversations
Customizable Rules - Configures what data should be protected based on your security policies
Privacy-First - Ensures sensitive data never leaves your environment
Key Benefits
🤖 Intelligent Automation - AI Agents that understand your infrastructure and can take action with your approval
🛡️ Security First - Human oversight for all actions with your credentials, never autonomous access
🚀 Faster Resolution - Collaborative workspace where you and AI work together to solve problems
📊 Better Insights - Automatic diagram generation and intelligent analysis of your systems
🔒 Enterprise Ready - Private AI models with complete data privacy and security controls
Shell Access for Containers
Access the shell for your Native Docker, EKS, and ECS containers
Enable and access shells for your DuploCloud Docker, EKS, and ECS containers directly through the DuploCloud Portal. This provides quick and easy access for managing and troubleshooting your containerized environments.
Native Docker Shell Access
Enabling the Shell for Docker
In the DuploCloud Portal, navigate to Docker -> Services.
From the Docker list box, select Enable Docker Shell. The Start Shell Service pane displays.
In the Platform list box, select Docker Native.
From the Certificate list box, select your certificate.
From the Visibility list box, select Public or Internal.
Accessing the Shell for Docker
From the DuploCloud portal, navigate to Docker -> Containers.
In the row of the container you want to access, click the options menu icon ( ).
Select Container Shell. A shell session launches directly into the running container.
EKS Shell Access
Enabling the Shell for Kubernetes
In the Tenant list box, select the Default Tenant.
In the DuploCloud Portal, navigate to Docker -> Services.
Click the Docker button, and select Enable Docker Shell. The Start Shell Service pane displays.
In the Platform list box, select Kubernetes.
In the Certificate list box, select your certificate.
In the Visibility list box, select Public or Internal.
Accessing the Shell for Kubernetes
From the DuploCloud Portal, navigate to Kubernetes -> Services.
Click the KubeCtl Shell button. The Kubernetes shell launches in your browser.
ECS Shell Access
Accessing the Shell for ECS
From the DuploCloud Portal, navigate to Cloud Services -> ECS. The ECSTask Definition page displays.
Select the name from the TASK DEFINITION FAMILY NAME column.
Select the Tasks tab.
Step 5: Create the ECS Service and Load Balancer
Create an ECS Service from Task Definition and expose it with a Load Balancer
Now that you've created a Task Definition, create a Service, which creates a Task (from the definition) to run your application. A Task is the instantiation of a Task Definition within a cluster. After you create a task definition for your application within Amazon ECS, you can specify multiple tasks to run on your cluster, based on your performance and availability requirements.
Once a Service is created, you must create a Load Balancer to expose the Service on the network. An Amazon ECS service runs and maintains the desired number of tasks simultaneously in an Amazon ECS cluster. If any of your tasks fail or stop, the Amazon ECS service scheduler launches another instance based on parameters specified in your Task Definition. It does so in order to maintain the desired number of tasks created.
Estimated time to complete Step 5: 10 minutes.
Prerequisites
Before creating the ECS Service and Load Balancer, verify that you accomplished the tasks in the previous tutorial steps. Using the DuploCloud Portal, confirm that:
An exist, both named NONPROD.
The NONPROD infrastructure has .
A Tenant named .
Creating an ECS Service and Load Balancer
In the DuploCloud Portal's Tenant list box, select dev01.
Navigate to Cloud Services -> ECS.
In the Name field, enter sample-httpd-app as the Service name.
In the LB Listeners area, click Add. The Add Load Balancer Listener pane displays.
Checking Your Work
In the Service Details tab, information about the Service and Load Balancer you created is displayed. Verify that the Service and Load Balancer configuration details in the Service Details tab are correct.
Step 4: Create a Host
Creating a Host that acts as an EKS Worker node
Creating an AWS EKS Service uses technologies from AWS and the Kubernetes open-source container orchestration system.
Kubernetes uses worker nodes to distribute workloads within a cluster. The cluster automatically distributes the workload among its nodes, enabling seamless scaling as required system resources expand to support your applications.
Estimated time to complete Step 4: 5 minutes.
Prerequisites
Before creating a Host (essentially a ), verify that you completed the previous tutorial steps. Using the DuploCloud Portal, confirm that:
An exist, both named NONPROD.
The NONPROD infrastructure has .
A named dev01 has been created.
Select the Tenant You Created
In the Tenant list box, select the dev01 Tenant that you created.
Creating a Host
In the DuploCloud Portal, navigate to Cloud Services -> Hosts. The Hosts page displays.
In the EC2 tab, click Add. The Add Host page displays.
In the Friendly Name field, enter
The EKS Image ID is the image published by AWS specifically for an EKS worker in the version of Kubernetes deployed at Infrastructure creation time. For this tutorial, the region is us-west-2, where the NONPROD Infrastructure was created.
If there is no Image ID with an EKS prefix, copy the AMI ID for the desired EKS version following this . Select Other from the Image ID list box and paste the AMI ID in the Other Image ID field. Contact the DuploCloud Support team via your Slack channel if you have questions or issues.
Checking Your Work
In the DuploCloud Portal, navigate to Cloud Services -> Hosts.
Select the EC2 tab.
Verify that the Host status is Running.
DNS Configuration
Managing custom DNS records in DuploCloud
DuploCloud automatically creates and manages DNS records for many resources you deploy, such as Kubernetes Services or VM hosts with public IPs, by integrating with your cloud provider’s DNS service. These DNS records are essential for routing traffic to your workloads and Services.
In most cases, DNS names are created automatically and can be customized within the DuploCloud Platform. However, you may sometimes need to manually configure or troubleshoot DNS entries, such as when using custom domain names, ensuring DuploCloud doesn’t overwrite DNS records you manage outside of the platform, or resolving DNS failures.
Prerequisites
Configure your DNS zones: Make sure your DNS zones are properly configured in both DuploCloud and your cloud provider. This often involves setting up subdomain zones (like apps.mycompany.com) and connecting them to DuploCloud. See DNS setup instructions for your cloud provider:
AWS:
GCP:
Adding Custom DNS Names
You can configure a custom DNS name for resource directly in the DuploCloud Platform, or manually in your cloud provider’s platform.
Creating Custom DNS Names in DuploCloud
For resources that DuploCloud manages (like services behind Load Balancers), you can customize the automatically generated DNS name:
In the Tenant list box, select theTenant.
Navigate tothe Services page (Kubernetes -> Services, or Docker -> Services). The Services page displays.
Select your Service from the NAME
Creating Custom DNS Names in Your Cloud Provider
For resources that don’t have DNS configuration in DuploCloud (e.g., non-Kubernetes services), you will need to manually add DNS entries in your cloud provider’s DNS service.
For AWS:
For GCP:
For Azure:
DuploCloud automatically deletes DNS records that it does not manage. If you create custom DNS names directly in your cloud provider, you must so they aren’t automatically removed.
Configuring DuploCloud to Ignore DNS Entries
If you create a DNS entry directly in your cloud provider’s platform (AWS, Google Cloud, or Azure), DuploCloud may delete it during updates, as it automatically deletes any DNS entries it did not create. To prevent this from happening, configure Systems Settings to ignore specific DNS entries.
From the DuploCloud Portal, navigate to Administrator -> System Settings -> System Config.
Click Add. The Add Config pane displays.
Fill the fields:
Click Submit. DuploCloud will ignore the specified DNS prefixes.
Resolving DNS Failures
Occasionally, DNS resolution can fail on local machines, especially for private resources behind VPNs. This is often caused by incorrect DNS server settings or local DNS caching.
To fix this:
Use public DNS servers like 8.8.8.8 (Google) or 1.1.1.1 (Cloudflare).
Flush your DNS cache.
Verify VPN connection if accessing private resources.
Step 6: Create a Load Balancer
Creating a Load Balancer to configure network ports to access the application
Now that your DuploCloud Service is running, you have a mechanism to expose the containers and images in which your application resides. However, since your containers are inside a private network, you need a Load Balancer listening on the correct ports to access the application.
In this step, we add a Load Balancer Listener to complete the network configuration.
Estimated time to complete Step 6: 10 minutes.
Prerequisites
Before creating a Load Balancer, verify that you completed the tasks in the previous tutorial steps. Using the DuploCloud Portal, confirm that:
An exist, both named NONPROD.
The NONPROD infrastructure has .
A Tenant named .
Creating a Load Balancer
In the Tenant list box, select the dev01 Tenant.
In the DuploCloud Portal, navigate to Kubernetes -> Services.
From the NAME column, select demo-service.
From the Type list box, select Application LB.
In the Container Port field, enter 3000. This is the configured port on which the application inside the Docker Container Image duplocloud/nodejs-hello:latest is running.
In the
Checking your work
In the Tenant list box, select the dev01 Tenant.
In the DuploCloud Portal, navigate to Kubernetes -> Services.
From the NAME column, select demo-service.
Step 6: Create a Load Balancer
Create a Load Balancer to expose the native Docker Service
Now that your DuploCloud Service is running, you have a mechanism to expose the containers and images in which your application resides. Since your containers are in a private network, you need a Load Balancer to make the application accessible.
In this step, we add a Load Balancer Listener to complete this network configuration.
Estimated time to complete Step 6: 15 minutes.
Prerequisites
Before creating a Load Balancer, verify that you completed the tasks in the previous tutorial steps. Using the DuploCloud Portal, confirm that:
An exist, both named NONPROD.
A Tenant named .
An EC2 Host named .
Creating a Load Balancer using Native Docker
In the Tenant list box, select dev01.
Navigate to Docker -> Services.
Select the Service demo-service-d01 .
When the LB Status card displays Ready, your Load Balancer is running and ready for use.
Securing the Load Balancer
If you want to secure the load balancer created, you can follow the steps specified
Creating a Custom DNS Name
You can modify the DNS name by clicking Edit in the DNS Name card in the Load Balancers tab. For more info about DNS setup and custom DNS names, see the .
Add a security layer and enable other Load Balancer options
This step is optional and unneeded for the example application in this tutorial; however, production cloud apps require an elevated level of protection.
In this tutorial step, for the Application Load Balancer (ALB) you created in Step 6, you will:
Enable access logging to monitor details and record incoming traffic data. Access logs are crucial for analyzing traffic patterns and identifying potential threats, but they are not enabled by default. You must manually activate them in the Load Balancer settings.
Protect against requests that contain .
Estimated time to complete Step 7: 5 minutes.
Prerequisites
Before securing a Load Balancer, verify that you accomplished the tasks in the previous tutorial steps. Using the DuploCloud Portal, confirm that:
An exist, both named NONPROD.
The NONPROD infrastructure has EKS.
A Tenant named .
Securing the Load Balancer
In the Tenant list box, select the dev01 Tenant.
In the DuploCloud Portal, navigate to Kubernetes -> Services.
From the NAME column, select the Service (demo-service).
Checking Your Work
In the Tenant list box, select the dev01 Tenant.
In the DuploCloud Portal, navigate to Kubernetes -> Services.
From the NAME column, select the Service (demo-service).
Web ACL - None
HTTP to HTTPS Redirect - False
Enable Access Logs - True
Enabling access logs enhances the security and monitoring capabilities of your Load Balancer and provides insights into the traffic accessing your application, for a more robust security posture.
Route 53 Hosted Zone
Create a Route 53 Hosted Zone to program DNS entries
To enable automatic DNS record creation in AWS, the DuploCloud Platform requires a unique Route 53 hosted zone. This hosted zone must be created outside of DuploCloud and configured as a DNS suffix in your infrastructure Plans. The domain should be a dedicated subdomain, such as apps.[MY-COMPANY].com.
Important: Never use this subdomain for any other purpose. DuploCloud takes ownership of all CNAME records within the domain and will remove any entries it does not manage.
For more details on how DNS works in DuploCloud and how to set up custom DNS names, see the DNS Configuration documentation.
Creating a Route 53 Hosted Zone Using AWS Console
To enable DuploCloud to manage DNS records automatically, you need to create a dedicated Route 53 Hosted Zone in your AWS account.
Log in to .
Navigate to Route 53 and Hosted Zones.
Create a new Route53 Hosted Zone with the desired domain name, for example, apps.acme.com.
Once delegation is complete, provision the Route 53 Hosted Zone in each DuploCloud Plan—starting with the DEFAULT Plan, as described in the following section.
Configuring DNS for a Plan
Configure a DuploCloud Plan to support automatic DNS record creation using AWS Route 53.
Start by updating the DEFAULT Plan, then repeat the configuration for any additional Plans where DNS-managed Services or Ingresses will be deployed. DNS settings are not shared across Plans.
In the DuploCloud Portal, navigate to Administrator → Plans.
Click the name of the Plan from the NAME column (e.g., DEFAULT).
Select the DNS tab.
Click Save to apply your changes.
Both the External and Internal DNS Suffix values must begin with a dot (.).
For example: .apps.acme.com. DNS entries will not be created correctly without the leading dot.
Step 8: Create a Custom DNS Name (Optional)
Changing the DNS Name for ease of use
After you create a Load Balancer Listener you can modify the DNS Name for ease of use and reference by your applications. It isn't necessary to run your application or complete this tutorial.
Once the Load Balancer is created, DuploCloud programs an autogenerated DNS Name registered to demo-service in the Route 53 domain. Before you create production deployments, you must create the Route 53 Hosted Zone domain (if DuploCloud has not already created one for you). For this tutorial, it is not necessary to create a domain.
Estimated time to complete Step 8: 5 minutes.
Prerequisites
Before securing a Load Balancer, verify that you completed the tasks in the previous tutorial steps. Using the DuploCloud Portal, confirm that:
An exist, both named NONPROD.
The NONPROD infrastructure has .
A Tenant named .
Creating a Custom DNS Name
In the Tenant list box, select the dev01 Tenant.
Navigate to Kubernetes -> Services. The Services page displays.
From the Name column, select demo-service.
An entry for your new DNS name is now registered with demo-service.
Checking Your Work
Navigate to Kubernetes -> Services.
From the Name column, select demo-service.
Select the Load Balancers tab and verify that the DNS Name card displays your modified DNS Name.
Creating an Infrastructure and Plan for AWS
Use the DuploCloud Portal to create an AWS Infrastructure and associated Plan
Creating an Infrastructure
Click Add. The Add Infrastructure pane displays.
Add Infrastructure pane
Define the Infrastructure by completing the fields:
Cloud providers limit the number of Infrastructures that can run in each region. Refer to your cloud provider for further guidelines on how many Infrastructures you can create.
Viewing Infrastructure settings
In the DuploCloud Portal, navigate to Administrator -> Infrastructure. The Infrastructure page displays.
From the Name column, select the Infrastructure containing settings that you want to view.
Click the Settings tab. The Infrastructure settings display.
Up to one instance (0 or 1) of an EKS or ECS is supported for each DuploCloud Infrastructure.
Configuring EKS features (optional)
You can customize your EKS configuration:
.
Enable EKS endpoints, logs, Cluster Autoscaler, and more. For information about configuration options, see these topics.
Configuring ECS features (optional)
You can customize your ECS configuration. See the topic for information about configuration options.
Terminologies in Container Orchestration
Key terms and concepts in DuploCloud container orchestration
The following concepts do not apply to ECS. ECS uses a proprietary policy model, which is explained in a later section.
Familiarize yourself with these DuploCloud concepts and terms before deploying containerized applications in DuploCloud. See the DuploCloud Common Concepts section for a description of DuploCloud Infrastructures, Tenants, Hosts, and Services.
Container Orchestration Terms
Hosts
These are virtual machines (EC2 Instances, GCP Node pools, or Azure Agent Pools). By default, apps within a Tenant are pinned to VMs in the same Tenant. One can also deploy Hosts in one Tenant that can be leveraged by apps in other Tenants. This is called the shared-host model. The shared-host model does not apply to ECS Fargate.
Services
Serviceis a DuploCloud term and is not the same as a Kubernetes Service. In DuploCloud, a Service is a micro-service defined by a name, Docker Image, number of replicas, and other optional parameters. Behind the scenes, a DuploCloud Service maps 1:1 to a Deployment or StatefulSet, based on whether it has stateful volumes. There are many optional Service configurations for Docker containers. Among these are:
Environment variables
Host Network Mode
Volume mounts
Allocation Tags
Allocation tags allow you to control which Hosts a Service can run on by specifying tags on both the Host and the Service. Services without allocation tags can be scheduled on any Host.
Docker Services use case-insensitive, substring-based matching.
For example, if a Host has the tag HighCpu;HighMem, a Service tagged highcpu or cpu would match and be eligible to run on that Host.
Kubernetes Deployments use exact, case-sensitive matching based on Kubernetes node labels and node selectors. For example, a Host tagged frontend-prod will only match a Service with the exact same tag. For example Frontend-Prod
If a Host is tagged and a matching Service exists, the Host may still be used by untagged Services unless all Services in the tenant are tagged. To fully isolate Hosts for a specific purpose, ensure all Services use allocation tags.
Host Networking
By default, Docker containers have network addresses. Sometimes, containers share the VM network interface. This reuse is called host networking mode.
Load Balancer
A DuploCloud Service that communicates with other Services, must be exposed by a Load Balancer. DuploCloud supports the following Load Balancers (LBs).
Application Elastic Load Balancer (ELB)
A DuploCloud Service exposed by an ELB is reachable from anywhere unless marked Internal, then, isonly reachable from within the VPC (or DuploCloud Infrastructure). Application ELBs allow you to use a certificate to terminate SSL on the LB and avoid providing application SSLs and certificates (e.g., certificates).
In Kubernetes, the platform creates a pointing to the Deployment and adds the Worker Nodes' Host IPs to the ELB. Traffic flows from the client to the external port defined in the ELB (for example, 443), to the ELB's NodePort (for example, 30004 on the Worker Node), and the Kubernetes Proxy running on each Worker Node. The Worker Node forwards the NodePort to the container.
Classic ELB (Only applicable to Built-in container orchestration)
Classic ELBs can be used when an application exposes non-HTTP ports that operate on any TCP port. Unless marked as Internal, Services exposed by an ELB are reachable from anywhere. Internal Services are reachable only from within the VPC (or DuploCloud infrastructure). Classic ELBs let you use a certificate to terminate SSL on the LB. This allows you to avoid providing application SSLs and certificates, such as certificates.
Cluster IP (Kubernetes only)
Load Balancers can be used if you are required to expose the application only within the Kubernetes Cluster.
AWS Use Cases
Use Cases supported for DuploCloud AWS
This section details common use cases for DuploCloud AWS.
Organization of use cases
Topics in this section are covered in the order of typical usage. Use cases that are foundational to DuploCloud such as Infrastructure, Tenant, and Hosts are listed at the beginning of this section; while supporting use cases such as Cost management for billing, JIT Access, Resource Quotas, and Custom Resource tags appear near the end.
Supported use cases for DuploCloud AWS
and
Infrastructure Security Group Rules
Add rules to custom configure your AWS Security Groups at the Infrastructure level
Infrastructure Security Group rules let you manage traffic controls at the Infrastructure level.
For security rules that apply to a specific Tenant, see the Security Groups page.
Adding Security Group Rules
In the DuploCloud Portal, navigate to Administrator -> Infrastructure.
Select the Infrastructure for which you want to add or view Security Group rules from the NAME column.
Select the Security Group Rules tab.
Click Add. The Add Infrastructure Security pane displays.
From the Source Type list box, select Tenant or IP Address.
From the Tenant list box, select the Tenant for which you want to set up the Security Rule.
Select the protocol from the Protocol list box.
In the Port Range field, specify the range of ports for access (for example, 1-65535).
Optionally, add a Description of the rule you are adding.
Click Add.
Viewing Security Group Rules
In the DuploCloud Portal, navigate to Administrator -> Infrastructure.
Select the Infrastructure from the Name column.
Click the Security Group Rules tab. Security Rules are displayed.
Deleting Security Group Rules
In the DuploCloud Portal, navigate to Administrator -> Infrastructure.
Select the Infrastructure from the Name column.
Click the Security Group Rules tab. Security Rules are displayed in rows.
Tenant
A conceptual overview of DuploCloud Tenants
Tenant as a Logical Concept
A Tenant is a project or a workspace and is a child of the Infrastructure. It is the most fundamental construct in DuploCloud. While Infrastructure provides VPC-level isolation, Tenant is the next level of isolation implemented by segregating Tenants using concepts like Security Groups, IAM roles, Instance Profiles, K8s Namespaces, KMS Keys, etc.
For instructions to create a Tenant in the DuploCloud Portal, see:
Step 5: Create a Service
Creating a Service to run a Docker-containerized application
DuploCloud supports three container orchestration technologies to deploy Docker-container applications in AWS:
Native EKS
Native ECS Fargate
Step 9: Test the Application
Test the application to ensure you get the results you expect
You can test your application directly from the Services page using the DNS status card.
Estimated time to complete Step 9 and finish tutorial: 10 minutes.
Prerequisites
Before testing your application, verify that you accomplished the tasks in the previous tutorial steps. Using the DuploCloud Portal, confirm that:
Step 6: Test the Application
Test the application to ensure you get the results you expect
You can test your application using the DNS Name from the Services page.
Estimated time to complete Step 6 and finish tutorial: 5 minutes.
Prerequisites
Before testing your application, verify that you accomplished the tasks in the previous tutorial steps. Using the DuploCloud Portal, confirm that:
Step 7: Test the Application
Test the application to ensure you get the results you expect.
Estimated time to complete Step 7 and finish tutorial: 5 minutes.
Prerequisites
Before testing your application, verify that you completed the tasks in the previous tutorial steps. Using the DuploCloud Portal, confirm that:
AWS Quick Start
Get up and running with DuploCloud inside an AWS cloud environment; harness the power of generating application infrastructures.
This Quick Start tutorial shows you how to set up an end-to-end cloud deployment. You will create DuploCloud Infrastructure and Tenants and, by the end of this tutorial, you can view a deployed sample web application.
Estimated time to complete tutorial: 75-95 minutes.
AWS Tutorial Roadmap
When you complete the AWS Quick Start Tutorial, you have three options or paths, as shown in the table below.
Enable EKS endpoints
Specify EKS endpoints for an Infrastructure
AWS SDKs and the AWS Command Line Interface (AWS CLI) automatically use the default public endpoint for each service in an AWS Region. However, when you create an Infrastructure in DuploCloud, you can specify a custom Private endpoint, a custom Public endpoint, or Both public and private custom endpoints. If you specify no endpoints, the default Public endpoint is used.
For more information about AWS Endpoints, see the .
Specifying public and private endpoints
Step 2: Create a Tenant
Creating a DuploCloud Tenant that segregates your workloads
Now that the exist and a Kubernetes EKS or ECS cluster has been enabled, create one or more Tenants that use the configuration DuploCloud created.
in DuploCloud are similar to projects or workspaces and have a subordinate relationship to the Infrastructure. Think of the Infrastructure as a virtual "house" (cloud), with Tenants conceptually "residing" in the Infrastructure performing specific workloads that you define. As Infrastructure is an abstraction of a Virtual Private Cloud, Tenants abstract the segregation created by a , although Kubernetes Namespaces are only one component that Tenants can contain.
In AWS, cloud features such as IAM Roles, security groups, and KMS keys are exposed in Tenants, which reference these feature configurations.
Estimated time to complete Step 2: 10 minutes.
Step 4: Create an EC2 Host
Create an EC2 Host in DuploCloud
Before you create your application and service using native Docker, create an EC2 Host for storage in DuploCloud.
Estimated time to complete Step 4: 5 minutes.
Prerequisites
Before creating a Host (essentially a ), verify that you completed the tasks in the previous tutorial steps. Using the DuploCloud Portal, confirm that:
# Terraform Management Skill
## Description
Expert in Terraform infrastructure as code for AWS, Azure, and GCP.
## Capabilities
- Create and modify Terraform configurations
- Manage state files and remote backends
- Troubleshoot plan and apply errors
- Implement modules and workspaces
- Handle provider version constraints
## Best Practices
- Always run `terraform plan` before apply
- Use remote state with locking
- Implement proper variable validation
- Tag all resources appropriately
- Use modules for reusable components
## Common Commands
```bash
terraform init
terraform plan -out=tfplan
terraform apply tfplan
terraform destroy
terraform state list
```
## Error Handling
- Check for state lock issues
- Verify provider credentials
- Validate syntax with `terraform validate`
- Review dependency cycles
At the logical level, a Tenant is fundamentally four things:
Container of Resources: All resources (except those corresponding to Infrastructure) are created within the Tenant. If we delete the Tenant, all resources within it are terminated.
Security Boundary: All resources within the Tenant can talk to each other. For example, a Docker container deployed in an EC2 instance within a Tenant will have access to S3 buckets and RDS instances in the same Tenant. By default, RDS instances in other Tenants cannot be reached. Tenants can expose endpoints to each other via ELBs or explicit inter-Tenant SG and IAM policies.
User Access Control: Self-service is the bedrock of the DuploCloud Platform. To that end, users can be granted Tenant-level access. For example, an administrator may be able to access all Tenants while developers can only access the Dev Tenant and a data scientist the data-science Tenant.
Billing Unit: Since a Tenant is a container of resources, all resources in a Tenant are tagged with the Tenant's name in the cloud provider, making it easy to segregate usage by Tenant.
Mechanism for Alerting: Alerts generate faults for all resources within a Tenant.
Mechanism for Logging: Each Tenant has a unique set of logs.
Mechanism for metrics: Each Tenant has a unique set of metrics.
Tenants and Kubernetes
Each Tenant is mapped to a Namespace in Kubernetes.
When you create a Tenant in an Infrastructure, a Namespace called duploservices-TENANT_NAME is created in the Kubernetes cluster. For example, if a Tenant is called Analyticsin DuploCloud, the Kubernetes Namespace is called duploservices-analytics.
All application components in the Analytics Tenant are placed in the duploservices-analytics Namespace. Since Nodes cannot be part of a Kubernetes Namespace, DuploCloud creates a tenantname label for all the Nodes launched within the Tenant. For example, a Node launched in the Analytics Tenant is labeled tenantname: duploservices-analytics.
Any Pods launched using the DuploCloud UI have an appropriate Kubernetes nodeSelector that ties the Pod to the Nodes within the Tenant. Ensure kubectl deployments use the proper nodeSelector.
Tenant Use Cases
DuploCloud customers often create at least two Tenants for their Prod and Nonprod cloud environments (Infrastructures).
You can map Tenants in each (or all) of your production environments.
For example:
Production Infrastructure
Pre-production Tenant: for preparing or reviewing production code
Production Tenant: for deploying tested code
Nonproduction Infrastructure
Development Tenant: For writing and reviewing code
Quality Assurance Tenant: For automated testing
Some customers in larger organizations create Tenants based on application environments: one Tenant for data science applications, another for web applications, etc.
Tenants can also isolate a single customer workload allowing more granular performance monitoring, flexibility scaling, or tighter security. This is referred to as a single-Tenantsetup. In this case, a DuploCloud Tenant maps to an environment used exclusively by the end client.
With large sets of applications accessed by different teams, it is helpful to map Tenants to team workloads (Dev-analytics, Stage-analytics, etc.).
Tenant Naming Conventions
Ensure Tenant names in DuploCloud are unique and not substrings of one another. For example, if you have a Tenant named dev, you cannot create another named dev2. This limitation arises because IAM policies and other security controls rely on pattern matching to enforce Tenant security boundaries. If Tenant names overlap, the patterns may not work correctly.
To avoid issues, we recommend using distinct numerical suffixes like dev01 and dev02.
Entrypoint or command overrides
Resource caps
Kubernetes health checks
or
frontend-prod-1
will
not
match. Kubernetes allocation tags must start and end with an alphanumeric character and may only contain letters, numbers, hyphens (
In the Task Definitions tab, select the Task Definition Family Name, DUPLOSERVICES-DEV01-SAMPLE-TASK-DEF. This is the Task Definition Name you created prepended by a unique identifier, which includes your Tenant name (DEV01) and part of your Infrastructure name (ECS-TEST).
In the Service Details tab, click the Configure ECS Service link. The Add ECS Service page displays.
From the Select Type list box, select Application LB.
In the Container Port field, enter 3000.
In the External Port field, enter 80.
From the Visibility list box, select Public.
In the Heath Check field, enter /, specifying root, the location of Kubernetes Health Check logs.
Click the Configure Load Balancer link. The Add Load Balancer Listener pane displays.
External Port
field, enter
80
. This is the port through which users will access the web application.
From the Visibility list box, select Public.
From the Application Mode list box, select Docker Mode.
Type / (forward-slash) in the Health Check field to indicate that the cluster we want Kubernetes to perform Health Checks on is located at the root level.
In the Backend Protocol list box, select HTTP.
Click Add. The Load Balancer is created and initialized. Monitor the LB Status card on the Services page. The LB Status card displays Ready when the Load Balancer is ready for use.
Verify that the LB Status card displays a status of Ready.
Note the DNS Name of the Load Balancer that you created.
In the LB Listeners area of the Services page, note the configuration details of the Load Balancer's HTTP protocol, which you specified, when you added it above.
Click the Configure Load Balancer link. The Add Load Balancer Listener pane displays.
From the Select Type list box, select Application LB.
In the Container Port field, enter 3000:the port on which the application running inside the container image (duplocloud/nodejs-hello:latest)is running.
In the External Port field, enter 80.
From the Visibility list box, select Public.
From the Application list box, select Docker Mode.
In the Health Check field, enter /, indicating that you want the Kubernetes Health Check logs written to the root directory.
Access the Hosted Zone and note the name server names.
Go to your root domain’s DNS provider (for example, the registrar managing acme.com) and create an NS record for your subdomain (e.g., apps.acme.com). This NS record should point to the name servers assigned to your Route 53 Hosted Zone (the ones you noted earlier). This delegation ensures that DNS queries for your subdomain are routed to AWS Route 53 for resolution.
Click Edit. The Set Plan DNS pane displays.
Complete the following fields:
Route53 Zone ID
Enter the Hosted Zone ID from AWS Route 53. This is required to enable DNS support.
External DNS Suffix
Enter the DNS suffix for public-facing records (e.g., .apps.acme.com). Must begin with a dot (.).
Internal DNS Suffix
Enter the DNS suffix for internal resources (e.g., .internal.acme.com). Must begin with a dot (.).
Ignore Global DNS
Optionally, select this option to disable global DNS record management for the Plan, allowing localized DNS control.
Enter the number of CIDR bits to define the size of each subnet (e.g., 22). Lower values create larger subnets.
Enable EKS
Select this option if you want to deploy a Kubernetes (EKS) cluster in the infrastructure. Once selected, you will be prompted to provide additional EKS settings, such as Cluster Mode (Auto or Standard), EKS Version, EKS Endpoint Visibility, Cluster IP CIDR, and EKS logging. For more information about cluster mode options, see .
Enable ECS Cluster
Select this option if you want to deploy an ECS cluster for running containerized workloads. Once selected, you can optionally select Enable Container Insights for ECS monitoring.
Advanced Options
Optionally, expand this section to configure additional network settings. You can enter custom values for Private Subnet CIDR and Public Subnet CIDR using semicolon-separated CIDR blocks (e.g., 10.10.0.0/22;10.10.4.0/22).
Click Create. The Infrastructure is created and listed on the Infrastructure page. DuploCloud automatically creates a Plan (with the same Infrastructure name) with the Infrastructure configuration.
Name
Enter a unique name for the infrastructure to identify it within the DuploCloud Portal.
Cloud
This is automatically set to AWS and cannot be changed.
VPC CIDR
Enter the CIDR block for the new VPC (e.g., 10.10.0.0/16). Make sure it doesn’t overlap with existing networks.
Region
Select the AWS region where you want to deploy your infrastructure (e.g., us-east-1).
Availability Zones
Choose how many AWS availability zones to use. Select more zones for higher availability.
AWS Add Infrastructure page with highlighted Enable EKS and Enable ECS Cluster options
Settings tab on the Infrastructure page
Subnet CIDR Bits
In the first column of the Security Group row, click the Options Menu Icon ( ) and select Delete.
Add Infrastructure Security pane defining port range for Cross-tenant access
Viewing Security Rules using the Security Group Rules tab
Built-in container orchestration in DuploCloud using EKS/ECS
You don't need experience with Kubernetes to deploy an application in the DuploCloud Portal. However, it is helpful to be familiar with the Docker platform. Docker runs on any platform and provides an easy-to-use UI for creating, running, and managing containers.
To deploy your own applications with DuploCloud, you’ll choose a public image or provide credentials for your private repository and configure your Docker Registry credentials in DuploCloud.
This tutorial will guide you through deploying a simple Hello World NodeJS web app using DuploCloud's built-in container orchestration with EKS. We’ll use a pre-built Docker container and access Docker images from a preconfigured Docker Hub.
Estimated time to complete Step 5: 10 minutes.
Prerequisites
Before creating a Service, verify that you completed the tasks in the previous tutorial steps. Using the DuploCloud Portal, confirm that:
In the DuploCloud Portal, navigate to Kubernetes -> Services.
Click Add. The Add Service page displays.
Add Service page
From the table below, enter the values that correspond to the fields on the Add Service page. Accept all other default values for fields not specified.
Click Next. The Advanced Options page is displayed.
At the bottom of the Advanced Options page, click Create. In about five (5) minutes, the Service will be created and initialized, displaying a status of Running in the Containers tab.
Add a Service page field
Value
Service Name
demo-service
Docker Image
duplocloud/nodejs-hello:latest
Use the Containers tab to monitor the Service creation status, between Desired (Running)and Current.
Note that if you skipped Step 7 and/or Step 8, the configuration in the Other Settings and DNS cards appears slightly different from the configuration depicted in the screenshot below. These changes do not impact you in testing your application, as these steps are optional. You can proceed to test your app with no visible change in the output of the deployable application.
In the Tenant list box, select the dev01 Tenant.
In the DuploCloud Portal, navigate to Kubernetes -> Services. The Services page displays.
From the Name column, select demo-service.
Click the Load Balancers tab.
In the DNS status card, click the Copy Icon ( ) to copy the DNS address displayed to your clipboard.
Open a browser instance and Paste the DNS in the URL field of your browser.
Press ENTER. A web page with the text Hello World! is displayed, from the JavaScript program residing in your Docker Container running in demo-service, which is exposed to the web by your Load Balancer.
Web page with Hello World! displayed
It can take from five to fifteen (5-15) minutes for the DNS Name to become active once you launch your browser instance to test your application.
Congratulations! You have just launched your first web service on DuploCloud!
Reviewing What You Learned
In this tutorial, your objective was to create a cloud environment to deploy an application for testing purposes, and to understand how the various components of DuploCloud work together.
The application rendered a simple web page with text, coded in JavaScript, from software application code residing in a Docker container. You can use this same procedure to deploy much more complex cloud applications.
Created a Tenant named dev01 in Infrastructure NONPROD. While generating the Infrastructure, DuploCloud created a set of templates (Plan) to configure multiple AWS and Kubernetes components needed for your environment.
named host01, providing the application with storage resources.
named demo-service to connect the Docker containers and associated images housing your application code to the DuploCloud Tenant environment.
to expose your application via ports and backend network configurations.
as expected by testing the DNS Name exposed by the Load Balancer Listener.
Cleaning Up Your Tutorial Environment
In this tutorial, you created many artifacts for testing purposes. Now that you are finished, clean them up so others can run this tutorial using the same names for Infrastructure and Tenant.
To delete the dev01 tenant follow these instructions, then return to this page. As you learned, the Tenant segregates all work in one isolated environment, so deleting the Tenant you created cleans up most of your artifacts.
Finish by deleting the NONPROD Infrastructure. In the DuploCloud Portal, navigate to Administrator -> Infrastructure. Click the Action menu icon () for the NONPROD row and select Delete.
The NONPROD Infrastructure is deleted and you have completed the clean-up of your test environment.
Thanks for completing this tutorial and proceed to the next section to learn more about using DuploCloud with AWS.
The sample-httpd-app) and Load Balancer have been created.
Testing the Application
In the Tenant list box, select the dev01 Tenant that you created.
Navigate to Cloud Services -> ECS.
Click the Service Details tab.
In the DNS Name card, click the Copy Icon ( ) to copy the DNS address to your clipboard.
Service Details tab with DNS Name card highlighted
Open a browser and pastethe DNS address in the URL field of your browser.
Press ENTER. A web page with the text It works! displays, from the JavaScript program residing in your Docker Container that is running in sample-httpd-app, which is exposed to the web by your Application Load Balancer.
Web page with Hello World! displayed
It can take from five to fifteen (5-15) minutes for the Domain Name to become active once you launch your browser instance to test your application.
Congratulations! You have just launched your first web service on DuploCloud!
Reviewing What You Learned
In this tutorial, your objective was to create a cloud environment to deploy an application for testing purposes, and to understand how the various components of DuploCloud work together.
The application rendered a simple web page with text, coded in JavaScript, from software application code residing in a Docker container. You can use this same procedure to deploy much more complex cloud applications.
Created a Tenant named dev01 in Infrastructure NONPROD. While generating the Infrastructure, DuploCloud created a set of templates (Plan) to configure multiple AWS and Kubernetes components needed for your environment.
named sample-task-def, used to create a service to run your application.
named sample-httpd-app to connect the Docker containers and associated images, in which your application code resides, to the DuploCloud Tenant environment. In the same step, you c to expose your application via ports and backend network configurations.
as expected by testing the DNS Name exposed by the Load Balancer Listener.
Cleaning Up Your Tutorial Environment
In this tutorial, you created many artifacts. When you are ready, clean them up so others can run this tutorial using the same names for Infrastructure and Tenant.
To delete the dev01 tenant follow these instructions, and then return to this page. As you learned, the Tenant segregates all work in one isolated environment, so deleting the Tenant cleans up most of your artifacts.
Finish by deleting the NONPROD Infrastructure. In the DuploCloud Portal, navigate to Administrator -> Infrastructure. Click the Action menu icon () for the NONPROD row and select Delete.
The NONPROD Infrastructure is deleted and you have completed the clean-up of your test environment.
Thanks for completing this tutorial and proceed to the next section to learn more about using DuploCloud with AWS.
Navigate to Docker -> Services. The Services page displays.
From the Name column, select demo-service-d01.
Click the Load Balancers tab. The Application Load Balancer configuration is displayed.
In the DNS status card on the right side of the Portal, click the Copy Icon ( ) to copy the DNS address displayed to your clipboard.
The Services Details page with the DNS status card highlighted.
Open a browser instance and paste the DNS in the URL field of your browser.
Press ENTER. A web page with the text Hello World! is displayed, from the JavaScript program residing in your Docker Container running in demo-service-d01, which is exposed to the web by your Load Balancer.
A Browser instance displaying Hello World!
It can take from five to fifteen (5-15) minutes for the DNS Name to become active once you launch your browser instance to test your application.
Congratulations! You have just launched your first web service on DuploCloud!
Reviewing What You Learned
In this tutorial, your objective was to create a cloud environment to deploy an application for testing purposes, and to understand how the various components of DuploCloud work together.
The application rendered a simple web page with text, coded in JavaScript, from software application code residing in a Docker container. You can use this same procedure to deploy much more complex cloud applications.
Created a Tenant named dev01 in Infrastructure NONPROD. While generating the Infrastructure, DuploCloud created a set of templates (Plan) to configure multiple Azure and Kubernetes components needed for your environment.
named host01, so your application has storage resources.
named demo-service-d01 to connect the Docker containers and associated images, in which your application code resides, to the DuploCloud Tenant environment.
to expose your application via ports and backend network configurations.
as expected by testing the DNS Name exposed by the Load Balancer Listener.
Cleaning Up Your Tutorial Environment
In this tutorial, you created many artifacts for testing purposes. Clean them up so others can run this tutorial using the same names for Infrastructure and Tenant.
To delete the dev01 tenant follow these instructions, then return to this page. As you learned, the Tenant segregates all work in one isolated environment, so deleting the Tenant that you created cleans up most of your artifacts.
Finish by deleting the NONPROD Infrastructure. In the DuploCloud Portal, navigate to Administrator -> Infrastructure. Click the Action menu icon () for the NONPROD row and select Delete.
The NONPROD Infrastructure is deleted and you have completed the clean-up of your test environment.
Thanks for completing this tutorial and proceed to the next section to learn more about using DuploCloud with AWS.
EKS (Elastic Kubernetes Service): Create a Service in DuploCloud using AWS Elastic Kubernetes Service and expose it using a Load Balancer within DuploCloud.
ECS (AWS Elastic Container Service): Create an app and Service in DuploCloud using AWS Elastic Container Service.
Native Docker: Create a Service in Docker and expose it using a Load Balancer within DuploCloud.
Optional steps in each tutorial path are marked with an asterisk in the table below. While these steps are not required to complete the tutorials, you may want to perform or read through them, as they are normally completed when you create production-ready services.
For information about the differences between these methods and to help you choose which method best suits your needs, skills, and environments, see this AWS blog and Docker documentation.
Step
EKS
ECS
Native Docker Services
1
Create Infrastructure and Plan
Create Infrastructure and Plan
Create Infrastructure and Plan
2
Create Tenant
Create Tenant
Create Tenant
3
* Optional
AWS Video Demo
Click the card below to watch DuploCloud video demos.
Follow the steps in the section Creating an Infrastructure. Before clicking Create, specify EKS Endpoint Visibility.
From the EKS Endpoint Visibility list box, select Public, Private, or Both public and private. If you select private or Both public and private, the Allow VPN Access to the EKS Cluster option is enabled.
Click Advanced Options.
Using the Private Subnet CIDR and Public Subnet CIDR fields, specify CIDRs for alternate public and private endpoints.
Click Create.
Infrastructure page with EKS Endpoint Visibility field and Advanced Options for specifying custom subnet CIDRs
Infrastructure page with EKS Endpoint Visibility Private option preconfigured
Changing VPN visibility from public to private (optional)
To change VPN visibility from public to private after you have created an Infrastructure, follow these steps.
In the DuploCloud Portal, navigate to Administrator -> Infrastructure. The Infrastructure page displays.
From the NAME column, select the Infrastructure.
Click the Settings tab.
In the EKS Endpoint Visibility row, in the Actions column, click the ( ) icon and select Update Setting. The Infra - Set Custom Data pane displays.
From the Setting Name list box, select Enable VPN Access to EKS Cluster.
Select Enable to enable VPN.
Click Set. When you , the Allow VPN Access to the EKS Cluster option will be enabled.
Changing EKS endpoint visibility (optional)
Modifying endpoints can incur an outage of up to thirty (30) minutes in your EKS cluster. Plan your update accordingly to minimize disruption for your users.
To modify the visibility for EKS endpoints you have already created:
In the DuploCloud Portal, navigate to Administrator -> Infrastructure. The Infrastructure page displays.
From the Name column, select the Infrastructure for which you want to modify EKS endpoints.
Click the Settings tab.
In the EKS Endpoint Visibility row, in the Actions column, click the ( ) icon and select Update Setting. The Infra - Set Custom Data pane displays.
From the Setting Value list box, select the desired type of visibility for endpoints (private, public, or both).
Click Set.
Infra - Custom Data pane with Setting Value for EKS Endpoint Visibility
DuploCloud customers often create at least two Tenants for their production and non-production cloud environments (Infrastructures).
For example:
Production Infrastructure
Pre-productionTenant - for preparing or reviewing production code
Production Tenant - for deploying tested code
Non-production Infrastructure
Development Tenant - for writing and reviewing code
Quality Assurance Tenant - for automated testing
In larger organizations, some customers create Tenants based on application environments, such as one Tenant for Data Science applications, another for web applications, and so on.
Tenants are sometimes created to isolate a single customer workload, allowing more granular performance monitoring, scaling flexibility, or tighter security. This is referred to as a single-Tenant setup.
Prerequisites
Before creating a Tenant, verify that you accomplished the tasks in the previous tutorial steps. Using the DuploCloud Portal, confirm that:
Integrate Slack with AI Helpdesk to access AI support from your Slack workspace
The DuploCloud AI HelpDesk Slack integration lets you get AI-powered support directly in your Slack workspace. You can create and manage Tickets, interact with AI Agents, and approve commands without leaving Slack. The integration works across all your DuploCloud Portals and maintains your existing shared Slack channel workflows. When needed, you can also open the full HelpDesk interface to perform advanced actions.
Here’s a quick look at how DuploCloud AI Agents work inside Slack.
Using the AI HelpDesk Slack Integration
Configuring Agent Permissions
Configuring permissions lets AI Agents assume specific user identities for proper access when handling your requests in Slack.
Navigate to Administrator ->Tenants.
Select the Tenant name from the NAME column.
Select the Settings tab.
Starting a Conversation
In your Slack channel, mention the Slack Bot using @DuploCloud AI. One of the following things will happen:
If the thread is already linked to a HelpDesk Ticket: The AI Agent joins the conversation. You can interact with it directly to request updates, ask questions, or continue troubleshooting in the Slack thread.
If the thread is not yet linked to a ticket: The Slack Bot will prompt you to create one. Click
Enter the following information to configure your Ticket:
The Slack backend creates a Ticket in the selected Tenant and assigns it to the chosen AI Agent. You can now interact with the AI Agent directly in the Slack thread. For more information about Tickets, see the .
Interacting with Tickets in Slack
Once your Ticket is created, you can manage the Ticket and interact with the AI Agents directly in Slack.
The AI Agent posts updates, responses, and command suggestions in the thread. Review each suggested command and choose Approve, Reject, or Ignore. For more information about interacting with Tickets, see the .
Click Submit. Approved commands are executed in your connected DuploCloud Portal, with results shown directly in the Slack thread.
Updating Ticket Context
Updating Ticket context lets you change the Portal, Tenant, Agent, or Default Permissions for your Ticket mid-conversation, without starting over. This ensures the AI Agent is always working with the correct environment and permissions.
From the Ticket Slack thread, click the menu icon () and select Update Context.
Modify any of the following: Portal, Tenant, Agent, or Permissions.
When updating the
Security Consideration: Historical context may contain sensitive data (API keys, credentials). Use discretion when transferring.
Opening a Ticket in HelpDesk
Some operations may require direct Helpdesk access. Opening a ticket in the Helpdesk gives you access to the full interface in the correct portal, allowing you to perform advanced operations not available in Slack.
In your Slack thread, click Open AI HelpDesk.
The Ticket opens in HelpDesk in the correct Portal, where you can view details, manage actions, and access additional HelpDesk features. For more information about using HelpDesk, see the .
Tools
In the DuploCloud AI Suite, a Tool is a modular component that enables the DuploCloud AI Agent to go beyond conversation and interact directly with your infrastructure. Each Tool accepts an input, executes custom logic, and returns a string output, which the Agent can use to make decisions, continue a conversation, or trigger follow-up actions. By defining Tools, you control what the AI Agent is capable of, ensuring it operates only within the boundaries you explicitly set.
Tools are built as Python classes that inherit from the BaseTool class in LangChain, making them compatible with LangChain-based agents and easily callable by the DuploCloud AI Agent. When a user query is received, the Agent determines whether a Tool is needed, selects the appropriate one, passes in the relevant input, and uses the output to guide its next steps. You can create Tools for:
Querying Tenant or infrastructure configuration
Fetching deployment statuses, logs, or metrics
Integrating with systems like Slack, PagerDuty, or GitHub
Retrieving usage or billing data for reporting and cost insights
Uploading Tool Packages
Each Tool in the DuploCloud AI Suite relies on a Python package that contains the logic it will execute. Before creating a Tool, upload your package to DuploCloud.
Navigate to AI Suite → Studio → Tools, and select the Packages tab.
Click Add. The Upload Package pane appears.
In the Tool Name field, provide a name for the package.
Once uploaded, the package will appear in the list under the Packages tab. You can then reference this package when creating a Tool by specifying its name and providing an install script.
Creating Tools
Add a custom Tool to the DuploCloud AI Suite to enable the DuploCloud AI Agent to import and execute it during its reasoning process.
From the DuploCloud Portal, navigate to AI Suite → Studio → Tools, and select the Tools tab.
Click the Add button. The Add Tool Definition pane displays.
Complete the following fields:
Click Submit to create the Tool. Once the Tool is created, it can be selected when .
Note: An Agent can use multiple Tools at once. When you assign Tools during Agent creation, the Agent dynamically selects and invokes whichever Tool is most appropriate for the user’s request.
Managing Tools
After creating a Tool, you can manage it at any time:
Navigate to AI Suite → Studio → Tools, and select the Tools tab.
Click on a Tool name in the NAME column to view its details.
Select from the tabs at the top (e.g., MetaData, Build Env Vars, Deployment Env Vars, Packages, Details
Note: If a Tool fails during execution, the Agent will capture the error message and return it in the Ticket conversation. This helps with troubleshooting and allows you to refine the Tool code or configuration without breaking the overall workflow.
Container Orchestrators
An overview of the container orchestration technologies DuploCloud supports
Most application workloads deployed on DuploCloud are in Docker containers. The rest consist of serverless functions, and big data workloads like Amazon EMR, Apache Airflow, and Amazon SageMaker. DuploCloud abstracts the complexity of container orchestration technologies, allowing you to focus on deploying, updating, and debugging your containerized application.
Among the technologies DuploCloud supports are:
Kubernetes: On AWS, DuploCloud supports orchestration using Elastic Kubernetes Service (EKS). On GCP we support GKE auto pilot and node-pool based clusters. On Azure we support Azure Kubernetes Service (AKS) and Azure Web Apps.
Built-In (DuploCloud): The DuploCloud Built-In container orchestration has the same interface as the docker run command, but it can be scaled to manage hundreds of containers across many Hosts, providing capabilities such as associated load balancers, DNS, and more.
AWS ECS Fargate: Fargate is a technology you can use with Elastic Container Service (ECS) to run containers without having to manage servers or clusters of EC2 instances.
Container Orchestration Feature Matrix
You can use the feature matrix below to compare the features of the orchestration technologies that DuploCloud supports. DuploCloud can help you implement whatever option you choose through the DuploCloud Portal or the Terraform API.
Feature
Kubernetes
Built-In
ECS Fargate
One dot indicates a low rating, two dots a medium rating, and three dots a high rating. For example, Kubernetes has a low ease-of-use rating but a high rating for stateful applications.
Feature Definitions
See the sections below for a detailed explanation of the cloud orchestrator's feature matrix ratings.
Ease of Use
Kubernetes is extensible and customizable, but not without a cost in ease of use. The DuploCloud Platform reduces the complexities of Kubernetes, making it comparable to other container orchestration technologies in ease of use/adoption.
DuploCloud's Built-in orchestration mirrors docker run. You can Secure Shell (SSH) into a virtual machine (VM) and run Docker commands to debug and diagnose. If you have an application with a few stateless microservices or configurations that use environment variables or AWS services like AWS Systems Manager (SSM), Amazon S3, or, consider using DuploCloud's Built-In container orchestration.
ECS Fargate contains proprietary constructs (such as task definitions, tasks, or services) that can be hard to learn. As Fargate is serverless, you can't control the host Docker, so commands such as docker ps and docker restart are unavailable. This makes debugging a container crash very difficult and time-consuming. DuploCloud simplifies Fargate with an out-of-the-box setup for Logging and Shell and abstraction of proprietary constructs and behavior.
Features and Ecosystem Tools
Kubernetes is rich in additional built-in features and ecosystem tools like Secrets and ConfigMaps. Built-In and ECS rely on native AWS services such as AWS Secrets Manager, AWS Systems Manager (SSM), Amazon S3, and others. While Kubernetes features have AWS equivalents, third parties like Influx DB, time-series databases, Prefect, etc. tend to publish their software as Kubernetes packages (Helm charts).
Suitability for Stateful Apps
Stateful applications should be avoided in AWS. Instead, managed cloud storage solutions should be leveraged for the best availability and Service Level Agreement (SLA) compliance. If this is undesirable due to cost, Kubernetes offers the best solution. Kubernetes uses and to implicitly manage Amazon Elastic Block Storage (EBS) volumes. With Built-In and ECS, you must use a shared Amazon Elastic File System (EFS) drive, which may not have feature parity with Kubernetes volume management.
Stability and Maintenance
Although Kubernetes is highly stable, it is an open-source product. Kubernetes' native customizability and extensibility can lead to points of failure. For example, when a mandatory cluster upgrade is needed. This complexity often leads to support costs from third-party vendors. Maintenance can be especially costly with EKS, as versions are frequently deprecated, requiring you to upgrade the control plane and data nodes. DuploCloud automates this upgrade process but still requires careful planning and execution.
AWS Cost
EKS control plane is fairly inexpensive, but operating an EKS environment without business support (at an additional premium) is not recommended. Small businesses may reduce costs by adding the support tier only when needed.
Multi-Cloud
For many enterprises and independent software vendors, multi-cloud capabilities are, or will soon be a requirement. While Kubernetes provides this benefit, DuploCloud's implementation is much easier to maintain and implement.
DuploCloud Tenancy Models
An outline of the tenancy deployment models supported by DuploCloud
DuploCloud supports a variety of deployment models, from basic multi-Tenant applications to complex single-Tenant deployments within customer environments. These models cater to different security needs, allowing customers to achieve their desired isolation level while maintaining operational efficiency.
Description: The application manages Tenant isolation with DuploCloud structured pooled tenancy.
Use Case: The most common scenario is where the application logic isolates customer data. DuploCloud Tenants are then used to isolate development environments (i.e., Nonprod and Prod).
Infrastructure:
DuploCloud Tenant-per-Customer
Description: Each customer gets a separate DuploCloud Tenant.
Use Case: Suitable for older applications not designed for multi-tenancy, or security and compliance needs.
Infrastructure:
DuploCloud Infrastructure-per-Customer
Description: Each customer gets a separate DuploCloud Infrastructure.
Use Case: Provides a higher security boundary at the network layer where customer access and data are separated.
Infrastructure:
Cloud Account-per-Customer
Description: Each customer gets a separate cloud account.
Use Case: The least common model, used for customers requiring complete isolation.
Infrastructure:
Hybrid Model
Description: Combination of the above models as needed to meet specific requirements.
Use Case: Diverse customer needs.
Infrastructure:
Special Hybrid Case: Single-Tenant Deployment in an External Kubernetes Cluster
Description: DuploCloud imports existing Kubernetes clusters from external environments.
Use Case: A cluster and resources already exist, or customers require the application or services solution running inside their client's cloud account. Customers are comfortable creating their own Kubernetes environments.
Infrastructure:
Documentation and Support
Documentation: is available to support the development of your DuploCloud tenancy model.
Support: can assist you in designing your deployment model or creating and managing Kubernetes clusters.
Step 1: Create Infrastructure and Plan
Create a DuploCloud Infrastructure and Plan
Each DuploCloud Infrastructure is a connection to a unique Virtual Private Cloud (VPC) network that resides in a region that can host Kubernetes clusters, EKS or ECS clusters, or a combination of these, depending on your public cloud provider.
After you supply a few basic inputs, DuploCloud creates an Infrastructure within AWS and DuploCloud. Behind the scenes, DuploCloud does a lot with what little you supply, generating the VPC, subnets, NAT Gateway, routes, and EKS or ECS clusters.
With the Infrastructure as your foundation, you can customize an extensible, versatile platform engineering development environment by adding Tenants, Hosts, Services, and more.
Estimated time to complete Step 1: 40 minutes. Much of this time is consumed by DuploCloud's creation of the Infrastructure and enabling your EKS cluster with Kubernetes.
Prerequisites
Before starting this tutorial:
Learn more about DuploCloud , , and .
Reference the documentation to create User IDs with the Administrator role. To perform the tasks in this tutorial, you must have Administrator privileges.
Creating a DuploCloud Infrastructure
In the DuploCloud Portal, navigate to Administrator -> Infrastructure.
Click Add. The Add Infrastructure page displays.
Enter the values from the table below in the corresponding fields on the Add Infrastructure page. Accept default values for fields not specified.
Add Infrastructure field
Value
It may take up to forty-five (45) minutes for your Infrastructure to be created and Kubernetes (EKS/ECS) enablement to be complete. Use the Kubernetes card in the Infrastructure screen to monitor the status, which should display Enabled when complete. You can also monitor progress using the Kubernetes tab.
Verifying That a Plan Exists for Your Infrastructure
Every DuploCloud Infrastructure generates a Plan. Plans are sets of templates that are used to configure the or workspaces, in your Infrastructure. You will set up Tenants in the next tutorial step.
Before proceeding, confirm that a Plan exists that corresponds to your newly created Infrastructure.
In the DuploCloud Portal, navigate to Administrator -> Plans. The Plans page displays.
Verify that a Plan exists with the name NONPROD:the name of the Infrastructure you created.
Checking Your Work
You previously verified that your Infrastructure and Plan were created. Now verify that Kubernetes is enabled before proceeding to create a Tenant.
In the DuploCloud Portal, navigate to Administrator -> Infrastructure. The Infrastructure page displays.
From the Name column, select the NONPROD Infrastructure.
Navigate to the EKS or ECS tab, depending on the platform you enabled.
Upgrading the EKS version
Upgrade the Elastic Kubernetes Service (EKS) version for AWS
AWS frequently updates the EKS version based on new features that are available in the Kubernetes platform. DuploCloud automates this upgrade in the DuploCloud Portal.
IMPORTANT: An EKS version upgrade can cause downtime to your application depending on the number of replicas you have configured for your services. Schedule this upgrade outside of your business hours to minimize disruption.
About the upgrade process
DuploCloud notifies users when an upgrade is planned. The upgrade process follows these steps:
A new EKS version is released.
DuploCloud adds support for the new EKS version.
DuploCloud tests all changes and new features thoroughly.
Updating the EKS version:
Updates the EKS Control Plane to the latest version.
Updates all add-ons and components.
Relaunches all Hosts to deploy the latest version on all nodes.
After the upgrade process completes successfully, you can assign allocation tags to Hosts.
Starting the upgrade
Upgrading the EKS version
Click Administrator -> Infrastructure.
Select the Infrastructure that you want to upgrade to the latest EKS version.
Select the EKS tab. If an upgrade is available for the Infrastructure, an Upgrade link appears in the Value column.
From the Target Version list box, select the version to which you want to upgrade.
From the Host Upgrade Action, select the method by which you want to upgrade hosts.
Click Start. The upgrade process begins.
Updating EKS Components (Add-ons)
Click Administrator -> Infrastructure.
Select the Infrastructure with components you want to upgrade.
Select the EKS tab. If an upgrade is available for the Infrastructure components, an Upgrade Components link appears in the Value column.
From the Host Upgrade Action, select the method by which you want to upgrade hosts.
Click Start. The upgrade process begins.
Monitoring upgrades
The EKS Upgrade Details page displays that the upgrade is In Progress.
Find more details about the upgrade by selecting your Infrastructure from the Infrastructure page. Click the EKS tab, and then click Show Details.
Upgrade completion
When you click Show Details, the EKS Upgrade Details page displays the progress of updates for all versions and Hosts. Green checkmarks indicate successful completion in the Status list. Red Xs indicate Actions you must take to complete the upgrade process.
Assign allocation tags
If any of your Hosts use allocation tags, you must assign allocation tags to the Hosts:
After your Hosts are online and available, navigate to Cloud Services -> Hosts.
Select the host group tab (EC2, ASG, etc.) on the Hosts screen.
Click the Add button.
For additional information about the EKS version upgrade process with DuploCloud, see the .
Common Use Cases
How DuploCloud is able to provide comprehensive DevOps support in a single intuitive tool
DuploCloud is a comprehensive solution for DevOps and SecOps, bringing cloud infrastructure management to businesses, regardless of expertise level.
Microservices can be created in minutes, accelerating time to market. Advanced DevOps users can leverage Kubernetes and Terraform to create custom solutions.
For a flat rate per year, personalized onboarding, cloud migration, SecOps questionnaire completion, and auditing support are included.
If there is a way to do something in the cloud, it can be done faster and more efficiently with DuploCloud.
1. Turbo-Charging Infrastructure and Workspace Creation
Did you know that DuploCloud can create a complete cloud infrastructure comprising virtually hundreds of components and sub-components in ten to fifteen minutes? This usually takes hours to develop in a native cloud portal and even longer when using native Kubernetes (K8s). Individual workspaces () can be created in less than a minute.
This acceleration is critical to many of the business value propositions DuploCloud offers. It is why we can perform cloud migrations at such an advanced pace, minimizing downtime and simultaneously ensuring security and compliance (and peace of mind).
2. Built-In Scaling and Managed Services
Virtually all of the services DuploCloud supports are designed to auto-scale as your cloud environment grows exponentially. These Managed Services include automated "set and forget" configurations that dovetail neatly into developer self-service.
As with creating Infrastructures and Tenants, DuploCloud Services are designed for the most common use cases. They enable users to supply a minimum number of inputs to get their service up and running quickly. At the same time, DuploCloud retains the ability to customize, using native Kubernetes YAML coding and custom scripting if needed.
Turnkey access to scalable Kubernetes constructs and managed services ensures minimal implementation detail, making DuploCloud the DevOps platform for the rapidly expanding AI/ML cloud space. In this arena, the power of an automated platform becomes readily apparent, not only in setting up your cloud infrastructure but also in maintaining it.
DuploCloud’s ready-made templatized approach to K8s makes adjustments to Kubernetes parameters, such as Horizontal Pod Autoscalers (HPA) for CPU and RAM requirements, easy to access and adjust.
DuploCloud is an efficient, user-friendly means of helping developers automate their environment, reducing the need for constant monitoring or "babysitting." More information on fewer screens and improved ease of navigation enhance monitoring performance.
3. Intuitive Self-Service DevOps for Developers
DuploCloud's simplified UI guides developers and less savvy DevOps users in creating and managing DevOps components and constructs. Even advanced features such as AWS Batch, CloudFront, or Lambda functions are simplified through procedural documentation, step-by-step UI panels, and sample code blocks that can be accessed through info-tips in the UI.
Using a templatized approach, potentially complex Kubernetes constructs such as Ingress and Terraform scripting can be managed by developers with minimal exposure to such functionality. Experts who have invested time and money in creating custom solutions using such tools do not need to discard their work. DuploCloud can help integrate existing solutions and workflows, often automating them during onboarding at no additional cost.
Our website also features a comprehensive Chatbot () that can provide thorough answers, coding assistance, and troubleshooting. Every DuploCloud customer receives their own Slack channel for personalized support from our responsive team of DevOps specialists.
4. Ease of Use and Expedited Navigation with JIT Access
Complex navigation and workflows can be a huge headache for DevOps and cloud engineers. Using DuploCloud, you can minimize the time you spend logging in and out of AWS, Azure, and GCP consoles. Every DevOps and SecOps task can be completed from within the DuploCloud Portal, often with significantly reduced clicks.
Compare the keystrokes and navigation between DuploCloud and using a native cloud portal. Often, DevOps engineers "get used to the pain" inherent in many daily DevOps tasks, unaware they can gain back minutes, hours, and days by using DuploCloud.
Some commonly used tools that can be accessed directly within DuploCloud include kubectl, shell access, and JIT cloud console access.
5. Turn-Key Compliance and Security
When you let DuploCloud manage your DevOps environment, a scalable and robust SecOps framework and implementation strategy are included. Aligned with industry best practices, our staff of SecOps experts analyzes how your data is stored and transmitted, helps identify the standards you must meet, and then constructs a detailed implementation strategy to meet and exceed those requirements. In addition, we create a scalable model that adapts as your customer base and workloads grow.
DuploCloud walks you through each process step during , then ensures each implementation phase results in smooth and secure operations, laying the foundation for a reliable and compliant system.
Using easy-to-access "Single Pane of Glass" dashboards, DuploCloud provides a granular view of all security issues and compliance controls. Completing questionnaires and passing audits is simple, especially with our 24/7 support.
6. Seamless CI/CD Pipeline Integrations
DuploCloud supports all the primary for creating automated, streamlined CI/CD pipelines, ensuring consistent processes and repeatable workflows.
Some of the tools we support, such as GitHub Actions, include ready-to-run scripts for quickly creating Docker images, updating Services or Lambdas, uploading data to an S3 Bucket, or executing Terraform scripts.
Whatever your tool of choice, our DevOps experts can help you find the best workflow that requires the least effort to build and maintain.
7. Optimizing DevOps Spending
One of the biggest reasons to consider an automated DevOps solution comes down to dollars and cents. It's too easy to spend a lot on a public cloud solution without knowing precisely where your money goes. Sometimes, the components and services you've created (and even ones you've forgotten about) cost you more than they're earning you.
DuploCloud provides several billing dashboards that break down your spending by workspace and component. These dashboards are navigable with just a few clicks. Our support team can help you identify redundancies in services and tools and possibly cut costs by suggesting solutions leveraging the many third-party tools built into DuploCloud.
As with most platforms, the work required to set up and configure a Terraform environment can adversely impact accuracy, productivity gains, and effectiveness. Crafting scalable Terraform requires more skills than simply programming. In addition, as with any code base, it requires constant updating, refactoring, and other maintenance tasks.
But here again, the power of ready-made templates in DuploCloud works to your advantage. DuploCloud contains its own Terraform provider, which can access DuploCloud constructs such as and . This simplifies the creation of many cloud resources by assuming defaults for compliance and security. When you run DuploCloud, you’re already speeding up the creation of DevOps components, so adding another accelerator based on Terraform is a win-win proposition: less code, less maintenance, faster deployments, and faster time-to-market.
Using DuploCloud’s proprietary Terraform provider removes the need to write specifically for one public cloud. You can effectively use the same DuploCloud Terraform Code — as it maps to DuploCloud’s constructs, not one specific cloud — with several public clouds. You don’t need to worry about differentiating platform-specific specifications. DuploCloud handles all of this for you in a transparent, replicable manner. You use utilities such as DuploCloud’s Terraform Exporter to quickly clone Tenants and modify configuration details when needed for specific Infrastructures and Tenants.
9. Single Pane of Glass for Enhanced Observability
Attempting to monitor your Cloud Infrastructure from the numerous UIs offered by public providers often obscures problems or causes confusion. DuploCloud's monitoring interfaces combine multiple functionalities on one screen; our SIEM Dashboard is a primary example of such flexibility and comprehensiveness. Leveraging Wazuh, DuploCloud offers unprecedented insights from a single interface.
Using OpenSearch, Grafana, and Prometheus, you can get single snapshots of logging, auditing, compliance and security vulnerabilities, custom alerting, and fault lists with one click.
DuploCloud utilizes numerous , which are included in the cost of a DuploCloud subscription. Depending on what tools you already use and the capacity in which you use them, a DuploCloud subscription can sometimes make the need for additional licenses obsolete. Our team of Solutions Architects can verify functional overlaps and suggest an optimal strategy to deliver the required functionality at the most efficient cost.
Support Options: Standard vs. Managed Operations
DuploCloud offers two levels of ongoing support designed to meet different customer needs: Standard Support, and Managed Operations.
DuploCloud Support Services
Every new DuploCloud customer begins with a led by our DevOps Implementation Engineers and project Team. Onboarding includes infrastructure assessment, deployment of services and environments, CI/CD integration, observability setup, and a guided production cutover.
After onboarding, customers transition to Standard Support. For Teams that require deeper, hands-on operational assistance, our Managed Operations offering is available as an upgrade. For details on pricing, see .
Shared DuploCloud Infrastructure (VPC, Tenant, VM/instances, S3 bucket, RDS). Cluster/namespace can also be shared.
Scaling: Increase compute instances for Kubernetes worker nodes as needed.
Shared network layer (VPC).
Separate Tenants per customer with security boundaries (Security Group, KMS Key, SSH Key, Kubernetes Namespace).
Kubernetes Cluster is shared and boundaries are through the Namespace.
Separate VPC and Network Resources for each customer.
Clusters are inherently separate through Tenants isolated in different Infrastructures.
Higher cost due to duplicated resources and Operational Overhead.
Separate accounts with a DuploCloud Platform installed in each.
Each account then has its own DuploCloud Infrastructure and Tenant.
A combination of previous models.
Organization-specific depending on requirements: some organizations may be in a pooled application environment whereas others may be more isolated through Tenant boundaries.
Customer's Cloud Account or On-Premises Cluster (EKS, AKS, GKE, Oracle, DOKS, etc.) in conjunction with a DuploCloud Infrastructure. This could be any Kubernetes Cluster not created by DuploCloud.
Manages both multi-Tenant and single-Tenant environments from the DuploCloud UI.
In the Configuration field, enter: helpdesk-impersonatable-users
In the adjacent Value field, enter the usernames of the users the Agent can impersonate, separated by semicolons, e.g., user1;user2;user3.
Click Add to apply the setting. The users appear in the Permissions list box when creating a Slack Ticket.
Create Ticket
, and a panel prompting you to enter details about your request will appear.
Portal
or
Tenant
, specify the
Share Thread Context
and add a
New Message
, if needed:
New Message: Optionally, add a note or instruction for the AI Agent.
Share Thread Context: Choose whether the AI should retain the existing thread conversation in the updated context:
Yes - Share all the thread messages: Transfers the entire conversation history to the AI in the updated Ticket context.
No - Do not share thread messages: Updates the ticket context without transferring the previous thread messages. No historical data is transferred.
Click Update to apply your changes.
Portal
Select the DuploCloud Portal where the Ticket should be created.
Tenant
Choose the Tenant within the selected Portal. Only Tenants you have access to will appear.
Agent
Pick the AI Agent to handle your request. Each Agent specializes in specific tasks (Kubernetes, deployments, etc.). For more information, see the Agent documentation.
DefaultPermissions
Select the user identity the Agent should assume. Only users added in Tenant Settings (helpdesk-impersonatable-users) will appear. See the detailed steps for Configuring Agent Permissions.
This tutorial offers different paths in the tutorial for creating Services with EKS, ECS, or DuploCloud Docker. For creating ECS or EKS Services, select the following options:
Enable ECS Cluster: Choose this option to follow the ECS path in the tutorial.
Enable EKS Cluster, and set Cluster Mode to Standard to follow the EKS path.
If you prefer to use Auto Mode, which offers a fully managed EKS experience with no host access or custom AMIs, see the DuploCloud documentation for EKS Auto Mode.
Click Create to create the Infrastructure. It may take up to half an hour to create the Infrastructure. While the Infrastructure is being created, a Pending status is displayed in the Infrastructure page Status column, often with additional information about what part of the Infrastructure DuploCloud is currently creating. When creation completes, a status of Complete displays.
Check Kubernetes status:
For EKS: Confirm that the EKS card on the right shows Enabled.
For ECS: Confirm that the Cluster Name appears in the ECS tab.
NONPROD Infrastructure page showing the EKS card with Enabled.
NONPROD Infrastructure page with the Cluster Name displayed in the ECS tab.
Standard Support
Standard Support is designed to enable your internal Teams to effectively operate cloud infrastructure with confidence, while leveraging DuploCloud’s automation and best practices. This support level ensures timely help with questions, troubleshooting, and implementation guidance.
What’s Included:
24×7 Access
Available via Slack or Microsoft Teams for real-time communication.
Ideal for initiating support requests, troubleshooting issues collaboratively, or asking quick questions.
Responses are handled by DuploCloud’s experienced support engineers, ensuring fast and informed assistance.
Issues are logged and tracked in a dedicated ticketing system.
Support Scope:
CI/CD Pipelines
Guidance on integrating and managing pipelines for continuous integration and deployment workflows.
Infrastructure Automation
Support for setting up and modifying infrastructure components automated by DuploCloud, including networks, security policies, and containerized workloads.
Service Level Agreements (SLA)
DuploCloud provides responsive support for all critical issues and best-effort assistance for general queries, typically within a few business hours depending on severity and complexity. Customers with urgent or escalated issues can flag them via chat using Slack or Microsoft Teams, or a provided escalation email address.
Standard Support is designed for Teams who want direct access to knowledgeable engineers without the need for fully managed operations. It is especially well-suited for engineering-led Teams that prefer to maintain operational control while getting help when and where they need it.
Managed Operations
Managed Operations is DuploCloud’s premium support offering, designed for organizations seeking full-service operational management of their cloud infrastructure. This service builds on what’s included in Standard Support and is available as an add-on. You’ll work with named DevOps and AI Engineers who provide proactive, hands-on management and operate as an extension of your internal engineering and DevOps Teams.
What’s Included:
Custom Agent Development
Our engineers work closely with your Team, understanding the business needs from an infrastructure perspective and help build AI Agents and agentic workflows to fit your custom requirements. Our goal is always to address the business need end-to-end (E2E) that often goes beyond just looking at cloud infrastructure.
Dedicated DevOps Support
Work directly with DuploCloud experts who understand your infrastructure, compliance posture, and operational goals.
Communicate via a shared Slack or Teams channel for faster collaboration and resolution.
Custom Statement of Work (SOW)
Every Managed Operations engagement is tailored to your organization. We collaborate with you to define a custom SOW that outlines the services, deliverables, and responsibilities aligned with your priorities.
The SOW can include any of the areas listed below — such as infrastructure provisioning, CI/CD automation, observability, security enforcement, and compliance support.
If you need to offload day-to-day cloud operations or ensure best-practice implementation at scale, Managed Operations delivers that through hands-on support, compliance reinforcement, and Platform optimization.
Comparing Support Options
The following table outlines key differences between Standard Support and Managed Operations:
Feature
Standard Support
Managed Operations
Support Access
24x7 via Slack or Teams
24x7 via dedicated channel with assigned DevOps Engineers
Level of Engagement
On-demand expert guidance
Proactive, ongoing management tailored to your environment
Scope of Coverage
Services deployed via the DuploCloud Platform
Full cloud infrastructure, including components outside the DuploCloud Platform
Troubleshooting & Issue Resolution
Support for issues within the DuploCloud Platform
Comprehensive troubleshooting across full cloud infrastructure
Which Support Option is Right for You?
Standard Support is ideal if you:
Self service your DevOps needs using the DuploCloud Platform.
Manage most cloud operations internally.
Prefer on-demand access to cloud experts for questions and troubleshooting.
Use DuploCloud primarily for Platform-deployed services.
Want flexible, self-driven support with best-practice guidance.
Managed Operations is a great fit if you:
Have ongoing DevOps project work that warrants dedicated staff.
Need proactive, hands-on management of your entire cloud infrastructure.
Want to offload daily cloud operations to experienced DevOps Engineers.
Require support that extends beyond the DuploCloud Platform to all cloud resources.
Operate in regulated environments or want strong security and compliance assistance.
Seek a trusted partner embedded with your Team to optimize performance and reliability.
Creating an RDS database to integrate with your DuploCloud Service
Creating an RDS database is not essential to running a DuploCloud Service. However, as most services also incorporate an RDS, this step is included to demonstrate the ease of creating a database in DuploCloud. To skip this step, proceed to creating an EKS or ECS Service.
An is a managed Relational Database Service that is easy to set up and maintain in DuploCloud for AWS public cloud environments. RDSs support many databases including MySQL, PostgreSQL, MariaDB, Oracle BYOL, or SQL Server.
See the for more information.
Cloud Provisioning
Help with resource creation and management across AWS, Azure, or GCP by using the DuploCloud Platform.
Deployment Troubleshooting
Diagnosis and resolution of errors or misconfigurations in application and infrastructure deployments.
Best Practice Guidance
Consultation on DevOps workflows, cloud architecture, and regulatory compliance to align with industry and platform standards.
Custom Agents
Design workflow that span multiple custom inhouse and third party tools.
Build Agents.
Evaluate performance and ROI.
LLM Cost management and more.
Cloud Infrastructure
Provision and manage cloud resources across environments.
Monitor system health, availability, and security posture.
Enforce policies and align with industry best practices.
Application & Workload Support
Deploy and manage workloads through the DuploCloud Platform.
Support for service migrations and CI/CD pipeline integration.
Configure infrastructure-as-code (e.g., Terraform) as needed.
Observability & Reliability
Implement and manage logging, monitoring, and alerting.
Automate backups and support periodic Disaster Recovery (DR) and Business Continuity Planning (BCP) exercises.
Cloud Cost Optimization
Analyze cloud resource usage and spending trends.
Recommend and implement cost-saving strategies.
Security & Compliance Support
Conduct annual penetration tests and provide findings with remediation plans.
Assist with audits and assessments.
Observability
Guidance on setting up logging and monitoring
Assistance configuring observability tools, aligned with your needs
Security & Compliance
General best-practice advice
Hands-on support, audit collaboration, and policy enforcement
from duplo_custom_tool.kubectl_tool import ExecuteKubectlCommandTool
# Initialize the tool
tool = ExecuteKubectlCommandTool()
Estimated time to complete Step 3: 5 minutes.
Prerequisites
Before creating an RDS, verify that you accomplished the tasks in the previous tutorial steps. Using the DuploCloud Portal, confirm that:
In the Tenant list box, select the dev01 Tenant that you created.
Navigate to Cloud Services -> Database.
Select the RDS tab, and click Add. The Create a RDS page displays.
From the table below, enter the values that correspond to the fields on the Create a RDS page. Accept default values for fields not specified.
Click Create. The database displays with a statusof Submitted in the RDS tab. Database creation takes approximately ten (10) minutes.
DuploCloud prepends DUPLO to the name of your RDS database instance.
Create a RDS page field
Value
RDS Name
docs
User Name
YOUR_DUPLOCLOUD_ADMIN_USER_NAME
User password
YOUR_DUPLOCLOUD_ADMIN_PASSWORD
RDS Engine
MySQL
RDS Engine Version
LATEST_AVAILABLE_VERSION
Validating RDS Database Creation
You can monitor the status of database creation using the RDS tab and the Status column.
When the database statusreads Available on the RDS tab on the Database page, the database's endpoint is ready for connection to a DuploCloud Service, which you create and start in the next step.
Troubleshooting Database Creation Failures
Faults can be viewed in the DuploCloud Portal by clicking the Fault/Alert ( ) Icon. Common database faults that may cause database creation to fail include:
Invalid passwords - Passwords cannot have special characters like quotes, @, commas, etc. Use a combination of uppercase and lowercase letters and numbers.
Invalid encryption - Encryption is not supported for small database instances (micro, small, or medium).
The RDS tab with the Fault/Alert Icon highlighted
Verifying Database Endpoints
In the RDS tab, select the DUPLODOCS database you created.
Note the database endpoint, the name, and credentials. For security, the database is automatically placed in a private subnet to prevent access from the internet. Access to the database is automatically set up for all resources (EC2 instances, containers, Lambdas, etc.) in the DuploCloud dev01 Tenant. You need the endpoint to connect to the database from an application running in the EC2 instance.
RDS Database details page with the endpoint highlighted
When you place a DuploCloud Service in a live production environment, consider passing the database endpoint, name, and credentials to a DuploCloud Service using AWS Secrets Manager, or Kubernetes Configs and Secrets.
Not sure what kind of DuploCloud Service you want to create? Consider the following:
AWS EKS is a managed Kubernetes service. AWS ECS is a fully managed container orchestration service using AWS technology. For a full discussion of the benefits of EKS vs. ECS, consult this AWS blog.
are ideal for lightweight deployments and run on any platform, using GitHub and other open-source tools.
In the DuploCloud AI Suite, the Vector Database (VectorDB) enables you to upload documents, such as architecture diagrams, runbooks, internal wikis, or API references, that you want the AI Agent to use for context during conversations. These documents are transformed into high-dimensional vector representations, which allow the system to retrieve the most relevant content when the Agent processes your queries. This enhanced context allows the Agent to better understand your cloud environment, use your terminology, and align with your organization’s best practices.
DuploCloud supports two types of VectorDBs:
Managed VectorDBs
DuploCloud deploys and manages VectorDBs directly within your Kubernetes environment, handling setup, environment variables, and connectivity for seamless integration. Supported engines include:
Chroma: Lightweight, fast, ideal for local AI workloads.
MilvusDB: Scalable for high-performance vector search at large scale.
Use managed VectorDBs if you want to keep all components within your cloud account, prefer zero setup, or don’t have an external VectorDB provider.
Third-Party VectorDBs
These are externally hosted vector databases like Pinecone or PostgreSQL that DuploCloud connects to but does not manage or deploy.
Choose third-party VectorDBs if you already use an external provider or need to integrate with specialized vector DB services outside your Kubernetes cluster.
Integrating VectorDBs with DuploCloud
The first step for working with VectorDBs in DuploCloud is to integrate a VectorDB with the DuploCloud AI Suite. This allows the platform to store and retrieve vectorized content.
Prerequisites
You must have access to the AI Suite feature in the DuploCloud Portal.
For third-party VectorDBs (e.g., Pinecone), make sure you have your API endpoint and any necessary authentication information.
For managed VectorDBs (e.g., Chroma, MilvusDB), ensure your Kubernetes environment is ready to deploy services.
Integrating Third-Party VectorDBs
To integrate a third-party vector database, such as Pinecone:
In the DuploCloud Platform, navigate to AI Suite → Studio → Vector DBs.
Click Add. The Add Vector Database pane displays.
Complete the following fields:
Click Submit to save the VectorDB. Your third-party VectorDB is ready to use immediately.
Integrating Managed VectorDBs
To integrate a DuploCloud-managed VectorDB (Chroma or MilvusDB), add and then deploy the database in the DuploCloud Platform.
Adding a Managed VectorDB
In the DuploCloud Platform, navigate to AI Suite → Studio → VectorDBs.
Click Add. The Add Vector Database pane displays.
Complete the following fields:
Click Submit to save the VectorDB.
Note: Adding a Managed Vector DB within DuploCloud saves the configuration, but does not deploy the database. Deployment is required before you can upload or ingest files.
Deploying a Managed VectorDB
After adding a managed VectorDB, deploy it to make it active and usable.
Navigate to AI Suite → Studio → Vector DBs.
Select the VectorDB from the NAME column.
Select the Deployment tab, and click Deploy. The Deploy pane displays.
Review or complete the deployment fields:
Choose either:
Quick Deploy to deploy with default settings immediately.
Advanced to customize deployment options before deploying.
Uploading Files
Upload your source documents or data files to your AWS S3 storage to make your files available for processing and ingestion into the VectorDB.
In the DuploCloud portal, go to AI Suite → Studio → Vector DBs.
Select the VectorDB you want to upload files to from the NAME column.
Select the Uploaded Files tab.
Ingesting Files
Ingesting transforms your uploaded files into vector representations.
In the DuploCloud portal, go to AI Suite → Studio → Vector DBs.
Select the Uploaded Files tab.
Click the checkbox(s) to select one or more files you want to ingest.
Viewing Ingestion Jobs
After uploading and ingesting documents into a VectorDB, you can monitor the status and output of each job in the Ingestion Jobs tab. This tab provides access to ingestion history, logs, and detailed configuration metadata to help validate behavior and troubleshoot issues.
In the DuploCloud portal, go to AI Suite → Studio → Vector DBs.
Select the Ingestion Jobs tab.
Click the menu icon () next to the job you want to inspect.
Using VectorDBs with DuploCloud AI Agents
To learn how to integrate the files uploaded to your VectorDBs with the DuploCloud AI Agent, see the DuploCloud documentation for .
If using Advanced Deploy, click Next to navigate through additional configuration screens, then click Create to start deployment. For Quick Deploy, click Quick Deploy.
Monitor the deployment status; it usually takes 4 to 5 minutes. Once complete, the status on the Deployment tab will show Running.
Click Browse. This will open your AWS S3 console where you can select the files you want to upload.
The AWS S3 Console
Select the files to upload (Click Upload Files → Add File, select your file(s), and click Open).
Return to the DuploCloud Uploaded Files tab, and click Sync to update the VectorDB’s Uploaded Files list. The uploaded files are displayed on the Uploaded Files tab.
Click Ingest. The Trigger Build pane displays.
The Trigger Build pane
Configure the fields as needed:
Review the Docker Image: This field is prepopulated with the container used for ingestion. You usually do not need to change it unless you're using a custom image.
Timeout: Enter the maximum duration (in minutes) for the ingestion job.
Custom Meta Data (Optional): Use key-value pairs to customize how the ingestion job processes your data. Common options include:
chunk_size: Size of each text chunk in characters (e.g., 1000).
chunk-overlap: Number of overlapping characters between chunks (e.g., 100).
Click Submit to trigger the ingestion job. Monitor the ingestion status on the Ingested Jobs tab.
Ingested Jobs tab in the DuploCloud Platform
Choose one of the following options:
Logs: View output that includes source file paths, chunking progress, chunk IDs, and any success or error messages.
The Logs for an ingested job
Details: Open a structured JSON summary showing VectorDB type and provider, API endpoint, file paths ingested, output directory, chunking configuration, embedding model, and other technical metadata.
The Details for an ingested job
Name
Enter a friendly name for the VectorDB.
Vector DB Type
Select pinecone for a third-party VectorDB.
API Endpoint
Enter the endpoint URL for your Pinecone instance.
Metadata
Optionally, enter key-value pairs to organize or filter this VectorDB later.
Name
Enter a friendly name for the Vector DB.
Vector DB Type
Select your Vector DB type, (e.g., chroma or milvusdb).
Deployment Environment Variables
Optionally, add custom environment variables (e.g., API keys, flags).
Metadata
Optionally, enter key-value pairs for organizing or tagging the VectorDB.
Name
Auto-filled with the VectorDB name; can be customized if desired.
Docker Image
Auto-filled for managed VectorDBs. For third-party VectorDBs, confirm or provide the correct image if applicable.
Deployment Environment Variables
Define any environment variables required for your VectorDB.
Advanced Options
Optional settings such as replicas, service name, network, volumes, and load balancer listeners.
The Add Vector Database pane in the DuploCloud Portal
The Deploy pane for the duplo-managed-db Vector DB
Deployment tab for the VectorDB with status Running
The Uploaded Files tab in the DuploCloud Platform
The Ingestion Jobs tab with the Logs and Details menu options highlighted
Agents
Agents are the core AI components in DuploCloud AI Suite. Each Agent is responsible for interpreting user inputs, deciding which Tools to invoke, and orchestrating intelligent responses using integrated data and logic. They serve as the execution layer for AI-powered workflows and can be tailored to a wide range of use cases, from conversational interfaces to backend automation.
The typical workflow for using Agents involves: creating the Agent in AI Studio, deploying it with configurable resource limits and replicas, and registering it for use within your infrastructure. DuploCloud supports two types of Agents: Prebuilt and Dynamic. Each offers a different development path, depending on whether you're deploying an existing containerized service or creating a prompt-driven AI workflow within DuploCloud.
Creating Agents
Creating a Prebuilt Agent
Prebuilt Agents use a pre-existing container image that defines its functionality. To read about the Prebuilt Agents included out-of-the-box, see the .
To create a Prebuilt Agent, follow these steps:
Navigate to AI Suite → Studio → Agents.
Click Add. The Add Agent Definition pane displays.
Complete the following fields:
Click Submit to create the Agent. You can view your Agents on the Agents tab.
Once the Agent has been successfully completed, you can deploy and register the Agent so it can be used with .
If you are building a custom Prebuilt Agent that integrates programmatically with HelpDesk, refer to the for API standards, message formats, Terminal command handling, and other integration details.
Creating a Dynamic Agent
Dynamic Agents are configured using flexible, user-defined parameters, including Tools, prompt behavior, and optional custom build variables. To create a Dynamic Agent, first create an Agent definition, and then build an Agent image based on its configuration.
Creating an Agent Definition
Navigate to AI Suite → Studio → Agents.
Click Add. The Add Agent Definition pane displays.
Complete the following fields:
Click Submit. Once the Agent creation is complete, package your configuration into a deployable container image.
Note: Once a Dynamic Agent is built, you can view the list of Tools it is configured to use on the Agent’s details page. This makes it easy to confirm which Tool packages are active for troubleshooting or auditing purposes.
Building an Image
Now that the Agent is created, trigger a build to package your dynamic Agent’s configuration, including prompts, Tool selections, and variables, into a deployable container image.
Go to the Builds tab on the Agent’s details page.
Click Trigger. The Trigger Build pane displays.
Complete the following fields:
Click Submit to begin packaging your configured Agent into a runnable container image. After the build is complete, you can proceed to deploy the image and register the Agent.
Deploying Agent Images
Once an Agent image is created, it must be deployed. Deploying the Agent makes it available for use on your infrastructure.
Select the Images tab on the Agent page (AI Suite → Studio → Agents → select the Agent name).
Click the menu icon () next to the Agent image and select Deploy. The Deploy Image pane displays with name and image fields prepopulated.
Note: You can deploy multiple instances of the same Agent image. Each deployment can have its own configuration, scaling, and network settings, allowing the Agent to serve multiple environments or handle higher workloads.
Registering Agents
Once an Agent has been successfully deployed, it must be registered so that the DuploCloud AI HelpDesk can route queries to it.
Select the Register tab on the Agent page (AI Suite → Studio → Agents → select the Agent name).
Click Register. The Register Agent pane displays.
Complete the following fields:
Click Submit to register the Agent. This Agent can now be utilized by the DuploCloud HelpDesk. To learn how to configure and use HelpDesk, see the .
Customizing Prompt Suggestions
You can customize the prompts that appear when interacting with an Agent by adding metadata to its configuration.
To add custom prompts for an Agent:
Go to AI Suite → AI Studio → Agents and select the Agent from the Name column.
Open the Metadata tab.
If a prompt_suggestions key exists, click the menu icon () and select
Click Add or Update to save the custom prompts.
These prompts appear as suggestions when creating a new Ticket with the selected Agent assigned, helping guide Agent interactions efficiently.
Define key-value pairs (mark as mandatory if needed).
Meta Data
Optionally, enter additional key-value configurations.
Set the maximum number of tokens for responses (e.g., 1000).
Knowledge Sources
Optionally, click the plus icon () to connect the Agent to a knowledge source such as a vector database collection. This allows the Agent to retrieve and use information from previously uploaded documents stored in a vector database. Complete the fields:
Vector DB: Select a previously created vector database to connect as a knowledge source.
Collections: Choose one or more document collections within the VectorDB relevant to the Agent (required if Vector DB is selected).
Meta Data
Add custom key-value metadata to the Agent.
Choose a deployment method:
Quick Deploy: Automatically sets up everything needed to run your Agent: it creates a DuploCloud Service, deploys a pod that runs the Agent container, and exposes it through a load balancer listener using the port specified during Agent creation.
Advanced: Allows full control over deployment settings, including network, scaling, and service options.
Proceed through the remaining steps to complete the deployment, following the prompts based on whether you selected Quick Deploy or Advanced. Monitor the deployment status on the Deployments tab.
Edit
to update the prompt suggestions.
If it doesn’t exist, click Add. The Add MetaData pane displays.
In the Key field, enter prompt_suggestions.
Update or enter new prompt suggestions in the Value field as a JSON-style array. For example:
Name
Enter a name for the Agent.
Agent Type
Select Prebuilt.
Docker Image
Enter the full image path (e.g., registry/myagent:latest).
Base64 Docker Registry Credential (Optional)
Enter credentials if needed.
Port
Enter the port exposed by the container.
Protocol
Select the network protocol your Agent uses for communication (e.g., http, https, or grpc). This determines how external services connect to the containerized Agent.
Token Limit
Set the token output limit.
Name
Enter a name for the Agent.
Agent Type
Select Dynamic.
Prompt
Enter the initial instruction or context that guides the Agent’s behavior and responses.
Tools
Select one or more registered Tools for the Agent to use. For more about using Tools, see the DuploCloud Tools documentation.
Provider
Select the Large Language Model (LLM) that will power the Agent (e.g., bedrock or Other.
Model
Choose the specific version or configuration of the selected LLM to use for this Agent.
Temperature
Set the randomness of responses (e.g., 0 for deterministic behavior).
Builder Docker Image
This field is prepopulated based on your Agent configurations.
Timeout
Set a timeout for the build job (e.g., 0 for unlimited).
Tools
Select one or more registered tools for the Agent to use. See the Tools documentation for more details.
Build Environment Variables
Define environment variables to initialize tool behavior.
Example:Key: init_code
Value: from duplo_custom_tool import ExecuteKubectlCommandTool; tool = ExecuteKubectlCommandTool(); print(isinstance(tool, BaseTool)); print(type(tool))
Mandatory
Check if the variable is required.
Custom Build Variables
(Optional) Add any custom key/value pairs for build-time configuration.
Name
Provide a name for this Agent registration.
Instance ID
Enter the ID of the deployed instance (created during deployment).
Allowed Tenants
Select the tenants where this Agent is allowed to operate.
Endpoint
The service endpoint for the deployed Agent (prepopulated).
Path
The endpoint path that handles requests. You can retrieve this from the Agent's registration info, if needed: On the Register tab, click the menu icon () next to the Instance and select Edit. Copy the path from the Path field.
Headers
Optional key/value pairs to pass custom headers during API calls.
Agents tab with the kubernetes-agent Agent displayed
The Add Agent Definition pane
Expanded Knowledge Sources section of the Add Agent Definition pane
Agents page with the Deploy Image pane
Agents Deployment tab showing the deployment with Running status
The Register Agent pane
Environment Variables
Token Limit
View Cartography Details
The Cartography feature visualizes the relationships and dependencies within your infrastructure. By providing a Dependencies Manifest (YAML), DuploCloud maps services, workloads, and external resources into a Neo4j graph. This allows you to explore, analyze, and manage dependencies across Kubernetes, cloud services, and external systems, giving you a clear picture of how your applications interact and rely on each other.
This guide explains how to describe a service/workload's dependencies in YAML, how the file is discovered at runtime, and how each dependency type is matched in Neo4j. It is written for junior engineers; examples are included throughout.
Where the file comes from
You can supply the manifest as a plain file inside the container, or by mounting a Kubernetes ConfigMap to a file path.
Set DEPENDENCIES_MAPPING_FILE=/path/to/file.yaml to use a specific file path (e.g., a ConfigMap mounted at /config/config).
If DEPENDENCIES_MAPPING_FILE is not set, we log a warning and skip the dependencies step (no crash).
Important: The mounted file must contain only the YAML shown below – do NOT wrap it in a config: | key. The file content should start with namespaces:.
YAML schema (high level)
The detailed specification can be found .
namespaces is a list. Each list item maps one namespace to one or more workloads under it.
<workload-name> is a logical workload key. We derive this from pod names by trimming hashes/ordinals, so all pod instances of the same workload share one definition.
Examples of derived workload names:
Deployment pod: api-7c9d88f9d9-abc12 → api
StatefulSet pod: db-0 → db
Kubernetes dependencies
These link your pod(s) to existing typed Kubernetes nodes in Neo4j. Supported kinds in v1:
We link to existing typed nodes: (:KubernetesService), (:KubernetesSecret), or (:KubernetesConfigMap) by name + namespace.
If the target doesn’t exist in the graph, we skip and log a warning. We do not create generic K8s nodes in v1.
AWS dependencies
AWS dependencies link your pod(s) to existing typed AWS nodes in Neo4j.
Use the actual Neo4j label in type (e.g., S3Bucket, RDSInstance).
Optional identifier_name selects the node property to match. Default: name for all types.
Item shapes:
Example:
Matching behavior:
Generic: MATCH (n:<type> { <identifier_name or default>: <value> })
Examples:
S3: MATCH (b:S3Bucket {name: name})
Neo4j property mapping rules:
If identifier_name is not set, we match on the default property (name) using the item’s name value.
If identifier_name is set and a property with that exact key is provided on the item, we use that property’s value; otherwise we fall back to name.
External dependencies
External dependencies represent systems outside your cluster/AWS account (or AWS offerings that don’t create resources, like Bedrock/SES).
Item shape:
Example:
Graph behavior:
We MERGE(:ExternalService {name}) and MERGE a DEPENDS_ON relationship from the pod. If the pod has IN_TENANT, we also MERGE (ExternalService)-[:IN_TENANT]->(DuploTenant) so the external system inherits tenant scoping.
On external relationships, we store helpful fields like critical
Dynamic ExternalService properties:
In addition to the fields listed above, any extra keys you place on an external item are also written to the ExternalService node as properties.
Keys are sanitized to Neo4j-safe names: only letters, numbers, and underscore are kept; everything else becomes _.
Example with custom fields:
Source control dependencies
These link your pod(s) to source code repositories.
Item shape (generic SCM):
Examples:
Example (GitHub):
Matching behavior:
We compute a canonical_url from the provided url (strip .git, strip trailing /, normalize host to lowercase, convert SSH/SCP forms to https://host/path).
If provider is github or the host is github.com, we attempt to link to an existing (:GitHubRepository) by url
When you use our , here is example of what the above example looks like:
Troubleshooting checklist
YAML wrapper: The file must start with namespaces: – do not wrap with config: |.
Indentation: Keys under list items must be indented two spaces more than the - line.
Workload key: The workload name must match the derived workload from pod names (e.g.,
Operational notes
The ingestion runs even when the manifest is missing or malformed; we log and continue.
You can change the file path at runtime by setting DEPENDENCIES_MAPPING_FILE and re-running.
We de-duplicate relationships by stable keys and set lastupdated on each run.
If you have questions or see skipped targets, copy the relevant log lines and open an issue with the exact pod name, namespace, and the YAML snippet.
FAQs
If your question isn't answered here, reach out to the team at [email protected] or contact us on Slack.
Security & Access
Do we need full admin access?
No. You don't need to grant any access to get started. DuploCloud's stack runs as a few Docker containers alongside a MongoDB instance and two S3 buckets — no privileged access to your environment is required upfront.
Product Updates
New features and enhancements in DuploCloud
Last Updated, January 8, 2026
GCP - — Support scaling node pools down to zero.¹,²
AWS - — Support for globally distributed Aurora clusters.²
["list all running pods", "show CPU and memory usage for my pods", "display recent events in the cluster"]
Kubernetes – Image Update API — Support image updates for additional and init containers.¹,²
Description: Enter a brief summary of the knowledge source’s purpose or contents (optional).
Meta Data: Add key-value pairs to filter or target specific content in the knowledge source (optional).
If the file cannot be opened, is empty, or has malformed YAML, we log a warning and skip the step (no crash).
Identifier value: by default we use the item’s name. If you set identifier_name, you can also supply a property with that exact key; when present, it takes precedence over name.
RDS: MATCH (r:RDSInstance {db_instance_identifier: db_instance_identifier OR name})
If the target doesn’t exist, no edge is created (we log for troubleshooting).
,
description
,
url
,
region
,
inference_profile_arn
, and
service_name
.
Values are normalized:
Primitives (string, number, boolean) are stored directly.
Lists of primitives are stored as arrays.
Complex values (objects, mixed lists) are JSON-serialized into a string for stability.
Properties are added/updated when present in the YAML. Keys not present in the current YAML are removed on each run (protected keys name, firstseen, lastupdated are preserved).
(http/https) or
giturl
(git/ssh). If none exists, we skip linking (GitHub repositories are created by a separate ingestion).
Otherwise, we MERGE a generic (:SCMRepository {canonical_url}) node, setting helpful properties like host, provider, url/giturl (originals), and lastupdated.
The relationship created is (:KubernetesPod)-[:DEPENDS_ON {kind:'source_control', type:'scm', provider?, name?, description?, service_name}]->(<repo node>). name is optional and stored on the relationship when provided.
namespaces:
- <namespace>:
<workload-name>:
dependencies:
kubernetes: [] # list of K8s dependency items
aws: [] # list of AWS dependency items
external: [] # list of external dependency items
source_control: [] # list of source control dependency items
- type: KubernetesService # Kubernetes Kind (PascalCase)
name: <k8s-name>
namespace: <optional; defaults to this namespace>
protocol: <optional>
port: <optional int>
url: <optional> # for documentation; stored on the relationship
description: <optional string>
kubernetes:
- type: KubernetesService
name: neo4j
namespace: duploservices-andy
protocol: bolt
port: 7687
url: bolt://neo4j.example.net:7687
description: Graph database used by architecture-diagram
# S3 bucket
- type: S3Bucket
name: <bucket> # maps to S3Bucket.name (default property)
identifier_name: name # optional; explicit for clarity
region: <optional>
description: <optional string>
# RDS instance
- type: RDSInstance
name: <db_identifier> # used if db_instance_identifier not provided
identifier_name: db_instance_identifier
db_instance_identifier: <db_identifier>
# optional; when set, used over name
region: <optional>
description: <optional string>
aws:
- type: S3Bucket
name: my-diagrams
region: us-east-1
description: Stores rendered diagrams
- type: RDSInstance
name: orders-db-prod
identifier_name: db_instance_identifier
# db_instance_identifier: orders-db-prod
# optional; if set, used over name
region: us-east-1
description: Primary orders database
external:
- type: api
name: azure-communication-server
url: https://example.communication.azure.com
description: Customer notifications via Azure Communication Server
- type: api
name: anthropic-claude-sonnet-3.5
url: https://bedrock.aws.amazon.com
description: Anthropic Claude Sonnet 3.5 via Application Inference Profile
region: us-east-1
inference_profile_arn: arn:aws:bedrock:us-east-1:123456789012:application-inference-profile/abcd
# (git repositories are modeled under source_control → see section below)
config: |
namespaces:
- duploservices-andy:
k8s:
dependencies:
kubernetes:
- type: KubernetesService
name: neo4j
namespace: duploservices-andy
protocol: bolt
port: 7687
url: bolt://neo4j.test10-apps.duplocloud.net:7687
description: "Graph database used by architecture-diagram"
- type: KubernetesConfigMap
name: duplo-cartography
namespace: duploservices-andy
description: "Dependencies for architecture-diagram"
- type: KubernetesSecret
name: my-k8s-secret
namespace: duploservices-andy
description: "GPG passphrase for architecture-diagram"
aws:
- type: S3Bucket
name: duplo-nonprod-awslogs-938690564755
region: us-east-1
description: "Stores exported diagrams"
- type: RDSInstance
name: duploandy-forcost
identifier_name: db_instance_identifier
# db_instance_identifier: duploandy-forcost
# optional; if set, used over name
region: us-east-1
description: "Primary application database"
external:
- type: api
name: azure-communication-server
url: https://example.communication.azure.com
description: "Customer notifications via Azure Communication Server"
- type: bedrock
region: us-east-1
name: anthropic-claude-sonnet-3.5
description: |
Anthropic Claude Sonnet 3.5 via Application Inference Profile
inference_profile_arn: arn:aws:bedrock:us-east-1:938690564755:application-inference-profile/5hli7gcftss9
Access is granted on your terms through Providers and Scopes. The platform uses IAM permissions defined in each Scope to generate temporary, just-in-time credentials that are passed to the agent as part of the ticket. You control exactly what the agent can and cannot touch.
How do you give access to Git?
Git is modeled as a Provider — the same way AWS, Kubernetes, and observability tools are. To give an Engineer Git access:
Navigate to Providers and add your Git provider (GitHub, GitLab, or Bitbucket).
Add your repository credentials under the Credentials tab.
Create a Scope — a named token with defined boundaries over specific repositories.
When creating a ticket, select the appropriate Scope.
See for step-by-step instructions.
How are credentials stored and secured?
Credentials are stored in DuploCloud or referenced from your own secrets manager. The platform uses them to generate scoped, temporary access at execution time — credentials are never passed to agents directly or stored in session context.
Each Scope defines the exact resources an Engineer can access. Guardrails can further restrict specific resources, operations, or environments within that Scope. See AI DevOps Policy Model — Provider and Scope.
What is the audit trail for AI actions?
Every ticket maintains a full context and audit trail throughout its lifecycle — what the agent was asked to do, what it proposed, what was approved, and what was executed. The Engineer Hub surfaces this history per Engineer, providing a transparent record of all completed work. Completed task history is also stored in the Engineer's Knowledge Base, queryable for future reference.
What if a user accidentally pastes credentials or secrets into a prompt?
DuploCloud applies a security validation Skill to all agents. When an agent detects that a prompt contains patterns consistent with secrets — API keys, access tokens, passwords, or cloud provider credentials — it refuses to process the request and returns a warning explaining why.
More broadly, the platform is designed so that users never need to supply credentials in prompts at all. Credentials are managed at the Scope level and injected as temporary, just-in-time credentials at execution time. If a user tries to bypass this by pasting credentials directly, the security Skill acts as a catch.
For stronger guarantees, Scopes can be configured to remove sensitive credential access from the user-facing layer entirely — the agent simply doesn't have the access to misuse.
What compliance certifications does DuploCloud have?
DuploCloud is SOC 2 certified. Full security documentation is available for procurement review. The platform is used by customers in regulated industries including fintech and healthcare. Contact [email protected] for compliance documentation.
Pricing & Billing
What is the limit on the number of tokens?
There is no token-based billing. DuploCloud charges based on tickets (tasks completed) and nodes under management (infrastructure resources managed by the platform) — not on LLM token consumption. Think of it as the cost of a DevOps engineer for a fraction of the price. Contact the team for a business proposal with specific pricing assurances.
What exactly counts as a "ticket"?
A ticket is a unit of work assigned to an AI agent. In the workflow, a human approves a Task generated from a Project Plan — at that point, the Task becomes a Ticket and is dispatched to the appropriate agent for execution. Each ticket corresponds to one discrete, agent-executed action or investigation. See AI Helpdesk - Tickets for details.
What does "nodes under management" mean?
Nodes under management refers to the infrastructure resources — servers, Kubernetes nodes, cloud instances — that DuploCloud actively monitors and operates on. This forms the second dimension of pricing alongside tickets, reflecting the scope of infrastructure the platform is responsible for.
What's included in the 30-day PoC?
The PoC gives you a working AI Engineer running against your real infrastructure. DuploCloud's human operations team — infrastructure engineers, Kubernetes specialists, and security practitioners — is included to support setup, review complex work, and ensure the PoC runs against tasks from your actual backlog. Contact the team to scope a PoC around your specific environment.
Agents & Customisation
Do you use MCP servers or APIs to access AWS, Kubernetes, etc.?
It depends on the agent. For AWS and Kubernetes, the platform primarily uses the CLI — LLMs have strong CLI comprehension and it provides precise, auditable execution. For third-party systems that publish MCP servers (observability tools, ticketing systems, etc.), DuploCloud uses those MCP endpoints directly.
Agents are flexible. DuploCloud's core value is in the overall orchestration layer — individual agents can be modified or replaced for your specific environment. See MCP Servers for configuration details.
How do you handle long-running jobs?
The platform supports two communication modes:
Synchronous — for short, fast-turnaround tasks where the result is returned inline.
Pub-sub (asynchronous) — for long-running tasks such as code reviews that require a code checkout, analysis, and structured output. The agent publishes results when complete; no session needs to remain open.
Long-running tasks like generating code reviews or large deployments use the pub-sub model automatically.
Is agent memory persistent, and can it be shared across agents?
Agents themselves are stateless — each execution starts fresh with the context provided in the ticket. Persistence lives at the help desk layer: every ticket maintains a full history of the investigation, actions taken, and outcomes. This history is stored in the Engineer's Knowledge Base and is accessible to any agent working on related tickets.
The result is shared, searchable memory at the system level without individual agents needing to carry state between runs. Agents working on a follow-up ticket can query prior work, and human team members can review or build on the full investigation history.
How do you give AI agents context?
Context is assembled from four layers and delivered to the agent as part of each ticket:
Graph database — DuploCloud maintains a graph of your infrastructure that captures relationships between hosts, services, pods, dependencies, and cloud resources — giving agents a structured, queryable map of your environment rather than just flat text.
Knowledge Base retrieval — the platform uses vector search over the Engineer's Knowledge Base (previous tickets, runbooks, architecture notes) to pull relevant prior work into the prompt.
Skills — best practices, guardrails, and operational patterns are encoded as Skills and included in the agent's system prompt. This is how domain expertise is consistently applied without relying on the model to infer it.
Scope credentials — the agent receives temporary, just-in-time credentials scoped to the exact resources it's permitted to access, so it has the access it needs without ever needing to ask for it.
The result is a multi-layer context strategy: graph relationships for infrastructure awareness, vector retrieval for institutional knowledge, Skills for operational expertise, and Scopes for safe execution.
Can we build custom agents or bring our own?
Yes. There are three options:
Prebuilt Agents — use DuploCloud's out-of-the-box agents as-is.
Dynamic Agents — build agents through the platform UI by defining a prompt, selecting tools, choosing an LLM, and deploying a container image.
Bring your own — connect an existing agent by providing its access endpoint. DuploCloud can also provide code for its own agents as a starting point.
See for setup instructions.
What agents come out of the box?
DuploCloud provides the following pre-built agents:
Agent
What it does
SRE Agent
Orchestrates specialist sub-agents for broad incident and operations support
Kubernetes Agent
Cluster management, health checks, resource management, log analysis
See the .
Can we use our own LLM?
Yes. Dynamic Agents support AWS Bedrock as a first-class LLM provider, with additional providers available. The platform is model-agnostic at the agent level — you can configure each agent to use the model that fits your requirements and data residency constraints.
What AI back-ends does DuploCloud use, and why?
DuploCloud works with managed LLM services from major cloud providers — AWS Bedrock, GCP Vertex AI, and Azure AI Foundry, for example — depending on your cloud environment. Using managed services means your data stays within your own cloud account and is not used to train third-party models. This is important for enterprise security and compliance requirements.
The platform is model-agnostic at the agent level. DuploCloud's team continuously evaluates new models as they are released and updates default model assignments based on what performs best for each task type — reasoning-heavy tasks like Terraform plan analysis may use a different model than higher-volume tasks like log summarisation. Customers can always override the default and choose specific models for specific agents.
Does my choice of container platform affect the quality of AI output?
In practice, yes. LLMs perform better against widely adopted, open-standard platforms — such as Kubernetes (EKS, GKE, AKS) — than against proprietary or less common orchestration systems. This is because the volume of public documentation, community discussion, and training data is significantly higher for Kubernetes than for alternatives like ECS.
This doesn't mean proprietary platforms aren't supported — they are. But for complex tasks like troubleshooting, cost optimisation, and infrastructure generation, you'll typically get more accurate and detailed output on Kubernetes-based environments.
If you're choosing between platforms and AI-assisted operations is a priority, DuploCloud will factor this into its recommendation during the scoping phase.
Operations & Reliability
Who is responsible for AI's mistakes and how do I protect against them?
There are two layers of protection:
Deterministic, permission-based controls — the Scope you assign to an Engineer defines exactly what IAM permissions the agent gets. The platform uses those permissions to generate temporary credentials passed to the agent as part of the ticket. The agent cannot act outside those boundaries regardless of what it's asked to do.
Skills — best practices and operational guardrails are encoded directly into the agent's Skills. Skills define not just what an agent can do, but how it should do it, including safety checks and approval steps.
DuploCloud's human operations team also acts as a reliability layer — reviewing complex work and stepping in when something requires human judgment.
How do LLM model updates work, and will they affect my agents?
DuploCloud is model-agnostic. Each agent is configured to use a specific model through your cloud provider's managed LLM service (e.g., AWS Bedrock, Azure OpenAI), which you control. Model updates are not applied automatically — you decide when to change the model an agent uses.
DuploCloud monitors model performance across its customer base and makes recommendations when a newer model produces meaningfully better results for a specific task type (e.g., Kubernetes operations, Terraform plan analysis). These are recommendations, not forced updates.
Because Skills encode best practices as explicit, versioned instructions, agent behavior remains consistent even as underlying models evolve — the guardrails don't change with the model.
What happens to our data if we stop using DuploCloud?
Your infrastructure stays in your accounts — Terraform state, Kubernetes manifests, and all provisioned cloud resources remain fully under your control and continue operating. The Knowledge Base and audit trail are your data, stored in your own repositories (generally, as markdown files) and in DuploCloud's vector database, and can be exported at any time. DuploCloud does not own or lock in any of the artifacts produced.
Integration & Tooling
Can you show us some Jenkins agents?
Yes — DuploCloud has deployed Jenkins agents for multiple customers. The out-of-the-box CI/CD Agent supports Jenkins and GitHub Actions pipeline troubleshooting. Please contact the team to arrange a targeted demonstration.
Can we use our existing Terraform, Helm, or other IaC?
Yes. The platform includes a Terraform Skill out of the box, covering plan, apply, state management, and error handling. Helm and Kubernetes deployments are handled by the Kubernetes Agent and Skills. External Skill packages from HashiCorp and Pulumi can also be made available to the agents. Your existing IaC files, modules, and conventions are used as-is — the agent works with your code, not a replacement for it.
Does DuploCloud support GitOps workflows (Flux, ArgoCD)?
Yes. Custom agents and Skills can be built for GitOps tools like Flux and ArgoCD. The platform's core model — agents operating on your Git repositories with scoped access and a full audit trail of proposed changes — maps naturally to GitOps pull-based delivery.
A GitOps-focused agent can manage Flux Kustomizations, HelmReleases, and GitRepository resources alongside your existing reconciliation workflow. Contact the team to scope a custom agent for your GitOps environment.
How does DuploCloud handle Terraform variable management across environments?
Terraform variable management is addressed at three levels:
System of record — variables and configuration are backed by your Git repositories. The platform treats your existing repo structure as the source of truth and works within it rather than replacing it.
Scope-based access control — each environment (dev, staging, production) is modeled as a separate Scope with its own credentials and boundaries. Engineers only access the variables relevant to the Scope they're operating in, preventing cross-environment leakage.
Skills — Terraform Skills encode best practices for module structure, variable organization, and environment promotion patterns, ensuring consistency across environments regardless of which team member or agent is making changes.
If you're partway through a migration, the platform can scan your existing repositories and cloud accounts to identify gaps and generate a remediation plan.
How does DuploCloud integrate with our existing CI/CD pipeline?
Git repositories (GitHub, GitLab, Bitbucket) are modeled as Providers with scoped access. The out-of-the-box CI/CD Agent integrates with Jenkins and GitHub Actions for pipeline troubleshooting and automation. For deeper pipeline integration, custom agents or Skills can be configured to fit your specific workflow.
Getting Started
What does DuploCloud's infrastructure look like at a high level?
DuploCloud runs within your own cloud account. The platform itself is a small footprint — a few Docker containers, a MongoDB instance, and a few S3 buckets.
The product has three main layers:
Engineer Hub — where you create and manage your AI DevOps Engineers (Platform Engineers, CI/CD Engineers, SRE Engineers, and more). You define high-level project requirements here; the Engineer converts them into a detailed plan and coordinates a team of agents to execute it.
Agentic AI Helpdesk — the work surface for task-level objectives. Accessible via web browser, Slack, Teams, or directly in an IDE, this is where tickets are created, assigned to specialized agents, and completed. Agents include SRE, Kubernetes, cloud-specific, Docs, and Architecture agents.
Integrations — the connectivity layer that links agents to your actual infrastructure through cloud providers (AWS, GCP, Azure), Kubernetes clusters (EKS, AKS, GKE), Git repositories (GitHub, GitLab, Bitbucket), observability tools, and MCP servers. Access is granted through Providers and Scopes, with temporary just-in-time credentials passed to agents at execution time.
Your existing infrastructure — Terraform state, Kubernetes clusters, CI/CD pipelines — is not migrated or replaced. Agents connect to it through the integrations layer using the permissions you define.
How long does it take to get started?
The platform is designed to be operational quickly. Setup involves deploying a few Docker containers, connecting your cloud and Git providers, and configuring an Engineer with the appropriate Skills and Scopes. All of which can be done in a few minutes, not days.
The 30-day PoC is structured to deliver measurable results against real infrastructure within the first sprint. Please contact the team to start scoping your onboarding.
Can DuploCloud scan our existing infrastructure and identify what still needs to be done?
Yes — this is the standard starting point for any project. When you create a Project Plan, you provide the platform with access to your Git repositories and cloud accounts (via Scopes). The planning phase scans what already exists and generates tasks only for what's missing or non-compliant with the target spec.
If you're partway through a migration, the agent picks up from where your team left off — assessing the current state, identifying the remaining gaps, and producing a prioritized task list with code reviews for each delta. You don't start from scratch.
What cloud providers and platforms are supported?
Category
Supported
Cloud
AWS, GCP, Azure
Kubernetes
EKS, AKS, GKE, RHOS
Git
GitHub, GitLab, Bitbucket
See for the full list and setup instructions.
Do you support self-hosted or on-premise deployments?
DuploCloud runs within your own cloud environment — your infrastructure, your accounts, your data. The PrivateGPT Agent, for example, uses AWS Bedrock to ensure sensitive data never leaves your AWS environment.
For customers with strict data residency or on-premise requirements, contact [email protected] to discuss deployment options.
Developers
Prebuilt Agent Integration Guide
This guide documents the API standards required for custom Agents to integrate with the DuploCloud AI HelpDesk. By following these standards, your Agent can leverage HelpDesk features like Terminal command execution, browser interactions, and file operations.
Agent API Requirements
All custom Agents must expose a chat endpoint:
This endpoint handles message exchanges between your Agent and Service Desk, supporting contextual information and specialized response types.
Request Format
Request from HelpDesk to Agent
The HelpDesk sends a flat array of messages where the last message is the current user request. All previous messages provide conversation context.
Field Descriptions
messages (array, required)
Flat array of all conversation messages
Last element is the current message
Follows OpenAI/Anthropic conversation format
role (string, required)
"user": Message from user to Agent
"assistant": Message from Agent to user
content (string, required)
Human-readable message text
Empty string for pure approval/rejection messages
platform_context (object, only for user messages)
Environment-specific configuration and credentials
Set by HelpDesk
Example:
data (object, required)
Structured data for commands, URLs, and other actions
Contains cmds, executed_cmds, and url_configs arrays
timestamp (string, optional)
formatted timestamp
Response Format
Response from Agent to HelpDesk
Capability-Specific Formats
Terminal Commands
Agents can provide Terminal commands for user approval and execution through a human-in-the-loop workflow.
Command Proposal (Agent → User)
Command Fields
command (string, required)
The shell command to execute
execute (boolean, required)
false: Command proposed by Agent, awaiting approval
true: Command approved by user
files (array, optional)
Files to create before command execution
Each file object contains:
file_path: Relative path where file should be created
rejection_reason (string, optional in user response)
User's reason for rejecting a command (when execute=false)
Terminal Command Workflow
1. Agent Proposes Commands
Agent suggests commands with execute: false:
2. User Approves/Rejects Commands
When the user responds, they send back commands with updated execute status:
3. User can send Executed Commands to the agent
In the next request, the user can also share commands and outputs executed by him on his own in a shared user terminal to the agent:
4. Agent Analyzes and Responds
The agent can share the commands it executed to the user via the executed_cmds array.
User-Initiated Commands
Users can run their own terminal commands between agent messages. These appear in the next user message:
File Operations with Commands
For commands requiring file creation (e.g., Helm charts, configurations):
Browser Actions
Agents can direct users to web resources:
Complete Workflow Example
Here's a full conversation flow showing all capabilities:
Best Practices
Use Platform Context: Always use provided platform context values instead of hardcoded values
Clear Explanations: Provide clear explanations with suggested actions
Human-in-the-Loop: Set execute: false for commands requiring approval
Observability Agent
Monitoring and performance via OpenTelemetry and Grafana
CI/CD Agent
Pipeline troubleshooting for Jenkins and GitHub Actions
Architecture Diagram Agent
Generates infrastructure diagrams from AWS and Kubernetes resources
PrivateGPT Agent
Secure, enterprise ChatGPT-like experience running within your AWS environment
Database Explorer Agent
Safe database queries via pre-approved templates
Observability
OpenTelemetry, Datadog, New Relic, Sentry
Incident Management
Grafana Alert Manager, Datadog, New Relic, Sentry, PagerDuty, Incident.io
Maintain State: Include your executed commands in responses to maintain context
Progressive Disclosure: Start with diagnostic commands before suggesting changes
Analyze Outputs: Always analyze command outputs and provide insights
Thread Consistency: Return the same thread_id received in the request
Handle Rejections: Respect command rejections and adjust your approach
Symmetric Patterns: Use executed_cmds consistently for sharing command results
POST /api/sendMessage
{
"messages": [
{
"role": "user" | "assistant",
"content": "Message text content",
"platform_context": {
// only for user messages
},
"data": {
// Structured data exchanges
},
"timestamp": "2025-05-20T18:00:46.123456Z"
}
// ... more messages
]
}
{
"k8s_namespace": "duploservices-andy",
"tenant_name": "andy",
"tenant_id": "7859b3f9-6d74-44fd-b20f-2b9c1e056761",
"duplo_base_url": "duplo.cloud.exmple.com",
"aws_credentials": { /* ... */ },
"kubeconfig": "base64...",
"duplo_token": "DuploCloud Token for the user sending the message to the Agent",
"grafana_base_url": "https://grafana.example.com",
"aws_security_group_name": "duploservices-andy",
"aws_iam_role_name": "duploservices-andy"
}
{
"role": "assistant",
"content": "Agent's text response to the user",
"data": {
"cmds": [],
"executed_cmds": [],
"url_configs": []
}
}
{
"role": "assistant",
"content": "I'll check the pod status in your namespace.",
"data": {
"cmds": [
{
"command": "kubectl get pods -n duploservices-andy",
"execute": false,
"files": [
{
"file_path": "config/app.yaml",
"file_content": "apiVersion: v1\nkind: ConfigMap..."
}
]
}
]
}
}
{
"role": "assistant",
"content": "Let me check your pod status and recent events.",
"data": {
"cmds": [
{
"command": "kubectl get pods -n duploservices-andy",
"execute": false
},
{
"command": "kubectl get events -n duploservices-andy --sort-by=.metadata.creationTimestamp",
"execute": false
}
]
}
}
{
"role": "user",
"content": "", // Empty for pure approval
"data": {
"cmds": [
{
"command": "kubectl get pods -n duploservices-andy",
"execute": true // Approved
},
{
"command": "kubectl get events -n duploservices-andy --sort-by=.metadata.creationTimestamp",
"execute": false, // Rejected
"rejection_reason": "Too much output, let's focus on pods first" //Optional Field may or may not be provided by the user
}
]
}
}
{
"role": "user",
"content": "I ran my own commands, analyze them",
"data": {
"executed_cmds": [
{
"command": "kubectl get pods -n duploservices-andy",
"output": "NAME READY STATUS RESTARTS AGE\napp-69fb74d9d4-j2l6x 1/1 Running 0 22h"
}
]
}
}
{
"content": "All pods are running successfully. The app pod has been stable for 22 hours.",
"data": {
"executed_cmds": [
{
"command": "kubectl get pods -n duploservices-andy",
"output": "NAME READY STATUS RESTARTS AGE\napp-69fb74d9d4-j2l6x 1/1 Running 0 22h"
}
],
"cmds": [
{
"command": "kubectl logs app-69fb74d9d4-j2l6x --tail=50",
"execute": false
}
]
}
}
{
"role": "user",
"content": "I checked the logs myself and found errors",
"data": {
"executed_cmds": [
{
"command": "kubectl logs app-69fb74d9d4-j2l6x | grep ERROR",
"output": "ERROR: Database connection timeout\nERROR: Failed to fetch user data"
}
]
}
}
{
"role": "assistant",
"content": "Here are the relevant monitoring dashboards for your issue:",
"data": {
"url_configs": [
{
"url": "https://grafana.example.com/d/k8s-pods",
"description": "Kubernetes pod health dashboard"
},
{
"url": "https://grafana.example.com/d/app-metrics",
"description": "Application performance metrics"
}
]
}
}
{
"messages": [
// 1. User reports issue
{
"role": "user",
"content": "My application is running slow",
"platform_context": {
"k8s_namespace": "duploservices-app",
"tenant_name": "app-team"
},
"data": {}
},
// 2. Agent proposes diagnostic commands
{
"role": "assistant",
"content": "I'll help diagnose the performance issue. Let me check your pod resources first.",
"data": {
"cmds": [
{
"command": "kubectl top pods -n duploservices-app",
"execute": false
}
]
}
},
// 3. User approves and also
{
"role": "user",
"content": "",
"platform_context": { /* ... */ },
"data": {
"cmds": [
{
"command": "kubectl top pods -n duploservices-app",
"execute": true
}
]
}
},
// 4. Agent analyzes results and suggests scaling
{
"role": "assistant",
"content": "Your pod is using 95% CPU, which explains the slowdown. I recommend scaling your deployment to distribute the load.",
"data": {
"executed_cmds": [
{
"command": "kubectl top pods -n duploservices-app",
"output": "NAME CPU(cores) MEMORY(bytes)\napp-pod-abc123 950m 1900Mi"
}
],
"cmds": [
{
"command": "kubectl scale deployment app --replicas=3 -n duploservices-app",
"execute": false
}
]
}
},
// 5. User approves scaling and checks deployment status
{
"role": "user",
"content": "Good idea, let's scale it up. I also want to check the current deployment status.",
"platform_context": { /* ... */ },
"data": {
"cmds": [
{
"command": "kubectl scale deployment app --replicas=3 -n duploservices-app",
"execute": true
}
],
"executed_cmds": [
{
"command": "kubectl get deployment app -n duploservices-app",
"output": "NAME READY UP-TO-DATE AVAILABLE AGE\napp 1/1 1 1 5d"
}
]
}
},
// 6. Agent confirms scaling and provides monitoring dashboards
{
"role": "assistant",
"content": "Scaling complete! Your deployment is now running 3 replicas. Here are dashboards to monitor the performance improvement:",
"data": {
"executed_cmds": [
{
"command": "kubectl scale deployment app --replicas=3 -n duploservices-app",
"output": "deployment.apps/app scaled"
}
],
"url_configs": [
{
"url": "https://grafana.example.com/d/cpu-analysis",
"description": "CPU usage trends across all replicas"
},
{
"url": "https://grafana.example.com/d/load-distribution",
"description": "Load distribution between pods"
}
]
}
}
Adding Hosts
Add a Host (VM) in the DuploCloud Portal.
In DuploCloud, a Host represents a virtual machine (VM) that runs your workloads in the cloud. DuploCloud supports multiple Host types on AWS, including:
Individual EC2 instances
Auto Scaling Groups (ASGs) for scalable clusters
Bring Your Own Host (BYOH) for integrating existing or non-standard VMs
Use BYOH for any Host that is neither EC2 nor ASG.
Adding a Host (VM)
Adding an EC2 Host
Select the appropriate Tenant from the Tenant list box.
Navigate to Cloud Services -> Hosts.
Select the EC2 tab.
Optionally, select Advanced Options and complete the additional configurations:
Click Add. The Host will appear on the EC2 tab.
To connect to the Host using SSH, .
The EKS Image ID is the image published by AWS specifically for an EKS worker in the version of Kubernetes deployed at Infrastructure creation time.
If no Image ID is available with a prefix of EKS, copy the AMI ID for the desired EKS version by referring to this . Select Other from the Image ID list box and paste the copied AMI ID in the Other Image ID field. Contact the DuploCloud Support team via your Slack channel if you have questions or issues.
Adding an Auto Scaling Group (ASG) Host
An Auto Scaling Group (ASG) Host represents a scalable cluster of EC2 instances managed as a group. DuploCloud integrates with AWS Auto Scaling to automatically adjust capacity based on demand.
Select the appropriate Tenant from the Tenant list box.
Navigate to Cloud Services -> Hosts.
Select the ASG tab.
Optionally, select Advanced Options and complete the additional configurations:
Click Add. The Host will appear on the ASG tab. For information about more advanced features, such as Launch Templates, see the .
Adding a BYOH Host
Bring Your Own Host (BYOH) allows you to register and manage existing servers or virtual machines with DuploCloud without provisioning resources through AWS.
Select the appropriate Tenant from the Tenant list box.
Navigate to Cloud Services -> Hosts.
Select the BYOH tab.
Click Add. The Host will appear on the BYOH tab.
Creating Kubernetes StorageClass and PVC constructs in the DuploCloud Portal.
See .
Supported Host Actions
From the DuploCloud Portal, navigate to Cloud Services -> Hosts.
Select the Host name from the list.
From the Actions list box, you can select Connect, Host Settings, or Host State to perform the following supported actions:
Connect:
Host Settings:
Host State:
Adding custom code for EC2 or ASG Hosts
If you add custom code for EC2 or ASG Hosts using the Base64 Data field, your custom code overrides the code needed to start the EC2 or ASG Hosts and the Hosts cannot connect to EKS. Instead, to add custom code directly in EKS.
Click Add. The Add Host page displays.
Add Host page
Complete the required fields:
Select the Key Pair Type. Supported types are ED25519 and RSA. The default is ED25519.
Enable Block EBS Optimization
Select Yes to enable optimized EBS performance, or No to disable it.
Enable Hibernation
Select Yes to allow EC2 hibernation; select No to disable it.
Metadata service
Select V1 and V2 Enabled, V2 Only Enabled, or Disabled to manage instance metadata access. Default is V2 Only Enabled.
Prepend Duplo's userdata to my custom userdata
Select this option to prepend DuploCloud’s bootstrap scripts before your custom user data.
Base64 Data
Enter Base64-encoded data strings (for example, a startup script). On Linux, encode your script using:
cat <filepath> | base64 -w 0 .
Volumes
Specify additional storage volumes (in JSON format) to attach to the Host.
Tags
Enter AWS Tags as JSON key-value pairs to apply to the EC2 instance.
Node Labels
Optionally, enter Kubernetes Node Labels as key-value pairs to apply to this Host (for EKS Hosts only).
Click Add. The Add ASG page displays.
Add ASG page
Configure the basic ASG settings:
Select Yes to enable optimized EBS performance, or No to disable it.
Enable Hibernation
Select Yes to allow EC2 hibernation; select No to disable it.
Metadata service
Select V1 and V2 Enabled, V2 Only Enabled, or Disabled to manage instance metadata access. Default is V2 Only Enabled.
Use Spot Instances
Select to launch hosts using Spot Instances.
Scale from zero (BETA)
Enable the Scale From Zero (BETA) feature.
Enabled Metrics
Select this option to enable Auto Scaling Group metrics collection.
Prepend Duplo's userdata to my custom userdata
Select this option to prepend DuploCloud’s bootstrap scripts before your custom user data.
Base64 Data
Enter Base64-encoded data strings (for example, a startup script). On Linux, encode your script using:
cat <filepath> | base64 -w 0 .
Volumes
Specify additional storage volumes (in JSON format) to attach to the Host.
Tags
Enter Tags as JSON key-value pairs to apply to the ASG.
Node Labels
Optionally, enter Kubernetes node labels as key-value pairs to apply to this Host (for EKS Hosts only).
Click Add. The Add page displays.
Add page
Complete the following fields:
Friendly Name
Enter a descriptive name for the Host.
Availability Zone
Select an availability zone or choose Automatic to let AWS decide.
Instance Type
Select the EC2 instance type (e.g., 2 CPU 2 GB - t3a.small).
Agent Platform
Select the Agent Platform, such as EKS Linux or Docker Windows.
Image ID
The Image ID is auto-populated based on the Agent Platform; override it if needed.
Dedicated Host ID
Specify the Dedicated Host ID, e.g., h-0c6ab6f38bdcb24f6. This ID is used to launch an instance on the Host.
Allocation Tags
Add Allocation Tags to manage where workloads are scheduled within the infrastructure.
OS
(displays only if Windows workloads are enabled)
Choose the operating system for the host. Select Linux for standard Linux workloads, or Windows to enable this host to run Windows containers on EKS.
Note: Windows hosts require that EnableK8sWindowsWorkload is enabled in infra settings. See AWS Infrastructure Settings
for instructions to configure infra settings.
Disk Size
Enter the EBS volume size in GB. If not specified, the volume size will be the same as defined in the AMI.
Enable Storage Encryption
Select Yes to encrypt the root volume, or No to leave it unencrypted.
Friendly Name
Enter a name to identify the ASG.
Availability Zones
Select an Availability Zone or choose Automatic to let AWS decide.
Instance Type
Select the Instance Type for the ASG instances (e.g., 2 CPU 2 GB - t3a.small).
Instance Count
Enter the desired instance count for the autoscaling group.
Minimum Instances
Enter the minimum number of instances allowed.
Maximum Instances
Enter the maximum number of instances allowed.
Use for Cluster Autoscaling
Select this option to allow the Kubernetes Cluster Autoscaler to manage scaling for this cluster.
Agent Platform
Select the Agent Platform (e.g., EKS Linux, or Docker Windows).
Image ID
The Image ID is auto-populated based on the Agent Platform; override it if needed.
Allocation Tags
Add Allocation Tags to manage where workloads are scheduled within the infrastructure.
OS
(displays only if Windows workloads are enabled)
Choose the operating system for the host. Select Linux for standard Linux workloads, or Windows to enable this host to run Windows containers on EKS.
Note: Windows hosts require that EnableK8sWindowsWorkload is enabled in infra settings. See AWS Infrastructure Settings
for instructions to configure infra settings.
Disk Size
Enter the EBS volume size in GB. If not specified, the volume size will be the same as defined in the AMI.
Enable Storage Encryption
Select Yes to encrypt the root volume, or No to leave it unencrypted.
Key Pair Type
Select the key pair type. Supported types are ED25519 and RSA. The default is ED25519.
Friendly Name
Enter a descriptive name for the Host.
Direct Address
Enter the IPv4 address that DuploCloud will use to communicate with the agent installed on your Host.
Fleet Type
Select the Fleet Type that corresponds to the container orchestrator running on your Host (e.g., Linux Docker/Native, Docker Windows, or None).
Username
Optionally, enter the Username for your Host.
Password
Optionally, enter the Password for access.
Private Key
Optionally, enter the Private Key used to log in to your Host via SSH. You can specify either a Password or a Private Key for authentication.
SSH
Establish an SSH connection to work directly in the AWS Console.
Connection Details
View connection details (connection type, address, user name, visibility) and download the key.
Host Details
View Host details in the Host Details JSON screen.
The Host Actions menu with Host Settings selected.
The Host Actions menu with Host State selected.
Key Pair Type
Enable Block EBS Optimization
#!/bin/bash
set -o xtrace
/etc/eks/bootstrap.sh duploinfra-MYINFRA --kubelet-extra-args '--node-labels=tenantname=duploservices-MYTENANT'
# Custom user code:
echo "hello world"
Agents
An overview of DuploCloud's pre-built AI Agents
DuploCloud AI Suite includes several Prebuilt, production-ready Agents that handle common DevOps and infrastructure management tasks. These Agents integrate seamlessly with your existing DuploCloud infrastructure and can be deployed immediately to automate routine operations and troubleshooting workflows.
Agents work within DuploCloud's secure Tenant architecture, inheriting user permissions and maintaining compliance with your organization's security policies. Agents can be accessed through the HelpDesk interface, where you can create Tickets and collaborate with AI Agents to resolve issues or perform tasks.
Out-of-the-Box Agents
Site Reliability Engineer (SRE) Agent
The Site Reliability Engineer (SRE) Agent is a central AI agent that orchestrates multiple specialized sub-agents to provide comprehensive support. When you interact with the SRE Agent, it automatically selects the appropriate sub-agent(s) to handle your queries, serving as a single point of interaction for all your operational questions.
Current integrations include the following sub-agents:
PrivateGPT Agent: secure access to DuploCloud documentation and tenant-specific knowledge
Future Integrations will include AWS, Observability, CI/CD, and additional sub-agents to provide end-to-end coverage for troubleshooting, diagnostics, monitoring, and observability.
View SRE (Master) Agent details
Core Capabilities
Automatic Sub-Agent Routing: Dynamically determines which sub-agent(s) to engage for each query.
AWS Agent
The AWS Agent is an AWS infrastructure expert that helps you diagnose, troubleshoot, and manage cloud resources across one or more AWS accounts. It suggests and executes AWS CLI commands with user approval, helping teams inspect, analyze, and take action on AWS infrastructure without requiring deep CLI expertise.
View AWS Agent details
Core Capabilities
Resource Discovery: List and inspect AWS resources across EC2, S3, RDS, Lambda, ECS, and more
Multi-Account Support
GCP Agent
The GCP Agent is a Google Cloud Platform infrastructure expert that helps you diagnose, troubleshoot, and manage GCP resources. It suggests and executes gcloud and kubectl commands with user approval, helping teams inspect, analyze, and take action on GCP infrastructure — including GKE clusters — without requiring deep CLI expertise.
View GCP Agent details
Core Capabilities
Resource Discovery: List and inspect GCP resources across Compute Engine, GKE, Cloud Storage, IAM, Networking, and more
Kubernetes Agent
The Kubernetes Agent is an expert DevOps engineer specialized in Kubernetes cluster management, maintenance, and troubleshooting. This Agent serves as your dedicated Kubernetes specialist, capable of handling everything from routine cluster health checks to complex resource deployments.
View Kubernetes Agent details
Core Capabilities
Cluster Health Monitoring: Assess overall cluster health and identify potential issues
Resource Management
Here’s a quick look at the Kubernetes AI Agent in action.
IaC Agent
The IaC Agent autonomously implements infrastructure changes in your Terraform repositories and opens pull requests for review. Give it a task — such as "add an S3 bucket with KMS encryption" — and it clones your repo, maps its structure, plans the changes, implements them, verifies the result, and creates a PR. Cloud-agnostic by design, it supports AWS, GCP, and Azure Terraform configurations.
View IaC Agent details
Core Capabilities
Autonomous Terraform Implementation: Clones your repo, plans changes, writes Terraform code, and opens a PR — end to end
Observability Agent
The Observability Agent provides intelligent monitoring and troubleshooting capabilities through integration with your observability stack. It is currently optimized for OpenTelemetry-based environments using Grafana and helps teams quickly identify and resolve application performance issues.
View Observability Agent details
Core Capabilities
Log Retrieval and Analysis: Fetch and summarize logs from Grafana with intelligent pattern recognition
Watch the Observability Agent in action below.
CI/CD Agent
The CI/CD Agent automates pipeline troubleshooting and failure resolution across your continuous integration and deployment workflows. Available for both Jenkins and GitHub Actions, this agent proactively engages when pipeline failures occur and provides intelligent assistance for resolution.
View CI/CD Agent details
Supported Platforms
Jenkins: Full integration with Jenkins pipelines and build processes
GitHub Actions
Here’s the GitActions Agent at work, resolving pipeline issues.
Architecture Diagram Agent
The Architecture Diagram Agent leverages DuploCloud's cartography system to generate intelligent infrastructure and application architecture diagrams. Built on Neo4j graph database technology, this agent provides visual representations of complex system relationships and dependencies.
View Architecture Diagram Agent Details
Core Technology
Backend: Neo4j graph database populated by DuploCloud Cartography
Real-time Updates
Knowledgebase Agent
The Knowledgebase Agent answers questions by searching a vector database of previously resolved support tickets and knowledge base articles. Rather than relying solely on general AI knowledge, it grounds responses in your organization's actual resolution history — surfacing relevant tickets, steps, and references that have worked before. Over time, it becomes a living repository of your team's tribal knowledge, making institutional expertise available to everyone on demand.
View Knowledgebase Agent details
Core Capabilities
Semantic Search: Searches previously resolved tickets and knowledge base articles using vector similarity to find the most relevant matches for your query
PrivateGPT Agent
The PrivateGPT Agent provides a secure, enterprise-grade ChatGPT-like experience for organizations concerned about data privacy and security. This agent ensures that sensitive organizational data never leaves your AWS environment while providing powerful AI assistance.
View PrivateGPT Agent details
Security Architecture
Data Locality: All processing occurs within your AWS environment
AWS Bedrock Backend
Database Explorer Agent
The Database Explorer Agent provides secure, controlled access to database operations through pre-defined query templates. This agent enables non-technical teams to access database information safely without SQL knowledge or direct database access.
While these out-of-the-box agents cover many common use cases, DuploCloud's AI Studio platform enables you to build custom agents tailored to your specific workflows and tools. For assistance with agent customization or integration with additional tools, contact the DuploCloud team.
Comprehensive Operational Support:
Covers troubleshooting, diagnostics, monitoring, and observability.
Context-Aware Assistance: Maintains awareness of tenant, cluster, and infrastructure context for precise answers.
Security and Compliance: Inherits sub-agent security and permission models (e.g., PrivateGPT Agent's in-tenant processing, Kubernetes Agent’s permission inheritance).
Key Features
Single Point of Interaction: Users can ask questions without needing to know which sub-agent to consult.
Real-Time Insights: Retrieves live cluster data, documentation, or logs as required by the query.
Dynamic Orchestration: Combines outputs from multiple sub-agents for complex queries spanning documentation and operational state.
Audit Logging: All agent interactions are logged for traceability and compliance.
Integration Workflow
User Query: User submits a question or task via the HelpDesk interface.
Agent Routing: The SRE Master Agent automatically routes queries to the appropriate sub-agent(s), currently including PrivateGPT and Kubernetes agents.
Sub-Agent Execution: Selected sub-agent(s) process the request (e.g., PrivateGPT Agent answers documentation questions, Kubernetes Agent retrieves cluster metrics).
Aggregated Response: SRE Master Agent consolidates results and presents a unified answer.
Logging & Auditing: All interactions are captured for traceability.
Future Integrations
The SRE Agent will extend its operational coverage over time by incorporating additional sub-agents, such as:
AWS Agent: Infrastructure management and cloud resource queries.
Observability Agent: Logs, metrics, and alerting for full observability coverage.
CI/CD Agent: Pipeline monitoring, failure detection, and automated troubleshooting.
Additional agents: To provide end-to-end operational support across all DuploCloud-managed resources.
: Work across multiple AWS accounts in a single conversation using Scopes
Command Suggestions: Recommends precise AWS CLI commands based on your query
Command Execution: Executes approved commands and returns results in real time
Credential Handling: Securely processes AWS credentials from your defined Providers — each Provider maps to a separate AWS account and region
Context Awareness: Maintains conversation history for more accurate, relevant responses
Key Features
Approval Workflow: All suggested commands are presented for user review before execution
Scoped Execution: When multiple Providers are configured, the agent identifies which account to target and routes commands to the correct credentials automatically
AWS Bedrock Backend: Powered by Anthropic Claude via AWS Bedrock — processing stays within your AWS environment
Isolated Execution: Commands run in an isolated process per session
Audit Logging: All agent interactions and command executions are logged for traceability
Use Cases
Investigating resource configuration and availability issues across one or more AWS accounts
Discovering and inventorying resources within an Environment
Troubleshooting IAM permissions, security groups, and networking
Analyzing CloudWatch logs and metrics for performance issues
Auditing cost and usage patterns across AWS services and accounts
Root cause analysis for infrastructure incidents
Security Model
Permission Scope: You define the permission boundary of the agent based on the Provider Scope you define
Sandboxed Execution: Commands run in an isolated process per session
Audit Logging: All agent interactions and command executions are logged for traceability and compliance
Command Suggestions
: Recommends precise
gcloud
and
kubectl
commands based on your query
Command Execution: Executes approved commands and returns results in real time
GKE Integration: Full Kubernetes support for GKE clusters, including pod management, deployments, and log retrieval
Credential Handling: Securely processes GCP credentials from your defined Providers
Context Awareness: Maintains conversation history for more accurate, relevant responses
Key Features
Approval Workflow: All suggested commands are presented for user review before execution
Dual CLI Support: Works across both gcloud for GCP resource management and kubectl for GKE workloads
Isolated Execution: Commands run in an isolated process per session
Audit Logging: All agent interactions and command executions are logged for traceability
Use Cases
Investigating Compute Engine instance configuration and availability
Diagnosing GKE pod failures, crashes, and networking issues
Managing Cloud Storage buckets and IAM policies
Analyzing Cloud Logging and Cloud Monitoring data for performance issues
Troubleshooting VPC networking, firewall rules, and load balancers
Root cause analysis for GCP infrastructure incidents
Security Model
Permission Scope: You define the permission boundary of the agent based on the Provider Scope you define
Sandboxed Execution: Commands run in an isolated process per session
Audit Logging: All agent interactions and command executions are logged for traceability and compliance
Troubleshooting: Diagnose and resolve pod failures, networking issues, and resource constraints
Log Analysis: Retrieve and analyze logs from specific pods or services
Resource Inspection: Detailed examination of Kubernetes objects and their configurations
Key Features
Permission Inheritance: Operates with the requesting user's Kubernetes permissions - no additional access required
kubectl Integration: Executes kubectl commands securely within your cluster environment
Multi-Level Support: Handles both specific detailed requests (like "get logs for pod xyz") and high-level queries (like "assess cluster health")
Real-time Troubleshooting: Interactive problem-solving with immediate command execution
Use Cases
Investigating pod startup failures or crashes
Analyzing resource utilization and capacity planning
Deploying new applications or updating existing ones
Troubleshooting networking and service connectivity issues
Performing routine maintenance tasks and health checks
Security Model
No standalone permissions - inherits user's existing kubectl access
All actions are performed within DuploCloud's Tenant isolation
Command execution is logged and auditable
Multi-Cloud Support
: Works with AWS, GCP, and Azure Terraform repositories
Repo Structure Discovery: Automatically maps Terraform roots, modules, and CI patterns before making changes
Clarification Q&A: Pauses to ask questions when a task is ambiguous before proceeding
Change Verification: Runs terraform fmt, init, validate, and plan to verify changes before the PR is created
Human Review Checkpoints: Optionally pause at the planning or implementation stage for review and feedback before continuing
Key Features
Fully Autonomous Mode: Submit a task and walk away — the agent handles the full pipeline without intervention
Checkpoint Feedback Loop: At any checkpoint, approve the plan, provide feedback for revision, or directly edit the artifact before continuing
PR Creation: Automatically branches, commits, and opens a GitHub pull request with a full summary of changes
Audit Trail: Every step of the pipeline produces artifacts — plan, diff summary, verification results — retained per run
Benefits
Faster Implementation: Infrastructure changes that would take hours to research, write, and validate manually can be completed in minutes
Consistent Code Quality: The agent follows your existing repo structure, naming conventions, and Terraform patterns — changes fit naturally into your codebase
Reduced Errors: Automated terraform fmt, validate, and plan checks catch mistakes before they ever reach a PR
Human Oversight Without Manual Work: Checkpoints give your team full control over what gets merged, without requiring them to write the code themselves
Democratizes IaC: Team members without deep Terraform expertise can contribute infrastructure changes safely
Use Cases
Adding new cloud resources (compute, storage, databases, networking) via Terraform
Modifying existing infrastructure configurations across environments
Automating repetitive IaC tasks that follow consistent patterns
Reviewing and approving AI-generated infrastructure changes before they reach your repo
Security Model
Credentials Never Persisted: API keys, GitHub tokens, and cloud credentials are passed per request and never written to disk
Isolated Execution: Each run operates in its own isolated directory
Human in the Loop: Optional checkpoints ensure no changes land in your repo without review
Metrics Analysis
: Query and interpret application and infrastructure metrics
Contextual Filtering: Automatically scope queries to the user's current namespace
Pattern Detection: Identify anomalies and trends in log data
Time-based Analysis: Analyze data across specific time windows
Current Implementation
Backend: OpenTelemetry with Grafana integration
Data Types: Logs and metrics (traces, spans, and profiles coming in future versions)
Compliance Auditing: Visualize data flows for security and compliance reviews
AWS Resources
Architecture Diagram Agent has insights into your AWS Resources. We currently support AWSAccount, AWSRegion, EC2Instance, S3Bucket, and RDSInstance to name a few. To see your AWS Resources ask the agent: Can you create a diagram of the aws resources?
Kubernetes Aware
Architecture Diagram Agent has extensive knowledge of Kubernetes in your infrastructure.
For example, we can ask Architecture Diagram Agent Can you create a diagram of the duploservices-ai namespace? and it'll create a diagram.
Custom Dependency Definition
Organizations can optionally define custom application dependencies:
Granular Control: Define dependencies per microservice
Multi-type Support: AWS resources, Kubernetes services, and external APIs
For example, once we define a pods dependenices, we can ask our Architecture Diagram Agent: Can you create a diagram of dependencies for my Architecture Diagram Agent? and it will create a diagram of the dependencies:
Architecture Diagram Agent — Role Boundaries and Scope
Tenancy boundary:
All resources, relationships, metrics, and events are organized under a Tenant using tenantId.
All reads/writes require a tenantId; data from other Tenants is never returned or mutated.
Cross-tenant access is denied; background jobs and graph updates run within the same tenantId scope.
User role (least privilege):
Visibility: Only resources and relationships within their own tenantId.
Actions: Read and interact within Tenant scope; cannot access or reference other Tenants.
Admin role (most privilege):
Visibility: Intended for administration across Tenants.
Actions: Manage resources, relationships, and system-wide operations.
Least-privilege override (effective role):
When a principal has multiple roles, the least-privileged role determines access.
Example: If a principal has both admin and user roles, the effective scope is the user role:
What each role sees:
User: Only their Tenant’s nodes, edges, metrics, and events; Tenant-filtered diagrams and panels.
Admin: System-wide view and management; however, if also assigned the user role, the session is constrained to the user’s single-Tenant scope.
Enforcement points (high level):
API routes validate and require tenantId.
Graph/database queries filter by tenantId.
In short: data is strictly segmented by tenantId; users operate only within their Tenant; admins can operate broadly, but any concurrent user assignment forces least-privilege behavior, restricting access to the user’s Tenant.
Result Reranking: Applies a reranking model to optimize search results for accuracy before generating a response
Grounded Responses: Answers are based on your organization's real ticket history, with source references included
Intelligent Fallback: When no sufficiently relevant results are found, falls back gracefully to general AI knowledge
PII Sanitization: Optionally detects and redacts personally identifiable information from queries before processing
Key Features
Reference Links: Responses include links to the source tickets or documentation used to formulate the answer
Context Awareness: Maintains conversation history for more accurate follow-up responses
Context-Aware by Default: Automatically interprets questions within the context of your configured knowledge base unless explicitly stated otherwise
Benefits
Preserves Tribal Knowledge: Captures and surfaces institutional expertise that would otherwise live only in individuals' heads or be lost over time
Faster Resolution: Teams spend less time re-investigating known issues — past solutions are surfaced instantly
Continuous Improvement: The more tickets resolved and indexed, the more accurate and useful the agent becomes
Self-Service Support: Empowers team members to find answers independently without needing to escalate
Use Cases
Troubleshooting known issues by surfacing how similar problems were previously resolved
Onboarding new team members with answers grounded in real operational history
Reducing repeat support tickets by making past resolutions searchable
Quick lookups for configuration guidance, error resolutions, and operational steps
Security Model
PII Protection: Optional PII detection and redaction ensures sensitive data is sanitized before being processed
Processing Within Your Environment: All AI processing occurs within your own cloud environment — data does not leave your infrastructure
Audit Logging: All agent interactions are logged for traceability and compliance
: Leverages AWS Bedrock for LLM capabilities
Enhanced Privacy: Stronger guarantees that input data won't be used for model training
DuploCloud Interface: Access through familiar HelpDesk interface
Core Capabilities
General AI Assistance: Natural language processing for various business needs
Document Analysis: Process and analyze internal documents securely
Key Features
Zero External Data Exposure: All interactions remain within your cloud environment
Familiar Interface: ChatGPT-like experience through DuploCloud HelpDesk
Enterprise Controls: Full audit trail and access controls
Compliance Ready: Meets strict data residency and privacy requirements
Use Cases
Analyzing sensitive business documents
Internal knowledge base queries
Compliance and regulatory document review
Benefits Over Public AI Services
Data Sovereignty: Complete control over where your data is processed
Compliance Alignment: Meets enterprise security and regulatory requirements
Audit Trail: Full logging and monitoring of AI interactions
: Works with MySQL, PostgreSQL, and other relational databases
Natural Language Interface: Users interact using plain language requests
Parameter Substitution: Intelligently fills in query parameters based on user input
Security Model
Controlled Access: Only pre-approved query patterns can be executed
No Raw SQL: Users cannot execute arbitrary database commands
Template Validation: All queries must match predefined templates
Audit Logging: Complete tracking of all database interactions
User Interaction
User Request: "Find the customer with phone number (555) 123-4567"
Agent Processing: Extracts phone number, maps to customer lookup template
Query Execution: Substitutes parameter and executes safe query
Response: Returns customer information in user-friendly format
Key Features
Template Library: Maintain a collection of approved query patterns
Parameter Validation: Automatic validation of input parameters
Result Formatting: Present database results in user-friendly formats
Use Cases
Customer Support: Quick customer information lookup
Data Analysis: Self-service access to business intelligence data
Report Generation: Automated generation of standard reports
Operational Queries: Access to operational data without technical expertise
Benefits
Rapid Development: Enable data access without building custom UIs
User Empowerment: Non-technical teams gain self-service capabilities
Reduced Development Overhead: No need to build custom data access interfaces
UI: Graph, search, filters, impact/stats panels, and websockets show only Tenant-scoped data.
UI: Can view and operate beyond a single Tenant unless least-privilege applies (see below).
Visibility and actions are restricted to the principal’s tenantId.
Cross-tenant views and operations are not permitted.
Websocket channels are partitianed by tenantId.
Diagram generation and analysis features apply tenantId filtering end-to-end.
YAML Dependenices Specification
Dependencies Manifest Specification - Version 2
Document Version: 2.0
Last Updated: October 3, 2025
Status: Current
Overview
This document specifies the schema, validation rules, and matching behavior for the dependencies manifest YAML format used by duplo-cartography. This manifest allows workloads to declare their runtime dependencies on Kubernetes resources, AWS services, external systems, and source control repositories.
⚠️ Important: As of duplo-cartography version 0.5.0 and higher, Version 1 manifests are no longer supported. Manifests without version: 2 will not cause import failures, but dependency items that don't match the Version 2 format will be silently skipped.
Document Structure
Top-Level Schema
Field Definitions
Field
Type
Required
Default
Description
Version Field
Specification
Type: Integer
Required: Yes (as of v0.5.0)
Required Value:2
Behavior
Version 1 (deprecated as of v0.5.0):
⚠️ No longer supported in duplo-cartography 0.5.0+
Items will be silently skipped during import
Versioning Strategy
v0.5.0+:version: 2 is required; omitting version or using version: 1 results in items being skipped
v0.4.x and earlier: Version 1 was the default when version field was omitted
Namespace & Workload Mapping
Workload Name Derivation
Pod names are transformed to logical workload names using these rules:
StatefulSet Pattern:name-N → name
Example: worker-0 → worker
Matching Behavior
All pods with same derived workload name share one manifest entry
Workload name is case-sensitive
Namespace must match exactly
Kubernetes Dependencies
Schema
Required Fields
Field
Validation Rule
Error Behavior
Optional Fields
Field
Type
Default
Description
Matching Logic
Label Matching:
type value must exactly match Neo4j label
Example: KubernetesService, KubernetesSecret
Target Validation
Targets are validated before relationship creation
Missing targets logged as WARNING and skipped
No nodes are created; K8s resources must exist in graph
Relationship Properties
Properties stored on DEPENDS_ON relationship:
Property
Source
Type
AWS Dependencies
Schema
Required Fields
Field
Validation Rule
Optional Fields
Field
Default
Description
Matching Logic
Match on {<identifier_name>: <value>}
If identifier_name field present in item, use that value
Otherwise, fall back to name value
Relationship Properties
Property
Source
External Dependencies
Schema
Required Fields
Field
Validation Rule
Node Creation
Creates (:ExternalService {name: <name>}) node
All item properties (except internal fields) stored on node
Properties sanitized: [^A-Za-z0-9_] → _
Property Normalization
Input Type
Storage Format
Stale Property Cleanup
Properties present in previous run but not current run are removed
# ❌ This format NO LONGER WORKS in v0.5.0+
# Items will be silently skipped during import
namespaces:
- default:
worker:
dependencies:
kubernetes:
- type: Service # v1 format - will be skipped
name: redis