YAML Dependenices Specification
Dependencies Manifest Specification - Version 2
Document Version: 2.0 Last Updated: October 3, 2025 Status: Current
Overview
This document specifies the schema, validation rules, and matching behavior for the dependencies manifest YAML format used by duplo-cartography. This manifest allows workloads to declare their runtime dependencies on Kubernetes resources, AWS services, external systems, and source control repositories.
⚠️ Important: As of duplo-cartography version 0.5.0 and higher, Version 1 manifests are no longer supported. Manifests without version: 2
will not cause import failures, but dependency items that don't match the Version 2 format will be silently skipped.
Document Structure
Top-Level Schema
version: <integer> # OPTIONAL: defaults to 1 if omitted
namespaces: # REQUIRED: list of namespace entries
- <namespace-name>: # namespace identifier
<workload-name>: # logical workload/service name
dependencies: # dependency declarations
kubernetes: []
aws: []
external: []
source_control: []
Field Definitions
version
integer
Yes (v0.5.0+)
-
Manifest version number. Must be 2
for v0.5.0+. Omitting or using version: 1
will cause items to be skipped.
namespaces
list
Yes
-
List of namespace entries. Each entry is a mapping of namespace name to workloads.
Version Field
Specification
Type: Integer
Required: Yes (as of v0.5.0)
Required Value:
2
Legacy Default:
1
(v1 format, no longer supported)Current Version:
2
Behavior
Version 1 (deprecated as of v0.5.0):
⚠️ No longer supported in duplo-cartography 0.5.0+
Items will be silently skipped during import
Manifests will not cause errors, but dependencies will not be created
Legacy behavior maintained for reference only
Version 2 (required):
Kubernetes
type
must match exact Neo4j label (e.g.,KubernetesService
)Dynamic label matching for any K8s resource type
Support for
identifier_name
field in kubernetes sectionNamespace defaulting behavior
Required for all manifests in v0.5.0+
Versioning Strategy
v0.5.0+:
version: 2
is required; omittingversion
or usingversion: 1
results in items being skippedv0.4.x and earlier: Version 1 was the default when
version
field was omittedFuture versions may introduce additional breaking changes
Always specify
version: 2
explicitly to ensure compatibility
Namespace & Workload Mapping
Workload Name Derivation
Pod names are transformed to logical workload names using these rules:
StatefulSet Pattern:
name-N
→name
Example:
worker-0
→worker
Example:
db-2
→db
Deployment Pattern: Remove last two dash-separated segments
Example:
api-7c9d88f9d9-abc12
→api
Example:
frontend-deployment-xyz
→frontend
If result ends with
-<digits>
, strip that suffix
Simple Names: Names with ≤2 segments remain unchanged
Example:
simple
→simple
Example:
a-b
→a-b
Matching Behavior
All pods with same derived workload name share one manifest entry
Workload name is case-sensitive
Namespace must match exactly
Kubernetes Dependencies
Schema
kubernetes:
- type: <string> # REQUIRED: Neo4j label name
name: <string> # REQUIRED: resource identifier value
namespace: <string> # OPTIONAL: target namespace
identifier_name: <string> # OPTIONAL: property name to match on
protocol: <string> # OPTIONAL: protocol (stored on relationship)
port: <integer> # OPTIONAL: port number (stored on relationship)
url: <string> # OPTIONAL: connection URL (stored on relationship)
description: <string> # OPTIONAL: human-readable description
Required Fields
type
Non-empty string
ERROR logged, item skipped
name
OR identifier_name
field
At least one must have non-empty value
ERROR logged, item skipped
Optional Fields
namespace
string
Pod's namespace
Target resource namespace
identifier_name
string
"name"
Property name to match on in Neo4j
protocol
string
-
Protocol used for connection
port
integer
-
Port number for connection
url
string
-
Full connection URL
description
string
-
Human-readable description
Matching Logic
Label Matching:
type
value must exactly match Neo4j labelExample:
KubernetesService
,KubernetesSecret
,KubernetesConfigMap
Case-sensitive
Property Matching:
Default: Match on
{name: <value>}
With
identifier_name
: Match on{<identifier_name>: <value>}
Example:
{name: "redis"}
or{uid: "abc-123"}
Namespace Matching:
Always matches on
namespace
propertyDefaults to pod's namespace if not specified
Explicit namespace overrides default
Complete Match Pattern:
MATCH (target:<type> {<identifier_name>: <value>, namespace: <namespace>})
Target Validation
Targets are validated before relationship creation
Missing targets logged as WARNING and skipped
No nodes are created; K8s resources must exist in graph
Relationship Properties
Properties stored on DEPENDS_ON
relationship:
kind
Fixed
"k8s"
type
type
field
string
name
identifier value
string
protocol
protocol
field
string (nullable)
port
port
field
integer (nullable)
url
url
field
string (nullable)
description
description
field
string (nullable)
service_name
Derived workload name
string
firstseen
Auto
timestamp
lastupdated
Auto
timestamp
AWS Dependencies
Schema
aws:
- type: <string> # REQUIRED: Neo4j label (e.g., S3Bucket)
name: <string> # REQUIRED: resource identifier
identifier_name: <string> # OPTIONAL: property to match on
region: <string> # OPTIONAL: AWS region
engine: <string> # OPTIONAL: database engine (RDS)
description: <string> # OPTIONAL: human-readable description
# ... additional AWS-specific fields ...
Required Fields
type
Non-empty string (Neo4j label)
name
Non-empty string
Optional Fields
identifier_name
"name"
Property name to match on
region
-
AWS region code
engine
-
Database engine (for RDS)
<identifier_name>
Falls back to name
Explicit identifier value
Matching Logic
Match on
{<identifier_name>: <value>}
If
identifier_name
field present in item, use that valueOtherwise, fall back to
name
valueExample: RDS uses
db_instance_identifier
, S3 usesname
Relationship Properties
kind
"aws"
type
type
field
name
Unified name
description
description
field
service_name
Derived workload name
firstseen
Auto
lastupdated
Auto
External Dependencies
Schema
external:
- type: <string> # REQUIRED: service type slug
name: <string> # REQUIRED: unique service name
url: <string> # OPTIONAL: service URL
region: <string> # OPTIONAL: region/location
inference_profile_arn: <string> # OPTIONAL: AWS Bedrock ARN
description: <string> # OPTIONAL: description
<custom-field>: <any> # Additional fields allowed
Required Fields
type
Non-empty string (not "git"
)
name
Non-empty string (unique identifier)
Node Creation
Creates
(:ExternalService {name: <name>})
nodeAll item properties (except internal fields) stored on node
Properties sanitized:
[^A-Za-z0-9_]
→_
Property Normalization
string, int, float, bool
Direct storage
List of primitives
Array storage
Object, complex types
JSON string
Stale Property Cleanup
Properties present in previous run but not current run are removed
Protected properties:
name
,firstseen
,lastupdated
Cleanup uses explicit
SET prop = NULL
(no APOC required)
Tenant Scoping
ExternalService
inheritsIN_TENANT
relationship from podBackfill: All external services linked to pod's tenant
Orphaned services: Deleted when no pods reference them
Source Control Dependencies
Schema
source_control:
- type: <string> # REQUIRED: "github" or "scm"
name: <string> # OPTIONAL: display name
url: <string> # REQUIRED: repository URL
description: <string> # OPTIONAL: description
Required Fields
type
"github"
or "scm"
url
Non-empty string
URL Canonicalization
All URLs canonicalized to https://host/path
format:
git@host:org/repo.git
https://host/org/repo
ssh://git@host/org/repo.git
https://host/org/repo
git://host/org/repo.git
https://host/org/repo
https://host/org/repo.git
https://host/org/repo
Rules:
Remove
.git
suffixRemove trailing
/
Lowercase scheme and host
Preserve path case
Preserve provider-specific segments (e.g.,
_git
in Azure DevOps)
GitHub Matching
Type
"github"
or hostgithub.com
: Link to(:GitHubRepository)
Match on
url
(HTTP) orgiturl
(git/ssh)No node creation; repositories must exist in graph
Generic SCM
Other providers: Creates
(:SCMRepository {canonical_url: <url>})
Properties:
host
,provider
,url
,giturl
,lastupdated
Validation Rules
Field-Level Validation
K8s type
Non-empty string
Log ERROR, skip item
K8s name
or identifier
At least one non-empty
Log ERROR, skip item
K8s namespace
String or omitted
Use pod namespace if omitted
AWS type
Non-empty string
Skip item silently
AWS name
Non-empty string
Skip item silently
External type
Non-empty, not "git"
Skip item silently
External name
Non-empty string
Skip item silently
Source Control url
Non-empty string
Skip item silently
Logging Levels
Missing required field
ERROR
[DepsIntel] K8s dependency missing required '<field>' field for pod=<name> ns=<ns> workload=<workload>; skipping
Target not found
WARNING
[DepsIntel] Skipping N K8s dependencies; targets not found: <type>/<prop>=<value>@<ns>
Successful mapping
DEBUG
[DepsIntel] K8s map pod=<name> ns=<ns> workload=<workload> -> type=<type> identifier=<prop>=<value> tgt_ns=<ns>
File issues
WARNING
[DepsIntel] DEPENDENCIES_MAPPING_FILE not set; skipping dependencies mapping.
Continuation Behavior
Single item validation failure: Skip item, continue processing
Multiple failures: Log each, continue processing
File-level failure: Log warning, skip entire sync
No pods found: Log info, skip gracefully
No manifest data: Log info, skip gracefully
Examples
Minimal Example (v2)
version: 2
namespaces:
- production:
api:
dependencies:
kubernetes:
- type: KubernetesService
name: database
Complete Example (v2)
version: 2
namespaces:
- staging:
frontend:
dependencies:
kubernetes:
- type: KubernetesService
name: backend-api
port: 8080
protocol: http
description: "Backend API service"
- type: KubernetesSecret
name: oauth-credentials
namespace: platform # cross-namespace
description: "Shared OAuth credentials"
aws:
- type: S3Bucket
name: user-uploads-staging
region: us-east-1
description: "User uploaded files"
- type: RDSInstance
name: staging-db
identifier_name: db_instance_identifier
db_instance_identifier: staging-db
region: us-east-1
external:
- type: api
name: stripe-payments
url: https://api.stripe.com
tier: production
description: "Payment processing"
source_control:
- type: github
name: myorg/frontend
url: https://github.com/myorg/frontend
⚠️ Deprecated Example (v1 - No Longer Supported)
# ❌ This format NO LONGER WORKS in v0.5.0+
# Items will be silently skipped during import
namespaces:
- default:
worker:
dependencies:
kubernetes:
- type: Service # v1 format - will be skipped
name: redis
Must be migrated to:
version: 2 # ✅ Required
namespaces:
- default:
worker:
dependencies:
kubernetes:
- type: KubernetesService # ✅ Exact Neo4j label
name: redis
Implementation Notes
Cypher Generation
Kubernetes: Dynamic blocks generated per
(type, identifier_name)
combinationAWS: Dynamic blocks generated per
(type, identifier_name)
combinationExternal: Single static block with property merging
Source Control: Static blocks for GitHub, generic SCM
Neo4j Compatibility
Target: Neo4j 4.4 (Community)
No APOC required for core functionality
Stale property cleanup uses explicit
SET prop = NULL
Bulk operations use
UNWIND
for efficiency
Performance Considerations
Pre-validation of targets reduces failed relationship attempts
Batch processing via
UNWIND
Single query execution for all dependency types
Stale cleanup in separate transaction
Migration from v1 to v2
Breaking Changes
⚠️ As of v0.5.0: Version 1 manifests are no longer supported. Items in v1 format will be silently skipped during import.
Required Updates
All manifests must be migrated to Version 2 format to function in v0.5.0+:
Add
version: 2
to manifest (required at top level)Update K8s
type
values to exact Neo4j labels:Service
→KubernetesService
Secret
→KubernetesSecret
ConfigMap
→KubernetesConfigMap
Explicitly set
namespace
for clarity (or omit to use default)Use
identifier_name
for custom matching if needed
Example Migration
Before (v1):
namespaces:
- prod:
api:
dependencies:
kubernetes:
- type: Service
name: db
namespace: prod
After (v2):
version: 2
namespaces:
- prod:
api:
dependencies:
kubernetes:
- type: KubernetesService
name: db
# namespace omitted - defaults to pod namespace (prod)
Error Handling Reference
Common Errors
missing required 'type' field
type
field missing or empty
Add valid type
field
missing required identifier
Both name
and identifier_name
field missing
Add name
field
targets not found
K8s resource doesn't exist in graph
Ensure resource ingested first
DEPENDENCIES_MAPPING_FILE not set
Environment variable not configured
Set env var to file path
is empty; skipping
File exists but has no content
Add valid YAML content
Malformed YAML
YAML syntax error
Fix YAML syntax
Debugging
Enable DEBUG logging: Set log level to DEBUG
Check manifest loading: Look for
Loaded manifest for N namespace/pod entries
Verify workload matching: Check
No manifest for pod=
messagesReview mapping logs: Look for
K8s map
,AWS map
, etc.Check target validation: Look for
targets not found
warnings
Changelog
duplo-cartography v0.5.0+ (October 2025)
Breaking Change:
Version 1 manifests no longer supported
Manifests without
version: 2
will have items silently skippedMigration to Version 2 format is now required
Version 2.0 (October 2025)
Added
version
field supportKubernetes dependencies now use exact Neo4j labels
Added
identifier_name
support for Kubernetes (like AWS)Namespace defaulting behavior for Kubernetes
Dynamic Cypher generation for any K8s resource type
Enhanced validation with ERROR-level logging
Improved debugging with detailed log messages
Version 1.0 (Initial - deprecated)
Basic manifest support
Hardcoded K8s types (Service, Secret, ConfigMap)
AWS dependencies with custom identifiers
External service tracking
Source control linking
No longer supported as of v0.5.0
Document Status: This specification is current and actively maintained. For implementation details, see src/intel/duplo/dependencies.py
.
Last updated
Was this helpful?