Achieving Compliance as Code Using Cloud Custodian

Introduction:

As more and more companies start adopting the Cloud, it has become immensely important to ensure that there is a sound Governance policy administered on the account which ensures security and compliance. There has to be an optimum utilization of resources without any billing surprises.
As cloud consumption model is different — it is distributed on demand and even a developer can provision infrastructure resources it is critical for organizations to have an automated and continuous monitoring of security policies, compliance and monitoring the resource utilization so that the resources are used efficiently in a predictive manner.

According to a survey $216 billion were spent on cloud in the year 2020 and at least $17 billion were wasted by over-provisioning or simply unused resources.

Cloud Custodian

An Open-Source application, Cloud Custodian gives control in the hands of an organization to manage the cloud resources efficiently in a predictive and predefined fashion. It is written in Python and was launched in 2016 with features like defining policies which would govern all the cloud resources in an account. The policies are written in YAML and has made it easy to create a structured and coherent way to provision cloud resources which would enable an organization to have cost saving measures, ensure a standard tagging methodology to efficiently manage all resources, establish compliance and security on the account along with a solid mechanism for resource inventory.

Reference: linkedin.com/pulse/cloud-custodian-tre-king/

Key Features:

  • All Major Cloud Providers AWS, Azure, GCP, Tencent Cloud are supported.
  • No requirement to install an agent or client. Streamlined execution achieved remotely.
  • Policies can be defined in a simple structure in YAML.
  • Gives a detailed report on the compliant and non-compliant resources on the cloud account and takes action as per predefined remedy measures.
  • Facilitates to have Real-time Guard rails to protect the organization from surprises.
  • Simple yet powerful method to filter the resources based on certain values of their real-time properties/current state.
  • Real-time action to take remedial measures as per predefined actions like notify via email, SMS to multiple channels. Actions may also be to execute Lambda functions to terminate unauthorized login sessions or delete non-compliant resources in under a minute.
  • Produces the output which can be ingested into a Security Information and Event Management solution (SIEM).

Architecture:

How it works:

Custodian can be run on an EC2 instance or a Docker container or a Windows machine where we define the Custodian policies in YAML. The policies are where we specify two things:

  1. Filters
  2. Actions

Filters are the criteria to mark any resource as non-compliant. An unattached Load balancer or an unencrypted EBS volume for example. Those compliance policies that are to be implemented on the AWS account are defined in filters.
Actions on the other hand are the remedial steps that are to be taken if there are any non-compliant resources detected to the mentioned filters. Notification can be used as an action or, and this is where Custodian shines over its peers, take remedial action like deleting the resources etc without any manual intervention.

When the custodian policies are applied on the account they create lambda functions and CloudWatch Events rules on the AWS account. This lambda function runs at a specified interval like once a day or every 6 hours etc. The lambda function takes the actions in case a resource is found non-compliant with the help of Cloudwatch Events which would trigger the lambda to take corrective measures on the detected resources or events.

Some use Cases (CIS Benchmarks for AWS):

The following are some example policies where I have only mentioned the filters and no action specified as we are only going to look at tracking the non-compliant resources.

  1. Ensuring MFA on All Accounts
policies:
- name: cis-iam-user-needs-mfa
  resource: aws.iam-user
  comment: |
    CIS AWS Foundations v1.4.0 (1.10)
    Multi-factor Authentication (MFA) adds an extra layer of 
    authentication assurance beyond traditional credentials. With 
    MFA enabled, when a user signs in to the AWS console, they will 
    be prompted for their username and password as well as for an 
    authentication code from their physical or virtual MFA device. 
    It is highly recommended that MFA be enabled for all accounts 
    that have a console password.
  filters:
    - type: credential
      key: password_enabled
      value: true
    - type: credential
      key: mfa_active
      value: false
  mode:
    schedule: "rate(24 hours)"
    type: periodic
    role: arn:aws:iam::123456789:role/AssumeRoleCustodianExternal
    execution-options:
      output_dir: s3://bucket/for/reports/custodian

2. Ensuring access key rotation

policies:
- name: cis-iam-user-key-rotation
  resource: aws.iam-user
  comment: |
    CIS AWS Foundations v1.4.0 (1.14).
    Access keys should be rotated to ensure that data cannot be  
    accessed with an old key which might have been lost, cracked, or 
    stolen. Rotating access keys will reduce that window of 
    opportunity for an access key that is associated with a    
    compromised or terminated account to be used.
  filters:
    - type: credential
      key: access_keys.active
      value: true
    - type: credential
      key: access_keys.last_rotated
      value: 90
      op: gt
      value_type: age
  mode:
    schedule: "rate(24 hours)"
    type: periodic
    role: arn:aws:iam::123456789:role/AssumeRoleCustodianExternal
    execution-options:
      output_dir: s3://bucket/for/reports/custodian

3. Ensuring no Security groups allows ingress from 0.0.0.0 to ports 22 or 3389.

policies:
- name: cis-security-group-ingress-is-restricted
  resource: aws.security-group
  comment: |
    CIS Amazon Web Services Foundations v1.4.0 (5.2). Security 
    groups provide stateful filtering of ingress and egress network 
    traffic to AWS resources. It is recommended that no security 
    group allows unrestricted ingress access to remote server 
    administration ports, such as SSH to port 22 and RDP to port 
    3389. Public access to remote server administration ports, such 
    as 22 and 3389, increases resource attack surface and 
    unnecessarily raises the risk of resource compromise.
  filters:
    - type: ingress
      Ports: [22,3389]
      Cidr: 
        value: 0.0.0.0/0
        op: eq 
        value_type: cidr
  mode:
    schedule: “rate(24 hours)”
    type: periodic
    role: arn:aws:iam::123456789:role/AssumeRoleCustodianExternal
    execution-options:
      output_dir: s3://bucket/for/reports/custodian
    runtime: python3.8

Our Scripts for reference:

git clone https://siddiquimohammed@bitbucket.org/siddiquimohammed/compliance_as_code_custodian.git

Grafana Dashboard:

The one thing which one could feel as lacking in Custodian is a front-end. A GUI to easily manage all the policies implemented and to keep track of the compliance on the account. Grafana can be used here as a front-end for Custodian and a dashboard can be created with individual panels for each of the policies applied on the account. This would give us all the information from a single pane and make life easy for the analysts.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.