Introduction:
As more and more companies start adopting the Cloud, it has become immensely important to ensure that there is a sound Governance policy administered on the account which ensures security and compliance. There has to be an optimum utilization of resources without any billing surprises.
As cloud consumption model is different — it is distributed on demand and even a developer can provision infrastructure resources it is critical for organizations to have an automated and continuous monitoring of security policies, compliance and monitoring the resource utilization so that the resources are used efficiently in a predictive manner.
According to a survey $216 billion were spent on cloud in the year 2020 and at least $17 billion were wasted by over-provisioning or simply unused resources.
Cloud Custodian
An Open-Source application, Cloud Custodian gives control in the hands of an organization to manage the cloud resources efficiently in a predictive and predefined fashion. It is written in Python and was launched in 2016 with features like defining policies which would govern all the cloud resources in an account. The policies are written in YAML and has made it easy to create a structured and coherent way to provision cloud resources which would enable an organization to have cost saving measures, ensure a standard tagging methodology to efficiently manage all resources, establish compliance and security on the account along with a solid mechanism for resource inventory.
Reference: linkedin.com/pulse/cloud-custodian-tre-king/
Key Features:
- All Major Cloud Providers AWS, Azure, GCP, Tencent Cloud are supported.
- No requirement to install an agent or client. Streamlined execution achieved remotely.
- Policies can be defined in a simple structure in YAML.
- Gives a detailed report on the compliant and non-compliant resources on the cloud account and takes action as per predefined remedy measures.
- Facilitates to have Real-time Guard rails to protect the organization from surprises.
- Simple yet powerful method to filter the resources based on certain values of their real-time properties/current state.
- Real-time action to take remedial measures as per predefined actions like notify via email, SMS to multiple channels. Actions may also be to execute Lambda functions to terminate unauthorized login sessions or delete non-compliant resources in under a minute.
- Produces the output which can be ingested into a Security Information and Event Management solution (SIEM).
Architecture:
How it works:
Custodian can be run on an EC2 instance or a Docker container or a Windows machine where we define the Custodian policies in YAML. The policies are where we specify two things:
- Filters
- Actions
Filters are the criteria to mark any resource as non-compliant. An unattached Load balancer or an unencrypted EBS volume for example. Those compliance policies that are to be implemented on the AWS account are defined in filters.
Actions on the other hand are the remedial steps that are to be taken if there are any non-compliant resources detected to the mentioned filters. Notification can be used as an action or, and this is where Custodian shines over its peers, take remedial action like deleting the resources etc without any manual intervention.
When the custodian policies are applied on the account they create lambda functions and CloudWatch Events rules on the AWS account. This lambda function runs at a specified interval like once a day or every 6 hours etc. The lambda function takes the actions in case a resource is found non-compliant with the help of Cloudwatch Events which would trigger the lambda to take corrective measures on the detected resources or events.
Some use Cases (CIS Benchmarks for AWS):
The following are some example policies where I have only mentioned the filters and no action specified as we are only going to look at tracking the non-compliant resources.
- Ensuring MFA on All Accounts
policies:
- name: cis-iam-user-needs-mfa
resource: aws.iam-user
comment: |
CIS AWS Foundations v1.4.0 (1.10)
Multi-factor Authentication (MFA) adds an extra layer of
authentication assurance beyond traditional credentials. With
MFA enabled, when a user signs in to the AWS console, they will
be prompted for their username and password as well as for an
authentication code from their physical or virtual MFA device.
It is highly recommended that MFA be enabled for all accounts
that have a console password.
filters:
- type: credential
key: password_enabled
value: true
- type: credential
key: mfa_active
value: false
mode:
schedule: "rate(24 hours)"
type: periodic
role: arn:aws:iam::123456789:role/AssumeRoleCustodianExternal
execution-options:
output_dir: s3://bucket/for/reports/custodian
2. Ensuring access key rotation
policies:
- name: cis-iam-user-key-rotation
resource: aws.iam-user
comment: |
CIS AWS Foundations v1.4.0 (1.14).
Access keys should be rotated to ensure that data cannot be
accessed with an old key which might have been lost, cracked, or
stolen. Rotating access keys will reduce that window of
opportunity for an access key that is associated with a
compromised or terminated account to be used.
filters:
- type: credential
key: access_keys.active
value: true
- type: credential
key: access_keys.last_rotated
value: 90
op: gt
value_type: age
mode:
schedule: "rate(24 hours)"
type: periodic
role: arn:aws:iam::123456789:role/AssumeRoleCustodianExternal
execution-options:
output_dir: s3://bucket/for/reports/custodian
3. Ensuring no Security groups allows ingress from 0.0.0.0 to ports 22 or 3389.
policies:
- name: cis-security-group-ingress-is-restricted
resource: aws.security-group
comment: |
CIS Amazon Web Services Foundations v1.4.0 (5.2). Security
groups provide stateful filtering of ingress and egress network
traffic to AWS resources. It is recommended that no security
group allows unrestricted ingress access to remote server
administration ports, such as SSH to port 22 and RDP to port
3389. Public access to remote server administration ports, such
as 22 and 3389, increases resource attack surface and
unnecessarily raises the risk of resource compromise.
filters:
- type: ingress
Ports: [22,3389]
Cidr:
value: 0.0.0.0/0
op: eq
value_type: cidr
mode:
schedule: “rate(24 hours)”
type: periodic
role: arn:aws:iam::123456789:role/AssumeRoleCustodianExternal
execution-options:
output_dir: s3://bucket/for/reports/custodian
runtime: python3.8
Our Scripts for reference:
git clone https://siddiquimohammed@bitbucket.org/siddiquimohammed/compliance_as_code_custodian.git
Grafana Dashboard:
The one thing which one could feel as lacking in Custodian is a front-end. A GUI to easily manage all the policies implemented and to keep track of the compliance on the account. Grafana can be used here as a front-end for Custodian and a dashboard can be created with individual panels for each of the policies applied on the account. This would give us all the information from a single pane and make life easy for the analysts.