Hi infracost community. A while ago I promised to ...
# general
h
Hi infracost community. A while ago I promised to show how Infracost and Keptn can work together to strengthen the CD process. Video coming soon (I promise) but for now, the top half of the screenshot shows: 1. Infracost running + metrics being pushed to Prometheus 2. A keptn quality gate evaluating the cost (+ other metrics [security / performance]) and approving the build 3. The deployment happens in the
dev-deploy
stage The bottom half shows the same except we’ve updated the plan so now it is too expensive (the cost SLO was set for
<700 && no more than 10% more expensive than a previous release
. The deployment is stopped). 🎉 You’ve prevented a very expensive release making it to production.
l
Looks neat! The prometheus stuff looks interesting. What does Keptn use for the quality gates, is it using OPA+rego or something different?
h
Thanks 🙂 The setup is Keptn + Prometheus + Prometheus service + job executor service + infracost API The quality gate is out-of-the-box Keptn functionality so you get it “for free” with that
- name: "evaluation"
line. There are two files Keptn needs. An sli.yaml and slo.yaml. Partial files:
Copy code
---
spec_version: '1.0'
indicators:
  infracost_total_hourly_cost: keptn_infracost_totalHourlyCost{ci_platform="keptn",keptn_project="infracost",keptn_service="microservice1",keptn_stage="dev-precheck"}
slo.yaml (partial)
Copy code
spec_version: '1.0'
comparison:
  compare_with: "single_result"
  include_result_with_score: "pass"
  aggregate_function: avg
objectives:
  - sli: infracost_total_monthly_cost
    pass:
      - criteria:
          - "<700"
          - "<=+10%"
    warning:
      - criteria:
          - "<=695"
total_score:
  pass: "90%"
  warning: "75%"
The job executor service is configured to listen for the
checkcost.triggered
event and run an image that has Python, requests + prom client installed. The
app.py
queries infracost API and pushes metrics to a backend (Prometheus in this case) during the
checkcost.triggered
cloudevent task. Then the built-in quality gate takes action during the
evaluation.triggered
task and does it’s thing, retrieving metrics from Prom and giving a score.
As promised, a video overview:

https://youtu.be/L8AWjCAHv-4

l
@high-shoe-19425 this is awesome! Maybe post it as a new thread so everyone sees.