Mistake that cost thousands (Kubernetes, GKE)

Lessons learned scaling Kubernetes cluster

What is the difference between a Kubernetes cluster using 100x n1-standard-1 (1 vCPU) VMs VS having 1x n1-standard-96 (vCPU 96), or 6x n1-standard-16 VMs (vCPU 16)?

Premise

2019 August

Major red flag 🚩

NAME                       CPU(cores)   MEMORY(bytes)
admdesl-5fcfbb5544-lq7wc 3m 112Mi
admdesl-5fcfbb5544-mfsvf 3m 118Mi
admdesl-5fcfbb5544-nj49v 4m 107Mi
admdesl-5fcfbb5544-nkvk9 3m 103Mi
admdesl-5fcfbb5544-nxbrd 3m 117Mi
admdesl-5fcfbb5544-pb726 3m 98Mi
admdesl-5fcfbb5544-rhhgn 83m 119Mi
admdesl-5fcfbb5544-rhp76 2m 105Mi
admdesl-5fcfbb5544-scqgq 4m 117Mi
admdesl-5fcfbb5544-tn556 49m 101Mi
admdesl-5fcfbb5544-tngv4 2m 135Mi
admdesl-5fcfbb5544-vcmjm 22m 106Mi
admdesl-5fcfbb5544-w9dsv 180m 100Mi
admdesl-5fcfbb5544-whwtk 3m 103Mi
admdesl-5fcfbb5544-wjnnk 132m 110Mi
admdesl-5fcfbb5544-xrrvt 4m 124Mi
admdesl-5fcfbb5544-zhbqw 4m 112Mi
admdesl-5fcfbb5544-zs75s 144m 103Mi
NAME                       CPU(cores)   MEMORY(bytes)
admdesl-5fcfbb5544-lq7wc 152m 107Mi
admdesl-5fcfbb5544-mfsvf 49m 102Mi
admdesl-5fcfbb5544-nj49v 151m 116Mi
admdesl-5fcfbb5544-nkvk9 105m 100Mi
admdesl-5fcfbb5544-nxbrd 160m 119Mi
admdesl-5fcfbb5544-pb726 6m 103Mi
admdesl-5fcfbb5544-rhhgn 20m 109Mi
admdesl-5fcfbb5544-rhp76 110m 103Mi
admdesl-5fcfbb5544-scqgq 13m 120Mi
admdesl-5fcfbb5544-tn556 131m 115Mi
admdesl-5fcfbb5544-tngv4 52m 113Mi
admdesl-5fcfbb5544-vcmjm 102m 104Mi
admdesl-5fcfbb5544-w9dsv 18m 125Mi
admdesl-5fcfbb5544-whwtk 173m 122Mi
admdesl-5fcfbb5544-wjnnk 31m 110Mi
admdesl-5fcfbb5544-xrrvt 91m 126Mi
admdesl-5fcfbb5544-zhbqw 49m 107Mi
admdesl-5fcfbb5544-zs75s 87m 148Mi
resources:
requests:
memory: '150Mi'
cpu: '20m'
limits:
memory: '250Mi'
cpu: '200m'
admdesl-78fc6f5fc9-xftgr  0/1    Terminating                3         21m
admdesl-78fc6f5fc9-xgbcq 0/1 Init:CreateContainerError 0 10m
admdesl-78fc6f5fc9-xhfmh 0/1 Init:CreateContainerError 1 9m44s
admdesl-78fc6f5fc9-xjf4r 0/1 Init:CreateContainerError 0 10m
admdesl-78fc6f5fc9-xkcfw 0/1 Terminating 0 20m
admdesl-78fc6f5fc9-xksc9 0/1 Init:0/1 0 10m
admdesl-78fc6f5fc9-xktzq 1/1 Running 0 10m
admdesl-78fc6f5fc9-xkwmw 0/1 Init:CreateContainerError 0 9m43s
admdesl-78fc6f5fc9-xm8pt 0/1 Init:0/1 0 10m
admdesl-78fc6f5fc9-xmhpn 0/1 CreateContainerError 0 8m56s
admdesl-78fc6f5fc9-xn25n 0/1 Init:0/1 0 9m6s
admdesl-78fc6f5fc9-xnv4c 0/1 Terminating 0 20m
admdesl-78fc6f5fc9-xp8tf 0/1 Init:0/1 0 10m
admdesl-78fc6f5fc9-xpc2h 0/1 Init:0/1 0 10m
admdesl-78fc6f5fc9-xpdhr 0/1 Terminating 0 131m
admdesl-78fc6f5fc9-xqflf 0/1 CreateContainerError 0 10m
admdesl-78fc6f5fc9-xrqjv 1/1 Running 0 10m
admdesl-78fc6f5fc9-xrrwx 0/1 Terminating 0 21m
admdesl-78fc6f5fc9-xs79k 0/1 Terminating 0 21m
resources:
requests:
memory: '150Mi'
cpu: '100m'
limits:
memory: '250Mi'
cpu: '500m'

Answer

What is the difference between a Kubernetes cluster using 100x n1-standard-1 (1 vCPU) VMs VS having 1x n1-standard-96 (vCPU 96), or 6x n1-standard-16 VMs (vCPU 16)?

--

--

Founder, engineer interested in JavaScript, PostgreSQL and DevOps. Follow me on Twitter for outbursts about startups & engineering. https://twitter.com/kuizinas

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Gajus Kuizinas

Founder, engineer interested in JavaScript, PostgreSQL and DevOps. Follow me on Twitter for outbursts about startups & engineering. https://twitter.com/kuizinas