ArthurSens
18a533db56
Create alertmanager alerts
...
Signed-off-by: ArthurSens <arthursens2005@gmail.com>
2022-09-02 09:33:17 +02:00
ArthurSens
1d013d794c
Add alerts for kubernetes nodes
...
Signed-off-by: ArthurSens <arthursens2005@gmail.com>
2022-09-02 09:26:18 +02:00
mustard
2f3b432707
[observability] add cluster selector for browser overview dashboard
2022-09-01 08:30:16 +02:00
Gero Posmyk-Leinemann
73cbd09b66
[ops] WebApp: review comments
2022-08-31 16:07:16 +02:00
Gero Posmyk-Leinemann
cb30274ccc
[ops] WebApp: Alert on services crashlooping
2022-08-31 16:07:16 +02:00
Gero Posmyk-Leinemann
ddf5651b7c
[ops] WebApp: Alerts on exessive RAM and CPU usage
2022-08-31 16:07:16 +02:00
Gero Posmyk-Leinemann
8701732104
[ops] WebApp: alert if db-sync is not running
2022-08-31 16:07:16 +02:00
Gero Posmyk-Leinemann
4035d1242a
[ops] WebApp: alert on messagebus not running
2022-08-31 16:07:16 +02:00
Gero Posmyk-Leinemann
d394c24727
[ops] WebApp: high websocket connection rate
2022-08-31 16:07:16 +02:00
Gero Posmyk-Leinemann
5a00651ffd
[ops] WebApp: Internal alert on JSON RPC error rates
2022-08-31 16:07:16 +02:00
Gero Posmyk-Leinemann
313b522c59
[ops] Meta Overview/server: Fix unit of "API Request Error rate" to be reqps
2022-08-31 16:07:16 +02:00
Thomas Schubart
ccb148f2a6
[observability] Add dashboard for network limiting
2022-08-31 10:27:15 +02:00
ArthurSens
aee56a583b
Add alerts related to kubernetes resources
...
Signed-off-by: ArthurSens <arthursens2005@gmail.com>
2022-08-30 22:20:15 +02:00
mustard
f62bc58f7f
[observability] add browser overview dashboard
2022-08-30 15:02:15 +02:00
ArthurSens
912410cdb0
Create alerts for certmanager
...
Signed-off-by: ArthurSens <arthursens2005@gmail.com>
2022-08-29 21:12:14 +02:00
Kyle Brennan
074d9842ca
Sum by type, phase (avoids nodepools dupes)
2022-08-25 18:34:10 +02:00
JenTing Hsiao
b31278f1f1
observability: assign default zero if no data found
...
Signed-off-by: JenTing Hsiao <hsiaoairplane@gmail.com>
2022-08-25 13:41:10 +02:00
Pavel Tumik @ GitPod
cc79d75a96
[alerts] increase GitpodWorkspaceStuckOnStopping for time to 30min to reduce flakiness
2022-08-23 20:32:39 +02:00
Manuel Alejandro de Brito Fontes
7b4a885ee3
Update k8s dependencies to v0.24.3
2022-08-23 08:18:39 +02:00
JenTing Hsiao
d5462c0d02
observability: add #workspace > 20 in alert GitpodWorkspaceTooManyRegularNotActive
...
To prevent the alert from being triggered once we start traffic shifting.
The number of workspaces might be low, this cause the
gitpod_workspace_regular_not_active_percentage is easily to hit because
the gitpod_ws_manager_workspace_activity_total is low number.
Therefore, we add #workspace > 20 as another criterion for the alert.
Signed-off-by: JenTing Hsiao <hsiaoairplane@gmail.com>
2022-08-11 16:35:28 +02:00
Arthur Silva Sens
79fdcdd0be
Update scrape.libsonnet
2022-08-10 14:11:54 +02:00
utam0k
2d1f66ae25
observability: Add a alert for the network connections.
2022-08-10 05:55:54 +02:00
JenTing Hsiao
a986791728
Update the alert description unit
...
Signed-off-by: JenTing Hsiao <hsiaoairplane@gmail.com>
2022-08-09 02:40:52 -03:00
Andrew Farries
c4363513a5
Run gofmt
...
gofmt -w .
From the repository root.
2022-08-08 10:54:52 -03:00
Pavel Tumik
06a686acf1
[alerts] change load avg alert to critical
2022-08-05 16:11:49 -03:00
Milan Pavlik
fc2355c241
[usage] Add runbook link for GitpodUsageScheduledReconciliationFailures
2022-08-05 09:34:49 -03:00
Milan Pavlik
9a947a5a81
[usage] Fix UsageReconciliationFailures alert
2022-08-05 02:42:49 -03:00
Milan Pavlik
63f3bb78ae
[usage] Add alert on failed reconciliations
2022-08-04 05:32:48 -03:00
Arthur Silva Sens
5bdda8ecea
Remove check for absense
2022-08-01 10:31:45 -03:00
ArthurSens
1041c76306
Add alert for target down
...
Signed-off-by: ArthurSens <arthursens2005@gmail.com>
2022-07-29 17:28:24 -03:00
ArthurSens
5092ab3934
Add alert for OpenVSX-proxy scraping failures
...
Signed-off-by: ArthurSens <arthursens2005@gmail.com>
2022-07-29 09:41:23 -03:00
Jean Pierre
eec738731c
Add openvsx alert
2022-07-29 01:46:23 -03:00
Manuel Alejandro de Brito Fontes
a5dd648f06
Add dashboard for node problem detector
2022-07-26 16:05:21 -03:00
Manuel Alejandro de Brito Fontes
70eaa01676
Add dashboard for ephemeral storage
2022-07-26 15:24:21 -03:00
Arthur Silva Sens
cd28f4c34d
Route GitpodWorkspaceStuckOnStarting to #t_workspace_alerts
2022-07-26 14:15:21 -03:00
Manuel Alejandro de Brito Fontes
c7474500ae
Improve Summary dashboard row
2022-07-26 06:07:20 -03:00
Milan Pavlik
6893677724
[usage] Add grafana dashboard
2022-07-26 03:15:20 -03:00
Manuel Alejandro de Brito Fontes
b9db8b349b
Add Summary row to Gitpod overview dashboard
2022-07-20 13:32:15 -03:00
Manuel Alejandro de Brito Fontes
18c764cbac
Add dashboard for swap utilization per cluster and node
2022-07-19 19:40:14 -03:00
ArthurSens
735a30899f
Update Preview env's dashboard with new metrics
...
Signed-off-by: ArthurSens <arthursens2005@gmail.com>
2022-07-18 16:39:13 +02:00
Pudong Zheng
25c5bfbecb
[alerts] change alert for adding new nodes rapidly to only count if node type is regular workspace
2022-07-18 03:40:13 +02:00
Nandaja Varma
d13bdfd0cd
Improve GitpodWorkspaceTooManyRegularNotActive alert
2022-07-11 06:09:58 +05:30
utam0k
04d945d216
obserbility: Add a alert for AutoscaleFailure.
2022-07-06 00:36:52 +05:30
JenTing Hsiao
7800a21c4d
[alerts] fix pod/container/namespace not rendering
...
Because every time series is uniquely identified by its metric name
a set of labels, and every unique combination of key-value label pairs
represents a new alert for this time series.
There is no common value for these metrics
- kube_pod_container_status_restarts_total
- gitpod_ws_manager_workspace_backups_failure_total
Signed-off-by: JenTing Hsiao <hsiaoairplane@gmail.com>
2022-07-01 06:23:39 +05:30
Arthur Silva Sens
028ef2608b
Update overview.json
2022-06-27 13:46:36 +05:30
utam0k
6c2705fbe4
observability: Ring the phone only when a data loss occurs with GitpodWsDaemonCrashLoopingg
2022-06-23 19:06:32 +05:30
Pavel Tumik
cf35903aff
Apply suggestions from code review
2022-06-23 02:46:31 +05:30
Pavel Tumik
7e0fe457fb
Apply suggestions from code review
2022-06-23 02:46:31 +05:30
utam0k
62859996d5
observability: Add GitpodWorkspaceTooLongTerminating alert.
2022-06-23 02:46:31 +05:30
ArthurSens
9be43de166
Add SLIs to preview-environment dashboard
...
Signed-off-by: ArthurSens <arthursens2005@gmail.com>
2022-06-22 12:15:31 +05:30