Thomas Schubart
|
7c41572bc9
|
[wsman-mk2] Remove mk1 from workspace success (#17497)
|
2023-05-25 17:37:59 +08:00 |
|
Wouter Verlaek
|
3bdd523046
|
[dashboards] Fix min/avg/max-over-time aggregate (#17536)
|
2023-05-08 20:48:46 +08:00 |
|
Thomas Schubart
|
44e5152607
|
[wsman-mk2] Make mk2 dashboard the default (#17496)
|
2023-05-04 17:59:42 +08:00 |
|
Thomas Schubart
|
e8a3c4e3bc
|
[obs] Include p01, p25 and p75 in success criteria (#17468)
* [obs] Include p25 and p75 in success criteria
* [obs] Include p01 workspace startup
|
2023-05-03 17:23:41 +08:00 |
|
Kyle Brennan
|
4cd2d5519f
|
[obs] change startup time rate to rate interval (#17453)
A rate of 5m makes the graph too dense, and, it doesn't match the the overview dashboard's heatmap.
|
2023-05-02 22:56:40 +08:00 |
|
Kyle Brennan
|
e952b75dd6
|
[obs] right align image builds in the Running Workspaces pane (#17454)
w/o this change, it becomes difficult to see image builds, because they're visible on the left Y axis, rather than the right.
|
2023-05-02 19:44:40 +08:00 |
|
Thomas Schubart
|
958db8be9a
|
[obs] Fix casing of names (#17418)
|
2023-04-27 22:23:35 +08:00 |
|
Thomas Schubart
|
bea298ae17
|
[wsman-mk2] Include in workspace success critieria (#17375)
|
2023-04-27 03:39:35 +08:00 |
|
Thomas Schubart
|
307a7502e1
|
[wsman-mk2] Add overview dashboard (#17368)
|
2023-04-27 03:38:35 +08:00 |
|
Thomas Schubart
|
c55d1f911e
|
[wsman-mk2] Add alerts for ws-manager-mk2 (#17362)
|
2023-04-26 00:07:46 +08:00 |
|
Milan Pavlik
|
b371e258ed
|
Remove payment endpoint component (#17278)
* Remove payment endpoint component
* fix
* Fix
|
2023-04-18 22:15:51 +08:00 |
|
Wouter Verlaek
|
7050e289b4
|
[ws-manager-mk2] Dashboard controller heatmaps (WKS-21) (#17093)
* [ws-manager-mk2] Dashboard controller heatmaps
* [ws-daemon] Use heatmaps
|
2023-04-03 10:28:43 +02:00 |
|
Wouter Verlaek
|
e7b89d60d6
|
[ws-manager-mk2] Dashboard improvements (#17120)
|
2023-03-31 23:32:41 +02:00 |
|
Gero Posmyk-Leinemann
|
0095dcefd8
|
[prometheus] Remove references to db_write (#17041)
|
2023-03-28 20:49:26 +02:00 |
|
Huiwen
|
d9f1988f81
|
[observability] add supervisor dashboard (#17031)
|
2023-03-25 14:24:23 +01:00 |
|
Milan Pavlik
|
dbc8574c50
|
[redis] Adjust dashboard to include variables for instances (#16963)
|
2023-03-22 10:04:13 +01:00 |
|
Milan Pavlik
|
55719ede91
|
[redis] Add grafana dashboard (#16939)
|
2023-03-21 14:02:14 +01:00 |
|
Huiwen
|
77764b8d83
|
[observability] add superviosr dashboard (#16853)
|
2023-03-17 09:27:09 +01:00 |
|
Milan Pavlik
|
b3ca36eb2b
|
[spicedb] Add grafana dashboard (#16722)
|
2023-03-09 09:10:45 +01:00 |
|
Wouter Verlaek
|
a9810d6a0a
|
[ws-manager-mk2] Fix race where pod gets recreated in Stopped phase (#16622)
* [ws-manager-mk2] Fix race where pod gets recreated in Stopped phase
* [ws-manager-mk2] Add pod creation logs
* Change to Patch
|
2023-03-02 13:27:59 +01:00 |
|
Wouter Verlaek
|
cf0dd5571f
|
[ws-manager-mk2] Show start failures in dashboard, show daemon ctrl metrics (#16612)
|
2023-03-01 12:13:58 +01:00 |
|
Milan Pavlik
|
5317b53ef8
|
[db-sync] Remove comment references (#16602)
|
2023-03-01 11:06:58 +01:00 |
|
Milan Pavlik
|
dade6f7e9f
|
[db-sync] Remove alerts and dashboards (#16584)
|
2023-02-28 13:46:58 +01:00 |
|
Wouter Verlaek
|
d827a2b9dd
|
[ws-manager-mk2] Add queue depth and work duration panels (#16555)
|
2023-02-24 13:47:54 +01:00 |
|
Wouter Verlaek
|
733c37b2f8
|
[ws-manager-mk2] Import dashboard (#16532)
|
2023-02-23 15:12:53 +01:00 |
|
Wouter Verlaek
|
7440f00796
|
[ws-manager-mk2] Add Grafana dashboard (#16455)
* [ws-manager-mk2] Add Grafana dashboard
* [ws-manager-mk2] Add reconciliations by controller panel
|
2023-02-23 00:19:52 +01:00 |
|
Milan Pavlik
|
a02a5d9db8
|
[alert] Page on failing workspace starts
|
2023-02-17 13:23:21 +01:00 |
|
Kyle Brennan
|
598b5372e8
|
[obs] Refactor alerts for image builds
For the last 30 days:
GitpodImagebuildDoneSuccess would have triggered once, on January 26 if set to 2h, instead of 4h. A customer was potentially struggling with the outer loop. We hit a 60% error rate in Pyrra briefly: https://pyrra.gitpod.io/objectives?expr={__name__=%22workspace-imagebuild-buildsdone-success-ratio%22,%20namespace=%22monitoring-central%22,%20team=%22workspace%22}&grouping={}&from=1673297716785&to=1675716916785
GitpodImagebuildStartSuccess would have fired once, on Jan 8, when GCP was having scaling issues, and would have been correct to do so. https://gitpod.slack.com/archives/C01TNS8EVQT/p1673173223060219
Removed the warnings because they're unnecessary. Why? Pyrra sends them now for SLOs to #team-workspace-alerts.
|
2023-02-16 14:51:21 +01:00 |
|
Milan Pavlik
|
7a8f76f9e5
|
[ws-man-bridge] Adjust CPU alert to provide better signal
|
2023-02-16 14:17:20 +01:00 |
|
Milan Pavlik
|
994debf5c0
|
[dashboard] k8s applications
|
2023-02-16 08:56:21 +01:00 |
|
Kyle Brennan
|
fc1b4af8e0
|
[obs] Temporarily avoid imageBuildFailure reason
Why? This alert fires too often / is generally a false positive. In other words, in it's current form, it's not a signal of a system failure.
|
2023-02-07 07:52:45 +01:00 |
|
Milan Pavlik
|
4628ccb5e6
|
[grafana] Cleanup server component dashboard
|
2023-01-27 16:27:34 +01:00 |
|
Milan Pavlik
|
961a3c33ed
|
[alerts] Exclude all of 2xx, 3xx, 4xx from JSON RPC API Error Rates
|
2023-01-25 16:37:32 +01:00 |
|
Milan Pavlik
|
8b88c8f99d
|
[dashboards] Fix double comma
|
2023-01-25 16:15:33 +01:00 |
|
Milan Pavlik
|
324b8d4950
|
[dashboard] Migrate server dashboard to timeseries visualization
|
2023-01-25 14:31:33 +01:00 |
|
Milan Pavlik
|
63817fdff0
|
[alerts] Reduce trigger duration for Stripe Webhook Failure alert
|
2023-01-23 11:45:30 +01:00 |
|
utam0k
|
33e6d1f540
|
obs: Make AutoscaleFailure ago down to warning level
|
2023-01-20 06:20:27 +01:00 |
|
Milan Pavlik
|
51c4adf124
|
[obs] Adjust CPU alert thresholds for webapp services
|
2023-01-18 15:07:26 +01:00 |
|
Milan Pavlik
|
dec43f11fe
|
[obs] Fix webapp monitoring rule names
|
2023-01-18 14:25:25 +01:00 |
|
Milan Pavlik
|
0ceaa6532f
|
[webapp] Group CPU alerts by deployment
|
2023-01-17 10:06:25 +01:00 |
|
Wouter Verlaek
|
b32eb221e7
|
Switch image builds axis on overview dashboard
|
2023-01-12 19:34:52 +01:00 |
|
Wouter Verlaek
|
e3ce970423
|
[observability] Add image build rate panels
|
2023-01-09 17:00:48 +01:00 |
|
Kyle Brennan
|
f08784fbc8
|
[obs] fix image-builder-mk3 dashboard
|
2022-12-26 02:24:34 +01:00 |
|
Kyle Brennan
|
c01d43b809
|
[obs] move blobserve from Workspace to IDE
|
2022-12-26 02:22:34 +01:00 |
|
ArthurSens
|
5d96084625
|
Delete unused PrometheusRules
Signed-off-by: ArthurSens <arthursens2005@gmail.com>
|
2022-12-14 04:38:23 -03:00 |
|
mustard
|
0576091fe1
|
[observability] add job variable for grpc client
|
2022-12-14 03:53:23 -03:00 |
|
Andrea Falzetti
|
729e0d8aa7
|
[ide-service]: update grafana dashboard
Co-authored-by: Victor Nogueira <victor@gitpod.io>
|
2022-12-09 06:56:18 -03:00 |
|
Pudong Zheng
|
fc6355a8d2
|
[observability] fix datasource in preview environment
|
2022-12-09 06:54:19 -03:00 |
|
Christian Weichel
|
478a75e744
|
Switch license to AGPL
|
2022-12-08 13:05:19 -03:00 |
|
Pudong Zheng
|
422c7cb690
|
[observability] fix ide-service dashboard
|
2022-12-08 05:37:18 -03:00 |
|