Add basic Prometheus/Grafana setup for local development observability and as a starting point for building custom production monitoring. Usage: ./pgconfig -f monitoring.yml up -d - Grafana at http://localhost:3000 (admin/admin) - Prometheus at http://localhost:9091 with Eureka service discovery Includes basic dashboard showing JVM metrics, HTTP rates, service health, and resource usage. Intentionally kept simple - users should customize for production with alerting, persistence, security, and integration with existing observability platforms. Features: - Auto-discovery of scaled replicas via Eureka - Comprehensive monitoring guide - Example queries and dashboard customization tips This is a development tool and foundation, not production-ready monitoring.
13 KiB
GeoServer Cloud Monitoring with Prometheus + Grafana
This guide explains how to monitor your GeoServer Cloud deployment using Prometheus and Grafana.
Overview
GeoServer Cloud applications expose Spring Boot Actuator endpoints on port 8081, including:
/actuator/health- Health check endpoints/actuator/metrics- Micrometer metrics/actuator/prometheus- Prometheus-formatted metrics
The monitoring stack includes:
- Prometheus - Time-series database for metrics collection (port 9091)
- Grafana - Visualization and dashboards (port 3000)
Purpose and Scope
This monitoring setup provides basic observability for local development and serves as a starting point for building production monitoring infrastructure.
What this is:
- Simple Prometheus/Grafana stack for development environments
- Basic dashboard showing essential metrics (JVM, HTTP, service health)
- Foundation and reference implementation for custom observability solutions
- Example of Eureka-based service discovery integration
What this is NOT:
- Production-ready monitoring (no alerting, persistence config, security hardening)
- Comprehensive dashboard suite (intentionally kept simple)
- Replacement for enterprise observability platforms
For production deployments, you should:
- Customize dashboards for your specific needs and SLAs
- Configure alerting rules and notification channels
- Implement persistent storage with appropriate retention policies
- Integrate with your existing observability stack (Datadog, New Relic, etc.)
- Add security (authentication, TLS, network policies)
- Consider distributed tracing (OpenTelemetry, Jaeger, Zipkin)
Quick Start
1. Start GeoServer Cloud with Monitoring
Add the monitoring.yml file when starting your environment:
cd compose
# For pgconfig backend with monitoring:
./pgconfig -f monitoring.yml up -d
# For datadir backend with monitoring:
./datadir -f monitoring.yml up -d
This starts all GeoServer Cloud services plus Prometheus and Grafana.
2. Access the Monitoring Tools
-
Grafana Dashboard: http://localhost:3000
- Default credentials:
admin/admin(you'll be prompted to change on first login)
- Default credentials:
-
Prometheus UI: http://localhost:9091
- View targets, explore metrics, test queries
3. View Pre-configured Dashboard
In Grafana:
- Navigate to Dashboards → GeoServer Cloud Overview
- This basic starter dashboard shows:
- Service availability (up/down status)
- JVM heap memory usage
- CPU usage per service
- HTTP request rate
- HTTP request duration
- JVM thread count
Note: This dashboard is intentionally simple and serves as a starting point. You should customize it for your specific monitoring needs, add alerting rules, and create additional dashboards as required.
Dashboard Preview
The dashboard provides basic observability metrics for development and debugging. Customize it to suit your production monitoring requirements.
Available Metrics
All Spring Boot services expose these metric categories:
JVM Metrics
jvm_memory_used_bytes- Memory usage by area (heap, non-heap)jvm_memory_max_bytes- Maximum memoryjvm_threads_live_threads- Thread countjvm_gc_pause_seconds- Garbage collection pausesjvm_classes_loaded_classes- Loaded class count
System Metrics
system_cpu_usage- System CPU usage (0-1)process_cpu_usage- Process CPU usage (0-1)system_load_average_1m- System load average
HTTP Metrics
http_server_requests_seconds_count- Request counthttp_server_requests_seconds_sum- Total request durationhttp_server_requests_seconds_max- Maximum request duration
Labeled by: service, uri, method, status, outcome
Tomcat Metrics
tomcat_threads_current_threads- Current thread counttomcat_threads_busy_threads- Busy threadstomcat_sessions_active_current_sessions- Active sessions
Database Connection Pool (HikariCP)
hikaricp_connections_active- Active connectionshikaricp_connections_idle- Idle connectionshikaricp_connections_pending- Pending connection requestshikaricp_connections_max- Maximum pool size
GeoServer-specific Metrics
geoserver_metrics_*- Custom GeoServer metrics (if enabled)
RabbitMQ Metrics
The RabbitMQ management plugin exposes metrics at http://rabbitmq:15692/metrics
Exploring Metrics
Using Prometheus UI
- Go to http://localhost:9091
- Click Status → Targets to see all scraped services
- Click Graph to explore metrics with PromQL
Example queries:
# Services that are down
up == 0
# Average request rate per service
rate(http_server_requests_seconds_count[5m])
# 95th percentile request duration
histogram_quantile(0.95, rate(http_server_requests_seconds_bucket[5m]))
# JVM heap usage percentage
jvm_memory_used_bytes{area="heap"} / jvm_memory_max_bytes{area="heap"} * 100
Creating Custom Dashboards
In Grafana:
- Click + → Create Dashboard
- Add panels with Prometheus queries
- Use the Prometheus data source (pre-configured)
Checking Actuator Endpoints Directly
You can also access actuator endpoints directly from the host:
# Check health of a specific service (services don't expose 8081 to host by default)
# Note: container name depends on your backend (pgconfig, datadir, etc)
docker exec -it gscloud_dev_pgconfig-wms-1 curl localhost:8081/actuator/health
# Or access via the gateway (if configured to proxy actuator endpoints)
curl http://localhost:9090/geoserver/cloud/wms/actuator/health
Scaling Services
The monitoring setup uses Eureka-based service discovery to find all service replicas automatically! 🎉
How It Works
All GeoServer Cloud services register themselves with the Eureka discovery service. Prometheus:
- Queries the Eureka service registry every 30 seconds
- Discovers all instances of each service, including all scaled replicas
- Scrapes metrics from each instance independently
- Labels each instance with service name, instance ID, and hostname
This works perfectly with Docker Compose scaling because Eureka tracks every individual container that registers.
How to Scale
Scale any service to multiple replicas:
./pgconfig -f monitoring.yml up -d --scale wms=3 --scale wfs=2
Prometheus will automatically:
- Discover all 3 WMS replicas from Eureka
- Discover all 2 WFS replicas from Eureka
- Scrape metrics from each replica independently
- Label each with unique
instance_id
Viewing Scaled Services
In Prometheus:
- Go to Status → Targets
- You'll see one target per replica (e.g., 3 WMS targets if scaled to 3)
- Each target has labels:
service,instance_id,hostname,application - Go to Status → Service Discovery to see Eureka discovery in action
In Grafana:
- Metrics are collected from all replicas simultaneously
- Use
sum by (service)to aggregate across all replicas of a service - Use
instance_idlabel to filter or view specific replicas - Use
hostnameto identify individual containers
Understanding the Service Availability Dashboard:
The "Service Availability (Replica Count)" table shows:
- Service: The service name (wms, wfs, wcs, etc.)
- Replicas UP: Number of healthy replicas running
- Replicas DOWN: Number of unhealthy replicas (if any)
When you scale WMS to 3 replicas, you'll see:
Service | Replicas UP | Replicas DOWN
wms | 3 | 0
This is the correct behavior! Each replica is an independent instance that:
- Runs in its own container
- Registers independently with Eureka
- Has its own health status
- Can fail independently
Prometheus tracks each replica separately, allowing you to:
- Monitor individual replica health
- Detect partial failures (e.g., 2/3 replicas healthy)
- View metrics per replica or aggregated
Example Queries:
Total request rate across all WMS replicas:
sum(rate(http_server_requests_seconds_count{service="wms"}[5m]))
Request rate per WMS replica:
rate(http_server_requests_seconds_count{service="wms"}[5m])
Memory usage of a specific replica:
jvm_memory_used_bytes{service="wms", instance_id="wms-service:172.18.0.5:8080", area="heap"}
Count of healthy replicas per service:
count by (service) (up{job="geoserver-cloud-services"} == 1)
Configuration
Prometheus Configuration
The setup includes two Prometheus configurations:
-
prometheus-eureka.yml(Default, Recommended)- Uses Eureka service discovery
- Automatically discovers all replicas
- Works perfectly with scaling
- Requires Eureka discovery service to be running
-
prometheus.yml(Fallback)- Uses DNS-based service discovery (
tasks.<service>) - Limited replica discovery in Docker Compose
- Use if Eureka is disabled
- Uses DNS-based service discovery (
To switch configurations, edit compose/monitoring.yml and change the volume mount:
volumes:
# For Eureka-based discovery (default):
- ./prometheus-eureka.yml:/etc/prometheus/prometheus.yml:ro
# OR for DNS-based discovery:
- ./prometheus.yml:/etc/prometheus/prometheus.yml:ro
To customize, edit the active configuration file:
- Adjust scrape intervals (default: 15s)
- Add more static targets
- Configure alerting rules
- Modify relabeling rules
Grafana Configuration
- Datasources:
compose/grafana/provisioning/datasources/ - Dashboards:
compose/grafana/provisioning/dashboards/
Add .json dashboard files to the dashboards directory - they'll be automatically loaded.
Customizing Credentials
Set environment variables before starting:
export GRAFANA_USER=myuser
export GRAFANA_PASSWORD=mypassword
./pgconfig -f monitoring.yml up -d
Monitoring Additional Components
PostgreSQL
To monitor PostgreSQL, add the postgres_exporter:
- Uncomment the postgres section in
compose/monitoring.yml - Add this service:
postgres-exporter:
image: prometheuscommunity/postgres-exporter:latest
environment:
DATA_SOURCE_NAME: "postgresql://geoserver:geoserver@geodatabase:5432/geoserver?sslmode=disable"
ports:
- "9187:9187"
depends_on:
- geodatabase
- Uncomment the postgres job in
compose/prometheus.yml
RabbitMQ
RabbitMQ metrics are already configured if the management plugin is enabled (it is by default).
Stopping the Monitoring Stack
# For pgconfig backend:
./pgconfig -f monitoring.yml down
# For datadir backend:
./datadir -f monitoring.yml down
To also remove volumes (WARNING: deletes all data including Prometheus metrics and Grafana dashboards):
./pgconfig -f monitoring.yml down -v
# or
./datadir -f monitoring.yml down -v
Troubleshooting
Not Seeing All Replicas in Prometheus (Eureka Discovery)
If you're using Eureka discovery but only see one instance per service:
-
Check Eureka service registry:
- Go to http://localhost:8761 (Eureka console)
- Verify all service instances are registered
- Look for multiple instances of the scaled service
-
Verify services are registering:
# Check if WMS instances are registered curl http://localhost:8761/eureka/apps/WMS-SERVICE | grep -o "<instance>.*</instance>" -
Check Prometheus service discovery:
- Go to http://localhost:9091/service-discovery
- Look for
eureka_sd_configssection - Verify discovered targets match Eureka registry
-
Check Prometheus logs:
./pgconfig -f monitoring.yml logs prometheus | grep -i eureka
Using DNS Discovery Instead
If Eureka discovery isn't working or you prefer DNS:
- Edit
compose/monitoring.yml - Change the volume mount to use
prometheus.ymlinstead ofprometheus-eureka.yml - Restart:
./pgconfig -f monitoring.yml restart prometheus
Note: DNS discovery has limitations - see the configuration section for details.
Services showing as "Down" in Prometheus
-
Check if actuator endpoints are accessible:
docker exec -it gscloud_dev_pgconfig-wms-1 curl localhost:8081/actuator/prometheus -
Check Prometheus logs:
./pgconfig -f monitoring.yml logs prometheus -
Verify services are on the same Docker network:
docker network inspect gscloud_dev_pgconfig_default
Grafana Dashboard Not Loading
-
Check Grafana logs:
./pgconfig -f monitoring.yml logs grafana -
Verify Prometheus datasource:
- Go to Configuration → Data Sources
- Test the Prometheus connection
High Memory Usage
The monitoring tools themselves consume resources. Adjust limits in compose/monitoring.yml:
deploy:
resources:
limits:
memory: 256M # Reduce if needed
Additional Resources
- Spring Boot Actuator Documentation
- Micrometer Documentation
- Prometheus Documentation
- Grafana Documentation
- Pre-built Grafana Dashboards
- Dashboard ID 4701: JVM (Micrometer)
- Dashboard ID 11378: Spring Boot Statistics
