Skip to content

bootjp/cloudflare-gslb

Repository files navigation

Cloudflare GSLB

A Global Server Load Balancing (GSLB) system that provides health checks and automatic failover for Cloudflare DNS records.

Features

  • Health checks for A and AAAA records
  • HTTPS health checks (with customizable paths and hostnames)
  • HTTP health checks (with customizable paths and hostnames)
  • ICMP health checks
  • Automatic DNS record replacement upon anomaly detection
  • Configurable check intervals
  • Priority-based failover with explicit priority levels (higher value = higher priority)
  • DNS round-robin across multiple IPs at the same priority level
  • Cloudflare proxy settings for each origin
  • One-shot mode for batch health checks via CLI or Docker container
  • Multiple zone support - Monitor and manage DNS records across multiple Cloudflare zones
  • Configuration migration tool - Convert legacy configs to the new priority-based format
  • Failover notifications - Send notifications to Slack and Discord webhooks when failover events occur

Installation

git clone https://github.com/bootjp/cloudflare-gslb.git
cd cloudflare-gslb
go build -o gslb ./cmd/gslb

Configuration

Cloudflare GSLB supports both JSON and YAML configuration file formats. You can use whichever format you prefer.

JSON Configuration

Copy config.json.example to create config.json and configure the necessary settings.

cp config.json.example config.json

YAML Configuration

Alternatively, you can use YAML format by copying config.yaml.example:

cp config.yaml.example config.yaml

Example configuration file:

{
  "cloudflare_api_token": "YOUR_CLOUDFLARE_API_TOKEN",
  "check_interval_seconds": 60,
  "cloudflare_zones": [
    {
      "zone_id": "YOUR_CLOUDFLARE_ZONE_ID_1",
      "name": "example.com"
    },
    {
      "zone_id": "YOUR_CLOUDFLARE_ZONE_ID_2",
      "name": "example.org"
    }
  ],
  "notifications": [
    {
      "type": "slack",
      "webhook_url": "https://hooks.slack.com/services/YOUR/SLACK/WEBHOOK"
    },
    {
      "type": "discord",
      "webhook_url": "https://discord.com/api/webhooks/YOUR/DISCORD/WEBHOOK"
    }
  ],
  "origins": [
    {
      "name": "www",
      "zone_name": "example.com",
      "record_type": "A",
      "health_check": {
        "type": "https",
        "endpoint": "/health",
        "host": "www.example.com",
        "timeout": 5,
        "headers": {
          "X-Health-Check": "gslb"
        }
      },
      "priority_levels": [
        {
          "priority": 100,
          "ips": [
            "192.168.1.1",
            "192.168.1.2"
          ]
        },
        {
          "priority": 50,
          "ips": [
            "192.168.1.3",
            "192.168.1.4"
          ]
        }
      ],
      "proxied": true,
      "return_to_priority": true
    },
    {
      "name": "api",
      "zone_name": "example.com",
      "record_type": "A",
      "health_check": {
        "type": "http",
        "endpoint": "/status",
        "host": "api.example.com",
        "timeout": 5,
        "headers": {
          "X-Health-Check": "gslb"
        }
      },
      "priority_levels": [
        {
          "priority": 100,
          "ips": [
            "10.0.0.1"
          ]
        },
        {
          "priority": 50,
          "ips": [
            "10.0.0.2",
            "10.0.0.3"
          ]
        }
      ],
      "proxied": true,
      "return_to_priority": true
    },
    {
      "name": "ipv6",
      "zone_name": "example.org",
      "record_type": "AAAA",
      "health_check": {
        "type": "icmp",
        "timeout": 5
      },
      "priority_levels": [
        {
          "priority": 100,
          "ips": [
            "2001:db8::1"
          ]
        },
        {
          "priority": 50,
          "ips": [
            "2001:db8::2",
            "2001:db8::3",
            "2001:db8::4"
          ]
        }
      ],
      "proxied": false,
      "return_to_priority": true
    }
  ]
}

Configuration Options

  • cloudflare_api_token: Cloudflare API token
  • check_interval_seconds: Health check interval (in seconds)
  • cloudflare_zones: Array of Cloudflare zones to manage
    • zone_id: Cloudflare zone ID
    • name: A name to identify this zone (used in zone_name field of origins)
  • notifications (optional): Array of notification configurations for failover events
    • type: Notification type (slack or discord)
    • webhook_url: Webhook URL for the notification service
  • origins: Array of origin configurations
    • name: DNS record name (without the zone part)
    • zone_name: The name of the zone this record belongs to (must match one of the names in cloudflare_zones)
    • record_type: DNS record type (A or AAAA). CNAME などはサポートしません
    • health_check: Health check configuration
      • type: Health check type (http, https, or icmp)
      • endpoint: HTTP/HTTPS endpoint path
      • host: HTTP/HTTPS host header
      • timeout: Health check timeout in seconds
      • insecure_skip_verify: Skip TLS verification for HTTPS checks
      • headers: Additional HTTP headers to include with health check requests
    • priority_levels: Priority-based IP groups (higher priority values are preferred)
      • priority: Priority value (higher = higher priority)
      • ips: List of IPs for DNS round-robin at that priority level
    • proxied: Whether to enable Cloudflare proxy for this record
    • return_to_priority: Whether to return to priority IPs when they become healthy again

Backward Compatibility

For backward compatibility, you can still use the old configuration format with a single zone:

{
  "cloudflare_api_token": "YOUR_CLOUDFLARE_API_TOKEN",
  "cloudflare_zone_id": "YOUR_CLOUDFLARE_ZONE_ID",
  "check_interval_seconds": 60,
  "origins": [
    ...
  ]
}

When using the old format, all origins will be associated with the single zone specified by cloudflare_zone_id.

Legacy priority_failover_ips and failover_ips fields are still supported, but they are deprecated in favor of priority_levels.

Migration Guide

Use the migration tool to convert legacy configurations (single zone or legacy failover IP fields) into the new priority_levels structure:

go build -o gslb-migrate ./cmd/migrate
./gslb-migrate -config config.json -out config.migrated.json

The generated file will include explicit priority_levels with default priorities:

  • priority_failover_ips → priority 100
  • failover_ips → priority 0

Priority Levels Behavior

When priority_levels are configured, the system behaves as follows:

  1. It selects the highest priority level that has at least one healthy IP
  2. All IPs at the selected priority level are published for DNS round-robin
  3. If any IP in a priority level is unhealthy, the system falls back to the next lower priority level
  4. If return_to_priority: true, it will move back to higher priorities once they recover

Utilizing Priority Levels

By combining multiple priority levels, you can optimize resource efficiency as follows:

  1. During normal operation, traffic is directed to the highest priority IPs (e.g., dedicated servers with fixed pricing)
  2. During outages, traffic is directed to lower priority IPs (e.g., cloud VMs with pay-as-you-go pricing)
  3. When higher priority IPs recover, traffic automatically returns to them (if return_to_priority: true)

This approach offers the following benefits:

  • Cost optimization during normal operation (prioritizing fixed-cost resources)
  • Availability assurance during outages (backup with pay-as-you-go resources)
  • Reduced operational burden with automatic failback upon recovery

About Proxy Settings

You can specify Cloudflare proxy settings individually for each origin:

  • With proxy enabled ("proxied": true):

    • Traffic passes through Cloudflare's network
    • Cloudflare security protections (WAF, DDoS protection, etc.) are applied
    • The origin server's IP address is masked
    • Modern protocols like HTTP/2 and TLS 1.3 become available
  • With proxy disabled ("proxied": false):

    • Traffic is sent directly to the origin server
    • Cloudflare security protections are not applied
    • The origin server's IP address is exposed
    • Suitable when using ICMP health checks or when direct connections are required

Notifications

Cloudflare GSLB supports sending notifications when failover events occur. This feature helps you stay informed about infrastructure health and failover activities in real-time.

Supported Notification Services

  • Slack: Send notifications to Slack channels via webhook
  • Discord: Send notifications to Discord channels via webhook

Setting Up Notifications

Slack
  1. Create a Slack webhook URL:

    • Go to your Slack workspace settings
    • Navigate to "Apps" → "Incoming Webhooks"
    • Create a new webhook and select the channel
    • Copy the webhook URL
  2. Add the webhook URL to your config.json:

    "notifications": [
      {
        "type": "slack",
        "webhook_url": "https://hooks.slack.com/services/YOUR/SLACK/WEBHOOK"
      }
    ]
Discord
  1. Create a Discord webhook URL:

    • Open your Discord server settings
    • Navigate to "Integrations" → "Webhooks"
    • Create a new webhook and select the channel
    • Copy the webhook URL
  2. Add the webhook URL to your config.json:

    "notifications": [
      {
        "type": "discord",
        "webhook_url": "https://discord.com/api/webhooks/YOUR/DISCORD/WEBHOOK"
      }
    ]

Multiple Notification Channels

You can configure multiple notification channels simultaneously. The system will send notifications to all configured channels:

"notifications": [
  {
    "type": "slack",
    "webhook_url": "https://hooks.slack.com/services/YOUR/SLACK/WEBHOOK"
  },
  {
    "type": "discord",
    "webhook_url": "https://discord.com/api/webhooks/YOUR/DISCORD/WEBHOOK"
  }
]

Notification Events

Notifications are sent for the following events:

  • Failover to Backup IP: When a health check fails and the system switches to a backup IP
  • Failover to Priority IP: When switching from a backup IP to a priority IP
  • Recovery (Return to Priority): When a priority IP becomes healthy again and the system returns to it

Each notification includes:

  • Origin name and zone
  • Record type (A or AAAA)
  • Old IP address(es)
  • New IP address(es)
  • Event type
  • Reason for the failover
  • Timestamp

Usage

The application accepts both JSON and YAML configuration files. The file format is automatically detected based on the file extension (.json, .yaml, or .yml).

Using JSON configuration:

./gslb -config config.json

Using YAML configuration:

./gslb -config config.yaml

You can also specify an alternative configuration file path:

./gslb -config /path/to/your/config.json
# or
./gslb -config /path/to/your/config.yaml

Using a directory path:

If you provide a directory path, the application will automatically search for configuration files in this order:

  1. config.yaml
  2. config.yml
  3. config.json
./gslb -config /path/to/config/directory

One-shot Mode

One-shot mode performs health checks and necessary failovers once without running continuously:

./cloudflare-gslb-oneshot -config config.json
# or with YAML
./cloudflare-gslb-oneshot -config config.yaml

This is useful for:

  • Running health checks via cron jobs
  • Batch processing in CI/CD pipelines
  • Kubernetes CronJobs
  • Testing configuration

Docker Usage

The application is available as Docker images for both continuous and one-shot modes:

Continuous Mode

# With JSON configuration
docker run -v /path/to/your/config.json:/app/config/config.json ghcr.io/bootjp/cloudflare-gslb:main

# With YAML configuration
docker run -v /path/to/your/config.yaml:/app/config/config.yaml ghcr.io/bootjp/cloudflare-gslb:main

One-shot Mode

# With JSON configuration
docker run -v /path/to/your/config.json:/app/config/config.json ghcr.io/bootjp/cloudflare-gslb-oneshot:main

# With YAML configuration
docker run -v /path/to/your/config.yaml:/app/config/config.yaml ghcr.io/bootjp/cloudflare-gslb-oneshot:main

Both images support multiple architectures (amd64/x86_64 and arm64) automatically.

Testing

To run tests, use the following command:

go test ./...

For detailed output, add the -v option:

go test ./... -v

To generate a coverage report:

go test ./... -coverprofile=coverage.out
go tool cover -html=coverage.out

Important Notes

  • This tool requires a Cloudflare API token with appropriate permissions (DNS editing permissions).
  • ICMP health checks may require privileges (often root permissions on many systems).
  • When the proxy feature is enabled, IP addresses will route through Cloudflare's network, which may restrict certain protocols or configurations.
  • It is recommended to test in a testing environment before using in a production environment.
  • Even if you have Cloudflare's proxy flag turned off, configuring a failover IP list enables flexible and reliable failover.

Limitations of Cloudflare DNS Round Robin

While Cloudflare advertises DNS Round Robin as a "zero-downtime" solution, it's important to note a significant limitation: when using Cloudflare Proxy (orange cloud), DNS Round Robin does not properly failover in case of server failures.

When a server fails behind Cloudflare Proxy:

  1. The DNS Round Robin continues to include the failed server's IP in rotation
  2. Cloudflare's proxy attempts to connect to the failed server
  3. Users experience connection failures or timeouts when their requests are routed to the failed server

This occurs because the proxy layer masks the actual server failures from the DNS layer. To achieve true zero-downtime with Cloudflare services, consider using:

  • This GSLB solution, which actively monitors servers and updates DNS records
  • Cloudflare Load Balancers (a paid service that properly handles failover)
  • Server-side health checks with proper error handling

If you must use DNS Round Robin with proxied records, implement additional client-side retry logic to handle potential failures.

About

A Global Server Load Balancing (GSLB) system that provides health checks and automatic failover for Cloudflare DNS records.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors