Skip to main content
Skip to main content

Failover Testing Guide

Overview

Every IronWiFi Network provides primary and secondary RADIUS servers for redundancy. Failover testing verifies that your access points correctly switch to the secondary server when the primary is unreachable, and that they recover to the primary when it becomes available again. Testing failover before you need it prevents downtime during real incidents.

Why Test Failover?

RiskConsequence Without Testing
AP not configured with secondary RADIUSComplete authentication outage when primary is unavailable
Failover timeout too longUsers experience 30+ second delays during failover
Recovery not automaticUsers stay on secondary even after primary recovers
Accounting lost during failoverGaps in session data for compliance reporting

Prerequisites

Before testing failover:

  1. Verify dual RADIUS configuration:

    • Log in to the IronWiFi Console
    • Navigate to Networks > select your Network
    • Record both the primary and secondary RADIUS server IPs, ports, and shared secret
  2. Verify AP configuration:

    • Confirm your AP or controller is configured with both primary and secondary RADIUS servers
    • See Configuration Guides for vendor-specific setup
  3. Create test accounts:

    • Create one or more test user accounts in IronWiFi
    • Verify they authenticate successfully before running failover tests
  4. Schedule a maintenance window:

    • Failover testing may temporarily disrupt authentication
    • Notify affected users if testing against a production Network
warning

Test failover during a scheduled maintenance window or use a dedicated test Network. Failover testing will briefly interrupt authentication for users connected through the AP under test.

Understanding RADIUS Failover

How Failover Works

AP Failover Behavior

Most access points implement failover using these parameters:

ParameterDescriptionTypical Default
RADIUS TimeoutTime to wait for a response before retrying5 seconds
RADIUS RetriesNumber of retry attempts before failing over3
Dead TimeHow long to mark a server as dead before retrying300 seconds (5 min)
FailbackWhether the AP returns to the primary after recoveryVaries by vendor

Failover timeline example:

tip

Reduce the failover delay by configuring a shorter RADIUS timeout (3 seconds) and fewer retries (2). This gives a failover time of approximately 9 seconds instead of 20.

Test Procedures

Test 1: Verify Dual Server Configuration

Objective: Confirm the AP is configured with both RADIUS servers.

Steps:

  1. Log in to your AP controller
  2. Navigate to the RADIUS server configuration for your SSID
  3. Verify both entries:
SettingPrimarySecondary
Server IP(from IronWiFi Console)(from IronWiFi Console)
Auth Port(from IronWiFi Console)(from IronWiFi Console)
Accounting Port(from IronWiFi Console)(from IronWiFi Console)
Shared Secret(from IronWiFi Console)Same as primary
  1. If the secondary is missing, add it now and retest

Pass criteria: Both primary and secondary RADIUS servers are configured with correct settings.

Test 2: Normal Authentication Baseline

Objective: Verify authentication works normally before testing failover.

Steps:

  1. Connect a test device to the WiFi network

  2. Authenticate with valid credentials

  3. Verify the connection succeeds

  4. Check the IronWiFi Console:

    • Navigate to Logs > Authentication Logs
    • Confirm the authentication appears with an Accept result
    • Note the RADIUS server that handled the request (should be primary)
  5. Measure the authentication time (time from WiFi connection attempt to connected state)

Pass criteria: Authentication succeeds within normal latency (typically under 3 seconds).

Test 3: Primary Server Failover

Objective: Verify the AP fails over to the secondary when the primary is unreachable.

Steps:

  1. Block the primary RADIUS server:

    • On your firewall, create a rule blocking traffic from the AP to the primary RADIUS server IP on the authentication port
    • Alternatively, change the AP's primary RADIUS server to a non-existent IP (e.g., 10.255.255.255)
  2. Attempt authentication:

    • Disconnect the test device
    • Reconnect and authenticate
  3. Observe failover:

    • The AP should timeout on the primary and fail over to the secondary
    • Authentication should succeed via the secondary server
    • Note the total time from connection attempt to success
  4. Verify in IronWiFi Console:

    • Navigate to Logs > Authentication Logs
    • Confirm the authentication was handled by the secondary server

Pass criteria:

  • Authentication succeeds via the secondary server
  • Total failover time is within acceptable limits (under 30 seconds recommended)

Record the results:

MetricValue
Failover time_____ seconds
Authentication resultAccept / Reject
Server usedSecondary
User experienceAcceptable / Unacceptable

Test 4: Authentication During Failover

Objective: Verify that multiple users can authenticate through the secondary server.

Steps:

  1. With the primary still blocked (from Test 3)
  2. Connect multiple test devices simultaneously
  3. Verify all devices authenticate successfully via the secondary
  4. Check for any timeouts or rejections

Pass criteria: All devices authenticate successfully through the secondary server.

Test 5: Accounting During Failover

Objective: Verify that RADIUS accounting continues during failover.

Steps:

  1. With the primary still blocked
  2. Authenticate a test device
  3. Use the device for a few minutes (generate some traffic)
  4. Check the IronWiFi Console for accounting records:
    • Navigate to Logs > Accounting
    • Verify session start, interim updates, and data usage are recorded

Pass criteria: Accounting data is present in IronWiFi logs during the failover period.

Test 6: Recovery to Primary

Objective: Verify the AP returns to the primary server after it becomes available.

Steps:

  1. Restore the primary RADIUS server:

    • Remove the firewall block from Test 3
    • Or restore the correct primary RADIUS server IP on the AP
  2. Wait for the dead time to expire:

    • Check your AP's dead time setting (typically 300 seconds)
    • Wait for this period to pass
  3. Trigger a new authentication:

    • Disconnect and reconnect the test device
  4. Verify recovery:

    • Check the IronWiFi Console authentication logs
    • The authentication should be handled by the primary server again

Pass criteria: The AP returns to the primary RADIUS server after it becomes available.

note

Some APs do not automatically fail back to the primary. They continue using the secondary until manually reconfigured or until the AP is restarted. Check your AP vendor's documentation for failback behavior.

Test 7: Rapid Failover and Recovery

Objective: Verify behavior during rapid primary server flapping.

Steps:

  1. Block the primary RADIUS server
  2. Authenticate a device (fails over to secondary)
  3. Unblock the primary
  4. Wait 60 seconds (less than typical dead time)
  5. Block the primary again
  6. Authenticate another device
  7. Verify consistent behavior

Pass criteria: The AP handles rapid changes predictably without getting stuck in an indeterminate state.

Vendor-Specific Failover Configuration

Cisco Meraki

Meraki proactively tests RADIUS server health and automatically manages failover.

Ubiquiti UniFi

UniFi fails over based on timeout. Dead time is managed by the controller.

Aruba

MikroTik

Ruckus

For detailed vendor configuration, see Configuration Guides.

Failover Optimization

Reducing Failover Time

SettingRecommended ValueImpact
RADIUS Timeout3 secondsFaster detection of unreachable server
RADIUS Retries2Fewer retries before failover
Dead Time120 secondsFaster recovery check

With these settings, failover occurs within approximately 9 seconds:

  • 3 seconds x 3 attempts (initial + 2 retries) = 9 seconds

Monitoring Failover Events

Set up monitoring to detect failover situations:

  1. Configure the IronWiFi Service Monitor to check RADIUS availability
  2. Set up email alerts for authentication failures that may indicate failover
  3. Monitor which RADIUS server is handling authentications in the logs

Documenting Failover Results

Test Report Template

Record the results of all failover tests:

Remediation for Failed Tests

Failed TestRemediation
Dual config missingAdd secondary RADIUS server to AP configuration
Failover too slowReduce RADIUS timeout and retries on the AP
No automatic recoveryCheck AP vendor documentation; may require firmware update
Accounting gapsVerify accounting is configured on both RADIUS servers on the AP
Secondary also failsVerify shared secret and port match for secondary server

Was this page helpful?