Skip to main content
Skip to main content

Operational Runbooks

This page provides step-by-step procedures for common operational tasks in IronWiFi. Use these runbooks during incidents, scheduled maintenance, and bulk operations to ensure consistent, reliable outcomes.

Incident Response

Runbook: Authentication Outage

Trigger: Multiple users report inability to connect to WiFi, or monitoring alerts indicate RADIUS server unreachable.

Severity: Critical

Target resolution time: 15 minutes

Step 1: Confirm the issue (2 minutes)

  1. Check status.ironwifi.com for platform-wide incidents
  2. Log in to the IronWiFi Console
  3. Navigate to Reports > Authentication
  4. Look for a sudden drop in successful authentications or spike in failures
  5. Navigate to Networks and check the status indicator for each Network

Step 2: Determine scope (3 minutes)

ScopeIndicatorsLikely Cause
All users, all sitesNo authentications in reportsPlatform issue or account issue
All users, one siteFailures from specific NAS IPsSite-level network or AP issue
Some users, all sitesSpecific usernames failingUser/group configuration issue
One userSingle username failingIndividual user account issue

Step 3: Triage and act

If platform-wide outage (status.ironwifi.com reports incident):

  1. Verify RADIUS caching is active on access points (previously authenticated users should still connect)
  2. Monitor the status page for updates
  3. Contact IronWiFi support if no status page update within 15 minutes

If site-level issue (one location affected):

  1. Verify network connectivity from the site to the internet
  2. Check if the access point can reach IronWiFi RADIUS IPs (ping or traceroute from the AP network)
  3. Verify firewall rules allow outbound UDP on RADIUS authentication and accounting ports
  4. Check if the AP is using the correct RADIUS server IPs, ports, and shared secret
  5. Restart the access point or controller if configuration appears correct

If user/group issue:

  1. Check the affected user's account in the IronWiFi Console (status, credentials, group membership)
  2. Look at the authentication report for the specific reject reason
  3. See Troubleshooting for resolution steps based on the reject reason

Step 4: Verify resolution (2 minutes)

  1. Connect a test device and authenticate
  2. Check Reports > Authentication for new
    Access-Accept
    entries
  3. Confirm monitoring alerts clear

Step 5: Post-incident

  1. Document what happened, when, and how it was resolved
  2. If a configuration change caused the issue, implement safeguards (pre-change testing, rollback plan)
  3. Review monitoring coverage -- were alerts timely?

Runbook: High Authentication Failure Rate

Trigger: Monitoring alert for authentication reject rate above threshold (e.g., >10% over 15 minutes).

Severity: Warning (escalate to Critical if >50% reject rate)

Step 1: Identify the pattern

  1. Navigate to Reports > Authentication
  2. Filter by result: Reject
  3. Look for patterns:
PatternLikely Cause
Same username, repeated failuresWrong password, locked account
Many usernames, same NAS IPAP misconfiguration (wrong shared secret)
Many usernames, all NAS IPsGroup policy change, expired certificates
Unknown usernamesUnauthorized access attempts

Step 2: Resolve based on pattern

Wrong password / locked account:

  1. Navigate to Users > find the affected user
  2. Verify the account is enabled
  3. Reset the password if needed

AP misconfiguration:

  1. Identify the NAS IP from the report
  2. Check the AP's RADIUS configuration (IP, port, shared secret)
  3. Correct the shared secret -- it must match the IronWiFi Network settings exactly

Group policy change:

  1. Review recent changes to Groups in the IronWiFi Console
  2. Check if any attributes were modified that could cause rejections (e.g.,
    Login-Time
    restrictions)
  3. Revert the change or adjust the policy

Unauthorized access attempts:

  1. Review the failing usernames -- do they match known accounts?
  2. If not, this may be a brute-force attempt
  3. Monitor but do not take action unless it causes service degradation

Runbook: Captive Portal Not Loading

Trigger: Guests report the splash page does not appear when connecting to WiFi.

Severity: High

Step 1: Verify the issue

  1. Connect a test device to the guest SSID
  2. Open a browser and navigate to
    http://neverssl.com
  3. Observe whether the redirect to the captive portal occurs

Step 2: Check common causes

CheckHowFix
Walled gardenVerify
107.178.250.42/32
and
*.ironwifi.com
are in the AP walled garden
Add missing entries
Captive portal URLVerify the splash page URL in the AP matches IronWiFi ConsoleCorrect the URL
RADIUS configurationVerify RADIUS settings on the AP match the IronWiFi NetworkCorrect settings
Captive portal statusCheck the portal is enabled in the IronWiFi ConsoleEnable the portal
DNSVerify DNS resolution works from the guest VLANFix DNS configuration

Step 3: Verify fix

  1. Reconnect the test device
  2. Confirm the splash page loads
  3. Complete authentication and verify internet access

Escalation Paths

When to escalate to IronWiFi support

Escalate to IronWiFi support when:

  • status.ironwifi.com shows no incident but the service is clearly down
  • An issue persists after completing the relevant runbook
  • You suspect a platform-level bug
  • You need configuration help beyond what documentation covers

How to contact support

ChannelResponse TimeBest For
Live chat on ironwifi.comMinutes (24/7)Urgent issues, quick questions
Email: support@ironwifi.comHoursDetailed issues, non-urgent requests
Status page: status.ironwifi.comN/APlatform-wide incident updates

Information to provide

When contacting support, include:

  • Account email or organization name
  • Network name and region
  • Timestamp of when the issue started
  • Scope -- how many users/sites affected
  • Steps already taken from the relevant runbook
  • Screenshots of error messages or report data
  • NAS IP of the affected access point (if applicable)

Maintenance Procedures

Runbook: Shared Secret Rotation

Rotate the RADIUS shared secret periodically or if you suspect it has been compromised.

warning

Changing the shared secret requires updating every access point that uses the Network. Plan this during a maintenance window.

Procedure

  1. Schedule a maintenance window during low-usage hours
  2. Document current settings -- Note the current shared secret for rollback
  3. Generate the new secret in the IronWiFi Console:
    • Navigate to Networks > select the Network
    • Regenerate the shared secret
    • Copy the new secret
  4. Update access points -- Change the shared secret on every AP or controller that uses this Network
  5. Test -- Authenticate a test user from each site
  6. Verify -- Check Reports > Authentication for successful authentications across all sites

Rollback

If authentication fails after the change:

  1. Revert the shared secret on the access points to the previous value
  2. Contact IronWiFi support to revert the secret in the Console if needed

Runbook: Adding a New Site

When deploying IronWiFi to a new physical location.

Procedure

  1. Determine the region -- Choose the IronWiFi region closest to the new site
  2. Decide: new Network or existing?
    • Same region as an existing Network: reuse the existing Network's RADIUS settings
    • Different region: create a new Network in the closer region
  3. If creating a new Network:
    • Navigate to Networks > Create Network
    • Select the appropriate region
    • Note the RADIUS server details
  4. Configure access points at the new site:
    • Enter the RADIUS server IPs, ports, and shared secret
    • Configure both primary and backup servers
    • Set up accounting
    • Configure captive portal if needed (see Quick Start: Guest WiFi)
  5. Test:
    • Authenticate a test user
    • Verify VLAN assignment (if applicable)
    • Verify captive portal (if applicable)
    • Check accounting data appears in Reports
  6. Set up monitoring for the new site's RADIUS connectivity (see Monitoring and Alerting)

Runbook: Planned Maintenance on Access Points

When you need to reboot or reconfigure access points.

Procedure

  1. Notify users of the maintenance window (email, Slack, signage)
  2. Verify IronWiFi RADIUS settings are documented (in case AP resets to defaults)
  3. Perform maintenance (firmware update, reboot, configuration change)
  4. After maintenance:
    • Verify the AP reconnects and reaches IronWiFi RADIUS servers
    • Authenticate a test user
    • Check Reports > Authentication for events from the maintained AP (NAS IP)
  5. Confirm accounting -- Verify sessions restart after the reboot (existing sessions will show
    NAS-Reboot
    terminate cause)

Bulk Operations

Runbook: Bulk User Import

Import a large number of users via the API.

Preparation

  1. Prepare a CSV file with columns:
    username
    ,
    email
    ,
    fullname
    ,
    password
    ,
    group
  2. Validate the data:
    • No duplicate usernames
    • Valid email formats
    • Passwords meet complexity requirements
  3. Back up your current user list via the API (see Backup and Restore)

Execution

  1. Use the batch import script from the Migration Guide
  2. Monitor the script output for errors
  3. Respect API rate limits (100 requests per minute)
  4. After the import, verify in the IronWiFi Console:
    • Total user count matches expected number
    • Spot-check several users' group assignments
    • Test authentication for a sample of imported users

Rollback

If the import introduces bad data:

  1. Stop the import script
  2. Identify problematic users via the Console or API
  3. Delete or correct them individually or via the API

Runbook: Bulk User Deactivation

Deactivate multiple user accounts (e.g., departing employees, end of event).

Procedure

Verification

  1. Attempt to authenticate with a deactivated user -- should receive
    Access-Reject
  2. Check the IronWiFi Console -- deactivated users should show as disabled

Runbook: Bulk Password Reset

Force password resets for a group of users (e.g., after a security incident).

Procedure

  1. Export the list of affected users
  2. Generate new temporary passwords
  3. Update passwords via the API:
  1. Communicate new passwords to users through a secure channel
  2. Require users to change their password on next login (if your IdP supports this)

Operational Checklists

Daily

  • Check status.ironwifi.com for any incidents
  • Review monitoring dashboard for alerts
  • Glance at authentication reports for unexpected patterns

Weekly

  • Review authentication failure trends
  • Check session counts for anomalies
  • Verify scheduled backups completed successfully
  • Review any pending user access requests

Monthly

  • Review and clean up unused user accounts
  • Audit Group policies for accuracy
  • Update documentation for any configuration changes
  • Review access point firmware for available updates

Quarterly

  • Test RADIUS failover (primary to backup)
  • Test backup restore procedure
  • Review and update monitoring thresholds
  • Rotate API keys used for integrations
  • Review Backup and Restore procedures

Was this page helpful?