Backup & Recovery
Backup procedures, disaster recovery, and node migration
Reliable backup and recovery procedures are essential for maintaining node availability and protecting against data loss. This section outlines what to back up, how often, where to store it, and how to restore from failure.
Plan for backup storage of 1.5–2× your current chain data size. Backup operations typically add 10–20% I/O load during execution.
What to Back Up
Non-validator nodes store critical state and configuration data required for continued operation. Key components include:
- Blockchain database
Stores the full Plasma chain state. Backups are significantly faster than full resyncs. - Configuration files
Includes Docker Compose files,.env
variables, and any custom scripts. - Keystores and peer state
Enables clean restarts without manual reconfiguration. May include auth tokens and networking metadata.
Backup Strategy
Frequency
Set backup intervals based on usage and risk profile. Daily snapshots are sufficient for most non-validator nodes. High-throughput deployments may require more frequent backups to minimize data loss during failure.
Storage Considerations
Store backups on separate infrastructure: cloud buckets, remote hosts, or offline disks. Avoid colocating backups with the primary node.
Do not store backups on the same physical machine as your running node. A single hardware failure can result in total data loss.
Implement backup encryption for sensitive data protection, especially when using external storage providers. Ensure backup storage has adequate capacity for your retention requirements and growth projections.
Recovery Scenarios
Partial Recovery
Use targeted restores when only certain files are affected:
- Restore configuration files after accidental edits
- Recover a corrupted database without resetting sync progress
- Reapply peer state to preserve existing networking setup
Partial recovery reduces downtime and avoids full resyncs.
Full Recovery
Required when the node or host system is lost:
- Provision a new machine or VM
- Restore the blockchain database and configs from backup
- Start the node and rejoin the network
- Confirm sync with the latest finalized block
Expect recovery time to depend on data size, bandwidth, and storage.
Validation
Regularly verify backup integrity:
- Run checksum verification on stored files
- Periodically perform test restores on non-critical infrastructure
- Monitor backup success, duration, and data size
Best Practices
- Automate backups and alert on failure
- Use version control for configuration files
- Test restore procedures quarterly
- Track recovery time to evaluate RTO/RPO goals
Troubleshooting
Backup Failures
- Check disk space, permissions, and storage connectivity
- Review logs for I/O or timeout errors
Corruption Detection
- Validate checksums regularly
- Monitor sync logs for signs of database inconsistency
Recovery Performance
- Optimize restore by using fast storage and local disks
- Use parallel I/O if supported by the storage backend
A robust backup and recovery plan protects against data loss and minimizes downtime. Test regularly, store backups securely, and follow a structured recovery process to maintain reliable node operations.