Migration From Etch
- c.h.o migrated to new server, all services functioning as before for all users
- Regularly scheduled automated off-site backups of new server, so that we can redeploy if we suddenly lose the server (hardware failure, VPS provider belly up, etc.)
- A set of migration scripts that can be re-used in case we need to migrate again in the future
- A set of policies/procedures that will keep the migration scripts up-to-date
Assets to be moved
Services to migrate
The following are in separately doable chunks:
- Increase lun's memory allocation and move the disk image out of ~igloo
- Admin home directories
- Configure exim, mailman, apache, clamav
- planet (might need to migrate one or two users early for this)
- projects.haskell.org MX (hmm, this'll divorce the lists from the web interface)
- mailman (need to disable list creation script on nun first)
- User accounts and home directories
- RT on PostgreSQL
- c.h.o admin scripts for users
- c.h.o admin scripts for admins
User data to be transfered
- User accounts and home directories
- Project groups in /etc/group
- Darcs repos from /srv/projects
- Project data from /srv/code
- Trac projects
- Mailman lists
- Planet Haskell
- Physical mailboxes (igloo and malcolm)
Domains to be transferred
Coordination will happen on the #haskell-infrastructure channel on FreeNode.
- Formulate plan on how to communicate with users during the migration process; begin gathering required contact info if necessary
- Calculate the approximate size and rate of change of each type of data on the server
- Give preliminary notice to users and ask for feedback
- Configure domain name etch.haskell.org (will later become read-only copy and left active for a while as a backup)
- Change TTL on all domains to be short
- Provision the new server
- Install and perform basic configuration of all services
- Test all installed services
- Set up backup mirror(s)
- Set up backup mirror services/scripts. Install on new server and test.
- Write scripts to copy user accounts and rsync home directories, and to verify
- Write scripts to copy project groups in /etc/group, and to verify
- Write scripts to rsync darcs repos from /srv/projects, and to verify
- Write scripts to rsync project data from /srv/code, and to verify
- Write scripts to rsync trac projects, and to verify
- Write scripts to copy mailman lists and rsync archives, and to verify
- Write scripts to rsync Planet Haskell, and to verify
- Write a script to make all user data and projects read-only (except mailman archives), and to verify.
- Write a script to *undo* making things read-only, in case of emergency.
- Test all scripts thoroughly
- Fix dates for beginning initial copy and final migration, and give advance notice and instructions to users.
- Transfer RT database and verify (as a dry run, go through the entire migration process just for RT)
- Move the rt domain
- After TTL, verify that RT is working on the new server
- Run script to copy user accounts. Verify.
- Run script to copy /etc/group entries for projects. Verify.
- Run scripts to rsync home directories, darcs repos, project data, trac projects, mailman archives, and Planet data
- Monitor progress; update schedule and notify users as needed
- When completed, verify.
- Notify users.
- Stop the Planet Haskell hourly cron job.
- Run script to make all user data and projects read-only. Verify.
- If any user accounts or projects were added since initial copy began, add them.
- Run scripts to rsync home directories, darcs repos, project data, and Trac projects. Verify.
- Stop mailman service on etch.
- Run the scripts to rsync mailman archives. Verify.
- Move the domain CNAME for community, trac, projects, and code.
- After TTL, verify remotely that the domains moved and that services are working.
- Notify users of current status.
- Tell Ian and Malcolm to check their mail.
- Move the domain MX records.
- Run the scripts to rsync Planet Haskell. Verify.
- Move the planet domain.
- Start the Planet Haskell hourly cron job.
- After TTL, verify remotely that all domains are moved and that all services are working.
- Notify users and community.
- Activate backup mirroring. Verify.
- Monitor net bandwidth and responsiveness of the new server.
- Monitor memory and cpu usage on the new server.
- Monitor and tune settings of PostgreSQL for resource usage on the new server
- Monitor and tune settings of Apache for resource usage on the new server
- After a day or two, raise TTL back to normal levels on all domains
- After a month or two, delete the Etch server