Revision as of 15:51, 10 November 2010 by Igloo
1 Migration From Etch
* c.h.o migrated to new server, all services functioning as before for all users * Regularly scheduled automated off-site backups of new server, so that we can redeploy if we suddenly lose the server (hardware failure, VPS provider belly up, etc.) * A set of migration scripts that can be re-used in case we need to migrate again in the future * A set of policies/procedures that will keep the migration scripts up-to-date
1.2 Assets to be moved
1.2.1 Services to migrate
* RT on PostgreSQL * Trac * http * Exim * ClamAV * mrtg * darcs * ghc * mailman * nagios * planet * c.h.o admin scripts for users * account_request * project_request * createlist * createtrac * addtoproject * crontabfor * c.h.o admin scripts for admins * create_user.sh * create_project.sh
1.2.2 User data to be transfered
* User accounts and home directories * Project groups in /etc/group * Darcs repos from /srv/projects * Project data from /srv/code * Trac projects * Mailman lists * Planet Haskell * Physical mailboxes (igloo and malcolm)
1.2.3 Domains to be transferred
* community * code * rt * planet * trac * projects
1.3 Migration plan
Coordination will happen on the #haskell-infrastructure channel on FreeNode.
1.3.1 Preparatory stage
1. Formulate plan on how to communicate with users during the migration process; begin gathering required contact info if necessary 1. Calculate the approximate size and rate of change of each type of data on the server 1. Give preliminary notice to users and ask for feedback 1. Configure domain name etch.haskell.org (will later become read-only copy and left active for a while as a backup) 1. Change TTL on all domains to be short 1. Provision the new server 1. Install and perform basic configuration of all services 1. Test all installed services 1. Set up backup mirror(s) 1. Set up backup mirror services/scripts. Install on new server and test. 1. Write scripts to copy user accounts and rsync home directories, and to verify 1. Write scripts to copy project groups in /etc/group, and to verify 1. Write scripts to rsync darcs repos from /srv/projects, and to verify 1. Write scripts to rsync project data from /srv/code, and to verify 1. Write scripts to rsync trac projects, and to verify 1. Write scripts to copy mailman lists and rsync archives, and to verify 1. Write scripts to rsync Planet Haskell, and to verify 1. Write a script to make all user data and projects read-only (except mailman archives), and to verify. 1. Write a script to *undo* making things read-only, in case of emergency. 1. Test all scripts thoroughly 1. Fix dates for beginning initial copy and final migration, and give advance notice and instructions to users. 1. Transfer RT database and verify (as a dry run, go through the entire migration process just for RT) 1. Move the rt domain 1. After TTL, verify that RT is working on the new server
1.3.2 Initial copy
1. Run script to copy user accounts. Verify. 1. Run script to copy /etc/group entries for projects. Verify. 1. Run scripts to rsync home directories, darcs repos, project data, trac projects, mailman archives, and Planet data 1. Monitor progress; update schedule and notify users as needed 1. When completed, verify.
1. Notify users. 1. Stop the Planet Haskell hourly cron job. 1. Run script to make all user data and projects read-only. Verify. 1. If any user accounts or projects were added since initial copy began, add them. 1. Run scripts to rsync home directories, darcs repos, project data, and Trac projects. Verify. 1. Stop mailman service on etch. 1. Run the scripts to rsync mailman archives. Verify. 1. Move the domain CNAME for community, trac, projects, and code. 1. After TTL, verify remotely that the domains moved and that services are working. 1. Notify users of current status. 1. Tell Ian and Malcolm to check their mail. 1. Move the domain MX records. 1. Run the scripts to rsync Planet Haskell. Verify. 1. Move the planet domain. 1. Start the Planet Haskell hourly cron job. 1. After TTL, verify remotely that all domains are moved and that all services are working. 1. Notify users and community.
1.3.4 Post-migration tasks
1. Activate backup mirroring. Verify. 1. Monitor net bandwidth and responsiveness of the new server. 1. Monitor memory and cpu usage on the new server. 1. Monitor and tune settings of PostgreSQL for resource usage on the new server 1. Monitor and tune settings of Apache for resource usage on the new server 1. After a day or two, raise TTL back to normal levels on all domains 1. After a month or two, delete the Etch server