Applies to SUSE OpenStack Cloud 6

11 SUSE OpenStack Cloud Maintenance

11.1 Keeping the Nodes Up-to-date

Keeping the nodes in SUSE OpenStack Cloud up-to-date requires an appropriate setup of the update and pool repositories and the deployment of either the Updater barclamp or the SUSE Manager barclamp. For details, see Section 5.2, “Update and Pool Repositories”, Section 9.4.1, “Deploying Node Updates with the Updater Barclamp”, and Section 9.4.2, “Configuring Node Updates with the SUSE Manager Client Barclamp”.

If one of those barclamps is deployed, patches are installed on the nodes. Installing patches that do not require a reboot of a node does not come with any service interruption. If a patch (for example, a kernel update) requires a reboot after the installation, services running on the machine that is rebooted will not be available within SUSE OpenStack Cloud. Therefore it is strongly recommended to install those patches during a maintenance window.

Note
Note: No Maintenance Mode

As of SUSE OpenStack Cloud 6 it is not possible to put SUSE OpenStack Cloud into Maintenance Mode.

Consequences when Rebooting Nodes
Administration Server

While the Administration Server is offline, it is not possible to deploy new nodes. However, rebooting the Administration Server has no effect on starting instances or on instances already running.

Control Nodes

The consequences a reboot of a Control Node has, depends on the services running on that node:

Database, Keystone, RabbitMQ, Glance, Nova:  No new instances can be started.

Swift:  No object storage data is available. If Glance uses Swift, it will not be possible to start new instances.

Cinder, Ceph:  No block storage data is available.

Neutron:  No new instances can be started. On running instances the network will be unavailable.

Horizon.  Horizon will be unavailable. Starting and managing instances can be done with the command line tools.

Compute Nodes

Whenever a Compute Node is rebooted, all instances running on that particular node will be shut down and must be manually restarted. Therefore it is recommended to evacuate the node by migrating instances to another node, before rebooting it.

11.2 Service Order on SUSE OpenStack Cloud Start-up or Shutdown

In case you need to restart your complete SUSE OpenStack Cloud (after a complete shut down or a power outage), the services need to started in the following order:

  1. Control Node/Cluster on which the Database is deployed

  2. Control Node/Cluster on which RabbitMQ is deployed

  3. Control Node/Cluster on which Keystone is deployed

  4. Any remaining Control Node/Cluster. The following additional rules apply:

    • The Control Node/Cluster on which the neutron-server role is deployed needs to be started before starting the node/cluster on which the neutron-l3 role is deployed.

    • The Control Node/Cluster on which the nova-controller role is deployed needs to be started before starting the node/cluster on which Heat is deployed.

  5. Compute Nodes

If multiple roles are deployed on a single Control Node, the services are automatically started in the correct order on that node. If you have more than one node on which multiple roles are installed, make sure they are started in a way that the order listed above is met as best as possible.

If you need to shut down SUSE OpenStack Cloud, the services need to be terminated in reverse order than on start-up:

  1. Compute Nodes

  2. Control Node/Cluster on which Heat is deployed

  3. Control Node/Cluster on which the nova-controller role is deployed

  4. Control Node/Cluster on which the neutron-l3 role is deployed

  5. All Control Node(s)/Cluster(s) on which neither of the following services is deployed: Database, RabbitMQ, and Keystone.

  6. Control Node/Cluster on which Keystone is deployed

  7. Control Node/Cluster on which RabbitMQ is deployed

  8. Control Node/Cluster on which the Database is deployed

11.3 Upgrading from SUSE OpenStack Cloud 5 to SUSE OpenStack Cloud 6

Upgrading from SUSE OpenStack Cloud 5 to SUSE OpenStack Cloud 6 is done via a Web interface guiding you through the process. The process consists of four phases:

  1. Saving the configuration data of your SUSE OpenStack Cloud 5 installation in a data dump.

  2. Re-Installing and setting up the Administration Server with SUSE OpenStack Cloud 6.

  3. Upgrading the all nodes to SUSE Linux Enterprise Server 12 SP1 and SUSE OpenStack Cloud 6.

  4. Re-applying the barclamps.

11.3.1 Requirements

Before you start upgrading SUSE OpenStack Cloud, make sure the following requirements are met:

  • The Administration Server needs to have the latest SUSE OpenStack Cloud 5 updates installed. One of these updates will add the new upgrade routine to the Crowbar Web interface.

  • All other nodes need to have the latest SUSE OpenStack Cloud 5 updates and the latest SLES updates. If this is not the case, refer to Section 9.4.1, “Deploying Node Updates with the Updater Barclamp” for instructions.

  • All allocated nodes need to be turned on.

  • During the upgrade of the Control Nodes and the Compute Nodes the instances need to be shut down. However it is not necessary to do so at the beginning of the upgrade procedure. This step can be postponed until after the Administration Server has been upgraded to SUSE OpenStack Cloud 6 to keep the downtime as short as possible.

    Important
    Important: Instance Running on HyperV Nodes Will Not Survive an Upgrade

    As of SUSE OpenStack Cloud 6, HyperV Nodes need to be re-installed after the upgrade procedure. This re-installation will overwrite the instance's data and therefore they will be lost. KVM, VMware and Xen instances are not affected.

    Tip
    Tip: Back Up the Administration Server

    It is strongly recommended to create a backup of the Administration Server before starting the upgrade procedure, to be able to restore the server in case the upgrade fails. Refer to the chapter Backing Up and Restoring the Administration Server in the SUSE Cloud 5 documentation for instructions.

11.3.2 The Upgrade Procedure

To start the upgrade procedure, proceed as follows:

Procedure 11.1: Upgrade Part 1: Create the Upgrade Data
  1. Open a browser and point it to the Crowbar Web interface, for example http://192.168.124.10/. Log in as user crowbar. The password is crowbar, if you have not changed the default.

  2. Open Utilities › Upgrade to Cloud 6.

  3. Follow the instructions in the Web interface to create and save the upgrade data. Part 1 of the upgrade procedure is finished when you have saved the data.

    Important
    Important: Location for Saving the Upgrade Data

    Make sure to save the upgrade data to a location that can be accessed from the Administration Server after having re-installed it. Do not save it on the Administration Server itself, since it might get overwritten when re-installing the machine.

When the upgrade data has been saved, the Administration Server needs to be re-installed with SUSE OpenStack Cloud 6 on SUSE Linux Enterprise Server 12 SP1:

Procedure 11.2: Part 2: Re-Installing the Administration Server
  1. Check the network configuration of the Administration Server with the command ifconfig. Note the MAC address and the IP address of the interface named eth0. Also note the IP addresses and ranges of all SUSE OpenStack Cloud networks. You can either find them in /etc/crowbar/network.json or when checking the Networks section in YaST Crowbar (see Section 7.2, “Networks for details).

    Warning
    Warning: No Parallel Setup

    It is not possible to set up a second machine, install SUSE OpenStack Cloud 6 and then switch the old machine with the new one. The MAC address of the network interfaces need to be the same before and after the upgrade.

  2. Reboot the Administration Server from a SUSE Linux Enterprise Server 12 SP1 installation source and install the operating system plus SUSE OpenStack Cloud 6 as an add-on products. For details, see Chapter 3, Installing the Administration Server.

    Tip
    Tip: Deleting Unused Mirror Data

    SUSE OpenStack Cloud 6 does not use any of the repositories that were required for SUSE OpenStack Cloud 5. In case you have mirrored repositories to the Administration Server and /srv resides on a separate partition, it is safe to format this partition to free space for the new repositories.

  3. Optional: If you have installed a local SMT server, configure it as described in Section 4.2, “SMT Configuration”. Make sure the repositories are set up and mirrored as described in Section 4.3, “Setting up Repository Mirroring on the SMT Server”.

  4. Make sure all required repositories are made available as described in Chapter 5, Software Repository Setup.

  5. Configure the network of the Administration Server as described in Chapter 6, Service Configuration: Administration Server Network Configuration. Make sure to use the exact same settings as in the previous installation.

  6. Configure SUSE OpenStack Cloud with YaST Crowbar as described in Chapter 7, Crowbar Setup. Make sure to configure the exact same network settings for Crowbar as in the previous installation.

  7. The Administration Server setup is finished as soon as you have finished the configuration with YaST Crowbar. Do not start the regular SUSE OpenStack Cloud Crowbar installation!

When the Administration Server has been set up and configured, return to the upgrade Web interface to upgrade all nodes in SUSE OpenStack Cloud:

Procedure 11.3: Part 3: Upgrading the Nodes
  1. Open a browser and point it to the Crowbar Web interface available on the Administration Server, for example http://192.168.124.10/.

    The SUSE OpenStack Cloud Installer
    Figure 11.1: The SUSE OpenStack Cloud Installer
  2. Choose Continue Upgrade from SUSE OpenStack Cloud 5 and start the upgrade process by uploading the upgrade data downloaded in Part 1 of the upgrade procedure. Follow the on-screen instruction to finish the upgrade process. Depending on the amount of nodes in your installation this will take up to several hours.

    Note
    Note: Login Credentials

    During the upgrade procedure you will be asked to provide login credentials for the Crowbar Web interface two times. First time you need to provide the default login credentials (crowbar/crowbar. On the second occasion you need to specify the ones you used with Cloud 5. These credentials are also the ones you need to provide for subsequent logins to the Crowbar Web interface.

When all nodes have been upgraded, the barclamps need to be re-applied:

Procedure 11.4: Part 4: Re-Applying the barclamps
  1. Go to the Dashboard on the Crowbar Web interface Nodes › Dashboard and check whether all nodes have been successfully updated—all nodes should be listed in state Ready, indicated by a green dot.

  2. If nodes have not been upgraded successfully, they are marked with a yellow or gray dot. Log in to those nodes (see How can I log in to a node as root? ) and check the log files (see Appendix A, Log Files for reasons. Fix the issues and reboot the node to restart the upgrade process. For more information also refer to What to do if a node is reported to be in the state Problem? What to do if a node hangs at.

  3. When all nodes have bee upgraded successfully re-apply the barclamps. Go to Barclamps › All Barclamps and apply the barclamps in the given order. For each barclamp the service configuration and the deployment configuration is the same as on SUSE OpenStack Cloud 5, since it was restored from the data dump.

  4. When all barclamp have been successfully deployed, you can restart the instances on the Compute Nodes.

11.4 Upgrading to an HA Setup

When making an existing SUSE OpenStack Cloud deployment highly available (by setting up HA clusters and moving roles to these clusters), there are a few issues to pay attention to. To make existing services highly available, proceed as follows. Note that moving to an HA setup cannot be done without SUSE OpenStack Cloud service interruption, because it requires OpenStack services to be restarted.

Important
Important: Teaming Network Mode is Required for HA

Teaming network mode is required for an HA setup of SUSE OpenStack Cloud. If you are planning to move your cloud to an HA setup at a later point in time, make sure to deploy SUSE OpenStack Cloud with teaming network mode from the beginning. Otherwise a migration to an HA setup is not supported.

  1. Make sure to have read the sections Section 1.5, “HA Setup” and Section 2.6, “High Availability” of this manual and taken any appropriate action.

  2. Make the HA repositories available on the Administration Server as described in Section 5.2, “Update and Pool Repositories”. Run the command chef-client afterwards.

  3. Set up your cluster(s) as described in Section 10.2, “Deploying Pacemaker (Optional, HA Setup Only)”.

  4. To move a particular role from a regular control node to a cluster, you need to stop the associated service(s) before re-deploying the role on a cluster:

    1. Log in to each node on which the role is deployed and stop its associated service(s) (a role can have multiple services). Do so by running the service's start/stop script with the stop argument, for example:

      rcopenstack-keystone stop

      See Appendix C, Roles and Services in SUSE OpenStack Cloud for a list of roles, services and start/stop scripts.

    2. The following roles need additional treatment:

      database-server (Database barclamp)
      1. Stop the database on the node the Database barclamp is deployed with the command:

        rcpostgresql stop
      2. Copy /var/lib/pgsql to a temporary location on the node, for example:

        cp -ax /var/lib/pgsql /tmp
      3. Redeploy the Database barclamp to the cluster. The original node may also be part of this cluster.

      4. Log in to a cluster node and run the following command to determine which cluster node runs the postgresql service:

        crm_mon -1
      5. Log in to the cluster node running postgresql.

      6. Stop the postgresql service:

        crm resource stop postgresql
      7. Copy the data backed up earlier to the cluster node:

        rsync -av --delete
                   NODE_WITH_BACKUP:/tmp/pgsql/ /var/lib/pgsql/
      8. Restart the postgresql service:

        crm resource start postgresql

      Copy the content of /var/lib/pgsql/data/ from the original database node to the cluster node with DRBD or shared storage.

      keystone-server (Keystone barclamp)

      If using Keystone with PKI tokens, the PKI keys on all nodes need to be re-generated. This can be achieved by removing the contents of /var/cache/*/keystone-signing/ on the nodes. Use a command similar to the following on the Administration Server as root:

      for NODE in NODE1
               NODE2 NODE3; do
        ssh $NODE rm /var/cache/*/keystone-signing/*
      done
  5. Go to the barclamp featuring the role you want to move to the cluster. From the left side of the Deployment section, remove the node the role is currently running on. Replace it with a cluster from the Available Clusters section. Then apply the proposal and verify that application succeeded via the Crowbar Web interface. You can also check the cluster status via Hawk or the crm / crm_mon CLI tools.

  6. Repeat these steps for all roles you want to move to cluster. See Section 2.6.2.1, “Control Node(s)—Avoiding Points of Failure” for a list of services with HA support.

Important
Important: SSL Certificates

Moving to an HA setup also requires to create SSL certificates for nodes in the cluster that run services using SSL. Certificates need to be issued for the generated names (see Important: Proposal Name) and for all public names you have configured in the cluster.

Important
Important: Service Management on the Cluster

After a role has been deployed on a cluster, its services are managed by the HA software. You must never manually start or stop an HA-managed service or configure it to start on boot. Services may only be started or stopped by using the cluster management tools Hawk or the crm shell. See http://www.suse.com/documentation/sle-ha-12/book_sleha/data/sec_ha_config_basics_resources.html for more information.

11.5 Backing Up and Restoring the Administration Server

Backing Up and Restoring the Administration Server can either be done via the Crowbar Web interface or on the Administration Server's command line via the crowbarctl backup command. Both tools provide the same functionality.

11.5.1 Backup and Restore via the Crowbar Web interface

To use the Web interface for backing up and restoring the Administration Server, go to the Crowbar Web interface on the Administration Server, for example http://192.168.124.10/. Log in as user crowbar. The password is crowbar by default, if you have not changed it. Go to Utilities › Backup & Restore.

Backup and Restore: Initial Page View
Figure 11.2: Backup and Restore: Initial Page View

To create a backup, click the respective button. Provide a descriptive name (allowed characters are letters, numbers, dashes and underscores) and confirm with Create Backup. Alternatively, you can upload a backup, for example from a previous installation.

Existing backups are listed with name and creation date. For each backup, three actions are available:

Download

Download a copy of the backup file. The TAR archive you receive with this download can be uploaded again via Upload Backup Image.

Restore

Restore the backup.

Delete

Delete the backup.

Backup and Restore: List of Backups
Figure 11.3: Backup and Restore: List of Backups

11.5.2 Backup and Restore via the Command Line

Backing up and restoring the Administration Server from the command line can be done with the command crowbarctl backup. For getting general help, run the command crowbarctl --help backup, help on a subcommand is available by running crowbarctl SUBCOMMAND --help. The following commands for creating and managing backups exist:

crowbarctl backup create NAME

Create a new backup named NAME. It will be stored at /var/lib/crowbar/backup.

crowbarctl backup [--yes] NAME

Restore the backup named NAME. You will be asked for confirmation before any existing proposals will get overwritten. If using the option --yes, confirmations are tuned off and the restore is forced.

crowbarctl backup delete NAME

Delete the backup named NAME.

crowbarctl backup download NAME [FILE]

Download the backup named NAME. If you specify the optional [FILE], the download is written to the specified file. Otherwise it is saved to the current working directory with an automatically generated file name. If specifying - for [FILE], the output is written to STDOUT.

crowbarctl backup list

List existing backups. You can optionally specify different output formats and filters—refer to crowbarctl backup list --help for details.

crowbarctl backup upload FILE

Upload a backup from FILE.

Print this page