Patching Oracle Exalogic - Updating the ZFS 7320 Storage Appliance II
Gepubliceerd: Auteur: Jos Nijhoff Categorie: OraclePart 3b
In my previous post we checked the current software versions on the storage, started the rolling upgrade proces for the ZFS 7320 storage appliance and upgraded the ILOM of storage head2. Now we will finish what we started by performing step 3, upgrading the storage software to version 2011.1.1.0, which is needed for Exalogic versions 1.0.0.0.5 and 2.0.0.0.0.
1.6 Upgrading storage head 2 continued
Let’s see where we were in the upgrade guide, section 2.4.3:
- Step 3 : Upgrading ZFS Storage 7320 Software to version 2011.1.1.0.
This section describes how to upgrade the ZFS Storage 7320 software to version 2011.1.1.0 (2011.04.24.1.0,1-1.8). Ensure that the storage head is running version 2010.Q3.2.1 (2010.08.17.2.1,1-1.21) or higher before proceeding with upgrade to 2011.1.1.0. Also, Ensure that the ILOM is upgraded, and ILOM/BIOS is running the version 3.0.16.10, build r65138, before applying this software update.
To upgrade the ZFS storage 7320 software to version 2011.1.1.0, complete the following steps:
1. Log in to ILOM of the storage head where you updated the ILOM, as root:
% <strong>ssh root@xxxxsn2-c.qualogy.com </strong>Password: Oracle(R) Integrated Lights Out Manager Version 3.0.16.10 r65138 Copyright (c) 2011, Oracle and/or its affiliates. All rights reserved. -> <strong>start /SP/console </strong>Are you sure you want to start /SP/console (y/n)? y Serial console started. To stop, type ESC ( ... ...
2. Check if this storage head has network connectivity.
xxxxsn2:configuration cluster resources> <strong>ping xxxxsn1.qualogy.com
3. Run the following command:
xxxxsn2:configuration cluster resources> <strong>cd / </strong>xxxxsn2:> <strong>maintenance system updates download </strong>xxxxsn2:maintenance system updates download (uncommitted)>
Ensure that you have created the patches share on the Sun ZFS Storage 7320 appliance, and enabled the FTP service on the share with the permission for root access. Configure to download the new software from <ftp URL to ak-nas-2011-04-24-1-0-1-1-8-nd.pkg.gz> by using the set url, set user, and set passwordcommands as follows:
<strong>set url=</strong><strong>ftp://<storage VIP address>//export/common/patches/todo/13795376/Infrastructure/2.0.0.0.0/ZFS_Storage_7320/Software/2011.1.1.0/ak-nas-2011-04-24-1-0-1-1-8-nd.pkg.gz </strong>url = ftp://192.168.xx.yy//export/common/patches/todo/13795376/Infrastructure/2.0.0.0.0/ZFS_Storage_7320/Software/2011.1.1.0/ak-nas-2011-04-24-1-0-1-1-8-nd.pkg.gz xxxxsn2:maintenance system updates download (uncommitted)> <strong>set user=patcher </strong>user = patcher xxxxsn2:maintenance system updates download (uncommitted)> <strong>set password </strong>Enter password: password = ********** Now we have to actually start the download of the new package : xxxxsn2:maintenance system updates download (uncommitted)> <strong>commit </strong>Transferred 681M of 703M (96.9%) ... done Unpacking ... done Check the list of past and present updates : xxxxsn2:maintenance> <strong>system updates </strong>xxxxsn2:maintenance system updates> <strong>show </strong>Updates: UPDATE DATE STATUS ak-nas@2010.08.17.1.1,1-1.16 2010-11-1 12:46:16 previous ak-nas@2010.08.17.2.1,1-1.21 2011-3-10 23:49:47 previous <strong>ak-nas@2010.08.17.3.0,1-1.25 2011-4-29 15:48:52 current </strong><strong>ak-nas@2011.04.24.1.0,1-1.8 2011-12-21 22:32:50 waiting</strong> Deferred updates: may have shared resources for which deferred updates are available. After all updates are completed, check both cluster peers for any deferred updates.
Then select the newly downloaded version and start the upgrade:
xxxxsn2:maintenance system updates> <strong>select ak-nas@2011.04.24.1.0,1-1.8 </strong>xxxxsn2:maintenance system updates ak-nas@2011.04.24.1.0,1-1.8> <strong>upgrade </strong>The selected software update requires a system reboot in order to take effect. The system will automatically reboot at the end of the update process. The update will take several minutes. At any time during this process, you can cancel the update with [Control-C]. Are you sure? (Y/N) <strong>Y </strong>Updating from ... ak/nas@2010.08.17.3.0,1-1.25 Loading media metadata ... done. Selecting alternate product ... SUNW,maguro_plus Installing Sun ZFS Storage 7320 2011.04.24.1.0,1-1.8 pkg://sun.com/ak/SUNW,maguro_plus@2011.04.24.1.0,1-1.8:20111221T223250Z Creating system/ak-nas-2011.04.24.1.0_1-1.8 ... done. Creating system/ak-nas-2011.04.24.1.0_1-1.8/install ... done. Creating system/ak-nas-2011.04.24.1.0_1-1.8/boot ... done. Creating system/ak-nas-2011.04.24.1.0_1-1.8/root ... done. Creating system/ak-nas-2011.04.24.1.0_1-1.8/install/svc ... done. Creating system/ak-nas-2011.04.24.1.0_1-1.8/install/var ... done. Creating system/ak-nas-2011.04.24.1.0_1-1.8/install/home ... done. Creating system/ak-nas-2011.04.24.1.0_1-1.8/install/stash ... done. Creating system/ak-nas-2011.04.24.1.0_1-1.8/wiki ... done. Customizing Solaris ... done. Updating vfstab ... done. Generating usr/man windex ... done. Generating usr/sfw/man windex ... done. Preserving ssh keys ... done. ... ... Installing firmware ... done. Installing device links ... Installing device files ... Updating device links ... done. Updating /etc ... done. Building boot menu ... done. Installing boot unix ... done. Installing boot amd64/unix ... done. Installing boot menu ... done. Snapshotting zfs filesystems ... done. Installing grub on /dev/rdsk/c2t1d0s0 ... done. Installing grub on /dev/rdsk/c2t0d0s0 ... done. Update completed; rebooting. xxxxsn2 console login: rootsyncing file systems... done rebooting... SunOS Release 5.11 Version ak/generic@2011.04.24.1.0,1-1.8 64-bit Copyright (c) 1983, 2010, Oracle and/or its affiliates. All rights reserved. System update in progress. Updating from: ak/nas@2010.08.17.3.0,1-1.25 Updating to: ak/nas@2011.04.24.1.0,1-1.8 Cloning active datasets ...... done. Upgrading /var/ak/home ... 16 blocks Upgrading /etc/svc/profile ... 176 blocks Upgrading /var/apache2 ... 4416 blocks Upgrading /var/sadm ... 6240 blocks Upgrading /var/svc ... 64 blocks Upgrading /var/dhcp/duid ... done. Upgrading /var/crypto/pkcs11.conf ... done. Updating system logs ... done. Starting configd ... done. Scanning manifests ... done. ... ... Refreshing system/identity:node ... done. Refreshing system/name-service-cache:default ... done. Refreshing system/ndmpd:default ... done. Applying service layer ak_nas ... done. Applying service layer ak_SUNW,maguro_plus ... done. Refreshing appliance/kit/identity:default ... done. Applying service profile generic ... done. Applying profile upgrade/akinstall.xml ... done. Applying layer upgrade/composite.svc ... done. Cleaning up services ... done. Shutting down configd ... done. Configuring devices. Configuring network devices ... done. Sun ZFS Storage 7320 Version ak/SUNW,maguro_plus@2011.04.24.1.0,1-1.8 Copyright 2012 Oracle All rights reserved.
OK, we have now upgraded the storage software on storage head2. Now log back onto it’s CLI console. We see a warning that the machine is running two different software versions now (as we have not upgraded the active head 1 yet).
-> <strong>start /SP/console </strong>Are you sure you want to start /SP/console (y/n)? y Serial console started. To stop, type ESC ( xxxxsn2 console login: <strong>root </strong>Password: Last login: Tue Mar 20 13:34:45 on console <span style="text-decoration: underline">This controller is running a different software version from its cluster </span><span style="text-decoration: underline">peer.</span> Configuration changes made to either controller will not be propagated to its peer while in this state, and may be undone when the two software versions are synchronized. Please see the appliance documentation for more information.
This message is also shown when logging in to the storage webinterfaces (figure 3), as a reminder that we should not leave it at this:
Check the current version:
xxxxsn2:> <strong>maintenance system updates </strong>xxxxsn2:maintenance system updates> <strong>show </strong>Updates: UPDATE DATE STATUS ak-nas@2010.08.17.1.1,1-1.16 2010-11-1 12:46:16 previous ak-nas@2010.08.17.2.1,1-1.21 2011-3-10 23:49:47 previous ak-nas@2010.08.17.3.0,1-1.25 2011-4-29 15:48:52 previous <strong>ak-nas@2011.04.24.1.0,1-1.8 2011-12-21 22:32:50 current</strong> Deferred updates: may have shared resources for which deferred updates are available. After all updates are completed, check both cluster peers for any deferred updates.
OK, storage head 2 is ready and we should now do a switchover to free head 1 from active duty and upgrade it next.
1.7 Doing the switchover between heads 1 and 2
The document says only this:
Now you can perform a takeover operation, as required, depending on your choice of storage head to serve as the active storage head. Ensure that one of the storage heads is in the Active (takeover completed) state, and the other is in the Ready (waiting for failback) state. This process completes software upgrade on the ZFS storage appliance.
OK, but how can we perform the takeover from 1 to 2? The takeover can be done via the webGUI, which I will demonstrate in a future post. Here’s how to do it from the CLI:
xxxxsn2:maintenance system updates> <strong>cd / </strong>xxxxsn2:> configuration cluster xxxxsn2:configuration cluster> <strong>show </strong>Properties: state = AKCS_STRIPPED <strong> description = Ready (waiting for failback) </strong>peer_asn = a54b53a0-afba-eae1-a77a-b0013813b629 peer_hostname = xxxxsn1 peer_state = AKCS_OWNER peer_description = Active (takeover completed) Children: resources => Configure resources xxxxsn2:configuration cluster> <strong>help </strong>Subcommands that are valid in this context: resources => Configure resources it must be one of "builtins", "commands", "general", "help", "script" or "properties". show => Show information pertinent to the current context done => Finish operating on "cluster" get [prop] => Get value for property [prop]. ("help properties" returns values for all properties. setup => Run through initial cluster setup failback => Fail back all resources assigned to the cluster peer takeover => Take over all resources assigned to the cluster peer links => Report the state of the cluster links xxxxsn2:configuration cluster> <strong>takeover </strong>Continuing will immediately take over the resources assigned to the cluster peer. This may result in clients experiencing a slight delay in service. Are you sure? (Y/N) xxxxsn2:configuration cluster> xxxxsn2:configuration cluster> <strong>show </strong>Properties: state = AKCS_OWNER description = Active (takeover completed) peer_asn = a54b53a0-afba-eae1-a77a-b0013813b629 peer_hostname = xxxxsn1 <strong> peer_state = </strong>peer_description = Unknown (disconnected or restarting) Children: resources => Configure resources
Now the storage head 1 will restart and join the cluster again, now as the active backup. After waiting a little (30 sec or so), check status again:
xxxxsn<strong>2</strong>:configuration cluster> <strong>show </strong>Properties: state = AKCS_OWNER description = Active (takeover completed) peer_asn = a54b53a0-afba-eae1-a77a-b0013813b629 <strong> peer_hostname = xxxxsn1 </strong>peer_state = AKCS_STRIPPED peer_description = Ready (waiting for failback) Children: resources => Configure resources
Now check the status on head 1 as well :
xxxxsn<strong>1</strong>:configuration cluster> <strong>show </strong>Properties: state = AKCS_STRIPPED <strong> description = Ready (waiting for failback) </strong>peer_asn = 9faf8ff1-c3a8-c090-8f4e-9871618a152e peer_hostname = xxxxsn2 peer_state = AKCS_OWNER peer_description = Active (takeover completed) Children: resources => Configure resources
Now that we have done the takeover we can perform the same upgrade steps 2 and 3 on head 1 as well, first the ILOM upgrade (see previous post) and then the storage upgrade. As it’s the same routine as before, thus I will not show it here.
1.8 Conclusion
We have demonstrated that we can upgrade both the network and the storage infrastructure in our Exalogic quarter rack in a rolling fashion, without interrupting these services and maintaining availability!
1.9 Next time
In a following post, we move on the next part of patching the Exalogic infrastructure: upgrading the OS image on the compute nodes.
Publicatiedatum: 29 augustus 2012