Terug naar overzicht

Patching Oracle Exalogic - Updating the ZFS 7320 Storage Appliance II

Gepubliceerd: 29 augustus 2012 Auteur: Jos Nijhoff Categorie: Oracle

Part 3b

In my previous post we checked the current software versions on the storage, started the rolling upgrade proces for the ZFS 7320 storage appliance and upgraded the ILOM of storage head2. Now we will finish what we started by performing step 3, upgrading the storage software to version 2011.1.1.0, which is needed for Exalogic versions 1.0.0.0.5 and 2.0.0.0.0.

1.6 Upgrading storage head 2 continued

Let’s see where we were in the upgrade guide, section 2.4.3:

Step 3 : Upgrading ZFS Storage 7320 Software to version 2011.1.1.0.

This section describes how to upgrade the ZFS Storage 7320 software to version 2011.1.1.0 (2011.04.24.1.0,1-1.8). Ensure that the storage head is running version 2010.Q3.2.1 (2010.08.17.2.1,1-1.21) or higher before proceeding with upgrade to 2011.1.1.0. Also, Ensure that the ILOM is upgraded, and ILOM/BIOS is running the version 3.0.16.10, build r65138, before applying this software update.

To upgrade the ZFS storage 7320 software to version 2011.1.1.0, complete the following steps:

1. Log in to ILOM of the storage head where you updated the ILOM, as root:

		% <strong>ssh root@xxxxsn2-c.qualogy.com
</strong>Password:
Oracle(R) Integrated Lights Out Manager
Version 3.0.16.10 r65138
Copyright (c) 2011, Oracle and/or its affiliates. All rights reserved.
-&gt; <strong>start /SP/console
</strong>Are you sure you want to start /SP/console (y/n)? y
Serial console started.  To stop, type ESC (
...
...

	

2. Check if this storage head has network connectivity.

xxxxsn2:configuration cluster resources&gt; <strong>ping xxxxsn1.qualogy.com
</strong>xxxxsn1.qualogy.com is alive

3. Run the following command:

		xxxxsn2:configuration cluster resources&gt; <strong>cd /
</strong>xxxxsn2:&gt; <strong>maintenance system updates download
</strong>xxxxsn2:maintenance system updates download (uncommitted)&gt;

	

Ensure that you have created the patches share on the Sun ZFS Storage 7320 appliance, and enabled the FTP service on the share with the permission for root access. Configure to download the new software from <ftp URL to ak-nas-2011-04-24-1-0-1-1-8-nd.pkg.gz> by using the set url, set user, and set passwordcommands as follows:

<strong>set url=</strong><strong>ftp://&lt;storage VIP address&gt;//export/common/patches/todo/13795376/Infrastructure/2.0.0.0.0/ZFS_Storage_7320/Software/2011.1.1.0/ak-nas-2011-04-24-1-0-1-1-8-nd.pkg.gz
</strong>url = ftp://192.168.xx.yy//export/common/patches/todo/13795376/Infrastructure/2.0.0.0.0/ZFS_Storage_7320/Software/2011.1.1.0/ak-nas-2011-04-24-1-0-1-1-8-nd.pkg.gz
xxxxsn2:maintenance system updates download (uncommitted)&gt; <strong>set user=patcher
</strong>user = patcher
xxxxsn2:maintenance system updates download (uncommitted)&gt; <strong>set password
</strong>Enter password:
password = **********
Now we have to actually start the download of the new package :
xxxxsn2:maintenance system updates download (uncommitted)&gt; <strong>commit
</strong>Transferred 681M of 703M (96.9%) ... done
Unpacking ... done
Check the list of past and present updates :
xxxxsn2:maintenance&gt; <strong>system updates
</strong>xxxxsn2:maintenance system updates&gt; <strong>show
</strong>Updates:
UPDATE                           DATE                      STATUS
ak-nas@2010.08.17.1.1,1-1.16     2010-11-1 12:46:16        previous
ak-nas@2010.08.17.2.1,1-1.21     2011-3-10 23:49:47        previous
<strong>ak-nas@2010.08.17.3.0,1-1.25     2011-4-29 15:48:52        current
</strong><strong>ak-nas@2011.04.24.1.0,1-1.8      2011-12-21 22:32:50       waiting</strong>
Deferred updates:
The appliance is currently configured as part of a cluster. The cluster peer
may have shared resources for which deferred updates are available. After all
updates are completed, check both cluster peers for any deferred updates.

Then select the newly downloaded version and start the upgrade:

xxxxsn2:maintenance system updates&gt; <strong>select ak-nas@2011.04.24.1.0,1-1.8
</strong>xxxxsn2:maintenance system updates ak-nas@2011.04.24.1.0,1-1.8&gt; <strong>upgrade
</strong>The selected software update requires a system reboot in order to take effect.
The system will automatically reboot at the end of the update process. The
update will take several minutes. At any time during this process, you can
cancel the update with [Control-C].
Are you sure? (Y/N) <strong>Y
</strong>Updating from ... ak/nas@2010.08.17.3.0,1-1.25
Loading media metadata ... done.
Selecting alternate product ... SUNW,maguro_plus
Installing Sun ZFS Storage 7320 2011.04.24.1.0,1-1.8
pkg://sun.com/ak/SUNW,maguro_plus@2011.04.24.1.0,1-1.8:20111221T223250Z
Creating system/ak-nas-2011.04.24.1.0_1-1.8 ... done.
Creating system/ak-nas-2011.04.24.1.0_1-1.8/install ... done.
Creating system/ak-nas-2011.04.24.1.0_1-1.8/boot ... done.
Creating system/ak-nas-2011.04.24.1.0_1-1.8/root ... done.
Creating system/ak-nas-2011.04.24.1.0_1-1.8/install/svc ... done.
Creating system/ak-nas-2011.04.24.1.0_1-1.8/install/var ... done.
Creating system/ak-nas-2011.04.24.1.0_1-1.8/install/home ... done.
Creating system/ak-nas-2011.04.24.1.0_1-1.8/install/stash ... done.
Creating system/ak-nas-2011.04.24.1.0_1-1.8/wiki ... done.
Customizing Solaris ... done.
Updating vfstab ... done.
Generating usr/man windex ... done.
Generating usr/sfw/man windex ... done.
Preserving ssh keys ... done.
...
...
Installing firmware ... done.
Installing device links ... Installing device files ... Updating device links ... done.
Updating /etc ... done.
Building boot menu ... done.
Installing boot unix ... done.
Installing boot amd64/unix ... done.
Installing boot menu ... done.
Snapshotting zfs filesystems ...  done.
Installing grub on /dev/rdsk/c2t1d0s0 ... done.
Installing grub on /dev/rdsk/c2t0d0s0 ... done.
Update completed; rebooting.
xxxxsn2 console login: rootsyncing file systems... done
rebooting...
SunOS Release 5.11 Version ak/generic@2011.04.24.1.0,1-1.8 64-bit
Copyright (c) 1983, 2010, Oracle and/or its affiliates. All rights reserved.
System update in progress.
Updating from: ak/nas@2010.08.17.3.0,1-1.25
Updating to:   ak/nas@2011.04.24.1.0,1-1.8
Cloning active datasets ...... done.
Upgrading /var/ak/home ... 16 blocks
Upgrading /etc/svc/profile ... 176 blocks
Upgrading /var/apache2 ... 4416 blocks
Upgrading /var/sadm ... 6240 blocks
Upgrading /var/svc ... 64 blocks
Upgrading /var/dhcp/duid ... done.
Upgrading /var/crypto/pkcs11.conf ... done.
Updating system logs ... done.
Starting configd ... done.
Scanning manifests ... done.
...
...
Refreshing system/identity:node ... done.
Refreshing system/name-service-cache:default ... done.
Refreshing system/ndmpd:default ... done.
Applying service layer ak_nas ... done.
Applying service layer ak_SUNW,maguro_plus ... done.
Refreshing appliance/kit/identity:default ... done.
Applying service profile generic ... done.
Applying profile upgrade/akinstall.xml ... done.
Applying layer upgrade/composite.svc ... done.
Cleaning up services ... done.
Shutting down configd ... done.
Configuring devices.
Configuring network devices ... done.
Sun ZFS Storage 7320 Version ak/SUNW,maguro_plus@2011.04.24.1.0,1-1.8
Copyright 2012 Oracle  All rights reserved.
Use is subject to license terms.

OK, we have now upgraded the storage software on storage head2. Now log back onto it’s CLI console. We see a warning that the machine is running two different software versions now (as we have not upgraded the active head 1 yet).

-&gt; <strong>start /SP/console
</strong>Are you sure you want to start /SP/console (y/n)? y
Serial console started.  To stop, type ESC (
xxxxsn2 console login: <strong>root
</strong>Password:
Last login: Tue Mar 20 13:34:45 on console
<span style="text-decoration: underline">This controller is running a different software version from its cluster
</span><span style="text-decoration: underline">peer.</span> Configuration changes made to either controller will not be propagated
to its peer while in this state, and may be undone when the two software
versions are synchronized. Please see the appliance documentation for more
information.

This message is also shown when logging in to the storage webinterfaces (figure 3), as a reminder that we should not leave it at this:

Check the current version:

xxxxsn2:&gt; <strong>maintenance system updates
</strong>xxxxsn2:maintenance system updates&gt; <strong>show
</strong>Updates:
UPDATE                           DATE                      STATUS
ak-nas@2010.08.17.1.1,1-1.16     2010-11-1 12:46:16        previous
ak-nas@2010.08.17.2.1,1-1.21     2011-3-10 23:49:47        previous
ak-nas@2010.08.17.3.0,1-1.25     2011-4-29 15:48:52        previous
<strong>ak-nas@2011.04.24.1.0,1-1.8      2011-12-21 22:32:50       current</strong>
Deferred updates:
The appliance is currently configured as part of a cluster. The cluster peer
may have shared resources for which deferred updates are available. After all
updates are completed, check both cluster peers for any deferred updates.

OK, storage head 2 is ready and we should now do a switchover to free head 1 from active duty and upgrade it next.

1.7 Doing the switchover between heads 1 and 2

The document says only this:

Now you can perform a takeover operation, as required, depending on your choice of storage head to serve as the active storage head. Ensure that one of the storage heads is in the Active (takeover completed) state, and the other is in the Ready (waiting for failback) state. This process completes software upgrade on the ZFS storage appliance.

OK, but how can we perform the takeover from 1 to 2? The takeover can be done via the webGUI, which I will demonstrate in a future post. Here’s how to do it from the CLI:

xxxxsn2:maintenance system updates&gt; <strong>cd /
</strong>xxxxsn2:&gt; configuration cluster
xxxxsn2:configuration cluster&gt; <strong>show
</strong>Properties:
state = AKCS_STRIPPED
<strong> description = Ready (waiting for failback)
</strong>peer_asn = a54b53a0-afba-eae1-a77a-b0013813b629
peer_hostname = xxxxsn1
peer_state = AKCS_OWNER
peer_description = Active (takeover completed)
Children:
resources =&gt; Configure resources
xxxxsn2:configuration cluster&gt; <strong>help
</strong>Subcommands that are valid in this context:
resources            =&gt; Configure resources
help [topic]         =&gt; Get context-sensitive help. If [topic] is specified,
it must be one of "builtins", "commands", "general",
"help", "script" or "properties".
show                 =&gt; Show information pertinent to the current context
done                 =&gt; Finish operating on "cluster"
get [prop]           =&gt; Get value for property [prop]. ("help properties"
for valid properties.) If [prop] is not specified,
returns values for all properties.
setup                =&gt; Run through initial cluster setup
failback             =&gt; Fail back all resources assigned to the cluster peer
takeover             =&gt; Take over all resources assigned to the cluster peer
links                =&gt; Report the state of the cluster links
 
xxxxsn2:configuration cluster&gt; <strong>takeover
</strong>Continuing will immediately take over the resources assigned to the cluster
peer. This may result in clients experiencing a slight delay in service.
Are you sure? (Y/N)
xxxxsn2:configuration cluster&gt;
xxxxsn2:configuration cluster&gt; <strong>show
</strong>Properties:
state = AKCS_OWNER
description = Active (takeover completed)
peer_asn = a54b53a0-afba-eae1-a77a-b0013813b629
peer_hostname = xxxxsn1
<strong> peer_state =
</strong>peer_description = Unknown (disconnected or restarting)
Children:
resources =&gt; Configure resources

Now the storage head 1 will restart and join the cluster again, now as the active backup. After waiting a little (30 sec or so), check status again:

		xxxxsn<strong>2</strong>:configuration cluster&gt; <strong>show
</strong>Properties:
state = AKCS_OWNER
description = Active (takeover completed)
peer_asn = a54b53a0-afba-eae1-a77a-b0013813b629
<strong> peer_hostname = xxxxsn1
</strong>peer_state = AKCS_STRIPPED
peer_description = Ready (waiting for failback)
Children:
resources =&gt; Configure resources

	

Now check the status on head 1 as well :

		xxxxsn<strong>1</strong>:configuration cluster&gt; <strong>show
</strong>Properties:
state = AKCS_STRIPPED
<strong> description = Ready (waiting for failback)
</strong>peer_asn = 9faf8ff1-c3a8-c090-8f4e-9871618a152e
peer_hostname = xxxxsn2
peer_state = AKCS_OWNER
peer_description = Active (takeover completed)
Children:
resources =&gt; Configure resources

	

Now that we have done the takeover we can perform the same upgrade steps 2 and 3 on head 1 as well, first the ILOM upgrade (see previous post) and then the storage upgrade. As it’s the same routine as before, thus I will not show it here.

1.8 Conclusion

We have demonstrated that we can upgrade both the network and the storage infrastructure in our Exalogic quarter rack in a rolling fashion, without interrupting these services and maintaining availability!

1.9 Next time

In a following post, we move on the next part of patching the Exalogic infrastructure: upgrading the OS image on the compute nodes.

Publicatiedatum: 29 augustus 2012

Over auteur Jos Nijhoff

Jos Nijhoff is an experienced Application Infrastructure consultant at Qualogy. Currently he plays a key role as technical presales and hands-on implementation lead for Qualogy's exclusive Exalogic partnership with Oracle for the Benelux area. Thus he keeps in close contact with Oracle presales and partner services on new developments, but maintains an independent view. He gives technical guidance and designs, reviews, manages and updates the application infrastructure before, during and after the rollout of new and existing Oracle (Fusion) Applications & Fusion Middleware implementations. Jos is also familiar with subjects like high availability, disaster recovery scenarios, virtualization, performance analysis, data security, and identity management integration with respect to Oracle applications.

Meer posts van Jos Nijhoff

Reacties