Member of The Internet Defense League Últimos cambios
Últimos Cambios
Blog personal: El hilo del laberinto Geocaching

Actualización de Solaris 10 con zonas, con "Live Upgrade" (de Update 7 a Update 9)

Última Actualización: 19 de septiembre de 2011

En este documento abundo en mi experiencia con "Live Upgrade". En este caso describo la migración de Solaris 10 Update 7 (5/09) a Solaris 10 Update 9 (09/10), en un entorno con "/var" en un "dataset" separado y con zonas Solaris. Esta migración se hizo esperar hasta que encontré unas horas para reconfigurar mis zonas Solaris para hacerlas compatibles con "Live Upgrade" y "Boot Environments".

Para comprender completamente este documento, habría que leerse los artículos anteriores sobre este tema. Para ir limpiando el asunto sólo enlazo a la actualización anterior. Recomiendo leer toda la serie sobre "Live Upgrade":

La nueva actualización de Solaris es bastante "light", aunque algunos usuarios encontrará cambios valiosos: Diversas mejoras en las zonas, migración P2V, mejoras en ZFS (holding Snapshots, RAIDZ-3, mejora en la gestión de dispositivos de log ZFS, recuperación ante "Uberblocks" corruptos, "zpool split", mejor observabilidad), soporte de discos con sectores de más de 512 bytes, mejoras iSCSI, API GLDv3 para drivers de red, nuevos drivers, cambios en la configuración SMF de "sendmail".

Los pasos para actualizar nuestro sistema mediante "Live Upgrade" son los siguientes.

En primer lugar, creamos un "BE" nuevo, clonando el actual:

# lustatus
Boot Environment           Is       Active Active    Can    Copy      
Name                       Complete Now    On Reboot Delete Status    
-------------------------- -------- ------ --------- ------ ----------
Solaris10u7                yes      yes    yes       no     -         
Solaris10u7BACKUP          yes      no     no        yes    -

# time lucreate -n Solaris10u9
Checking GRUB menu...
System has findroot enabled GRUB
Analyzing system configuration.
Comparing source boot environment <Solaris10u7> file systems with the file 
system(s) you specified for the new boot environment. Determining which 
file systems should be in the new boot environment.
Updating boot environment description database on all BEs.
Updating system configuration files.
Creating configuration for boot environment <Solaris10u9>.
Source boot environment is <Solaris10u7>.
Creating boot environment <Solaris10u9>.
Cloning file systems from boot environment <Solaris10u7> to create boot environment <Solaris10u9>.
Creating snapshot for <datos/ROOT/Solaris10u7> on <datos/ROOT/Solaris10u7@Solaris10u9>.
Creating clone for <datos/ROOT/Solaris10u7@Solaris10u9> on <datos/ROOT/Solaris10u9>.
Setting canmount=noauto for </> in zone <global> on <:datos/ROOT/Solaris10u9>.
Creating snapshot for <datos/ROOT/Solaris10u7/var> on <datos/ROOT/Solaris10u7/var@Solaris10u9>.
Creating clone for <datos/ROOT/Solaris10u7/var@Solaris10u9> on <datos/ROOT/Solaris10u9/var>.
Setting canmount=noauto for </var> in zone <global> on <datos/ROOT/Solaris10u9/var>.
Creating snapshot for <datos/ROOT/Solaris10u7/zones> on <datos/ROOT/Solaris10u7/zones@Solaris10u9>.
Creating clone for <datos/ROOT/Solaris10u7/zones@Solaris10u9> on <datos/ROOT/Solaris10u9/zones>.
Setting canmount=noauto for </zones> in zone <global> on <datos/ROOT/Solaris10u9/zones>.
Creating snapshot for <datos/ROOT/Solaris10u7/zones/stargate> on <datos/ROOT/Solaris10u7/zones/stargate@Solaris10u9>.
Creating clone for <datos/ROOT/Solaris10u7/zones/stargate@Solaris10u9> on <datos/ROOT/Solaris10u9/zones/stargate-Solaris10u9>.
Creating snapshot for <datos/ROOT/Solaris10u7/zones/babylon5> on <datos/ROOT/Solaris10u7/zones/babylon5@Solaris10u9>.
Creating clone for <datos/ROOT/Solaris10u7/zones/babylon5@Solaris10u9> on <datos/ROOT/Solaris10u9/zones/babylon5-Solaris10u9>.
Saving existing file </boot/grub/menu.lst> in top level dataset for BE <Solaris10u7BACKUP> as <mount-point>//boot/grub/menu.lst.prev.
Saving existing file </boot/grub/menu.lst> in top level dataset for BE <Solaris10u9> as <mount-point>//boot/grub/menu.lst.prev.
File </boot/grub/menu.lst> propagation successful
Copied GRUB menu from PBE to ABE
No entry for BE <Solaris10u9> in GRUB menu
Population of boot environment <Solaris10u9> successful.
Creation of boot environment <Solaris10u9> successful.

real    1m56.650s
user    0m3.935s
sys     0m7.265s

# lustatus
Boot Environment           Is       Active Active    Can    Copy
Name                       Complete Now    On Reboot Delete Status
-------------------------- -------- ------ --------- ------ ----------
Solaris10u7                yes      yes    yes       no     -         
Solaris10u7BACKUP          yes      no     no        yes    -         
Solaris10u9                yes      no     no        yes    -

El clonado del "BE" actual, incluyendo dos zonas Solaris, se realiza en menos de dos minutos. ¡¡Bien por ZFS!!.

El siguiente paso consiste en actualizar el sistema operativo en el nuevo "BE". Para ello copio la imagen ISO en "/tmp", la monto y actualizo desde ella:

# lofiadm -a /tmp/sol-10-u9-ga-x86-dvd.iso
/dev/lofi/1
# mkdir /tmp/sol-10-u9-ga-x86-dvd
# mount -o ro -F hsfs /dev/lofi/1 /tmp/sol-10-u9-ga-x86-dvd
# time luupgrade -n Solaris10u9 -u -s /tmp/sol-10-u9-ga-x86-dvd
System has findroot enabled GRUB
No entry for BE <Solaris10u9> in GRUB menu
Uncompressing miniroot
Copying failsafe kernel from media.
61364 blocks
miniroot filesystem is <lofs>
Mounting miniroot at </tmp/sol-10-u9-ga-x86-dvd/Solaris_10/Tools/Boot>
Validating the contents of the media </tmp/sol-10-u9-ga-x86-dvd>.
The media is a standard Solaris media.
The media contains an operating system upgrade image.
The media contains <Solaris> version <10>.
Constructing upgrade profile to use.
Locating the operating system upgrade program.
Checking for existence of previously scheduled Live Upgrade requests.
Creating upgrade profile for BE <Solaris10u9>.
Checking for GRUB menu on ABE <Solaris10u9>.
Saving GRUB menu on ABE <Solaris10u9>.
Checking for x86 boot partition on ABE.
Determining packages to install or upgrade for BE <Solaris10u9>.
Performing the operating system upgrade of the BE <Solaris10u9>.
CAUTION: Interrupting this process may leave the boot environment unstable 
or unbootable.
Upgrading Solaris: 100% completed
Installation of the packages from this media is complete.
Restoring GRUB menu on ABE <Solaris10u9>.
Updating package information on boot environment <Solaris10u9>.
Package information successfully updated on boot environment <Solaris10u9>.
Adding operating system patches to the BE <Solaris10u9>.
The operating system patch installation is complete.
ABE boot partition backing deleted.
PBE GRUB has no capability information.
PBE GRUB has no versioning information.
ABE GRUB is newer than PBE GRUB. Updating GRUB.
GRUB update was successfull.
Configuring failsafe for system.
Failsafe configuration is complete.
INFORMATION: The file </var/sadm/system/logs/upgrade_log> on boot 
environment <Solaris10u9> contains a log of the upgrade operation.
INFORMATION: The file </var/sadm/system/data/upgrade_cleanup> on boot 
environment <Solaris10u9> contains a log of cleanup operations required.
WARNING: <1> packages failed to install properly on boot environment <Solaris10u9>.
INFORMATION: The file </var/sadm/system/data/upgrade_failed_pkgadds> on 
boot environment <Solaris10u9> contains a list of packages that failed to 
upgrade or install properly.
INFORMATION: Review the files listed above. Remember that all of the files 
are located on boot environment <Solaris10u9>. Before you activate boot 
environment <Solaris10u9>, determine if any additional system maintenance 
is required or if additional media of the software distribution must be 
installed.
The Solaris upgrade of the boot environment <Solaris10u9> is partially complete.
Installing failsafe

Failsafe install is complete.

real    120m56.533s
user    10m15.707s
sys     11m26.822s

Falla la actualización de un paquete. Montamos el nuevo "BE" con "lumount Solaris10u9" y revisamos los ficheros de logs indicados:

# lumount Solaris10u9
/.alt.Solaris10u9
# cat /.alt.Solaris10u9/var/sadm/system/data/upgrade_failed_pkgadds
SUNWscpr
# cat /.alt.Solaris10u9/var/sadm/system/data/upgrade_cleanup
(nada relevante)
# cat /.alt.Solaris10u9/var/sadm/system/logs/upgrade_log
[...]
Doing pkgadd of SUNWscpr to /


This appears to be an attempt to install the same architecture and
version of package <SUNWscpr> which is already installed on zones
<babylon5, stargate>.  This installation will attempt to overwrite
this package.


This appears to be an attempt to install the same architecture and
version of a package which is already installed.  This installation
will attempt to overwrite this package.

pkgadd: ERROR: unable to create package object </a/home>.
    unable to fix attributes
ERROR: attribute verification of </a/home> failed
    unable to fix attributes

Installation of <SUNWscpr> partially failed.
pkgadd return code = 2
[...]

¿"/a/home"?. Sospechoso. Intento actualizar ese paquete a mano:

# luumount Solaris10u9
# luupgrade -p -n Solaris10u9 -s /tmp/sol-10-u9-ga-x86-dvd/Solaris_10/Product SUNWscpr
System has findroot enabled GRUB
No entry for BE <Solaris10u9> in GRUB menu
Validating the contents of the media </tmp/sol-10-u9-ga-x86-dvd/Solaris_10/Product>.
Mounting the BE <Solaris10u9>.
Adding packages to the BE <Solaris10u9>.
## Verifying package <SUNWscpr> dependencies in zone <babylon5>
## Verifying package <SUNWscpr> dependencies in zone <stargate>


This appears to be an attempt to install the same architecture and
version of package <SUNWscpr> which is already installed on zones
<babylon5, stargate>.  This installation will attempt to overwrite
this package.


The package <SUNWscpr> contains scripts which will be executed on
zones <babylon5, stargate> with super-user permission during the
process of installing this package.

Do you want to continue with the installation of <SUNWscpr> [y,n,?] y

Processing package instance <SUNWscpr> from </tmp/sol-10-u9-ga-x86-dvd/Solaris_10/Product>
## Installing package <SUNWscpr> in global zone

Source Compatibility, (Root)(i386) 11.10.0,REV=2005.01.21.16.34
Copyright (c) 2010, Oracle and/or its affiliates. All rights reserved.

This appears to be an attempt to install the same architecture and
version of a package which is already installed.  This installation
will attempt to overwrite this package.


The installation of this package was previously terminated and
installation was never successfully completed.

Do you want to continue with the installation of <SUNWscpr> [y,n,?] y
Using </a> as the package base directory.
## Processing package information.
## Processing system information.
   10 package pathnames are already properly installed.
## Verifying package dependencies.
## Verifying disk space requirements.
## Checking for conflicts with packages already installed.
## Checking for setuid/setgid programs.

This package contains scripts which will be executed with super-user
permission during the process of installing this package.

Do you want to continue with the installation of <SUNWscpr> [y,n,?] y

Installing Source Compatibility, (Root) as <SUNWscpr>

## Installing part 1 of 1.
pkgadd: ERROR: unable to create package object </a/home>.
    unable to fix attributes
/a/home <attribute change only>
[ verifying class <none> ]
ERROR: attribute verification of </a/home> failed
    unable to fix attributes
[ verifying class <preserve> ]

Installation of <SUNWscpr> partially failed.
## Interrupted: package <SUNWscpr> not installed in any non-global zones

1 package was not processed!

Unmounting the BE <Solaris10u9>.
The package add to the BE <Solaris10u9> completed with Warnings.

Analizando el manifiesto de este paquete, veo lo siguiente:

# cat /tmp/sol-10-u9-ga-x86-dvd/Solaris_10/Product/SUNWscpr/pkgmap
: 1 44
1 i copyright 71 6125 1281069637
1 i depend 1036 21264 1281069637
1 d none etc 0755 root sys
1 s none etc/chroot=../usr/sbin/chroot
1 s none etc/fuser=../usr/sbin/fuser
1 s none etc/link=../usr/sbin/link
1 d none etc/mail 0755 root mail
1 e preserve etc/mail/Mail.rc 0644 root bin 163 13006 1106352419
1 s none etc/mvdir=../usr/sbin/mvdir
1 s none etc/pwck=../usr/sbin/pwck
1 s none etc/termcap=../usr/share/lib/termcap
1 s none etc/unlink=../usr/sbin/unlink
1 d none export 0755 root sys
1 d none home 0555 root root
1 i i.preserve 186 13489 1281069637
1 i pkginfo 1096 19457 1281545246
1 d none tmp 1777 root sys

Ejecutando la actualización bajo "truss", vemos lo siguiente:

[...]
24674:  chmod("/a/home", 0555)                          Err#30 EROFS
[...]

Vemos que intenta poner atributos en el directorio "/home" (el "/a" inicial son juegos con "lofs", que no vienen al caso). Esa operación falla con un error "EROFS", que indica que los atributos no se pueden modificar porque se encuentran en un sistema de ficheros "read only".

El problema es que en nuestras zonas, "home" es un "dataset" separado, no un directorio que resida dentro del "dataset" raíz de la zona. Como se trata de un "dataset" compartido entre las zonas de diferentes "BE", cuando se monta en un "BE" no activo, como es el caso, se monta en modo "solo lectura".

Se me ocurre que podemos probar cinco cosas para soslayar el problema:

  • Poner el modo 555 manualmente en ese directorio, esperando que la herramienta no intente modificar los permisos si ve que no es necesario. Tras la instalación, podemos dejar otra vez los permisos originales.

    Probando, ésto no funciona. El instalador intenta modificar los atributos aunque no sea necesario.

  • Jugar con DTRACE para devolver un OK al realizar la operación problemática.

    Factible, pero complejo.

  • Hacer "detach" de las zonas, clonar el "BE", reactivar las zonas en el "BE" actual, actualizar el nuevo "BE", reiniciar la máquina con él y realizar un "attach" de zonas, con la opción de "upgrade".

    Complicado para lo que queremos hacer, y requiere varios cortes de servicio.

  • Modificar el montaje del dataset "home" en las zonas, para que se monte en otro sitio, dejando el directorio "home" vacío pero tranquilo y disponible. Tras instalar el parche, dejamos todo como estaba antes.

    El problema de este sistema es que necesita parar las zonas durante la instalación. Puede ser la mejor opción si muchos paquetes fallasen de la misma manera.

  • Modificar el manifiesto del paquete para que deje tranquilo el directorio "home".

    Parece la opción más sencilla e inocua, ya que solo requiere modificar un fichero y no necesita detener el servicio.

Es de señalar que este parche falla porque estamos compartiendo el "dataset" "home" de las zonas entre sus versiones en diferentes "BE"s. Es una configuración que yo veo lógica, pero parece que Sun/Oracle no le gusta mucho. El por qué este parche necesita tocar los permisos de "home" es un misterio, eso sí. Yo no veo la necesidad.

Para modificar el manifiesto del paquete, lo copiamos de la imagen ISO del DVD, eliminamos la linea de "home" del manifiesto y probamos a instalar nuestra versión modificada:

# cp -a /tmp/sol-10-u9-ga-x86-dvd/Solaris_10/Product/SUNWscpr /tmp

(Editamos el fichero "/tmp/SUNWscpr/pkgmap" y eliminamos la linea que referencia a "home")

# luupgrade -p -n Solaris10u9 -s /tmp SUNWscpr
System has findroot enabled GRUB
No entry for BE <Solaris10u9> in GRUB menu
Validating the contents of the media </tmp>.
Mounting the BE <Solaris10u9>.
Adding packages to the BE <Solaris10u9>.
## Verifying package <SUNWscpr> dependencies in zone <babylon5>
## Verifying package <SUNWscpr> dependencies in zone <stargate>


This appears to be an attempt to install the same architecture and
version of package <SUNWscpr> which is already installed on zones
<babylon5, stargate>.  This installation will attempt to overwrite
this package.


The package <SUNWscpr> contains scripts which will be executed on
zones <babylon5, stargate> with super-user permission during the
process of installing this package.

Do you want to continue with the installation of <SUNWscpr> [y,n,?] y

Processing package instance <SUNWscpr> from </tmp>
## Installing package <SUNWscpr> in global zone

Source Compatibility, (Root)(i386) 11.10.0,REV=2005.01.21.16.34
Copyright (c) 2010, Oracle and/or its affiliates. All rights reserved.

This appears to be an attempt to install the same architecture and
version of a package which is already installed.  This installation
will attempt to overwrite this package.


The installation of this package was previously terminated and
installation was never successfully completed.

Do you want to continue with the installation of <SUNWscpr> [y,n,?] y
Using </a> as the package base directory.
## Processing package information.
## Processing system information.
   11 package pathnames are already properly installed.
## Verifying package dependencies.
## Verifying disk space requirements.
## Checking for conflicts with packages already installed.
## Checking for setuid/setgid programs.

This package contains scripts which will be executed with super-user
permission during the process of installing this package.

Do you want to continue with the installation of <SUNWscpr> [y,n,?] y

Installing Source Compatibility, (Root) as <SUNWscpr>

## Installing part 1 of 1.
[ verifying class <none> ]
[ verifying class <preserve> ]

Installation of <SUNWscpr> was successful.
## Installing package <SUNWscpr> in zone <SUNWlu-babylon5>

Source Compatibility, (Root)(i386) 11.10.0,REV=2005.01.21.16.34

This appears to be an attempt to install the same architecture and
version of a package which is already installed.  This installation
will attempt to overwrite this package.

Using </a> as the package base directory.
## Processing package information.
## Processing system information.
   11 package pathnames are already properly installed.

Installing Source Compatibility, (Root) as <SUNWscpr>

## Installing part 1 of 1.
[ verifying class <none> ]
[ verifying class <preserve> ]

Installation of <SUNWscpr> on zone <SUNWlu-babylon5> was successful.
## Installing package <SUNWscpr> in zone <SUNWlu-stargate>

Source Compatibility, (Root)(i386) 11.10.0,REV=2005.01.21.16.34

This appears to be an attempt to install the same architecture and
version of a package which is already installed.  This installation
will attempt to overwrite this package.

Using </a> as the package base directory.
## Processing package information.
## Processing system information.
   11 package pathnames are already properly installed.

Installing Source Compatibility, (Root) as <SUNWscpr>

## Installing part 1 of 1.
[ verifying class <none> ]
[ verifying class <preserve> ]

Installation of <SUNWscpr> on zone <SUNWlu-stargate> was successful.
Unmounting the BE <Solaris10u9>.
The package add to the BE <Solaris10u9> completed.

Problema solucionado. Habrá que estar atento a si tenemos problemas similares en el futuro.

Ahora tenemos el sistema perfectamente actualizado en el nuevo "BE". Lo activamos y probamos a reiniciar:

# lustatus
Boot Environment           Is       Active Active    Can    Copy      
Name                       Complete Now    On Reboot Delete Status    
-------------------------- -------- ------ --------- ------ ----------
Solaris10u7                yes      yes    yes       no     -         
Solaris10u7BACKUP          yes      no     no        yes    -         
Solaris10u9                yes      no     no        yes    -         
# luactivate Solaris10u9
# luactivate Solaris10u9
System has findroot enabled GRUB
Generating boot-sign, partition and slice information for PBE <Solaris10u7>
WARNING: <1> packages failed to install properly on boot environment <Solaris10u9>.
INFORMATION: </var/sadm/system/data/upgrade_failed_pkgadds> on boot 
environment <Solaris10u9> contains a list of packages that failed to 
upgrade or install properly. Review the file before you reboot the system 
to determine if any additional system maintenance is required.

Generating boot-sign for ABE <Solaris10u9>
Generating partition and slice information for ABE <Solaris10u9>
Copied boot menu from top level dataset.
Generating multiboot menu entries for PBE.
Generating multiboot menu entries for ABE.
Disabling splashimage
Re-enabling splashimage
No more bootadm entries. Deletion of bootadm entries is complete.
GRUB menu default setting is unaffected
Done eliding bootadm entries.

**********************************************************************

The target boot environment has been activated. It will be used when you 
reboot. NOTE: You MUST NOT USE the reboot, halt, or uadmin commands. You 
MUST USE either the init or the shutdown command when you reboot. If you 
do not use either init or shutdown, the system will not boot using the 
target BE.

**********************************************************************

In case of a failure while booting to the target BE, the following process 
needs to be followed to fallback to the currently working boot environment:

1. Boot from Solaris failsafe or boot in single user mode from the Solaris 
Install CD or Network.

2. Mount the Parent boot environment root slice to some directory (like 
/mnt). You can use the following command to mount:

     mount -Fzfs /dev/dsk/c2d0s0 /mnt

3. Run <luactivate> utility with out any arguments from the Parent boot 
environment root slice, as shown below:

     /mnt/sbin/luactivate

4. luactivate, activates the previous working boot environment and 
indicates the result.

5. Exit Single User mode and reboot the machine.

**********************************************************************

Modifying boot archive service
Propagating findroot GRUB for menu conversion.
File </etc/lu/installgrub.findroot> propagation successful
File </etc/lu/stage1.findroot> propagation successful
File </etc/lu/stage2.findroot> propagation successful
File </etc/lu/GRUB_capability> propagation successful
Deleting stale GRUB loader from all BEs.
File </etc/lu/installgrub.latest> deletion successful
File </etc/lu/stage1.latest> deletion successful
File </etc/lu/stage2.latest> deletion successful
Activation of boot environment <Solaris10u9> successful.

El "warning" que nos da sobre un paquete mal instalado ya lo hemos resuelto manualmente, pero no se ha enterado (habría que borrar los ficheros de log).

Cruzamos los dedos y reiniciamos la máquina.

# init 6
propagating updated GRUB menu
Saving existing file </boot/grub/menu.lst> in top level dataset for BE <Solaris10u9> as <mount-point>//boot/grub/menu.lst.prev.
File </boot/grub/menu.lst> propagation successful
File </etc/lu/GRUB_backup_menu> propagation successful
File </etc/lu/menu.cksum> propagation successful
File </sbin/bootadm> propagation successful

Todo funciona a la perfección, pero es de señalar que las zonas no globales (es decir, "babylon5" y "stargate") no admitirán conexiones SSH (puerto 22) hasta que entremos con "zlogin -C <zona>" (a través de la zona global) para elegir el tipo de terminal a emplear (uso la opción "3": DEC VT100).

Una vez que parece que todo funciona bien, es conveniente crear un "BE" de backup, por si algún cambio posterior, un parche defectuoso, etc., nos arruina el día. Si hay problemas, arrancamos con dicho "BE" de backup.


Historia

  • 19/sep/11: Publicación de esta página web.

  • 07/oct/10: Primera versión de esta página.



Python Zope ©2010-2011 jcea@jcea.es

Más información sobre los OpenBadges

Donación BitCoin: 19niBN42ac2pqDQFx6GJZxry2JQSFvwAfS