SEARCH DOCS
info central: your site for Collage technical info
  CASSATT.COM   INFO CENTRAL
ACTIVE RESPONSE 5.1 TOPICS BLUEPRINTS TROUBLESHOOTING DOC INDEX


 

TOC

Diagnosing and repairing virtual nodes in a pool or active tier
arrow ESX VM fails to boot when a VMware Console is attached
LUN disconnection results in virtual application node failure
arrow "Export disk" errors
arrow Running out of file space on the SAN
Deleting a VM guest tier
arrow Related articles
   
  Sidebars
arrow Understanding host node/guest node dependencies
   

know-how:

VMs: Troubleshooting

Intended for use with Cassatt Active Response Premium Edition and Data Center Edition V5.1.

The following material outlines problems you may encounter with VMs and virtual application nodes in your Cassatt Active Response environment, along with the steps to solve those problems. It also contains material related to using virtual application nodes with Cassatt Active Response that is not covered in the related blueprints.

Diagnosing and repairing VMware virtual application nodes in a pool or active guest tier

Description

A VMware virtual application node is not working as expected.

Resolution

Before taking any action, you may want to observe the status of VMs that seem to have issues by using the VMware Management Interface or the Remote Console. Note that if you stop a VM using one of these tools, Cassatt Active Response starts another virtual application node to keep the tier at its defined service level, and if you make any configuration changes to a VM, they are not propagated to other VMs on the host.

If you add or delete VMs using the VMware Management Interface or the Remote Console, you must shut down and restart the host application node from the Controller for your changes to take effect in Cassatt Active Response. Shutting down and restarting a host node causes Cassatt Active Response to delete the node's guest virtual application nodes, then recreate the virtual application nodes, taking a new inventory of each one. Alternatively, you can move the host application node to the discovered pool and reinventory the node. For more information, see Understanding host application node/guest application node dependencies.

If something goes wrong with a virtual application node, move it to the maintenance pool and run the diagnostic. Then move the virtual application node to the discovered pool, inventory it, and move it to the free pool. Note that, depending on your automation settings, Cassatt Active Response does most of these tasks without your intervention.

Do not delete a virtual application node using the Controller as a means to force rediscovery: Cassatt Active Response does not rediscover deleted virtual application nodes like it rediscovers deleted physical nodes. Cassatt Active Response rediscovers virtual application nodes only when their host nodes are rebooted. Stopping a host application node deletes all of the node's virtual application nodes; rebooting a physical node with the same image instance regenerates the node's virtual application nodes, triggering discovery.

ESX virtual application node fails to boot when a VMware Console is attached to the VM

Description

An ESX virtual application node fails to boot when a VMware Console is attached to the VM.

Resolution

Cassatt Active Response modifies key definition items for an ESX virtual application node during the boot process. When a VMware Console is attached to an ESX VM while these modifications are being made, the virtual application node may not boot correctly. Refer to Diagnosing and repairing virtual nodes in a pool or active guest tier.

top

LUN disconnection results in virtual application node failure

Description

SAN only SAN

If you are using SAN to host guest virtual disks, and a LUN should fail or become disconnected, the virtual disks that were copied there when you created the VM guest tier become unavailable. The virtual application nodes consequently fail and move to the maintenance pool. After the same virtual application node fails twice, Cassatt Active Response disables the image instance.

Resolution

SAN only SAN

Restore the LUN connectivity, then reenable the image instance.

top

"Exporting disk" errors when capturing Windows images from VMware ESX host

Description

When capturing Windows images from an ESX host, the cccapture program copies the virtual machine disk (VMDK) image to /tmp/vmexport on the image host. If there is insufficient disk space on /tmp, cccapture displays error messages similar to the following:

Exporting disk vmhba0:0:0:5:vm1.vmdk:
Export: 20% done.DiskLib_Clone() failed

Resolution

If you encounter these errors, you need to mount a larger directory via NFS on /tmp. (The ESX blueprint recommends that /tmp should minimally be 1024 MB + the largest VM disk size you will configure.)

Running out of file space on the SAN

Description

SAN only SAN

SAN file space fills up while working with virtual application nodes.

Resolution

SAN only SAN

Remove unneeded virtual disk and administrative files.

Cassatt Active Response puts two types of administrative files on the SAN: fence files and temporary files. Temporary files are created when network-booting guest Linux images that require local disk. Fence files are created when network-booting Linux guest images that do not require local disk.

Follow these guidelines when removing administrative files:

  1. Identify the temporary and fence files by looking for these names:

    VM_Ordinal.MAC_address.temp.vmdk
    VM_Ordinal.MAC_address.fence.vmdk


    Where:
    VM_Ordinal is the number of the VM and
    MAC_address is the MAC address shown in the Controller for VM's bootable NIC
  2. Make sure the files are no longer needed. You can safely delete files when the virtual application node with the MAC address you've identified in the file name is in any of these circumstances:

    Temporary Files

    Fence Files

    In the maintenance pool

    In the maintenance pool AND is powered off

    In the discovered pool or the free pool AND is not in the process of being inventoried

    In the maintenance pool, discovered pool, or free pool AND is powered off

    Allocated to a tier that is assigned a network-booted Linux image without local disk requirements

    Allocated to a tier that is assigned a network-booted Linux image with local disk requirements

    Allocated to a tier that is assigned a network-booted Linux image with local disk requirements BUT the tier is not activated

    Allocated to a tier that is assigned a network-booted Linux image without local disk requirements BUT the tier is not activated

    Allocated to a tier that is assigned a Windows image

    No virtual application node exists with that MAC address


  3. Remove the files.

Follow these guidelines when removing virtual disk files:

  1. Deactivate and deallocate the VM guest tier that uses the files.
  2. Make a list of the IP addresses assigned to the tier.
  3. Log in as root to a node in the ESX base tier.
  4. List the virtual disk files for each IP address, for example:

    ls -l /vmfs/*/ipaddress*
  5. Because the files may contain changes made by the VM after the files were copied to the SAN, you may wish to save the files before deleting. To save a file, use the vmkfstools command, for example:

    vmkfstools -e filename.vmdk directoryname:ipaddress.scsidisknumber.vmdk
  6. Remove the files.

Deleting a VM guest tier

SAN only SAN

Before deleting a VM guest tier, deactivate and deallocate the tier. If you need to reclaim file space on the SAN, delete unneeded files as described in Running out of file space on the SAN.

Related Articles

See Blueprints