Tuesday, 29 December 2015

How To Analyze PSOD

Purple Screen of Death or commonly known as PSOD is something which we see most of the times when we run an ESXi host.

Usually when we experience PSOD, we reboot the host (which is a must) and then gather the logs and upload it to VMware support for analysis (where I spend a good amount of time going through it)

Why not take a look at the dumps by yourself?

Step 1:
I am going to simulate a PSOD on my ESXi host. You need to be logged into the host's SSH. The command is

# vsish -e set /reliability/crashMe/Panic 1 

And when you open a DCUI to the ESXi host, you can see the PSOD


Step 2:
Sometimes, we might miss out on the screenshot of PSOD. Well that's alright! If we have core-dump configured for the ESXi, we can extract the dump files to gather the crash logs.

Reboot the host, if it is in the PSOD screen. Once the host is back up, login to the SSH/Putty of the host and go to the core directory. The core directory is the location where your PSOD logging go to.

# cd var/core 


Then list out the files here:



# ls -lh 

Step 3:
How do we extract it?

Well, we have a nice extract script that does all the job, " vmkdump_extract ". This command must be executed against the zdump.1 file, which looks something like this:

# vmkdump_extract vmkernel-zdump.1 

It creates four files:
a) vmkernel-log.1
b) vmkernel-core.1
c) visorFS.tar
d) vmkernel-pci

All we require for analysis is the vmkernel-log.1 file

Step 4:
Open the vmkernel-log.1 file using the below command:

# less vmkernel-log.1 

Skip to the end of the file by pressing Shift+G. Now let's slowly go to the top by pressing PageUp.
You will come across a line that says @BlueScreen: <event>

In my case, the dumps were:

2015-12-17T20:34:03.603Z cpu3:47209)@BlueScreen: CrashMe
2015-12-17T20:34:03.603Z cpu3:47209)Code start: 0x418021200000 VMK uptime: 0:01:14:16.524>
2015-12-17T20:34:03.603Z cpu3:47209)0x412461a5dc10:[0x41802128d249]PanicvPanicInt@vmkernel#nover+0x575 stack: 0x726f632000000008
2015-12-17T20:34:03.603Z cpu3:47209)0x412461a5dc70:[0x41802128d48d]Panic_NoSave@vmkernel#nover+0x49 stack: 0x412461a5dcd0
2015-12-17T20:34:03.604Z cpu3:47209)0x412461a5dd60:[0x41802157a63b]CrashMeCurrentCore@vmkernel#nover+0x553 stack: 0x100000278
2015-12-17T20:34:03.604Z cpu3:47209)0x412461a5dda0:[0x41802157a8ca]CrashMe_VsiCommandSet@vmkernel#nover+0x13e stack: 0x0
2015-12-17T20:34:03.604Z cpu3:47209)0x412461a5de30:[0x41802160c3c7]VSI_SetInfo@vmkernel#nover+0x2fb stack: 0x41109d630330
2015-12-17T20:34:03.604Z cpu3:47209)0x412461a5dec0:[0x4180217bd7a7]UWVMKSyscallUnpackVSI_Set@<none>#<none>+0xef stack: 0x412461a67000
2015-12-17T20:34:03.604Z cpu3:47209)0x412461a5df00:[0x418021783a47]User_UWVMKSyscallHandler@<none>#<none>+0x243 stack: 0x412461a5df20
2015-12-17T20:34:03.604Z cpu3:47209)0x412461a5df10:[0x4180212aa90d]User_UWVMKSyscallHandler@vmkernel#nover+0x1d stack: 0xffbc0bb8

2015-12-17T20:34:03.604Z cpu3:47209)0x412461a5df20:[0x4180212f2064]gate_entry@vmkernel#nover+0x64 stack: 0x0 

  • The first line @BlueScreen: Tells the crash exception like Exception 13/14, in my case it is CrashMe which is for a manual crash. 
  • The VMKuptime tells the Kernel up-time before the crash.
  • The logging after that is the information that we need to be looking for, the cause as to why the crash occurred. 
Now, here the crash dump varies for every crash. These issues can range from hardware errors / driver issues / issues with ESXi build and a lot more.

Each dump analysis would be different. But the basic is the same. 

So, you can try analyzing the dumps by yourself. However, if you are entitled to VMware support, I will do the job for you.

Cheers!

Unable To Delete Orphaned/Stale VMDK File

So today I got a case where we were trying to delete an orphaned flat.vmdk file.

A brief background of what was being experienced here:

There were three ESXi hosts and 2 shared datastores among these hosts. Now, there were couple of folders in these 2 shared datastores which contained only flat.vmdk files. These flat files were not associated with any virtual machines and also the last modified date of these files were somewhere about a year ago.

However, every time we tried to delete the file from the datastore browser GUI, we got the error:

Cannot Delete File [Datastore Name] File_Name.vmdk


So, when we try to delete this file from the command line using the " rm -f <file_name> " we got the error:

rm: cannot remove 'File.vmdk': No such file or directory

Also:
We were able to move the file to another datastore and remove it successfully. But, the stale file copy was still left behind in the original datastore.

So, how do we remove this stale file?

Step 1:

  • Take a SSH session to all the hosts that have access to this datastore where the stale file resides. 
  • In my case all the three hosts in the cluster.

Step 2:

  • Run the below command. This command has to be executed from the SSH(Putty) of all the hosts having connectivity to that datastore.
# vmkfstools -L release /vmfs/volumes/<datastore_name>/<folder_name>/<file_name>-flat.vmdk

This can result in two error outputs:

First error:
Could not open /vmfs/volumes/xxxxxxxx/xxxxxxx/xxxxxx-flat.vmdk 
Command release failed Error: Device or resource busy

Second error:
Command release failed 
Error: Inappropriate ioctl for device


In my case it was the second error.

The host that gives you the second error has the stale lock on the file. All the three hosts returned the second error, and I had to reboot all the three hosts. 

Once the hosts are rebooted, you can successfully remove the stale flat.vmdk files. 

Understanding VMkernel.log for vMotion Operation

Let's decode the vMotion logging in VMkernel.log.

Open a SSH (Putty) to the host where this virtual machine currently resides. Change the directory to:


# cd var/log 


Capture the live logging of VMkernel using the following command:

# tail -f vmkernel.log 



Perform vMotion of a virtual machine residing on this host to any other available host with a shared storage. You will see the below logging:

I will break down the logging with " // " for comments.

2015-12-25T16:39:25.565Z cpu4:2758489)Migrate: vm 2758492: 3284: Setting VMOTION info: Source ts = 1451061663105920, src ip = <192.168.1.176> dest ip = <192.168.1.177> Dest wid = 1830931 using SHARED swap

//The first line Migrate vm 2758492 does not tell which virtual machine is being migrated. It tells the world ID of the virtual machine that is going to be migrated. To find the world ID of the virtual machine, before migrating run the command # esxcli vm process list This command lists all the virtual machines world IDs that is residing on the host.

// The Setting vMotion info 1451061663105920 is the vMotion ID. This vMotion ID is required because when you "grep" for this ID in the hostd.log or vmware.log (residing in the virtual machine directory) gives you further information of vMotion. In vmware.log you can see the transitioning states for vMotion, with each state performing a set of steps.

// The source ip  where this virtual machine currently resides is 192.168.1.176 and the destination to where the virtual machine is being migrated to is 192.168.1.177

// The dest wid 1830931 is the world ID for this virtual machine once the vMotion is completed. 


2015-12-25T16:39:25.567Z cpu4:2758489)Tcpip_Vmk: 1288: Affinitizing 192.168.1.176 to world 2772001, Success
2015-12-25T16:39:25.567Z cpu4:2758489)VMotion: 2734: 1451061663105920 S: Set ip address '192.168.1.176' worldlet affinity to send World ID 2772001
2015-12-25T16:39:25.567Z cpu4:2758489)Hbr: 3340: Migration start received (worldID=2758492) (migrateType=1) (event=0) (isSource=1) (sharedConfig=1)

// Here the host is being prepared for migration by taking it's IP address into consideration.

//The migration start received logs the vMotion type. The World ID 2758492 is recorded. MigrateType=1 is host migration,

//The host where I am logged in currently via the SSH is the source host which shows the isSource=1 and sharedConfig=1


2015-12-25T16:39:25.567Z cpu5:2771999)CpuSched: 583: user latency of 2771999 vmotionStreamHelper0-2758492 0 changed by 2771999 vmotionStreamHelper0-2758492 -1
2015-12-25T16:39:25.568Z cpu4:2772001)MigrateNet: 1186: 1451061663105920 S: Successfully bound connection to vmknic '192.168.1.176'

//Here the connection from source host vmkernel port-group is established.

2015-12-25T16:39:25.570Z cpu5:33435)MigrateNet: vm 33435: 2096: Accepted connection from <::ffff:192.168.1.177>

// Here the destination source has accepted the connection for vMotion


2015-12-25T16:39:25.570Z cpu5:33435)MigrateNet: vm 33435: 2166: dataSocket 0x410958a8dc00 receive buffer size is -565184049
2015-12-25T16:39:25.570Z cpu4:2772001)MigrateNet: 1186: 1451061663105920 S: Successfully bound connection to vmknic '192.168.1.176'
2015-12-25T16:39:25.571Z cpu4:2772001)VMotionUtil: 3396: 1451061663105920 S: Stream connection 1 added.
2015-12-25T16:39:25.571Z cpu4:2772001)MigrateNet: 1186: 1451061663105920 S: Successfully bound connection to vmknic '192.168.1.176'
2015-12-25T16:39:25.572Z cpu4:2772001)VMotionUtil: 3396: 1451061663105920 S: Stream connection 2 added.

//Both the surce and destination have established the connection and the vMotion process takes place. The VMkernel.log does not record the details of vMotion. If you check the vmware.log for this virtual machine, you can see the states and progress of vMotion in detail.

2015-12-25T16:39:25.848Z cpu3:2758492)VMotion: 4531: 1451061663105920 S: Stopping pre-copy: only 0 pages left to send, which can be sent within the switchover time goal of 0.500 seconds (network bandwidth ~2.116 MB/s, 52403100% t2d)

//In short how vMotion works is:


  • A shadow VM is created on the destination host.
  • Copy each memory page from the source to the destination via the vMotion network. This is known as preCopy.
  • Perform another pass over the VM’s memory, copying any pages that changed during the last preCopy iteration
  • Continue the pre-copy iteration until no changed page remains
  • Stun the VM and resume the destination VM
//Basically, the memory state of the virtual machine is being transferred to the shadow virtual machine created on the destination machine. The memory is nothing but pages.  The pages are transferred to the shadow VM over the vMotion network. Larger the VM I/Os, longer the vMotion process. 

//Towards the end of vMotion the source VM must be destroyed and the operations should continue at the destination end. For this, the ESXi should determine, that the last few memory pages can be transferred over to the destination quickly. Which is the switch-over goal of 0.5 seconds

//So here when it says only 0 pages left to send, which can be sent within the switchover time goal of 0.500 seconds it means that there are no more active memory pages left to be transferred. So the host declares that the source VM can be destroyed and the vMotion can be completed and the destination VM can resume. All this can happen within the feasible switch time. 


2015-12-25T16:39:25.952Z cpu5:2772001)VMotionSend: 3643: 1451061663105920 S: Sent all modified pages to destination (no network bandwidth estimate)

//Here it tells that the for the vMotion ID the "S" Source has sent all the memory pages to the destination.


2015-12-25T16:39:26.900Z cpu0:2758489)Hbr: 3434: Migration end received (worldID=2758492) (migrateType=1) (event=1) (isSource=1) (sharedConfig=1)
2015-12-25T16:39:26.908Z cpu3:32820)Net: 3354: disconnected client from port 0x200000c
2015-12-25T16:39:26.967Z cpu3:34039)DLX: 3768: vol 'Recovery_LUN', lock at 116094976: [Req mode 1] Checking liveness:

//Here the migration has completed for the world ID, the migration type. And the virtual machine in my case is residing on Recovery_LUN is locked by the new host that is residing on with a new world ID that was assigned during the vMotion.


So you know what a successful vMotion looks like in the vmkernel.log
In depth vMotion can be found in the vmware.log, which can be self explanatory once you know what to look at and where to look at.