Friday 27 November 2020

MS ATA Gateway Service not starting after Nutanix Move

ATA Nutanix Move.md

Microsoft Advanced Threat Analytics Gateway not starting after Nutanix Move

The Issue

After moving one of our Domain Controllers from Hyper-V to Nutanix AHV using Nutanix Move, I was unable to start the Microsoft ATA Lightweight gateway service.

ATA not starting

Checking the log in C:\Program Files\Microsoft Advanced Threat Analytics\Gateway\Logs\Microsoft.Tri.Gateway-Errors.log showed the following error:

Error [WebClient+<InvokeAsync>d__8`1] System.Net.Http.HttpRequestException: PostAsync failed [requestTypeName=StopNetEventSessionRequest]

Log error

This lead me to This blog post which explained the issue with the MSFT_NetEventSession WMI class. Unfortunately rebuilding the WMI repository did not help.

It did however lead me to this WMI query which on my system showed a generic error instead of nothing.

Get-WmiObject -Namespace root\standardcimv2 -class "MSFT_NetEventSession" | Select Name

WMI Generic Error

Resolution

Since one of the only differences in the VM would be the network adapter configuration and since I’m aware the original adapter would still be present in Device Manager, I decided to try removing the old device.

Run Device Manager and make sure to show hidden devices to show the old adapters

Show Hidden Devices

Remove the hidden Hyper-V Network Adapter

Remove Hyper-V Adapter
Remove Hyper-V Adapter 2

I also noticed an old, hidden ISATAP adapter, which I also removed. I suspect this was the cause of the issue.

Remove ISATAP Adapter
Remove ISATAP Adapter 2

Once removed, the WMI query was now working.

working wmi

And the service also starts. If this doesn’t immediately resolve your issue, uninstalling and reinstalling the gateway once the adapters are removed should resolve it.

service running

Written with StackEdit.

Sunday 22 November 2020

VLAN on Ubuntu Live Disc

VLAN on Ubuntu Live Disc

Configure VLAN tagging on Ubuntu Live DVD

Most of the guides I found online show you how to configure a VLAN interface on ubuntu by running apt install vlan. That’s all good, however what happens if you’re booted to a Live DVD OS and you don’t already have network access to install a package. Well, it turns out you can use the ip command to add a VLAN tagged interface to your physical interface and get connectivity.


I’ve tested these commands on both a Ubuntu 16.04 and 20.04 Live DVD and both work fine.

Here’s what you need to do

Show your existing interfaces

ip -c -br a

Once you can see which interfaces are up, note the interface name of the one which has the VLAN you’re interested in. In my case enp4s0f1 has vlan id 777 tagged on it.

ip brief

Add a VLAN interface to your physical interface

sudo ip link add link enp4s0f1 name enp4s0f1.777 type vlan id 777

Bring the new interface up

sudo ip link set enp4s0f1.777 up

Add an IP address to the interface. You can use CIDR notation (eg. 192.168.0.1/24) or IP/Subnet (eg. 192.168.0.1/255.255.255.0)

sudo ip a add 172.23.98.176/24 dev enp4s0f1.777

Add a default gateway

sudo ip route add default via 172.23.98.254

Ping something

ping -c 3 172.23.98.254

commands

Hopefully you should now have connectivity.

Written with StackEdit.

VLAN on Hiren's Boot CD

VLAN on Hiren's Boot CD

Configure VLAN tagging on Hiren’s Boot CD

I had to configure a network card to use a VLAN on some Live CD operating systems to image a disk on a server. Luckily with HBCD it’s pretty much the same process as in normal Windows:


Wait for all your devices to be detected and configured

hbcd-waitfordevices.png

Click Start > Run

hvcd-run.png

Type ncpa.cpl and click OK

hbcd-ncpa.png

Right click in some white space in the Network Connections window, click View > Details

hbcd-viewdetails.png

Right click on the enabled NIC that has your VLAN configured and click properties

hbcd-nicproperties.png

Click the Configure button for the NIC

hbcd-confignic.png

Click the Advanced tab and look for something like “VLAN ID”. This may be referred to differently depending on the network adaper and / or driver. Set the relevant VLAN ID and click OK

hbcd-advancedvlan.png

Go back into the NIC properties, select IPv4 and click configure

hbcd-ipv4.png

Set an IP address, subnet and gateway address.

hbcd-ipv4props.png

Test it!

hbcd-ping.png

Hopefully it’s all working now.

Written with StackEdit.

Friday 13 November 2020

Packer for Nutanix AHV Windows Templates

Packer for Nutanix AHV

Packer for Nutanix AHV

Packer automates the creation of virtual machine images. It’s quite a bit simpler than SCCM to set up if you just want basic, up to date images that you can deploy virtual machines from. Packer uses ‘builders’ to create virtual machine images for various hypervisors and cloud services. Unfortunately, there isn’t yet a builder for Nutanix AHV. AHV is based on KVM for virtualisation though, so it’s possible to generate images using a basic KVM hypervisor and then upload them to the image service ready to deploy.

Since it’s not possible to create the templates natively in the platform, a helper virtual machine is needed to run KVM and build the images. In this post, I’ll go through the set up for an Ubuntu virtual machine and give a Windows Server 2019 example ready to deploy.

I used Nutanix CE 5.18 in most of my testing, but it’s also possible to run the KVM builder on a physical machine or any VM that supports CPU passthrough such as VMware workstation, Hyper-V or ESXi. If you’re running an older version of Nutanix AOS/AHV then it may still be possible with caveats. Check the troubleshooting section for more information.

Create the builder virtual machine

Create the VM in AHV

  • Create VM, 2 vCPU, 4 GB, I’m using the name packer
  • Add disk, ~100 GB
  • Enable CPU passtrhough in the Nutanix CLI

SSH to a CVM as nutanix and run the following command

acli vm.update <VMNAME> cpu_passthrough=true

cpu passthrough

Install packer with the following commands. First party guide here. Make sure you update with the latest version, the URL here is just an example, but it is the version I used.

sudo apt update
sudo apt -y upgrade
wget https://releases.hashicorp.com/packer/1.6.5/packer_1.6.5_linux_amd64.zip
unzip packer_1.6.5_linux_amd64.zip
sudo mv packer /usr/local/bin/

Run the packer binary to make sure it’s in the path and executable

packer

packer run

Download the packer windows update provisioner and install per the instructions.

wget https://github.com/rgl/packer-provisioner-windows-update/releases/download/v0.10.1/packer-provisioner-windows-update_0.10.1_linux_amd64.tar.gz
tar -xvf packer-provisioner-windows-update_0.10.1_linux_amd64.tar.gz
chmod +x packer-provisioner-windows-update
sudo mv packer-provisioner-windows-update /usr/local/bin/

Install qemu, vnc viewer & git

sudo apt -y install qemu qemu-kvm tigervnc-viewer git

Check you have virtualisation enabled

kvm-ok

Add your local user to the kvm group and reboot

sudo usermod -aG kvm dave
sudo reboot

I have built an example windows build on github here

Clone the example files to your local system with the following command

git clone https://github.com/bobalob/packer-examples

you will need to download the Windows 2019 Server ISO from here, the Nutanix VirtIO Drivers from here and LAPS from here.

Place the Windows installation media in the iso folder and the 2 MSI files in the files folder. They must be named exactly as in the win2019.json file for this to work. Update the json file if you have differing MSI or ISO file names. Use this as a base to build from. You can upload additonal MSIs or add scripts. Experiment with the packer provisioners.

If you run with a different ISO you will either need to obtain the sha256 hash for the ISO or just run the packer build command and it will tell you what hash it wants. Be careful here, I trusted my ISO file so I just copied the hash that packer wanted into my json and ran the build again.

If you wish to change the password for the build, change the variable in the win2019.json file and the plain text password in the Autounattend.xml file.

My folder structure looks like this:

win2019 folders

Run packer build

cd packer-examples/win2019/
packer build win2019.json

packer building

Once the machine is built, upload the artifact from vm/win2019-qemu/win2019 to the image service in Prism

upload the file

Once uploaded you can create a VM from the image. Hopefully it will have all the correct VirtIO drivers installed.

Finished VM

Troubleshooting

In all cases where a build fails it’s useful to set the PACKER_LOG environment variable as follows

PACKER_LOG=1 packer build win2019.json

==> qemu: Failed to send shutdown command: unknown error Post “http://127.0.0.1:3931/wsman”: dial tcp 127.0.0.1:3931: connect: connection refused

In my case this was because I had configured my sysprep command in a regular script. Since the sysprep runs and shuts the machine down, there is no longer a winrm endpoint for packer to connect to.

The issue with this is that packer attempts to cleanup once it has run the script and then run the shutdown_command. I removed my sysprep from the script and included it as my shutdown_command.

Build hangs when installing VirtIO MSI

I realised this is because the network driver installs and disconnects the network for a second causing packer to hang and not receive output from the script. Changing the build VM nic to e1000 in the json file means the NIC doesn’t get disconnected when installing VirtIO.

openjdk / java issue with ncli

System default is
but java 8 is required to run ncli

edit the ncli file in a text editor, replace

JAVA_VERSION=`java -version 2>&1 | grep -i "java version"

with

JAVA_VERSION=`java -version 2>&1 | grep -i "openjdk version"

MSR 0xe1 to 0x0 error on Ryzen system

Fix here

Essentially run the following and try again, if this fixes it, try the linked blog for the permenant fix.

echo 1 | sudo tee /sys/module/kvm/parameters/ignore_msrs

Windows build hangs the host VM or the guest

I think this is a problem in AHV 5.10, Tested working on AHV 5.18 CE. A workaround is changing the machine type to pc or pc-i440fx-4.2. Unfortunately this appears to be REALLY slow! Might be worth experimenting with different machine types. q35 is just as slow.

Update the Qemu args to include the machine type:

"qemuargs": [
    [
      "-m",
      "2048"
    ],
    [
      "-smp",
      "2"
    ],
    [
      "-machine",
      "pc"
    ]
  ]

Written with StackEdit.

Saturday 7 November 2020

Add disk to Nutanix CE

Add disk to Nutanix CE.md

Adding a new disk to Single Node Nutanix CE

There wasn’t much detailed information for adding an additional disk to a Nutanix CE node, so I’ve compiled the steps and written a guide here. This guide is for a single node CE cluster and therefore I have to stop the cluster to perform most of these actions. If you have a multi node cluster, most of the steps can be done by shutting down a single CVM but these are the steps I took and verified.

At a high level, the drive should be added to the physical host then added to the CVM config. Once the CVM is booted back up, there are some scripts to run to partition, format, mount and set the tier for the drive.

I have read that there is a script named ce_add_disk but running this on my system didn’t work.

This was tested on AOS 5.18 / 2020.09.16.


On the CVM

SSH to the CVM and Stop the cluster

cluster stop

Cluster Stopping

Cluster Stopped

Shut down the CVM

cvm_shutdown -P now

CVM Shutdown


On the AHV Host

SSH to the AHV Host and Shut down. Check that the CVM is not running first.

virsh list --all

poweroff

virsh list

CVM Shutdown

Open up your “server” and add your disk. Bonus points for professional cable management and mounting 😉

Add Disk

Power up your host and wait for it to come back up. Once you can SSH back in, run lsblk to identify the new disk. My disk has been cleaned using DISKPART and therefore it’s the one listed with no partitions. Note the sdX identifer for the new drive. In this case my drive is sdb.

lsblk

Run hdparm /dev/sdX and note the Model and SerialNo of the device

hdparm

Run ls -la /dev/disk/by-id and note the filename that points to your /dev/sdX

ls -la path

Also note the wwn-0x5XXXXXXXX number that points to your /dev/sdX

ls -la wwn

Now you have all the information required to add the disk to the CVM config. Edit the CVM config by running virsh list --all to get the CVM name, then run virsh edit CVMNAME. You may need to shut down the CVM with virsh shutdown CVMNAME first if it’s not already off.

virsh edit

Add a new disk block to the XML config. I coped and pasted an existing block. Make sure to update the entries in the following list. The editor is vi so press i to enter INSERT mode.

  • source dev is the ata-DeviceName path from the ls -la command
  • target dev is a unique device ID for the CVM, I went with sdd as it was the next one available from all device blocks currently in this file
  • serial is the SerialNo of the device from hdparm command
  • wwn is the id after wwn-0x in the ls -la command
  • product is the Model from the hdparm command, however this disk model name caused the update of the CVM config to fail. It appears you need exactly 2 words, alphanumeric characters only so I modifed the model number appropriately.
  • address is a unique SCSI LUN for the CVM, I just incremented the unit by 1 from the previous disk block

add block

Once you’ve made your edits, hit esc then wq, enter to save.

virsh editied


Back on the CVM

Start your CVM (virsh start CVMNAME) and SSH to it. If you had previously stopped the cluster then it should still be stopped. You can check with cluster status. Since this is a single node guide, I won’t restart the cluster just yet.

Run lsblk to see what sdX designation the new drive has been given inside the CVM. In my case the drive has no partitions and is not mounted.

lsblk cvm

To create a new partition on the drive run sudo cluster/bin/repartition_disks -d /dev/sdX

repartition disks

The disk now shows with a single partition

Partitioned disk

Format the disk with sudo cluster/bin/clean_disks -p /dev/sdX1 (get partition id from above command, mine is sdb1)

clean disk

Mount the new partition with sudo cluster/bin/mount_disks

Mount disk

Verify the partition is mounted with lsblk

Partitioned mounted

Restart the cluster with cluster start. Once the cluster is up, verify the disk is visible using ncli disk ls. If the disk is an SSD, then note the disk id number. In my example it is 0005b1cc-e043-664e-02c5-001b2118e830::1480606.

If the disk is not showing in ncli disk ls then you many need to stop and restart the cluster again. In my case, I waited around 10 minutes with no disk showing up. But after a cluster stop / start the disk was immediately visible.*

* It’s possible I ran the mount command while the cluster was running which is what actually brought the disk online in ncli and PE. If I get chance I’ll test this more thoroughly.

ncli ls

If this is a hard disk then skip this step. If you added an SSD you need to update the disk type with the following ncli command

ncli disk update id=0005b1cc-e043-664e-02c5-001b2118e830::1480606 tier-name=SSD-SATA

ncli update

Once the disk is showing and in the correct tier, you should shortly be able to see the disk in Prism Elements. Check in Hardware > Table > Disk section

ncli update

Sources

/u/gdo on this Reddit thread

A now deleted blog post

Written with StackEdit.

Nutanix CE 2.0 on ESXi AOS Upgrade Hangs

AOS Upgrade on ESXi from 6.5.2 to 6.5.3.6 hangs. Issue I have tried to upgrade my Nutanix CE 2.0 based on ESXi to a newer AOS version for ...