Creating an iSCSI template from scratch

Note: This method is now deprecated, please use the new metalcloud-cli os-template register command. For more information visit Adding OS Templates.

Creating an iSCSI template is relatively easy in principle but there are a few important gotchas that need to be taken into consideration.

Installation with native iSCSI support in the OS installer

  1. Deploy an instance array with an empty unformatted drive.

  2. An AFC will be stuck in waiting for SSH. skip it. (This will not be necessary in the future).

  3. Retrieve iSCSI credentials

    Using the CLI, identify the infrastructure then the instance-array and drive-array:

    alex@AlexandrusMBP2 ~ % metalcloud-cli infra get -id template-dev
    Infrastructure template-dev (27021) - datacenter ro-bucharest owner [email protected]
    +-------+----------------+----------------------+--------------------------------------------------------------------------+--------+
    | ID    | OBJECT_TYPE    | LABEL                | DETAILS                                                                  | STATUS |
    +-------+----------------+----------------------+--------------------------------------------------------------------------+--------+
    | 38084 | InstanceArray  | instance-array-38084 | 1 instances (1 RAM, 1 cores, 0 disks pxe_iscsi )                         | active |
    | 48762 | DriveArray     | drive-array-48762    | 1 drives - 40.0 GB iscsi_ssd  attached to: instance-array-38084 [#38084] | active |
    +-------+----------------+----------------------+--------------------------------------------------------------------------+--------+
    Total: 2 elements
    
    
    metalcloud-cli instance-array get -id instance-array-38084 --show-iscsi-credentials --format yaml
    - details: 'M.8.32.v2 (#9095) '
    id: 59725
    iscsi: 'Initiator IQN: iqn.2020-10.com.metalsoft.storage:instance-59725 Username:
        GMBYHg5keXknd Password: WpccjYD9NgYhR '
    status: active
    subdomain: instance-59725.metalsoft.io
    wanIp: 185.90.50.238
    
    metalcloud-cli drive-array get -id  drive-array-48762 --show-iscsi-credentials --format yaml
    - attachedTo: instance-59725
    credentials: 'Target: 100.97.0.0 Port:3260 IQN:iqn.2013-01.com.metalsoft:storage.pb429od.q6xjxws.6vykjr4
        LUN ID:47'
    details: none  none
    id: 75618
    label: drive-75618
    sizemb: 40960
    status: active
    template: ""
    type: iscsi_ssd
    

    Alternatively, in the infrastructure editor click on the drive array then select the drive:

    ../_images/creating_an_os_template_from_scratch_03.png

  4. Login into the IPMI interface of the server.

    The server should be cycling in the iPXE stage:

    ../_images/creating_an_os_template_from_scratch_01.png

    That is fine as there is no Operating System installed on the drive.

    Attach an ISO image via the virtual media facility of the IPMI interface of the server. Some servers will reboot at this point. I’ve used Ubuntu 20.04 ‘live server’.

  5. If the installer has support for iSCSI simply let the installer run it’s course and mount the drive during the installer and continue the setup that way. Most linux setups should support this natively.

Installation by mounting the iSCSI LUN using the installer shell

This option is used when the installer doesn’t have support for iSCSI installation per se but the OS itself has. For example we use this method with Ubuntu 20.04:

  1. Perform the steps 1-3 above.

  2. Insert the ISO or virtual media with the OS and boot it via the IPMI interface.

  3. In the installer, enter the shell. I use the SSH version of the shell as it is easy to paste. Ubuntu’s Ubiquity installer has an option in the Help menu to access the system during installation via SSH.

  4. Login into the iSCSI portal

    Set the initiator username by editing the /etc/iscsi/initiatorname.iscsi file (this identifies your host to the iSCSI target - the SAN storage):

    InitiatorName=iqn.2020-10.com.metalsoft.storage:instance-59725
    

    Edit chap credentials in /etc/iscsi/iscsid.conf:

    node.session.auth.authmethod = CHAP
    node.session.auth.username = GMBYHg5keXknd
    node.session.auth.password = WpccjYD9NgYhR
    

    Login to the iscsi target:

    iscsiadm -m discovery -t st -p 100.97.0.0
    iscsiadm -m node -T iqn.2013-01.com.metalsoft:storage.pb429od.q6xjxws.6vykjr4 -p 100.97.0.0 -l
    Logging in to [iface: default, target: iqn.2013-01.com.metalsoft:storage.pb429od.q6xjxws.6vykjr4, portal: 100.97.0.0,3260]
    Login to [iface: default, target: iqn.2013-01.com.metalsoft:storage.pb429od.q6xjxws.6vykjr4, portal: 100.97.0.0,3260] successful.
    

    Check if the drive has been mounted. In this case we now have an SDA drive attached. You can also see this in dmesg:

    dmesg | grep sda
    [ 2288.470097] sd 3:0:0:47: [sda] 83886080 512-byte logical blocks: (42.9 GB/40.0 GiB)
    [ 2288.470099] sd 3:0:0:47: [sda] 8192-byte physical blocks
    [ 2288.470181] sd 3:0:0:47: [sda] Write Protect is off
    [ 2288.470182] sd 3:0:0:47: [sda] Mode Sense: 43 00 10 08
    [ 2288.470340] sd 3:0:0:47: [sda] Write cache: enabled, read cache: enabled, supports DPO and FUA
    [ 2288.470511] sd 3:0:0:47: [sda] Optimal transfer size 8192 bytes
    [ 2288.511975] sd 3:0:0:47: [sda] Attached SCSI disk
    
  5. Switch to the installer, perform the regular install but DO NOT REBOOT.

    ../_images/creating_an_os_template_from_scratch_04.png

  6. After the setup, in the console, change the root to the local drive:

    cd /target
    mount -t proc /proc proc/
    mount --rbind /sys sys/
    mount --rbind /dev dev/
    chroot /target
    
  7. Add iscsi initrd modules: Edit the /etc/initramfs-tools/modules file and append:

    iscsi_tcp
    iscsi_ibft
    

    These are needed so that the kernel has the necessary support to mount the root of an iscsi drive.

  8. Add metalsoft-iscsi initrd hooks and scripts. Note that this is for Ubuntu. We provide different script-sets for different operating systems. Note: These helpers will also disable iscsi related systemd and sysV scripts and change some defaults for dhcp. This is important as they might trigger a premature logout from the iSCSI target.

    wget http://<repo>/metalsoft/ubuntu/metalsoft-ubuntu-helpers-1.0.0_1.0.0_amd64.deb
    dpkg -i  metalsoft-ubuntu-helpers-1.0.0_1.0.0_amd64.deb
    ``
    
  9. Update initramfs

    update-initramfs -u
    
  10. Add ibft as root source in GRUB

    Edit /etc/default/grub add iscsi_firmware ip=ibft to the kernel cmd line:

    GRUB_CMDLINE_LINUX="iscsi_firmware ip=ibft edd=off"
    

    Update GRUB:

    $ update-grub
    Sourcing file `/etc/default/grub'
    Sourcing file `/etc/default/grub.d/init-select.cfg'
    Generating grub configuration file ...
    Found linux image: /boot/vmlinuz-5.4.0-52-generic
    Found initrd image: /boot/initrd.img-5.4.0-52-generic
    Adding boot menu entry for UEFI Firmware Settings
    done
    
  11. Install and Activate python2

    apt-get install python2
    sudo update-alternatives --install /usr/bin/python python /usr/bin/python2 1
    
  12. Disable all built-in iscsi services

    This is needed as these services will perform a premature logout and the system will hand before reboot.

    echo "Disabling all iscsi related services to avoid the hanging issue"
    systemctl daemon-reload
    systemctl disable iscsid.service
    systemctl disable iscsid.socket
    systemctl disable open-iscsi.service
    systemctl enable metalsoft.service
    echo "Removing from sysV as well"
    sudo rm -f /etc/rc0.d/K01open-iscsi
    sudo rm -f /etc/rc0.d/K01iscsid
    
  13. Install snmpd for monitoring

    apt-get install snmpd
    
  14. Reduce the DHCP timeout to 30s

    Set the timeout property to 10 in /etc/dhcp/dhclient.conf. This is needed on systems with more than one NIC to speedup the boot process.

    sed -i 's/#timeout 300/timeout 10/g' /etc/dhcp/dhclient.conf
    
  15. Reboot and check if everything works, notice in particular the iscsi related scripts. If something goes wrong go to the troubleshooting part at the end of this document.

  16. Cleanup root It is useful to provide the user with clean slate.

    history -c
    history -w
    sudo su
    
    #clear audit logs
    if [ -f /var/log/audit/audit.log ]; then
        cat /dev/null > /var/log/audit/audit.log
    fi
    if [ -f /var/log/wtmp ]; then
        cat /dev/null > /var/log/wtmp
    fi
    if [ -f /var/log/lastlog ]; then
        cat /dev/null > /var/log/lastlog
    fi
    #cleanup persistent udev rules
    if [ -f /etc/udev/rules.d/70-persistent-net.rules ]; then
        rm /etc/udev/rules.d/70-persistent-net.rules
    fi
    #cleanup /tmp directories
    rm -rf /tmp/*
    rm -rf /var/tmp/*
    #cleanup current ssh keys
    #rm -f /etc/ssh/ssh_host_*
    
    #cleanup apt
    apt-get clean
    #cleanup shell history
    history -c
    history -w
    exit
    
  17. Shutdown the server

    Use the IPMI shutdown button as to not write a single line in bash history with ‘power off’. Also remove the virtual media to prevent the server from re-booting from the installation medium.

    metalcloud-cli instance power-control --autoconfirm -id 59725 --operation off
    
  18. Convert the drive into a template If everything checks out you are now ready to convert the drive into a template

    metalcloud-cli volume-template create  --id 75618 --boot-methods-supported pxe_iscsi --boot-type uefi_only --label "ubuntu-20-04-1" --description "Ubuntu 20.04.1" --name "Ubuntu 20.04.1" --os-bootstrap-function-name "provisioner_os_cloudinit_prepare_ubuntu" --os-type "Ubuntu" --os-version "20.04" --os-architecture "x86_64"
    
  19. The template should now be visible as a ‘private template`

    alex@Alexandrus-MacBook-Pro-2 ~ % metalcloud-cli volume-template list                              
    Volume templates I have access to as user alex.bordei@metalsoft.io:
    +-------+-----------------------------------------+----------------------------------+-------+---------------------------+------------------------+
    | ID    | LABEL                                   | NAME                             | SIZE  | STATUS                    | FLAGS                  |
    +-------+-----------------------------------------+----------------------------------+-------+---------------------------+------------------------+
    | 173   | ubuntu-20-04-1                          | Ubuntu 20.04.1                   | 40960 | not_deprecated            | pxe_iscsi              |
    +-------+-----------------------------------------+----------------------------------+-------+---------------------------+------------------------+
    Total: 39 Volume templates
    

Now you will need to mark this template as public which will make it available to all users of the platform and/or experimental which will make it available only to the users that have the volume_templates experimental flag set.

A note on the classless static route for SAN

The part that generates issues for most people is accidentally cutting off the connection the iSCSI. Often this happens during all NICs DHCP because the SAN devices are on a different subnet than the instance’s SAN subnet - the iSCSI connection is done via a gateway (unique to each server)- and during the DHCP phase when the WAN NIC has configured a default gateway on the system and the SAN NIC is not yet configured there is a momentary ‘no route to host’ situation for the iscsi traffic which freezes the system.

This is the typical routing table of a server instance:

default via 185.90.50.245 dev ens2f1  proto static  metric 100  #THIS IS THE DEFAULT GW, VIA THE WAN NETWORK
100.64.16.48/29 dev ens2f0  proto kernel  scope link  src 100.64.16.54  #THIS IS THE SAN NETWORK
100.97.0.0/16 via 100.64.16.49 dev ens2f0 #THIS IS THE SAN TARGET SUBNET
185.90.50.244/30 dev ens2f1  proto kernel  scope link  src 185.90.50.246  metric 100 #THIS IS THE WAN NETWORK

Notice the classless static route 100.97.0.0/16 via 100.64.16.49 dev ens2f0 this needs to be present in the system at all times.

To ensure this make sure that the SAN card is always discovered first via DHCP or it is configured statically by using the information about the NIC from the iBFT.

Populating the IBFT table

Certain operating system installers support reading the IBFT table for iSCSI disk information and using that to perform the install in order to avoid having to manually log into the target during the install. This is useful for unattended image build processes. Both anaconda (CentOS/RHEL) and subiquity have support for reading the iBFT table out of the box but it might need to be enabled. Windows also has support for this but support depends on other things such as support for the NIC in question.

The IPXE bootloader can be used to populate this table prior to an installation. Other ‘Smart NICs’ such as Emulex OCE-14000 or Mellanox cards can also populate this table.

To use it with IPXE use:

  1. Perform the steps 1-3 above.

  2. Insert the ISO or media with the OS but DO NOT BOOT IT, have IPXE boot first and then use it to login into the iSCSI volume to populate the IBFT table then boot the inserted virtual media as follows:

    When prompted press Control+B then:

    dhcp
    set username GMBYHg5keXknd
    set password WpccjYD9NgYhR
    set initiator-iqn iqn.2020-10.com.metalsoft.storage:instance-59725
    set keep-san 1
    sanhook --keep iscsi:100.97.0.0:::2F:iqn.2013-01.com.metalsoft:storage.pb429od.q6xjxws.6vykjr4
    sanboot --no-describe https://releases.ubuntu.com/20.04/ubuntu-20.04.1-live-server-amd64.iso
    

    Other servers might use 0x81 or 0x82 as drive descriptor for the CDROM 0x80 should be the iSCSI volume.

    Note the 2F which is the LUN ID in hexadecimal (47 in our case =2F). It can be left empty in which case it is 0.

    The format of the sanhook ipxe command is: iscsi:<servername>:<protocol>:<port>:<LUN>:<targetname> where:

    • is the DNS name or IP address of the iSCSI target.
    • is ignored and can be left empty.1)
    • is the TCP port of the iSCSI target. It can be left empty, in which case the default port (3260) will be used.
    • is the SCSI LUN of the boot disk, in hexadecimal. It can be left empty, in which case the default LUN (0) will be used.
    • is the iSCSI target IQN.
  3. Continue the install as usual and the disk should be visible as a regular disk.

Troubleshooting

  1. If you run into issues and you are unable to boot your newly installed system, login into the installer’s console then mount the drive and login into the portal by running steps 1-4 from above.

  2. Mount the iSCSI volume

    vgscan
     Found volume group "ubuntu-vg" using metadata type lvm2
    

    Mount the lvm volume and the boot partition

    mkdir /target
    mount /dev/ubuntu-vg/ubuntu-lv /target
    mount /dev/sda2 /target/boot/
    
  3. Link the special directories

    mount -t proc /proc /target/proc/
    mount --rbind /sys /target/sys/
    mount --rbind /dev /target/dev/
    
  4. Chroot into the drive

    chroot /target