Deploying MetalSoft to support legacy (PXE) deployments

In certain situations where older servers need to be automated, the legacy PXE protocol method can be used to provision servers. As opposed to the regular setup, an in-band network connection between the agent and the leaf switches is required. This can be either L2 or a DHCP relay configuration.

Cabling

The cabling is shown as follows:

Note: In this setup each server is connected with 4 ports to the ToR switch and an extra dedicated BMC port to the OOB network. MetalSoft doesn’t need an extra provisioning network and thus doesn’t need shared BMC ports. If present MetalSoft will simply ignore the extra port in the OS. Provisioning is done using the In-band network by reconfiguring the switch during the setup.

In fact, our recommendation is also NOT to use the shared port for security reasons. A rogue tenant could potentially use it to access the OOB network.

IP Allocation

MetalSoft uses several IP ranges with various purposes such as OOB IPs, In-band IPs, storage IPs etc. The following IP ranges are the recommended defaults:

  1. WAN subnet - IPv4

    This subnet will be used to reach tenant servers from outside the managed datacenter. For a baremetal-as-a-service IaaS product this would be a routable IP address subnet.

    Example: 10.255.226.0/24

  2. WAN subnet - IPv6:

    This subnet will be used to reach tenant servers from outside the managed datacenter. For a baremetal-as-a-service IaaS product this would be a routable IP address subnet.

    Example: fd1f:8bbb:56b3:2000:0000:0000:0000:0000/53)

  3. SAN client subnet - IPv4

    This subnet is assigned to servers on the SAN interface.

    Example: 100.64.0.0/21

  4. SAN storage subnet - IPv4

    This subnet is used for the storage arrays. Servers will get (via DHCP) a static route that pushes storage traffic (towards this subnet) via the SAN interface on the server.

    Example: 100.96.0.0/16

  5. OOB Subnet - IPv4

    This subnet will be used to allocate IPs to server’s BMCs and potentially to switch BMCs.

    Example: 10.255.227.0/24

  6. Quarantine subnet - IPv4 (legacy)

    This subnet will be used by MetalSoft to allocate IPs to servers during server registration and before the server’s identity has been established. All unused ports on the TOR switch are configured to be part of the Quarantine VLAN (by convention VLAN ID 5). This allows zero touch enrollment of servers but does not allow a physical attack to access tenant networks.

    Example: 172.16.0.0/24

  7. Dummy subnet (“Primary” WAN IP Subnet)

    This is only required for HP older switches. These IPs are permanently allocated on the interfaces as “Primary” IPs in order to avoid a downtime when manipulating secondary IP addresses on the switch interfaces. Traffic should not be routed to them in any way.

    Example: 172.24.4.0/22

  8. Router ‘public’ IP Address

    This is the IP configured on the router, reachable from the outside world. It is the entry point towards the managed datacenter. It is also the endpoint for VPN connections if the VPN has been configured on this machine.

    Example: 83.246.0.140

  9. Router ‘in-band’ IP Address

    This is the IP configured on the router, reachable from the in-band network. This will be used as a gateway towards the datacenter agents as well as towards the internet or the rest of the network.

    Example: 172.16.10.1

  10. Router ‘out-of-band’ IP Address

    This is the IP configured on the router, reachable from the out-of-band network. This will be used as a gateway towards the datacenter agents for on-boarding of zero-touch servers and BMC configuration.

    Example: 10.255.227.1

  11. DC Agents machine IP Address

    This is the IP configured on the datacenter agents machine.

    Example: 172.16.10.6

  12. Top of Rack (ToR) 01 switch in-band IP Address

    This is the switch in-band IP address. In this example setup, it is used for:

    1. As the gateway towards the Quarantine

    2. As the gateway towards the SAN in-band network

    3. As the in-band management interface

    Example IP 172.16.10.2

  13. Quarantine L3 interface

    This virtual switch interface (VSI) interface is used as a gateway for the quarantine network. This IP must be configured on one of the ToR switches.

    Example: 172.16.0.1

Router configuration

The reference architecture uses a router machine that bridges the in-band, out-of-band and the uplink links.

The netplan configuration should look like this:

edit the /etc/netplan/50-cloud-init.yaml

network:
  version: 2
  ethernets:
    ens160:
      addresses: [83.246.0.140/24]
      gateway4: 83.246.0.1
      nameservers:
        addresses:
        - 8.8.8.8
        - 1.1.1.1
    ens192:
      addresses: [10.255.227.1/24]
    ens224:
      addresses: [172.16.10.1/30]
      routes:
        #this is a static route that takes DHCP traffic coming from the agents machine
        #to the quarantine network via the in-band link.
        - to: 172.16.0.0/16
          via: 172.16.10.2
        #this is a static route that allows a connection between a datacenter agent and the SAN network 
        #via one of the ToR switches. This is only required to initiate volume replication processes via SSH.
        - to: 100.96.0.0/16
          via: 172.16.10.2
        #this is a static route that allows a connection between a datacenter agent and the SAN L3 interfaces
        #used as gateways on the ToR switches. This is required to allow the server to perform diskless boot.
        - to: 100.64.0.0/21
          via: 172.16.10.2
        #this is a static route that allows a connection between the agent and the WAN L3 interfaces
        #used on ToR switches
        - to: 10.255.226.0/24
          via: 172.16.10.2
        #this is a static route that allows DHCP traffic between the agent and the WAN gateway (VLAN VSI) interface's  'primary' (dummy) ip address
        #used on ToR switches
        - to: 172.24.4.0/22
          via: 172.16.10.2
    ens256:
      addresses: [172.16.10.5/30]

Apply the configuration:

root@router:~# netplan --debug generate
root@router:~# netplan --debug apply

Check that the routes have been added properly. You should see routes for: Quarantine, OOB, In-band towards the WAN interfaces and in-band towards the DC agents machine and towards the in-band SAN network.

root@router:~# ip route
default via 83.246.0.1 dev ens160 proto static  #default gw
10.255.226.0/24 via 172.16.10.2 dev ens224 proto static #in-band towards the ip ranges configured on the server's WAN ports
10.255.227.0/24 dev ens192 proto kernel scope link src 10.255.227.1  #OOB subnet
83.246.0.0/24 dev ens160 proto kernel scope link src 83.246.0.140  #uplink
100.64.0.0/21 via 172.16.10.2 dev ens224 proto static  #in-band towards the ip range configured on the server's SAN ports. This allows DHCP relay traffic to WAN.
100.96.0.0/16 via 172.16.10.2 dev ens224 #in-band SAN, configured on the SAN array
172.16.0.0/16 via 172.16.10.2 dev ens224 #quarantine subnet
172.16.10.0/30 dev ens224 proto kernel scope link src 172.16.10.1  #in-band towards the rest of the network and WAN IP addresses
172.16.10.4/30 dev ens256 proto kernel scope link src 172.16.10.5  #towards the DC agent machine
172.24.4.0/22 via 172.16.10.2 dev ens224 proto static #in-band towards the WAN gateway L3 primary ip address on the ToR. This allows DHCP relay traffic to WAN.

The VPN configuration is beyond the scope of this tutorial. Visit How to Set Up an OpenVPN server on Ubuntu 18.04 for more details.

Switch configurations

Out-of-band (OOB) Switch configuration

Normally the OOB network is already configured and no special requirements apart form reachability between all the hosts is required. If a new OOB switch is being added at the minimum:

  1. Configure serial console

  2. Reset admin password

  3. Configure maximum number of VLANs to 256

Top-of-Rack (ToR) switch configuration

MetalSoft supports automatic configuration on certain switches via the ‘Zerotouch’ configuration mechanism (ONIE). The following is an example of a manual configuration of a ToR switch:

  1. Configure management IPs (in this scenario we use in-band management IP for the switch but we could have used out-of-band management as well. IP used in this case is 172.16.10.2)

  2. Configure telnet

  3. Enable SSH console access

  4. Create quarantine VLAN 5

  5. Add all ports into the quarantine VLAN

  6. Configure DHCP relay with the DCAgents machine as a destination

  7. Activate Restconf if not automatically activated and if the device supports it.

Example configuration for a HP5900 switch:

sysname HP5900-H1060
#
telnet server enable
telnet server acl 2000
telnet server ipv6 acl ipv6 2000
#
irf mac-address persistent timer
irf auto-update enable
undo irf link-delay
irf member 1 priority 1
#
link-aggregation global load-sharing mode destination-ip source-ip destination-port source-port
#
undo link-aggregation load-sharing mode local-first
#
ip ttl-expires enable
#
ipv6 option drop enable
#
dhcp enable
#
lldp global enable
#
cut-through enable
burst-mode enable
#
system-working-mode standard
password-recovery enable
#
vlan 1
#
vlan 5
name QUARANTINE
description QUARANTINE

interface Vlan-interface5
description QUARANTINE
ip address 172.16.0.1 255.255.255.0
dhcp select relay
dhcp relay information circuit-id string VLAN5
dhcp relay information enable
dhcp relay information remote-id sysname
dhcp relay server-address 172.16.10.6

Optional recommended settings:

The switches are automatically managed but MetalSoft allows for local customization of the switch settings on elements beyond the scope of the ports being managed.

  1. Configure global hash settings for LAG (layer3+4)

  2. Enable cut-through switching + dynamic buffer allocation (DBA)

  3. Enable spanning-tree

  4. Enable TTL for IPv4, Option Drop for Ipv6 etc. for security hardening

  5. Configure “administrative distance” for static routes so that they are preferred over learned prefixes. This is default on most servers.

Register (enroll) a switch

Registering switches can be done manually or automatically via ONIE - if the equipment supports it. The following instructions will register a switch manually. Check that the IP ranges specified match the allocated ones for your local infrastructure:

Create a configuration file:

#the hostname of the switch
identifierString: HP5900-H1060
description: ToR switch
#the datacenter label
datacenterName: uk-london
#
provisionerType: vpls
provisionerPosition: tor
driver: hp5900
#connection details. This can be in-band management iP or the out-of-band. In this setup we used the out of band 
managementAddress: 172.16.10.2
managementProtocol: ssh
managementPort: 22
managementUsername: admin
managementPassword: 

#Network address (first IP) of the WAN IPv6 subnet.
#These will be the IPs that will be allocated to hosts
#that need to be reachable from outside an infrastructure
#Both Ipv4 and IPv6 IPs are allocated to servers
#The IPv6 subnet will be automatically created
primaryWANIPv6SubnetPool: fd1f:8bbb:56b3:2000:0000:0000:1000:0000
#Subnet size of the above subnet. If it is fd1f:8bbb:56b3:2000:0000:0000:1000:0000/53 in CIRD notation this number will be 53
primaryWANIPv6SubnetPrefixSize: 53

#Network address (first IP) of the SAN subnet. 
primarySANSubnetPool: 100.64.0.1
#By default, the SAN subnet has a netmask of 21.
primarySANSubnetPrefixSize: 21

# The pool from which to allocate ips to servers during registration
quarantineSubnetStart: 172.16.0.0
quarantineSubnetEnd: 172.16.0.255
#the subnet size (CIDR) of the above pool
quarantineSubnetPrefixSize: 24
#the IP of the VSI in the quarantine network
quarantineSubnetGateway: 172.16.0.1

#This IP range is used as a hack to prevent downtime during dynamic provisioning of 
#ports (since we have multiple secondary IPs on the same interface).
#the traffic is not routed to these IPs.
#The real IPv4 WAN subnet pool needs to be added later using a 'subnet pool'.
primaryWANIPv4SubnetPool: 172.24.4.0
#Subnet size of the above subnet. If it is 172.24.4.0/22 in CIRD notation this number will be 22
primaryWANIPv4SubnetPrefixSize: 22


#Set to true to enable ONIE for this switch.
requiresOSInstall: false
#Set this to the ID of the volume template that holds the ONIE image
volumeTemplateID: 0

Register the new switch by running:

$ metalcloud-cli switch create --raw-config examples/tor_switch.yaml --format yaml

Pre-flight checklist

Before proceeding with server registration it is useful to go through the following checklist to verify your configuration:

  1. DC agents machine is able to reach the Quarantine L3 interface

     root@uk-metalsoft-poc-dcagents01  ~# ping 172.16.0.1
      PING 172.16.0.1 (172.16.0.1) 56(84) bytes of data.
      64 bytes from 172.16.0.1: icmp_seq=1 ttl=63 time=0.332 ms
    
  2. DC agents machine is able to reach a SAN array in-band interface

    root@uk-metalsoft-poc-dcagents01  ~# ping 100.96.0.2
    PING 100.96.0.2 (100.96.0.2) 56(84) bytes of data.
    64 bytes from 100.96.0.2: icmp_seq=1 ttl=63 time=0.332 ms
    
  3. Router has rule for SAN L3 interfaces (100.64.0.0/21)

    root@bsirouter-hpe01:~# ip r | grep 100.64
    100.64.0.0/21 via 172.16.10.2 dev ens224 proto static 
    
  4. Router has rule is able to reach the OOB network

    root@bsirouter-hpe01:~# ip r | grep 10.255.226
    10.255.226.0/24 via 172.16.10.2 dev ens224 proto static 
    
  5. DC agents machine is able to reach a switch in-band interface

    root@uk-metalsoft-poc-dcagents01  ~# ping 172.16.10.2 
    PING 172.16.10.2 (172.16.10.2) 56(84) bytes of data.
    64 bytes from 172.16.10.2: icmp_seq=1 ttl=254 time=1.01 ms
    
  6. DC agents machine is able to resolve the hostname of the repository using one of the DNS servers configured on the datacenter “DNSServers”:

    root@bsirouter-hpe01:~# nslookup <repo> 8.8.8.8
    Server:		8.8.8.8
    Address:	8.8.8.8#53
    
    Non-authoritative answer:
    Name:	<repo>
    Address: 84.40.58.43
    
  7. DNS DC agent overrides the DNS record for the repoURL host:

    $ nslookup <repo> 172.16.10.6
    Server:		172.16.10.6
    Address:	172.16.10.6#53
    
    Name:	<repo>
    Address: 172.16.10.6
    
  8. If using HTTPS for the repository check that HAproxy is configured correctly by configuring a temporary entry in the /etc/hosts file of a server and poinging it to the DC agents machine:

root@bsirouter:~$ cat /etc/hosts
...
172.16.10.6 <repo>

Use curl to pull any asset from the repository. This will force a request through the HAproxy and the DC agent proxy

root@bsirouter:~$ curl http://<repo>/templates/catalog.json
{
	"OSTemplates":{
		"centos6-8": "CentOS/centos6-8",
		"centos6-9": "CentOS/centos6-9",
		"centos7-4": "CentOS/centos7-4",
		"centos7-5": "CentOS/centos7-5",
                "centos7-6-old": "CentOS/centos7-6-old",
		"centos7-6": "CentOS/centos7-6",
		"centos7-7-old": "CentOS/centos7-7-old",
                "centos7-7": "CentOS/centos7-7",
		"ubuntu16-04": "Ubuntu/ubuntu16-04",
		"WindowsServer2012R2": "Windows/WindowsServer2012R2",
		"WindowsServer2016": "Windows/WindowsServer2016",
		"WindowsServer2019": "Windows/WindowsServer2019",
		"WindowsServer2016Template221118": "Windows/WindowsServer2016Template221118",
		"Cloudware2019Stdv3": "Windows/Cloudware2019Stdv3"
	}
}

Note: Don’t forget to remove the hosts entry afterwards!

  1. Check that the VLAN ranges in the switchProvisioner configured in the datacenter record do not overlap with eachother.

  2. Check that the Controller’s (master DC) repoURL matches that of the datacenter’s:

./metalcloud-cli datacenter get --id master  --show-config --format yaml | grep repoURL
    repoURLRoot: http://<repo>
    repoURLRootQuarantineNetwork: http://<repo>

These can sometimes causes issues with SAN template replication. This will not be required in the future.

Configuring configuration for SSL enabled repository access

To enable an HTTPS-only repository the following are needed:

  1. Datacenter’s repoURLRoot must hold a url that starts with https://

  2. The datacenter agent’s haproxy configuration on the datacenter machine must be configured with the repository server’s SSL certificates

  3. the CN of the SSL certificate must match the url specified in the repoURLRoot

Create new /opt/agents/haproxy.cfg file:

global
    chroot /var/lib/haproxy
    user root
    group root
    daemon
    ssl-default-bind-options no-sslv3 no-tls-tickets
    ssl-default-bind-ciphers ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-SHA384:ECDHE-RSA-AES256-SHA384:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA256
    ssl-default-server-options no-sslv3 no-tls-tickets
    ssl-default-server-ciphers ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-SHA384:ECDHE-RSA-AES256-SHA384:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA256
defaults
    mode http
    log global

    retries 3
    timeout connect 10s
    timeout client 100m
    timeout server 30m
    timeout check 10s
    timeout http-keep-alive 10s
    timeout queue 10m
    timeout http-request 30m
    timeout tunnel 480m
    maxconn 3000
    option httpclose
    option forwardfor except 127.0.0.0/8
    option redispatch
    option abortonclose
    option httplog
    option dontlognull
    option http-server-close

frontend ft_local_apache_80
    mode http
    bind :80
    bind 127.0.0.1:80
    acl host_ws path_beg -i /api-ws
    acl host_dhcpe path_beg -i /dhcpe
    acl host_tftp path_beg -i /tftp8069
    acl host_repo hdr_dom(Host) -i repo.poc.metalsoft.io
    acl host_repo hdr_dom(Host) -i repo.metalsoft.io
    use_backend bk_local_apache_8080 if host_ws
    use_backend bk_fullmetal_dhcpe_8067 if host_dhcpe
    use_backend bk_fullmetal_tftpe_8069 if host_tftp
    use_backend bk_repo_443 if host_repo
    use_backend bk_repo_443 if host_repo
    default_backend bk_local_apache_81

frontend ft_local_apache_443
    mode http
    bind :443 ssl crt /etc/ssl/certs/poc.metalsoft.io.pem
    acl host_ws path_beg -i /api-ws
    acl host_dhcpe path_beg -i /dhcpe
    acl host_tftp path_beg -i /tftp8069
    acl host_repo hdr_dom(Host) -i repo.poc.metalsoft.io
    acl host_repo hdr_dom(Host) -i repo.metalsoft.io
    http-response add-header Strict-Transport-Security max-age=157680001
    use_backend bk_local_apache_8080 if host_ws
    use_backend bk_fullmetal_dhcpe_8067 if host_dhcpe
    use_backend bk_fullmetal_tftpe_8069 if host_tftp
    use_backend bk_repo_443 if host_repo
    use_backend bk_repo_443 if host_repo
    default_backend bk_local_apache_81

backend bk_fullmetal_dhcpe_8067
    server localhost 127.0.0.1:8067

    http-request set-header X-HAPROXY-OUTSIDE-SAFE %[src]
    option forwardfor header X-HAPROXY-OUTSIDE-IP

backend bk_fullmetal_tftpe_8069
    server localhost 127.0.0.1:8069

    http-request set-header X-HAPROXY-OUTSIDE-SAFE %[src]
    option forwardfor header X-HAPROXY-OUTSIDE-IP

backend bk_local_apache_81
    server localhost 127.0.0.1:81

        http-request set-header X-HAPROXY-OUTSIDE-SAFE %[src]
    option forwardfor header X-HAPROXY-OUTSIDE-IP

backend bk_local_apache_8080
    server localhost 127.0.0.1:8080

    http-request set-header X-HAPROXY-OUTSIDE-SAFE %[src]
    option forwardfor header X-HAPROXY-OUTSIDE-IP

backend bk_repo_443
    server repo.poc.metalsoft.io 127.0.0.1:9080

Update the docker-compose file to map the file inside the container:

haproxy:
    network_mode: host
    container_name: dc-haproxy
    image: registry.metalsoft.dev/datacenter-agents/dc-haproxy:latest
    restart: always
    privileged: true
    volumes:
      - /opt/agents/haproxy.cfg:/usr/local/etc/haproxy/haproxy.cfg
    environment:
      - TZ=Etc/UTC
    hostname: dc-haproxy