# Deploying MetalSoft to support legacy (PXE) deployments In certain situations where older servers need to be automated, the legacy PXE protocol method can be used to provision servers. As opposed to the regular setup, an in-band network connection between the agent and the leaf switches is required. This can be either L2 or a DHCP relay configuration. ## Cabling The cabling is shown as follows: ![](/assets/guides/preparing_to_setup_a_datacenter_01.svg) >Note: In this setup each server is connected with 4 ports to the ToR switch and an extra dedicated BMC port to the OOB network. MetalSoft doesn't need an extra provisioning network and thus doesn't need shared BMC ports. If present MetalSoft will simply ignore the extra port in the OS. Provisioning is done using the In-band network by reconfiguring the switch during the setup. In fact, our recommendation is also **NOT** to use the shared port for security reasons. A rogue tenant could potentially use it to access the OOB network. ## IP Allocation MetalSoft uses several IP ranges with various purposes such as OOB IPs, In-band IPs, storage IPs etc. The following IP ranges are the recommended defaults: ![](/assets/guides/preparing_to_setup_a_datacenter_02.svg) 1. **WAN subnet - IPv4** This subnet will be used to reach tenant servers from outside the managed datacenter. For a baremetal-as-a-service IaaS product this would be a routable IP address subnet. *Example: 10.255.226.0/24* 2. **WAN subnet - IPv6**: This subnet will be used to reach tenant servers from outside the managed datacenter. For a baremetal-as-a-service IaaS product this would be a routable IP address subnet. *Example: fd1f:8bbb:56b3:2000:0000:0000:0000:0000/53)* 3. **SAN client subnet - IPv4** This subnet is assigned to servers on the SAN interface. *Example: 100.64.0.0/21* 4. **SAN storage subnet - IPv4** This subnet is used for the storage arrays. Servers will get (via DHCP) a static route that pushes storage traffic (towards this subnet) via the SAN interface on the server. *Example: 100.96.0.0/16* 5. **OOB Subnet - IPv4** This subnet will be used to allocate IPs to server's BMCs and potentially to switch BMCs. *Example: 10.255.227.0/24* 6. **Quarantine subnet - IPv4 (legacy)** This subnet will be used by MetalSoft to allocate IPs to servers during server registration and before the server's identity has been established. All unused ports on the TOR switch are configured to be part of the Quarantine VLAN (by convention VLAN ID 5). This allows zero touch enrollment of servers but does not allow a physical attack to access tenant networks. *Example: 172.16.0.0/24* 7. **Dummy subnet ("Primary" WAN IP Subnet)** This is only required for HP older switches. These IPs are permanently allocated on the interfaces as "Primary" IPs in order to avoid a downtime when manipulating secondary IP addresses on the switch interfaces. Traffic should not be routed to them in any way. *Example: 172.24.4.0/22* 9. **Router 'public' IP Address** This is the IP configured on the router, reachable from the outside world. It is the entry point towards the managed datacenter. It is also the endpoint for VPN connections if the VPN has been configured on this machine. *Example: 83.246.0.140* 10. **Router 'in-band' IP Address** This is the IP configured on the router, reachable from the in-band network. This will be used as a gateway towards the datacenter agents as well as towards the internet or the rest of the network. *Example: 172.16.10.1* 11. **Router 'out-of-band' IP Address** This is the IP configured on the router, reachable from the out-of-band network. This will be used as a gateway towards the datacenter agents for on-boarding of zero-touch servers and BMC configuration. *Example: 10.255.227.1* 12. **DC Agents machine IP Address** This is the IP configured on the datacenter agents machine. *Example: 172.16.10.6* 8. **Top of Rack (ToR) 01 switch in-band IP Address** This is the switch in-band IP address. In this example setup, it is used for: 1. As the gateway towards the Quarantine 2. As the gateway towards the SAN in-band network 3. As the in-band management interface *Example IP 172.16.10.2* 13. **Quarantine L3 interface** This virtual switch interface (VSI) interface is used as a gateway for the quarantine network. This IP must be configured on one of the ToR switches. *Example: 172.16.0.1* ## Router configuration The reference architecture uses a router machine that bridges the in-band, out-of-band and the uplink links. The `netplan` configuration should look like this: edit the `/etc/netplan/50-cloud-init.yaml` ```yaml network: version: 2 ethernets: ens160: addresses: [83.246.0.140/24] gateway4: 83.246.0.1 nameservers: addresses: - 8.8.8.8 - 1.1.1.1 ens192: addresses: [10.255.227.1/24] ens224: addresses: [172.16.10.1/30] routes: #this is a static route that takes DHCP traffic coming from the agents machine #to the quarantine network via the in-band link. - to: 172.16.0.0/16 via: 172.16.10.2 #this is a static route that allows a connection between a datacenter agent and the SAN network #via one of the ToR switches. This is only required to initiate volume replication processes via SSH. - to: 100.96.0.0/16 via: 172.16.10.2 #this is a static route that allows a connection between a datacenter agent and the SAN L3 interfaces #used as gateways on the ToR switches. This is required to allow the server to perform diskless boot. - to: 100.64.0.0/21 via: 172.16.10.2 #this is a static route that allows a connection between the agent and the WAN L3 interfaces #used on ToR switches - to: 10.255.226.0/24 via: 172.16.10.2 #this is a static route that allows DHCP traffic between the agent and the WAN gateway (VLAN VSI) interface's 'primary' (dummy) ip address #used on ToR switches - to: 172.24.4.0/22 via: 172.16.10.2 ens256: addresses: [172.16.10.5/30] ``` Apply the configuration: ```bash root@router:~# netplan --debug generate root@router:~# netplan --debug apply ``` Check that the routes have been added properly. You should see routes for: Quarantine, OOB, In-band towards the WAN interfaces and in-band towards the DC agents machine and towards the in-band SAN network. ```bash root@router:~# ip route default via 83.246.0.1 dev ens160 proto static #default gw 10.255.226.0/24 via 172.16.10.2 dev ens224 proto static #in-band towards the ip ranges configured on the server's WAN ports 10.255.227.0/24 dev ens192 proto kernel scope link src 10.255.227.1 #OOB subnet 83.246.0.0/24 dev ens160 proto kernel scope link src 83.246.0.140 #uplink 100.64.0.0/21 via 172.16.10.2 dev ens224 proto static #in-band towards the ip range configured on the server's SAN ports. This allows DHCP relay traffic to WAN. 100.96.0.0/16 via 172.16.10.2 dev ens224 #in-band SAN, configured on the SAN array 172.16.0.0/16 via 172.16.10.2 dev ens224 #quarantine subnet 172.16.10.0/30 dev ens224 proto kernel scope link src 172.16.10.1 #in-band towards the rest of the network and WAN IP addresses 172.16.10.4/30 dev ens256 proto kernel scope link src 172.16.10.5 #towards the DC agent machine 172.24.4.0/22 via 172.16.10.2 dev ens224 proto static #in-band towards the WAN gateway L3 primary ip address on the ToR. This allows DHCP relay traffic to WAN. ``` The VPN configuration is beyond the scope of this tutorial. Visit [How to Set Up an OpenVPN server on Ubuntu 18.04](https://linuxize.com/post/how-to-set-up-an-openvpn-server-on-ubuntu-18-04/) for more details. ## Switch configurations ### Out-of-band (OOB) Switch configuration Normally the OOB network is already configured and no special requirements apart form reachability between all the hosts is required. If a new OOB switch is being added at the minimum: 1. Configure serial console 2. Reset admin password 3. Configure maximum number of VLANs to 256 ### Top-of-Rack (ToR) switch configuration MetalSoft supports automatic configuration on certain switches via the 'Zerotouch' configuration mechanism (ONIE). The following is an example of a **manual** configuration of a ToR switch: 1. Configure management IPs (in this scenario we use in-band management IP for the switch but we could have used out-of-band management as well. IP used in this case is **172.16.10.2**) 2. Configure telnet 3. Enable SSH console access 4. Create quarantine VLAN 5 5. Add all ports into the quarantine VLAN 6. Configure DHCP relay with the DCAgents machine as a destination 7. Activate Restconf if not automatically activated and if the device supports it. Example configuration for a HP5900 switch: ``` sysname HP5900-H1060 # telnet server enable telnet server acl 2000 telnet server ipv6 acl ipv6 2000 # irf mac-address persistent timer irf auto-update enable undo irf link-delay irf member 1 priority 1 # link-aggregation global load-sharing mode destination-ip source-ip destination-port source-port # undo link-aggregation load-sharing mode local-first # ip ttl-expires enable # ipv6 option drop enable # dhcp enable # lldp global enable # cut-through enable burst-mode enable # system-working-mode standard password-recovery enable # vlan 1 # vlan 5 name QUARANTINE description QUARANTINE interface Vlan-interface5 description QUARANTINE ip address 172.16.0.1 255.255.255.0 dhcp select relay dhcp relay information circuit-id string VLAN5 dhcp relay information enable dhcp relay information remote-id sysname dhcp relay server-address 172.16.10.6 ``` Optional recommended settings: The switches are automatically managed but MetalSoft allows for local customization of the switch settings on elements beyond the scope of the ports being managed. 1. Configure global hash settings for LAG (layer3+4) 2. Enable cut-through switching + dynamic buffer allocation (DBA) 3. Enable spanning-tree 5. Enable TTL for IPv4, Option Drop for Ipv6 etc. for security hardening 6. Configure "administrative distance" for static routes so that they are preferred over learned prefixes. This is default on most servers. ## Register (enroll) a switch Registering switches can be done manually or automatically via ONIE - if the equipment supports it. The following instructions will register a switch manually. Check that the IP ranges specified match the allocated ones for your local infrastructure: Create a configuration file: ```yaml #the hostname of the switch identifierString: HP5900-H1060 description: ToR switch #the datacenter label datacenterName: uk-london # provisionerType: vpls provisionerPosition: tor driver: hp5900 #connection details. This can be in-band management iP or the out-of-band. In this setup we used the out of band managementAddress: 172.16.10.2 managementProtocol: ssh managementPort: 22 managementUsername: admin managementPassword: #Network address (first IP) of the WAN IPv6 subnet. #These will be the IPs that will be allocated to hosts #that need to be reachable from outside an infrastructure #Both Ipv4 and IPv6 IPs are allocated to servers #The IPv6 subnet will be automatically created primaryWANIPv6SubnetPool: fd1f:8bbb:56b3:2000:0000:0000:1000:0000 #Subnet size of the above subnet. If it is fd1f:8bbb:56b3:2000:0000:0000:1000:0000/53 in CIRD notation this number will be 53 primaryWANIPv6SubnetPrefixSize: 53 #Network address (first IP) of the SAN subnet. primarySANSubnetPool: 100.64.0.1 #By default, the SAN subnet has a netmask of 21. primarySANSubnetPrefixSize: 21 # The pool from which to allocate ips to servers during registration quarantineSubnetStart: 172.16.0.0 quarantineSubnetEnd: 172.16.0.255 #the subnet size (CIDR) of the above pool quarantineSubnetPrefixSize: 24 #the IP of the VSI in the quarantine network quarantineSubnetGateway: 172.16.0.1 #This IP range is used as a hack to prevent downtime during dynamic provisioning of #ports (since we have multiple secondary IPs on the same interface). #the traffic is not routed to these IPs. #The real IPv4 WAN subnet pool needs to be added later using a 'subnet pool'. primaryWANIPv4SubnetPool: 172.24.4.0 #Subnet size of the above subnet. If it is 172.24.4.0/22 in CIRD notation this number will be 22 primaryWANIPv4SubnetPrefixSize: 22 #Set to true to enable ONIE for this switch. requiresOSInstall: false #Set this to the ID of the volume template that holds the ONIE image volumeTemplateID: 0 ``` Register the new switch by running: ```bash $ metalcloud-cli switch create --raw-config examples/tor_switch.yaml --format yaml ``` ## Pre-flight checklist Before proceeding with server registration it is useful to go through the following checklist to verify your configuration: 1. DC agents machine is able to reach the *Quarantine L3 interface* ```bash root@uk-metalsoft-poc-dcagents01 ~# ping 172.16.0.1 PING 172.16.0.1 (172.16.0.1) 56(84) bytes of data. 64 bytes from 172.16.0.1: icmp_seq=1 ttl=63 time=0.332 ms ``` 2. DC agents machine is able to reach a SAN array in-band interface ```bash root@uk-metalsoft-poc-dcagents01 ~# ping 100.96.0.2 PING 100.96.0.2 (100.96.0.2) 56(84) bytes of data. 64 bytes from 100.96.0.2: icmp_seq=1 ttl=63 time=0.332 ms ``` 3. Router has rule for SAN L3 interfaces (100.64.0.0/21) ```bash root@bsirouter-hpe01:~# ip r | grep 100.64 100.64.0.0/21 via 172.16.10.2 dev ens224 proto static ``` 3. Router has rule is able to reach the OOB network ```bash root@bsirouter-hpe01:~# ip r | grep 10.255.226 10.255.226.0/24 via 172.16.10.2 dev ens224 proto static ``` 4. DC agents machine is able to reach a switch in-band interface ```bash root@uk-metalsoft-poc-dcagents01 ~# ping 172.16.10.2 PING 172.16.10.2 (172.16.10.2) 56(84) bytes of data. 64 bytes from 172.16.10.2: icmp_seq=1 ttl=254 time=1.01 ms ``` 5. DC agents machine is able to resolve the hostname of the repository using one of the DNS servers configured on the datacenter "DNSServers": ```bash root@bsirouter-hpe01:~# nslookup 8.8.8.8 Server: 8.8.8.8 Address: 8.8.8.8#53 Non-authoritative answer: Name: Address: 84.40.58.43 ``` 6. DNS DC agent overrides the DNS record for the repoURL host: ``` $ nslookup 172.16.10.6 Server: 172.16.10.6 Address: 172.16.10.6#53 Name: Address: 172.16.10.6 ``` 7. If using HTTPS for the repository check that HAproxy is configured correctly by configuring a temporary entry in the /etc/hosts file of a server and poinging it to the DC agents machine: ```bash root@bsirouter:~$ cat /etc/hosts ... 172.16.10.6 ``` Use curl to pull any asset from the repository. This will force a request through the HAproxy and the DC agent proxy ```bash root@bsirouter:~$ curl http:///templates/catalog.json { "OSTemplates":{ "centos6-8": "CentOS/centos6-8", "centos6-9": "CentOS/centos6-9", "centos7-4": "CentOS/centos7-4", "centos7-5": "CentOS/centos7-5", "centos7-6-old": "CentOS/centos7-6-old", "centos7-6": "CentOS/centos7-6", "centos7-7-old": "CentOS/centos7-7-old", "centos7-7": "CentOS/centos7-7", "ubuntu16-04": "Ubuntu/ubuntu16-04", "WindowsServer2012R2": "Windows/WindowsServer2012R2", "WindowsServer2016": "Windows/WindowsServer2016", "WindowsServer2019": "Windows/WindowsServer2019", "WindowsServer2016Template221118": "Windows/WindowsServer2016Template221118", "Cloudware2019Stdv3": "Windows/Cloudware2019Stdv3" } } ``` > Note: Don't forget to remove the hosts entry afterwards! 8. Check that the VLAN ranges in the switchProvisioner configured in the datacenter record do not overlap with eachother. 9. Check that the Controller's (master DC) repoURL matches that of the datacenter's: ```bash ./metalcloud-cli datacenter get --id master --show-config --format yaml | grep repoURL repoURLRoot: http:// repoURLRootQuarantineNetwork: http:// ``` These can sometimes causes issues with SAN template replication. This will not be required in the future. ## Configuring configuration for SSL enabled repository access To enable an HTTPS-only repository the following are needed: 1. Datacenter's repoURLRoot must hold a url that starts with `https://` 2. The datacenter agent's haproxy configuration on the datacenter machine must be configured with the repository server's SSL certificates 3. the CN of the SSL certificate must match the url specified in the repoURLRoot Create new `/opt/agents/haproxy.cfg` file: ``` global chroot /var/lib/haproxy user root group root daemon ssl-default-bind-options no-sslv3 no-tls-tickets ssl-default-bind-ciphers ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-SHA384:ECDHE-RSA-AES256-SHA384:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA256 ssl-default-server-options no-sslv3 no-tls-tickets ssl-default-server-ciphers ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-SHA384:ECDHE-RSA-AES256-SHA384:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA256 defaults mode http log global retries 3 timeout connect 10s timeout client 100m timeout server 30m timeout check 10s timeout http-keep-alive 10s timeout queue 10m timeout http-request 30m timeout tunnel 480m maxconn 3000 option httpclose option forwardfor except 127.0.0.0/8 option redispatch option abortonclose option httplog option dontlognull option http-server-close frontend ft_local_apache_80 mode http bind :80 bind 127.0.0.1:80 acl host_ws path_beg -i /api-ws acl host_dhcpe path_beg -i /dhcpe acl host_tftp path_beg -i /tftp8069 acl host_repo hdr_dom(Host) -i repo.poc.metalsoft.io acl host_repo hdr_dom(Host) -i repo.metalsoft.io use_backend bk_local_apache_8080 if host_ws use_backend bk_fullmetal_dhcpe_8067 if host_dhcpe use_backend bk_fullmetal_tftpe_8069 if host_tftp use_backend bk_repo_443 if host_repo use_backend bk_repo_443 if host_repo default_backend bk_local_apache_81 frontend ft_local_apache_443 mode http bind :443 ssl crt /etc/ssl/certs/poc.metalsoft.io.pem acl host_ws path_beg -i /api-ws acl host_dhcpe path_beg -i /dhcpe acl host_tftp path_beg -i /tftp8069 acl host_repo hdr_dom(Host) -i repo.poc.metalsoft.io acl host_repo hdr_dom(Host) -i repo.metalsoft.io http-response add-header Strict-Transport-Security max-age=157680001 use_backend bk_local_apache_8080 if host_ws use_backend bk_fullmetal_dhcpe_8067 if host_dhcpe use_backend bk_fullmetal_tftpe_8069 if host_tftp use_backend bk_repo_443 if host_repo use_backend bk_repo_443 if host_repo default_backend bk_local_apache_81 backend bk_fullmetal_dhcpe_8067 server localhost 127.0.0.1:8067 http-request set-header X-HAPROXY-OUTSIDE-SAFE %[src] option forwardfor header X-HAPROXY-OUTSIDE-IP backend bk_fullmetal_tftpe_8069 server localhost 127.0.0.1:8069 http-request set-header X-HAPROXY-OUTSIDE-SAFE %[src] option forwardfor header X-HAPROXY-OUTSIDE-IP backend bk_local_apache_81 server localhost 127.0.0.1:81 http-request set-header X-HAPROXY-OUTSIDE-SAFE %[src] option forwardfor header X-HAPROXY-OUTSIDE-IP backend bk_local_apache_8080 server localhost 127.0.0.1:8080 http-request set-header X-HAPROXY-OUTSIDE-SAFE %[src] option forwardfor header X-HAPROXY-OUTSIDE-IP backend bk_repo_443 server repo.poc.metalsoft.io 127.0.0.1:9080 ``` Update the docker-compose file to map the file inside the container: ``` haproxy: network_mode: host container_name: dc-haproxy image: registry.metalsoft.dev/datacenter-agents/dc-haproxy:latest restart: always privileged: true volumes: - /opt/agents/haproxy.cfg:/usr/local/etc/haproxy/haproxy.cfg environment: - TZ=Etc/UTC hostname: dc-haproxy ```