Networking overview¶

MetalSoft is capable of of automatically deploying complex network configurations on a multitude of switch brands including Cisco Nexus, Cisco ACI, Juniper, Dell, Cumulus, HP and others.

This document describes MetalSoft related concepts. If you are looking for Cabling & Network topologies that MetalSoft creates consult the Network topology section.

The approach that we take is “intent-based” in the sense that the user fully describes the desired network “services” and the system then translates those into actual switch and OS configurations. This is a service provider-inspired approach where the end-users are “consumers” of network services offered by the “provider”. As such they don’t control which subnets to allocate from, which VLANs will be used etc but they can specify the size of the subnet they need for example.

The user defines via the infrastructure editor or the API (or terraform, ansible etc) how the desired network setup looks like.

Important concepts of MetalSoft networking¶

MetalSoft uses several important concepts:

Infrastructures

This is a security construct similar to a “VPC” in which a tenant-specific network blueprint is defined. It defines a security boundary. In general a multi-server application is assigned in an infrastructure.
Networks

Networks are an abstract concept that signals the intent of connecting servers or *endpoints within the same infrastructure to each other or with the outside world. From an implementation perspective a typical network maps to a separate L2 network usually implemented with a different VLAN ID and one or more L3 IPv4 and IPv6 subnets.

Servers and other endpoints are connected to one or more networks and have a network profile associated with each network that defines how that connection is made. If no network profile is selected the default network profile (set in the data-center configuration) is used.
Network profiles

Network profiles alter the default behavior of a network and for example request additional tagged VLANs, additional subnets, the transport of the L2 networks to External connections etc.

A network profile can have one or more VLAN entries. The default vlan entry defines the the network behavior for untagged traffic. Note that in eVPN environments there are VNIs associated with these VLANs and the VLANs are VTEPs so the VLANs should not be used here in the strict RFC 2674 sense but rather as a wider overlay L2 network.

If the VLAN ID is not set or set to ‘auto’ a VLAN id will be automatically allocated for the infrastructure That will be the same for all servers or endpoints within that infrastructure. Within this overlay L2 network additional settings such as subnets or external connections can be defined.
Subnet pools

Subnet pools allow the admins to specify IP Addresses ranges from which IPs are allocated to in-band, out-of-band and san networks. Subnet pools are further subdivided in subnets. The subnet pools are used in-order. The subnet pools marked as “manually allocated” are not used in this automatic allocation and are used only if explicitly referenced in a VLAN entry of a network profile.
Subnets

Subnets are allocated from a subnet pool automatically to satisfy the need for IPs of a particular infrastructure. By default the system will allocate a subnet from the list of public subnet pools that is large enough for 1 IP for each endpoint connected to a network in an infrastructure. Two different infrastructures will get two different subnets for the WAN network.

Multiple subnets will be allocated if the subnet pools are listed in the network profiles VLAN entries.
Servers

We use the term server to refer to physical equipment. A server can have one or more server interfaces.
Endpoints

Unmanaged equipment (such as unmanaged servers in brownfield environments) are called endpoints and the admin must manually define them along with their host-port-to-switch port associations. In the future this process will also be automated.
Instances

When added to an infrastructure, a server or a VM are presented as an “instance” to the user. Each Instance will have a distinct ID. IPs allocated to an instance from a Subnet are tied to the Instance and not the actual server or VM. This allows the server to be replaced while the instance object itself would remain the same. The VM behind a VM-instance object might move to another host or vendor and might receive a new identifier whereas the VM-Instance object will remain the same.
InstanceArrays

A group of typically identical instances manipulated as a single entity. Usually all instances have the same server type but they can also be different if needed.
Provisioners

A provisioner is the core logic that uses a certain network overlay technology to separate tenants. MetalSoft currently provides the following provisioners:
- EVPN-VXLAN (supported with SONIC, Juniper, Dell OS10)
- VLAN (supported with Cisco Nexus, HP)
- SDN (supported with Cisco ACI)
- VPLS (supported with HP, H3C) - note that this is obsolete and will be phased out in later releases.
Networks cannot be extended over data enters and typically the same data center should use the same vendor. While multi-vendor setups are possible in MetalSoft we do not recommend.
Datacenters

A data center is a group of endpoints and switches operating under the same provisioner, supported by an Site controller. Infrastructures are also tied to a single data center. Interconnects across infrastructures or data-centers should be handled via external connections.

An example setup¶

The final network configuration depends on:

Admin-level configurations in the form of Network Profiles and Subnets, Switch and Datacenter configurations
User-level intent definition in the form of the infrastructure design.

To help clarify the concepts the following is an example configuration:

To build this configuration the following configurations are needed:

User-level Intent configuration¶

Let’s assume two Users define their intents in two different infrastructures as follows:

User 1 (Infrastructure A):

Server 1: a web server with two interfaces: one connected to the “WAN” Network that needs to reach the internet and the other to the “LAN” Network
Server 2: a db server with a single interface connected to a “LAN” Network TODO: Add infrastructure design

User 2 (Infrastructure B):

Three identical servers with an interface connected to the WAN Network. Note that here the network can be called “WAN”, the same as in Infrastructure A because a network is subordinated to an Infrastructure. TODO: Add infrastructure design

Admin-level configuration¶

The network admin would have configured in the system the following: Subnet pools:

  Label: default
  Description: Subnet pool for subnets allocated in WAN networks
  Range: 192.168.0.0/16
  Type: WAN 
  Routable: true

Network profiles:

Label: Default network profile
Description: Network profile that will be configured by default on all servers attached to the WAN network
Type: WAN
VLANS:
  - index: 0
    type: auto
    subnetPools: 192.168.0.0/16
    vxlan: true
    provisionGateways: true

Datacenter configurations:

WANVLANRange: 100-199

How abstract operations define intents¶

Both L2 and L3 configurations on both switches, servers and storage pools are controlled at the same time by using Networks *Network Profiles, infrastructures and other concepts described above. Some example intent definitions:

By adding networks, servers or other endpoints to an infrastructure users signal the intent of separating them from other ones.
By connecting an *instance port to a Network the user expresses the intent to consume the services of that network from the respective instance.
By selecting a Network Profile other than the default the user requests some alterations to the default service offered by the respective network.
By defining a Network profile VLAN entry to auto a network admin expresses the intent to allocate a new VLAN ID and use it for all L2 networks to all servers or endpoints within a particular infrastructure that use the respective network profile.

MetalSoft Networks¶

Networks are an abstract concept that links endpoints together or with the outside world.

There are 3 types of networks:

WAN networks This is the default network type and in most cases the only type deployed. It is signaled with the “internet” or “WAN” labels in the infrastructure editor and green color.

WAN networks typically provide connectivity not only between servers but also with the outside world. WAN networks typically provide the following services:
1. An L2 broadcast domain that is shared between ports of the same infrastructure connected to this network but isolated from WAN networks provisioned for other Infrastructures. This means that users can use their own IPs if they want to to connect for example the nodes of a Kubernetes cluster. In most implementations this will be achieved by using a unique VLAN ID, VXLAN VNI or EPG.
2. One or more subnets of sufficient size (taken together) to provide one IP for each port connected to the network. These subnets are provisioned from the default subnet pools configured by admins in the Subnets section. These IPs are usually public (routable) IPv4 and IPv6 IPs but not always, depending on the design of the service.
3. A gateway for all allocated subnets that provides connectivity to outside of the infrastructure. MetalSoft provides a mechanism to provision L3 switch virtual interfaces (SVIs) to act as gateways. These L3 interfaces will be private to this network. Each infrastructure will get their own gateways.
4. Link aggregation support on the switch side if two or more ports of the same instance are connected to the same WAN network. If the physical ports are connected to two different switches an M-LAG or equivalent concept will be used to create the switch-side aggregated port. See the LACP section for more details. Note that more than one gateway SVIs will be created in this case to ensure redundancy in case a switch fails.
SAN networks

A SAN network is a special network dedicated to SAN traffic. It provides one of the following services depending on the type of storage pool and storage protocols used:
1. An isolated L2 broadcast domain that is shared between the servers on the same infrastructure connected to the SAN network. This L2 broadcast domain is terminated on the storage pools from which volumes are offered to the instances of the same infrastructure.
2. An isolated L3 network where the gateway resides on the leaf switch to which the server port is connected to, isolated from other SAN networks with ACLs.
3. One or more subnets of sufficient size (taken together) to provide one IP for each port connected to the network. These subnets are provisioned from the default subnet pools configured by admins in the Subnets section of type SAN. These IPs are usually private IPv4 and IPv6 IPs but not always, depending on the design of the service.
4. ISCSi/FC Multi-path support will be enabled if two ports of the same instance are connected to the same SAN network. In this case, different subnets will be allocated for each pair of switch port and storage pool virtual port.
5. LAN networks
LAN networks typically provide connectivity between servers of the same infrastructure.
1. An L2 broadcast domain that is shared between ports of the same infrastructure connected to this network but isolated from LAN networks provisioned for other Infrastructures and from other LAN networks. This means that users can use their own IPs if they want to to connect for example the nodes of a Kubernetes cluster. In most implementations this will be achieved by using a unique VLAN ID, VXLAN VNI or EPG.
2. One or more subnets of sufficient size (taken together) to provide one IP for each port connected to the network. These subnets are provisioned from the default subnet pools configured by admins in the Subnets section. These IPs are usually private IPv4 and IPv6 IPs.
3. Link aggregation support on the switch side if two or more ports of the same instance are connected to the same LAN network. If the physical ports are connected to two different switches an M-LAG or equivalent concept will be used to create the switch-side aggregated port. See the LACP section for more details.
More than one LAN networks can be created on the same infrastructure.

MetalSoft Network profiles¶

Both users and admins can create network profiles that can change the behavior of a network. Note that users are restricted in what they can do with the networks versus admins.

To define the behavior of a network, a network profile can alter:

Choose a specific native (access mode) VLAN ID instead of one that is automatically allocated
Specify a series of allowed VLAN IDs on the same ports
Choose a specific subnet pool to use for the native VLAN or for the other allowed VLANs.
Choose if the SVIs should not be provisioned on the WAN network
Choose if a specific network should be extended on a border device to connect to a External Connection.

Refer to Managing network profiles for more details.

Details on link-aggregation¶

If the user chooses to connect two or more ports to the same WAN or LAN network a link aggregation will be created on the switches and a bond interface will be created on the server.

If the two ports lead to different switches defined as a switch-pair then an M-LAG or equivalent multi-chassis link aggregation will be created. LACP type 4 will be used as negotiation protocol. Refer to Managing switch pairs for more details.
If the two ports lead to two different switches that are configured in “stacked” configuration in which case a single switch record is registered in MetalSoft, the system will create regular link-aggregations, using LACP type 4.
If the two ports lead to the same switch a simple link-aggregation will be created on the respective switch. The link will be configured using LACP type 4.

To ensure consistent behavior, the LACP priority will be higher on the port with the lower index so that if the LACP protocol cannot negotiate with the server (such as during the boot process) the fallback port will always be port with index #0.

How server interface indexes are allocated¶

To ensure consistent behavior the network ports of the servers will always have indexes based on the switch hostname and then the switch port in lexicographic order irrespective of the actual order in the server. For example if a server is connected to two switches with hostnames sw1 and sw2 and connected to Ethernet10 on both, the port connected to sw1 will always be port index #0 and the port connected to sw2 will always be port #1. If the same server is connected to sw1 Ethernet11 and sw1 Ethernet10 the second port will be index #0 and the first one will be #1.

OS Interface Index¶

MetalSoft provides an OS index information for interfaces that is what VMWare ESXI uses to name the interfaces thus the OS index 1 will be the vmk1 interface.

Virtualization networking¶

MetalSoft uses a novel networking approach for the virtualization by tying the VMs into the same network as bare metal servers. This means that the network admin can reuse the network profiles, subnets external connections etc. To learn more go to Managing VM Pools

Where to go from here: