Skip to content

ELI - AI assisted diagnosis overview

ELI (now at version 2.0) is our agentic AI component. It has various capabilities that can help users manage infrastructure at scale.

ELI offers multiple services:

  1. A user-level chat interface that can be used to build or modify tenant infrastructures.
  2. An admin-level chat interface that can be used to perform troubleshooting and generally to interact with both the MetalSoft database and with MetalSoft-managed equipment (readonly for the moment)

ELI can be deployed fully on-prem, a server with 2xH100 is recommended. We recommend using the qwen3:235b model but any Ollama-supported ‘thinking’ and ‘tools’-enabled model would work such as Mistral.

ELI can also be automatically invoked when an equipment-generated syslog event is triggered. To enable this behaviour check the “Send to monitoring agent” checkbox in the Global Configurations -> Device Alerts for the rules that need to invoke the agent.

These skills can be combined in any way in complex prompts such as: get me all the vlans configured on ports where LLDP neighbors show a switch or what is the mac address of the Management0 switch connected to switch 15?

Eli will execute a multi-stage process to retrieve the needed information and finally combine it into a single result etc. Eli is able to iterate of elements, do complex logic and more.

prompt example: what switch ports are admin up on switch 5? output:

The following switch ports are admin up on switch 5:
ethernet1/1/2
ethernet1/1/13:2
ethernet1/1/13:4
ethernet1/1/14:2
ethernet1/1/14:4
ethernet1/1/15:4
ethernet1/1/7
ethernet1/1/11
ethernet1/1/15:1
ethernet1/1/15:2
ethernet1/1/15:3
ethernet1/1/5
ethernet1/1/6
ethernet1/1/8
ethernet1/1/12
ethernet1/1/13:1
ethernet1/1/14:1
ethernet1/1/1
ethernet1/1/13:3
ethernet1/1/14:3

Getting the VLANs configurations on a switch

Section titled “Getting the VLANs configurations on a switch”

prompt example: get all vlans configured on switch 15

Ports with VLAN 5 configured on switch 15: Eth1/2/2, Eth1/1/1, Eth1/2/3, Eth1/2/4, Eth1/4/2, Eth1/3/3, Eth1/1/2, PortChannel256, Eth1/4/1, Eth1/3/4, Eth1/1/3, Eth1/2/1, Eth1/3/1, Eth1/1/4, Eth1/4/4, Eth1/3/2, Eth1/4/3

another prompt example:is there any port that is down configured with vlan 5 on switch 15?

Analysis result
The following down ports are configured with VLAN 5 on switch 15:
Eth1/1/3
Eth1/3/3
Eth1/2/1
Eth1/4/2
Eth1/1/2
Eth1/2/3
Eth1/2/4
Eth1/3/2
Eth1/1/1
Eth1/3/4
Eth1/4/3
Eth1/2/2
Eth1/4/4
Eth1/3/1
Eth1/1/4
Eth1/4/1

prompt example:get the bgp sessions of switch 5 output:

BGP Sessions for switch 5:
Session 1: Neighbor Name: 192.168.122.1 Enabled: true Active: true Uptime Seconds: 1203847756 State: established Last Error: Reset by peer

prompt example: is there any lldp neighbour of type switch on switch 15?

LLDP Neighbors Analysis Results
The following LLDP neighbors of type 'switch' were found on switch 15:
Neighbor Name: sw1-dell-s5232f
Neighbor Chassis MAC Address: c4:5a:b1:48:e4:85
Neighbor Name: US_CHG_QTS01_01_MJ40_ML43_01
Neighbor Chassis MAC Address: d8:94:03:96:5c:fb
Neighbor Name: sw1-dell-s5232f
Neighbor Chassis MAC Address: c4:5a:b1:48:e4:85
Neighbor Name: US_CHG_QTS01_01_MJ40_MGMT_01
Neighbor Chassis MAC Address: d0:7e:28:ec:3b:e8
Neighbor Name: US_CHG_QTS01_01_MJ40_ML43_01
Neighbor Chassis MAC Address: d8:94:03:96:5c:fb
  1. Getting a switch log prompt examples: get the switch logs of switch 5 output:
Thu Jun 27 22:25:26 2024 informational - LACP_PORT_UNGROUPED port-channel30 : ethernet1/1/4
Thu Jun 27 22:25:26 2024 informational - VLT_PORT_CHANNEL_DOWN 30 is down
Thu Jun 27 22:25:53 2024 informational - IFM_OSTATE_UP :ethernet1/1/4
Thu Jun 27 22:25:54 2024 informational - IFM_OSTATE_DN :ethernet1/1/4
Thu Jun 27 22:25:55 2024 informational - IFM_OSTATE_UP :ethernet1/1/4
Thu Jun 27 22:26:10 2024 informational - VLT_PORT_CHANNEL_UP 30 is up
Thu Jun 27 22:26:11 2024 informational - IFM_OSTATE_DN :ethernet1/1/4
Thu Jun 27 22:26:11 2024 informational - LACP_PORT_GROUPED port-channel30 : ethernet1/1/4
Thu Jun 27 22:26:11 2024 informational - LACP_PORT_UNGROUPED port-channel30 : ethernet1/1/4
Thu Jun 27 22:26:11 2024 informational - IFM_OSTATE_UP :ethernet1/1/4
Thu Jun 27 22:26:11 2024 informational - VLT_PORT_CHANNEL_DOWN 30 is down
Thu Jun 27 22:26:11 2024 informational - IFM_OSTATE_DN :ethernet1/1/4
Thu Jun 27 22:26:12 2024 informational - IFM_OSTATE_UP :ethernet1/1/4
Thu Jun 27 22:26:27 2024 informational - VLT_PORT_CHANNEL_UP 30 is up
Thu Jun 27 22:30:13 2024 informational - VLT_PORT_CHANNEL_DOWN 30 is down
Thu Jun 27 22:30:15 2024 informational - IFM_OSTATE_UP :port-channel30
Thu Jun 27 22:30:15 2024 informational - LACP_PORT_GROUPED port-channel30 : ethernet1/1/4
Thu Jun 27 22:30:15 2024 informational - VLT_PORT_CHANNEL_UP 30 is up
Fri Jun 28 01:29:05 2024 informational - IFM_OSTATE_DN :ethernet1/1/4
Fri Jun 28 01:29:06 2024 informational - IFM_OSTATE_DN :port-channel30
Fri Jun 28 01:29:06 2024 informational - LACP_PORT_UNGROUPED port-channel30 : ethernet1/1/4
Fri Jun 28 01:29:06 2024 informational - VLT_PORT_CHANNEL_DOWN 30 is down
Fri Jun 28 01:29:34 2024 informational - IFM_OSTATE_UP :ethernet1/1/4
Fri Jun 28 01:29:34 2024 informational - IFM_OSTATE_DN :ethernet1/1/4
Fri Jun 28 01:29:35 2024 informational - IFM_OSTATE_UP :ethernet1/1/4
Fri Jun 28 01:29:50 2024 informational - VLT_PORT_CHANNEL_UP 30 is up
Fri Jun 28 01:29:51 2024 informational - IFM_OSTATE_DN :ethernet1/1/4
Fri Jun 28 01:29:51 2024 informational - IFM_OSTATE_UP :ethernet1/1/4
Fri Jun 28 01:29:51 2024 informational - VLT_PORT_CHANNEL_DOWN 30 is down
Fri Jun 28 01:29:52 2024 informational - IFM_OSTATE_DN :ethernet1/1/4
Fri Jun 28 01:29:53 2024 informational - IFM_OSTATE_UP :ethernet1/1/4
Fri Jun 28 01:30:08 2024 informational - VLT_PORT_CHANNEL_UP 30 is up
Fri Jun 28 01:33:50 2024 informational - VLT_PORT_CHANNEL_DOWN 30 is down
Fri Jun 28 01:33:52 2024 informational - LACP_PORT_GROUPED port-channel30 : ethernet1/1/4
Fri Jun 28 01:33:52 2024 informational - IFM_OSTATE_UP :port-channel30
Fri Jun 28 01:33:53 2024 informational - VLT_PORT_CHANNEL_UP 30 is up

Prompt example: is there any error in the log for switch 15?

Note that this part is subject to change.

The topology information is available to the AI component in the form of a graph with nodes. Note that the AI might confuse lldp-based toplogy with the known topology. When questioned if switch 5 is connected to server 15 it will try an lldp query first. To force it to use the known topology use explicit prompts:

prompt: based on the topology what are the peers of switch 15?