Server lifecycle

The following are the states as they are defined by MetalSoft:

Server states

Available

A server is healthy and ready to be used in a new provisioning operation. An admin will manually change a server to available from unavailable once the configuration of the server has been checked (Disks and Switch Interfaces). This is a manually set state.

Available_reserved

State available_reserved is automatically set on a server that is in the available state but is reserved under a subscription for a specific user. This is an automatically set state.

Used

A server is already used and cannot be used in a new provisioning process in the current state. A server cannot be manually changed to used and should not manually be changed to another state from used. This is an automatically set state.

Registering

When a new server is first added to the application, it will pass through a complex process so the system is aware of the servers hardware components, configuration etc. For this a live, lightweight and custom linux image is booted via the BMC and a series of scripts are run. This is an automatically set state.

Cleaning

A server under the cleaning state is in the middle of a de-provision operation. The same live linux image runs clean-up scripts during this stage, making the server ready for a new provisioning operation. This is an automatically set state.

Defective

When specific issues occur with a server or hardware faults are present, a server can be marked as defective, from the advanced page. When in defective state, a server cannot pass into the Used state. This is a manually set state.

Used_registering

A server is in the used_registering state when an already used server is passed through the registering process. Once the registration process completes, the server will re-enter used state. This is an automatically set state.

Decommissioned

The decommissioning state describes a server that has been removed from the application and cannot be used unless re-added again (going back through the first server cycle described below). This is an automatically set state.

Removed from Rack When a server is no longer required, but the administrator needs to keep audit data, the server should be changed to removed_from_rack. This is a manually set state.

Interact with the server states

All the servers in all their states, can be seen in the Server type utilization report page.

server lifecycle

Administrators can interact with the server states. They can do so by accessing the server page, going to the Server status section and pressing the desired buttons. From here, the server can be moved into available, unavailable, defective or removed_from_rack You should not, in normal circumstances move a server to Used or from Used to another state unless the server is defective or decommissioned.

server lifecycle

The registration and decommissioning processes can be triggered from the specific section.

server lifecycle

Server cycles

1. Adding a server and using it

When a new server is first added into the application, it first passes through the registering process. If the registering processes succeeds, then it goes into the unavailable state. From here, it can be set manually in the available state. From here on, the server can be selected and used in a provision process.

server lifecycle

2. Modifying and re-registering a used server

When modifications need to be done to a server in state, the re-register processes can be issued to an active server.

server lifecycle

3. Releasing a server and using it again

The following diagram shows the states of a server from its release to its re-use.

server lifecycle

4. Marking a server as defective

The process of marking a server as defective is manual. Once a server is diagnosed as unhealthy, the below button can be used to mark it as defective. Note that the server must not be in Used state when tagging it as defective.

server lifecycle

During the server recovery, the notes section can be used to note down the fault, current status and other details.

server lifecycle

After the server is repaired, it can be set to available again using the Available button.

server lifecycle

The following diagram shows the states of a server from marking it as defective to setting it as available again and then being used.

server lifecycle

Managing hardware failures and changes

During a servers life, there will be a need to replace components, or upgrade the server. For some changes, MetalSoft would need to be made aware of the changes so the server type can be presented correctly and the networking can perform as expected.

No Re-Registration required (but advised)

For the following types of replacement, nothing will need to be changed in MetalSoft:

  • Replace RAM same size as failed module

  • Replace Disk same size/model/type as failed module (If the disk is replaced, we still strongly suggest the server is re-registered at some point so MetalSoft is aware of the serial number.  This can be performed using Re-register, but will cause the server to be rebooted)

  • Replace CPU same model/type as failed module

Network card changes

If the Nic needs to be replaced, then the server would need to be re-registered. How this is performed differs between if it is in Available/Unavailable or Used state:

  • Available - If the server is in Available state, then the server can just be changed to Unavailable state and then re-registered

  • Unavailable – If the server us in Unavailable state, then the server can just be re-registered

  • Used – If the server is in Used state, then you can still re-register. You do not need to change the state from Used, you will just need to click on re-register Please be aware, this will cause a reboot of the server If the Nic is plugged into different ports on the switch, then the network configuration on the OS may need to be manually changed as MetalSoft do not have access to an installed OS.

BMC board/Motherboard changes

If the BMC board on a server (and, in some instances, the motherboard)which is in Used state is replaced, then we would strongly suggest the server is backed up, removed from the infrastructure, then the server status changed from Available to Unavailable and the server will need to be Decommissioned (from the Advanced tab) and then registered. Before decommissioning the server, please ensure you make a copy of the BMC IP address, user and password as well as the serial number.

Upgrading a server

If the desire is to upgrade a servers resources, then the server can not be used by any infrastructure (will need to be in available state) and then the server would need to be changed to Unavailable and re-registered. Once changed back to Available, the server will have a new server type which will indicate the new configuration.