Enhancing Server State Detection in OpenStack for Immediate Host Fault Reporting
This blueprint proposes the development of a new API in Nova to promptly update the server state in OpenStack when a host fault occurs. The intention is to ensure reliability and real-time updates of server and host states for Telco-grade VIM, allowing users to take necessary actions swiftly in case of errors. Implementation alternatives and suggestions for enhancing server state visibility are discussed.
Uploaded on Sep 22, 2024 | 0 Views
Download Presentation
Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
E N D
Presentation Transcript
**DRAFT** NOVA Blueprint Report host fault to update server state immediately. https://blueprints.launchpad.net/nova/+spec/update-server-state-immediately 03/10/2015
BP Draft description When a server goes down because of an host hardware, OS or hypervisor error, the server state remains as operational in OpenStack API. A new API is needed to report that a host fault and to change the state of the server(s) immediately. The new API provides the possibility to externally detect any kind of host failure and to inform OpenStack about it. 9/22/2024 2
Intention For Telco grade VIM, user would expect server and host state to be reliable and up-to-date. State should change fast when error occurs, so user can make correlating action. Host needs to be fenced (shutdown) if server(s) moved to another host, so it is granted same server instance is not there twice. This will be done by external tool in NFVI that can be Pacemaker or any other tool detecting host faults (Doctor). While Doctor project will have notification of host faults to NB IF fast, the OpenStack state should reflect the same and not conflict. 9/22/2024 3
Report host fault immediately implementation alternatives 1. Have a new API to Nova to report host have failed. Expected there is an external configurable tool capable of detecting any host fault reliably and report it to Nova immediately when detected. Not easily accepted as overlaps with OpenStack current capability of detecting faults and expects external tool (or OpenStack tool that does not exist). This can be next step after there is a tool to collect and report faults (Doctor). 2. Use Pacemaker servicegroup driver to report host fault. This has awareness already in OpenStack community and might easily be accepted: https://blueprints.launchpad.net/nova/+spec/Pacemaker-servicegroup-driver Related spec to have tooz library used for service groups: https://blueprints.launchpad.net/nova/+spec/tooz-for-service-groups Limits to use of Pacemaker. Also does not reflect to server state that one might currently be polling in user side. 9/22/2024 4
Enhance the nova list to know state better This expects Pacemaker servicegroup driver to report host fault Currently if user is interested about server state, this is the information to be polled. As capability of detecting host fault is enhanced to be fast, would be beneficial the host state to be somehow visible in server state query. By having this any current user HA implementation polling server state would benefit from this. See that host is really down, so user could in reliable way expect server to be down and do necessary action. 9/22/2024 5