OpenShift Platform in Hybrid Operation Model

Introduction
Most organizations today face growing demands for flexibility and performance of IT infrastructure. The hybrid operation model of the OpenShift platform, where worker nodes are deployed both on virtual servers and physical servers (bare metal), represents an effective solution. This model allows combining the benefits of both environments: the flexibility and easy management of virtual machines together with the high performance and efficiency of bare metal servers.
Goals and Benefits of the Hybrid Model
Key Goals
- Optimization of infrastructure operating costs.
- Ensuring performance for demanding workloads (AI/ML) while maintaining flexible cluster scaling.
- Flexibility in deploying and scaling workloads.
- Easier extensibility and compatibility with existing IT systems.
Benefits of the Hybrid Model
- Performance: Bare metal workers provide better performance for applications that need direct access to hardware.
- Scalability: Virtual machines allow quick response to changes in computational capacity requirements.
- Cost Optimization: Possibility to combine more expensive but高性能 bare metal workers equipped with powerful graphics cards with more flexible scaling of the entire cluster using virtual workers.
Hybrid OpenShift Cluster Architecture
Cluster Structure
- Control plane: Runs on virtual servers for easy management and redundancy.
- Worker nodes:
- Virtual servers: Intended for standard workloads and microservices applications.
- Bare metal servers: Used for applications with high performance requirements.
Integration with Existing Infrastructure
- Connection with existing virtualization platform (VMware).
- Cluster provisioning automation using Infrastructure-as-Code (Ansible).
Implementation and Operation
Hybrid Cluster Deployment
- Planning: Definition of workloads and decision on which applications will run on which nodes.
- OpenShift Installation: Using the OpenShift Installer and deployment according to our proposal, which is fully automated.
- Configuration:
- Setting conditions for workload scheduling (node affinity, tolerations).
- Optimization of network communication between virtual and bare metal workers.
- Monitoring and observability:
- Using Prometheus, Grafana and OpenTelemetry for monitoring performance and cluster status.
- Log management using OpenShift Logging stack with ECK Cloud deployed in on-premise environment.
Conclusion
The hybrid operation model of OpenShift combines the flexibility and cost efficiency of virtual servers with the high performance of bare metal workers. This approach provides an ideal solution for demanding workloads that need a scalable and efficient platform for modern applications using AI/ML.