Introduction

Whether you have already adopted Openshift or are considering it, this article will help you increase your ROI and productivity by listing the 12 essential features including with any Openshift subscription. This is where Openshift shines as a platform when compared to pure Kubernetes engine distributions like EKS, AKS, etc. which are more barebones and require quite a bit of setup to be production and/or enterprise ready. When you consider the total value of Openshift, and factor in the total cost of ownership for the alternative, Openshift is a very competitive option not only for cost conscious buyers but also organizations that like to get things done, and get things done the right way. Here we go:

Managed Openshift in the cloud
Operators
GitOps
Cluster Monitoring
Cluster Logging
Distributed Tracing
Pipelines
Autoscaling
Service Mesh
Serverless
External Secrets
Hyperscaler Operators

Special Bonus: Api Management

ROSA, ARO, ROKS: Managed Openshift in the cloud

If you want an easy way to manage your Openshift Cloud infra, these managed Openshift solutions are an excellent value and a great way to get ROI fast. Pay-as-you-go running on the hyperscaler’s infrastructure, you can save a ton of money by using reserved instances with a year commitment. RedHat manages the control plane (master and infra nodes) and you pay a small fee per worker. We like the seamless integration with native hyperscaler services like storage and node pools for easy autoscaling, zone awareness for HA, networking and RBAC security with IAM or AAD. Definitely worth a consideration over the EKS/AKS, etc. solutions which are more barebones.

Check out our Openshift Spring Boot Accelerator for ROSA, which leverages most of the tools I’m introducing down below…

Operators

Available by default on Openshift, the OperatorHub is pretty much the app store for Kubernetes. Operators manage the installation, upgrade and lifecycle of complex Kubernetes-based solutions like the tools we’re going to present in this list. Operators are based on the controller pattern which is at the core of the Kubernete’s architecture and enable declarative configuration through the use of Custom Resource Definitions (CRD). Operators is a very common way to distribute 3rd party software nowadays and the Operator Framework makes it easy to create custom controllers to automate common Kubernetes operations tasks in your organization. The OperatorHub included with Openshift out-of-the-box allows you to install said 3rd party tools with the click of a button, so you can setup a full-featured cluster in just minutes, instead of spending days, weeks, months gathering installation packages from all over. The Operator Framework support Helm, Ansible and plain Go based controllers to manage your own CRDs and extend the Kubernetes APIs. At Perficient, we leverage custom operators to codify operations of high-level resources like a SpringBootApp. To me, Operators represent the pinnacle of devsecops automation or at least a giant leap forward.

Openshift GitOps (AKA ArgoCD)

First thing you should install on your clusters to centralize the management of your clusters configuration with Git is GitOps. GitOps is a RedHat’s distribution of ArgoCD which is delivered as an Operator, and integrates seamlessly with Openshift RBAC and single-sign on authentication. Instead of relying on a CI/CD pipeline and the oc (kubectl) cli to implement changes in your clusters, ArgoCD works as an agent running on your cluster which automatically pulls your configuration manifests from a Git repository. This is the single most important tool in my opinion for so many reasons, the main ones being:

central management and synchronization of multi-cluster configuration (think multi-region active/active setups at the minimum)
ability to version control cluster states (auditing, rollback, git flow for change management)
reduction of learning curve for development teams (no new tool required, just git, manage simple yaml files)
governance and security (quickly propagating policy changes, no need to give non-admin users access to clusters’apis)

 

I have a very detailed series on GitOps on my Perficient’s blog, this is a must-read whether you’re new to Openshift or not.

Cluster Monitoring

Openshift comes with a pre-configured monitoring stack powered by Prometheus and Grafana. Openshift Monitoring manages the collection and visualization of internal metrics like resource utilization, which can be leveraged to create alerts and used as the source of data for autoscaling. This is generally a cheaper and more powerful alternative to the native monitoring systems provided by the hyperscalers like CloudWatch and Azure Monitoring. Like other RedHat’s managed operators, it comes already integrated with Openshift RBAC and authentication. The best part is it can be managed through GitOps by using the provided, super simple CRDs.

 

A less-know feature is the ability to leverage Cluster Monitoring to collect your own application metrics. This is called user-workload monitoring and can be enabled with one line in a manifest file. You can then create ServiceMonitor resources to indicate where Prometheus can scrape your application custom metrics, which can be then used to build custom alerts, framework-aware dashboards, and best of all, used as a source for autoscaling (beyond cpu/memory). All with a declarative approach which you can manage across clusters with GitOps!

Cluster Logging

Based on a Fluentd-Elasticsearch stack, cluster logging can be deployed through the OperatorHub and comes with production-ready configuration to collect logs from the Kubernetes engine as well as all your custom workloads in one place. Like Cluster Monitoring, Cluster Logging is generally a much cheaper and Powerful alternative to the hyperscaler’s native services. Again, the integration with Openshift RBAC and single-sign on makes it very easy to secure on day one. The built-in Kibana deployment allows you to visualize all your logs through a web browser without requiring access to the Kubernetes API or CLI. The ability to visualize logs from multiple pods simultaneously, sort and filter messages based on specific fields and create custom analytics dashboards makes Cluster Logging a must-have.

Another feature of Cluster Logging is log forwarding. Through a simple LogForwarder CRD, you can easily (and through GitOps too!) forward logs to external systems for additional processing such as real-time notifications, anomaly detection, or simply integrate with the rest of your organization’s logging systems. A great use case of log forwarding is to selectively send log messages to a central location which is invaluable when managing multiple clusters in active-active configuration for example.

Last but not least is the addition of custom Elasticsearch index schema in recent versions, which allows developers to output structured log messages (JSON) and build application-aware dashboards and analytics. This feature is invaluable when it comes to filtering log messages based on custom fields like log levels, or a trace ID, to track logs across distributed transactions (think Kafka messages transiting through multiple topics and consumers). Bonus points for being able to use Elasticsearch as a metrics source for autoscaling with KEDA for example.

Openshift Distributed Tracing

Based on Jaeger and Opentracing, Distributed Tracing can again be quicky installed through the OperatorHub and makes implementing Opentracing for your applications very, ridiculously easy. Just deploy a Jaeger instance in your namespace and you can just annotate any Deployment resource in that namespace with one single line to start collecting your traces. Opentelemetry is invaluable for pinpointing performance bottlenecks in distributed systems. Alongside Cluster Logging with structured logs as mentioned above, it makes up a complete solution for troubleshooting transactions across multiple services if you just log your Opentracing trace IDs.

Openshift Distributed Tracing also integrates with Service Mesh, which we’ll introduce further down, to monitor and troubleshoot traffic between services inside a mesh, even for applications which are not configured with Opentelemetry to begin with.

Openshift Pipelines

Based on Tekton, Openshift pipelines allow you to create declarative pipelines for all kind of purposes. Pipelines are the recommended way to create CI/CD workflows and replaces the original Jenkins integration. The granular declarative nature of Tekton makes creating re-usable pipeline steps, tasks and entire pipelines a breeze, and again can be managed through GitOps (!) and custom operators. Openshift pipelines can be deployed through the OperatorHub in one-click and comes with a very intuitive (Jenkins-like) UI and pre-defined tasks like S2I to containerize applications easily. Creating custom tasks is a breeze as tasks are simply containers, which allows you to leverage the massive ecosystem of 3rd party containers without having to install anything additional.

You can use Openshift pipelines for any kind of workflow, from standard ci/cd for application deployments to on demand integration tests, to executing operations maintenance tasks, or even step functions. As Openshift native, Pipelines are very scalable as they leverage the Openshift infrastructure to execute tasks on pods, which can be very finely tuned for maximum performance and high availability, integrate with the Openshift RBAC and storage.

Autoscaling

Openshift supports the three types of autoscalers: horizontal pod autoscaler, vertical pod autoscaler, cluster autoscaler. The horizontal pod autoscaler is included OOTB alongisde the node autoscaler, and the vertical pod autoscaler can be installed through the OperatorHub.

Horizontal pod autoscaler is a controller which increases and decreases the number of pod replicas for a deployment based on CPU and Memory metrics threshold. It leverages Cluster Logging to source the Kubernetes pod metrics from the included Prometheus server and can be extended to use custom application metrics. The HPA is great to scale stateless rest services up and down to maximize utilization and increase responsiveness during traffic increase.

Vertical pod autoscaler is another controller which analyses utilization metrics patterns to optimize pod resource configuration. It automatically tweaks your deployment resources CPU and memory requests to reduce wastes or undercommitment to insure maximum performance. It’s worth noting that a drawback of VPA is that pods have to be shutdown and replaced during scaling operations. Use with caution.

Finally, the cluster autoscaler is used to increase or decrease the number of nodes (machines) in the cluster to adapt to the number of pods and requested resources. The cluster autoscaler paired with the hyperscaler integration with machine pools can automatically create new nodes when additional space is required and remove the nodes when the load decreases. There are a lot of considerations to account for before turning on cluster autoscaling related to cost, stateful workloads requiring local storage, multi-zone setups, etc.  Use with caution too.

Special mention for KEDA, which is not commercially supported by RedHat (yet), although it is actually a RedHat-Microsoft led project. KEDA is an event-driven scaler which sits on top of the built-in HPA and provides extensions to integrate with 3rd party metrics aggregating systems like Prometheus, Datadog, Azure App Insight, and many many more. It’s most well-known for autoscaling serverless or event-driven applications backed by tools like Kafka, AMQ, Azure EventHub, etc. but it’s very useful to autoscale REST services as well. Really cool tech if you want  to move your existing AWS Lambda or Azure Functions over to Kubernetes.

Service Mesh

Service mesh is supported by default and can also be installed through the OperatorHub. It leverages Istio and integrates nicely with other Openshift operators such as Distributed Tracing, Monitoring & Logging, as well as SSO. Service mesh serves many different functions that you might be managing inside your application today (For example if you’re using Netflix OSS apps like Eureka, Hystrix, Ribbon, etc):

Blue/green deployments
Canary deployments (weighted traffic)
A/B testing
Chaos testing
Traffic encryption
OAuth and OpenID authentication
Distributed tracing
APM

You don’t even need to use microservices to take advantage of Service Mesh, a lot of these features apply to re-platformed monoliths as well.

Finally you can leverage Service Mesh as a simple API Management tool thanks to the Ingress Gateway components, in order to expose APIs outside of the cluster behind a single pane of glass.

Serverless

Now we’re getting into real modern application development and deployment. If you want peak performance and maximize your compute resources and/or bring down your cost, serverless is the way to go for APIs. Openshift Serverless is based on KNative and provides 2 main components: serving and eventing. Serving is for HTTP APIs containers autoscaling and basic routing, while eventing is for event-driven architecture with CloudEvents.

If you’re familiar with AWS Lambda or Azure Functions, Serverless is the equivalent in the Kubernetes world, and there are ways to migrate from one to the other if you want to leverage more Kubernetes in your infrastructure.

We can build a similar solution with some of the tools we already discussed like KEDA and Service Mesh, but KNative is a more opinionated model for HTTP-based serverless applications. You will get better results with KNative if you’re starting from scratch.

The big new thing is eventing which promotes a message-based approach to service-to-service communication (as opposed to point-to-point). If you’ve used that kind of decoupling before, you might have used Kafka, or AWS SQS or other types of queues to decouple your applications, and maybe Mulesoft or Spring Integration or Camel (Fuse) to produce and consume messages. KNative eventing is a unified model for message format with CloudEvent and abstracts the transport layer with a concept called event mesh. Check it out: https://knative.dev/docs/eventing/event-mesh/#knative-event-mesh.

External Secrets Add-On

One of the first things to address when deploying applications to Kubernetes is the management of sensitive configuration variables like passwords to external systems. Though Openshift doesn’t officially support loading secrets from external vaults, there are widely used solutions which are easily setup on Openshift clusters:

Sealed Secrets: if you just want to manage your secrets in Git, you cannot have them in clear even if you’re using GitHub or other Git providers. SealedSecrets allows you to encrypt secrets in Git which can only be read by your Openshift cluster. This requires an extra step before committing using the provided client certificate but doesn’t require a 3rd party store.
External Secrets: this operator allows you to map secrets stored in external vaults like Hashicorp, Azure Vault and AWS Secret Manager to internal Openshift secrets. Very similar to the CSI driver below, it essentially creates a Secret resource automatically, but doesn’t require an application deployment manifest to be modified in order to be leveraged.
Secrets Store CSI Driver: another operator which syncs an external secrets store to an internal secret in Openshift but works differently than the External Secrets operator above. Secrets managed by the CSI driver only exist as long as a pod using it is running, and the application’s deployment manifest has to explicitly “invoke” it. It’s not usable for 3rd party containers which are not built with CSI driver support out-of-the-box.

Each have their pros and cons depending on whether you’re in the cloud, use GitOps, your organization policies, existing secrets management processes, etc. If you’re starting from scratch and are not sure of which one to use, I recommend starting with External Secrets and your Cloud provider secret store like AWS Secret Manager or Azure Vault.

Special Mention: Hyperscalers Operators

If you’re running on AWS or Azure, each cloud provider has released their own operators to manage cloud infrastructure components through GitOps (think vaults, databases, disks, etc), allowing you to consolidate all your cloud configuration in one place, instead of using additional tools like Terraform and CI/CD. This is particularly useful when automating integration or end-to-end tests with ephemeral Helm charts to setup various components of an application.

API Management Add-On

Muleosft, Boomi or Cloupak for integration customers, this is an add-on but it’s way worth considering if you want to reduce your APIM costs: Redhat Application Foundation and Integration. These suites include a bunch of cool tech like Kafka (with a registry) and AMQ, SSO (SAML, OIDC, OAuth), Runtimes like Quarkus and Spring and Camel, 3Scale for API Management (usage plans, keys, etc), CDC, Caching and more.

Again because it’s all packaged as an operator, you can install and start using all these things in just a few minutes, with the declarative configuration goodness that enables GitOps and custom operators.