Cloudera Streaming Operators
Setting up a new Macbook for Data Engineering doesn’t have to be a chore. In this guide, we will go from a fresh macOS install to running the full suite of Cloudera Streaming Operators (CFM, CSA, CSM) on a local Minikube cluster.
We are utilizing the Evaluation examples for these services. This is a critical step for any engineer; mastering the foundational “batteries-included” configurations is essential before moving into complex, high-availability production architectures.
🚀 Let’s get started!
💻 My Macbook Setup
First, we need the “Swiss Army Knife” of macOS: Homebrew. We will use it to install our core CLI tools, container runtimes, and local Kubernetes environment.
# Install Homebrew (if not already installed)
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
# Install Core Tools
brew install git docker python minikube kubernetes-cli k9s helm
# Start Docker Desktop (or Colima)
open /Applications/Docker.app
# Spin up Minikube with enough resources for the full stack
minikube start --cpus 4 --memory 12288
Pro Tip! Makes sure you have enough resources in docker, then use minikube start to allocate cpu and memory as shown above.
📦 Some Helm and Kubectl Setup
We need the Operators installed first to manage our custom resources. We’ll pull these from the Cloudera Helm Registry.
# 1. Create the namespace
kubectl create namespace cld-streaming
# 2. Add Cloudera License Credentials
kubectl create secret docker-registry cloudera-creds \
--docker-server=container.repository.cloudera.com \
--docker-username=<CLOUDERA_LICENSE_USER> \
--docker-password=<CLOUDERA_LICENSE_PASSWORD> \
-n cld-streaming
helm registry login container.repository.cloudera.com
🎡 Install Cloudera Streams Messaging Operator(CSM 1.6)
Next, we stand up our messaging backbone. The CSM 1.6 operator makes it incredibly simple to deploy a Kafka cluster for local development. First lets install the Strimzi Kafka Operator:
# Install Cloudera Kafka Strimzi Operator
helm install strimzi-cluster-operator
--namespace cld-streaming
--version 1.6.0-b99
--set 'image.imagePullSecrets[0].name=cloudera-creds'
--set-file clouderaLicense.fileContent=./license.txt
--set watchAnyNamespace=true
oci://container.repository.cloudera.com/cloudera-helm/csm-operator/strimzi-kafka-operator
Create the kafka-eval.yaml as follows:
kafka-eval.yaml
apiVersion: kafka.strimzi.io/v1
kind: Kafka
metadata:
name: my-cluster
annotations:
strimzi.io/node-pools: enabled
strimzi.io/kraft: enabled
spec:
kafka:
version: 4.1.1.1.6
listeners:
- name: plain
port: 9092
type: internal
tls: false
- name: tls
port: 9093
type: internal
tls: true
config:
offsets.topic.replication.factor: 3
transaction.state.log.replication.factor: 3
transaction.state.log.min.isr: 2
default.replication.factor: 3
min.insync.replicas: 2
entityOperator:
topicOperator: {}
userOperator: {}
Create the kafka-nodepool.yaml as follows:
kafka-nodepool.yaml
apiVersion: kafka.strimzi.io/v1
kind: KafkaNodePool
metadata:
name: combined
labels:
strimzi.io/cluster: my-cluster
spec:
replicas: 3
roles:
- controller
- broker
storage:
type: jbod
volumes:
- id: 0
type: persistent-claim
size: 10Gi
kraftMetadata: shared
deleteClaim: false
---
apiVersion: kafka.strimzi.io/v1
kind: KafkaNodePool
metadata:
name: broker-only
labels:
strimzi.io/cluster: my-cluster
spec:
replicas: 3
roles:
- broker
storage:
type: jbod
volumes:
- id: 0
type: persistent-claim
size: 10Gi
kraftMetadata: shared
deleteClaim: false
Execute the following kubectl apply command to create the kafka cluster:
kubectl apply --filename kafka-eval.yaml,kafka-nodepool.yaml --namespace cld-streaming
Notice the response:
kafka.kafka.strimzi.io/my-cluster created
kafkanodepool.kafka.strimzi.io/combined created
kafkanodepool.kafka.strimzi.io/broker-only created
📑 Deploy Schema Registry (CSM 1.6)
To manage our data serialization and ensure compatibility across the stack, we deploy the Cloudera Schema Registry. This allows to define schema for data that NiFI, Kafka, Flink and SSB can easily decipher.
Create the sr-values.yaml file as follows:
sr-values.yaml
tls:
enabled: false
authentication:
oauth:
enabled: false
authorization:
simple:
enabled: false
database:
type: in-memory
service:
type: NodePort
Execute the following helm command to install Schema Registry:
# Install Schema Registry from the Cloudera OCI Registry
helm install schema-registry
--namespace cld-streaming
--version 1.6.0-b99
--values sr-values.yaml
--set "image.imagePullSecrets[0].name=cloudera-creds"
oci://container.repository.cloudera.com/cloudera-helm/csm-operator/schema-registry
Notice the following output:
Pulled: container.repository.cloudera.com/cloudera-helm/csm-operator/schema-registry:1.6.0-b99
Digest: sha256:6e7d57bf42ddefc3f216b977d91fd606caf78a456c6aa2989002ad4387fb5a6d
NAME: schema-registry
LAST DEPLOYED: Mon Mar 9 13:15:09 2026
NAMESPACE: cld-streaming
STATUS: deployed
REVISION: 1
DESCRIPTION: Install complete
TEST SUITE: None
NOTES:
Thank you for installing schema-registry.
Your release is named schema-registry.
WARNING: You configured Schema Registry to use an in-memory database. This setup is only suitable for testing and evaluation. Cloudera does not recommend this configuration for production use. All schemas will be lost when Pods restart.
🕵️ Deploy Kafka Surveyor (CSM 1.6)
A new standout in the CSM Operator 1.6 is Kafka Surveyor. This provides advanced observability into your Kafka environment. It is a game-changer for monitoring broker health and topic metrics in a Kubernetes-native way.
Before we can create a surveyor yaml we need to find our kafka boostrapServers:
kubectl get kafka my-cluster -n cld-streaming -o jsonpath='{.status.listeners[?(@.name=="plain")].bootstrapServers}'
Grab the following from the output:
my-cluster-kafka-bootstrap.cld-streaming.svc:9092
and use it to create the surveyor-eval.yaml as follows:
kafka-surveyor.yaml
clusterConfigs:
clusters:
- clusterName: my-cluster
tags:
- csm1.6
bootstrapServers: my-cluster-kafka-bootstrap.cld-streaming.svc:9092
adminOperationTimeout: PT1M
authorization:
enabled: false
commonClientConfig:
security.protocol: PLAINTEXT
surveyorConfig:
surveyor:
authentication:
enabled: false
tlsConfigs:
enabled: false
Execute the following helm command to install Cloudera Surveyor:
helm install cloudera-surveyor \\
--namespace cld-streaming \\
--version 1.6.0-b99 \\
--values kafka-surveyor.yaml \\
--set image.imagePullSecrets=cloudera-creds \\
--set-file clouderaLicense.fileContent=./license.txt
Notice the following output:
Pulled: container.repository.cloudera.com/cloudera-helm/csm-operator/surveyor:1.6.0-b99
Digest: sha256:7e12c2b2ab8aca0351296f5c986d64d55a92b66e7ac6b599ccbd5b3fa9c13167
NAME: cloudera-surveyor
LAST DEPLOYED: Mon Mar 9 13:18:14 2026
NAMESPACE: cld-streaming
STATUS: deployed
REVISION: 1
DESCRIPTION: Install complete
TEST SUITE: None
NOTES:
Thank you for installing surveyor.
Your release is named cloudera-surveyor.
⚡ Deploy SQL Stream Builder (CSA 1.5)
First lets install the Cloudera Streaming Analytics Operator
kubectl create -f https://github.com/jetstack/cert-manager/releases/download/v1.8.2/cert-manager.yaml
kubectl wait -n cert-manager --for=condition=Available deployment --all
helm install csa-operator --namespace cld-streaming \\
--set 'flink-kubernetes-operator.imagePullSecrets[0].name=cloudera-creds' \\
--set 'ssb.sse.image.imagePullSecrets[0].name=cloudera-creds' \\
--set 'ssb.sqlRunner.image.imagePullSecrets[0].name=cloudera-creds' \\
--set-file flink-kubernetes-operator.clouderaLicense.fileContent=./license.txt \\
Notice the following output:
Pulled: container.repository.cloudera.com/cloudera-helm/csa-operator/csa-operator:1.5.0-b275
Digest: sha256:2ee057584ec167a9d70cabbe9a4f0aa8cd93ddf74e73da3c2617a829a87e593c
NAME: csa-operator
LAST DEPLOYED: Mon Mar 9 14:43:32 2026
NAMESPACE: cld-streaming
STATUS: deployed
REVISION: 1
DESCRIPTION: Install complete
TEST SUITE: None
🌊 Deploy NiFi (CFM 3.0)
We’ll start with a NiFi cluster using the standard evaluation spec. This gives us a fully functional NiFi instance without the overhead of complex external persistence.
Create the nifi-eval.yaml as follows:
nifi-eval.yaml
apiVersion: cfm.cloudera.com/v1alpha1
kind: NifiCluster
metadata:
name: nifi-eval
namespace: cld-streaming
spec:
nodes: 1
imagePullSecrets:
- name: cloudera-creds
externalConfiguration:
nodeProperties:
nifi.web.http.host: 0.0.0.0
kubectl apply -f nifi-eval.yaml -n cld-streaming
🕵️ Final Verification
The best way to see the fruits of your labor is via k9s.
# Check the deployment progress
k9s
Behold: The full Cloudera Streaming stack running harmoniously in your local cluster.
You can also expose all the services with minikube like this:
minikube service list -n cld-streaming
┌───────────────┬────────────────────────────────┬──────────────────┬─────┐
│ NAMESPACE │ NAME │ TARGET PORT │ URL │
├───────────────┼────────────────────────────────┼──────────────────┼─────┤
│ cld-streaming │ cloudera-surveyor-service │ http/8080 │ │
│ cld-streaming │ flink-operator-webhook-service │ No node port │ │
│ cld-streaming │ my-cluster-kafka-bootstrap │ No node port │ │
│ cld-streaming │ my-cluster-kafka-brokers │ No node port │ │
│ cld-streaming │ schema-registry-service │ application/9090 │ │
│ cld-streaming │ ssb-mve │ No node port │ │
│ cld-streaming │ ssb-postgresql │ No node port │ │
│ cld-streaming │ ssb-sse │ No node port │ │
└───────────────┴────────────────────────────────┴──────────────────┴─────┘
Pro Tip! Once your pods are running, use minikbue service to expose the UI endpoints.
# Check our the service UIs:
minikube service cloudera-surveyor-service --namespace cld-streaming
minikube service schema-registry-service --namespace cld-streaming
minikube service ssb-sse --namespace cld-streaming
This setup is your “sandbox” for building end-to-end streaming architectures.



📚 Resources
- Cloudera Flow Management (CFM) 3.0 Docs
- Cloudera Streaming Analytics (CSA) 1.5 Docs
- Cloudera Streams Messaging (CSM) 1.6 Docs
Macbook Setup with Cloudera Streaming Operators Deep Dive
I hope you enjoyed this foundational guide to getting your local environment ready with the Cloudera Streaming stack. By starting with these evaluation examples, you’ve established the baseline knowledge required to navigate the latest CSA 1.5, CSM 1.6, and CFM 3.0 Cloudera Operators.
If you have any questions about Cloudera Streaming Operators or the integration between these components, please reach out!