HySDS in Kubernetes (k8)
Kubernetes is an open-source system for automating deployment, scaling, and management of containerized applications.
K8 pods run an instance of a docker image, similar to a docker container. We can run all of our services in HySDS as K8 pods/deployments (& services if it needs to be exposed to users)
kubectl is the CLI tool used to communicate with your k8 cluster
New Dockerfile
I want to make the base docker image for the HySDS services as light as possible, so it would be best to not use hysds/pge-base
because it’s ~3.5GB and installs a lot of extra tools needed for PGE execution
Resorted to creating a new docker image from centos:7
and installed only python 3.7 and the core hysds
libraries, the new image is ~850MB (but will try to shrink it more)
in the future we can set a
ARG
to a python version (3.7.9) and give the users the option of installing a different version of python with--build-arg
the docker images for the various services in HySDS (mozart
, grq2
, pele
rest APIs, celery workers) will be branched off this image and ran in a k8 environment
FROM centos:7
ARG HOME=/root
ARG VERSION="3.7.9"
WORKDIR $HOME
# RUN yum update -y && \
RUN yum install gcc openssl-devel bzip2-devel libffi-devel openldap-devel readline-devel make wget git -y && \
cd /tmp && \
# installing python 3
wget https://www.python.org/ftp/python/${VERSION}/Python-${VERSION}.tgz && \
tar xzf Python-${VERSION}.tgz && \
cd Python-${VERSION} && \
./configure --enable-optimizations && \
make altinstall && \
ln -s /usr/local/bin/python${VERSION:0:3} /usr/local/bin/python3 && \
ln -s /usr/local/bin/pip${VERSION:0:3} /usr/local/bin/pip3 && \
pip3 install --no-cache-dir --upgrade pip && \
pip3 install --no-cache-dir gnureadline && \
rm -f /tmp/Python-${VERSION}.tgz && \
rm -rf /tmp/Python-${VERSION} && \
# installing HySDS libraries
cd $HOME && \
git clone https://github.com/hysds/prov_es.git && \
git clone https://github.com/hysds/osaka.git && \
git clone https://github.com/hysds/hysds_commons.git && \
git clone https://github.com/hysds/hysds.git && \
pip3 install --no-cache-dir -e prov_es/ && \
pip3 install --no-cache-dir -e osaka/ && \
pip3 install --no-cache-dir -e hysds_commons/ && \
pip3 install --no-cache-dir -e hysds/ && \
yum clean all && \
rm -rf /var/cache/yum && \
rm -r /tmp/*
WORKDIR $HOME
CMD ["/bin/bash"]
Kubernetes YAML files
Example of all services in mozart
in a kubernetes environment, can run on your local k8 cluster (minikube
or docker for desktop)
k8 services and deployments are defined in a .yaml
file
a k8 service exposes your “pod” (similar to a docker container) to allow other entities to communicate with it (another pod or a user)
service.yml
apiVersion: v1
kind: Service
metadata:
name: mozart
labels:
app: mozart
spec:
ports:
- port: 8888
selector:
app: mozart
type: LoadBalancer
deployment.yml
apiVersion: apps/v1
kind: Deployment
metadata:
name: mozart
labels:
app: mozart
spec:
# replicas: 2 # will allow you to run multiple instances of the app
selector:
matchLabels:
app: mozart
strategy:
type: Recreate
template:
metadata:
labels:
app: mozart
spec:
containers:
- name: mozart
image: mozart:test
# env: # passing environment variables
# - name: WORKERS
# value: "4"
ports:
- containerPort: 8888
name: mozart
volumeMounts:
- ...
volumes:
- ...
Use the kubectl
CLI tool to deploy your application in your kubernetes cluster
Your deployment and service is now running
HySDS services:
Mozart
Celery workers (wrapped with supervisord)
orchestrator (job validation)
etc.
GRQ
Elasticsearch
aws-es-proxy (if we’re using AWS ES for GRQ)
Factotum
Celery workers
user_rules (jobs & datasets) processor
factotum job worker (small & large)
etc.
Metrics
CI
Before HySDS would run its services in their respective machine/instance (Mozart, GRQ, Metrics & Factotum) but moving to a k8 deployment will get rid off that as the engine will determine which k8 node will run which service
Stateless Application(s)
Mozart rest API
Logstash
Celery workers
GRQ2 rest API
Pele rest API
Kibana
sdswatch
stateless applications are applications that don’t store data (besides logs), therefore the deployment in k8 is very simple & straightforward
can scale out easily without worrying about a leader/worker architecture, just add replicas: #
in the deployment.yml
file and your k8 LoadBalancer
will handle the rest
most of the work is revolved around creating a PersistentVolume to store logs & maybe cache data
Stateful Application(s)
Elasticsearch: Run & Deploy Elasticsearch on Kubernetes [Best Practices] - Sematext
Redis
RabbitMQ
stateful applications save client data and deployments are more complicated
examples are databases, queues and cache stores
scaling out stateful applications require the usage of a StatefulSet
it keeps the ordering of pods in your deployment so that it can accommodate a leader/worker architecture
example provided by RabbitMQ’s documentation: https://github.com/rabbitmq/diy-kubernetes-examples/blob/master/gke/statefulset.yaml
Another option they recommend is to move your stateful applications into cloud managed services
Redis is supported by AWS Elasticache, Azure Cache & GCP MemoryStore
RabbitMQ supported by Amazon MQ, not sure if Azure or GCP support this
AWS Elasticsearch service (but adds complexity as every request will have to be signed)
Elastic on Azure, but not sure how it works
Helm
Helm is a plugin (similar to homebrew
, yum
) and repository for kubernetes which hosts k8 “packages” and …
… helps you manage Kubernetes applications — Helm Charts help you define, install, and upgrade even the most complex Kubernetes application.
Use Helm v3, v2 has a security vulnerabilities
Deploying stateful applications can often be complicated and can take a lot of k8 yaml files to get it to work, especially if you’re planning on running a multi-node setup: example for RabbitMQ
Using helm to handle the templating and .yml
creation makes things much easier: example by Bitnami
Concerns
Kubernetes is dropping support for Docker
contains only the runtime component of Docker:
container.d
saves resources (RAM, CPU storage, etc.) & less security risk
Kubernetes on cloud (EKS, GKE & AKS) don’t have to worry about it, but will affect users who are managing a K8 cluster themselves
SDSCLI - https://github.com/sdskit/sdscli
moving to kubernetes will drastically affect sdscli
it was written under the design of SSHing into other HySDS components (
grq
,factotum
&metrics
) and running commands such aspip install
, etc.it relies on
fabdric
to copy files frommozart
to other HySDS componentsfor example,
sds update grq
will clear out~/sciflo/ops/
and copy over all the necessary files/repos from~/mozart/ops/
togrq
can copy files from pod -> pod (
kubectl cp my-pod:my-file my-file
) but it can potentially mess things up
this will not work with k8 b/c every service is completely de-coupled and in their own environment
sds [start|stop|reset] [mozart|grq|metrics|factotum]
will become somewhat obsolete (in its current state) because there’s no need forsupervisord
to run its servicesservices will be running in their own standalone pod(s)
instead will use
kubectl
to manage the k8 servicessueprvisord
may be used in the k8 pod forcelery
workersb/c we have many
celery
workers (user_rules
processing,orchestrator
, etc), wrapping it insupervisord
in a pod may clean things up
will need to see how
sds ship
will be affected by kubernetes