HySDS in Kubernetes (k8)

Kubernetes is an open-source system for automating deployment, scaling, and management of containerized applications.

K8 pods run an instance of a docker image, similar to a docker container. We can run all of our services in HySDS as K8 pods/deployments (& services if it needs to be exposed to users)

kubectl is the CLI tool used to communicate with your k8 cluster

New Dockerfile

I want to make the base docker image for the HySDS services as light as possible, so it would be best to not use hysds/pge-base because it’s ~3.5GB and installs a lot of extra tools needed for PGE execution

Resorted to creating a new docker image from centos:7 and installed only python 3.7 and the core hysds libraries, the new image is ~850MB (but will try to shrink it more)

  • in the future we can set a ARG to a python version (3.7.9) and give the users the option of installing a different version of python with --build-arg

the docker images for the various services in HySDS (mozart, grq2, pele rest APIs, celery workers) will be branched off this image and ran in a k8 environment

FROM centos:7 ARG HOME=/root ARG VERSION="3.7.9" WORKDIR $HOME # RUN yum update -y && \ RUN yum install gcc openssl-devel bzip2-devel libffi-devel openldap-devel readline-devel make wget git -y && \ cd /tmp && \ # installing python 3 wget https://www.python.org/ftp/python/${VERSION}/Python-${VERSION}.tgz && \ tar xzf Python-${VERSION}.tgz && \ cd Python-${VERSION} && \ ./configure --enable-optimizations && \ make altinstall && \ ln -s /usr/local/bin/python${VERSION:0:3} /usr/local/bin/python3 && \ ln -s /usr/local/bin/pip${VERSION:0:3} /usr/local/bin/pip3 && \ pip3 install --no-cache-dir --upgrade pip && \ pip3 install --no-cache-dir gnureadline && \ rm -f /tmp/Python-${VERSION}.tgz && \ rm -rf /tmp/Python-${VERSION} && \ # installing HySDS libraries cd $HOME && \ git clone https://github.com/hysds/prov_es.git && \ git clone https://github.com/hysds/osaka.git && \ git clone https://github.com/hysds/hysds_commons.git && \ git clone https://github.com/hysds/hysds.git && \ pip3 install --no-cache-dir -e prov_es/ && \ pip3 install --no-cache-dir -e osaka/ && \ pip3 install --no-cache-dir -e hysds_commons/ && \ pip3 install --no-cache-dir -e hysds/ && \ yum clean all && \ rm -rf /var/cache/yum && \ rm -r /tmp/* WORKDIR $HOME CMD ["/bin/bash"]

Kubernetes YAML files

Example of all services in mozart in a kubernetes environment, can run on your local k8 cluster (minikube or docker for desktop)

k8 services and deployments are defined in a .yaml file

a k8 service exposes your “pod” (similar to a docker container) to allow other entities to communicate with it (another pod or a user)

service.yml

apiVersion: v1 kind: Service metadata: name: mozart labels: app: mozart spec: ports: - port: 8888 selector: app: mozart type: LoadBalancer

deployment.yml

apiVersion: apps/v1 kind: Deployment metadata: name: mozart labels: app: mozart spec: # replicas: 2 # will allow you to run multiple instances of the app selector: matchLabels: app: mozart strategy: type: Recreate template: metadata: labels: app: mozart spec: containers: - name: mozart image: mozart:test # env: # passing environment variables # - name: WORKERS # value: "4" ports: - containerPort: 8888 name: mozart volumeMounts: - ... volumes: - ...

Use the kubectl CLI tool to deploy your application in your kubernetes cluster

Your deployment and service is now running

HySDS services:

 

Before HySDS would run its services in their respective machine/instance (Mozart, GRQ, Metrics & Factotum) but moving to a k8 deployment will get rid off that as the engine will determine which k8 node will run which service

Stateless Application(s)

  • Mozart rest API

  • Logstash

  • Celery workers

  • GRQ2 rest API

  • Pele rest API

  • Kibana

  • sdswatch

stateless applications are applications that don’t store data (besides logs), therefore the deployment in k8 is very simple & straightforward

can scale out easily without worrying about a leader/worker architecture, just add replicas: # in the deployment.yml file and your k8 LoadBalancer will handle the rest

most of the work is revolved around creating a PersistentVolume to store logs & maybe cache data

Stateful Application(s)

stateful applications save client data and deployments are more complicated

examples are databases, queues and cache stores

scaling out stateful applications require the usage of a StatefulSet

Another option they recommend is to move your stateful applications into cloud managed services

Helm

Helm is a plugin (similar to homebrew, yum) and repository for kubernetes which hosts k8 “packages” and …

… helps you manage Kubernetes applications — Helm Charts help you define, install, and upgrade even the most complex Kubernetes application.

Use Helm v3, v2 has a security vulnerabilities

Deploying stateful applications can often be complicated and can take a lot of k8 yaml files to get it to work, especially if you’re planning on running a multi-node setup: example for RabbitMQ

Using helm to handle the templating and .yml creation makes things much easier: example by Bitnami

Concerns

Kubernetes is dropping support for Docker

  • contains only the runtime component of Docker: container.d

  • saves resources (RAM, CPU storage, etc.) & less security risk

Kubernetes on cloud (EKS, GKE & AKS) don’t have to worry about it, but will affect users who are managing a K8 cluster themselves

SDSCLI -

moving to kubernetes will drastically affect sdscli

  • it was written under the design of SSHing into other HySDS components (grq, factotum & metrics) and running commands such as pip install, etc.

  • it relies on fabdric to copy files from mozart to other HySDS components

    • for example, sds update grq will clear out ~/sciflo/ops/ and copy over all the necessary files/repos from ~/mozart/ops/ to grq

    • can copy files from pod -> pod (kubectl cp my-pod:my-file my-file) but it can potentially mess things up

  • this will not work with k8 b/c every service is completely de-coupled and in their own environment

  • sds [start|stop|reset] [mozart|grq|metrics|factotum] will become somewhat obsolete (in its current state) because there’s no need for supervisord to run its services

    • services will be running in their own standalone pod(s)

    • instead will use kubectl to manage the k8 services

    • sueprvisord may be used in the k8 pod for celery workers

      • b/c we have many celery workers (user_rules processing, orchestrator, etc), wrapping it in supervisord in a pod may clean things up

  • will need to see how sds ship will be affected by kubernetes

Note: JPL employees can also get answers to HySDS questions at Stack Overflow Enterprise: