Customizing Rollover of old jobs in Mozart
Overview
With HySDS Core release version 5, it added the capability to rollover old jobs from Mozartโs Elasticsearch. The associated Jira ticket can be found here:
HC-177: investigate improving scalability of mozart ES indices for job, worker, task and event statuses by creating rolling indices by dateOpen (legacy ticket)
Elasticsearchโs Index Lifecycle Manager (ILM) policy is used to determine when to rollover old indices: https://www.elastic.co/guide/en/elasticsearch/reference/7.10/index-lifecycle-management.html
Previously, all jobs were getting stored under the job_status-current
index. Now, with this new feature, jobs will get partitioned by date under the following index: job_status-YYYY.MM.DD
.
This wiki will describe the default behavior of the system and how to customize it to adapt to project needs.
ILM (Index lifecycle policy) Policy
A default ILM policy comes with HySDS Core version 5. It is found in the sdscli repository along with dependency index templates for the jobs:
sdscli/sdscli at develop-v5 ยท sdskit/sdscli
NOTE: at the time of this writing, so that we donโt break other projects using hysds_release=develop
, we will merge this feature into a develop-v5
branch until we cut an official v5 of HySDS Core. At that time, we will fully merge this feature into the develop branch.
ย
The default ILM policy is as follows:
{
"policy": {
"phases": {
"hot": {
"min_age": "0ms",
"actions": {
"set_priority" : {
"priority": 100
}
}
},
"warm": {
"min_age": "90d",
"actions": {
"migrate": {
"enabled": false
},
"set_priority" : {
"priority": 50
}
}
},
"cold": {
"min_age": "97d",
"actions": {
"set_priority" : {
"priority": 0
},
"migrate": {
"enabled": false
},
"freeze": {}
}
},
"delete": {
"min_age": "104d",
"actions": {
"delete": {}
}
}
}
}
}
ย
At a high level, this will do the following:
Upon index creation, the index will go into a hot phase
After 90 days, it will move into a warm phase, where they can still be written to in order to update job statuses they may have crossed over from hot to warm in the middle of running.
7 days from that point, they will move into a cold phase where they will be closed. Thus, they will not appear under Figaro at that point.
Finally, 7 days from that point, they will be deleted from ElasticSearch.
See the comments section in HC-447 as it contains details on how this was tested and what you can expect to see at the different phases when looking at HySDS UI: HC-447: investigate and implement mechanism for rollover of old ES docs in job_status from mozart ESClosed
ย
ย
Override Default Behavior
In order to override the default ILM behavior, do the following:
ย
Get the following files from the sdscli repo and put it into the ~/.sds/files
area in your Mozart:
es_ilm_policy_mozart.json
event_status.template
job_status.template
task_status.template
worker_status.template
ย
At this point, you can update your ILM policy to fit your project needs. The ElasticSearch documentation referenced in the Overview section has more details on how else you can customize it as you see fit.
The templates should be included in here, mainly because within each of these templates has the number of shards setting:
"settings": {
"number_of_shards": 8,
"index": {
"refresh_interval": "5s"
}
ย
it is best to verify that the shards setting here is consistent with your deployment.
ย
Testing
At the time of this writing, HySDS Core is in the middle of transitioning from v4 to v5. While that is occurring, the rollover feature will temporarily reside under a develop-v5 branch for the following repos that were updated to properly add this rollover:
ย
For NISAR, we have temporarily updated Terraform so that we can bring this feature into our development clusters for further vetting and testing. It is done like so before we call the sds update
commands to push the HySDS Core code out to the cluster:
provisioner "remote-exec" {
inline = [
"set -ex",
"source ~/.bash_profile",
# NOTE THAT THIS WILL BE REMOVED ONCE WE MERGE THE develop-v5 changes into the develop branch
# get v5 develop versions of sdscli and hysds repo
"if [ \"${var.hysds_release}\" = \"develop\" ]; then",
" cd ~/mozart/ops/hysds",
" git checkout develop-v5",
" pip install -e .",
" cd ~/mozart/ops/sdscli",
" git checkout develop-v5",
" pip install -e .",
"fi",
]
}
ย
ย