Version: 1.7 → 7.1
Relevant Github Repos/Branches (develop-es7 branch):
- hysds https://github.com/hysds/hysds.git
- container-builder https://github.com/hysds/container-builder.git
- Rest APIs
- Good for now, may move to hysds repo
- hysds_commons https://github.com/hysds/hysds_commons.git
- Will remove because code and scripts ahve been moved to grq/mozart repo
- sdscli https://github.com/sdskit/sdscli.git
- lightweight-jobs https://github.com/hysds/lightweight-jobs.git
- Other repos that are installed in every hysds component
Big Changes
PLEASE LOOK AT AND USE THE NEW ELASTICSEARCH UTILITY CLASS: (SOURCE CODE)
- Only 1 type allowed in each index: _doc
- Need to manually enable all field text searches
- Removal of filtered since ES 5.0
- Split
string
intotext
andkeyword
- text allows for more searching capabilities documentation
- keyword allows for aggregation, etc. Documentation
fielddata: true
is the mapping allows for sorting (but we'll sort on thekeyword
instead): Documentation_default
_ mapping deprecated in ES 6.0.0 (Link)- workaround is using
index templates
: (Documentation)
- workaround is using
Changes in the geo coordinates query
Note: {"type": "geo_shape","tree": "quadtree","tree_levels": 26} makes uploading documents slow, specifically "tree_levels”: 2
{ "query": { "bool": { "filter": { "geo_shape": { "location": { "shape": { "type": "polygon", "coordinates": [[<coordinates>]] }, "relation": "within" } } } } } }
- Changes in Percolator
- Removal of
.percolator
type Documentation- Instead a percolator field type must be configured prior to indexing percolator queries
- Complete overhaul in the percolate index mapping
- Removal of
Removal of _all: { "enabled": true } type in indices so we cannot search for all fields
workaround is adding copy_to in field mapping, especially in dynamic templating
- Does not work with multi-fields
"random_field_name": { "type": "keyword", "ignore_above": 256, "copy_to": "all_text_fields", # DOES WORK "fields": { "keyword": { "type": "text" "copy_to": "all_text_fields" # DOES NOT WORK } } }
Proper mapping with text fields
"random_field_name": { "type": "text", "copy_to": "all_text_fields" "fields": { "keyword": { # WE USE 'raw' instead of 'keyword' in our own indices "type": "keyword" # THIS IS NEEDED FOR AGGREGATION ON THE FACETS FOR THE UI "ignore_above": 256 } } }
- Need to add the
copy_to
field mapping"all_text_fields": { "type": "text" }
- General changes to the mapping
create example mapping called grq_v1.1_s1-iw_slc
copied example data into new ES index, using built in dynamic mapping to build initial mapping
mapping changes:
metadata.context to {"type": "object", "enabled": false}
properties.location to {"type": "geo_shape","tree": "quadtree"}
use type keyword to be able to use msearch:
"reason": "Fielddata is disabled on text fields by default. ... Alternatively use a keyword field instead."
- Changes to query_string
- removal of escaping literal double quotes in query_string
- old query_string from 1.7, would return S1B_IW_SLC__1SDV_20170812T010949_20170812T011016_006900_00C25E_B16D
{ "query": { "query_string": { "query": "\"__1SDV_20170812T010949_20170812T011016_006900_00C25E_B16\"", "default_operator": "OR" } } }
- new query_string returns equivalent document, requires wildcard * at the beginning and end of string
{ "query": { "query_string": { "default_field": "all_text_fields", "query": "*__1SDV_20170812T010949_20170812T011016_006900_00C25E_B16*", "default_operator": "OR" } } }
- i dont think date searches really changed much
{ "query": { "query_string": { "query": "starttime: [2019-01-01 TO 2019-01-31]", "default_operator": "OR" } } }
- can combine different fields as well
{ "query": { "query_string": { "fields": ["all_text_fields", "all_date_fields"], "query": "[2019-01-01 TO 2019-01-31] AND *__1SDV_20190109T020750_20190109T020817_014411*", "default_operator": "OR" } } }
- Removal of
search_type=scan
- https://www.elastic.co/guide/en/elasticsearch/reference/5.5/breaking_50_search_changes.html
Requires changes in our HySDS code, wherever it uses
search_type=scan
curl -X POST http://localhost:9200/hysds_ios/_search?search_type=scan&scroll=10m&size=100 { "error": { "root_cause": [ { "type": "illegal_argument_exception", "reason": "No search type for [scan]" } ], "type": "illegal_argument_exception", "reason": "No search type for [scan]" }, "status": 400 } # removing search_type=scan from the endpoint fixes this problem curl -X POST http://100.64.134.55:9200/user_rules/_search?scroll=10m&size=100 { "_scroll_id": "DXF1ZXJ5QW5kRmV0Y2gBAAAAAAAAAAEWMUpVeFNzVXpTVktlUzFPc0NKa1dndw==", "took": 34, "timed_out": false, "_shards": { "total": 1, "successful": 1, "skipped": 0, "failed": 0 }, "hits": { "total": { "value": 0, "relation": "eq" }, "max_score": null, "hits": [] } }
- Removal of filter and filtered: Link and Guide
- deprecated in version 5.x, move all logic to
query
andbool
and
,or
,not
changed tomust
should
andmust_not
- if using should, will need to add
minimum_should_match: 1
- if using should, will need to add
# from this: { "filtered": { "filter": { "and": [ { "match": { "tags": "ISL" } }, { "range": { "metadata.ProductReceivedTime": {"gte": "2020-03-24T00:00:00.000000Z"} } }, { "range": { "metadata.ProductReceivedTime": {"lte": "2020-03-24T23:59:59.999999Z"} } } ] } } } # change to this: { "query": { "bool": { "must": [ { "match": { "tags": "ISL" } } ], "filter": [ { "range": { "metadata.ProductReceivedTime": {"gte": "2020-03-24T00:00:00.000000Z"} } }, { "range": { "metadata.ProductReceivedTime": {"lte": "2020-03-24T23:59:59.999999Z"} } } ] } } }
- deprecated in version 5.x, move all logic to
Running Elasticsearch 7 on EC2 instance
In order to expose port 0.0.0.0 properly, we need to edit the config/elasticsearch.yml file
network.host: 0.0.0.0 cluster.name: grq_cluster node.name: ESNODE_CYR node.master: true node.data: true transport.host: localhost transport.tcp.port: 9300 http.port: 9200 discovery.zen.minimum_master_nodes: 2 # allows UI to talk to elasticsearch (in production we would put the actual hostname of the uI) http.cors.enabled : true http.cors.allow-origin: "*"
Running Kibana on EC2 instance
Install Kibana in command line
curl -O https://artifacts.elastic.co/downloads/kibana/kibana-7.1.1-darwin-x86_64.tar.gz tar -xzf kibana-7.1.1-darwin-x86_64.tar.gz cd kibana-7.1.1-darwin-x86_64/
Edit the config/kibana.yml file to expose host 0.0.0.0
server.host: 0.0.0.0
Index Template
- So that every index created automatically follows this template for its mapping
- grq2
_default_
mapping template Link - python code to create the index template: Link
- Documentation
{ "order": 0, "index_patterns": [ "{{ prefix }}_*" ], "settings": { "index.refresh_interval": "5s", "analysis": { "analyzer": { "default": { "filter": [ "standard", "lowercase", "word_delimiter" ], "tokenizer": "keyword" } } } }, "mappings": { "dynamic_templates": [ { "integers": { "match_mapping_type": "long", "mapping": { "type": "integer" } } }, { "strings": { "match_mapping_type": "string", "mapping": { "norms": false, "type": "text", "copy_to": "all_text_fields", "fields": { "raw": { "type": "keyword", "ignore_above": 256 } } }, "match": "*" } } ], "properties": { "browse_urls": { "type": "text", "copy_to": "all_text_fields" }, "urls": { "type": "text", "copy_to": "all_text_fields" }, "location": { "tree": "quadtree", "type": "geo_shape" }, "center": { "tree": "quadtree", "type": "geo_shape" }, "starttime": { "type": "date" }, "endtime": { "type": "date" }, "creation_timestamp": { "type": "date" }, "metadata": { "properties": { "context": { "type": "object", "enabled": false } } }, "prov": { "properties": { "wasDerivedFrom": { "type": "keyword" }, "wasGeneratedBy": { "type": "keyword" } } }, "all_text_fields": { "type": "text" } } }, "aliases": { "{{ alias }}": {} } }
Percolator
Percolator needs to be compatible with ES 7.1 (not applicable because HySDS uses its own version of percolator)
- User Rules (Documentation for user rules triggering)
mapping added in mozart server
/home/ops/mozart/ops/tosca/configs/user_rules_dataset.mapping
- python code to create the
user_rules
index: Link - Mapping template for
user_rules
index Link # PUT user_rules { "mappings": { "properties": { "creation_time": { "type": "date" }, "enabled": { "type": "boolean", "null_value": true }, "job_type": { "type": "keyword" }, "kwargs": { "type": "keyword" }, "modification_time": { "type": "date" }, "modified_time": { "type": "date" }, "passthru_query": { "type": "boolean" }, "priority": { "type": "long" }, "query": { "type": "object", "enabled": false }, "query_all": { "type": "boolean" }, "query_string": { "type": "text" }, "queue": { "type": "text" }, "rule_name": { "type": "keyword" }, "username": { "type": "keyword" }, "workflow": { "type": "keyword" } } } }
hysds_ios Index
- Github Link to template.json: Link
- Python code to create
hysds_ios
index template: Link - Follow HySDS and Job-Spec documentation for Jenkins build Link
{ "order": 0, "template": "{{ index }}", "settings": { "index.refresh_interval": "5s", "analysis": { "analyzer": { "default": { "filter": [ "standard", "lowercase", "word_delimiter" ], "tokenizer": "keyword" } } } }, "mappings": { "dynamic_templates": [ { "integers": { "match_mapping_type": "long", "mapping": { "type": "integer" } } }, { "strings": { "match_mapping_type": "string", "mapping": { "norms": false, "type": "text", "copy_to": "all_text_fields", "fields": { "raw": { "type": "keyword", "ignore_above": 256 } } }, "match": "*" } } ], "properties": { "_timestamp": { "type": "date", "store": true } } } }