Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Version: 1.7 → 7.1

Relevant Github Repos/Branches (develop-es7 branch):

...

  • PLEASE LOOK AT AND USE THE NEW ELASTICSEARCH UTILITY CLASS: (SOURCE CODE)

  • Only 1 type allowed in each index: _doc
  • Need to manually enable all field text searches
  • Removal of filtered since  since ES 5.0
  • Split string into text and keyword
  • fielddata: true is the mapping allows for sorting (but we'll sort on the keyword instead): Documentation
  • Support for z coordinate in geoshapesdocumentation
    • it wont affect searches but adds more flexibility in location data
  • _default_ mapping deprecated in ES 6.0.0 (Link)
  • Changes in the geo coordinates query

    • Note: {"type": "geo_shape","tree": "quadtree","tree_levels": 26} makes uploading documents slow, specifically "tree_levels”: 2


    • Code Block
      {
        "query": {
          "bool": {
            "filter": {
              "geo_shape": {
                "location": {
                  "shape": {
                    "type": "polygon",
                    "coordinates": [[<coordinates>]]
                  },
                  "relation": "within"
                }
              }
            }
          }
        }
      }


  • Changes in Percolator


  • Removal of   _all: { "enabled": true }   type in indices so we cannot search for all fields

    • workaround is adding copy_to in field mapping, especially in dynamic templating

    • Link to copy_to documentation

    • Does not work with multi-fields

      • Code Block
        "random_field_name": {
          "type": "keyword",
          "ignore_above": 256,
          "copy_to": "all_text_fields", # DOES WORK
          "fields": {
            "keyword": {
              "type": "text"
              "copy_to": "all_text_fields" # DOES NOT WORK
            }
          }
        }


    • Proper mapping with text fields


      • Code Block
        "random_field_name": {
          "type": "text",
          "copy_to": "all_text_fields"
          "fields": {
            "keyword": { # WE USE 'raw' instead of 'keyword' in our own indices
              "type": "keyword" # THIS IS NEEDED FOR AGGREGATION ON THE FACETS FOR THE UI
              "ignore_above": 256
            }
          }
        }


    • Need to add the copy_to field mapping

      • Code Block
        "all_text_fields": {
          "type": "text"
        }


  • General changes to the mapping

    • create example mapping called grq_v1.1_s1-iw_slc

    • copied example data into new ES index, using built in dynamic mapping to build initial mapping

    • mapping changes:

      • metadata.context to {"type": "object", "enabled": false}

        • properties.location to {"type": "geo_shape","tree": "quadtree"}

        • use type keyword to be able to use msearch:


          • Code Block
            "reason": "Fielddata is disabled on text fields by default. ... Alternatively use a keyword field instead."


  • Changes to query_string

    • removal of escaping literal double quotes in query_string
    • old query_string from 1.7, would return S1B_IW_SLC__1SDV_20170812T010949_20170812T011016_006900_00C25E_B16D

      • Code Block
        {
          "query": {
            "query_string": {
              "query": "\"__1SDV_20170812T010949_20170812T011016_006900_00C25E_B16\"",
              "default_operator": "OR"
            }
          }
        }


    • new query_string returns equivalent document, requires wildcard * at the beginning and end of string

      • Code Block
        {
          "query": {
            "query_string": {
              "default_field": "all_text_fields",
              "query": "*__1SDV_20170812T010949_20170812T011016_006900_00C25E_B16*",
              "default_operator": "OR"
            }
          }
        }


    • i dont think date searches really changed much

      • Code Block
        {
          "query": {
            "query_string": {
              "query": "starttime: [2019-01-01 TO 2019-01-31]",
              "default_operator": "OR"
            }
          }
        }


    • can combine different fields as well

      • Code Block
        {
          "query": {
            "query_string": {
              "fields": ["all_text_fields", "all_date_fields"],
              "query": "[2019-01-01 TO 2019-01-31] AND *__1SDV_20190109T020750_20190109T020817_014411*",
              "default_operator": "OR"
            }
          }
        }



  • Removal of search_type=scan

    • https://www.elastic.co/guide/en/elasticsearch/reference/5.5/breaking_50_search_changes.html
    • NOTE: must clear _scroll_id after using the scroll API to pull data
      • Will return error is _scroll_id's not cleared

      • Code Block
        query_phase_execution_exception","reason":"Result window is too large, from + size must be less than or equal to: [10000] but was [11000]. See the scroll api for a more efficient way to request large data sets. This limit can be set by changing the [index.max_result_window] index level setting.


    • Requires changes in our HySDS code, wherever it uses search_type=scan

      Code Block
      curl -X POST http://localhost:9200/hysds_ios/_search?search_type=scan&scroll=10m&size=100
      {
        "error": {
          "root_cause": [
            {
              "type": "illegal_argument_exception",
              "reason": "No search type for [scan]"
            }
          ],
          "type": "illegal_argument_exception",
          "reason": "No search type for [scan]"
        },
        "status": 400
      }
      
      # removing search_type=scan from the endpoint fixes this problem
      curl -X POST http://100.64.134.55:9200/user_rules/_search?scroll=10m&size=100
      {
        "_scroll_id": "DXF1ZXJ5QW5kRmV0Y2gBAAAAAAAAAAEWMUpVeFNzVXpTVktlUzFPc0NKa1dndw==",
        "took": 34,
        "timed_out": false,
        "_shards": {
          "total": 1,
          "successful": 1,
          "skipped": 0,
          "failed": 0
        },
        "hits": {
          "total": {
            "value": 0,
            "relation": "eq"
          },
          "max_score": null,
          "hits": []
        }
      }


  • Removal of filter and filtered: Link and Guide

    • deprecated in version 5.x, move all logic to query and bool
    • andornot changed to must should and must_not
      • if using should, will need to add minimum_should_match: 1
      • Link


    • Code Block
      # from this:
      {
        "filtered": {
          "filter": {
            "and": [
              {
                "match": {
                  "tags": "ISL"
                }
              },
              {
                "range": {
                  "metadata.ProductReceivedTime": {"gte": "2020-03-24T00:00:00.000000Z"}
                }
              },
              {
                "range": {
                  "metadata.ProductReceivedTime": {"lte": "2020-03-24T23:59:59.999999Z"}
                }
              }
            ]
          }
        }
      }
      
      # change to this:
      {
        "query": {
          "bool": {
            "must": [
              {
                "match": {
                  "tags": "ISL"
                }
              }
            ],
            "filter": [
              {
                "range": {
                  "metadata.ProductReceivedTime": {"gte": "2020-03-24T00:00:00.000000Z"}
                }
              },
              {
                "range": {
                  "metadata.ProductReceivedTime": {"lte": "2020-03-24T23:59:59.999999Z"}
                }
              }
            ]
          }
        }
      }


...