Elasticsearch version 8.16.0
edit
IMPORTANT: This documentation is no longer updated. Refer to Elastic's version policy and the latest documentation.
Elasticsearch version 8.16.0
editAlso see Breaking changes in 8.16.
Breaking changes
edit- Analysis
-
- Set lenient to true by default when using updateable synonyms #110901
- Data streams
-
- Update data stream lifecycle telemetry to track global retention #112451
- ES|QL
-
- Entirely remove META FUNCTIONS #113967
- Mapping
-
- JDK locale database change #113975
- Search
-
- Adding breaking change entry for retrievers #115399
Bug fixes
edit- Aggregations
- Authentication
- Authorization
- CRUD
- Cluster Coordination
-
-
Ensure clean thread context in
MasterService#114512
-
Ensure clean thread context in
- Data streams
-
- Adding support for data streams with a match-all template #111311 (issue: #111204)
- Exclude internal data streams from global retention #112100
- Fix verbose get data stream API not requiring extra privileges #112973
- OTel mappings: avoid metrics to be rejected when attributes are malformed #114856
- Resolve pipelines from template on lazy rollover write #116031 (issue: #112781)
- [apm-data] Apply lazy rollover on index template creation #116219 (issue: #116230)
- [otel-data] Add more kubernetes aliases #115429
- logs-apm.error-*: define log.level field as keyword #112440
- Distributed
-
-
Handle
InternalSendExceptioninline for non-forking handlers #114375
-
Handle
- EQL
- ES|QL
-
-
Add Values aggregation tests, fix
ConstantBytesRefBlockmemory handling #111367 - Align year diffing to the rest of the units in DATE_DIFF: chronological #113103 (issue: #112482)
- Disable pushdown of WHERE past STATS #115308 (issue: #115281)
- Fix CASE when conditions are multivalued #112401 (issue: #112359)
- Fix DEBUG log of filter #116086 (issue: #116055)
- Fix Double operations returning infinite #111064 (issue: #111026)
-
Fix
REVERSEwith backspace character #115245 (issues: #114372, #115227, #115228) - Fix a bug in VALUES agg #115952
-
Fix a bug in
MV_PERCENTILE#112218 (issues: #112193, #112180, #112187, #112188) - Fix filtered grouping on ords #115312 (issue: #114897)
- Fix grammar changes around per agg filtering #114848
-
Fix serialization during
can_match#111779 (issues: #111701, #111726) - Fix synthetic attribute pruning #111413 (issue: #105821)
- Don’t lose the original casting error message #111968 (issue: #111967)
- Fix for missing indices error message #111797 (issue: #111712)
-
Restrict sorting for
_sourceand counter field types #114638 (issues: #114423, #111976) - Better validation for GROK patterns #110574 (issue: #110533)
- Better validation for RLIKE patterns #112489 (issue: #112485)
- Better validation of GROK patterns #112200 (issue: #112111)
- Fix LIMIT pushdown past MV_EXPAND #115624 (issues: #102084, #102061)
- Fix ST_CENTROID_AGG when no records are aggregated #114888 (issue: #106025)
- Spatial search functions support multi-valued fields in compute engine #112063 (issues: #112102, #112505, #110830)
-
Check expression resolved before checking its data type in
ImplicitCasting#113314 (issue: #113242) - Simplify patterns for subfields #111118
- Simplify syntax of named parameter for identifier and pattern #115061
- Skip validating remote cluster index names in parser #114271
-
Use
RangeQueryand String inBinaryComparisonon datetime fields #110669 (issue: #107900) -
Verify aggregation filter’s type is boolean to avoid
class_cast_exception#116274 - Add tests for stats by constant #110593 (issue: #105383)
- Make named parameter for identifier and pattern snapshot #114784
-
Validate
mv_sortorder #110021 (issue: #109910)
-
Add Values aggregation tests, fix
- Geo
- Health
-
-
Set
replica_unassigned_buffer_timein constructor #112612
-
Set
- ILM+SLM
-
-
Make
SnapshotLifecycleStatsimmutable soSnapshotLifecycleMetadata.EMPTYisn’t changed as side-effect #111215
-
Make
- Indices APIs
- Infra/Core
-
-
Fix max file size check to use
getMaxFileSize#113723 (issue: #113705) -
Guard blob store local directory creation with
doPrivileged#115459 -
Handle
BigIntegerin xcontent copy #111937 (issue: #111812) - Report JVM stats for all memory pools (97046) #115117 (issue: #97046)
-
ByteArrayStreamInput:Return -1 when there are no more bytes to read #112214
-
Fix max file size check to use
- Infra/Logging
- Infra/Settings
- Ingest Node
- License
- Logs
-
- Do not expand dots when storing objects in ignored source #113910
-
Fix
ignore_abovehandling in synthetic source when index level setting is used #113570 (issue: #113538) -
Fix synthetic source for flattened field when used with
ignore_above#113499 (issue: #112044) - Prohibit changes to index mode, source, and sort settings during restore #115811
- Machine Learning
-
-
Avoid
ModelAssignmentdeadlock #109684 -
Avoid
catch (Throwable t)inAmazonBedrockStreamingChatProcessor#115715 -
Allow for
pytorch_inferenceresults to include zero-dimensional tensors - Empty percentile results no longer throw no_such_element_exception in Anomaly Detection jobs #116015 (issue: #116013)
- Fix NPE in Get Deployment Stats #115404
- Fix bug in ML serverless autoscaling which prevented trained model updates from triggering a scale up #110734
-
Fix stream support for
TaskType.ANY#115656 - Fix parameter initialization for large forecasting models #2759
- Forward bedrock connection errors to user #115868
- Ignore unrecognized openai sse fields #114715
- Prevent NPE if model assignment is removed while waiting to start #115430
- Send mid-stream errors to users #114549
-
Temporarily return both
modelIdandinferenceIdfor GET /_inference until we migrate clients to onlyinferenceId#111490 - Warn for model load failures if they have a status code <500 #113280
- [Inference API] Remove unused Cohere rerank service settings fields in a BWC way #110427
- [ML] Create Inference API will no longer return model_id and now only return inference_id #112508
-
Avoid
- Mapping
- Ranking
- Search
-
-
Allow for querries on
_tierto skip shards in thecan_matchphase #114990 (issue: #114910) - Allow out of range term queries for numeric types #112916
- Do not exclude empty arrays or empty objects in source filtering #112250 (issue: #109668)
-
Fix synthetic source handling for
bittype indense_vectorfield #114407 (issue: #114402) - Improve DateTime error handling and add some bad date tests #112723 (issue: #112190)
- Improve date expression/remote handling in index names #112405 (issue: #112243)
- Make "too many clauses" throw IllegalArgumentException to avoid 500s #112678 (issue: #112177)
- Make empty string searches be consistent with case (in)sensitivity #110833
- Prevent flattening of ordered and unordered interval sources #114234
-
Remove needless forking to GENERIC in
TransportMultiSearchAction#110796 - Search/Mapping: KnnVectorQueryBuilder support for allowUnmappedFields #107047 (issue: #106846)
- Span term query to convert to match no docs when unmapped field is targeted #113251
-
Speedup
CanMatchPreFilterSearchPhaseconstructor #110860 -
Update
BlobCacheBufferedIndexInput::readVLongto correctly handle negative long values #115594 - [8.x] Limit the number of tasks that a single search can submit #115932
-
Allow for querries on
- Security
- Snapshot/Restore
- TSDB
- Task Management
-
- Improve handling of failure to create persistent task #114386
- Transform
- Vector Search
- Watcher
Deprecations
edit- Analysis
- CRUD
-
- Deprecate dot-prefixed indices and composable template index patterns #112571
- Search
Enhancements
edit- Aggregations
-
-
Account for
DelayedBucketbefore reduction #113013 - Add protection for OOM during aggregations partial reduction #110520
-
Deduplicate
BucketOrderwhen deserializing #112707 -
Lower the memory footprint when creating
DelayedBucket#112519 -
Reduce heap usage for
AggregatorsReducer#112874 -
Remove reduce and
reduceContextfromDelayedBucket#112547
-
Account for
- Allocation
- Application
-
-
[Profiling] add
container.idfield to event index template #111969
-
[Profiling] add
- Authorization
- Codec
- Data streams
-
-
Add verbose flag retrieving
maximum_timestampfor get data stream API #112303 - Display effective retention in the relevant data stream APIs #112019
- Expose global retention settings via data stream lifecycle API #112210
- Ignore warning on yaml test put template #116201 (issue: #116158)
- Make ecs@mappings work with OTel attributes #111600
-
Add verbose flag retrieving
- Distributed
-
- Add link to Max Shards Per Node exception message #110993
- ES|QL
-
- Add EXP ES|QL function #110879
- Delay construction of warnings #114368
-
Add
CircuitBreakerto TDigest, Step 3: Connect with ESQL CB #113387 -
Add
CircuitBreakerto TDigest, Step 4: Take into account shallow classes size #113613 (issue: #113916) - Collect and display execution metadata for ES|QL cross cluster searches #112595 (issue: #112402)
- Add support for multivalue fields in Arrow output #114774
- BUCKET: allow numerical spans as whole numbers #111874 (issues: #104646, #109340, #105375)
- Have BUCKET generate friendlier intervals #111879 (issue: #110916)
- Profile more timing information #111855
- Push down filters even in case of renames in Evals #114411
- Speed up CASE for some parameters #112295
- Speed up grouping by bytes #114021
- Use less memory in listener #114358
- Add support for cached strings in plan serialization #112929
- Add Telemetry API and track top functions #111226
- Enhance SORT push-down to lucene to cover references to fields and ST_DISTANCE function #112938 (issue: #109973)
- Siem ea 9521 improve test #111552
- Support multi-valued fields in compute engine for ST_DISTANCE #114836 (issue: #112910)
-
Add
SPACEfunction #112350 - Add finish() elapsed time to aggregation profiling times #113172 (issue: #112950)
-
Make query wrapped by
SingleValueQuerycacheable #110116 - Add hypot function #114382
- Cast mixed numeric types to a common numeric type for Coalesce and In at Analyzer #111917 (issue: #111486)
- Combine Disjunctive CIDRMatch #111501 (issue: #105143)
-
Create
RangeinPushFiltersToSourcefor qualified pushable filters on the same field #111437 - Name parameter with leading underscore #111950 (issue: #111821)
- Named parameter for field names and field name patterns #112905
- Validate index name in parser #112081
- Add reverse function #113297
-
Explicit cast a string literal to
date_periodandtime_durationin arithmetic operations #109193
- Experiences
-
- Integrate IBM watsonx to Inference API for text embeddings #111770
- Geo
- Health
- ILM+SLM
-
-
ILM: Add
total_shards_per_nodesetting to searchable snapshot #112972 (issue: #112261) - PUT slm policy should only increase version if actually changed #111079
- Preserve Step Info Across ILM Auto Retries #113187
- Register SLM run before snapshotting to save stats #110216
-
SLM interval schedule followup - add back
getFieldNamestyle getters #112123
-
ILM: Add
- Infra/Core
- Infra/Metrics
-
-
Add
TaskManagertopluginServices#112687
-
Add
- Infra/REST API
- Infra/Scripting
-
-
Expose
HexFormatin Painless #112412
-
Expose
- Infra/Settings
- Ingest Node
-
-
Add
size_in_bytesto enrich cache stats #110578 - Add support for templates when validating mappings in the simulate ingest API #111161
-
Adding
index_template_substitutionsto the simulate ingest API #114128 - Adding component template substitutions to the simulate ingest API #113276
- Adding mapping validation to the simulate ingest API #110606
- Adds example plugin for custom ingest processor #112282 (issue: #111539)
- Fix unnecessary mustache template evaluation #110986 (issue: #110191)
- Listing all available databases in the _ingest/geoip/database API #113498
- Make enrich cache based on memory usage #111412 (issue: #106081)
- Tag redacted document in ingest metadata #113552
- Verify Maxmind database types in the geoip processor #114527
-
Add
- Logs
- Machine Learning
-
- Add Completion Inference API for Alibaba Cloud AI Search Model #112512
- Add Streaming Inference spec #113812
-
Add chunking settings configuration to
CohereService,AmazonBedrockService,andAzureOpenAiService#113897 -
Add chunking settings configuration to
ElasticsearchService/ELSER#114429 - Add custom rule parameters to force time shift #110974
-
Adding chunking settings to
GoogleVertexAiService,AzureAiStudioService,andAlibabaCloudSearchService#113981 -
Adding chunking settings to
MistralService,GoogleAiStudioService,andHuggingFaceService#113623 - Adds a new Inference API for streaming responses back to the user. #113158
- Allow users to force a detector to shift time series state by a specific amount #2695
-
Create
StreamingHttpResultPublisher#112026 - Create an ml node inference endpoint referencing an existing model #114750
- Default inference endpoint for ELSER #113873
- Default inference endpoint for the multilingual-e5-small model #114683
- Dynamically get of num allocations #114636
- Enable OpenAI Streaming #113911
- Filter empty task settings objects from the API response #114389
-
Migrate Inference to
ChunkedToXContent#111655 - Register Task while Streaming #112369
- Server-Sent Events for Inference response #112565
- Stream Anthropic Completion #114321
- Stream Azure Completion #114464
- Stream Bedrock Completion #114732
- Stream Cohere Completion #114080
- Stream Google Completion #114596
- Stream OpenAI Completion #112677
- Support sparse embedding models in the elasticsearch inference service #112270
- Switch default chunking strategy to sentence #114453
- Update the Pytorch library to version 2.3.1 #2688
- Upgrade to AWS SDK v2 #114309 (issue: #110590)
- Use the same chunking configurations for models in the Elasticsearch service #111336
- Validate streaming HTTP Response #112481
- Wait for allocation on scale up #114719
- [Inference API] Add Alibaba Cloud AI Search Model support to Inference API #111181
- [Inference API] Add Docs for AlibabaCloud AI Search Support for the Inference API #111181
- [Inference API] Introduce Update API to change some aspects of existing inference endpoints #114457
- [Inference API] Prevent inference endpoints from being deleted if they are referenced by semantic text #110399
- [Inference API] alibabacloud ai search service support chunk infer to support semantic_text field #110399
- Mapping
-
- Add Field caps support for Semantic Text #111809
- Add lucene segment-level fields stats #111123
- Add Search Inference ID To Semantic Text Mapping #113051
- Add object param for keeping synthetic source #113690
- Add support for multi-value dimensions #112645 (issue: #110387)
- Allow dimension fields to have multiple values in standard and logsdb index mode #112345 (issues: #112232, #112239)
- Allow fields with dots in sparse vector field mapper #111981 (issue: #109118)
-
Allow querying
index_mode#110676 -
Configure keeping source in
FieldMapper#112706 - Control storing array source with index setting #112397
-
Introduce mode
subobjects=autofor objects #110524 -
Update
semantic_textfield to support indexing numeric and boolean data types #111284 -
Use fallback synthetic source for
copy_toand doc_values: false cases #112294 (issues: #110753, #110038, #109546)
- Network
-
- Add links to network disconnect troubleshooting #112330
- Ranking
-
- Add timeout and cancellation check to rescore phase #115048
- Relevance
-
- Add a query rules tester API call #114168
- Search
-
-
Add more
dense_vectordetails for cluster stats field stats #113607 - Add range and regexp Intervals #111465
-
Adding support for
allow_partial_search_resultsin PIT #111516 -
Allow incubating Panama Vector in simdvec, and add vectorized
ipByteBin#112933 -
Avoid using concurrent collector manager in
luceneChangesSnapshot#113816 -
Bool query early termination should also consider
must_notclauses #115031 - Deduplicate Kuromoji User Dictionary #112768
- Multi term intervals: increase max_expansions #112826 (issue: #110491)
-
Search coordinator uses
event.ingestedin cluster state to do rewrites #111523 - Update cluster stats for retrievers #114109
-
Add more
- Security
- Snapshot/Restore
-
-
Add
max_multipart_partssetting to S3 repository #113989 - Add support for Azure Managed Identity #111344
- Add telemetry for repository usage #112133
- Add workaround for missing shard gen blob #112337
- Clean up dangling S3 multipart uploads #111955 (issues: #101169, #44971)
- Execute shard snapshot tasks in shard-id order #111576 (issue: #108739)
- Include account name in Azure settings exceptions #111274
- Introduce repository integrity verification API #112348 (issue: #52622)
-
Add
- Stats
-
- Track search and fetch failure stats #113988
- TSDB
- Vector Search
New features
edit- Data streams
- ES|QL
-
- Add match function #113374
-
Add
MV_PSERIES_WEIGHTED_SUMfor score calculations used by security solution #109017 -
Add async ID and
is_runningheaders to ESQL async query #111840 - Add boolean support to Max and Min aggs #110527
- Add boolean support to TOP aggregation #110718
-
Added
mv_percentilefunction #111749 (issue: #111591) - Introduce per agg filter #113735
- Strings support for MAX and MIN aggregations #111544
- Support IP fields in MAX and MIN aggregations #110921
- TOP aggregation IP support #111105
- TOP support for strings #113183 (issue: #109849)
-
mv_median_absolute_deviationfunction #112055 (issue: #111590) - Add MATCH operator #110971
- ILM+SLM
-
- SLM Interval based scheduling #110847
- Inference
-
- EIS integration #111154
- Ingest Node
- Machine Learning
- Relevance
-
-
[Query rules] Add
excludequery rule type #111420
-
[Query rules] Add
- Search
- Vector Search
-
- Adding new bbq index types behind a feature flag #114439
Upgrades
edit- Infra/Core
- Infra/Metrics
- Search
-
- Upgrade to lucene 9.12 #113333
- Snapshot/Restore
Known issues
edit- ES|QL
-
-
Some valid queries using an
ENRICHcommand can fail when a match field is used that is absent from some indices or shards, either with a 500 status code due toNullPointerExceptionorClassCastExceptionor with a 400 status code andIllegalArgumentException. This is fixed in #126187. -
A bug in the ES|QL STATS command may yield incorrect results. The bug only happens in very specific cases that follow this pattern:
STATS ... BY keyword1, keyword2, i.e. the command must have exactly two grouping fields, both keywords, where the first field has high cardinality (more than 65k distinct values).The bug is described in detail in [this issue](https://github.com/elastic/elasticsearch/issues/130644). The problem was introduced in 8.16.0 and [fixed](https://github.com/elastic/elasticsearch/pull/130705) in 8.17.9, 8.18.7.
Possible workarounds include: * switching the order of the grouping keys (eg. `STATS ... BY keyword2, keyword1`, if the `keyword2` has a lower cardinality) * reducing the grouping key cardinality, by filtering out values before STATS
-
Some valid queries using an