What’s new in 8.15
editWhat’s new in 8.15
editComing in 8.15.
Here are the highlights of what’s new and improved in Elasticsearch 8.15! For detailed information about this release, see the Release notes and Migration guide.
Other versions:
8.14 | 8.13 | 8.12 | 8.11 | 8.10 | 8.9 | 8.8 | 8.7 | 8.6 | 8.5 | 8.4 | 8.3 | 8.2 | 8.1 | 8.0
Stricter failure handling in multi-repo get-snapshots request handling
editIf a multi-repo get-snapshots request encounters a failure in one of the targeted repositories then earlier versions of Elasticsearch would proceed as if the faulty repository did not exist, except for a per-repository failure report in a separate section of the response body. This makes it impossible to paginate the results properly in the presence of failures. In versions 8.15.0 and later this API’s failure handling behaviour has been made stricter, reporting an overall failure if any targeted repository’s contents cannot be listed.
Introduce logsdb
index mode as Tech Preview
editThis change introduces a new index mode named logsdb
.
When the new index mode is enabled then the following storage savings features are enabled automatically:
-
Synthetic source, which omits storing the
_source
field. When_source
or part of it is requested it is synthesized on the fly at runtime. -
Index sorting. By default indices are sorted by
host.name
and@timestamp
fields at index time. This can be overwritten if other sorting fields yield better compression rate. -
Enable more space efficient compression for fields with doc values enabled. These are the same codecs used
when
time_series
index mode is enabled.
The index.mode
index setting set to logsdb
should be configured in index templates or defined when creating a plain index.
Benchmarks and other tests have shown that logs data sets use around 2.5 times less storage with the new index mode enabled compared to not configuring it.
The new logsdb
index mode is a tech preview feature.
Add new int4 quantization to dense_vector
editNew int4 (half-byte) scalar quantization support via two knew index types: int4_hnsw
and int4_flat
.
This gives an 8x reduction from float32
with some accuracy loss. In addition to less memory required, this
improves query and merge speed significantly when compared to raw vectors.
Mark Query Rules as GA
editThis PR marks query rules as Generally Available. All APIs are no longer in tech preview.
Adds new bit
element_type
for dense_vectors
editThis adds bit
vector support by adding element_type: bit
for
vectors. This new element type works for indexed and non-indexed
vectors. Additionally, it works with hnsw
and flat
index types. No
quantization based codec works with this element type, this is
consistent with byte
vectors.
bit
vectors accept up to 32768
dimensions in size and expect vectors
that are being indexed to be encoded either as a hexidecimal string or a
byte[]
array where each element of the byte
array represents 8
bits of the vector.
bit
vectors support script usage and regular query usage. When
indexed, all comparisons done are xor
and popcount
summations (aka,
hamming distance), and the scores are transformed and normalized given
the vector dimensions.
For scripts, l1norm
is the same as hamming
distance and l2norm
is
sqrt(l1norm)
. dotProduct
and cosineSimilarity
are not supported.
Note, the dimensions expected by this element_type are always to be
divisible by 8
, and the byte[]
vectors provided for index must be
have size dim/8
size, where each byte element represents 8
bits of
the vectors.
The Redact processor is Generally Available
editThe Redact processor uses the Grok rules engine to obscure text in the input document matching the given Grok patterns. The Redact processor was initially released as Technical Preview in 8.7.0
, and is now released as Generally Available.
New custom parser for ISO-8601 datetimes
editThis introduces a new custom parser for ISO-8601 datetimes, for the iso8601
, strict_date_optional_time
, and
strict_date_optional_time_nanos
built-in date formats. This provides a performance improvement over the
default Java date-time parsing. Whilst it maintains much of the same behaviour,
the new parser does not accept nonsensical date-time strings that have multiple fractional seconds fields
or multiple timezone specifiers. If the new parser fails to parse a string, it will then use the previous parser
to parse it. If a large proportion of the input data consists of these invalid strings, this may cause
a small performance degradation. If you wish to force the use of the old parsers regardless,
set the JVM property es.datetime.java_time_parsers=true
on all ES nodes.
New custom parser for more ISO-8601 date formats
editFollowing on from #106486, this extends the custom ISO-8601 datetime parser to cover the strict_year
,
strict_year_month
, strict_date_time
, strict_date_time_no_millis
, strict_date_hour_minute_second
,
strict_date_hour_minute_second_millis
, and strict_date_hour_minute_second_fraction
date formats.
As before, the parser will use the existing java.time parser if there are parsing issues, and the
es.datetime.java_time_parsers=true
JVM property will force the use of the old parsers regardless.
Preview: Support for the Connection Type, 'Domain, and ISP databases in the geoip processor
editAs a Technical Preview, the geoip
processor can now use the commercial
GeoIP2 Connection Type,
GeoIP2 Domain,
and
GeoIP2 ISP
databases from MaxMind.
Update Elasticsearch to Lucene 9.11
editElasticsearch is now updated using the latest Lucene version 9.11. Here are the full release notes: But, here are some particular highlights: - Usage of MADVISE for better memory management: https://github.com/apache/lucene/pull/13196 - Use RWLock to access LRUQueryCache to reduce contention: https://github.com/apache/lucene/pull/13306 - Speedup multi-segment HNSW graph search for nested kNN queries: https://github.com/apache/lucene/pull/13121 - Add a MemorySegment Vector scorer - for scoring without copying on-heap vectors: https://github.com/apache/lucene/pull/13339
Synthetic _source
improvements
editThere are multiple improvements to synthetic _source
functionality:
-
Synthetic
_source
is now supported for all field types includingnested
andobject
.object
fields are supported withenabled
set tofalse
. -
Synthetic
_source
can be enabled together withignore_malformed
andignore_above
parameters for all field types that support them.
Index sorting on indexes with nested fields
editIndex sorting is now supported for indexes with mappings containing nested objects.
The index sort spec (as specified by index.sort.field
) can’t contain any nested
fields, still.