OpenSearch Flattened Field Numeric Range Queries in Java: Solutions
Are you trying to perform OpenSearch flattened field numeric range queries, only to discover that your numeric ranges are producing unexpected results? Maybe you’re querying for values between 1 and 10, but somehow 100 and 1000 are matching while 2 is mysteriously excluded. Flattened fields (OS flat object, ES flattened) store all sub-properties as keywords (unanalyzed strings), which means OpenSearch compares your numbers as ASCII strings rather than numerically. This creates unique challenges for range queries that can produce results that make no sense at all if you’re not prepared.
This blog post demonstrates numeric range query patterns using code examples from the Code Sloth Code Samples repository. You can find the complete samples in GitHub via our Code Sloth Code Samples page.
If you’re new to flattened fields or want to understand how they work in general (including indexing, searching, and array limitations), be sure to check out our OpenSearch Flattened Field Type in Java blog post for more detailed information.
Note: This post focuses specifically on numeric range queries. If you’re working with date range queries on flattened fields, be sure to check out our companion post on OpenSearch Flattened Field Range Queries in Java: ISO8601 Dates. Other than some small syntax differences, the OpenSearch and Elasticsearch implementations are very similar—this article will refer to OpenSearch but link to both implementations, so feel free to adapt to Elasticsearch if that is your Engine of choice!
Let’s dive in! 🦥
What’s to Come
In this post, we’ll explore:
- Understanding Flattened Field Range Query Challenges – Why range queries on flattened fields behave differently
- Numeric Range Queries: ASCII Comparison and Zero-Padding – Why unpadded numbers fail and how zero-padding solves the problem
- Max-Size Padding: Supporting Full Integer Ranges – The tradeoff of padding all numbers to maximum size
- Negative Numbers: Additional Complications – How negative numbers behave in ASCII comparison
- Single Point Range Queries – Working with gte, lte, gt, and lt operators
- Sloth Summary – Key takeaways and recommendations
Understanding Flattened Field Range Query Challenges
Before we dive into the specific challenges, let’s quickly recap what makes flattened fields special. Flattened fields (OS flat object, ES flattened) in OpenSearch treat an entire object as a single field, with all sub-properties indexed as keywords (unanalyzed strings). This means that when you perform range queries on flattened field properties, OpenSearch compares them lexicographically (ASCII order), not as the native data types they represent.
This fundamental difference creates a critical challenge for numeric range queries:
Numeric Range Queries: Numbers are compared as ASCII strings, not numerically, which breaks range queries. For example, unpadded numbers like "1", "2", "10" sort incorrectly as "1" < "10" < "2" in ASCII (because character-by-character comparison means "10" < "2" since the first character "1" < "2"), but when zero-padded to the same length ("0001", "0002", "0010"), they sort correctly as "0001" < "0002" < "0010", matching the numeric order.
Let’s explore this challenge in detail with actual test code examples.
OpenSearch Flattened Field Numeric Range Queries: ASCII Comparison and Zero-Padding
Before we dive into numeric range queries, let’s take a quick look at the document classes used in these tests. They follow the same pattern as other flattened field examples, using strongly-typed records for nested objects:
public class ProductWithNumericAttribute implements IDocumentWithId {
private String id;
private String name;
private NumericAttribute attribute; // Strongly-typed record that will be serialized as JSON object
// Constructors, getters, setters, equals, hashCode, toString...
}
public record NumericAttribute(
String value
) {
}
The ProductWithNumericAttribute class contains a NumericAttribute attribute, which will be mapped as a flattened field. The NumericAttribute record stores the numeric value as a String rather than using a native numeric type (like Integer or Long) to provide maximum flexibility over the format of the numeric values. This allows testing both padded and unpadded numeric values, and ensures that the exact string representation (including zero-padding) is preserved when serialized to JSON.
Note: While Java doesn’t have native integer types that include padding, Java provides several ways to pad integers when converting them to strings, such as String.format("%010d", value) to pad to 10 digits with zeros, or using DecimalFormat. Using String in the attribute class allows you to control the exact padding format that gets indexed in OpenSearch.
The Problem: Unpadded Numbers Fail Range Queries
When numbers are stored in flattened fields, they’re indexed as keywords (strings). This means OpenSearch compares them lexicographically (ASCII order), not numerically. This creates a fundamental problem:
- Numerically: 1 < 2 < 10 < 100 < 1000 (correct order)
- ASCII order: “1” < “10” < “100” < “1000” < “2” (incorrect order!)
ASCII comparison works left-to-right, character by character. When comparing these strings, they align on the left side:
1
10
100
1000
2
When comparing "10" to "2", we compare the first character: '1' vs '2'. Since '1' < '2', the comparison stops there—we never even look at the second character of "10". This is why "10" < "2" in ASCII order, even though numerically 10 > 2. This breaks range queries entirely.
Let’s see this failure in action from FlattenedNumericRangeTests.java:
@Test
public void flattenedNumericRange_UnpaddedNumbers_FailsRangeQueries() throws Exception {
try (OpenSearchTestIndex testIndex = fixture.createTestIndex(mapping ->
mapping.properties("attribute", Property.of(p -> p.flatObject(f -> f))))) {
// Create documents with unpadded numbers
ProductWithNumericAttribute[] products = new ProductWithNumericAttribute[]{
new ProductWithNumericAttribute("1", "Product1", new NumericAttribute("1")),
new ProductWithNumericAttribute("2", "Product2", new NumericAttribute("2")),
new ProductWithNumericAttribute("3", "Product3", new NumericAttribute("10")),
new ProductWithNumericAttribute("4", "Product4", new NumericAttribute("100")),
new ProductWithNumericAttribute("5", "Product5", new NumericAttribute("1000"))
};
testIndex.indexDocuments(products);
// Try to query for values between "1" and "10"
// This will fail because "10" < "2" in ASCII comparison
SearchResponse<ProductWithNumericAttribute> result = loggingOpenSearchClient.search(s -> s
.index(testIndex.getName())
.query(q -> q
.range(r -> r
.field("attribute.value")
.gte(JsonData.of("1"))
.lte(JsonData.of("10"))
)
),
ProductWithNumericAttribute.class
);
// The query will fail because ASCII comparison order is: "1", "10", "100", "1000", "2"
// So "1" <= x <= "10" matches "1", "10", "100", "1000" but not "2"
// Expected 3 (1, 2, 10) but will be wrong due to ASCII comparison
assertThat(result.hits().total().value()).isNotEqualTo(3);
// Verify the actual incorrect results: matches "1", "10", "100", "1000" but excludes "2"
List<String> matchedIds = result.hits().hits().stream()
.map(h -> h.source().getId())
.sorted()
.collect(Collectors.toList());
assertThat(matchedIds).containsExactly("1", "3", "4", "5"); // Missing "2"!
}
}
This test clearly demonstrates the problem. When querying for values between 1 and 10, we expect to match products with values 1, 2, and 10. However, due to ASCII comparison, the query incorrectly matches "1", "10", "100", and "1000" but misses "2" because in ASCII order, "2" comes after "1000"!
Key points from this failure:
- Unpadded numbers produce incorrect ASCII ordering
- Range queries fail because ASCII order doesn’t match numeric order
- The value
"2"is incorrectly excluded from the range["1", "10"] - Values like
"100"and"1000"are incorrectly included
HTTP Commands for OpenSearch Dashboards
Index Creation Request:
PUT /range_query_numeric_unpadded
{
"mappings": {
"properties": {
"id": { "type": "keyword" },
"name": { "type": "keyword" },
"attribute": {
"type": "flat_object"
}
}
},
"settings": {
"number_of_replicas": 0,
"number_of_shards": 1
}
}
Index Documents Request:
POST /range_query_numeric_unpadded/_bulk
{ "index" : { "_index" : "range_query_numeric_unpadded", "_id" : "1" } }
{ "id" : "1", "name" : "Product1", "attribute" : { "value" : "1" } }
{ "index" : { "_index" : "range_query_numeric_unpadded", "_id" : "2" } }
{ "id" : "2", "name" : "Product2", "attribute" : { "value" : "2" } }
{ "index" : { "_index" : "range_query_numeric_unpadded", "_id" : "3" } }
{ "id" : "3", "name" : "Product3", "attribute" : { "value" : "10" } }
{ "index" : { "_index" : "range_query_numeric_unpadded", "_id" : "4" } }
{ "id" : "4", "name" : "Product4", "attribute" : { "value" : "100" } }
{ "index" : { "_index" : "range_query_numeric_unpadded", "_id" : "5" } }
{ "id" : "5", "name" : "Product5", "attribute" : { "value" : "1000" } }
Search Request 1: Range Query for Values Between 1 and 10 (Fails Due to ASCII Comparison)
POST /range_query_numeric_unpadded/_search
{
"query": {
"range": {
"attribute.value": {
"gte": "1",
"lte": "10"
}
}
}
}
Search Response 1:
{
"took": 3,
"hits": {
"total": {
"value": 4,
"relation": "eq"
},
"hits": [
{
"_id": "1",
"_source": {
"id": "1",
"name": "Product1",
"attribute": {
"value": "1"
}
}
},
{
"_id": "3",
"_source": {
"id": "3",
"name": "Product3",
"attribute": {
"value": "10"
}
}
},
{
"_id": "4",
"_source": {
"id": "4",
"name": "Product4",
"attribute": {
"value": "100"
}
}
},
{
"_id": "5",
"_source": {
"id": "5",
"name": "Product5",
"attribute": {
"value": "1000"
}
}
}
]
}
}
Notice that the response includes "100" and "1000" (which shouldn’t be in the range) and excludes "2" (which should be in the range).
The Solution: Zero-Padding
Zero-padding solves the ASCII comparison problem by ensuring all numbers have the same length. When numbers are zero-padded to the same length (e.g., "0001", "0002", "0010", "0100", "1000"), ASCII comparison matches numeric order:
0001
0002
0010
0100
1000
With zero-padding, all strings share the same length and align perfectly on the left. This uniform length guarantees that lexicographic (ASCII) ordering will match numeric ordering: "0001" < "0002" < "0010" < "0100" < "1000" (correct numeric order!)
Let’s see this solution in action:
@Test
public void flattenedNumericRange_ZeroPaddedNumbers_WorksWithRangeQueries() throws Exception {
try (OpenSearchTestIndex testIndex = fixture.createTestIndex(mapping ->
mapping.properties("attribute", Property.of(p -> p.flatObject(f -> f))))) {
// Create documents with zero-padded numbers (4 digits)
ProductWithNumericAttribute[] products = new ProductWithNumericAttribute[]{
new ProductWithNumericAttribute("1", "Product1", new NumericAttribute("0001")),
new ProductWithNumericAttribute("2", "Product2", new NumericAttribute("0002")),
new ProductWithNumericAttribute("3", "Product3", new NumericAttribute("0010")),
new ProductWithNumericAttribute("4", "Product4", new NumericAttribute("0100")),
new ProductWithNumericAttribute("5", "Product5", new NumericAttribute("1000"))
};
testIndex.indexDocuments(products);
// Query for values between "0001" and "0010" (inclusive)
SearchResponse<ProductWithNumericAttribute> result = loggingOpenSearchClient.search(s -> s
.index(testIndex.getName())
.query(q -> q
.range(r -> r
.field("attribute.value")
.gte(JsonData.of("0001"))
.lte(JsonData.of("0010"))
)
),
ProductWithNumericAttribute.class
);
// Should match Product1 (0001), Product2 (0002), and Product3 (0010)
assertThat(result.hits().total().value()).isEqualTo(3);
assertThat(result.hits().hits().stream()
.map(h -> h.source().getId())
.sorted())
.containsExactly("1", "2", "3");
}
}
This test demonstrates the solution! With zero-padded numbers, the range query correctly matches all values between 1 and 10, including 2 which was missing in the unpadded version.
Key points from this solution:
- Zero-padding ensures all numbers have the same length
- ASCII comparison now matches numeric order
- Range queries work correctly with zero-padded numbers
- All expected values are correctly included in the results
HTTP Commands for OpenSearch Dashboards
Index Creation Request:
PUT /range_query_numeric_padded
{
"mappings": {
"properties": {
"id": { "type": "keyword" },
"name": { "type": "keyword" },
"attribute": {
"type": "flat_object"
}
}
},
"settings": {
"number_of_replicas": 0,
"number_of_shards": 1
}
}
Index Documents Request:
POST /range_query_numeric_padded/_bulk
{ "index" : { "_index" : "range_query_numeric_padded", "_id" : "1" } }
{ "id" : "1", "name" : "Product1", "attribute" : { "value" : "0001" } }
{ "index" : { "_index" : "range_query_numeric_padded", "_id" : "2" } }
{ "id" : "2", "name" : "Product2", "attribute" : { "value" : "0002" } }
{ "index" : { "_index" : "range_query_numeric_padded", "_id" : "3" } }
{ "id" : "3", "name" : "Product3", "attribute" : { "value" : "0010" } }
{ "index" : { "_index" : "range_query_numeric_padded", "_id" : "4" } }
{ "id" : "4", "name" : "Product4", "attribute" : { "value" : "0100" } }
{ "index" : { "_index" : "range_query_numeric_padded", "_id" : "5" } }
{ "id" : "5", "name" : "Product5", "attribute" : { "value" : "1000" } }
Search Request 1: Range Query for Values Between 1 and 10 (Works with Zero-Padding)
POST /range_query_numeric_padded/_search
{
"query": {
"range": {
"attribute.value": {
"gte": "0001",
"lte": "0010"
}
}
}
}
Search Response 1:
{
"took": 2,
"hits": {
"total": {
"value": 3,
"relation": "eq"
},
"hits": [
{
"_id": "1",
"_source": {
"id": "1",
"name": "Product1",
"attribute": {
"value": "0001"
}
}
},
{
"_id": "2",
"_source": {
"id": "2",
"name": "Product2",
"attribute": {
"value": "0002"
}
}
},
{
"_id": "3",
"_source": {
"id": "3",
"name": "Product3",
"attribute": {
"value": "0010"
}
}
}
]
}
}
Perfect! Now the query correctly includes all three values (1, 2, and 10) and excludes values outside the range.
Max-Size Padding: Supporting Full Integer Ranges
While zero-padding solves the ASCII comparison problem, it introduces a significant tradeoff: all numbers must be padded to the same maximum length. To support the full integer range (up to Integer.MAX_VALUE = 2,147,483,647, which is 10 digits), all numbers must be padded to 10 digits.
This means even small numbers like 1 must be stored as "0000000001" instead of "1", which increases storage requirements. Let’s see this in action:
@Test
public void flattenedNumericRange_MaxSizePadding_RequiredForFullIntegerRange() throws Exception {
try (OpenSearchTestIndex testIndex = fixture.createTestIndex(mapping ->
mapping.properties("attribute", Property.of(p -> p.flatObject(f -> f))))) {
// Create documents with numbers padded to 10 digits (max size for Integer.MAX_VALUE = 2,147,483,647)
ProductWithNumericAttribute[] products = new ProductWithNumericAttribute[]{
new ProductWithNumericAttribute("1", "Product1", new NumericAttribute("0000000001")),
new ProductWithNumericAttribute("2", "Product2", new NumericAttribute("0000000002")),
new ProductWithNumericAttribute("3", "Product3", new NumericAttribute("0000000010")),
new ProductWithNumericAttribute("4", "Product4", new NumericAttribute("0000000100")),
new ProductWithNumericAttribute("5", "Product5", new NumericAttribute("0000001000")),
new ProductWithNumericAttribute("6", "Product6", new NumericAttribute("2147483647")) // Integer.MAX_VALUE
};
testIndex.indexDocuments(products);
// Query for values between "0000000001" and "0000000100" (inclusive)
SearchResponse<ProductWithNumericAttribute> result = loggingOpenSearchClient.search(s -> s
.index(testIndex.getName())
.query(q -> q
.range(r -> r
.field("attribute.value")
.gte(JsonData.of("0000000001"))
.lte(JsonData.of("0000000100"))
)
),
ProductWithNumericAttribute.class
);
// Should match Product1 (0000000001), Product2 (0000000002), Product3 (0000000010), and Product4 (0000000100)
assertThat(result.hits().total().value()).isEqualTo(4);
assertThat(result.hits().hits().stream()
.map(h -> h.source().getId())
.sorted())
.containsExactly("1", "2", "3", "4");
// Query for Integer.MAX_VALUE
SearchResponse<ProductWithNumericAttribute> maxResult = loggingOpenSearchClient.search(s -> s
.index(testIndex.getName())
.query(q -> q
.range(r -> r
.field("attribute.value")
.gte(JsonData.of("2147483647"))
.lte(JsonData.of("2147483647"))
)
),
ProductWithNumericAttribute.class
);
// Should match Product6 (Integer.MAX_VALUE)
assertThat(maxResult.hits().total().value()).isEqualTo(1);
assertThat(maxResult.hits().hits().get(0).source().getId()).isEqualTo("6");
}
}
This test highlights the tradeoff: to support the full integer range, all numbers must be padded to 10 digits, even if most numbers in your dataset are much smaller. This increases storage requirements but ensures range queries work correctly across the entire integer range.
Key points from this test:
- All numbers must be padded to the same maximum length (10 digits for full integer range)
- Small numbers like
1must be stored as"0000000001"instead of"1" - This increases storage requirements but ensures correct range queries
- Range queries work correctly across the entire integer range when properly padded
HTTP Commands for OpenSearch Dashboards
Index Creation Request:
PUT /range_query_numeric_max_size_padding
{
"mappings": {
"properties": {
"attribute": {
"type": "flat_object"
}
}
},
"settings": {
"number_of_replicas": 0,
"number_of_shards": 1
}
}
Index Documents Request:
POST /range_query_numeric_max_size_padding/_bulk
{ "index" : { "_index" : "range_query_numeric_max_size_padding", "_id" : "1" } }
{ "id" : "1", "name" : "Product1", "attribute" : { "value" : "0000000001" } }
{ "index" : { "_index" : "range_query_numeric_max_size_padding", "_id" : "2" } }
{ "id" : "2", "name" : "Product2", "attribute" : { "value" : "0000000002" } }
{ "index" : { "_index" : "range_query_numeric_max_size_padding", "_id" : "3" } }
{ "id" : "3", "name" : "Product3", "attribute" : { "value" : "0000000010" } }
{ "index" : { "_index" : "range_query_numeric_max_size_padding", "_id" : "4" } }
{ "id" : "4", "name" : "Product4", "attribute" : { "value" : "0000000100" } }
{ "index" : { "_index" : "range_query_numeric_max_size_padding", "_id" : "5" } }
{ "id" : "5", "name" : "Product5", "attribute" : { "value" : "0000001000" } }
{ "index" : { "_index" : "range_query_numeric_max_size_padding", "_id" : "6" } }
{ "id" : "6", "name" : "Product6", "attribute" : { "value" : "2147483647" } }
Search Request 1: Range Query for Values Between 1 and 100 (Works with Max-Size Padding)
POST /range_query_numeric_max_size_padding/_search
{
"query": {
"range": {
"attribute.value": {
"gte": "0000000001",
"lte": "0000000100"
}
}
}
}
Search Response 1:
{
"took": 24,
"timed_out": false,
"_shards": {
"failed": 0,
"skipped": 0,
"successful": 1,
"total": 1
},
"hits": {
"total": {
"value": 4,
"relation": "eq"
},
"max_score": 1.0,
"hits": [
{
"_id": "1",
"_index": "range_query_numeric_max_size_padding",
"_score": 1.0,
"_source": {
"id": "1",
"name": "Product1",
"attribute": {
"value": "0000000001"
}
}
},
{
"_id": "2",
"_index": "range_query_numeric_max_size_padding",
"_score": 1.0,
"_source": {
"id": "2",
"name": "Product2",
"attribute": {
"value": "0000000002"
}
}
},
{
"_id": "3",
"_index": "range_query_numeric_max_size_padding",
"_score": 1.0,
"_source": {
"id": "3",
"name": "Product3",
"attribute": {
"value": "0000000010"
}
}
},
{
"_id": "4",
"_index": "range_query_numeric_max_size_padding",
"_score": 1.0,
"_source": {
"id": "4",
"name": "Product4",
"attribute": {
"value": "0000000100"
}
}
}
]
}
}
Search Request 2: Range Query for Integer.MAX_VALUE
POST /range_query_numeric_max_size_padding/_search
{
"query": {
"range": {
"attribute.value": {
"gte": "2147483647",
"lte": "2147483647"
}
}
}
}
Search Response 2:
{
"took": 2,
"timed_out": false,
"_shards": {
"failed": 0,
"skipped": 0,
"successful": 1,
"total": 1
},
"hits": {
"total": {
"value": 1,
"relation": "eq"
},
"max_score": 1.0,
"hits": [
{
"_id": "6",
"_index": "range_query_numeric_max_size_padding",
"_score": 1.0,
"_source": {
"id": "6",
"name": "Product6",
"attribute": {
"value": "2147483647"
}
}
}
]
}
}
Negative Numbers: Additional Complications
Negative numbers introduce additional complications in ASCII comparison. Negative numbers start with a minus sign (-), which means they come before positive numbers in ASCII order. However, unpadded negative numbers also suffer from the same ASCII ordering issues as positive numbers.
Let’s explore this behavior with two separate tests:
Unpadded Negative Numbers Fail Range Queries
When negative numbers are stored in flattened fields without padding, they suffer from the same ASCII comparison issues as positive numbers. When comparing these strings, they align on the left side:
-100
-10
-2
-1
0
1
10
String comparison proceeds sequentially from left to right, examining one character at a time. Consider "-10" versus "-2": the first characters match ('-' = '-'), but the second character comparison ('1' vs '2') determines the result. Because '1' < '2', the algorithm concludes "-10" < "-2" without examining further characters—even though numerically -10 < -2, which is correct. The real problem emerges with "-1" versus "-10": after matching the first two characters ('-' and '1'), the comparison continues with "-1"‘s end-of-string versus "-10"‘s '0'. Since '1' > '0', we incorrectly get "-1" > "-10" in ASCII order, contradicting the numeric relationship (-1 > -10). This fundamental mismatch causes range queries to fail.
@Test
public void flattenedNumericRange_UnpaddedNegativeNumbers_FailsRangeQueries() throws Exception {
// Create a test index with flattened mapping for the attribute field
try (OpenSearchTestIndex testIndex = fixture.createTestIndex(mapping ->
mapping.properties("attribute", Property.of(p -> p.flatObject(f -> f))))) {
// Create documents with unpadded negative numbers
// ASCII order: "-1", "-10", "-100", "-2", "0", "1", "10" (partial order)
ProductWithNumericAttribute[] products = new ProductWithNumericAttribute[]{
new ProductWithNumericAttribute("1", "Product1", new NumericAttribute("-100")),
new ProductWithNumericAttribute("2", "Product2", new NumericAttribute("-10")),
new ProductWithNumericAttribute("3", "Product3", new NumericAttribute("-2")),
new ProductWithNumericAttribute("4", "Product4", new NumericAttribute("-1")),
new ProductWithNumericAttribute("5", "Product5", new NumericAttribute("0")),
new ProductWithNumericAttribute("6", "Product6", new NumericAttribute("1")),
new ProductWithNumericAttribute("7", "Product7", new NumericAttribute("10"))
};
testIndex.indexDocuments(products);
// Query for range [-10, 10] with unpadded numbers
// The query fails because ASCII comparison doesn't match numeric ordering for negative numbers
// Result: Matches "-10", "-100", "-2", "0", "1", "10" but incorrectly excludes "-1"
SearchResponse<ProductWithNumericAttribute> result = loggingOpenSearchClient.search(s -> s
.index(testIndex.getName())
.query(q -> q
.range(r -> r
.field("attribute.value")
.gte(JsonData.of("-10"))
.lte(JsonData.of("10"))
)
),
ProductWithNumericAttribute.class
);
// Demonstrates the failure with unpadded negative numbers
assertThat(result.hits().total().value()).isEqualTo(6); // "-10", "-100", "-2", "0", "1", "10" (missing "-1")
List<String> matchedIds = result.hits().hits().stream()
.map(h -> h.source().getId())
.sorted()
.collect(Collectors.toList());
// "-1" is missing even though it should be in the range [-10, 10] numerically
// This demonstrates the incorrect behavior with unpadded negative numbers
assertThat(matchedIds).containsExactly("1", "2", "3", "5", "6", "7"); // Missing "4" (-1)!
}
}
This test demonstrates the failure with unpadded negative numbers. The query for range [-10, 10] should match all values from -10 to 10, including -1. However, based on ASCII comparison, "-1" should satisfy both conditions ("-1" > "-10" and "-1" < "10"), yet it’s incorrectly excluded from the results.
The actual behavior: The query matches "-10", "-100", "-2", "0", "1", "10" but incorrectly excludes "-1". This demonstrates that range queries on unpadded negative numbers spanning from negative to positive values produce incorrect results.
Key points from this test:
- Unpadded negative numbers fail range queries due to ASCII comparison issues
- Negative numbers come before positive numbers in ASCII order (because
"-"<"0") - The value
"-1"is incorrectly excluded from the range[-10, 10]when unpadded, even though it logically should be included - Range queries spanning from negative to positive values produce incorrect results with unpadded numbers
HTTP Commands for OpenSearch Dashboards
Index Creation Request:
PUT /range_query_numeric_negative_unpadded
{
"mappings": {
"properties": {
"attribute": {
"type": "flat_object"
}
}
},
"settings": {
"number_of_replicas": 0,
"number_of_shards": 1
}
}
Index Documents Request:
POST /range_query_numeric_negative_unpadded/_bulk
{ "index" : { "_index" : "range_query_numeric_negative_unpadded", "_id" : "1" } }
{ "id" : "1", "name" : "Product1", "attribute" : { "value" : "-100" } }
{ "index" : { "_index" : "range_query_numeric_negative_unpadded", "_id" : "2" } }
{ "id" : "2", "name" : "Product2", "attribute" : { "value" : "-10" } }
{ "index" : { "_index" : "range_query_numeric_negative_unpadded", "_id" : "3" } }
{ "id" : "3", "name" : "Product3", "attribute" : { "value" : "-2" } }
{ "index" : { "_index" : "range_query_numeric_negative_unpadded", "_id" : "4" } }
{ "id" : "4", "name" : "Product4", "attribute" : { "value" : "-1" } }
{ "index" : { "_index" : "range_query_numeric_negative_unpadded", "_id" : "5" } }
{ "id" : "5", "name" : "Product5", "attribute" : { "value" : "0" } }
{ "index" : { "_index" : "range_query_numeric_negative_unpadded", "_id" : "6" } }
{ "id" : "6", "name" : "Product6", "attribute" : { "value" : "1" } }
{ "index" : { "_index" : "range_query_numeric_negative_unpadded", "_id" : "7" } }
{ "id" : "7", "name" : "Product7", "attribute" : { "value" : "10" } }
Search Request: Range Query for Values Between -10 and 10 (Fails with Unpadded Negative Numbers)
POST /range_query_numeric_negative_unpadded/_search
{
"query": {
"range": {
"attribute.value": {
"gte": "-10",
"lte": "10"
}
}
}
}
Search Response:
{
"took": 53,
"timed_out": false,
"_shards": {
"failed": 0,
"skipped": 0,
"successful": 1,
"total": 1
},
"hits": {
"total": {
"value": 6,
"relation": "eq"
},
"max_score": 1.0,
"hits": [
{
"_id": "1",
"_index": "range_query_numeric_negative_unpadded",
"_score": 1.0,
"_source": {
"id": "1",
"name": "Product1",
"attribute": {
"value": "-100"
}
}
},
{
"_id": "2",
"_index": "range_query_numeric_negative_unpadded",
"_score": 1.0,
"_source": {
"id": "2",
"name": "Product2",
"attribute": {
"value": "-10"
}
}
},
{
"_id": "3",
"_index": "range_query_numeric_negative_unpadded",
"_score": 1.0,
"_source": {
"id": "3",
"name": "Product3",
"attribute": {
"value": "-2"
}
}
},
{
"_id": "5",
"_index": "range_query_numeric_negative_unpadded",
"_score": 1.0,
"_source": {
"id": "5",
"name": "Product5",
"attribute": {
"value": "0"
}
}
},
{
"_id": "6",
"_index": "range_query_numeric_negative_unpadded",
"_score": 1.0,
"_source": {
"id": "6",
"name": "Product6",
"attribute": {
"value": "1"
}
}
},
{
"_id": "7",
"_index": "range_query_numeric_negative_unpadded",
"_score": 1.0,
"_source": {
"id": "7",
"name": "Product7",
"attribute": {
"value": "10"
}
}
}
]
}
}
Notice that the response includes "-100" (which shouldn’t be in the range) and excludes "-1" (which should be in the range). The value "-2" is included, but "-1" (Product4) is missing, demonstrating the incorrect behavior with unpadded negative numbers.
Zero-Padded Negative Numbers: Still Problematic for Mixed Ranges
Unfortunately, zero-padding alone does not solve the problem for negative numbers when the range spans from negative to positive values. When comparing zero-padded negative and positive numbers, they align on the left side:
-0100
-0010
-0002
-0001
0000
0001
0010
The lexicographic comparison algorithm processes characters sequentially from the start. For "-0002" and "-0010", the first three character pairs are identical ('-', '0', '0'), so the comparison hinges on the fourth position: '0' versus '1'. Because '0' < '1' in ASCII, the result is "-0002" < "-0010"—but this contradicts the numeric truth where -2 > -10. The presence of the minus sign combined with zero-padding creates this ordering paradox, causing range queries to malfunction when the range crosses from negative to positive territory.
This test demonstrates that even with zero-padding, negative numbers still fail range queries due to ASCII comparison:
@Test
public void flattenedNumericRange_ZeroPaddedNegativeNumbers_StillFailsForMixedRanges() throws Exception {
// Create a test index with flattened mapping for the attribute field
try (OpenSearchTestIndex testIndex = fixture.createTestIndex(mapping ->
mapping.properties("attribute", Property.of(p -> p.flatObject(f -> f))))) {
// Create documents with zero-padded negative numbers (4 digits after minus sign)
ProductWithNumericAttribute[] products = new ProductWithNumericAttribute[]{
new ProductWithNumericAttribute("1", "Product1", new NumericAttribute("-0100")),
new ProductWithNumericAttribute("2", "Product2", new NumericAttribute("-0010")),
new ProductWithNumericAttribute("3", "Product3", new NumericAttribute("-0002")),
new ProductWithNumericAttribute("4", "Product4", new NumericAttribute("-0001")),
new ProductWithNumericAttribute("5", "Product5", new NumericAttribute("0000")),
new ProductWithNumericAttribute("6", "Product6", new NumericAttribute("0001")),
new ProductWithNumericAttribute("7", "Product7", new NumericAttribute("0010"))
};
testIndex.indexDocuments(products);
// Query for range [-10, 10] with padded numbers
// Zero-padding alone doesn't solve the problem: "-0002" and "-0001" are < "-0010" in ASCII
// Result: Matches "-0010", "-0100", "0000", "0001", "0010" but incorrectly excludes "-0002" and "-0001"
SearchResponse<ProductWithNumericAttribute> result = loggingOpenSearchClient.search(s -> s
.index(testIndex.getName())
.query(q -> q
.range(r -> r
.field("attribute.value")
.gte(JsonData.of("-0010"))
.lte(JsonData.of("0010"))
)
),
ProductWithNumericAttribute.class
);
// The result shows that even with zero-padding, negative numbers don't work correctly
// The query matches "-0010", "-0100", "0000", "0001", "0010" but misses "-0002" and "-0001"
assertThat(result.hits().total().value()).isEqualTo(5); // "-0010", "-0100", "0000", "0001", "0010"
List<String> matchedIds = result.hits().hits().stream()
.map(h -> h.source().getId())
.sorted()
.collect(Collectors.toList());
// Missing "-0002" (ID 3) and "-0001" (ID 4) because they are < "-0010" in ASCII
// This demonstrates that zero-padding doesn't fully solve the problem for negative numbers
assertThat(matchedIds).containsExactly("1", "2", "5", "6", "7"); // Missing "3" (-0002) and "4" (-0001)!
}
}
This test demonstrates that zero-padding alone does not solve the problem for negative numbers when querying ranges that span from negative to positive values.
The Problem: Even with zero-padding, "-0002" and "-0001" are less than "-0010" in ASCII comparison (because after "-00", the next character "0" < "1"). This means they don’t satisfy gte("-0010"), even though numerically they are greater than -10.
The Result: The query matches "-0010", "-0100", "0000", "0001", "0010" but incorrectly excludes "-0002" and "-0001". This shows that zero-padding doesn’t fully solve the ASCII comparison problem for negative numbers when ranges span across zero.
Key points from this test:
- Zero-padding alone does NOT solve the ASCII comparison problem for negative numbers when ranges span from negative to positive
- Values like
"-0002"and"-0001"are less than"-0010"in ASCII, so they fail thegtecondition - This demonstrates a fundamental limitation of using ASCII comparison for numeric ranges that include negative numbers
- The solution is to use a two’s complement approach (see next section)
HTTP Commands for OpenSearch Dashboards (Zero-Padded Negative Numbers – Still Problematic)
Index Creation Request:
PUT /range_query_numeric_negative_padded
{
"mappings": {
"properties": {
"attribute": {
"type": "flat_object"
}
}
},
"settings": {
"number_of_replicas": 0,
"number_of_shards": 1
}
}
Index Documents Request:
POST /range_query_numeric_negative_padded/_bulk
{ "index" : { "_index" : "range_query_numeric_negative_padded", "_id" : "1" } }
{ "id" : "1", "name" : "Product1", "attribute" : { "value" : "-0100" } }
{ "index" : { "_index" : "range_query_numeric_negative_padded", "_id" : "2" } }
{ "id" : "2", "name" : "Product2", "attribute" : { "value" : "-0010" } }
{ "index" : { "_index" : "range_query_numeric_negative_padded", "_id" : "3" } }
{ "id" : "3", "name" : "Product3", "attribute" : { "value" : "-0002" } }
{ "index" : { "_index" : "range_query_numeric_negative_padded", "_id" : "4" } }
{ "id" : "4", "name" : "Product4", "attribute" : { "value" : "-0001" } }
{ "index" : { "_index" : "range_query_numeric_negative_padded", "_id" : "5" } }
{ "id" : "5", "name" : "Product5", "attribute" : { "value" : "0000" } }
{ "index" : { "_index" : "range_query_numeric_negative_padded", "_id" : "6" } }
{ "id" : "6", "name" : "Product6", "attribute" : { "value" : "0001" } }
{ "index" : { "_index" : "range_query_numeric_negative_padded", "_id" : "7" } }
{ "id" : "7", "name" : "Product7", "attribute" : { "value" : "0010" } }
Search Request: Range Query for Values Between -10 and 10 (Still Fails with Zero-Padded Negative Numbers)
POST /range_query_numeric_negative_padded/_search
{
"query": {
"range": {
"attribute.value": {
"gte": "-0010",
"lte": "0010"
}
}
}
}
Search Response:
{
"took": 20,
"timed_out": false,
"_shards": {
"failed": 0,
"skipped": 0,
"successful": 1,
"total": 1
},
"hits": {
"total": {
"value": 5,
"relation": "eq"
},
"max_score": 1.0,
"hits": [
{
"_id": "1",
"_index": "range_query_numeric_negative_padded",
"_score": 1.0,
"_source": {
"id": "1",
"name": "Product1",
"attribute": {
"value": "-0100"
}
}
},
{
"_id": "2",
"_index": "range_query_numeric_negative_padded",
"_score": 1.0,
"_source": {
"id": "2",
"name": "Product2",
"attribute": {
"value": "-0010"
}
}
},
{
"_id": "5",
"_index": "range_query_numeric_negative_padded",
"_score": 1.0,
"_source": {
"id": "5",
"name": "Product5",
"attribute": {
"value": "0000"
}
}
},
{
"_id": "6",
"_index": "range_query_numeric_negative_padded",
"_score": 1.0,
"_source": {
"id": "6",
"name": "Product6",
"attribute": {
"value": "0001"
}
}
},
{
"_id": "7",
"_index": "range_query_numeric_negative_padded",
"_score": 1.0,
"_source": {
"id": "7",
"name": "Product7",
"attribute": {
"value": "0010"
}
}
}
]
}
}
Notice that the response includes "-0100" (which shouldn’t be in the range) and excludes "-0002" and "-0001" (which should be in the range). This demonstrates that even with zero-padding, negative numbers still fail range queries when the range spans from negative to positive values.
Two’s Complement Approach: The Solution for Negative Numbers
The two’s complement approach solves the problem by converting all negative numbers to positive values before storing them. This ensures that ASCII ordering matches numeric ordering across the entire range, including negative and positive values.
When using two’s complement with an offset (e.g., adding Integer.MIN_VALUE absolute value = 2,147,483,648), all values become positive. Here’s how the conversion works:
Original numeric values:
-100, -10, -2, -1, 0, 1, 10
After adding offset (2,147,483,648):
2,147,483,548 (was -100)
2,147,483,638 (was -10)
2,147,483,646 (was -2)
2,147,483,647 (was -1)
2,147,483,648 (was 0)
2,147,483,649 (was 1)
2,147,483,658 (was 10)
The offset ensures all values are already the same length (10 digits), so they align perfectly when stored as strings. When comparing these strings, they align on the left side:
2147483548
2147483638
2147483646
2147483647
2147483648
2147483649
2147483658
By applying the offset, every value becomes a positive integer with uniform 10-digit length. This standardization means that when OpenSearch performs string comparison, the lexicographic ordering will precisely mirror the underlying numeric ordering: "2147483548" < "2147483638" < "2147483646" < "2147483647" < "2147483648" < "2147483649" < "2147483658" (which correctly represents the numeric sequence -100, -10, -2, -1, 0, 1, 10 after applying the offset!)
How it works:
- Add the absolute value of the minimum possible value (e.g.,
Integer.MIN_VALUE= 2,147,483,648 forint) to all numbers - Store all values as strings without signs (the offset ensures they’re already the same length)
- All values become positive, ensuring correct ASCII ordering
For Integer (int) range:
- Offset:
2,147,483,648(absolute value ofInteger.MIN_VALUE) - Max value after offset:
4,294,967,295 - All values are automatically 10 digits (no padding needed)
For Long range:
- Offset:
9,223,372,036,854,775,808(absolute value ofLong.MIN_VALUE) - Max value after offset:
18,446,744,073,709,551,615 - All values are automatically 20 digits (no padding needed)
Here’s a test that demonstrates this approach:
@Test
public void flattenedNumericRange_TwosComplement_WorksWithRangeQueries() throws Exception {
// Create a test index with flattened mapping for the attribute field
try (OpenSearchTestIndex testIndex = fixture.createTestIndex(mapping ->
mapping.properties("attribute", Property.of(p -> p.flatObject(f -> f))))) {
// Two's complement offset: Integer.MIN_VALUE absolute value = 2,147,483,648
// This converts all negative numbers to positive, ensuring correct ASCII ordering
// The offset ensures all values are already 10 digits (max value after offset: 4,294,967,295)
final long offset = 2147483648L; // |Integer.MIN_VALUE|
ProductWithNumericAttribute[] products = new ProductWithNumericAttribute[]{
new ProductWithNumericAttribute("1", "Product1", new NumericAttribute(String.valueOf(-100 + offset))),
new ProductWithNumericAttribute("2", "Product2", new NumericAttribute(String.valueOf(-10 + offset))),
new ProductWithNumericAttribute("3", "Product3", new NumericAttribute(String.valueOf(-2 + offset))),
new ProductWithNumericAttribute("4", "Product4", new NumericAttribute(String.valueOf(-1 + offset))),
new ProductWithNumericAttribute("5", "Product5", new NumericAttribute(String.valueOf(0 + offset))),
new ProductWithNumericAttribute("6", "Product6", new NumericAttribute(String.valueOf(1 + offset))),
new ProductWithNumericAttribute("7", "Product7", new NumericAttribute(String.valueOf(10 + offset)))
};
testIndex.indexDocuments(products);
// Query for range [-10, 10] using two's complement offset values
// Convert range bounds to offset values: -10 + offset and 10 + offset
SearchResponse<ProductWithNumericAttribute> result = loggingOpenSearchClient.search(s -> s
.index(testIndex.getName())
.query(q -> q
.range(r -> r
.field("attribute.value")
.gte(JsonData.of(String.valueOf(-10 + offset)))
.lte(JsonData.of(String.valueOf(10 + offset)))
)
),
ProductWithNumericAttribute.class
);
// All values in range [-10, 10] are correctly included
assertThat(result.hits().total().value()).isEqualTo(6); // -10, -2, -1, 0, 1, 10 (excludes -100)
List<String> matchedIds = result.hits().hits().stream()
.map(h -> h.source().getId())
.sorted()
.collect(Collectors.toList());
// All values in range are correctly included
// Note: Product1 (-100) is excluded because it's outside the range [-10, 10]
assertThat(matchedIds).containsExactly("2", "3", "4", "5", "6", "7");
}
}
Key points from this test:
- Two’s complement approach solves the ASCII comparison problem for negative numbers by converting them to positive values
- The offset ensures all values are the same length (10 digits) without needing zero-padding, and they’re stored as strings without signs, ensuring correct ASCII ordering
- Range queries work correctly across the entire range, including negative and positive values
- The offset value must be added to both the stored values and the query bounds
HTTP Commands for OpenSearch Dashboards (Two’s Complement Approach)
Index Creation Request:
PUT /range_query_numeric_twos_complement
{
"mappings": {
"properties": {
"attribute": {
"type": "flat_object"
}
}
},
"settings": {
"number_of_replicas": 0,
"number_of_shards": 1
}
}
Index Documents Request:
POST /range_query_numeric_twos_complement/_bulk
{ "index" : { "_index" : "range_query_numeric_twos_complement", "_id" : "1" } }
{ "id" : "1", "name" : "Product1", "attribute" : { "value" : "2147483548" } }
{ "index" : { "_index" : "range_query_numeric_twos_complement", "_id" : "2" } }
{ "id" : "2", "name" : "Product2", "attribute" : { "value" : "2147483638" } }
{ "index" : { "_index" : "range_query_numeric_twos_complement", "_id" : "3" } }
{ "id" : "3", "name" : "Product3", "attribute" : { "value" : "2147483646" } }
{ "index" : { "_index" : "range_query_numeric_twos_complement", "_id" : "4" } }
{ "id" : "4", "name" : "Product4", "attribute" : { "value" : "2147483647" } }
{ "index" : { "_index" : "range_query_numeric_twos_complement", "_id" : "5" } }
{ "id" : "5", "name" : "Product5", "attribute" : { "value" : "2147483648" } }
{ "index" : { "_index" : "range_query_numeric_twos_complement", "_id" : "6" } }
{ "id" : "6", "name" : "Product6", "attribute" : { "value" : "2147483649" } }
{ "index" : { "_index" : "range_query_numeric_twos_complement", "_id" : "7" } }
{ "id" : "7", "name" : "Product7", "attribute" : { "value" : "2147483658" } }
Search Request: Range Query for Values Between -10 and 10 (Using Two’s Complement Offset)
POST /range_query_numeric_twos_complement/_search
{
"query": {
"range": {
"attribute.value": {
"gte": "2147483638",
"lte": "2147483658"
}
}
}
}
Search Response:
{
"took": 20,
"timed_out": false,
"_shards": {
"failed": 0,
"skipped": 0,
"successful": 1,
"total": 1
},
"hits": {
"total": {
"value": 6,
"relation": "eq"
},
"max_score": 1.0,
"hits": [
{
"_id": "2",
"_index": "range_query_numeric_twos_complement",
"_score": 1.0,
"_source": {
"id": "2",
"name": "Product2",
"attribute": {
"value": "2147483638"
}
}
},
{
"_id": "3",
"_index": "range_query_numeric_twos_complement",
"_score": 1.0,
"_source": {
"id": "3",
"name": "Product3",
"attribute": {
"value": "2147483646"
}
}
},
{
"_id": "4",
"_index": "range_query_numeric_twos_complement",
"_score": 1.0,
"_source": {
"id": "4",
"name": "Product4",
"attribute": {
"value": "2147483647"
}
}
},
{
"_id": "5",
"_index": "range_query_numeric_twos_complement",
"_score": 1.0,
"_source": {
"id": "5",
"name": "Product5",
"attribute": {
"value": "2147483648"
}
}
},
{
"_id": "6",
"_index": "range_query_numeric_twos_complement",
"_score": 1.0,
"_source": {
"id": "6",
"name": "Product6",
"attribute": {
"value": "2147483649"
}
}
},
{
"_id": "7",
"_index": "range_query_numeric_twos_complement",
"_score": 1.0,
"_source": {
"id": "7",
"name": "Product7",
"attribute": {
"value": "2147483658"
}
}
}
]
}
}
Notice that all values in the range [-10, 10] are correctly included: -10 (2147483638), -2 (2147483646), -1 (2147483647), 0 (2147483648), 1 (2147483649), and 10 (2147483658). The value -100 (2147483548) is correctly excluded because it’s outside the range. This demonstrates that the two’s complement approach correctly handles range queries spanning negative to positive values.
Single Point Numeric Range Queries
⚠️ Note: In OpenSearch versions prior to 2.19.3, a bug required specifying both parts (gte/lte or gt/lt) of a range query when querying flattened fields, otherwise a null pointer exception would occur: “failed to create query: Cannot invoke "Object.toString()" because "inputValue" is null”. This bug has been fixed in OpenSearch 2.19.3 or higher. See GitHub issue #19890 for more details.
Just like with dates, numeric range queries support all the standard operators: gte, lte, gt, and lt. Let’s see one example below:
@Test
public void flattenedNumericRange_ZeroPaddedNumbers_GreaterThanOrEqualSinglePoint() throws Exception {
try (OpenSearchTestIndex testIndex = fixture.createTestIndex(mapping ->
mapping.properties("attribute", Property.of(p -> p.flatObject(f -> f))))) {
// Create documents with zero-padded numbers (4 digits)
ProductWithNumericAttribute[] products = new ProductWithNumericAttribute[]{
new ProductWithNumericAttribute("1", "Product1", new NumericAttribute("0001")),
new ProductWithNumericAttribute("2", "Product2", new NumericAttribute("0002")),
new ProductWithNumericAttribute("3", "Product3", new NumericAttribute("0010")),
new ProductWithNumericAttribute("4", "Product4", new NumericAttribute("0100"))
};
testIndex.indexDocuments(products);
// Query for values >= 0002
SearchResponse<ProductWithNumericAttribute> result = loggingOpenSearchClient.search(s -> s
.index(testIndex.getName())
.query(q -> q
.range(r -> r
.field("attribute.value")
.gte(JsonData.of("0002"))
)
),
ProductWithNumericAttribute.class
);
// Should match Product2 (0002), Product3 (0010), and Product4 (0100)
assertThat(result.hits().total().value()).isEqualTo(3);
assertThat(result.hits().hits().stream()
.map(h -> h.source().getId())
.sorted())
.containsExactly("2", "3", "4");
}
}
This test shows that gte (greater than or equal) correctly matches documents with values greater than or equal to the specified value. The lte (less than or equal), gt (greater than), and lt (less than) operators work similarly, with gte and lte being inclusive (matching the boundary value) and gt and lt being exclusive (not matching the boundary value). The corresponding test methods are flattenedNumericRange_ZeroPaddedNumbers_LessThanOrEqualSinglePoint(), flattenedNumericRange_ZeroPaddedNumbers_GreaterThanSinglePoint(), and flattenedNumericRange_ZeroPaddedNumbers_LessThanSinglePoint().
HTTP Commands for OpenSearch Dashboards
Index Creation Request:
PUT /range_query_numeric_single_point
{
"mappings": {
"properties": {
"attribute": {
"type": "flat_object"
}
}
},
"settings": {
"number_of_replicas": 0,
"number_of_shards": 1
}
}
Index Documents Request:
POST /range_query_numeric_single_point/_bulk
{ "index" : { "_index" : "range_query_numeric_single_point", "_id" : "1" } }
{ "id" : "1", "name" : "Product1", "attribute" : { "value" : "0001" } }
{ "index" : { "_index" : "range_query_numeric_single_point", "_id" : "2" } }
{ "id" : "2", "name" : "Product2", "attribute" : { "value" : "0002" } }
{ "index" : { "_index" : "range_query_numeric_single_point", "_id" : "3" } }
{ "id" : "3", "name" : "Product3", "attribute" : { "value" : "0010" } }
{ "index" : { "_index" : "range_query_numeric_single_point", "_id" : "4" } }
{ "id" : "4", "name" : "Product4", "attribute" : { "value" : "0100" } }
Search Request: Range Query for Values >= 0002 (gte Single Point)
POST /range_query_numeric_single_point/_search
{
"query": {
"range": {
"attribute.value": {
"gte": "0002"
}
}
}
}
Search Response:
{
"took": 2,
"timed_out": false,
"_shards": {
"failed": 0,
"skipped": 0,
"successful": 1,
"total": 1
},
"hits": {
"total": {
"value": 3,
"relation": "eq"
},
"max_score": 1.0,
"hits": [
{
"_id": "2",
"_index": "range_query_numeric_single_point",
"_score": 1.0,
"_source": {
"id": "2",
"name": "Product2",
"attribute": {
"value": "0002"
}
}
},
{
"_id": "3",
"_index": "range_query_numeric_single_point",
"_score": 1.0,
"_source": {
"id": "3",
"name": "Product3",
"attribute": {
"value": "0010"
}
}
},
{
"_id": "4",
"_index": "range_query_numeric_single_point",
"_score": 1.0,
"_source": {
"id": "4",
"name": "Product4",
"attribute": {
"value": "0100"
}
}
}
]
}
}
Sloth Summary
Throughout this post, we’ve explored the unique challenges and solutions for performing numeric range queries on flattened fields. While this article focused specifically on numeric range queries, if you’re looking for more detailed information about flattened fields in general—including how they work, when to use them, searching patterns, and array limitations—be sure to check out our OpenSearch Flattened Field Type in Java blog post.
If you’re working with date range queries on flattened fields, be sure to check out our companion post on OpenSearch Flattened Field Range Queries in Java: ISO8601 Dates.
Let’s recap the key takeaways from this post:
Numeric Range Queries on Flattened Fields
- Unpadded numbers fail: Unpadded numbers (e.g.,
"1","2","10") fail range queries because ASCII comparison doesn’t match numeric order ("10"<"2"in ASCII, but10 > 2numerically) - Zero-padding solves the problem: Zero-padding numbers to the same length (e.g.,
"0001","0002","0010") makes ASCII comparison match numeric order - Max-size padding required: To support the full integer range (up to
Integer.MAX_VALUE= 2,147,483,647), all numbers must be padded to 10 digits, even small numbers like1must be stored as"0000000001" - Storage tradeoff: Zero-padding increases storage requirements but ensures correct range queries
- Negative numbers need special handling: Zero-padding alone doesn’t work for negative numbers when ranges span from negative to positive values. The two’s complement approach is required for correct range queries across negative and positive numbers
- All range operators work:
gte,lte,gt, andltall work correctly with zero-padded numbers (and two’s complement for negative numbers)
Key Recommendations
- For numbers: Always zero-pad numbers to a consistent length when storing them in flattened fields if you need to perform range queries
- Consider the tradeoffs: Zero-padding increases storage requirements but ensures correct range queries
- Plan your padding strategy: Decide on a maximum number size for your use case and pad all numbers to that length
- Handle negative numbers carefully: Use the two’s complement approach if you need to support ranges that span from negative to positive values
- Test your queries: Always test range queries with your actual data to ensure they work as expected
If you’re working with flattened fields and need numeric range query functionality, these patterns will ensure your queries work correctly. Remember: flattened fields store everything as keywords, so lexicographic ordering is what matters, not the native data type ordering!
Happy Code Slothing! 🦥