Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Incorrect docs.deleted count with Soft Delete Enabled #13725

Open
monusingh-1 opened this issue May 17, 2024 · 1 comment
Open

[BUG] Incorrect docs.deleted count with Soft Delete Enabled #13725

monusingh-1 opened this issue May 17, 2024 · 1 comment
Labels
bug Something isn't working Indexing Indexing, Bulk Indexing and anything related to indexing

Comments

@monusingh-1
Copy link

monusingh-1 commented May 17, 2024

Describe the bug

There's an inconsistency in the docs.deleted count in Opensearch indices when soft delete is enabled, leading to incorrect value.

Related component

Indexing

To Reproduce

Steps to reproduce

Case with Soft Delete Enabled [Incorrect behavior]:

  1. Create an index with soft delete enabled:
> PUT myindex1
{
  "settings":{
    "number_of_shards": 1,
    "number_of_replicas": 2,
    "index.soft_deletes.enabled" : true
  }
}
  1. Index 2 documents:
> POST /_bulk
{"index": {"_index": "myindex1", "_id": "1"}}
{"field1": "value1", "field2": "value1"}
{"index": {"_index": "myindex1", "_id": "2"}}
{"field1": "value3", "field2": "value2"}
  1. Delete 1 document and issue a refresh:
> DELETE /myindex1/_doc/1
> GET /_refresh
  1. Check the document stats:
> GET /_cat/indices?v
health status index                uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   myindex1             8BsMrKplRdW1yQEpUE6Jag   5   2          1            2     27.9kb          9.3kb

We can see that docs.deleted is 2 which is incorrect as we only deleted 1 document.

Check segemets

> GET _cat/segments/myindex1?v
index    shard prirep ip            segment generation docs.count docs.deleted  size size.memory committed searchable version compound
myindex1 3     p      x.x.x.x_0               0          1            0 4.1kb        1876 false     true       8.10.1  true
myindex1 4     p      x.x.x.x_0               0          0            1 5.4kb        2084 false     true       8.10.1  true
myindex1 4     p      x.x.x.x_1               1          0            1 2.9kb         852 false     true       8.10.1  true
  1. Issue a force merge:
>POST /myindex1/_forcemerge
  1. Check the document stats again:
> GET /_cat/indices?v
health status index                uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   myindex1             8BsMrKplRdW1yQEpUE6Jag   5   2          1            0     41.8kb         13.9kb

Only after a merge, the correct value for doc.deleted is updated.

Expected Behavior:

The docs.deleted count should accurately reflect the number of documents deleted in the index.

Actual Behavior:

The docs.deleted count appears to be doubled after deletion, only resolving after a force merge operation.

Case with Soft Delete Disabled:

  1. Create an index with soft delete disabled [Correct behaviour]:
> PUT myindex2
{
  "settings":{
    "number_of_shards": 1,
    "number_of_replicas": 2,
    "index.soft_deletes.enabled" : false
  }
}
  1. Index 2 documents:
> POST /_bulk
{"index": {"_index": "myindex2", "_id": "1"}}
{"field1": "value1", "field2": "value1"}
{"index": {"_index": "myindex2", "_id": "2"}}
{"field1": "value3", "field2": "value2"}
  1. Delete 1 document and issue a refresh:
> DELETE /myindex2/_doc/1
> GET /_refresh
  1. Check the document stats:
> GET /_cat/indices?v
health status index                uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   myindex2             hWdayX6oTr64bSPJGGQTFA   1   2          1            1     24.8kb          8.2kb

In this case, without a merge, the correct value of doc.deleted is reflected.

Check segments

> GET _cat/segments/myindex2?v
index    shard prirep ip            segment generation docs.count docs.deleted  size size.memory committed searchable version compound
myindex2 3     p      x.x.x.x_0               0          1            0 4.1kb        1876 true      true       8.10.1  true
myindex2 4     p      x.x.x.x_0               0          1            0 4.1kb           0 true      false      8.10.1  true
myindex2 4     p      x.x.x.x_1               1          0            1 2.9kb         852 false     true       8.10.1  true

Additional Information:

Elasticsearch version: OpenSearch-1.3, OpenSearch-2.1

Expected behavior

The docs.deleted count should accurately reflect the number of documents deleted in the index.

Additional Details

No response

@monusingh-1 monusingh-1 added bug Something isn't working untriaged labels May 17, 2024
@github-actions github-actions bot added the Indexing Indexing, Bulk Indexing and anything related to indexing label May 17, 2024
@sarthakaggarwal97
Copy link
Contributor

Thanks @monusingh-1 for creating this issue. Can you also please add the output of _cat/segments?v. Would like to see doc count and deleted doc count for the segments of the index.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Indexing Indexing, Bulk Indexing and anything related to indexing
Projects
None yet
Development

No branches or pull requests

3 participants