[7.x] [DOCS] Adds helpers section to Python book

2021-04-01 18:46:18 +02:00
parent 45a430e7c9
commit 9502ab5f90
2 changed files with 97 additions and 1 deletions
@@ -0,0 +1,94 @@
+[[client-helpers]]
+== Client helpers
+
+You can find here a collection of simple helper functions that abstract some 
+specifics of the raw API. For detailed examples, refer to 
+https://elasticsearch-py.readthedocs.io/en/stable/helpers.html[this page].
+
+
+[discrete]
+[[bulk-helpers]]
+=== Bulk helpers 
+
+There are several helpers for the bulk API since its requirement for specific 
+formatting and other considerations can make it cumbersome if used directly.
+
+All bulk helpers accept an instance of `{es}` class and an iterable `action` 
+(any iterable, can also be a generator, which is ideal in most cases since it 
+allows you to index large datasets without the need of loading them into 
+memory).
+
+The items in the iterable `action` should be the documents we wish to index in 
+several formats. The most common one is the same as returned by `search()`, for 
+example:
+
+[source,yml]
+----------------------------
+{
+  '_index': 'index-name',
+  '_type': 'document',
+  '_id': 42,
+  '_routing': 5,
+  'pipeline': 'my-ingest-pipeline',
+  '_source': {
+    "title": "Hello World!",
+    "body": "..."
+  }
+}
+----------------------------
+
+Alternatively, if `_source` is not present, it pops all metadata fields from 
+the doc and use the rest as the document data:
+
+[source,yml]
+----------------------------
+{
+  "_id": 42,
+  "_routing": 5,
+  "title": "Hello World!",
+  "body": "..."
+}
+----------------------------
+
+The `bulk()` api accepts `index`, `create`, `delete`, and `update` actions. Use 
+the `_op_type` field to specify an action (`_op_type` defaults to `index`):
+
+[source,yml]
+----------------------------
+{
+  '_op_type': 'delete',
+  '_index': 'index-name',
+  '_type': 'document',
+  '_id': 42,
+}
+{
+  '_op_type': 'update',
+  '_index': 'index-name',
+  '_type': 'document',
+  '_id': 42,
+  'doc': {'question': 'The life, universe and everything.'}
+}
+----------------------------
+
+
+[discrete]
+[[scan]]
+=== Scan
+
+Simple abstraction on top of the `scroll()` API - a simple iterator that yields 
+all hits as returned by underlining scroll requests.
+
+By default scan does not return results in any pre-determined order. To have a 
+standard order in the returned documents (either by score or explicit sort 
+definition) when scrolling, use `preserve_order=True`. This may be an expensive 
+operation and will negate the performance benefits of using `scan`.
+
+
+[source,py]
+----------------------------
+scan(es,
+    query={"query": {"match": {"title": "python"}}},
+    index="orders-*",
+    doc_type="books"
+)
+----------------------------
@@ -14,4 +14,6 @@ include::configuration.asciidoc[]

 include::integrations.asciidoc[]

-include::examples.asciidoc[]
+include::examples.asciidoc[]
+
+include::helpers.asciidoc[]