pyams/pyams_user_guide: comparison src/source/plugins.rst

equal deleted inserted replaced

--1:000000000000
+:b2be9a32f3fc
+.. _plugins:
+PyAMS additional features and services
+======================================
+Elasticsearch
++++++++++++++
+At first you need to install ElasticSearch (ES); PyAMS is actually compatible with version 5.4. The Ingest attachment
+plug-in is also required to handle attachments correctly.
+Visit https://www.elastic.co/ to learn how to install Elasticsearch Server and `ingest-attachment` plug-in
+.. tip:: Documentation for installing ElasticSearch 5.4
+- https://www.elastic.co/guide/en/elasticsearch/reference/5.4/gs-installation.html
+- https://www.elastic.co/guide/en/elasticsearch/plugins/5.4/ingest-attachment.html
+After Elasticsearch installation, following steps describe how to configure ES with PyAMS.
+Initializing Elasticsearch index
+--------------------------------
+If you want to use an Elasticsearch index, you have to initialize index settings and mappings;
+Elasticsearch integration is defined through the *PyAMS_content_es* package.
+1. Enable service
+'''''''''''''''''
+In Pyramid INI application files (*etc/development.ini* and *etc/production.ini*):
+.. code-block:: ini
+# Elasticsearch server settings
+elastic.server = http://127.0.0.1:9200
+elastic.index = pyams
+Where:
+- **elastic.server**: address of Elasticsearch server; you can include authentication arguments in the form
+*http://login:password@w.x.y.z:9200*
+- **elastic.index**: name of Elasticsearch index.
+On startup, main PyAMS application process can start in *indexer* process which will handle indexing requests in
+asynchronous mode; this process settings are defined like this:
+.. code-block:: ini
+# PyAMS content Elasticsearch indexer process settings
+pyams_content.es.tcp_handler = 127.0.0.1:5557
+pyams_content.es.start_handler = false
+pyams_content.es.allow_auth = admin:admin
+pyams_content.es.allow_clients = 127.0.0.1
+Where:
+- **pyams_content.es.tcp_handler**: IP address and listening port of PyAMS indexer process
+- **pyams_content.es.start_handler**: if *true*, the indexer process is started on PyAMS startup; otherwise (typically
+in a cluster configuration), the process is supposed to be started from another *master* server
+- **pyams_content.es.allow_auth**: login and password to be used to connect to indexer process (settings are defined
+in the same way on indexer process and on all it's clients)
+- **pyams_content.es.allow_clients**: list of IP addresses allowed to connect to indexer process.
+2. Initialize Elasticsearch database
+''''''''''''''''''''''''''''''''''''
+Configuration files for attachment pipeline, index and mappings settings are available into `pyams_content_es` source
+package or in PyAMS installation folder:
+.. code-block:: bash
+(env) $ cd docs/elasticsearch
+(env) $ curl --noproxy localhost -XPUT http://localhost:9200/_ingest/pipeline/attachment -d @attachment-pipeline.json
+And with ``elastic.index = pyams`` defined as Elasticsearch index name: *"http://localhost:9200/pyams"*:
+.. code-block:: shell
+(env) $ curl -XDELETE http://localhost:9200/pyams
+(env) $ curl -XPUT http://localhost:9200/pyams -d @index-settings.json
+(env) $ curl -XPUT http://localhost:9200/pyams/WfTopic/_mapping  -d @mappings/WfTopic.json
+(env) $ curl -XPUT http://localhost:9200/pyams/WfNewsEvent/_mapping -d @mappings/WfNewsEvent.json
+(env) $ curl -XPUT http://localhost:9200/pyams/WfBlogPost/_mapping -d @mappings/WfBlogPost.json
+*Troubleshooting*: If you have a 406 error try to add ``-H 'Content-Type: application/json'`` in Curl command lines.
+3. Update index contents
+''''''''''''''''''''''''
+If your ZODB database already store contents, you can update ElasticSearch indexes with all these contents with
+``pymas_es_index`` command line script. From a shell:
+.. code-block:: bash
+(env) $ ./bin/pyams_es_index ../etc/development.ini
+Natural Language Toolkit - NLTK
++++++++++++++++++++++++++++++++
+PyAMS is using NLTK features through the *PyAMS_calalog*.
+.. seealso::
+Visit https://www.nltk.org/ to learn more about NLTK
+Initializing NLTK (Natural Language ToolKit)
+--------------------------------------------
+Some NLTK collections like **tokenizers** and **stopwords** utilities are used to index fulltext contents
+elements. You can enhance NLTK indexation according to your own needs. This package requires downloading and
+configuration of several elements which are done as follow:
+*1. Run the Python shell into PyAMS environment:*
+.. code-block:: bash
+(env) $ ./bin/py
+*2. In the Python shell:*
+.. code-block:: pycon
+>>> import nltk
+>>> nltk.download()
+*3. Configuration installation directory:*
+.. tip::
+On Debian GNU/Linux, you can choose any directory between '*~/nltk_data*' (where '~' is the homedir of user running
+Pyramid application), '*/usr/share/nltk_data*', '*/usr/local/share/nltk_data*', '*/usr/lib/nltk_data*' and
+'*/usr/local/lib/nltk_data*'
+Please check if you have permission to write to this directory!
+.. code-block:: shell
+NLTK Downloader
+---------------------------------------------------------------------------
+d) Download   l) List    u) Update   c) Config   h) Help   q) Quit
+---------------------------------------------------------------------------
+Downloader> c
+Data Server:
+- URL: <https://raw.githubusercontent.com/nltk/nltk_data/gh-pages/index.xml>
+- 6 Package Collections Available
+- 107 Individual Packages Available
+Local Machine:
+- Data directory: /home/tflorac/nltk_data
+Config> d
+New directory> /usr/local/lib/nltk_data
+*4. Return to the main menu:*
+.. code-block:: shell
+---------------------------------------------------------------------------
+s) Show Config   u) Set Server URL   d) Set Data Dir   m) Main Menu
+---------------------------------------------------------------------------
+Config> m
+*5. Download utilities:*
+punkt
+Punkt Tokenizer Models
+stopwords
+Stopwords Corpus
+.. code-block:: shell
+---------------------------------------------------------------------------
+d) Download   l) List    u) Update   c) Config   h) Help   q) Quit
+---------------------------------------------------------------------------
+Downloader> d
+Download which package (l=list; x=cancel)?
+Identifier> punkt
+Downloading package punkt to /usr/local/lib/nltk_data...
+Downloader> d
+Download which package (l=list; x=cancel)?
+Identifier> stopwords
+Downloading package stopwords to /usr/local/lib/nltk_data...
+.. tip::
+The full list of NTLK Collection can be displayed with the ``l) list`` option.

changeset 99	b2be9a32f3fc
child 104	942151432421