pyams/pyams_user_guide: comparison src/source/admln

equal deleted inserted replaced

-:097b0c025eec
+:49e4432c0a1d
-.. _plugins:
-PyAMS additional features and services
-======================================
-Elasticsearch
-+++++++++++++
-At first you need to install ElasticSearch (ES); PyAMS is actually compatible with version 5.4. The Ingest attachment
-plug-in is also required to handle attachments correctly.
-Visit https://www.elastic.co/ to learn how to install Elasticsearch Server and `ingest-attachment` plug-in
-.. tip:: Documentation for installing ElasticSearch 5.4
-- https://www.elastic.co/guide/en/elasticsearch/reference/5.4/gs-installation.html
-- https://www.elastic.co/guide/en/elasticsearch/plugins/5.4/ingest-attachment.html
-After Elasticsearch installation, following steps describe how to configure ES with PyAMS.
-Initializing Elasticsearch index
---------------------------------
-If you want to use an Elasticsearch index, you have to initialize index settings and mappings;
-Elasticsearch integration is defined through the *PyAMS_content_es* package.
-1. Enable service
-'''''''''''''''''
-In Pyramid INI application files (*etc/development.ini* and *etc/production.ini*):
-.. code-block:: ini
-# Elasticsearch server settings
-elastic.server = http://127.0.0.1:9200
-elastic.index = pyams
-Where:
-- **elastic.server**: address of Elasticsearch server; you can include authentication arguments in the form
-*http://login:password@w.x.y.z:9200*
-- **elastic.index**: name of Elasticsearch index.
-On startup, main PyAMS application process can start in *indexer* process which will handle indexing requests in
-asynchronous mode; this process settings are defined like this:
-.. code-block:: ini
-# PyAMS content Elasticsearch indexer process settings
-pyams_content.es.tcp_handler = 127.0.0.1:5557
-pyams_content.es.start_handler = false
-pyams_content.es.allow_auth = admin:admin
-pyams_content.es.allow_clients = 127.0.0.1
-Where:
-- **pyams_content.es.tcp_handler**: IP address and listening port of PyAMS indexer process
-- **pyams_content.es.start_handler**: if *true*, the indexer process is started on PyAMS startup; otherwise (typically
-in a cluster configuration), the process is supposed to be started from another *master* server
-- **pyams_content.es.allow_auth**: login and password to be used to connect to indexer process (settings are defined
-in the same way on indexer process and on all it's clients)
-- **pyams_content.es.allow_clients**: list of IP addresses allowed to connect to indexer process.
-2. Initialize Elasticsearch database
-''''''''''''''''''''''''''''''''''''
-Configuration files for attachment pipeline, index and mappings settings are available into `pyams_content_es` source
-package or in PyAMS installation folder:
-.. code-block:: bash
-(env) $ cd docs/elasticsearch
-(env) $ curl --noproxy localhost -XPUT http://localhost:9200/_ingest/pipeline/attachment -d @attachment-pipeline.json
-And with ``elastic.index = pyams`` defined as Elasticsearch index name: *"http://localhost:9200/pyams"*:
-.. code-block:: shell
-(env) $ curl -XDELETE http://localhost:9200/pyams
-(env) $ curl -XPUT http://localhost:9200/pyams -d @index-settings.json
-(env) $ curl -XPUT http://localhost:9200/pyams/WfTopic/_mapping  -d @mappings/WfTopic.json
-(env) $ curl -XPUT http://localhost:9200/pyams/WfNewsEvent/_mapping -d @mappings/WfNewsEvent.json
-(env) $ curl -XPUT http://localhost:9200/pyams/WfBlogPost/_mapping -d @mappings/WfBlogPost.json
-*Troubleshooting*: If you have a 406 error try to add ``-H 'Content-Type: application/json'`` in Curl command lines.
-3. Update index contents
-''''''''''''''''''''''''
-If your ZODB database already store contents, you can update ElasticSearch indexes with all these contents with
-``pymas_es_index`` command line script. From a shell:
-.. code-block:: bash
-(env) $ ./bin/pyams_es_index ../etc/development.ini
-Natural Language Toolkit - NLTK
-+++++++++++++++++++++++++++++++
-PyAMS is using NLTK features through the *PyAMS_calalog*.
-.. seealso::
-Visit https://www.nltk.org/ to learn more about NLTK
-Initializing NLTK (Natural Language ToolKit)
---------------------------------------------
-Some NLTK collections like **tokenizers** and **stopwords** utilities are used to index fulltext contents
-elements. You can enhance NLTK indexation according to your own needs. This package requires downloading and
-configuration of several elements which are done as follow:
-*1. Run the Python shell into PyAMS environment:*
-.. code-block:: bash
-(env) $ ./bin/py
-*2. In the Python shell:*
-.. code-block:: pycon
->>> import nltk
->>> nltk.download()
-*3. Configuration installation directory:*
-.. tip::
-On Debian GNU/Linux, you can choose any directory between '*~/nltk_data*' (where '~' is the homedir of user running
-Pyramid application), '*/usr/share/nltk_data*', '*/usr/local/share/nltk_data*', '*/usr/lib/nltk_data*' and
-'*/usr/local/lib/nltk_data*'
-Please check if you have permission to write to this directory!
-.. code-block:: shell
-NLTK Downloader
----------------------------------------------------------------------------
-d) Download   l) List    u) Update   c) Config   h) Help   q) Quit
----------------------------------------------------------------------------
-Downloader> c
-Data Server:
-- URL: <https://raw.githubusercontent.com/nltk/nltk_data/gh-pages/index.xml>
-- 6 Package Collections Available
-- 107 Individual Packages Available
-Local Machine:
-- Data directory: /home/tflorac/nltk_data
-Config> d
-New directory> /usr/local/lib/nltk_data
-*4. Return to the main menu:*
-.. code-block:: shell
----------------------------------------------------------------------------
-s) Show Config   u) Set Server URL   d) Set Data Dir   m) Main Menu
----------------------------------------------------------------------------
-Config> m
-*5. Download utilities:*
-punkt
-Punkt Tokenizer Models
-stopwords
-Stopwords Corpus
-.. code-block:: shell
----------------------------------------------------------------------------
-d) Download   l) List    u) Update   c) Config   h) Help   q) Quit
----------------------------------------------------------------------------
-Downloader> d
-Download which package (l=list; x=cancel)?
-Identifier> punkt
-Downloading package punkt to /usr/local/lib/nltk_data...
-Downloader> d
-Download which package (l=list; x=cancel)?
-Identifier> stopwords
-Downloading package stopwords to /usr/local/lib/nltk_data...
-.. tip::
-The full list of NTLK Collection can be displayed with the ``l) list`` option.

branch	doc-dc
changeset 112	49e4432c0a1d
parent 111	097b0c025eec
child 113	5108336d3a4c