diff -r 1de74ac0628f -r 31b3d00edb8a src/source/plugins.rst --- a/src/source/plugins.rst Mon Dec 10 18:17:20 2018 +0100 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,209 +0,0 @@ -.. _plugins: - -PyAMS additional features and services -====================================== - - -Elasticsearch -+++++++++++++ - -At first you need to install ElasticSearch (ES); PyAMS is actually compatible with version 5.4. The Ingest attachment -plug-in is also required to handle attachments correctly. - -Visit https://www.elastic.co/ to learn how to install Elasticsearch Server and `ingest-attachment` plug-in - - -.. tip:: Documentation for installing ElasticSearch 5.4 - - - https://www.elastic.co/guide/en/elasticsearch/reference/5.4/gs-installation.html - - https://www.elastic.co/guide/en/elasticsearch/plugins/5.4/ingest-attachment.html - - -After Elasticsearch installation, following steps describe how to configure ES with PyAMS. - - -Initializing Elasticsearch index --------------------------------- - -If you want to use an Elasticsearch index, you have to initialize index settings and mappings; -Elasticsearch integration is defined through the *PyAMS_content_es* package. - - -1. Enable service -''''''''''''''''' - -In Pyramid INI application files (*etc/development.ini* and *etc/production.ini*): - -.. code-block:: ini - - # Elasticsearch server settings - elastic.server = http://127.0.0.1:9200 - elastic.index = pyams - -Where: - - **elastic.server**: address of Elasticsearch server; you can include authentication arguments in the form - *http://login:password@w.x.y.z:9200* - - **elastic.index**: name of Elasticsearch index. - - -On startup, main PyAMS application process can start in *indexer* process which will handle indexing requests in -asynchronous mode; this process settings are defined like this: - -.. code-block:: ini - - # PyAMS content Elasticsearch indexer process settings - pyams_content.es.tcp_handler = 127.0.0.1:5557 - pyams_content.es.start_handler = false - pyams_content.es.allow_auth = admin:admin - pyams_content.es.allow_clients = 127.0.0.1 - -Where: - - **pyams_content.es.tcp_handler**: IP address and listening port of PyAMS indexer process - - **pyams_content.es.start_handler**: if *true*, the indexer process is started on PyAMS startup; otherwise (typically - in a cluster configuration), the process is supposed to be started from another *master* server - - **pyams_content.es.allow_auth**: login and password to be used to connect to indexer process (settings are defined - in the same way on indexer process and on all it's clients) - - **pyams_content.es.allow_clients**: list of IP addresses allowed to connect to indexer process. - - -2. Initialize Elasticsearch database -'''''''''''''''''''''''''''''''''''' - -Configuration files for attachment pipeline, index and mappings settings are available into `pyams_content_es` source -package or in PyAMS installation folder: - - -.. code-block:: bash - - (env) $ cd docs/elasticsearch - (env) $ curl --noproxy localhost -XPUT http://localhost:9200/_ingest/pipeline/attachment -d @attachment-pipeline.json - - -And with ``elastic.index = pyams`` defined as Elasticsearch index name: *"http://localhost:9200/pyams"*: - -.. code-block:: shell - - (env) $ curl -XDELETE http://localhost:9200/pyams - - (env) $ curl -XPUT http://localhost:9200/pyams -d @index-settings.json - - (env) $ curl -XPUT http://localhost:9200/pyams/WfTopic/_mapping -d @mappings/WfTopic.json - (env) $ curl -XPUT http://localhost:9200/pyams/WfNewsEvent/_mapping -d @mappings/WfNewsEvent.json - (env) $ curl -XPUT http://localhost:9200/pyams/WfBlogPost/_mapping -d @mappings/WfBlogPost.json - - -*Troubleshooting*: If you have a 406 error try to add ``-H 'Content-Type: application/json'`` in Curl command lines. - - -3. Update index contents -'''''''''''''''''''''''' - -If your ZODB database already store contents, you can update ElasticSearch indexes with all these contents with -``pymas_es_index`` command line script. From a shell: - -.. code-block:: bash - - (env) $ ./bin/pyams_es_index ../etc/development.ini - - - -Natural Language Toolkit - NLTK -+++++++++++++++++++++++++++++++ - -PyAMS is using NLTK features through the *PyAMS_calalog*. - -.. seealso:: - - Visit https://www.nltk.org/ to learn more about NLTK - - -Initializing NLTK (Natural Language ToolKit) --------------------------------------------- - -Some NLTK collections like **tokenizers** and **stopwords** utilities are used to index fulltext contents -elements. You can enhance NLTK indexation according to your own needs. This package requires downloading and -configuration of several elements which are done as follow: - - -*1. Run the Python shell into PyAMS environment:* - -.. code-block:: bash - - (env) $ ./bin/py - - -*2. In the Python shell:* - -.. code-block:: pycon - - >>> import nltk - >>> nltk.download() - - -*3. Configuration installation directory:* - -.. tip:: - - On Debian GNU/Linux, you can choose any directory between '*~/nltk_data*' (where '~' is the homedir of user running - Pyramid application), '*/usr/share/nltk_data*', '*/usr/local/share/nltk_data*', '*/usr/lib/nltk_data*' and - '*/usr/local/lib/nltk_data*' - - Please check if you have permission to write to this directory! - - -.. code-block:: shell - - NLTK Downloader - --------------------------------------------------------------------------- - d) Download l) List u) Update c) Config h) Help q) Quit - --------------------------------------------------------------------------- - Downloader> c - - Data Server: - - URL: - - 6 Package Collections Available - - 107 Individual Packages Available - - Local Machine: - - Data directory: /home/tflorac/nltk_data - - Config> d - New directory> /usr/local/lib/nltk_data - - -*4. Return to the main menu:* - -.. code-block:: shell - - --------------------------------------------------------------------------- - s) Show Config u) Set Server URL d) Set Data Dir m) Main Menu - --------------------------------------------------------------------------- - Config> m - - -*5. Download utilities:* - - punkt - Punkt Tokenizer Models - stopwords - Stopwords Corpus - - -.. code-block:: shell - - --------------------------------------------------------------------------- - d) Download l) List u) Update c) Config h) Help q) Quit - --------------------------------------------------------------------------- - Downloader> d - Download which package (l=list; x=cancel)? - Identifier> punkt - Downloading package punkt to /usr/local/lib/nltk_data... - Downloader> d - Download which package (l=list; x=cancel)? - Identifier> stopwords - Downloading package stopwords to /usr/local/lib/nltk_data... - - -.. tip:: - - The full list of NTLK Collection can be displayed with the ``l) list`` option.