src/source/plugins.rst
branchdoc-dc
changeset 55 949d496c4e96
child 56 60a1fbdbbed3
equal deleted inserted replaced
54:e3d33ef363fd 55:949d496c4e96
       
     1 .. _plugins:
       
     2 
       
     3 PyAMS additional features and services
       
     4 ++++++++++++++++++++++++++++++++++++++
       
     5 
       
     6 
       
     7 Elasticsearch 5.4
       
     8 =================
       
     9 
       
    10 At first you need to install ElasticSearch (ES), currently PyAMS is compatible with the version 5.4, the Ingest attachment
       
    11 plug-in is also required to handle attachments correctly.
       
    12 
       
    13 Visit  https://www.elastic.co/ to learn how to install Elasticsearch Server, and how install `ingest-attachment` plug-in
       
    14 
       
    15 
       
    16 .. tips:: Documentation for installing ElasticSearch 5.4
       
    17 
       
    18     - https://www.elastic.co/guide/en/elasticsearch/reference/5.4/gs-installation.html
       
    19     - https://www.elastic.co/guide/en/elasticsearch/plugins/5.4/ingest-attachment.html
       
    20 
       
    21 
       
    22 After ElasticSearch installation, following steps describe how to configure ES with PyAMS;
       
    23 
       
    24 Initializing Elasticsearch index
       
    25 --------------------------------
       
    26 
       
    27 If you want to use an Elasticsearch index, you have to initialize index settings and mappings;
       
    28 Elasticsearch integration is defined through the *PyAMS_content_es* package.
       
    29 
       
    30 
       
    31 1. Enable Service:
       
    32 ''''''''''''''''''
       
    33 
       
    34 In Pyramid INI application file *(etc/development.ini)*:
       
    35 
       
    36 .. code-block:: bash
       
    37 
       
    38     # ElasticSearch settings
       
    39     elastic.server = http://127.0.0.1:9200
       
    40     elastic.index = pyams
       
    41 
       
    42 .. code-block:: bash
       
    43 
       
    44     # PyAMS content elasticsearch index settings
       
    45     pyams_content.es.tcp_handler = 127.0.0.1:5557
       
    46     pyams_content.es.start_handler = false
       
    47     pyams_content.es.allow_auth = admin:admin
       
    48     pyams_content.es.allow_clients = 127.0.0.1
       
    49 
       
    50 
       
    51 2. Initialize Elasticsearch Database:
       
    52 '''''''''''''''''''''''''''''''''''''
       
    53 
       
    54 Configuration files for attachment pipeline, index settings and mappings are available `pyams_content_es` package or in PyAMS installation folder:
       
    55 
       
    56 
       
    57 .. code-block:: bash
       
    58 
       
    59     (env) $ cd docs/elasticsearch
       
    60     (env) $ curl --noproxy localhost -XPUT http://localhost:9200/_ingest/pipeline/attachment -d @attachment-pipeline.json
       
    61 
       
    62 
       
    63 With ``elastic.index = pyams`` defined as Elasticsearch index name : *"http://localhost:9200/pyams"* :
       
    64 
       
    65 .. code-block:: shell
       
    66 
       
    67     (env) $ curl -XDELETE http://localhost:9200/pyams
       
    68 
       
    69     (env) $ curl -XPUT http://localhost:9200/pyams -d @index-settings.json
       
    70 
       
    71     (env) $ curl -XPUT http://localhost:9200/pyams/WfTopic/_mapping  -d @mappings/WfTopic.json
       
    72     (env) $ curl -XPUT http://localhost:9200/pyams/WfNewsEvent/_mapping -d @mappings/WfNewsEvent.json
       
    73     (env) $ curl -XPUT http://localhost:9200/pyams/WfBlogPost/_mapping -d @mappings/WfBlogPost.json
       
    74 
       
    75 
       
    76 *Troubleshooting*: If you have a 406 error try to add ``-H 'Content-Type: application/json'`` in curl option
       
    77 
       
    78 
       
    79 3. Create or update index:
       
    80 ''''''''''''''''''''''''''
       
    81 
       
    82 You have to index PyAMS objects into ES database. From a shell:
       
    83 
       
    84 .. code-block:: bash
       
    85 
       
    86     (env) $ ./bin/pyams_es_index ../etc/development.ini
       
    87 
       
    88 
       
    89 
       
    90 -------------------------------
       
    91 
       
    92 Natural Language Toolkit - NLTK
       
    93 ===============================
       
    94 
       
    95 
       
    96 With the package *PyAMS_nltk* PyAMS can use the NLTK features
       
    97 
       
    98 .. seealso::
       
    99 
       
   100     Visit https://www.nltk.org/ to learn more about NLTK
       
   101 
       
   102 
       
   103 
       
   104 
       
   105 Initializing NLTK
       
   106 -----------------
       
   107 
       
   108 Some NLTK (Natural Language Toolkit) tokenizers and stopwords utilities are used to index fulltext contents elements.
       
   109 This package requires downloading and configuration of several elements which are done as follow:
       
   110 
       
   111 
       
   112 *1. Run the Python shell with PyAMS environment:*
       
   113 
       
   114 .. code-block:: bash
       
   115 
       
   116     (env) $ ./bin/py
       
   117 
       
   118 
       
   119 *2. In the Python shell:*
       
   120 
       
   121 .. code-block:: python
       
   122 
       
   123     >>> import nltk
       
   124     >>> nltk.download()
       
   125 
       
   126 .. code-block:: python
       
   127 
       
   128     NLTK Downloader
       
   129     ---------------------------------------------------------------------------
       
   130         d) Download   l) List    u) Update   c) Config   h) Help   q) Quit
       
   131     ---------------------------------------------------------------------------
       
   132     Downloader> c
       
   133 
       
   134     Data Server:
       
   135       - URL: <https://raw.githubusercontent.com/nltk/nltk_data/gh-pages/index.xml>
       
   136       - 6 Package Collections Available
       
   137       - 107 Individual Packages Available
       
   138 
       
   139     Local Machine:
       
   140       - Data directory: /home/tflorac/nltk_data
       
   141     ---------------------------------------------------------------------------
       
   142         s) Show Config   u) Set Server URL   d) Set Data Dir   m) Main Menu
       
   143     ---------------------------------------------------------------------------
       
   144     Config> d
       
   145       New directory> /usr/local/lib/nltk_data
       
   146 
       
   147 .. tip::
       
   148 
       
   149     On Debian GNU/Linux, you can choose any directory between '*~/nltk_data*' (where '~' is the homedir of user running
       
   150     Pyramid application), '*/usr/share/nltk_data*', '*/usr/local/share/nltk_data*', '*/usr/lib/nltk_data*' and
       
   151     '*/usr/local/lib/nltk_data*'
       
   152 
       
   153 
       
   154 .. code-block:: pycon
       
   155 
       
   156     Config> m
       
   157     ---------------------------------------------------------------------------
       
   158         d) Download   l) List    u) Update   c) Config   h) Help   q) Quit
       
   159     ---------------------------------------------------------------------------
       
   160     Downloader> d
       
   161 
       
   162     Download which package (l=list; x=cancel)?
       
   163       Identifier> punkt
       
   164         Downloading package punkt to /usr/local/lib/nltk_data...
       
   165 
       
   166     Downloader> d
       
   167 
       
   168     Download which package (l=list; x=cancel)?
       
   169       Identifier> stopwords
       
   170         Downloading package stopwords to /usr/local/lib/nltk_data...