Created attachment 908598 [details] Picture of problem - Description of problem: I hosted an python webservice application in openshift which uses RSLP Stemmer module of nltk, but the log of service reported that: "[...] Resource 'stemmers/rslp/step0.pt' not found. Please use the NLTK Downloader to obtain the resource: >>> nltk.download() Searched in: - '/var/lib/openshift/539a61ab5973caa2410000bf/nltk_data' - '/usr/share/nltk_data' - '/usr/local/share/nltk_data' - '/usr/lib/nltk_data' - '/usr/local/lib/nltk_data' [...] " I concluded that the module is not installed properly and so I'm reporting the bug. - How reproducible: Use the following code snippet: import nltk from nltk.stem import RSLPStemmer stemmer = RSLPStemmer() - Actual results: The application not be working. - Expected results: The application should be working.
Junior the problem is that the NLTK package by default expect corpus in user home directory. Unfortunatelly, you cannot write to user home, you have to use $OPENSHIFT_DATA_DIR for storing data. To solve this problem do the following: 1. Create an environment variable called NLTK_DATA with value $OPENSHIFT_DATA_DIR. After creating environment variable restart the app using rhc app-restart command. 2. SSH into your application gear using rhc ssh command 3. Activate the virtual environment and download the corpus using the commads shown below. 1.# . $VIRTUAL_ENV/bin/activate 2.# curl https://raw.githubusercontent.com/sloria/TextBlob/dev/textblob/download_corpora.py | python There was also an blog post which solves your problem. https://www.openshift.com/blogs/day-9-textblob-finding-sentiments-in-text
Thanks for help, Shekhar. I following your instructions but the URL was broken. However, this feature of create environment variables was useful because I created a folder containing the content of nltk which I needed, and set an environment variable NLTK_DATA for this folder. Again, thanks for the help.
Junior the correct url is: https://raw.githubusercontent.com/sloria/TextBlob/dev/textblob/download_corpora.py This one is working and is the right one. -Jakub