Bug 1265609 - pandas not getting installed
Summary: pandas not getting installed
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: ImageStreams
Version: 2.2.0
Hardware: Unspecified
OS: Unspecified
urgent
high
Target Milestone: ---
: ---
Assignee: Vu Dinh
QA Contact: DeShuai Ma
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-09-23 10:59 UTC by Jaspreet Kaur
Modified: 2016-02-26 11:40 UTC (History)
9 users (show)

Fixed In Version: openshift-origin-cartridge-python-1.34.1.1-1.el6op
Doc Type: Bug Fix
Doc Text:
When using the Python cartridge, the pandas package had several dependencies that were not installed successfully using the setup.py method. This bug fix updates the cartridge to use the `pip install` method, which resolves the dependency issue and allows the pandas package to be installed properly. However, to avoid a regression issue, a marker `pip_install` is required to use `pip install`. Otherwise, the standard setup.py installation method is used instead.
Clone Of:
Environment:
Last Closed: 2015-12-17 17:10:48 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2015:2666 0 normal SHIPPED_LIVE Important: Red Hat OpenShift Enterprise 2.2.8 security, bug fix, and enhancement update 2015-12-17 22:07:54 UTC

Description Jaspreet Kaur 2015-09-23 10:59:58 UTC
Description of problem:

Pandas doesn't get installed when added to setup.py as a dependency in python 2.7

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
Problem:  

rhc app-show python
python @ http://python-rbajaj.rhcloud.com/ (uuid: 55ff845989f5cf8b3b00013c)
---------------------------------------------------------------------------
  Domain:          rbajaj
  Created:         Sep 21  9:45 AM
  Gears:           1 (defaults to small)
  Git URL:         ssh://55ff845989f5cf8b3b00013c.com/~/git/python.git/
  Initial Git URL: https://github.com/openshift-quickstart/flask-base
  SSH:             55ff845989f5cf8b3b00013c.com
  Deployment:      auto (on git push)

  python-2.7 (Python 2.7)
  -----------------------
    Gears: Located with mysql-5.5, phpmyadmin-4


When adding pandas to setup.py:
...
install_requires=['pandas'],
...

vi setup.py 
[root@jkaur python]# git add setup.py
[root@jkaur python]# git commit -m "added"
[master 43a0451] added
 1 files changed, 1 insertions(+), 0 deletions(-)
[root@jkaur python]# git push
Counting objects: 5, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (3/3), done.
Writing objects: 100% (3/3), 291 bytes, done.
Total 3 (delta 2), reused 0 (delta 0)
remote: Stopping Python 2.7 cartridge
Connection to python-rbajaj.rhcloud.com closed by remote host.
fatal: The remote end hung up unexpectedly
fatal: The remote end hung up unexpectedly


Actual results:

It simply get hangs and exits

Expected results:

It should be installed without any issue.


Additional info:

Comment 1 Jaspreet Kaur 2015-09-25 07:25:50 UTC
Hello, 

The issue can be reproduced with python 3.3 as well. It end with :

=======
remote: pandas/src/parser/tokenizer.c: In function 'precise_xstrtod':
remote: pandas/src/parser/tokenizer.c:2345: warning: suggest parentheses around comparison in operand of '&'
remote: In file included from pandas/src/parser/io.c:1:
remote: pandas/src/parser/io.h:33:1: warning: "HAVE_MMAP" redefined
remote: In file included from /opt/rh/python33/root/usr/include/python3.3m/pyconfig.h:6,
remote:                  from /opt/rh/python33/root/usr/include/python3.3m/Python.h:8,
remote:                  from pandas/src/parser/io.h:1,
remote:                  from pandas/src/parser/io.c:1:
remote: /opt/rh/python33/root/usr/include/python3.3m/pyconfig-64.h:578:1: warning: this is the location of the previous definition
remote: In file included from /opt/rh/python33/root/usr/lib64/python3.3/site-packages/numpy/core/include/numpy/ndarraytypes.h:1728,
remote:                  from /opt/rh/python33/root/usr/lib64/python3.3/site-packages/numpy/core/include/numpy/ndarrayobject.h:17,
remote:                  from /opt/rh/python33/root/usr/lib64/python3.3/site-packages/numpy/core/include/numpy/arrayobject.h:15,
remote:                  from pandas/algos.c:250:
remote: /opt/rh/python33/root/usr/lib64/python3.3/site-packages/numpy/core/include/numpy/npy_deprecated_api.h:11:2: warning: #warning "Using deprecated NumPy API, disable it by #defining NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION"
remote: In file included from pandas/algos.c:259:
remote: pandas/src/datetime_helper.h:7:1: warning: "PyInt_AS_LONG" redefined
remote: pandas/algos.c:145:1: warning: this is the location of the previous definition
remote: pandas/algos.c: In function 'PyInit_algos':
remote: pandas/algos.c:212222: warning: dereferencing pointer '__pyx_f_6pandas_3lib_is_null_datetimelike.16058' does break strict-aliasing rules
remote: pandas/algos.c:199798: note: initialized from here
Connection to python-jasdomain.rhcloud.com closed by remote host.
fatal: The remote end hung up unexpectedly
error: error in sideband demultiplexer
To ssh://5604f0a07628e1e5ac000111.com/~/git/python.git/
   1b26745..206add6  master -> master

==========

Comment 2 Rahul Bajaj 2015-09-29 07:38:41 UTC
Hello Team,

Kindly prioritize resolution to this bug as it is delaying the deployment of one of our key project. 

Would appreciate if we can get the resolution within this week i.e. by 2-Oct-15. 

Regards
Rahul Bajaj

Comment 3 Rahul Bajaj 2015-10-26 07:29:32 UTC
Hello Team,

I can still see the status of the bug stands as NEW. Please update on the progress / timeline for resolution of this bug. 

Warm Regards
Rahul Bajaj

Comment 4 Vu Dinh 2015-10-27 17:18:32 UTC
PR <https://github.com/openshift/origin-server/pull/6292> is submitted to fix this bug.

Comment 5 Vu Dinh 2015-10-27 17:19:18 UTC
Fixed for both python-2.7 and python-3.3

Comment 6 Vu Dinh 2015-10-29 15:10:19 UTC
This PR <https://github.com/openshift/origin-server/pull/6295> is also associated with this bug.

Comment 7 openshift-github-bot 2015-10-30 23:05:36 UTC
Commit pushed to master at https://github.com/openshift/origin-server

https://github.com/openshift/origin-server/commit/9d05226005c8318e5616964071eb10dada8b8c4f
Bug 1265609: Fix pandas not getting installed

Pandas package fails to install properly if it's included in setup.py
for Python 2 & 3 applications.

After this commit, python cartridge control file will check to see if
'pip_install' marker is present. If it is, then control file will use
pip to install packages. If not, the standard 'python setup.py install'
is executed instead. Also, environment variable "OPENSHIFT_PYTHON_USE_PIP"
is set to 'enable' if the marker file exists. Otherwise, it is set to
'disable'.

In order to use pip install as default, a file named 'pip_install' needs
to be created in directory .openshift/markers/ inside the application git
repository.

Bug <1265609>
Link <https://bugzilla.redhat.com/show_bug.cgi?id=1265609>

Signed-off-by: Vu Dinh <vdinh>

Comment 12 Johnny Liu 2015-11-18 07:08:12 UTC
Verified this bug with , and PASS.


Create a python-3.3 app (or python-2.7)
In app's git repo, create the following marker file
$ vi setup.py
...
install_requires=['pandas'],
...
$ touch .openshift/markers/pip_install
$ git add .
$ git commit -a -m"added"; git push

remote: Activating virtenv
remote: Checking for pip dependency listed in requirements.txt file..
remote: The directory '/var/lib/openshift/jialiu-python33app-1/.cache/pip/http' or its parent directory is not owned by the current user and the cache has been disabled. Please check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.
remote: The directory '/var/lib/openshift/jialiu-python33app-1/.cache/pip' or its parent directory is not owned by the current user and caching wheels has been disabled. check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.
remote: You must give at least one requirement to install (see "pip help install")
remote: Checking pip install marker..
remote: Running pip install..
remote: The directory '/var/lib/openshift/jialiu-python33app-1/.cache/pip/http' or its parent directory is not owned by the current user and the cache has been disabled. Please check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.
remote: The directory '/var/lib/openshift/jialiu-python33app-1/.cache/pip' or its parent directory is not owned by the current user and caching wheels has been disabled. check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.
remote: Obtaining file:///var/lib/openshift/jialiu-python33app-1/app-root/runtime/repo
remote: Collecting pandas (from YourAppName==1.0)
remote:   Downloading pandas-0.17.0.tar.gz (6.5MB)
remote: Collecting python-dateutil>=2 (from pandas->YourAppName==1.0)
remote:   Downloading python_dateutil-2.4.2-py2.py3-none-any.whl (188kB)
remote: Collecting pytz>=2011k (from pandas->YourAppName==1.0)
remote:   Downloading pytz-2015.7-py2.py3-none-any.whl (476kB)
remote: Requirement already satisfied (use --upgrade to upgrade): numpy>=1.7.0 in /opt/rh/python33/root/usr/lib64/python3.3/site-packages (from pandas->YourAppName==1.0)
remote: Collecting six>=1.5 (from python-dateutil>=2->pandas->YourAppName==1.0)
remote:   Downloading six-1.10.0-py2.py3-none-any.whl
remote: Installing collected packages: six, python-dateutil, pytz, pandas, YourAppName
remote:   Found existing installation: six 1.3.0
remote:     DEPRECATION: Uninstalling a distutils installed project (six) has been deprecated and will be removed in a future version. This is due to the fact that uninstalling a distutils project will only partially uninstall the project.
remote:     Not uninstalling six at /opt/rh/python33/root/usr/lib/python3.3/site-packages, outside environment /var/lib/openshift/jialiu-python33app-1/python/virtenv/venv
remote:   Running setup.py install for pandas
remote:   Running setup.py develop for YourAppName
remote: Successfully installed YourAppName pandas python-dateutil-2.4.2 pytz-2015.7 six-1.10.0
remote: Preparing build for deployment
remote: Deployment id is ad918187
remote: Activating deployment
remote: Starting Python 3.3 cartridge (Apache+mod_wsgi)
remote: Application directory "/" selected as DocumentRoot
remote: Application "wsgi.py" selected as default WSGI entry point


Seem from the output, "pandas" is installed as dependency.

Comment 13 Johnny Liu 2015-11-18 07:20:35 UTC
The above verification was executed against penShiftEnterpriseErrata/2.2/2015-11-12.1 puddle.

Comment 15 errata-xmlrpc 2015-12-17 17:10:48 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2015-2666.html

Comment 20 Vu Dinh 2016-02-18 21:32:09 UTC
Hi Madhavprasad,

I just want to give you an update on this issue. The pandas seems to require more memory than a small gear (512MB) can afford. As a result, ssh connection is terminated to the memory issue. I have tested pip install pandas on medium and large gear and it is working fine.

Pandas is pretty big and requires quite a bit of dependencies including numpy. It's used for heavy data analysis so it should be restricted to medium gear or bigger.

If you can give it a try and let me know if it's working for you, that would be great.

Thanks,
Vu


Note You need to log in before you can comment on or make changes to this bug.