Bug 1444729 - Container Review Request: cassandra - OpenSource database server for high-scale application
Summary: Container Review Request: cassandra - OpenSource database server for high-sca...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Fedora Container Images
Classification: Fedora
Component: Container Review
Version: 26
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Honza Horak
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-04-24 06:54 UTC by Tomas Repik
Modified: 2017-07-20 13:37 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-07-20 13:37:40 UTC
Type: Bug
Embargoed:
hhorak: fedora-review+


Attachments (Terms of Use)

Description Tomas Repik 2017-04-24 06:54:04 UTC
Dockerfile URL: https://trepik.fedorapeople.org/fedora-cassandra-3.9/v0/Dockerfile
Other files URL: https://trepik.fedorapeople.org/fedora-cassandra-3.9/v0/

Description: Cassandra is a partitioned row store. Rows are organized into tables with a required primary key. Partitioning means that Cassandra can distribute your data across multiple machines in an application-transparent matter. Cassandra will automatically re-partition as machines are added/removed from the cluster. Row store means that like relational databases, Cassandra organizes data by rows and columns. The Cassandra Query Language (CQL) is a close relative of SQL.

You can find more information on the Apache Cassandra project from the project Web site (https://cassandra.apache.org).

Fedora Account System Username: trepik

Comment 1 Juraci Paixão Kröhling 2017-04-27 13:11:52 UTC
I just tried this, but not sure I'm missing something. I built via `docker build . -t jpkroehling/cassandra` and ran as `docker run --rm jpkroehling/cassandra`, but I saw no output at first. When issuing a Ctrl+C, I see the typical Cassandra output. 

The line 8 of `run-cassandra` looks suspicious to me and could explain the behavior above, as it seems that it's starting Cassandra and not stopping it.

Other comments:

1) I'm not familiar with the Fedora Docker images, but wouldn't it be better to use `FROM fedora:rawhide` instead of `FROM registry.fedoraproject.org/fedora:rawhide`? Leaving the registry address out of the image means that I could use a local/corporate registry instead of forcing the usage of Fedora's. Also, the default registry is served by a good CDN, which is waaay faster than `registry.fedoraproject.org` (for me, at least). Of course, I'd personally prefer to have `jboss/base-jdk:8` as the base image :)

2) CONTAINER_SCRIPTS_PATH is exported, but the referred path doesn't exist:

```
$ docker run -it --rm jpkroehling/cassandra bash
bash-4.4$ ls $CONTAINER_SCRIPTS_PATH 
ls: cannot access '/usr/share/container-scripts/cassandra': No such file or directory
```

3) OpenShift assigns a random UID to the process within the container, so, the user that ends up running the process might not have enough permissions. For JBoss images, it's common to do stuff like this: 

https://github.com/jboss-dockerfiles/hawkular-apm/blob/master/elasticsearch/Dockerfile#L25
https://github.com/jboss-dockerfiles/infinispan/blob/master/server/Dockerfile#L30

Comment 2 Tomas Repik 2017-05-02 12:22:34 UTC
(In reply to Juraci Paixão Kröhling from comment #1)
> The line 8 of `run-cassandra` looks suspicious to me and could explain the
> behavior above, as it seems that it's starting Cassandra and not stopping it.

What I'm actually doing is running cassandra to get the IP address of the machine to store it as a SEED in the config (cassandra.yaml) file. The first run of cassandra always fails, because there are no seeds set in config file, and thus does not have to be stopped (no need for Ctrl+C). The first run is done in silence so you don't see any output in terminal. Once the first run is done and the config file is filled with the obtained seed value, the server starts normally.

> 1) wouldn't it be better to use `FROM fedora:rawhide` instead of `FROM
> registry.fedoraproject.org/fedora:rawhide`?

guidelines say that registry is mandatory therefore using it that way [1]

> 2) CONTAINER_SCRIPTS_PATH is exported, but the referred path doesn't exist:

Yes you're right the concept is taken from other database containers we have and so far in cassandra we don't have any use for it. I will skip it if don't use it.
 
> 3) OpenShift assigns a random UID to the process within the container, so,
> the user that ends up running the process might not have enough permissions.

I had the loosening of permissions in the dockerfile and I did not know the purpose for it so I commented it out. For Fedora it is not necessary for openshift we can definitely include it.

Thank you very much for feedback!

[1] https://fedoraproject.org/wiki/Container:Guidelines#FROM

Comment 3 Honza Horak 2017-05-24 07:45:44 UTC
As for the guidelines [1], I don't see:

* summary and description labels, that are marked as required; I propose having those defined as ENV first, and then re-use for other labels, like in [2]; description should give enough information about purpose and content

* help.md or README.md file in the root, where usage would be mentioned; structure of the README.md might be taken from [3]

Otherwise I don't see any blockers for the review.

[1] https://fedoraproject.org/wiki/Container:Guidelines#LABELS
[2] https://hhorak.fedorapeople.org/nginx-docker/Dockerfile
[3] https://github.com/hhorak/mariadb-container/blob/test-new-man-page/10.1/README.md

Comment 4 Tomas Repik 2017-05-25 08:10:46 UTC
Dockerfile URL: https://trepik.fedorapeople.org/fedora-cassandra-3.9/v1/Dockerfile
Other files URL: https://trepik.fedorapeople.org/fedora-cassandra-3.9/v1/

I added both summary and description labels as suggested, although only summary seems to be mandatory according to the guidelines. Also I created a README.md file and placed it into root/usr/share/container-scripts/cassandra/ with a symlink in the root directory, as done by mongodb container.

Thanks for the review!

Comment 5 Honza Horak 2017-05-26 15:40:10 UTC
Labels seem fine, and the idea with README.md looks like a good idea as well, although I don't see those files in https://trepik.fedorapeople.org/fedora-cassandra-3.9/v1/root/usr/share/container-scripts/cassandra/.. maybe wrong file permissions?

Anyway, I believe the file is there, so I don't take this as a blocker.. So approving.

Comment 6 Tomas Repik 2017-05-29 06:58:15 UTC
File permissions are not the issue however the naming is. No "README.md" file is displayed in the pedorapeople.org web interface.

Comment 7 Juraci Paixão Kröhling 2017-05-29 09:11:32 UTC
On the Dockerfile and container-entrypoint/run.sh, would it be possible to externalize the options into env vars that could be overriden by the end user? Something like this:

https://github.com/jpkrohling/jboss-dockerfiles-hawkular/blob/JPK-AddedCassandra/cassandra/files/bin/run.sh

It would also be interesting to have a readiness probe, so that Kubernetes/OpenShift (or any other platform starting this container) would know when it's ready to serve requests.

Comment 8 Gwyn Ciesla 2017-05-29 19:23:37 UTC
Package request has been approved: https://admin.fedoraproject.org/pkgdb/package/container/cassandra


Note You need to log in before you can comment on or make changes to this bug.