Bug 1494543
| Summary: | ceph-disk prepare cannot find partition that does exist when run inside containers | |||
|---|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat Ceph Storage | Reporter: | John Fulton <johfulto> | |
| Component: | Ceph-Disk | Assignee: | Kefu Chai <kchai> | |
| Status: | CLOSED ERRATA | QA Contact: | John Fulton <johfulto> | |
| Severity: | high | Docs Contact: | ||
| Priority: | urgent | |||
| Version: | 3.0 | CC: | afazekas, gfidente, hnallurv, jefbrown, johfulto, jschluet, kchai, kdreyer, sasha, shan, yrabl | |
| Target Milestone: | rc | Keywords: | Triaged | |
| Target Release: | 3.0 | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | RHEL: ceph-12.2.1-13.el7cp Ubuntu: ceph_12.2.1-15redhat1xenial | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 1496509 (view as bug list) | Environment: | ||
| Last Closed: | 2017-12-05 23:45:31 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 1496509 | |||
In the workaround section I provided an example with the upstream docker.io/ceph/daemon:tag-build-master-jewel-centos-7. However, I have also seen this and used the same workaround with docker-registry.engineering.redhat.com/ceph/rhceph-2-rhel7:latest John,
> [Errno 2] No such file or directory: '/dev/vdb1'
the exception looks weird if "/dev/vdb1" exists. can you reproduce this by rerunning "ceph-disk prepare /dev/vdb1" after this command fails? and what is the output if you run
$ stat /dev/sda
right after "ceph-disk prepare /dev/vdb1" fails?
I see that release flag has this targeted at ceph-3.0 but package/container reported against is jewel/ceph 2 I was able to reproduce this again:
[root@overcloud-cephstorage-2 ~]# docker start ceph-osd-prepare-overcloud-cephstorage-2-devdevvdb ; docker logs -f ceph-osd-prepare-overcloud-cephstorage-2-devdevvdb
...
command: Running command: /usr/bin/ceph-osd --cluster=ceph --show-config-value=osd_journal_size
get_dm_uuid: get_dm_uuid /dev/vdb uuid path is /sys/dev/block/252:16/dm/uuid
get_dm_uuid: get_dm_uuid /dev/vdb uuid path is /sys/dev/block/252:16/dm/uuid
get_dm_uuid: get_dm_uuid /dev/vdb uuid path is /sys/dev/block/252:16/dm/uuid
command: Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup osd_mkfs_type
command: Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup osd_mkfs_options_xfs
command: Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup osd_mount_options_xfs
get_dm_uuid: get_dm_uuid /dev/vdd uuid path is /sys/dev/block/252:48/dm/uuid
prepare_device: OSD will not be hot-swappable if journal is not the same device as the osd data
get_dm_uuid: get_dm_uuid /dev/vdd uuid path is /sys/dev/block/252:48/dm/uuid
ptype_tobe_for_name: name = journal
get_dm_uuid: get_dm_uuid /dev/vdd uuid path is /sys/dev/block/252:48/dm/uuid
command: Running command: /usr/sbin/parted --machine -- /dev/vdd print
get_free_partition_index: get_free_partition_index: analyzing BYT;
/dev/vdd:1074MB:virtblk:512:512:gpt:Virtio Block Device:;
2:269MB:538MB:268MB::ceph journal:;
create_partition: Creating journal partition num 3 size 256 on /dev/vdd
command_check_call: Running command: /usr/sbin/sgdisk --new=3:0:+256M --change-name=3:ceph journal --partition-guid=3:d7dcab81-6d89-48ba-bd9e-081dceba85b3 --typecode=3:45b0969e-9b03-4f30-b4c6-b4b80ceff106 --mbrtogpt -- /dev/vdd
Warning: The kernel is still using the old partition table.
The new table will be used at the next reboot.
The operation has completed successfully.
update_partition: Calling partprobe on created device /dev/vdd
command_check_call: Running command: /usr/bin/udevadm settle --timeout=600
command: Running command: /usr/bin/flock -s /dev/vdd /usr/sbin/partprobe /dev/vdd
command_check_call: Running command: /usr/bin/udevadm settle --timeout=600
get_dm_uuid: get_dm_uuid /dev/vdd uuid path is /sys/dev/block/252:48/dm/uuid
get_dm_uuid: get_dm_uuid /dev/vdd uuid path is /sys/dev/block/252:48/dm/uuid
get_dm_uuid: get_dm_uuid /dev/vdd3 uuid path is /sys/dev/block/252:51/dm/uuid
prepare_device: Journal is GPT partition /dev/disk/by-partuuid/d7dcab81-6d89-48ba-bd9e-081dceba85b3
prepare_device: Journal is GPT partition /dev/disk/by-partuuid/d7dcab81-6d89-48ba-bd9e-081dceba85b3
get_dm_uuid: get_dm_uuid /dev/vdb uuid path is /sys/dev/block/252:16/dm/uuid
set_data_partition: Creating osd partition on /dev/vdb
get_dm_uuid: get_dm_uuid /dev/vdb uuid path is /sys/dev/block/252:16/dm/uuid
ptype_tobe_for_name: name = data
get_dm_uuid: get_dm_uuid /dev/vdb uuid path is /sys/dev/block/252:16/dm/uuid
create_partition: Creating data partition num 1 size 0 on /dev/vdb
command_check_call: Running command: /usr/sbin/sgdisk --largest-new=1 --change-name=1:ceph data --partition-guid=1:ad14c7b6-c23e-4357-a594-009234f7069e --typecode=1:89c57f98-2fe5-4dc0-89c1-f3ad0ceff2be --mbrtogpt -- /dev/vdb
The operation has completed successfully.
update_partition: Calling partprobe on created device /dev/vdb
command_check_call: Running command: /usr/bin/udevadm settle --timeout=600
command: Running command: /usr/bin/flock -s /dev/vdb /usr/sbin/partprobe /dev/vdb
command_check_call: Running command: /usr/bin/udevadm settle --timeout=600
get_dm_uuid: get_dm_uuid /dev/vdb uuid path is /sys/dev/block/252:16/dm/uuid
get_dm_uuid: get_dm_uuid /dev/vdb uuid path is /sys/dev/block/252:16/dm/uuid
Traceback (most recent call last):
File "/usr/sbin/ceph-disk", line 9, in <module>
load_entry_point('ceph-disk==1.0.0', 'console_scripts', 'ceph-disk')()
File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 5095, in run
main(sys.argv[1:])
File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 5046, in main
args.func(args)
File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 1855, in main
Prepare.factory(args).prepare()
File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 1844, in prepare
self.prepare_locked()
File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 1875, in prepare_locked
self.data.prepare(self.journal)
File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 2542, in prepare
self.prepare_device(*to_prepare_list)
File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 2705, in prepare_device
self.set_data_partition()
File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 2632, in set_data_partition
self.partition = self.create_data_partition()
File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 2616, in create_data_partition
return device.get_partition(partition_number)
File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 1622, in get_partition
path=self.path, dev=dev, args=self.args)
File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 1685, in factory
(dev is not None and is_mpath(dev))):
File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 535, in is_mpath
uuid = get_dm_uuid(dev)
File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 522, in get_dm_uuid
uuid_path = os.path.join(block_path(dev), 'dm', 'uuid')
File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 516, in block_path
rdev = os.stat(path).st_rdev
OSError: [Errno 2] No such file or directory: '/dev/vdb1'
[root@overcloud-cephstorage-2 ~]# stat /dev/vdb1
File: ‘/dev/vdb1’
Size: 0 Blocks: 0 IO Block: 4096 block special file
Device: 5h/5d Inode: 3594514 Links: 1 Device type: fc,11
Access: (0660/brw-rw----) Uid: ( 0/ root) Gid: ( 6/ disk)
Context: system_u:object_r:fixed_disk_device_t:s0
Access: 2017-09-29 19:43:35.502364450 +0000
Modify: 2017-09-29 19:43:35.502364450 +0000
Change: 2017-09-29 19:43:35.502364450 +0000
Birth: -
[root@overcloud-cephstorage-2 ~]# stat /dev/sda
stat: cannot stat ‘/dev/sda’: No such file or directory
[root@overcloud-cephstorage-2 ~]# stat /dev/vda
File: ‘/dev/vda’
Size: 0 Blocks: 0 IO Block: 4096 block special file
Device: 5h/5d Inode: 9480 Links: 1 Device type: fc,0
Access: (0660/brw-rw----) Uid: ( 0/ root) Gid: ( 6/ disk)
Context: system_u:object_r:fixed_disk_device_t:s0
Access: 2017-09-29 19:24:19.883364450 +0000
Modify: 2017-09-25 01:58:34.239364450 +0000
Change: 2017-09-25 01:58:34.239364450 +0000
Birth: -
[root@overcloud-cephstorage-2 ~]#
(In reply to Jon Schlueter from comment #6) > I see that release flag has this targeted at ceph-3.0 but package/container > reported against is jewel/ceph 2 re-assigning needinfo after accidentally closing all need-infos (In reply to Kefu Chai from comment #5) > John, > > > [Errno 2] No such file or directory: '/dev/vdb1' > > the exception looks weird if "/dev/vdb1" exists. can you reproduce this by > rerunning "ceph-disk prepare /dev/vdb1" after this command fails? Rather than directly run "ceph-disk prepare /dev/vdb1" I had the ceph container run it by running: docker start ceph-osd-prepare-overcloud-cephstorage-2-devdevvdb > and what is the output if you run > > $ stat /dev/sda > > right after "ceph-disk prepare /dev/vdb1" fails? I ran this as requested, after reproducing the error, and pasted it in comment #7 If you want to reproduce this error, then here are the steps.
1. use ceph-ansible to do a cotainerized deployment by using site-docker.yaml
2. If you fail on this, then you've reproduced it, but if you don't fail on it, you can do the following....
Given a working containerized Ceph deployment:
[root@overcloud-cephstorage-2 ~]# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
ba88ee78641b docker.io/ceph/daemon:tag-build-master-jewel-centos-7 "/entrypoint.sh" 10 minutes ago Up 10 minutes ceph-osd-overcloud-cephstorage-2-devvdb
983ef4bc33b7 tripleoupstream/centos-binary-cron:latest "kolla_start" 4 days ago Up 4 days logrotate_crond
b0cebb90f70f docker.io/ceph/daemon:tag-build-master-jewel-centos-7 "/entrypoint.sh" 4 days ago Up 4 days ceph-osd-overcloud-cephstorage-2-devvdc
[root@overcloud-cephstorage-2 ~]#
Determine the OSD number of the container you want to stop by asking that container to show it's output of df:
[root@overcloud-cephstorage-2 ~]# docker exec -ti ba88ee78641b df
Filesystem 1K-blocks Used Available Use% Mounted on
overlay 52416492 4906864 47509628 10% /
tmpfs 4005268 0 4005268 0% /sys/fs/cgroup
devtmpfs 3979576 0 3979576 0% /dev
shm 65536 0 65536 0% /dev/shm
/dev/vda2 52416492 4906864 47509628 10% /etc/ceph
/dev/vdb1 1044076 33576 1010500 4% /var/lib/ceph/osd/ceph-0
[root@overcloud-cephstorage-2 ~]#
To tell systemd not to restart the container after you take it down, change it's unit file as follows:
-Restart=always
+Restart=never
Find the line to change like this and don't forget to reload systemd.
[root@overcloud-cephstorage-2 ~]# vi /etc/systemd/system/ceph-osd\@.service
# Please do not change this file directly since it is managed by Ansible and will be overwritten
[Unit]
Description=Ceph OSD
After=docker.service
[Service]
EnvironmentFile=-/etc/environment
ExecStartPre=-/usr/bin/docker stop ceph-osd-overcloud-cephstorage-2-dev%i
ExecStartPre=-/usr/bin/docker rm -f ceph-osd-overcloud-cephstorage-2-dev%i
ExecStart=/usr/share/ceph-osd-run.sh %i
ExecStop=-/usr/bin/docker stop ceph-osd-overcloud-cephstorage-2-dev%i
Restart=always # <---------------------------------------------------------- HERE!
RestartSec=10s
TimeoutStartSec=120
TimeoutStopSec=15
[Install]
WantedBy=multi-user.target
[root@overcloud-cephstorage-2 ~]# systemctl daemon-reload
[root@overcloud-cephstorage-2 ~]#
Stop the container with systemd (the command below is what it does)
/usr/bin/docker stop ceph-osd-overcloud-cephstorage-2-devvdb
Verify the OSD is down
[root@overcloud-cephstorage-2 ~]# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
983ef4bc33b7 tripleoupstream/centos-binary-cron:latest "kolla_start" 4 days ago Up 4 days logrotate_crond
b0cebb90f70f docker.io/ceph/daemon:tag-build-master-jewel-centos-7 "/entrypoint.sh" 4 days ago Up 4 days ceph-osd-overcloud-cephstorage-2-devvdc
[root@overcloud-cephstorage-2 ~]# ceph -s
cluster 7c5b1446-a16b-11e7-aacd-00e165f39225
health HEALTH_WARN
1/6 in osds are down
monmap e1: 1 mons at {overcloud-controller-0=192.168.24.9:6789/0}
election epoch 3, quorum 0 overcloud-controller-0
fsmap e5: 1/1/1 up {0=overcloud-controller-0=up:active}
osdmap e22: 6 osds: 5 up, 6 in
flags sortbitwise,require_jewel_osds
pgmap v1600: 240 pgs, 8 pools, 40164 kB data, 30 objects
240 MB used, 5877 MB / 6117 MB avail
240 active+clean
[root@overcloud-cephstorage-2 ~]#
Have a container zap the disk:
[root@overcloud-cephstorage-2 ~]# docker run -ti --privileged=true -v /dev/:/dev/ -e OSD_DEVICE=/dev/vdb docker.io/ceph/daemon:tag-build-master-jewel-centos-7 zap_device
ownership of '/var/run/ceph/' retained as ceph:ceph
ownership of '/var/lib/ceph/bootstrap-mds' retained as ceph:ceph
ownership of '/var/lib/ceph/bootstrap-osd' retained as ceph:ceph
ownership of '/var/lib/ceph/bootstrap-rgw' retained as ceph:ceph
ownership of '/var/lib/ceph/mds' retained as ceph:ceph
changed ownership of '/var/lib/ceph/mds/ceph-mds-1f2980da8c8d' from root:root to ceph:ceph
ownership of '/var/lib/ceph/mon' retained as ceph:ceph
changed ownership of '/var/lib/ceph/mon/ceph-1f2980da8c8d' from root:root to ceph:ceph
ownership of '/var/lib/ceph/osd' retained as ceph:ceph
changed ownership of '/var/lib/ceph/radosgw' from root:root to ceph:ceph
changed ownership of '/var/lib/ceph/radosgw/1f2980da8c8d' from root:root to ceph:ceph
ownership of '/var/lib/ceph/tmp' retained as ceph:ceph
2017-09-29 19:13:37 /entrypoint.sh: Zapping the entire device /dev/vdb
GPT data structures destroyed! You may now partition the disk using fdisk or
other utilities.
The operation has completed successfully.
2017-09-29 19:13:39 /entrypoint.sh: Executing partprobe on /dev/vdb
[root@overcloud-cephstorage-2 ~]#
If you're using a separate disk for a journal, then you need to zap the journal partition too.
parted /dev/vdd --script rm 1
Note that I had verified that partition 1 was the journal in use vdb.
[root@overcloud-cephstorage-2 ~]# ceph-disk list
/dev/vda :
/dev/vda1 other, iso9660
/dev/vda2 other, xfs, mounted on /
/dev/vdb other, unknown
/dev/vdc :
/dev/vdc1 ceph data, prepared, cluster ceph, osd.5, journal /dev/vdd2
/dev/vdd :
/dev/vdd2 ceph journal, for /dev/vdc1
[root@overcloud-cephstorage-2 ~]#
In my case I want to remove all partitions on the journal disk except the second one.
for p in $(parted /dev/vdd --script print | awk {'print $1'} | egrep [0-9] | grep -v 2); do parted /dev/vdd --script rm $p; done
Have a container prepare the disk:
[root@overcloud-cephstorage-2 ~]# docker start ceph-osd-prepare-overcloud-cephstorage-2-devdevvdb
ceph-osd-prepare-overcloud-cephstorage-2-devdevvdb
[root@overcloud-cephstorage-2 ~]#
See if the preparation worked.
docker logs -f ceph-osd-prepare-overcloud-cephstorage-2-devdevvdb
If it, then did rezap and re-prepare until the prepare fails.
In Summary
Erase:
docker run -ti --privileged=true -v /dev/:/dev/ -e OSD_DEVICE=/dev/vdb docker.io/ceph/daemon:tag-build-master-jewel-centos-7 zap_device ; for p in $(parted /dev/vdd --script print | awk {'print $1'} | egrep [0-9] | grep -v 2); do parted /dev/vdd --script rm $p; done
Try:
docker start ceph-osd-prepare-overcloud-cephstorage-2-devdevvdb ; docker logs -f ceph-osd-prepare-overcloud-cephstorage-2-devdevvdb
I had to run the above commands to erase and try five times in a row, but I did get the error the last time.
Note that I am using virtual hardware.
(In reply to Kefu Chai from comment #5) > John, > > > [Errno 2] No such file or directory: '/dev/vdb1' > > the exception looks weird if "/dev/vdb1" exists. can you reproduce this by > rerunning "ceph-disk prepare /dev/vdb1" after this command fails? and what > is the output if you run > > $ stat /dev/sda > > right after "ceph-disk prepare /dev/vdb1" fails? I want to be sure I'm providing the info you need. So I reproduced the error again using the following loop and then doing a Ctrl-C when I saw it. while [ 1 ] ; do docker run -ti --privileged=true -v /dev/:/dev/ -e OSD_DEVICE=/dev/vdb docker.io/ceph/daemon:tag-build-master-jewel-centos-7 zap_device ; for p in $(parted /dev/vdd --script print | awk {'print $1'} | egrep [0-9] | grep -v 2); do parted /dev/vdd --script rm $p; done ; docker start ceph-osd-prepare-overcloud-cephstorage-2-devdevvdb ; docker logs -f ceph-osd-prepare-overcloud-cephstorage-2-devdevvdb ; sleep 1; done ; Here is a paste from that... get_dm_uuid: get_dm_uuid /dev/vdb uuid path is /sys/dev/block/252:16/dm/uuid create_partition: Creating data partition num 1 size 0 on /dev/vdb command_check_call: Running command: /usr/sbin/sgdisk --largest-new=1 --change-name=1:ceph data --partition-guid=1:4a323d18-e61e-4f47-ae98-291d44625203 --typecode=1:89c57f98-2fe5-4dc0-89c1-f3ad0ceff2be --mbrtogpt -- /dev/vdb The operation has completed successfully. update_partition: Calling partprobe on created device /dev/vdb command_check_call: Running command: /usr/bin/udevadm settle --timeout=600 command: Running command: /usr/bin/flock -s /dev/vdb /usr/sbin/partprobe /dev/vdb command_check_call: Running command: /usr/bin/udevadm settle --timeout=600 get_dm_uuid: get_dm_uuid /dev/vdb uuid path is /sys/dev/block/252:16/dm/uuid get_dm_uuid: get_dm_uuid /dev/vdb uuid path is /sys/dev/block/252:16/dm/uuid Traceback (most recent call last): File "/usr/sbin/ceph-disk", line 9, in load_entry_point('ceph-disk==1.0.0', 'console_scripts', 'ceph-disk')() File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 5095, in run main(sys.argv[1:]) File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 5046, in main args.func(args) File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 1855, in main Prepare.factory(args).prepare() File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 1844, in prepare self.prepare_locked() File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 1875, in prepare_locked self.data.prepare(self.journal) File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 2542, in prepare self.prepare_device(*to_prepare_list) File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 2705, in prepare_device self.set_data_partition() File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 2632, in set_data_partition self.partition = self.create_data_partition() File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 2616, in create_data_partition return device.get_partition(partition_number) File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 1622, in get_partition path=self.path, dev=dev, args=self.args) File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 1685, in factory (dev is not None and is_mpath(dev))): File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 535, in is_mpath uuid = get_dm_uuid(dev) File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 522, in get_dm_uuid uuid_path = os.path.join(block_path(dev), 'dm', 'uuid') File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 516, in block_path rdev = os.stat(path).st_rdev OSError: [Errno 2] No such file or directory: '/dev/vdb1' ownership of '/var/run/ceph/' retained as ceph:ceph ownership of '/var/lib/ceph/osd' retained as ceph:ceph ownership of '/var/lib/ceph/osd/ceph-0' retained as ceph:ceph ownership of '/var/lib/ceph/osd/ceph-5' retained as ceph:ceph ownership of '/var/lib/ceph/bootstrap-mds' retained as ceph:ceph ownership of '/var/lib/ceph/bootstrap-mds/ceph.keyring' retained as ceph:ceph ownership of '/var/lib/ceph/radosgw' retained as ceph:ceph ownership of '/var/lib/ceph/radosgw/overcloud-cephstorage-2' retained as ceph:ceph ownership of '/var/lib/ceph/bootstrap-osd' retained as ceph:ceph ownership of '/var/lib/ceph/bootstrap-osd/ceph.keyring' retained as ceph:ceph ownership of '/var/lib/ceph/mds' retained as ceph:ceph ownership of '/var/lib/ceph/mds/ceph-mds-overcloud-cephstorage-2' retained as ceph:ceph ownership of '/var/lib/ceph/mon' retained as ceph:ceph ownership of '/var/lib/ceph/mon/ceph-overcloud-cephstorage-2' retained as ceph:ceph ownership of '/var/lib/ceph/bootstrap-rgw' retained as ceph:ceph ownership of '/var/lib/ceph/bootstrap-rgw/ceph.keyring' retained as ceph:ceph ownership of '/var/lib/ceph/tmp' retained as ceph:ceph ownership of '/var/lib/ceph/tmp/ceph-disk.prepare.lock' retained as ceph:ceph ownership of '/var/lib/ceph/tmp/ceph-disk.activate.lock' retained as ceph:ceph ownership of '/var/lib/ceph/bootstrap-rbd' retained as ceph:ceph 2017-09-29 20:47:39 /entrypoint.sh: static: does not generate config HEALTH_OK 2017-09-29 20:47:39 /entrypoint.sh: Regarding parted, device /dev/vdb is inconsistent/broken/weird. 2017-09-29 20:47:39 /entrypoint.sh: It would be too dangerous to destroy it without any notification. 2017-09-29 20:47:39 /entrypoint.sh: Please set OSD_FORCE_ZAP to '1' if you really want to zap this disk. ^C [root@overcloud-cephstorage-2 ~]# ^C [root@overcloud-cephstorage-2 ~]# ceph-disk prepare /dev/vdb1 Traceback (most recent call last): File "/sbin/ceph-disk", line 9, in load_entry_point('ceph-disk==1.0.0', 'console_scripts', 'ceph-disk')() File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 5047, in run main(sys.argv[1:]) File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 5000, in main main_catch(args.func, args) File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 5025, in main_catch func(args) File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 1812, in main Prepare.factory(args).prepare() File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 1808, in factory return PrepareFilestore(args) File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 1820, in __init__ self.data = PrepareFilestoreData(args) File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 2420, in __init__ self.set_type() File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 2428, in set_type dmode = os.stat(self.args.data).st_mode OSError: [Errno 2] No such file or directory: '/dev/vdb1' [root@overcloud-cephstorage-2 ~]# However, I had originally asked it to prepare /dev/vdb, not /dev/vdb1. It's prepare that tries to create vdb1 as the above contains the following: create_partition: Creating data partition num 1 size 0 on /dev/vdb but it's not there: [root@overcloud-cephstorage-2 ~]# parted /dev/vdb --script print Error: /dev/vdb: unrecognised disk label Model: Virtio Block Device (virtblk) Disk /dev/vdb: 1074MB Sector size (logical/physical): 512B/512B Partition Table: unknown Disk Flags: [root@overcloud-cephstorage-2 ~]# If you were asking me to run that, not in a container but directly, then here's that output: [root@overcloud-cephstorage-2 ~]# ceph-disk prepare /dev/vdb Creating new GPT entries. The operation has completed successfully. The operation has completed successfully. meta-data=/dev/vdb1 isize=2048 agcount=4, agsize=49087 blks = sectsz=512 attr=2, projid32bit=1 = crc=1 finobt=0, sparse=0 data = bsize=4096 blocks=196347, imaxpct=25 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0 ftype=1 log =internal log bsize=4096 blocks=864, version=2 = sectsz=512 sunit=0 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 The operation has completed successfully. [root@overcloud-cephstorage-2 ~]# parted /dev/vdb --script print Model: Virtio Block Device (virtblk) Disk /dev/vdb: 1074MB Sector size (logical/physical): 512B/512B Partition Table: gpt Disk Flags: Number Start End Size File system Name Flags 2 1049kB 269MB 268MB xfs ceph journal 1 269MB 1074MB 804MB xfs ceph data [root@overcloud-cephstorage-2 ~]# We seem to only hit this when using containers, I received no reports about it outside of containers and I tried to reproduce it without containers using the following loop. while [ 1 ] ; do sgdisk -Z /dev/vdb ; ceph-disk prepare /dev/vdb ; done However, I didn't run into it. Until there's other evidence I think this is a containers only bug for ceph-disk. *** Bug 1490283 has been marked as a duplicate of this bug. *** I changed the subject to say "in containers" as this does not happen outside of containers. This appears to be a timing issue when running inside the container where partitions need more time to show up and ceph-disk doesn't wait for them to be fully available. Similar symptoms might have been fixed by this: https://github.com/ceph/ceph/commit/702edb5519e67bc5f8c5b65c6f63c9635cd758cf in the past, but that fix is in my container and still an issue. I can provide an environment to anyone working on ceph-disk where the problem exists. (In reply to Jon Schlueter from comment #6) > I see that release flag has this targeted at ceph-3.0 but package/container > reported against is jewel/ceph 2 The same fix could be made against ceph-3.0 and then be backported to the following clone of this bug to get it in jewel/ceph 2: https://bugzilla.redhat.com/show_bug.cgi?id=1496509 Kefu, I had a brief conversation with Seb. He thinks that this pattern may need to be applied elsewhere in ceph-disk for Jewel (not just where it is currently applied). Do you think that's relevant? https://github.com/ceph/ceph/commit/93e7b95ed8b4c78daebf7866bb1f0826d7199075 Thanks, John john, to the best of my knowledge, https://github.com/ceph/ceph/commit/93e7b95ed8b4c78daebf7866bb1f0826d7199075 keep polling the /sys/block until the expected partition sysfs file shows up under /sys/block. but a sysfs attribute file is created by the linux kernel when a new partition is found. partprobe sends ioctl(BLKPG, ..) calls to instruct kernel to add/del/resize partitions. and the ioctl() syscall does not return until the sysfs file is created. so it's all linux kernel's job to create sysfs files in a blocking way. while, in general, the files under /dev/ representing partitions are created by udev. as Loïc pointed out, udev and (partprobe + linux kernel) are asynchronous. i.e. partprobe instruct linux kernel to add new partitions when they are created, and linux kernel sends uevent to udev in userspace. in the meanwhile, udev adds device files under /dev/ when handling the uevents from kernel. anyway, apply the same fix -- retry if a /dev/ file does not exists might help as per the analysis of http://tracker.ceph.com/issues/19428, the sysfile takes up to 1 sec to show up under /sys/block. probably the same applies to /dev/ in docker container. i will prepare a patch addressing it. also, "udevadmin settle" waits until the event queue is empty by watching /run/udev using inotify(7). and the funny thing about udevadmin is that it returns 0 if the /run/udev directory does not exists at all. see adm_settle() and udev_queue_get_fd() quote from adm_settle() in src/udev/udevadm-settle.c of systemd-234 ===========>8============ pfd[0].fd = udev_queue_get_fd(queue); if (pfd[0].fd < 0) { log_debug("queue is empty, nothing to watch"); rc = EXIT_SUCCESS; goto out; } ===========8<============ i verified my theory in the testbed: [root@overcloud-cephstorage-2 ~]# docker run -ti --entrypoint=bash docker.io/ceph/daemon:tag-build-master-jewel-centos-7 [root@df045229ec6c /]# udevadm --version 219 [root@a036184074e4 /]# ls /run/udev/ ls: cannot access /run/udev/: No such file or directory [root@a036184074e4 /]# udevadm settle [root@a036184074e4 /]# echo $? 0 [root@a036184074e4 /]# exit [root@overcloud-cephstorage-2 ~]# ls /run/udev/ control data links static_node-tags tags watch so, to ensure that "udevadm" is able to check the status of udev's event queue, we need to pass "-v /run/udev:/run/udev:ro" to docker. take the reproducer as an example: while true ; do docker run -ti --privileged=true -v /dev/:/dev/ -v /run/udev:/run/udev:ro -e OSD_DEVICE=/dev/vdb docker.io/ceph/daemon:tag-build-master-jewel-centos-7 zap_device ; for p in $(parted /dev/vdd --script print | awk {'print $1'} | egrep [0-9] | grep -v 2); do parted /dev/vdd --script rm $p; done ; docker start ceph-osd-prepare-overcloud-cephstorage-2-devdevvdb ; docker logs -f ceph-osd-prepare-overcloud-cephstorage-2-devdevvdb ; sleep 1; done ; also, https://github.com/ceph/ceph-ansible/pull/1531 could also address this issue. @john, have you get a chance to try it? upstream PR posted at https://github.com/ceph/ceph/pull/18162. will backport it once it gets merged. forgot to mention, ceph-disk fails even with "-v /run/udev:/run/udev:ro". John, i prepared a test build with the backport of PR 18162 at https://2.chacra.ceph.com/r/ceph/wip-bz1494543-jewel-kefu/d36009b00ddbaf2f34e106be7c0e824b327d124a/centos/7/flavors/default/, could you give it a try? sorry, the URL of the repo should be https://3.chacra.ceph.com/r/ceph/wip-bz1494543-jewel-kefu/e0b02ba9bdccf04bef6dc44ccabe8cb5b77cf6ee/centos/7/flavors/default/ John, I built a new image with Kefu's fix here: https://hub.docker.com/r/ceph/daemon/builds/bafedtemdtvo6ofheper28a/ Use the following tag: tag-build-race-jewel-devel-wip-bz1494543-jewel-kefu-centos-7 new image building here, https://hub.docker.com/r/ceph/daemon/builds/b6h8yysq4mrjmmitdjybear/ same tag Thanks Kefu! I conducted 5 tests in a row with a docker image [1] that contains the fix and didn't hit the issue. I am going to pursue getting this promoted into the puddle so OpenStack QE can try it too. [1] https://hub.docker.com/r/ceph/daemon/builds/bvqtdrttvmbajeepyjbnvm3 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:3387 |
When using ceph-disk prepare as provided by ceph-osd-10.2.9-0.el7.x86_64 on /dev/vdb, which exists on my system, I get a stack trace [1] and see [Errno 2] No such file or directory: '/dev/vdb1'. Unfortunately it seems intermittent. For example, if I zap the device and re-run prepare, then I don't see this error but it does recur and will fail a ceph-ansible deployment. I am using ceph-docker [2]. [1] Traceback (most recent call last): File "/usr/sbin/ceph-disk", line 9, in <module> load_entry_point('ceph-disk==1.0.0', 'console_scripts', 'ceph-disk')() File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 5095, in run main(sys.argv[1:]) File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 5046, in main args.func(args) File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 1855, in main Prepare.factory(args).prepare() File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 1844, in prepare self.prepare_locked() File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 1875, in prepare_locked self.data.prepare(self.journal) File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 2542, in prepare self.prepare_device(*to_prepare_list) File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 2705, in prepare_device self.set_data_partition() File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 2632, in set_data_partition self.partition = self.create_data_partition() File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 2616, in create_data_partition return device.get_partition(partition_number) File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 1622, in get_partition path=self.path, dev=dev, args=self.args) File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 1685, in factory (dev is not None and is_mpath(dev))): File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 535, in is_mpath uuid = get_dm_uuid(dev) File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 522, in get_dm_uuid uuid_path = os.path.join(block_path(dev), 'dm', 'uuid') File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 516, in block_path rdev = os.stat(path).st_rdev [2] Workaround: zap and re-run: docker run -ti --privileged=true -v /dev/:/dev/ -e OSD_DEVICE=/dev/vdb docker.io/ceph/daemon:tag-build-master-jewel-centos-7 zap_device docker start ceph-osd-prepare-overcloud-cephstorage-1-devdevvdb; docker logs -f ceph-osd-prepare-overcloud-cephstorage-1-devdevvdb [root@overcloud-cephstorage-1 ~]# docker run -ti --entrypoint=bash docker.io/ceph/daemon:tag-build-master-jewel-centos-7 [root@e902236a00de /]# [root@e902236a00de /]# rpm -qa | grep ceph libcephfs1-10.2.9-0.el7.x86_64 python-cephfs-10.2.9-0.el7.x86_64 ceph-base-10.2.9-0.el7.x86_64 ceph-osd-10.2.9-0.el7.x86_64 ceph-radosgw-10.2.9-0.el7.x86_64 ceph-release-1-1.el7.noarch ceph-common-10.2.9-0.el7.x86_64 ceph-selinux-10.2.9-0.el7.x86_64 ceph-mds-10.2.9-0.el7.x86_64 ceph-mon-10.2.9-0.el7.x86_64 [root@e902236a00de /]#