Bug 709318
Summary: | conflicting file hashes lead to data corruption | ||||||
---|---|---|---|---|---|---|---|
Product: | [Retired] Pulp | Reporter: | Daniel Mach <dmach> | ||||
Component: | z_other | Assignee: | Pradeep Kilambi <pkilambi> | ||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Preethi Thomas <pthomas> | ||||
Severity: | urgent | Docs Contact: | |||||
Priority: | urgent | ||||||
Version: | unspecified | CC: | dgao, dgregor | ||||
Target Milestone: | --- | Keywords: | Triaged | ||||
Target Release: | Sprint 24 | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2011-08-16 14:19:58 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 563609, 647488 | ||||||
Attachments: |
|
I forgot one thing: If full hashes are used in paths, it will be probably an incompatible change. I'd use this opportunity to propose new path schema. this would be well searchable: <pulp_root>/files/<file_name[:3]>/<file_name>/<sha256> and this is more like current paths: <pulp_root>/files/<sha256[:3]>/<sha256>/<file_name> I prefer the first option due to better manual searching. commit bfb294c0b679d4ec43d261dc2d77cbeeca8d23c1 Author: Pradeep Kilambi <pkilambi> Date: Thu Jun 2 15:39:01 2011 -0400 $ sudo pulp-admin content upload --repoid=test test/file1/reproducer -v * Starting Content Upload * Performing Content Uploads to Pulp server Successfully uploaded [reproducer] to server * Performing Repo Associations Content association Complete for Repo [test]: Packages: None Files: reproducer * Content Upload complete. [pkilambi@prad ~]$ sudo pulp-admin content upload --repoid=test test/file2/reproducer -v * Starting Content Upload * Performing Content Uploads to Pulp server Successfully uploaded [reproducer] to server * Performing Repo Associations Content association Complete for Repo [test]: Packages: None Files: reproducer * Content Upload complete. $ ls -l /var/lib/pulp/files/rep/reproducer/963*/var/lib/pulp/files/rep/reproducer/96364261a9e076d43a21d6ff17fa0694eaf66f0e6f577ef59847202a1708fb26: total 4 -rw-r--r--. 1 apache apache 31 Jun 2 15:34 reproducer /var/lib/pulp/files/rep/reproducer/963b29f07b9c24a234123676dfad905fd61d93e9c8bcca1002625966535ff96a: total 4 -rw-r--r--. 1 apache apache 9 Jun 2 15:34 reproducer Just a note, if you already have files pushed, you might have to delete and repush them to get the new format. Other wise, if you prefer, lemme know and I can put together a migration script that updates the file system and db paths. build: 0.188 [root@pulp-qe ~]# pulp-admin content upload --repoid=bar file1/reproducer -v * Starting Content Upload * Performing Content Uploads to Pulp server Successfully uploaded [reproducer] to server * Performing Repo Associations Content association Complete for Repo [bar]: Packages: None Files: reproducer * Content Upload complete. [root@pulp-qe ~]# pulp-admin content upload --repoid=bar file2/reproducer -v * Starting Content Upload * Performing Content Uploads to Pulp server Successfully uploaded [reproducer] to server * Performing Repo Associations Content association Complete for Repo [bar]: Packages: None Files: reproducer * Content Upload complete. [root@pulp-qe ~]# ls -l /var/lib/pulp/repos/bar/ ABCD ABCE MANIFEST repodata/ reproducer [root@pulp-qe ~]# ls -l /var/lib/pulp/files/rep/reproducer/963 96364261a9e076d43a21d6ff17fa0694eaf66f0e6f577ef59847202a1708fb26/ 963b29f07b9c24a234123676dfad905fd61d93e9c8bcca1002625966535ff96a/ [root@pulp-qe ~]# ls -l /var/lib/pulp/files/rep/reproducer/963* /var/lib/pulp/files/rep/reproducer/96364261a9e076d43a21d6ff17fa0694eaf66f0e6f577ef59847202a1708fb26: total 4 -rw-r--r--. 1 apache apache 31 Jun 17 15:14 reproducer /var/lib/pulp/files/rep/reproducer/963b29f07b9c24a234123676dfad905fd61d93e9c8bcca1002625966535ff96a: total 4 -rw-r--r--. 1 apache apache 9 Jun 17 15:14 reproducer [root@pulp-qe ~]# Closing with Community Release 15 pulp-0.0.223-4. |
Created attachment 501974 [details] files to reproduce the issue Files are staged into following directory structure: <pulp_root>/files/<sha256[:3]>/<file_name> Only 3 characters of a hash are used which gives 16^3 (4k) possible combinations. It should be fine for most files, but when considering common file names like README, LICENSE etc., it's easy to get a conflict. I was able to reproduce this issue: $ pulp-admin content upload --repoid=<REPO> file1/reproducer $ sha256sum /var/lib/pulp/files/963/reproducer 963b29f07b9c24a234123676dfad905fd61d93e9c8bcca1002625966535ff96a /var/lib/pulp/files/963/reproducer $ pulp-admin content upload --repoid=<REPO> file2/reproducer $ sha256sum /var/lib/pulp/files/963/reproducer 96364261a9e076d43a21d6ff17fa0694eaf66f0e6f577ef59847202a1708fb26 /var/lib/pulp/files/963/reproducer The file is overwritten *WITHOUT ANY WARNING*. Database contains both files but there's only one file on disk. BTW, packages shouldn't conflict, because we usually release only several copies of a RPM with the same name - signed with different keys. These files are the identical except several bytes and should have complete different hashes due to hash nature.