Bug 74088
Summary: | DB_File broken in perl-DB_File-1.804-51 -- can't store key | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Retired] Red Hat Raw Hide | Reporter: | Jonathan Kamens <jik> | ||||||||
Component: | perl | Assignee: | Warren Togami <wtogami> | ||||||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | David Lawrence <dkl> | ||||||||
Severity: | medium | Docs Contact: | |||||||||
Priority: | medium | ||||||||||
Version: | 1.0 | ||||||||||
Target Milestone: | --- | ||||||||||
Target Release: | --- | ||||||||||
Hardware: | i386 | ||||||||||
OS: | Linux | ||||||||||
Whiteboard: | |||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||
Doc Text: | Story Points: | --- | |||||||||
Clone Of: | Environment: | ||||||||||
Last Closed: | 2005-05-28 06:59:58 UTC | Type: | --- | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Attachments: |
|
Description
Jonathan Kamens
2002-09-15 18:18:25 UTC
Created attachment 76212 [details]
test database file
Created attachment 76213 [details]
C test case
Created attachment 76214 [details]
Perl test case
looks like you're running into UTF8 issues. when I changed your key to be valid ascii (ie, all bytes < 127), it seems to work. also, if I 'utf8::upgrade($key)' it works as well. this is almost certainly a bug in DB_File itself. it looks like the data in your db was possibly already utf8 encoded; where does it come from? a result of disk IO? It's an IP address in binary (packed sockaddr_in) format. It is not UTF8 data, and neither Perl nor DB_File should not be treating it like UTF8 data. There's nothing that says I can't use binary keys in hashes or databases. If it is a UTF8 problem, it's certainly a weird one, given that the key is added successfully by Perl if I dump and reload the database. I don't really care whether it's a UTF8 bug or some other kind of bug; all I know is that it is a bug, and it has the potential to to affect anyone who uses DB_File with binary keys (and since we haven't even proven that it's dependent on the key, perhaps even non-binary keys!), and it thus seems like a rather significant bug. perl 5.8.0 (and 5.6.1 to a lesser extent) has been fully tooled for utf8 inside and out. you may not intend the data to be utf8, but the internal data model perl uses IS utf8 in almost every case, more so than it has been in previous releases. can you provide a smaller test case that doesn't involve your already existing database? preferably a test case that creates the database itself. typically these kinds of utf8 issues show up when dealing with perl modules that have C bindings, especially the Digest:: modules, but possibly this case as well. if you can simplify your test case I will submit the issue upstream to the perl maintainers. Perl utf8 support is only supposed to affect actual program source code, and this problem was occurring even in a dynamically generated variable; the only reason the key is a constant in the test case I submitted is to reduce the size of the test case. Furthermore, I don't have "use utf8" in my source code, which means Perl shouldn't be enabling its utf8 support, and the problem occurs even if I put "no utf8" in the source code explicitly. Even if the bug *is* in some way due to the new utf8 source code, that doesn't change the fact that it's a bug. No, I can't reduce the test case any further. The database having this problem has been built up over the course of many years; I have no idea which particular operations and in what order would cause the bug to manifest itself. That is why I included the database as part of the test case. The database attachment is only 137KB bzipped; I hardly think that's so large that it can't be passed upstream. This looks to me like it has the potential to be a rather serious bug. Given that, I don't understand why it seems like you're looking for a reason to ignore it rather than aggressively pursuing it, especially when I've given you a simple test case which reproduces it on demand. A smaller test case that shows the bug more simply is always better than one that doesn't. A 12k entry database isn't as good a test case as a smaller, simpler, self-contained test case. If one isn't available, then we'll do what we can, but chances of getting a fix from upstream are much greater if the test case is simpler. I'm sorry that you feel this is some attempt to avoid the issue. utf8 issues do indeed come up more than just in the actual source of your program. an example is this: echo y | perl -e 'use Devel::Peek; my $x = "y\n"; Dump($x); $x = <>; Dump($x)' Notice that they are the same string, but one has the UTF8 flag. This causes a number of differences in how perl treats the variable internally. Also notice: perl -le 'use Socket; print inet_aton("200.200.200.200")' | hexdump When printed, the 4 bytes become 8 bytes. If you use binmode and a perlio-ism (perldoc PerlIO), it becomes correct: perl -le 'use Socket; binmode(STDOUT, ":raw"); print inet_aton("200.200.200.200")' | hexdump utf8 is most definitely more than just an issue surrounding what encoding your .pl or .pm file uses. Also note that the above snippets behave very differently in perl 5.6.1. Getting these issues solved is very high priority. The solution comes quicker with simpler test cases. There are also issues with other modules, though, so any help you can provide will make the process much smoother. Closing due to inactivity. Assuming this if fixed with modern perl. |