RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1862929 - Scanner is slower with JDK 1.8.0_252 and above
Summary: Scanner is slower with JDK 1.8.0_252 and above
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: java-1.8.0-openjdk
Version: 7.4
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Martin Balao
QA Contact: OpenJDK QA
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-08-03 08:50 UTC by Simeon Andreev
Modified: 2024-12-20 19:11 UTC (History)
11 users (show)

Fixed In Version: java-1.8.0-openjdk-1.8.0.312.b05-0.3.ea.el7_9
Doc Type: Bug Fix
Doc Text:
A defensive security change in an earlier OpenJDK update led to a performance degradation when using the Scanner class. This was due to the change being applied to many common cases that did not need this protection. With this update, we provide the original behaviour for these cases.
Clone Of:
Environment:
Last Closed: 2021-10-20 14:38:48 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Snippet to reproduce the problem with. (5.87 KB, text/plain)
2020-08-03 08:50 UTC, Simeon Andreev
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2021:3889 0 None None None 2021-10-20 14:39:12 UTC
openjdk bug system JDK-8253451 0 None None None 2020-09-21 19:59:28 UTC

Description Simeon Andreev 2020-08-03 08:50:12 UTC
Created attachment 1703250 [details]
Snippet to reproduce the problem with.

Description of problem:

We observe that reading streams with java.util.Scanner is slower with JDK 1.8.0_262, JDK 11.0.7 and JDK 11.0.8 (likely other version as well), when compared to JDK 1.8.0_181.

Version-Release number of selected component (if applicable):

OpenJDK 1.8.0_262, OpenJDK 11.0.7, OpenJDK 11.0.8 (tested only on these, likely the performance degradation is on other versions as well).


How reproducible:

Always.

Steps to Reproduce:
1. Execute the attached snippet "TestScannerNoIOManyStreams.java" on different Java versions.
2. Observe a performance degradation with OpenJDK 1.8.0_262 and above.

Actual results:

When running the snippet with different OpenJDK versions we observe the following output.

java.version=1.8.0_181
reading with Scanner took: 24204ms
useless line to ensure everything is read 2.0714783E7

java.version=1.8.0_262
reading with Scanner took: 29761ms
useless line to ensure everything is read 2.0714783E7

java.version=11.0.8
reading with Scanner took: 31346ms
useless line to ensure everything is read 2.0714783E7

java.version=15-ea
reading with Scanner took: 30845ms
useless line to ensure everything is read 2.0714783E7

Expected results:

No performance degradation should be observed.

Additional info:

We run with RHEL 7.4, here are CPU details (HPZ640 workstation):

Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                16
On-line CPU(s) list:   0-15
Thread(s) per core:    2
Core(s) per socket:    8                                                                                                                                                                                                                                                       
Socket(s):             1                                                                                                                                                                                                                                                       
NUMA node(s):          1                                                                                                                                                                                                                                                       
Vendor ID:             GenuineIntel                                                                                                                                                                                                                                            
CPU family:            6                                                                                                                                                                                                                                                       
Model:                 85                                                                                                                                                                                                                                                      
Model name:            Intel(R) Xeon(R) W-2145 CPU @ 3.70GHz                                                                                                                                                                                                                   
Stepping:              4                                                                                                                                                                                                                                                       
CPU MHz:               3699.774                                                                                                                                                                                                                                                
CPU max MHz:           3700.0000                                                                                                                                                                                                                                               
CPU min MHz:           1200.0000                                                                                                                                                                                                                                               
BogoMIPS:              7400.00                                                                                                                                                                                                                                                 
Virtualization:        VT-x                                                                                                                                                                                                                                                    
L1d cache:             32K                                                                                                                                                                                                                                                     
L1i cache:             32K                                                                                                                                                                                                                                                     
L2 cache:              1024K                                                                                                                                                                                                                                                   
L3 cache:              11264K                                                                                                                                                                                                                                                  
NUMA node0 CPU(s):     0-15                                                                                                                                                                                                                                                    
Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch epb cat_l3 cdp_l3 intel_ppin intel_pt mba tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm mpx rdt_a avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local ibpb ibrs stibp dtherm arat pln pts hwp hwp_act_window hwp_epp hwp_pkg_req spec_ctrl intel_stibp ssbd

Comment 2 Simeon Andreev 2020-08-03 08:52:29 UTC
Note that the attached snippet "TestScannerNoIOManyStreams.java" will use multiple byte streams with different sizes. This is not necessary to reproduce the problem, one big stream instead of many smaller ones also reproduces the problem. The attached snippet is however closer to the problem case observed in our product.

Comment 3 jiri vanek 2020-08-03 09:12:12 UTC
just fyi, u265 is not fixing it:

 jvanek  jvanek  10:58:12    ~  Desktop  $  java TestScannerNoIOManyStreams
java.version=1.8.0_242
reading with Scanner took: 25660ms
useless line to ensure everything is read 2.0714783E7
 jvanek  jvanek  10:58:43    ~  Desktop  $  java TestScannerNoIOManyStreams
java.version=1.8.0_265
reading with Scanner took: 30951ms
useless line to ensure everything is read 2.0714783E7
 jvanek  jvanek  11:04:12    ~  Desktop  $  java TestScannerNoIOManyStreams
java.version=1.8.0_265
reading with Scanner took: 31374ms
useless line to ensure everything is read 2.0714783E7
 jvanek  jvanek  11:07:28    ~  Desktop  $  java TestScannerNoIOManyStreams
java.version=1.8.0_262
reading with Scanner took: 31251ms
useless line to ensure everything is read 2.0714783E7
 jvanek  jvanek  11:09:29    ~  Desktop  $  java TestScannerNoIOManyStreams
java.version=1.8.0_262
reading with Scanner took: 33305ms
useless line to ensure everything is read 2.0714783E7
 jvanek  jvanek  11:10:08    ~  Desktop  $  java TestScannerNoIOManyStreams
java.version=1.8.0_222
reading with Scanner took: 26183ms
useless line to ensure everything is read 2.0714783E7

Comment 4 Severin Gehwolf 2020-08-03 10:52:54 UTC
Are you sure this is 8u262 and above? Looks like its observable with 8u252[1] as well:

$ openjdk-8u262-b10/bin/java TestScannerNoIOManyStreams
java.version=1.8.0_262
reading with Scanner took: 29344ms
useless line to ensure everything is read 2.0714783E7
$ openjdk-8u262-b10/bin/java TestScannerNoIOManyStreams
java.version=1.8.0_262
reading with Scanner took: 31366ms
useless line to ensure everything is read 2.0714783E7


$ openjdk-8u252-b09/bin/java TestScannerNoIOManyStreams
java.version=1.8.0_252
reading with Scanner took: 29647ms
useless line to ensure everything is read 2.0714783E7
$ openjdk-8u252-b09/bin/java TestScannerNoIOManyStreams
java.version=1.8.0_252
reading with Scanner took: 31243ms
useless line to ensure everything is read 2.0714783E7
$ openjdk-8u252-b09/bin/java TestScannerNoIOManyStreams
java.version=1.8.0_252
reading with Scanner took: 30117ms
useless line to ensure everything is read 2.0714783E7


$ openjdk-8u242-b08/bin/java TestScannerNoIOManyStreams
java.version=1.8.0_242
reading with Scanner took: 26024ms
useless line to ensure everything is read 2.0714783E7
$ openjdk-8u242-b08/bin/java TestScannerNoIOManyStreams
java.version=1.8.0_242
reading with Scanner took: 26095ms
useless line to ensure everything is read 2.0714783E7
$ openjdk-8u242-b08/bin/java TestScannerNoIOManyStreams
java.version=1.8.0_242
reading with Scanner took: 26248ms
useless line to ensure everything is read 2.0714783E7


$ openjdk-8u252-b09/bin/java TestScannerNoIOManyStreams
java.version=1.8.0_252
reading with Scanner took: 32680ms
useless line to ensure everything is read 2.0714783E7


[1] https://github.com/AdoptOpenJDK/openjdk8-upstream-binaries/releases/tag/jdk8u252-b09

Comment 5 Andrey Loskutov 2020-08-03 11:05:14 UTC
(In reply to Severin Gehwolf from comment #4)
> Are you sure this is 8u262 and above? Looks like its observable with
> 8u252[1] as well:

We only tested with 8u262 because we needed the patch from that build. We haven't bisected where after 8u181 it was introduced.

Comment 6 Simeon Andreev 2020-08-03 11:07:04 UTC
From the description:

> Version-Release number of selected component (if applicable):
>
> OpenJDK 1.8.0_262, OpenJDK 11.0.7, OpenJDK 11.0.8 (tested only on these, likely the performance degradation is on other versions as well).

Comment 7 Severin Gehwolf 2020-08-03 12:08:23 UTC
(In reply to Andrey Loskutov from comment #5)
> (In reply to Severin Gehwolf from comment #4)
> > Are you sure this is 8u262 and above? Looks like its observable with
> > 8u252[1] as well:
> 
> We only tested with 8u262 because we needed the patch from that build. We
> haven't bisected where after 8u181 it was introduced.

OK, thanks. I've updated the bug summary.

Comment 8 Martin Balao 2020-08-03 16:21:36 UTC
This may be related to http://hg.openjdk.java.net/jdk8u/jdk8u/jdk/rev/a8f0a9ef1797

I'll investigate.

Comment 10 Martin Balao 2020-08-03 21:28:20 UTC
I can confirm my guess in comment:8. In particular, the lines that are causing the extra delay you noticed are 1.34 [1] and 1.35 [2].

Compiling a "\x{...}" (i.e.: \x{2c}) string pattern in the regular expression takes more time than a plain "\..." pattern (i.e.: \,). This is the time it takes to call Pattern::x [3]. I've done a quick test [4]. The automata node built is the same (Pattern::Single [5]), so there shouldn't be any difference when running the Matcher.

I'll look into a possible fix.

With that said, initializing the Scanner instance in each iteration and compiling the regular expression over and over looks inefficient to me (even if we use the previous string pattern). For high performance I'd try something different, that reuses the compiled regular expression.

--
[1] - http://hg.openjdk.java.net/jdk8u/jdk8u/jdk/rev/a8f0a9ef1797#l1.34
[2] - http://hg.openjdk.java.net/jdk8u/jdk8u/jdk/rev/a8f0a9ef1797#l1.35
[3] - http://hg.openjdk.java.net/jdk8u/jdk8u/jdk/file/84c5676f140b/src/share/classes/java/util/regex/Pattern.java#l3209
[4] -

public class Main {
    private static volatile Pattern p = null;
    private static long totalElapsed = 0L;
    public static void main(String[] args) throws Throwable {
        for (int i = 0; i < 200_000_000; i++) {
            long start = System.currentTimeMillis();
            p = Pattern.compile("\\x{2c}\\x{2e}");
            //p = Pattern.compile("\\,\\.");
            long end = System.currentTimeMillis();
            long elapsed = (end - start);
            totalElapsed += elapsed;
        }
        System.out.println("Total elapsed: " + totalElapsed + "ms");
    }
}

[5] - http://hg.openjdk.java.net/jdk8u/jdk8u/jdk/file/84c5676f140b/src/share/classes/java/util/regex/Pattern.java#l3831

Comment 11 Martin Balao 2020-08-04 06:00:30 UTC
(In reply to Martin Balao from comment #10)
> With that said, initializing the Scanner instance in each iteration and
> compiling the regular expression over and over looks inefficient to me (even
> if we use the previous string pattern). For high performance I'd try
> something different, that reuses the compiled regular expression.

I've just noticed that there are other uses of the modified regexp strings while parsing the input (instead of while initializing the Scanner instance or setting a new locale only) [1]. That should explain why you also noticed the performance degradation even with a single Scanner and a large input.

Anyways, I'm working on an optimization/fix. Keep you posted.

--
[1] - http://hg.openjdk.java.net/jdk8u/jdk8u/jdk/file/84c5676f140b/src/share/classes/java/util/Scanner.java#l2042

Comment 13 Simeon Andreev 2020-08-04 06:50:06 UTC
For better performance we will (try to) recommend using Files.readAllLines(Path) and Double.valueOf(String). That seems to have a lot better performance (and is applicable) in the particular case where we encountered the performance degradation. Though the problem is in code written by our customers, so not in our control (another reason why we are reporting this issue).

Comment 14 Simeon Andreev 2020-08-04 07:07:41 UTC
> Hi Andrey and Simeon,
>
> I'm looking into "Bug 1862929 - Scanner is slower with JDK 1.8.0_252 and above" [1]. As commented here [2] [3], the issue seems to be clear and I'm working on a fix. To make sure that your use-case is covered by the optimization, would you run the statements below and give me the result back?
>
> *
> Integer.toHexString((char)(DecimalFormatSymbols.getInstance(Locale.getDefault(Locale.Category.FORMAT))).getGroupingSeparator());
>
> *
> Integer.toHexString((char)(DecimalFormatSymbols.getInstance(Locale.getDefault(Locale.Category.FORMAT))).getDecimalSeparator());
>
> Thanks,
> Martin.-
>
> --
> [1] - https://bugzilla.redhat.com/show_bug.cgi?id=1862929
> [2] - https://bugzilla.redhat.com/show_bug.cgi?id=1862929#c10
> [3] - https://bugzilla.redhat.com/show_bug.cgi?id=1862929#c11

Output with JDK 1.8:

2c
2e

Output with JDK 11 (I assume there would be no differences, but listing the results anyway):

2c
2e

> And also, if you are setting a non-default locale please let me know. I want to know which are the grouping and decimal separators you are seeing.

Here are the outputs of the "locale" command in the environment in which our product runs:

LANG=C
LC_CTYPE="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_COLLATE="C"
LC_MONETARY="C"
LC_MESSAGES="C"
LC_PAPER="C"
LC_NAME="C"
LC_ADDRESS="C"
LC_TELEPHONE="C"
LC_MEASUREMENT="C"
LC_IDENTIFICATION="C"
LC_ALL=

"Locale.getDefault()" prints "en_US".

Let me know if you need to know some other sort of configuration, I'm not particularly knowledgeable of locales.

Comment 15 Martin Balao 2020-08-06 22:33:00 UTC
Simeon: thanks for providing that information.

Seems like the candidate fix is working:

 * 11u
  * Without fix: reading with Scanner took: 164855ms
  * With fix: reading with Scanner took: 123054ms

 * 8u
  * Without fix: reading with Scanner took: 137425ms
  * With fix: reading with Scanner took: 111691ms

Note: the absolute numbers are not relevant but the difference, as I'm running a slow-debug build inside a VM.

Please also note that this candidate fix is still under discussion, we haven't yet made any decision. I'll share any updates here.

Comment 19 Andrey Loskutov 2020-08-27 07:12:57 UTC
@Martin: any updates on this ticket? Could you share the patch proposal (we could try it out with our custom JDK build)?

Comment 20 Martin Balao 2020-08-27 15:51:20 UTC
@Andrey: we are planning to fix this bug for our next 8 and 11 releases. Feedback from your side would be appreciated. I'll reach you out directly.

Comment 22 Simeon Andreev 2020-08-28 06:15:57 UTC
I'll give the patch a try, probably early next week.

Comment 23 Simeon Andreev 2020-08-28 06:50:29 UTC
Can you also send the JDK 11 patch? I can't build JDK 1.8 without further RPM installs; it would be simpler to build JDK 11.

Comment 26 Andrew John Hughes 2020-08-31 23:28:54 UTC
(In reply to Simeon Andreev from comment #25)
> OK, I checked with JDK 11.0.8 plus the patch. I no longer see the
> performance degradation we observed previously (and reported here). Thanks!
> 
> I also have a question about the patch itself and how "risky" it is. We have
> built 11.0.8 with 1 patch on top for bug 1751985 (as that fix is not yet
> downported to JDK 11)

That fix has been backported to both 11u and 8u. It was in the July release of
8u, 8u262, and will be in 11.0.9 in October (just missed the cut-off for July)

Both that bug and this are filed against java-1.8.0-openjdk. If an earlier fix
is desired, we need a bug against java-11-openjdk to deliver the fix there.

However, I'm not sure you're even using the RHEL packages, from our previous
discussion.

, we plan to update to that JDK (up from 11.0.7) in
> about 1 month; we have already verified the impact of the update. How high
> do you judge the risk of including the proposed patch to that build?
> 

snip..

> Unfortunately I don't know the code so I can't judge this myself (the change
> does seem minimal). For the patch in bug 1751985, we waited until the patch
> was merged in JDK 15 before using it. In the case here however, Andrey would
> like to include it sooner (so that customers mostly don't notice the
> performance degradation, unless they moved to JDK 11 very early). Its
> unclear when we are shipping the next JDK 11 update, so this is not
> unimportant to us.
> 
> We can re-do all validation with the patch, if its certain that the patch is
> going in the JDK main code stream, but the patch is still deemed somewhat
> risky by you. If its not clear whether the patch will be merged in the JDK
> code in its current form, we'll likely wait.

I don't personally deem it risky; it's simply an optimisation for two common
cases. However, the original fix was part of a security update and so changes
to it need to be handled with caution and with co-ordination with Oracle.

Martin would prefer that the patch is not shared publicly until we get feedback
from them. I'm not sure if you can see the private comments on this bug, which
is why I've elided the patch in this response. Please let us know if you can.

Comment 27 Simeon Andreev 2020-09-04 06:27:19 UTC
Yes, I can see private comments. Sorry for commenting with the patch, I did that out of habit.

I.e. you don't see a reason for us not to include the patch to our next 11.0.8 build? We will validate as much as we can with the patch, of course.

Comment 28 Martin Balao 2020-09-04 13:01:30 UTC
I don't see a reason not to ship it. As previously said, the risk should be low. The patch is about optimizing the most common paths, in a fast-path / slow-path scheme. Testing is very welcomed. Please let us know if you find anything.

We've coordinated with Oracle through the Vulnerability Group and are cleared to propose our patch upstream -which we will do as soon as possible-.

Comment 30 Martin Balao 2020-09-22 04:52:14 UTC
Status update

 * Opened bug upstream
  * https://bugs.openjdk.java.net/browse/JDK-8253451

 * Proposed fix upstream
  * https://mail.openjdk.java.net/pipermail/core-libs-dev/2020-September/069013.html

Comment 33 Chris Williams 2020-11-11 21:42:00 UTC
Red Hat Enterprise Linux 7 shipped it's final minor release on September 29th, 2020. 7.9 was the last minor releases scheduled for RHEL 7.
From intial triage it does not appear the remaining Bugzillas meet the inclusion criteria for Maintenance Phase 2 and will now be closed. 

From the RHEL life cycle page:
https://access.redhat.com/support/policy/updates/errata#Maintenance_Support_2_Phase
"During Maintenance Support 2 Phase for Red Hat Enterprise Linux version 7,Red Hat defined Critical and Important impact Security Advisories (RHSAs) and selected (at Red Hat discretion) Urgent Priority Bug Fix Advisories (RHBAs) may be released as they become available."

If this BZ was closed in error and meets the above criteria please re-open it flag for 7.9.z, provide suitable business and technical justifications, and follow the process for Accelerated Fixes:
https://source.redhat.com/groups/public/pnt-cxno/pnt_customer_experience_and_operations_wiki/support_delivery_accelerated_fix_release_handbook  

Feature Requests can re-opened and moved to RHEL 8 if the desired functionality is not already present in the product. 

Please reach out to the applicable Product Experience Engineer[0] if you have any questions or concerns.  

[0] https://bugzilla.redhat.com/page.cgi?id=agile_component_mapping.html&product=Red+Hat+Enterprise+Linux+7

Comment 34 Chris Williams 2020-11-11 23:18:39 UTC
Apologies for the inadvertent closure.

Comment 47 errata-xmlrpc 2021-10-20 14:38:48 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: java-1.8.0-openjdk security and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:3889

Comment 48 Red Hat Bugzilla 2023-09-15 00:35:02 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days


Note You need to log in before you can comment on or make changes to this bug.