Bug 2280606

Summary: nova-manage db purge fails on large datasets
Product: Red Hat OpenStack Reporter: Alex Stupnikov <astupnik>
Component: openstack-novaAssignee: melanie witt <mwitt>
Status: CLOSED MIGRATED QA Contact: OSP DFG:Compute <osp-dfg-compute>
Severity: medium Docs Contact:
Priority: medium    
Version: 17.1 (Wallaby)CC: alifshit, dasmith, eglynn, jhakimra, kchamart, mwitt, sbauza, sgordon, vromanso
Target Milestone: asyncKeywords: Patch, Triaged
Target Release: 17.1   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2025-01-14 20:41:54 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Alex Stupnikov 2024-05-15 11:42:37 UTC
Description of problem:
One of our customers faced upstream bug https://bugs.launchpad.net/nova/+bug/1983188 in his RHOSP 16.2 deployment when running the following command against nova DB with 18 GBs of records in shadow_instances table:

# podman exec -it nova_api_cron bash -c "nova-manage db purge --verbose --all --all-cells"                                                                         
An error has occurred:                                                                                                                                                                            
Traceback (most recent call last):                                                                                                                                                                
  File "/usr/lib64/python3.6/site-packages/sqlalchemy/engine/base.py", line 761, in _commit_impl                                                                                                  
    self.engine.dialect.do_commit(self.connection)                                                                                                                                                
  File "/usr/lib64/python3.6/site-packages/sqlalchemy/dialects/mysql/base.py", line 2215, in do_commit                                                                                            
    dbapi_connection.commit()                                                                                                                                                                     
  File "/usr/lib/python3.6/site-packages/pymysql/connections.py", line 422, in commit                                                                                                             
    self._read_ok_packet()                                                                                                                                                                        
  File "/usr/lib/python3.6/site-packages/pymysql/connections.py", line 396, in _read_ok_packet                                                                                                    
    pkt = self._read_packet()                                                                                                                                                                     
  File "/usr/lib/python3.6/site-packages/pymysql/connections.py", line 676, in _read_packet                                                                                                       
    packet.raise_for_error()                                                                                                                                                                      
  File "/usr/lib/python3.6/site-packages/pymysql/protocol.py", line 223, in raise_for_error                                                                                                       
    err.raise_mysql_exception(self._data)                                                                                                                                                         
  File "/usr/lib/python3.6/site-packages/pymysql/err.py", line 107, in raise_mysql_exception                                                                                                      
    raise errorclass(errno, errval)                                                                                                                                                               
pymysql.err.OperationalError: (1180, 'Got error 90 "Message too long" during COMMIT')                                                                                                             
                                                                                                                                                                                                  
The above exception was the direct cause of the following exception:                                                                                                                              
                                                                                                                                                                                                  
Traceback (most recent call last):                                                                                                                                                                
  File "/usr/lib/python3.6/site-packages/nova/cmd/manage.py", line 3314, in main                                                                                                                  
    ret = fn(*fn_args, **fn_kwargs)                                                                                                                                                               
  File "/usr/lib/python3.6/site-packages/nova/cmd/manage.py", line 747, in purge                                                                                                                  
    status_fn=status)                                                                                                                                                                             
  File "/usr/lib/python3.6/site-packages/nova/db/sqlalchemy/api.py", line 5817, in purge_shadow_tables                                                                                            
    deleted = conn.execute(delete)                                                                                                                                                                
  File "/usr/lib64/python3.6/site-packages/sqlalchemy/engine/base.py", line 988, in execute                                                                                                       
    return meth(self, multiparams, params)                                                                                                                                                        
  File "/usr/lib64/python3.6/site-packages/sqlalchemy/sql/elements.py", line 287, in _execute_on_connection                                                                                       
    return connection._execute_clauseelement(self, multiparams, params)                                                                                                                           
  File "/usr/lib64/python3.6/site-packages/sqlalchemy/engine/base.py", line 1107, in _execute_clauseelement                                                                                       
    distilled_params,                                                                                                                                                                             
  File "/usr/lib64/python3.6/site-packages/sqlalchemy/engine/base.py", line 1272, in _execute_context                                                                                             
    self._root._commit_impl(autocommit=True)                                                                                                                                                      
  File "/usr/lib64/python3.6/site-packages/sqlalchemy/engine/base.py", line 763, in _commit_impl                                                                                                  
    self._handle_dbapi_exception(e, None, None, None, None)                                                                                                                                       
  File "/usr/lib64/python3.6/site-packages/sqlalchemy/engine/base.py", line 1464, in _handle_dbapi_exception                                                                                      
    util.raise_from_cause(newraise, exc_info)                                                                                                                                                     
  File "/usr/lib64/python3.6/site-packages/sqlalchemy/util/compat.py", line 383, in raise_from_cause                                                                                              
    reraise(type(exception), exception, tb=exc_tb, cause=cause)                                                                                                                                   
  File "/usr/lib64/python3.6/site-packages/sqlalchemy/util/compat.py", line 128, in reraise                                                                                                       
    raise value.with_traceback(tb)                                                                                                                                                                
  File "/usr/lib64/python3.6/site-packages/sqlalchemy/engine/base.py", line 761, in _commit_impl                                                                                                  
    self.engine.dialect.do_commit(self.connection)                                                                                                                                                
  File "/usr/lib64/python3.6/site-packages/sqlalchemy/dialects/mysql/base.py", line 2215, in do_commit                                                                                            
    dbapi_connection.commit()                                                                                                                                                                     
  File "/usr/lib/python3.6/site-packages/pymysql/connections.py", line 422, in commit                                                                                                             
    self._read_ok_packet()                                                                                                                                                                        
  File "/usr/lib/python3.6/site-packages/pymysql/connections.py", line 396, in _read_ok_packet                                                                                                    
    pkt = self._read_packet()                                                                                                                                                                     
  File "/usr/lib/python3.6/site-packages/pymysql/connections.py", line 676, in _read_packet                                                                                                       
    packet.raise_for_error()                                                                                                                                                                      
  File "/usr/lib/python3.6/site-packages/pymysql/protocol.py", line 223, in raise_for_error                                                                                                       
    err.raise_mysql_exception(self._data)                                                                                                                                                         
  File "/usr/lib/python3.6/site-packages/pymysql/err.py", line 107, in raise_mysql_exception                                                                                                      
    raise errorclass(errno, errval)                                                                                                                                                               
sqlalchemy.exc.OperationalError: (pymysql.err.OperationalError) (1180, 'Got error 90 "Message too long" during COMMIT')                                                                           
(Background on this error at: http://sqlalche.me/e/e3q8)


I understand that it is not going to be fixed in RHOSP 16.2, but kindly ask engineering to consider fixing it in RHOSP 17.1.

Version-Release number of selected component (if applicable):
RHOSP 16.2 +

How reproducible: see description

Actual results: nova-manage fails to purge archived entries if their number is too big

Expected results: nova-manage successfully drops archived entries no matter how big their number is

Comment 1 melanie witt 2024-05-15 16:52:46 UTC
I had a look in the code and can confirm this bug.

Depending on the environment it may be possible for the customer to work around the problem by specifying --before to 'nova-manage db purge' to manually limit the number of rows the command will attempt to delete in one transaction. And repeat the command with subsequent adjustments to --before until all rows are deleted or until the number of rows in the shadow tables has decreased enough for --all to be used again.

It may be worth noting that there were enhancements made to the related 'nova-manage db archive_deleted_rows' command to better handle very large numbers of records in 16.2.6 [1] and 17.1.1 [2] because it might be possible to get around the problem by purging records indirectly with 'nova-manage db archive_deleted_rows --purge'.

We will aim to get 'nova-manage db purge' fixed for 17.1 provided we have a z4 release.


[1] https://bugzilla.redhat.com/show_bug.cgi?id=2170683
[2] https://bugzilla.redhat.com/show_bug.cgi?id=2170686

Comment 3 Artom Lifshitz 2024-09-03 14:41:28 UTC
Moving to async since we're out of runway for 17.1.4 non-blockers, and the impact seems to be not too high, given the presence of the workaround of using --before.

Comment 4 Artom Lifshitz 2025-01-14 20:42:50 UTC
We'll continue the work to get this into 18 as tracked by https://issues.redhat.com/browse/OSPRH-13035, but for 17.1 this no longer fits into the inclusion criteria: https://groups.google.com/a/redhat.com/g/openstack-program/c/sn4Pso92fuM