Bug 1576817
Summary: | RFE -couple of minutes after the transfer(upload disk) was paused by the system due to inactivity upload did not stop but 2 minutes afterwards | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [oVirt] ovirt-engine | Reporter: | Avihai <aefrat> | ||||||
Component: | RFEs | Assignee: | Rob Young <royoung> | ||||||
Status: | CLOSED DEFERRED | QA Contact: | Avihai <aefrat> | ||||||
Severity: | medium | Docs Contact: | |||||||
Priority: | unspecified | ||||||||
Version: | 4.2.3.5 | CC: | bugs, derez, nsoffer | ||||||
Target Milestone: | --- | Keywords: | FutureFeature | ||||||
Target Release: | --- | Flags: | rule-engine:
planning_ack?
rule-engine: devel_ack? rule-engine: testing_ack? |
||||||
Hardware: | Unspecified | ||||||||
OS: | Unspecified | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | Type: | Bug | |||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | Storage | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Engine shows pasue by system: 2018-05-10 12:19:05 - EVENT_ID: UPLOAD_IMAGE_PAUSED_BY_SYSTEM_TIMEOUT deamon log shows another write(upload) started 2 minutes later at 2018-05-10 12:21:14 2018-05-10 12:21:14,835 INFO (Thread-5) [web] START [10.35.162.7] PUT /images/c3db24af-70b2-45e4-9d3d-fca7872cb184 2018-05-10 12:21:14,836 INFO (Thread-5) [images] Writing 16777216 bytes at offset 33554432 flush True to Created attachment 1434350 [details]
engine, vdsm , proxy , deamon logs
If the transfer was paused because of inactivity, it means inactivity_timeout seconds passed since the last client request was finished. If the client is using code like: while not done: send PUT request write data sleep inactivity_timeout + 10 Then I think we can expect that the ticket will paused, and the next PUT request will fail with an error like "403 Forbidden", with a json error about "Transfer was paused" We don't want to remove the ticket since we will loose the ticket state, but we can add a "paused" attribute. To pause a transfer, engine will ask vdsm to modify the ticket: Host.update_image_ticket(ticket_uuid, {"paused"=true}) Vdsm will send this request to the daemon: PUT /tickets/ticket-uuid ... { "paused": true, } To resume a ticket, engine will send this to vdsm: Host.update_image_ticket(ticket_uuid, {"paused"=false, "timeout"=300) Vdsm will send this request to the daemon: PUT /tickets/ticket-uuid ... { "paused": false, "timeout": 300 } Host.update_image_ticket() will replace Host.extend_image_ticket() which will be deprecated and removed in future version. Daniel, what do you think? (In reply to Nir Soffer from comment #3) > If the transfer was paused because of inactivity, it means inactivity_timeout > seconds passed since the last client request was finished. > > If the client is using code like: > > while not done: > send PUT request > write data > sleep inactivity_timeout + 10 > > Then I think we can expect that the ticket will paused, and the next PUT > request > will fail with an error like "403 Forbidden", with a json error about > "Transfer was paused" > > We don't want to remove the ticket since we will loose the ticket state, but > we > can add a "paused" attribute. > > To pause a transfer, engine will ask vdsm to modify the ticket: > > Host.update_image_ticket(ticket_uuid, {"paused"=true}) > > Vdsm will send this request to the daemon: > > PUT /tickets/ticket-uuid > ... > { > "paused": true, > } > > To resume a ticket, engine will send this to vdsm: > > Host.update_image_ticket(ticket_uuid, {"paused"=false, "timeout"=300) > > Vdsm will send this request to the daemon: > > PUT /tickets/ticket-uuid > ... > { > "paused": false, > "timeout": 300 > } > > Host.update_image_ticket() will replace Host.extend_image_ticket() > which will be deprecated and removed in future version. > > Daniel, what do you think? We can just add a new api (e.g. pause_image_ticket), which could be simpler. Or do we expect any more values we might want the update in the ticket later on? (In reply to Daniel Erez from comment #4) > We can just add a new api (e.g. pause_image_ticket), which could be simpler. > Or do we expect any more values we might want the update in the ticket later > on? I don't know if we will need new ways to update the ticket, but having a generic API means we will have to change only engine next time, and it means we need single update() API instead of pause() and resume(). This bug didn't get any attention for a while, we didn't have the capacity to make any progress. If you deeply care about it or want to work on it please assign/target accordingly This bug didn't get any attention for a while, we didn't have the capacity to make any progress. If you deeply care about it or want to work on it please assign/target accordingly ok, closing. Please reopen if still relevant/you want to work on it. ok, closing. Please reopen if still relevant/you want to work on it. |
Created attachment 1434349 [details] upload_with_inactivity_script Description of problem: For a couple of minutes after the transfer(upload disk) was paused by the system due to inactivity upload itself did not stop but only after 2 minutes afterward. Daniel E. stated: "Currently pause doesn't remove the ticket as we need to allow resume using the existing ticket. We could perhaps add some 'paused' flag to the ticket and handle it in the daemon accordingly. You can open an RFE about that and we'll decide if it's interesting enough. " Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. Script attached to simulate upload with inactivity value of 10sec and sleep of 70 sec, 2. 3. Actual results: Expected results: Additional info: