Bug 678440

Summary: opening urls fails with error 403 on picky servers
Product: [Fedora] Fedora Reporter: Matěj Cepl <mcepl>
Component: libreofficeAssignee: Stephan Bergmann <sbergman>
Status: CLOSED NEXTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: rawhideCC: caolanm, dtardon, jorton, ltinkl, mcepl
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-06-12 16:43:49 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
screencast of the issue
none
this is what we do none

Description Matěj Cepl 2011-02-17 22:38:22 UTC
Created attachment 479418 [details]
screencast of the issue

Description of problem:
When trying to include table from http://en.wikipedia.org/wiki/Intel_GMA I've got 403 Error just when inserting URL to the dialog box.

Version-Release number of selected component (if applicable):
libreoffice-calc-3.3.0.4-1.fc15.x86_64

How reproducible:
100% (3 out of 3)

Steps to Reproduce:
1. Insert/Link to External Data
2. insert URL to the "URL of external data source" input box
3.
  
Actual results:
Error dialog

Expected results:
List of available tables in the "Available tables/ranges" listbox

Comment 1 David Tardon 2011-02-18 07:33:26 UTC
Created attachment 479463 [details]
this is what we do

Compile with

gcc $(pkg-config --cflags --libs neon) neon_propfind.c -o neon_propfind

and observe the error.

Comment 2 David Tardon 2011-02-18 07:36:08 UTC
dtardon->jorton: Should the above test case work?

Comment 3 Joe Orton 2011-02-18 08:52:58 UTC
It's entirely dependant on the whether the server supports DAV or not.  If it doesn't support DAV a 403 error is not unsurprising.

Comment 4 David Tardon 2011-02-18 09:01:37 UTC
Okay, thanks for clarification. Sorry, Matej, nothing we can do here.

Comment 5 Matěj Cepl 2011-02-18 11:12:01 UTC
(In reply to comment #4)
> Okay, thanks for clarification. Sorry, Matej, nothing we can do here.

Sure, you can. Say in Help, that OOo doesn't support an arbitrary URL but just WebDAV (BTW, why you don't support RO arbitrary URL?). So for example in the help for this function I read:

With the help of the Web Page Query (LibreOffice Calc) import filter, you can insert tables from HTML documents in a Calc spreadsheet.

or

3. Enter the URL of the HTML document or the name of the spreadsheet. Press Enter when finished. Click the ...

There is nothing to indicate that this functionality is limited just to WebDAV.

Comment 6 Caolan McNamara 2012-03-20 15:18:58 UTC
oowriter http://www.redhat.com works
oowriter http://www.google.com works
oowriter http://en.wikipedia.org doesn't work

On the face of it, it would *appear* to make sense to try webdav first, and if that fails just grab the content of the url and open a read-only copy of it.

caolanm->sbergmann: could you have a look and see if that's what's happening, i.e. that those other two servers support webdav and that's why they work, or if something else is going on, and/or a fallback to slurping down the full content like I think the ftp handler does is a good idea.

Comment 7 Stephan Bergmann 2012-06-12 16:43:49 UTC
As it turns out, the difference with <http://en.wikipedia.org> is that it forces back 403 "Scripts should use an informative User-Agent string with contact information, or they may be IP-blocked without notice" as long as you do not inclued a User-Agent header.

Fixed upstream now as <http://cgit.freedesktop.org/libreoffice/core/commit/?id=4d0e3127ed2def7212bc05aa860cd06704bb1efe> "rhbz#678440: Always include User-Agent to avoid 403 from picky servers," included in LibreOffice 3.6 and intended for inclusion into LibreOffice 3.5.5 as well.