Red Hat Bugzilla – Bug 678440
opening urls fails with error 403 on picky servers
Last modified: 2012-06-12 12:47:37 EDT
Created attachment 479418 [details]
screencast of the issue
Description of problem:
When trying to include table from http://en.wikipedia.org/wiki/Intel_GMA I've got 403 Error just when inserting URL to the dialog box.
Version-Release number of selected component (if applicable):
100% (3 out of 3)
Steps to Reproduce:
1. Insert/Link to External Data
2. insert URL to the "URL of external data source" input box
List of available tables in the "Available tables/ranges" listbox
Created attachment 479463 [details]
this is what we do
gcc $(pkg-config --cflags --libs neon) neon_propfind.c -o neon_propfind
and observe the error.
dtardon->jorton: Should the above test case work?
It's entirely dependant on the whether the server supports DAV or not. If it doesn't support DAV a 403 error is not unsurprising.
Okay, thanks for clarification. Sorry, Matej, nothing we can do here.
(In reply to comment #4)
> Okay, thanks for clarification. Sorry, Matej, nothing we can do here.
Sure, you can. Say in Help, that OOo doesn't support an arbitrary URL but just WebDAV (BTW, why you don't support RO arbitrary URL?). So for example in the help for this function I read:
With the help of the Web Page Query (LibreOffice Calc) import filter, you can insert tables from HTML documents in a Calc spreadsheet.
3. Enter the URL of the HTML document or the name of the spreadsheet. Press Enter when finished. Click the ...
There is nothing to indicate that this functionality is limited just to WebDAV.
oowriter http://www.redhat.com works
oowriter http://www.google.com works
oowriter http://en.wikipedia.org doesn't work
On the face of it, it would *appear* to make sense to try webdav first, and if that fails just grab the content of the url and open a read-only copy of it.
caolanm->sbergmann: could you have a look and see if that's what's happening, i.e. that those other two servers support webdav and that's why they work, or if something else is going on, and/or a fallback to slurping down the full content like I think the ftp handler does is a good idea.
As it turns out, the difference with <http://en.wikipedia.org> is that it forces back 403 "Scripts should use an informative User-Agent string with contact information, or they may be IP-blocked without notice" as long as you do not inclued a User-Agent header.
Fixed upstream now as <http://cgit.freedesktop.org/libreoffice/core/commit/?id=4d0e3127ed2def7212bc05aa860cd06704bb1efe> "rhbz#678440: Always include User-Agent to avoid 403 from picky servers," included in LibreOffice 3.6 and intended for inclusion into LibreOffice 3.5.5 as well.