Bug 672378 - HTML -> * conversion fails with files from texi2html
Summary: HTML -> * conversion fails with files from texi2html
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Fedora
Classification: Fedora
Component: calibre
Version: 14
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: ---
Assignee: Kevin Fenzi
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-01-24 23:55 UTC by Sean Stangl
Modified: 2012-08-16 21:28 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-08-16 21:28:18 UTC
Type: ---


Attachments (Terms of Use)
Rust Documentation, HTML format (fails conversion) (353.13 KB, text/html)
2011-01-24 23:55 UTC, Sean Stangl
no flags Details

Description Sean Stangl 2011-01-24 23:55:22 UTC
Created attachment 475087 [details]
Rust Documentation, HTML format (fails conversion)

Description of problem:

I am trying to read documentation on an e-book reader. The documentation is provided as .texi, so I converted to HTML via texi2html, and then loaded the HTML into Calibre, and asked it to convert to PDF and MOBI. Both fail with the same error. The HTML file in question is attached.


Version-Release number of selected component (if applicable):

0.7.38-3.fc14

Steps to Reproduce:

1. Add attached HTML file to library.
2. Convert individually with default options to PDF or MOBI.
3. Error is promptly displayed.

Actual Results:

The following error is given:

------------------------------------------
ERROR: Conversion Error: <b>Failed</b>: Convert book 1 of 1 (Rust Documentation)

Convert book 1 of 1 (Rust Documentation)
Processing archive...
Resolved conversion options
calibre version: 0.7.38
{'asciiize': True,
 'author_sort': None,
 'authors': None,
 'base_font_size': 0.0,
 'book_producer': None,
 'breadth_first': False,
 'change_justification': u'original',
 'chapter': u"//*[((name()='h1' or name()='h2') and re:test(., 'chapter|book|section|part|prologue|epilogue\\s+', 'i')) or @class = 'chapter']",
 'chapter_mark': u'pagebreak',
 'comments': None,
 'cover': None,
 'debug_pipeline': None,
 'disable_font_rescaling': False,
 'dont_compress': False,
 'dont_package': False,
 'extra_css': None,
 'font_size_mapping': None,
 'footer_regex': u'(?i)(?<=<hr>)((\\s*<a name=\\d+></a>((<img.+?>)*<br>\\s*)?\\d+<br>\\s*.*?\\s*)|(\\s*<a name=\\d+></a>((<img.+?>)*<br>\\s*)?.*?<br>\\s*\\d+))(?=<br>)',
 'header_regex': u'(?i)(?<=<hr>)((\\s*<a name=\\d+></a>((<img.+?>)*<br>\\s*)?\\d+<br>\\s*.*?\\s*)|(\\s*<a name=\\d+></a>((<img.+?>)*<br>\\s*)?.*?<br>\\s*\\d+))(?=<br>)',
 'html_unwrap_factor': 0.4,
 'input_encoding': '',
 'input_profile': <calibre.customize.profiles.InputProfile object at 0x3a94750>,
 'insert_blank_line': False,
 'insert_metadata': False,
 'isbn': None,
 'keep_ligatures': False,
 'language': None,
 'level1_toc': None,
 'level2_toc': None,
 'level3_toc': None,
 'line_height': 0.0,
 'linearize_tables': False,
 'margin_bottom': 5.0,
 'margin_left': 5.0,
 'margin_right': 5.0,
 'margin_top': 5.0,
 'max_levels': 5,
 'max_toc_links': 50,
 'minimum_line_height': 120.0,
 'mobi_ignore_margins': False,
 'no_chapters_in_toc': False,
 'no_inline_navbars': True,
 'no_inline_toc': False,
 'output_profile': <calibre.customize.profiles.KindleOutput object at 0x3a94d10>,
 'page_breaks_before': u"//*[name()='h1' or name()='h2']",
 'personal_doc': u'[PDOC]',
 'prefer_author_sort': False,
 'prefer_metadata_cover': False,
 'preprocess_html': False,
 'pretty_print': False,
 'pubdate': None,
 'publisher': None,
 'rating': None,
 'read_metadata_from_opf': '/tmp/calibre_0.7.38_tmp_r_4Wkt/calibre_0.7.38_FQbpxq.opf',
 'remove_first_image': False,
 'remove_footer': False,
 'remove_header': False,
 'remove_paragraph_spacing': False,
 'remove_paragraph_spacing_indent_size': 1.5,
 'rescale_images': False,
 'series': None,
 'series_index': None,
 'smarten_punctuation': False,
 'tags': None,
 'timestamp': None,
 'title': None,
 'title_sort': None,
 'toc_filter': None,
 'toc_threshold': 6,
 'toc_title': None,
 'use_auto_toc': False,
 'verbose': 2}
InputFormatPlugin: HTML Input running
on /tmp/calibre_0.7.38_tmp_r_4Wkt/calibre_0.7.38_Eedtrt_plumber/content.opf
Parsing all content...
Traceback (most recent call last):
  File "/usr/bin/calibre-parallel", line 19, in <module>
    sys.exit(main())
  File "/usr/lib64/calibre/calibre/utils/ipc/worker.py", line 106, in main
    result = func(*args, **kwargs)
  File "/usr/lib64/calibre/calibre/gui2/convert/gui_conversion.py", line 24, in gui_convert
    plumber.run()
  File "/usr/lib64/calibre/calibre/ebooks/conversion/plumber.py", line 853, in run
    accelerators, tdir)
  File "/usr/lib64/calibre/calibre/customize/conversion.py", line 216, in __call__
    log, accelerators)
  File "/usr/lib64/calibre/calibre/ebooks/html/input.py", line 299, in convert
    encoding=opts.input_encoding)
  File "/usr/lib64/calibre/calibre/ebooks/conversion/plumber.py", line 990, in create_oebbook
    reader()(oeb, path_or_stream)
  File "/usr/lib64/calibre/calibre/ebooks/oeb/reader.py", line 71, in __call__
    opf = self._read_opf()
  File "/usr/lib64/calibre/calibre/ebooks/oeb/reader.py", line 104, in _read_opf
    data = self.oeb.decode(data)
  File "/usr/lib64/calibre/calibre/ebooks/oeb/base.py", line 1897, in decode
    return fix_data(data.decode(self.input_encoding, 'replace'))
LookupError: unknown encoding: 
------------------------------------------

Comment 1 Kevin Fenzi 2011-01-25 00:26:50 UTC
This looks like: 
http://bugs.calibre-ebook.com/ticket/8232
which was fixed upstream in 0.7.40. 

Can you: 

su
cd /etc/yum.repos.d
wget http://repos.fedorapeople.org/repos/kevin/calibre/fedora-calibre.repo
yum clean all
yum update calibre

and see if that version works for you use case?

Comment 2 Sean Stangl 2011-01-25 00:37:45 UTC
It converts successfully with 0.7.40-1.fc14 from your repo.

Comment 3 Fedora End Of Life 2012-08-16 21:28:20 UTC
This message is a notice that Fedora 14 is now at end of life. Fedora 
has stopped maintaining and issuing updates for Fedora 14. It is 
Fedora's policy to close all bug reports from releases that are no 
longer maintained.  At this time, all open bugs with a Fedora 'version'
of '14' have been closed as WONTFIX.

(Please note: Our normal process is to give advanced warning of this 
occurring, but we forgot to do that. A thousand apologies.)

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, feel free to reopen 
this bug and simply change the 'version' to a later Fedora version.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we were unable to fix it before Fedora 14 reached end of life. If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora, you are encouraged to click on 
"Clone This Bug" (top right of this page) and open it against that 
version of Fedora.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping


Note You need to log in before you can comment on or make changes to this bug.