ocrodjvu (0.10.4) unstable; urgency=low

  * Fix handling input files with non-ASCII names.
    Regression introduced in 0.9.1.
    https://github.com/jwilk/ocrodjvu/issues/23
    Thanks to @derrikF for the bug report.
  * Improve documentation:
    + Fix punctuation.
    + Clarify Python version requirements.
    + Update the credits file.

 -- Jakub Wilk <jwilk@jwilk.net>  Thu, 12 Jul 2018 21:07:40 +0200

ocrodjvu (0.10.3) unstable; urgency=low

  * Tesseract: fix stripping multi-line comments.
  * Drop support for python-djvulibre < 0.3.9.
  * Improve documentation:
    + Document untrusted search path vulnerability that was unknowingly fixed
      in 0.4.7.
    + Document that argparse is only needed for Python 2.6.
    + Link to Python 2 (not Python 3) documentation.
    + Use HTTPS for unicode.org URLs.
    + Update Tesseract bug tracker URL.
    + Update HTML5 specification URL.
    + Update PyPI URLs.
  * Improve the setup script:
    + Don't import any own modules.
    + Use distutils644 to sanitize tarball permissions etc.
  * Improve the test suite:
    + Fix compatibility with subprocess32 3.5.0.

 -- Jakub Wilk <jwilk@jwilk.net>  Mon, 21 May 2018 23:47:49 +0200

ocrodjvu (0.10.2) unstable; urgency=low

  * Make --version print also versions of Python and the libraries.
  * Make --version print to stdout, not stderr.
  * Make bad usage exit status 1.
  * Drop support for PyICU < 1.0.
  * Update DocBook XSL homepage URL.

 -- Jakub Wilk <jwilk@jwilk.net>  Tue, 07 Feb 2017 23:58:08 +0100

ocrodjvu (0.10.1) unstable; urgency=low

  * Don't hardcode the Python interpreter path in script shebangs; use
    “#!/usr/bin/env python” instead.
  * Include a missing test image in the tarball.
  * Update Tesseract homepage URL.
  * Update bug tracker URLs.
    The project repo has moved to GitHub.

 -- Jakub Wilk <jwilk@jwilk.net>  Tue, 22 Nov 2016 16:45:13 +0100

ocrodjvu (0.10) unstable; urgency=low

  * Add support for cuneiform-multilang as OCR engine.
    Thanks to Alexey Shipunov for the bug report.
  * Improve error handling.

 -- Jakub Wilk <jwilk@jwilk.net>  Fri, 17 Jun 2016 11:04:16 +0200

ocrodjvu (0.9.2) unstable; urgency=low

  * Fix crashes on empty pages.
    https://github.com/jwilk/ocrodjvu/issues/18
    https://github.com/jwilk/ocrodjvu/issues/7
    Thanks to Janusz S. Bień for the bug report.
  * Fix typos.
  * Ignore boring diagnostic messages from Tesseract.
  * Update the HTML5 specification URLs.
  * Update the ICU website URL.
  * Update the PyICU website URL.
  * Rename the test modules, so that passing --all to nosetests is no longer
    necessary.

 -- Jakub Wilk <jwilk@jwilk.net>  Tue, 31 May 2016 22:10:30 +0200

ocrodjvu (0.9.1) unstable; urgency=low

  * Use the subprocess32 module (a thread-safe replacement for the subprocess
    module) when it's available.
  * Issue a warning when the -j/--jobs is enabled, but the subprocess module
    is not thread-safe.
  * Include an example script for converting scans to DjVu + hOCR.
  * Improve error handling.

 -- Jakub Wilk <jwilk@jwilk.net>  Tue, 25 Aug 2015 23:36:12 +0200

ocrodjvu (0.9) unstable; urgency=low

  * If python-djvulibre >= 0.4 is installed, don't escape non-ASCII characters
    in djvused scripts.
    https://github.com/jwilk/ocrodjvu/issues/13
    Thanks to Janusz S. Bień for the bug report.
  * Improve error handling.

 -- Jakub Wilk <jwilk@jwilk.net>  Mon, 27 Jul 2015 21:34:39 +0200

ocrodjvu (0.8) unstable; urgency=low

  * Change the default OCR engine to Tesseract.
  * Add the “tesseract: ” prefix to messages Tesseract prints on stderr.
    https://github.com/jwilk/ocrodjvu/issues/10
    Thanks to Janusz S. Bień for the bug report.
  * Ensure that exit code is non-zero if the program recovered from an error.
    https://github.com/jwilk/ocrodjvu/issues/6
  * Improve error handling.

 -- Jakub Wilk <jwilk@jwilk.net>  Wed, 10 Jun 2015 21:17:32 +0200

ocrodjvu (0.7.19) unstable; urgency=low

  * Make sure that text zones are at least 1 pixel wide and 1 pixel high.
  * Tesseract: fix splitting bounding boxes for character clusters.
    https://github.com/jwilk/ocrodjvu/issues/12
    Thanks to Janusz S. Bień for the bug report.
  * Fix typos in the documentation.

 -- Jakub Wilk <jwilk@jwilk.net>  Tue, 11 Nov 2014 18:11:36 +0100

ocrodjvu (0.7.18) unstable; urgency=low

  [ Filip Graliński ]
  * Fix counting pages when file identifier cannot be converted to locale
    encoding.

  [ Jakub Wilk ]
  * Use HTTPS URLs when they are available, in documentation and code.
  * Update some stale URLs in documentation and code.

 -- Jakub Wilk <jwilk@jwilk.net>  Tue, 22 Apr 2014 11:22:01 +0200

ocrodjvu (0.7.17) unstable; urgency=low

  * Fix compatibility with Tesseract > 3.02.
    https://github.com/jwilk/ocrodjvu/issues/9
    Thanks to Heinrich Schwietering for the bug report.
  * ocrodjvu:
    + Ensure that exit code is non-zero if the program was interrupted by
      user.
    + Fix typos in the documentation.

 -- Jakub Wilk <jwilk@jwilk.net>  Tue, 04 Feb 2014 11:28:46 +0100

ocrodjvu (0.7.16) unstable; urgency=low

  * Use “en-US-POSIX” as the default locale for ICU.
  * ocrodjvu:
    + Fix option names in documentation of the --ocr-only option.
    + Don't crash if file identifier is not in UTF-8 or if it cannot be
      converted to locale encoding; use the page number instead.
      https://github.com/jwilk/ocrodjvu/issues/4
    + Don't hang if a page cannot be decoded.
      https://github.com/jwilk/ocrodjvu/issues/5

 -- Jakub Wilk <jwilk@jwilk.net>  Sun, 28 Apr 2013 15:08:19 +0200

ocrodjvu (0.7.15) unstable; urgency=low

  * Strip trailing whitespace from text zones bigger than words (lines,
    paragraphs, …).
  * Fix compatibility with Tesseract 3.02.
    Thanks to Janusz S. Bień for the bug report.
  * ocrodjvu:
    + Make it possible to pass multiple languages to Tesseract ≥ 3.02.
      https://github.com/jwilk/ocrodjvu/issues/3
      Thanks to Janusz S. Bień for the bug report.
    + Cuneiform: rename mixed Russian-English language code:
      “rus-eng” → “rus+eng”. This is for consistency with Tesseract.
    + Tesseract: fix support for Chinese language pack.
    + Tesseract: make it possible to pass the -psm option in order to
      customize layout analysis. For example, to enable OSD, use:
      “-X extra_args='-psm 1'”.
    + Make --list-languages output sorted.
    + Tesseract: remove “osd” from language list.
    + Accept both ISO 639-2/T and ISO 639-2/B language codes.
    + Add the --save-raw-ocr option.
    + Add the --raw-ocr-filename-template option.
    + Improve documentation of the --ocr-only option.
  * Require Python ≥ 2.6.
  * Fix compatibility with nose 1.2.
    Thanks to Kyrill Detinov for the bug report.

 -- Jakub Wilk <jwilk@jwilk.net>  Wed, 17 Apr 2013 00:59:23 +0200

ocrodjvu (0.7.14) unstable; urgency=low

  * Document which versions of OCRopus are supported.
  * Document that PyICU and html5lib are only required for some optional
    features.
  * Document what software is needed to rebuild the manual pages from source.
  * djvu2hocr:
    + Add the --title option.
    + Add the --css option.
    + Document the -p/--pages option.

 -- Jakub Wilk <jwilk@jwilk.net>  Fri, 15 Mar 2013 13:54:05 +0100

ocrodjvu (0.7.13) unstable; urgency=low

  * Abort early if one tries to use an incompatible Python version.
  * Improve the manual pages, as per man-pages(7) recommendations:
    + Remove the “AUTHOR” sections.
    + Rename the “REPORTING BUGS” sections as “BUGS”.
  * Improve the test suite.
  * Make “setup.py clean -a” remove compiled manual pages (unless they were
    built by “setup.py sdist”).

 -- Jakub Wilk <jwilk@jwilk.net>  Thu, 14 Feb 2013 23:28:56 +0100

ocrodjvu (0.7.12) unstable; urgency=low

  * Don't let “-X fix-html=1” break HTML snippets ocrodjvu generates itself
    for the “-t chars” Tesseract support.
    Thanks to Janusz S. Bień for the test case.

 -- Jakub Wilk <jwilk@jwilk.net>  Wed, 15 Aug 2012 19:32:57 +0200

ocrodjvu (0.7.11) unstable; urgency=low

  * hocr2djvused:
    + Allow processing multiple hOCR documents at once.
      https://github.com/jwilk/ocrodjvu/issues/1
      Thanks to Thomas Koch for the bug report and the initial patch.
  * Fix merging results of two Tesseract runs.
    Thanks to Janusz S. Bień for the bug report.

 -- Jakub Wilk <jwilk@jwilk.net>  Mon, 28 May 2012 19:43:22 +0200

ocrodjvu (0.7.10) unstable; urgency=low

  * Improve error handling.
  * ocrodjvu:
    + Attempt to fix encoding issues and eliminate unwanted control characters
      in files produced by Tesseract and Cuneiform.
      https://bugs.debian.org/671764
      Thanks to Thomas Koch for the bug report.
  * hocr2djvused:
    + Add the --fix-utf8 option.
  * djvu2hocr:
    + Translate DjVu “region” to <div class="ocrx_block"> (instead of <span…>,
      which was causing XHTML validity errors).
  * Tests: fix compatibility with PIL ≥ 1.2.
  * Include example scans2djvu+hocr script.
  * Fix merging results of two Tesseract runs.
    Thanks to Janusz S. Bień for the bug report.
  * Use RFC 3339 date format in the manual page. Don't call external programs
    to build it.

 -- Jakub Wilk <jwilk@jwilk.net>  Sat, 12 May 2012 00:37:50 +0200

ocrodjvu (0.7.9) unstable; urgency=low

  * Improve error handling.
  * Fix compatibility with Tesseract > 3.01.

 -- Jakub Wilk <jwilk@jwilk.net>  Sat, 10 Mar 2012 23:36:03 +0100

ocrodjvu (0.7.8) unstable; urgency=low

  * Improve test suite.

 -- Jakub Wilk <jwilk@jwilk.net>  Sun, 22 Jan 2012 00:04:16 +0100

ocrodjvu (0.7.7) unstable; urgency=low

  * Raise proper import error if html5lib is not installed.
    Thanks to Kyrill Detinov for the bug report.

 -- Jakub Wilk <jwilk@jwilk.net>  Sun, 11 Dec 2011 23:08:05 +0100

ocrodjvu (0.7.6) unstable; urgency=low

  * Improve error handling.
  * ocrodjvu:
    + Fix a regression in gocr, ocrad and tesseract engines, which made them
      unusable.

 -- Jakub Wilk <jwilk@jwilk.net>  Thu, 27 Oct 2011 18:06:38 +0200

ocrodjvu (0.7.5) unstable; urgency=low

  * Check Python version in setup.py.
  * Accept slightly malformed hOCR documents (with a text zone not completely
    within the page area).
    https://bugs.debian.org/575484#35
  * Fix compatibility with Tesseract > 3.00.
    Thanks to Janusz S. Bień for the bug report.
  * ocrodjvu, hocr2djvused:
    + Add the --html5 option.

 -- Jakub Wilk <jwilk@jwilk.net>  Sat, 27 Aug 2011 01:25:33 +0200

ocrodjvu (0.7.4) unstable; urgency=low

  * Use a better method to detect Debian-based systems.
  * hocr2djvused:
    + Ignore comments and <script> elements in hOCR.
  * ocrodjvu:
    + For Tesseract ≥ 3.00, extract bounding boxes of particular characters
      with higher accuracy.

 -- Jakub Wilk <jwilk@jwilk.net>  Wed, 27 Jul 2011 17:34:38 +0200

ocrodjvu (0.7.2) unstable; urgency=low

  * Don't hang if one of the threads raises an exception.
  * Use the logging module for printing progress messages, errors etc.
  * Produce more useful import error messages on Debian-based systems.

 -- Jakub Wilk <jwilk@jwilk.net>  Mon, 04 Apr 2011 01:14:22 +0200

ocrodjvu (0.7.1) unstable; urgency=low

  * Windows: guess location of the DjVuLibre DLL (requires python-djvulibre
    ≥ 0.3.3).
  * ocrodjvu:
    + Work around a bug in Cuneiform, which mistakenly use “slo” (rather than
      “slv”) as language code for Slovenian.
      https://bugs.launchpad.net/cuneiform-linux/+bug/707951
    + Accept “ces”, “nld”, “slv”, “ron” as language codes for Czech, Dutch,
      Slovenian and Romanian languages, even when Cuneiform internally use
      different ones.
  * djvu2hocr:
    + Don't flip hOCR upside-down.
      https://bugs.debian.org/611460

 -- Jakub Wilk <jwilk@jwilk.net>  Sat, 29 Jan 2011 18:14:40 +0100

ocrodjvu (0.7.0) unstable; urgency=low

  * Correctly handle empty pages recognized by Cuneiform and Ocrad.
    Thanks to Alexey Shipunov for the bug report.
  * Fix crash on Cuneiform-generated hOCR with bounding boxes for whitespace
    characters.
    Thanks to Alexey Shipunov for the bug report.
  * Fix compatibility with Tesseract 3.00.
  * Fix colors in 24-bit BMP images.
  * ocrodjvu:
    + Make “-e” an alias for “--engine”.
    + Make “-l” an alias for “--language”.
    + Add the -X option (for advanced users).
    + Work-around for Cuneiform returning files with control characters is now
      disabled by default. Use “-X fix-html=1” to re-enable it.
    + Add the --on-error option (for advanced users).
  * djvu2hocr:
    + Fix a typo, which prevented hocr2djvused from correctly parsing files
      produced by it.
      https://bugs.debian.org/600539
  * Extend the test suite.

 -- Jakub Wilk <jwilk@jwilk.net>  Sun, 07 Nov 2010 21:37:00 +0100

ocrodjvu (0.6.1) unstable; urgency=high

  * Improve detection of Tesseract.
  * Correctly handle unrecognized and non-ASCII characters in Ocrad ORF output.
    Thanks to Heinrich Schwietering for the bug report.
  * Correctly handle text that is closer than 100 pixels from the left edge in
    Ocrad ORF output.
    Thanks to Heinrich Schwietering for the test case.
  * Fix crash on hOCR with image elements.
    https://bugs.debian.org/598139
    Thanks to Alexey Shipunov for the bug report.
  * Fix insecure use of temporary files when using Cuneiform.
    https://bugs.debian.org/598134
    CVE-2010-4338

 -- Jakub Wilk <jwilk@jwilk.net>  Sun, 26 Sep 2010 15:01:51 +0200

ocrodjvu (0.6.0) unstable; urgency=low

  * Add support for the Tesseract OCR engine.
  * Fix Cuneiform support (a regression introduced in 0.5).
    Thanks to Kyrill Detinov for the bug report.

 -- Jakub Wilk <jwilk@jwilk.net>  Thu, 16 Sep 2010 19:24:20 +0200

ocrodjvu (0.5.1) unstable; urgency=low

  * Fix crash when listing engines/languages if OCRopus is not found.
    Thanks to Kyrill Detinov for the bug report.
  * lxml is no longer required for OCR engines that are not using hOCR as
    output format.

 -- Jakub Wilk <jwilk@jwilk.net>  Wed, 15 Sep 2010 18:38:00 +0200

ocrodjvu (0.5.0) unstable; urgency=low

  * Add support for the Ocrad OCR engine.
  * Add support for the GOCR engine.
  * Cuneiform is no longer required to be linked with ImageMagick.
  * Prevent Cuneiform from asking interactive questions.
    Thanks to Heinrich Schwietering for the bug report.
  * Make sure that signals are handled in a sane way.
    Thanks to Heinrich Schwietering for the bug report.
  * Drop support for guessing page size from image (scan) contents.
  * Let the setup.py script install manual pages.
    Thanks to Kyrill Detinov and Heinrich Schwietering for bug reports.

 -- Jakub Wilk <jwilk@jwilk.net>  Tue, 14 Sep 2010 23:00:35 +0200

ocrodjvu (0.4.7) unstable; urgency=low

  * Preserve as much environment as possible when calling external programs.
    https://bugs.debian.org/594385
    Thanks to Heinrich Schwietering for the bug report.
    + In particular, preserve PATH, instead of letting Python fall back to
      insecure default PATH, which contains current working directory.
      https://bugs.python.org/issue26414

 -- Jakub Wilk <jwilk@jwilk.net>  Wed, 25 Aug 2010 20:27:17 +0200

ocrodjvu (0.4.6) unstable; urgency=low

  * Implement work-around for Cuneiform returning files with control
    characters.
    Thanks to Kyrill Detinov for the bug report.
  * Avoid deprecation warnings with PyICU ≥ 1.0.
    https://bugs.debian.org/589027
  * djvu2hocr:
    + Don't crash on very long documents.
      https://bugs.debian.org/591389

 -- Jakub Wilk <jwilk@jwilk.net>  Tue, 03 Aug 2010 20:33:49 +0200

ocrodjvu (0.4.5) unstable; urgency=low

  * Fix handling of “deu” and “rus-eng” languages.
    Thanks to Kyrill Detinov for the bug report.
  * Properly handle hOCR with inline formatting.
    Thanks to Kyrill Detinov for the bug report.
  * djvu2hocr:
    + Add ocr-system and ocr-capabilities meta information.

 -- Jakub Wilk <jwilk@jwilk.net>  Mon, 24 May 2010 21:22:39 +0200

ocrodjvu (0.4.4) unstable; urgency=low

  * Document that ocrodjvu honours TMPDIR environment variable.
    https://bugs.debian.org/575488
  * Don't remove temporary directory if ocrodjvu crashed.
    https://bugs.debian.org/575487

 -- Jakub Wilk <jwilk@jwilk.net>  Fri, 02 Apr 2010 12:00:11 +0200

ocrodjvu (0.4.3) unstable; urgency=low

  * Don't crash on --version.
    https://bugs.debian.org/573496
  * Give more meaningful error messages on a malformed hOCR produced by
    Cuneiform.
    https://bugs.debian.org/572522
  * Document how djvu2hocr deals with non-XML characters.

 -- Jakub Wilk <jwilk@jwilk.net>  Fri, 19 Mar 2010 01:22:54 +0100

ocrodjvu (0.4.2) unstable; urgency=low

  * New options for ocrodjvu:
    + --render=mask,
    + --render=foreground,
    + --render=all.
    https://bugs.debian.org/572081
  * Fix off-by-one error in text area coordinates.
  * Add support for Cuneiform 0.9.

 -- Jakub Wilk <jwilk@jwilk.net>  Wed, 03 Mar 2010 21:27:15 +0100

ocrodjvu (0.4.1) unstable; urgency=low

  * Be stricter when reading hOCR produced by OCRopus 0.3.1.

 -- Jakub Wilk <jwilk@jwilk.net>  Fri, 22 Jan 2010 20:25:54 +0100

ocrodjvu (0.4.0) unstable; urgency=low

  * Add support for the Cuneiform OCR engine.
    New options for ocrodjvu:
    + --engine,
    + --list-engines.
  * Don't crash on non-ASCII file names.
    Thanks to Jean-Christophe Heger for the bug report.
  * hocr2djvused:
    + Add the --page-size option.
  * ocrodjvu:
    + Add the -j/--jobs option.

 -- Jakub Wilk <jwilk@jwilk.net>  Thu, 21 Jan 2010 23:41:37 +0100

ocrodjvu (0.3.2) unstable; urgency=low

  * Accept negative numbers in hOCR bounding boxes.
  * djvu2hocr:
    + Fix broken UAX #29 segmentation.
    + Provide correct page bounding boxes.

 -- Jakub Wilk <jwilk@jwilk.net>  Fri, 08 Jan 2010 17:46:51 +0100

ocrodjvu (0.3.1) unstable; urgency=low

  * djvu2hocr:
    + Fix broken UAX #29 segmentation.

 -- Jakub Wilk <jwilk@jwilk.net>  Sun, 03 Jan 2010 12:56:08 +0100

ocrodjvu (0.3.0) unstable; urgency=low

  * Python ≥ 2.5 is now required.
  * argparse module in now required.
  * Add support for OCRopus 0.3.1.
  * Give better error messages when Tesseract language pack cannot be found.
  * New options for ocrodjvu:
    + -t/--details;
    + --word-segmentation.
  * New options for hocr2djvused:
    + --rotation,
    + -t/--details,
    + --word-segmentation,
  * New tool: djvu2hocr.

 -- Jakub Wilk <jwilk@jwilk.net>  Wed, 16 Dec 2009 18:42:21 +0100

ocrodjvu (0.2.1) unstable; urgency=low

  * Give a clearer error message if OCRopus were interrupted by a signal.
  * Add the --language option.
  * Add the --list-languages option.

 -- Jakub Wilk <jwilk@jwilk.net>  Sat, 17 Oct 2009 17:34:43 +0200

ocrodjvu (0.2.0) unstable; urgency=low

  * Provide a manual page.
  * Add the -D/--debug option.
  * Add options to specify how results are stored:
    + -o/--save-bundled,
    + -i/--save-indirect,
    + --save-script,
    + --in-place,
    + --dry-run.
  * Add the --clear-text option.
  * Add the --ocr-only option.

  * Please use the --in-place and --clear-text options to retain compatibility
    with ocrodjvu < 0.2.

 -- Jakub Wilk <jwilk@jwilk.net>  Wed, 14 Oct 2009 20:53:48 +0200

ocrodjvu (0.1.3) unstable; urgency=low

  * Use ocroscript, rather than ocrocmd.

 -- Jakub Wilk <jwilk@jwilk.net>  Sun, 15 Mar 2009 19:01:11 +0100

ocrodjvu (0.1.2) unstable; urgency=low

  * Make hocr2djvused work with hOCR for multiple pages.
  * Handle rotated pages correctly.
  * Ignore IW44-only pages.

 -- Jakub Wilk <jwilk@jwilk.net>  Mon, 23 Jun 2008 20:14:42 +0200

ocrodjvu (0.1.1) unstable; urgency=low

  * Depend on python-lxml.
  * Better compatibility with Python 2.4.

 -- Jakub Wilk <jwilk@jwilk.net>  Wed, 14 May 2008 11:23:13 +0200

ocrodjvu (0.1) unstable; urgency=low

  * Initial release.

 -- Jakub Wilk <jwilk@jwilk.net>  Wed, 07 May 2008 18:29:40 +0200
