Changelog¶
This changelog mostly follows keep a changelog. Release numbering mostly follows Semantic Versioning.
Version 3.0.0 (2020-09-15)¶
Milestone¶
Initial release of webchanges as an updated fork of urlwatch 2.21. Changes below are relative to urlwatch 2.21
Added¶
You can now specify just a
url
and the “just works for web” philosophy optimizes the monitoring of text in webpages by applying all necessary filters TODOIf no job
name
is provided, the title of an HTML page will be used for a job name in reportsThe Python
html2text
package (used by thehtml2text
filter, previously known aspyhtml2text
) is now initialized with the following purpose-optimized non-default options: unicode_snob = True, body_width = 0, single_line_break = True, and ignore_images = TrueThe output from
html2text
filter is reconstructed into HTML (for html reports), preserving basic formatting such as bolding, italics, list bullets, etc. as well as making links clickableThe formatting of HTML has been made radically more legible and useful, including long lines wrapping around
Reports are now rendered correctly by HTML email clients who reformat style sheets
Filter
format-xml
reformats (pretty-print) XMLwebchanges --errors
will run all jobs and list all errors and empty responses (after filtering)Browser jobs now recognize
cookies
,headers
,http_proxy
,https_proxy
, andtimeout
Can select the revision number of Chromium browser to use with
chromium_revision
Can set the user directory for the Chromium browser with
user_data_dir
Chromium can be directed to ignore HTTPs errors with
ignore_https_errors
Chromium can be directed as to when to consider a page loaded with
wait_until
Additional command line switches can be passed to Chromium with
switches
New report filters
additions_only
anddeletions_only
allow to track only content that was added (or deleted) from the sourceSupport for Python 3.9
Backward compatibility with urlwatch 2.21
Changed and deprecated¶
Navigation by full browser is now accomplished by specifying the
url
and adding theuse_browser: true
directiveThe navigate directive has been deprecated for clarity and will trigger a warning *TODO*
The name of the default job’s configuration file has been changed to
jobs.yaml
; if at program launchurls.yaml
is found and nojobs.yaml
exists, it is copied over for backward-compatibilityThe name of the default program configuration file has been changed to
config.yaml
; if at program launchurlwatch.yaml
is found and noconfig.yaml
exists, it is copied over for backward-compatibility.The location of config files in Windows has been moved to
%USERPROFILE%\Documents\webchanges
where they can be more easily edited (they are indexed there) and backed upThe
html2text
filter defaults to using the Pythonhtml2text
package (with optimized defaults)New additions_only directive to report only added lines (useful when monitoring only new content)
New deletions_only directive to report only deleted lines
keyring and cssselect Python packages are no longer installed by default
html2text and markdown2 Python packages are installed by default
Installation of Python packages required by a feature is now made easier with pip extras
The
html2text
filter’sre
method has been renamedstrip_tags
, which is deprecated and will trigger a warningThe
grep
filter has been renamedkeep_lines_containing
, which is deprecated and will trigger a warningThe
grepi
filter has been renameddelete_lines_containing
, which is deprecated and will trigger a warningBoth the
keep_lines_containing
anddelete_lines_containing
accepttext
(default) in addition tore
(regular expressions)--test
command line switch is used to test a job (formerly--test-filter
, deprecated)--test-diff
command line switch is used to test a jobs’ diff (formerly--test-diff-filter
, deprecated)-V
command line switch added as an alias to--version
If a filename for
--jobs
,--config
or--hooks
is supplied without a path and the file is not present in the current directory, webchanges now looks for it in the default configuration directoryIn Windows,
--edit
defaults to using built-in notepad.exe if %EDITOR% or %VISUAL% are not setWhen using
--job
command line switch, if there’s no file by that name in the specified directory will look in the default one before giving up.The use of the
kind
directive injobs.yaml
configuration files has been deprecated (but is, for now, still used internally)The database (cache) file is backed up at every run to *.bak
The list of default and optional dependencies has been updated (see documentation) to enable “Just works”
Dependencies are now specified as PyPi extras to simplify their installation
Changed timing from datetime to timeit.default_timer
Upgraded concurrent execution loop to concurrent.futures.ThreadPoolExecutor.map
Reports’ elapsed time now always has at least 2 significant digits
Using flake8 to the test suite
Removed¶
The
html2text
filter’slynx
method is no longer supported; usehtml2text
insteadPython 3.5 (obsoleted by 3.6 on December 23, 2016) is no longer supported
Fixed¶
The
html2text
filter’shtml2text
method defaults to unicode handlingHTML href links ending with spaces are no longer broken by
xpath
replacing spaces with %20Initial config file no longer has directives sorted alphabetically, but are saved logically (e.g. enabled is always the first sub-directive
Security¶
None
Documentation changes¶
Complete rewrite
Known bugs¶
An empty report will still be generated for a job when no reportable changes survive the
additions_only
ordeletions_only
report filters