64 Commits

Author SHA1 Message Date
James Lu
400ffd7899 Wikifetch: fix quote_plus import 2017-09-07 19:20:21 -07:00
James Lu
6011742299 Wikifetch: remove Python 2 compatibility code 2017-09-01 18:09:19 -07:00
James Lu
08d8f48db5 Wikifetch: refactor text fetching, fix listing disambig results 2017-09-01 18:04:20 -07:00
James Lu
9986babd2e Wikifetch: strip inline notes in the form "text[note 1]" from IRC 2017-09-01 18:02:52 -07:00
James Lu
1dbcdb746d Wikifetch: declare encoding for ancient Python support 2017-06-03 18:40:24 -07:00
James Lu
71458857f9 Wikifetch: more tests for --site and foreign Wikipedia 2017-06-03 18:35:02 -07:00
James Lu
b6231f56ef Revert "Wikifetch: intelligently filter out <p> lines with little or no content"
This broke parsing for CJK languages (e.g. Chinese and Japanese), which don't use traditional spaces...

(but I should've known that)

This reverts commit 91cfa7acb0975fd5b5bab6e6f2c760781ccd84e2.
2017-06-03 18:30:25 -07:00
James Lu
b8e04f167e Wikifetch: add tests for non-English Wikipedia & articles with symbols in their title 2017-06-03 18:11:55 -07:00
James Lu
670b41950b Wikifetch: rm broken Commons test 2017-06-03 18:11:48 -07:00
James Lu
346f72d816 Wikifetch: fix lookup of articles with symbols (e.g. "/") in their title
The normalization for the special cases was previously ignored if the query matched a "/"; why was this added in the first place?
2017-06-03 18:09:20 -07:00
James Lu
b79ddf2f7e Wikifetch: update URL for commons.wikimedia.org test, as the old one has been removed 2017-06-03 17:47:06 -07:00
James Lu
092055d491 Wikifetch: fix Wikipedia parsing again
As of 2017-06-03, Wikipedia has put its text content under a new "mw-parser-output" div, while# other sites (e.g. Wikia) still have it directly under "mw-content-text".
2017-06-03 17:46:30 -07:00
James Lu
11a03ad9a0 Bump version to 2017.05.31 2017-05-31 13:10:47 -07:00
James Lu
394158bea5 Wikifetch: fix Wikipedia test 2017-04-16 16:53:37 -07:00
James Lu
7611f0fa9c Wikifetch: fix 'random' help text syntax 2017-03-24 19:10:47 -07:00
James Lu
001b49b6c3 Wikifetch: prefer <link rel="canonical"> links again when available 2017-03-24 19:09:38 -07:00
James Lu
d5f498bfcb Wikifetch: switch to a different article for testMediaWiki 2017-03-18 23:52:38 -07:00
James Lu
91cfa7acb0 Wikifetch: intelligently filter out <p> lines with little or no content
More specifically, this skips lines that have a lower word count than the search query (e.g. page titles, some navigation links).
This allows some pages on https://wiki.ubuntu.com/ to work, for example
2017-03-18 19:07:56 -07:00
James Lu
819fcc6c09 Wikifetch: add a --no-mw-parsing option in an attempt to support non-MediaWiki sites 2017-03-18 18:47:20 -07:00
James Lu
194ac4d7be Wikifetch: clarify _get_article_tree docstring 2017-03-18 18:23:00 -07:00
James Lu
a9dfb1009d Wikifetch: add a three second timeout in fetch 2017-02-04 18:28:11 -08:00
James Lu
2fbfc37f98 Wikifetch: leave a fallback reply if paragraph parsing failed 2017-01-27 18:16:00 -08:00
James Lu
2bd06a39a9 Wikifetch: return the address in _get_article_tree as well 2017-01-27 18:10:52 -08:00
James Lu
18493a5e23 Wikifetch: revamp tests to be more complete
This now tests different combinations of --site, and tries to parse some other common wikis.
2017-01-27 18:01:13 -08:00
James Lu
100f503783 Wikifetch: support wikimedia.org and mediawiki.org 2017-01-27 18:00:48 -08:00
James Lu
8d586dad47 Wikifetch: fix NameError on redirect parsing 2017-01-27 17:38:37 -08:00
James Lu
5bf0bd6fd5 Wikifetch: bump copyright years 2017-01-27 17:32:20 -08:00
James Lu
9f1f04d25c Wikifetch: only show "possible results" in disambiguation pages if parsing succeeds 2017-01-27 17:25:43 -08:00
James Lu
d1eea2a0a4 Wikifetch: support disambiguation parsing on Wikia 2017-01-27 17:25:24 -08:00
James Lu
3ca87bb686 Wikifetch: abstract out article fetching, fix Wikia search support 2017-01-27 17:12:41 -08:00
James Lu
482a8dd1d9 Wikifetch: remove special case for articles about years
This isn't really relevant anymore, since most years on Wikipedia have an introductory paragraph describing them now.
2017-01-27 17:10:32 -08:00
James Lu
4d487dd0e0 Wikifetch: log the URL when fetching a link fails 2017-01-26 21:28:06 -08:00
James Lu
22a710649e Bump version to 2017.01.16+git 2017-01-16 21:23:59 -08:00
James Lu
f97b54d709 Bump version to 2017.01.16 2017-01-16 21:16:25 -08:00
James Lu
b6f0397665 Bump version to 2016.09.26+git 2016-09-26 11:08:14 -07:00
James Lu
23989f692e Bump version to 2016.09.26 2016-09-26 11:07:28 -07:00
James Lu
eb96578e91 Wikifetch: remove another broken test 2016-09-18 12:19:39 -07:00
James Lu
9db9f000ed Wikifetch: remove long-broken test 2016-09-18 11:16:46 -07:00
James Lu
7c3c90ee37 Bump version to 2016.07.03+git 2016-07-03 12:49:10 -07:00
James Lu
1e96e5fd80 Bump version to 2016.07.03 2016-07-03 12:49:04 -07:00
James Lu
274027c94a Bump version to 2016.05.15+git 2016-05-15 10:28:42 -07:00
James Lu
f2d9a3a3ff Bump version to 2016.05.15 2016-05-15 10:28:39 -07:00
James Lu
bb1ed2eda8 Bump version to 2016.02.28.1+git 2016-03-04 16:39:58 -08:00
James Lu
e73e47f211 Merge branch 'release/2016.02.28.1' 2016-03-04 15:30:26 -08:00
James Lu
eb4f2b31e9 Bump version to 2016.02.28.1 2016-03-04 15:28:14 -08:00
James Lu
66b3ef6d17 Wikifetch: make _wiki() return the fetched text instead of replying it directly 2016-02-28 18:20:52 -08:00
James Lu
d1e3cd9837 Update my email in various places 2015-11-26 16:27:57 -08:00
James Lu
2d0e90b2dc Wikifetch: Limit supybot.commands import to remove __builtins__.any hack 2015-11-14 16:19:11 -08:00
James Lu
4109741a01 Wikifetch: "Not found or page malformed" should be an error, not a reply 2015-11-01 11:14:32 -08:00
James Lu
facd42da12 Wikifetch: add test for 'random' 2015-10-25 17:44:17 -07:00