Rewrite the link re-writing logic
The previous logic parses URLs with a regex first and then using
the markdown library. It then uses urlopen().read() to validate
links.
We use now the markdown library only to extract the list of links,
and then urlparse to deconstruct, analyse, adapt and reconstract
the link. We do not attempt to fetch links anymore, which means
that external links are not guaranteed to be working.
Absolute URLs are not changed (they may be external)
Fragments are relative to the page and do not need changes
Path only links should point to a file synced to the website
but sometimes the file may be missing (if it's not in the sync
configuration), so we follow this approach:
- prefix with base_path and check for the file locally
- if not found, prefix with base_url instead
Note that urlparse treats URLs without scheme like path only
URLs, so 'github.com' will be rewritten to base_url/github.com
Signed-off-by: NAndrea Frittoli <andrea.frittoli@gmail.com>
Showing
sync/test-content/content.md
0 → 100644
sync/test-content/test.txt
0 → 100644
想要评论请 注册 或 登录