Main orchestration module for web page to markdown conversion.
This is the primary entry point for the mdfetch library. It combines all the
individual components (HTTP fetching, content extraction, markdown conversion)
into a single, easy-to-use pipeline.
The main function readURL handles the complete workflow:
Fetches HTML from the provided URL (with retries and timeout)
Converts all relative URLs to absolute URLs
Extracts readable content using Mozilla Readability
Converts the content to markdown using Turndown with GitHub Flavored Markdown
Returns content in all three formats (HTML, text, markdown) with metadata
Main orchestration module for web page to markdown conversion.
This is the primary entry point for the mdfetch library. It combines all the individual components (HTTP fetching, content extraction, markdown conversion) into a single, easy-to-use pipeline.
The main function readURL handles the complete workflow:
Example