elm.web.html_pw.load_html_with_pw
- async load_html_with_pw(url, browser_semaphore=None, timeout=90000, use_scrapling_stealth=False, load_state='networkidle', **pw_launch_kwargs)[source]
Extract HTML from URL using Playwright.
- Parameters:
url (str) – URL to pull HTML for.
browser_semaphore (asyncio.Semaphore, optional) – Semaphore instance that can be used to limit the number of playwright browsers open concurrently. If
None, no limits are applied. By default,None.timeout (int, optional) – Maximum time to wait for page loading state time in milliseconds. Pass 0 to disable timeout. By default,
90,000.use_scrapling_stealth (bool, default=False) – Option to use scrapling stealth scripts instead of tf-playwright-stealth. By default,
False.load_state (str, default=”networkidle”) –
The load state to wait for. One of:
- “load” - consider navigation to be finished when the load
event is fired.
- “domcontentloaded” - consider navigation to be finished
when the
DOMContentLoadedevent is fired.
- “networkidle” - consider navigation to be finished when
there are no network connections for at least 500 ms.
By default,
"networkidle".**pw_launch_kwargs – Keyword-value argument pairs to pass to
async_playwright.chromium.launch().
- Returns:
str – HTML from page.