elm.web.search.duckduckgo.PlaywrightDuckDuckGoLinkSearch

class PlaywrightDuckDuckGoLinkSearch(use_homepage=True, use_scrapling_stealth=False, **launch_kwargs)[source]

Bases: PlaywrightSearchEngineLinkSearch

Search for top links on the main DuckDuckGo search engine

Parameters:
  • use_homepage (bool, default=True) – If True, the browser will be navigated to the search engine homepage and the query will be input into the search bar. If False, the query will be embedded in the URL and the browser will navigate directly to the filled-out URL. By default, False.

  • use_scrapling_stealth (bool, default=False) – Option to use scrapling stealth scripts instead of tf-playwright-stealth. If set to True, the _SC class attribute will be ignored. By default, False.

  • **launch_kwargs – Keyword arguments to be passed to playwright.chromium.launch. For example, you can pass headless=False, slow_mo=50 for a visualization of the search.

Methods

results(*queries[, num_results])

Retrieve links for the first num_results of each query

Attributes

MAX_RESULTS_CONSIDERED_PER_PAGE

Number of results displayed per DuckDuckGo page

PAGE_LOAD_TIMEOUT

Default page load timeout value in milliseconds

MAX_RESULTS_CONSIDERED_PER_PAGE = 10

Number of results displayed per DuckDuckGo page

PAGE_LOAD_TIMEOUT = 60000

Default page load timeout value in milliseconds

async results(*queries, num_results=10)

Retrieve links for the first num_results of each query

This function executes a search for each input query and returns a list of links corresponding to the top num_results.

Parameters:
  • *queries (str) – One or more queries to search for.

  • num_results (int, optional) – Maximum number of top results to retrieve for each query. Note that this value can never exceed the number of results per page (typically 10). If you pass in a larger value, it will be reduced to the number of results per page. There is also no guarantee that the search query will return this many results - the actual number of results returned is determined by the number of results on a page (excluding ads). You can, however, use this input to limit the number of results returned. By default, 10.

Returns:

list – List equal to the length of the input queries, where each entry is another list containing no more than num_results links.