elm.web.google_search.google_results_as_docs
- async google_results_as_docs(queries, num_urls=None, browser_semaphore=None, task_name=None, **file_loader_kwargs)[source]
Retrieve top
N
google search results as document instances.- Parameters:
queries (collection of str) – Collection of strings representing google queries. Documents for the top num_urls google search results (from all of these queries _combined_ will be returned from this function.
num_urls (int, optional) – Number of unique top Google search result to return as docs. The google search results from all queries are interleaved and the top num_urls unique URL’s are downloaded as docs. If this number is less than
len(queries)
, some of your queries may not contribute to the final output. By default,None
, which setsnum_urls = 3 * len(queries)
.browser_semaphore (
asyncio.Semaphore
, optional) – Semaphore instance that can be used to limit the number of playwright browsers open concurrently. IfNone
, no limits are applied. By default,None
.task_name (str, optional) – Optional task name to use in
asyncio.create_task()
. By default,None
.**file_loader_kwargs – Keyword-argument pairs to initialize
elm.web.file_loader.AsyncFileLoader
with. If found, the “pw_launch_kwargs” key in these will also be used to initialize theelm.web.google_search.PlaywrightGoogleLinkSearch
used for the google URL search. By default,None
.
- Returns:
list of
elm.web.document.BaseDocument
– List of documents representing the top num_urls results from the google searches across all queries.