Yes I did, the script runs for a while (the time varies) then stops with the error mentioned above
Are you using the latest code and have you ran npm install
to update the dependencies? Particularly puppeteer?
I’m asking because this was a known issue in the library.
I ran a pull yesterday and then followed up with npm install, happy to try it again if you thin kit will help
We recently updated puppeteer plugins today. Do you mind trying that?
Sure, just so I know I’m doing it correctly would you mind posting the commands you want me to enter and I’ll copy and paste them in?
I’ve ran the following:
PS C:UsersNeocleousDesktopstreetmerchant> git pull origin main
remote: Enumerating objects: 71, done.
remote: Counting objects: 100% (71/71), done.
remote: Compressing objects: 100% (36/36), done.
remote: Total 71 (delta 49), reused 37 (delta 34), pack-reused 0
Unpacking objects: 100% (71/71), 134.11 KiB | 213.00 KiB/s, done.
From https://github.com/jef/streetmerchant
- branch main -> FETCH_HEAD
d792116..2d01cfd main -> origin/main
error: Your local changes to the following files would be overwritten by merge:
package-lock.json
Please commit your changes or stash them before you merge.
Aborting
Updating d0ebffd..2d01cfd
PS C:UsersNeocleousDesktopstreetmerchant> npm install
up to date, audited 1360 packages in 4s
6 packages are looking for funding
run npm fund
for details
found 0 vulnerabilities
PS C:UsersNeocleousDesktopstreetmerchant> npm fund
streetmerchant
+— https://github.com/sponsors/isaacs
| -- [email protected], [email protected], [email protected], [email protected]
— [email protected]
+-- https://github.com/sponsors/sindresorhus
|
-- https://github.com/sponsors/ljharb
— [email protected]
To give an update, I have tried pulling it down again and reinstalling it, I am getting the same issue.
Please commit your changes or stash them before you merge.
Aborting
Updating d0ebffd..2d01cfd
It looks like it didn’t update.
To give an update, I have tried pulling it down again and reinstalling it, I am getting the same issue.
This would definitely update.
It’s hard for me to reproduce… Can you post your .env
and remove any secrets?
Please see my .env file below, I noticed when I copied it in I don’t have «» around EMAIL_PASSWORD or EMAIL_TO fields, I’m not sure if that will cause an issue but I did the notifications test and it worked.
Thanks for all the help
* All configuration variables are optional *
Read https://github.com/jef/streetmerchant#customization for help on customizing this file
#
ASCII_BANNER=
ASCII_COLOR=
AUTO_ADD_TO_CART=
BROWSER_TRUSTED=
COUNTRY=
DESKTOP_NOTIFICATIONS=
DISCORD_NOTIFY_GROUP=
DISCORD_WEB_HOOK=
EMAIL_PASSWORD=»***«
EMAIL_TO=***
EMAIL_USERNAME=*******
HEADLESS=
IN_STOCK_WAIT_TIME=
LOG_LEVEL=
LOW_BANDWIDTH=
MAX_PRICE_SERIES_3070=
MAX_PRICE_SERIES_3080=
MAX_PRICE_SERIES_3090=
MAX_PRICE_SERIES_RYZEN5600=
MAX_PRICE_SERIES_RYZEN5800=
MAX_PRICE_SERIES_RYZEN5900=
MAX_PRICE_SERIES_RYZEN5950=
MICROCENTER_LOCATION=
MQTT_BROKER_ADDRESS=
MQTT_BROKER_PORT=
MQTT_CLIENT_ID=
MQTT_PASSWORD=
MQTT_QOS=
MQTT_TOPIC=
MQTT_USERNAME=
NVIDIA_ADD_TO_CART_ATTEMPTS=
NVIDIA_SESSION_TTL=
OPEN_BROWSER=
PAGE_BACKOFF_MIN=
PAGE_BACKOFF_MAX=
PAGE_SLEEP_MIN=
PAGE_SLEEP_MAX=
PAGE_TIMEOUT=
PAGERDUTY_INTEGRATION_KEY=
PAGERDUTY_SEVERITY=
PHILIPS_HUE_API_KEY=
PHILIPS_HUE_CLOUD_ACCESS_TOKEN=
PHILIPS_HUE_CLOUD_CLIENT_ID=
PHILIPS_HUE_CLOUD_CLIENT_SECRET=
PHILIPS_HUE_CLOUD_REFRESH_TOKEN=
PHILIPS_HUE_LAN_BRIDGE_IP=
PHILIPS_HUE_LIGHT_COLOR=
PHILIPS_HUE_LIGHT_IDS=
PHILIPS_HUE_LIGHT_PATTERN=
PHONE_CARRIER=
PHONE_NUMBER=
PLAY_SOUND=
PROXY_ADDRESS=
PROXY_PORT=
PUSHBULLET=
PUSHOVER_TOKEN=
PUSHOVER_USER=
PUSHOVER_PRIORITY=
SCREENSHOT=
SHOW_ONLY_BRANDS=»asus,nvidia,amd»
SHOW_ONLY_MODELS=
SHOW_ONLY_SERIES=»3080,ryzen5900″
SLACK_CHANNEL=
SLACK_TOKEN=
SMTP_ADDRESS=
SMTP_PORT=
STORES=»amazon-uk,overclockers,scan,very,nvidia,novatech,game,currys,aria»
TELEGRAM_ACCESS_TOKEN=
TELEGRAM_CHAT_ID=
TWILIO_ACCOUNT_SID=
TWILIO_AUTH_TOKEN=
TWILIO_FROM_NUMBER=
TWILIO_TO_NUMBER=
TWITCH_ACCESS_TOKEN=
TWITCH_CHANNEL=
TWITCH_CLIENT_ID=
TWITCH_CLIENT_SECRET=
TWITCH_REFRESH_TOKEN=
TWITTER_ACCESS_TOKEN_KEY=
TWITTER_ACCESS_TOKEN_SECRET=
TWITTER_CONSUMER_KEY=
TWITTER_CONSUMER_SECRET=
TWITTER_TWEET_TAGS=
USER_AGENT=»Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.66 Safari/537.36″
WEB_PORT=
This is the error I am seeing, in this screenshot it ran for less than 10 mins and on other occasions it can run for a few hours. It seems to be the same error each time.
I’m also having this issue. I am not sure if this is a solid workaround, but it hasn’t crashed on me since making this edit.
streetmerchant/node_modules/@cliqz/adblocker-puppeteer/dist/cjs/adblocker.js
async disable() {
if (this.blocker.config.loadNetworkFilters === true) {
this.page.removeListener('request', this.onRequest);
await this.page.setRequestInterception(true);
I am having this same issue, any way to get this fixed? It usually runs completly through once or twice then it crashes with this same thing
So it seems like it’s a problem with the adblocker dependency. Adding this.page.removeListener('request', this.onRequest);
to the adblocker code fixed my problem as well. Not sure what is happening since I’m not an expert on Puppeteer. I’m guessing that promise doesn’t resolve and so when request interception is done later on in the code base it errors out. A ticket should probably be filed.
This may have nothing to do with it. I am not sure. But if I connect to a different server on my VPN I will not experience this error. I am super green so I have no other thoughts than this on how to fix it. But it does work. But not all servers, it only works forever on specific ones on the vpn
Perhaps let’s make a report to that repository. I can downgrade if necessary.
Perhaps let’s make a report to that repository. I can downgrade if necessary.
I have no clue how to do that. Sorry I am very new to git. The VPN thing definitely works though. There are certain nodes when I can predict it will come up with this error.
Perhaps let’s make a report to that repository. I can downgrade if necessary.
I have no clue how to do that. Sorry I am very new to git. The VPN thing definitely works though. There are certain nodes when I can predict it will come up with this error.
That’s ok! It’s fixed now, so you should be good to go. Thank you!
Prerequisites
helpers: {
Puppeteer: {
url: process.env.CODECEPT_URL || ‘http://seller-ui.sc.stg.s.o3.ru’,
show: false,
browser: ‘chrome’,
windowSize: ‘1920×1080’,
waitForAction: 0,
waitForNavigation: ‘load’,
restart: false,
keepBrowserState: true,
typeInterval: ‘-1’,
chrome: {
args: [‘—no-sandbox’, ‘—headless’, ‘—window-size=1920,1080’, ‘—disable-web-security’],
},
capabilities: {
chromeOptions: {
w3c: false,
},
},
},
MockRequest: {
mode: ‘passthrough’,
logging: false,
recordIfMissing: false,
recordFailedRequests: false,
},
Description
When I run tests that using MockRequest I get very big logs, that contains everything — my request and response and all images in response. I don’t need it, I want small logs without it.
Also after each request I get error in logs:
UnhandledPromiseRejectionWarning: Error: Request Interception is not enabled!
Shareable Source
// Avoid posting hundreds of lines of source code. // Edit to just the relevant portions.
Error Message & Stack Trace
id: '1390c110deecf8652c41310288b789bc',
order: 6,
timestamp: '2020-03-17T07:08:26.713Z',
responseTime: 203,
[Symbol()]: URL {
slashes: true,
protocol: 'http:',
hash: '',
query: {},
pathname: '/api/site/mailbox/topic/list',
auth: '',
host: 'seller-ui.sc.stg.s.o3.ru',
port: '',
hostname: 'seller-ui.sc.stg.s.o3.ru',
password: '',
username: '',
origin: 'http://seller-ui.sc.stg.s.o3.ru',
href: 'http://seller-ui.sc.stg.s.o3.ru/api/site/mailbox/topic/list'
},
[Symbol()]: <ref *1> Polly {
logger: Logger {
polly: [Circular *1],
groupName: 'Test',
_middleware: [Handler [Map]]
},
server: Server {
[Symbol()]: '',
[Symbol()]: [Object],
[Symbol()]: [],
[Symbol()]: [Array]
},
config: {
mode: 'passthrough',
adapters: [Array],
adapterOptions: [Object],
persister: null,
persisterOptions: [Object],
logging: false,
recordIfMissing: false,
recordFailedRequests: false,
expiresIn: null,
expiryStrategy: 'warn',
timing: [Function (anonymous)],
matchRequestsBy: [Object]
},
container: Container { _registry: [Map] },
adapters: Map { _c: [Map] },
persister: null,
_requests: [
[PollyRequest], [PollyRequest], [PollyRequest],
[PollyRequest], [PollyRequest], [PollyRequest],
[PollyRequest], [PollyRequest], [PollyRequest],
[PollyRequest], [PollyRequest], [PollyRequest],
[PollyRequest], [PollyRequest], [PollyRequest],
[PollyRequest], [PollyRequest], [PollyRequest],
[PollyRequest], [PollyRequest], [PollyRequest],
[PollyRequest], [PollyRequest], [PollyRequest],
[PollyRequest], [PollyRequest], [PollyRequest],
[PollyRequest], [PollyRequest], [PollyRequest],
[PollyRequest], [PollyRequest], [PollyRequest],
[PollyRequest], [PollyRequest], [PollyRequest],
[Circular *2]
],
[Symbol()]: 'Test',
[Symbol()]: 'Test_805092869'
},
[Symbol()]: EventEmitter {
[Symbol()]: Map { _c: [Map] },
[Symbol()]: Set { _c: [Set] }
},
[Symbol()]: Route {
params: {},
queryParams: {},
handlers: [],
middleware: [ [Route] ],
[Symbol()]: [ [Object] ]
}
}
Response: PollyResponse {
headers: {
date: 'Tue, 17 Mar 2020 07:08:26 GMT',
'content-encoding': 'gzip',
server: 'nginx/1.17.7',
vary: 'Origin, Accept-Encoding, Origin',
'content-type': 'application/json',
'access-control-allow-origin': 'http://seller-ui.sc.stg.s.o3.ru',
'access-control-expose-headers': 'X-O3-Trace-Id',
'transfer-encoding': 'chunked',
connection: 'keep-alive',
'access-control-allow-credentials': 'true',
'x-o3-trace-id': '6e1dda4a844febbf, 6e1dda4a844febbf'
},
statusCode: 200,
body: bigbigbig body....
(node:51520) UnhandledPromiseRejectionWarning: Error: Request Interception is not enabled!
at assert (/Users/aelzhanova/Documents/Projects/seller-ui/__tests__/codeceptjsTests/node_modules/puppeteer/lib/helper.js:283:11)
at Request.abort (/Users/aelzhanova/Documents/Projects/seller-ui/__tests__/codeceptjsTests/node_modules/puppeteer/lib/NetworkManager.js:494:5)
at Request.<anonymous> (/Users/aelzhanova/Documents/Projects/seller-ui/__tests__/codeceptjsTests/node_modules/puppeteer/lib/helper.js:112:23)
at PuppeteerAdapter.respondToRequest (/Users/aelzhanova/Documents/Projects/seller-ui/__tests__/codeceptjsTests/node_modules/@pollyjs/adapter-puppeteer/dist/cjs/pollyjs-adapter-puppeteer.js:2397:21)
at PuppeteerAdapter.onRequestFailed (/Users/aelzhanova/Documents/Projects/seller-ui/__tests__/codeceptjsTests/node_modules/@pollyjs/adapter/dist/cjs/pollyjs-adapter.js:1394:18)
at runMicrotasks (<anonymous>)
at processTicksAndRejections (internal/process/task_queues.js:97:5)
at async PuppeteerAdapter.handleRequest (/Users/aelzhanova/Documents/Projects/seller-ui/__tests__/codeceptjsTests/node_modules/@pollyjs/adapter/dist/cjs/pollyjs-adapter.js:1167:7)
(node:51520) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). To terminate the node process on unhandled promise rejection, use the CLI flag `--unhandled-rejections=strict` (see https://nodejs.org/api/cli.html#cli_unhandled_rejections_mode). (rejection id: 95)
Config
Copy the config used to setup the Polly instance:
new Polly('Recording Name', { // config... });
Dependencies
Copy the @pollyjs dependencies from package.json
:
{ "@pollyjs/core": "latest", "@pollyjs/adapter-x": "latest", "@pollyjs/persister-x": "latest" }
Relevant Links
- If your project is public, link to the repo so we can investigate directly.
- BONUS POINTS: Create a minimal reproduction and upload it to GitHub. This will get you the fastest support.
Environment
Tell us which operating system you are using, as well as which versions of Node.js and npm/yarn. If applicable, include the browser and the corresponding version.
Run the following to get it quickly:
node -e "var os=require('os');console.log('Node.js ' + process.version + 'n' + os.platform() + ' ' + os.release())"
npm --version
yarn --version
Node.js v13.8.0
darwin 18.7.0
6.13.7
1.22.0
"dependencies": {
"@pollyjs/adapter-puppeteer": "^4.0.2",
"chai": "^4.2.0",
"codeceptjs": "^2.5.0",
"codeceptjs-resemblehelper": "^1.9.0",
"eslint-config-google": "^0.14.0",
"mocha": "^7.0.1",
"node-fetch": "^2.6.0",
"puppeteer": "^2.1.1"
},
"devDependencies": {
"@pollyjs/core": "^2.5.0",
"@codeceptjs/configure": "^0.4.1",
"eslint": "^6.8.0",
"mochawesome": "^4.1.0"
},
"engines": {
"node": ">=8.0.0"
}
}
Whenever the page sends a request, the following events are emitted by puppeteer’s page:
Request emitted when the request is issued by the page.
Response emitted when/if the response is received for the request.
RequestFinished emitted when the response body is downloaded and the request is complete.
If request fails at some point, then instead of RequestFinished event (and possibly instead of Response event), the RequestFailed event is emitted.
If request gets a ‘redirect’ response, the request is successfully finished with the RequestFinished event, and a new request is issued to a redirected url.
Inheritance
System.Object
Request
Implements
Namespace: PuppeteerSharp
Assembly: PuppeteerSharp.dll
Syntax
public class Request : object, IRequest
Properties
|
Request doc improvement
View Source
Failure
Gets the failure.
Declaration
public string Failure { get; }
Property Value
Type | Description |
---|---|
System.String |
The failure. |
|
Request doc improvement
View Source
Frame
Gets the frame.
Declaration
public IFrame Frame { get; }
Property Value
Type | Description |
---|---|
IFrame |
The frame. |
|
Request doc improvement
View Source
Gets the HTTP headers.
Declaration
public Dictionary<string, string> Headers { get; }
Property Value
Type | Description |
---|---|
Dictionary<System.String, System.String> |
HTTP headers. |
|
Request doc improvement
View Source
InterceptionId
Gets the interception identifier.
Declaration
public string InterceptionId { get; }
Property Value
Type | Description |
---|---|
System.String |
The interception identifier. |
|
Request doc improvement
View Source
IsNavigationRequest
Gets whether this request is driving frame’s navigation
Declaration
public bool IsNavigationRequest { get; }
Property Value
Type | Description |
---|---|
System.Boolean |
|
Request doc improvement
View Source
Method
Gets the HTTP method.
Declaration
public HttpMethod Method { get; }
Property Value
Type | Description |
---|---|
HttpMethod |
HTTP method. |
|
Request doc improvement
View Source
PostData
Gets the post data.
Declaration
public object PostData { get; }
Property Value
Type | Description |
---|---|
System.Object |
The post data. |
|
Request doc improvement
View Source
RedirectChain
A redirectChain is a chain of requests initiated to fetch a resource.
If there are no redirects and the request was successful, the chain will be empty.
If a server responds with at least a single redirect, then the chain will contain all the requests that were redirected.
redirectChain is shared between all the requests of the same chain.
Declaration
public IRequest[] RedirectChain { get; }
Property Value
Type | Description |
---|---|
IRequest[] |
The redirect chain. |
|
Request doc improvement
View Source
RequestId
Gets the request identifier.
Declaration
public string RequestId { get; }
Property Value
Type | Description |
---|---|
System.String |
The request identifier. |
|
Request doc improvement
View Source
ResourceType
Gets the type of the resource.
Declaration
public ResourceType ResourceType { get; }
Property Value
Type | Description |
---|---|
ResourceType |
The type of the resource. |
|
Request doc improvement
View Source
Response
Declaration
public Response Response { get; }
Property Value
Type | Description |
---|---|
Response |
|
Request doc improvement
View Source
Url
Gets the URL.
Declaration
public string Url { get; }
Property Value
Type | Description |
---|---|
System.String |
The URL. |
Methods
|
Request doc improvement
View Source
AbortAsync(RequestAbortErrorCode)
Aborts request. To use this, request interception should be enabled with SetRequestInterceptionAsync(Boolean).
Exception is immediately thrown if the request interception is not enabled.
Declaration
public async Task AbortAsync(RequestAbortErrorCode errorCode = RequestAbortErrorCode.Failed)
Parameters
Type | Name | Description |
---|---|---|
RequestAbortErrorCode | errorCode |
Optional error code. Defaults to Failed |
Returns
Type | Description |
---|---|
Task |
Task |
|
Request doc improvement
View Source
ContinueAsync(Payload)
Continues request with optional request overrides. To use this, request interception should be enabled with SetRequestInterceptionAsync(Boolean). Exception is immediately thrown if the request interception is not enabled.
If the URL is set it won’t perform a redirect. The request will be silently forwarded to the new url. For example, the address bar will show the original url.
Declaration
public async Task ContinueAsync(Payload overrides = null)
Parameters
Type | Name | Description |
---|---|---|
Payload | overrides |
Returns
Type | Description |
---|---|
Task |
Task |
|
Request doc improvement
View Source
RespondAsync(ResponseData)
Fulfills request with given response. To use this, request interception should be enabled with SetRequestInterceptionAsync(Boolean). Exception is thrown if request interception is not enabled.
Declaration
public async Task RespondAsync(ResponseData response)
Parameters
Type | Name | Description |
---|---|---|
ResponseData | response |
Response that will fulfill this request |
Returns
Type | Description |
---|---|
Task |
Task |
Explicit Interface Implementations
|
Request doc improvement
View Source
IRequest.Response
Responsed attached to the request.
Declaration
IResponse IRequest.Response { get; }
Returns
Type | Description |
---|---|
IResponse |
The response. |
Implements
Commands¶
pyppeteer-install
: Download and install chromium for pyppeteer.
Environment Variables¶
-
$PYPPETEER_HOME
: Specify the directory to be used by pyppeteer.
Pyppeteer uses this directory for extracting downloaded Chromium, and for
making temporary user data directory.
Default location depends on platform:- Windows:
C:Users<username>AppDataLocalpyppeteer
- OS X:
/Users/<username>/Library/Application Support/pyppeteer
- Linux:
/home/<username>/.local/share/pyppeteer
- or in
$XDG_DATA_HOME/pyppeteer
if$XDG_DATA_HOME
is defined.
- or in
Details see appdirs’s
user_data_dir
. - Windows:
-
$PYPPETEER_DOWNLOAD_HOST
: Overwrite host part of URL that is used to
download Chromium. Defaults tohttps://storage.googleapis.com
. -
$PYPPETEER_CHROMIUM_REVISION
: Specify a certain version of chromium you’d
like pyppeteer to use. Default value can be checked by
pyppeteer.__chromium_revision__
.
Launcher¶
-
pyppeteer.launcher.
launch
(options: dict = None, **kwargs) → pyppeteer.browser.Browser[source]¶ -
Start chrome process and return
Browser
.This function is a shortcut to
Launcher(options, **kwargs).launch()
.Available options are:
ignoreHTTPSErrors
(bool): Whether to ignore HTTPS errors. Defaults to
False
.headless
(bool): Whether to run browser in headless mode. Defaults to
True
unlessappMode
ordevtools
options isTrue
.executablePath
(str): Path to a Chromium or Chrome executable to run
instead of default bundled Chromium.slowMo
(int|float): Slow down pyppeteer operations by the specified
amount of milliseconds.args
(List[str]): Additional arguments (flags) to pass to the browser
process.ignoreDefaultArgs
(bool): Do not use pyppeteer’s default args. This
is dangerous option; use with care.handleSIGINT
(bool): Close the browser process on Ctrl+C. Defaults to
True
.handleSIGTERM
(bool): Close the browser process on SIGTERM. Defaults
toTrue
.handleSIGHUP
(bool): Close the browser process on SIGHUP. Defaults to
True
.dumpio
(bool): Whether to pipe the browser process stdout and stderr
intoprocess.stdout
andprocess.stderr
. Defaults toFalse
.userDataDir
(str): Path to a user data directory.env
(dict): Specify environment variables that will be visible to the
browser. Defaults to same as python process.devtools
(bool): Whether to auto-open a DevTools panel for each tab.
If this option isTrue
, theheadless
option will be set
False
.logLevel
(int|str): Log level to print logs. Defaults to same as the
root logger.autoClose
(bool): Automatically close browser process when script
completed. Defaults toTrue
.loop
(asyncio.AbstractEventLoop): Event loop (experimental).appMode
(bool): Deprecated.
Note
Pyppeteer can also be used to control the Chrome browser, but it works
best with the version of Chromium it is bundled with. There is no
guarantee it will work with any other version. UseexecutablePath
option with extreme caution.
-
pyppeteer.launcher.
connect
(options: dict = None, **kwargs) → pyppeteer.browser.Browser[source]¶ -
Connect to the existing chrome.
browserWSEndpoint
option is necessary to connect to the chrome. The
format isws://${host}:${port}/devtools/browser/<id>
. This value can
get bywsEndpoint
.Available options are:
browserWSEndpoint
(str): A browser websocket endpoint to connect to.
(required)ignoreHTTPSErrors
(bool): Whether to ignore HTTPS errors. Defaults to
False
.slowMo
(int|float): Slow down pyppeteer’s by the specified amount of
milliseconds.logLevel
(int|str): Log level to print logs. Defaults to same as the
root logger.loop
(asyncio.AbstractEventLoop): Event loop (experimental).
-
pyppeteer.launcher.
executablePath
() → str[source]¶ -
Get executable path of default chrome.
Browser Class¶
-
class
pyppeteer.browser.
Browser
(connection: pyppeteer.connection.Connection, contextIds: List[str], ignoreHTTPSErrors: bool, setDefaultViewport: bool, process: Optional[subprocess.Popen] = None, closeCallback: Callable[[], Awaitable[None]] = None, **kwargs)[source]¶ -
Bases:
pyee.EventEmitter
Browser class.
A Browser object is created when pyppeteer connects to chrome, either
throughlaunch()
or
connect()
.-
browserContexts
¶ -
Return a list of all open browser contexts.
In a newly created browser, this will return a single instance of
[BrowserContext]
-
coroutine
close
() → None[source]¶ -
Close connections and terminate browser process.
-
coroutine
createIncogniteBrowserContext
() → pyppeteer.browser.BrowserContext[source]¶ -
[Deprecated] Miss spelled method.
Use
createIncognitoBrowserContext()
method instead.
-
coroutine
createIncognitoBrowserContext
() → pyppeteer.browser.BrowserContext[source]¶ -
Create a new incognito browser context.
This won’t share cookies/cache with other browser contexts.
browser = await launch() # Create a new incognito browser context. context = await browser.createIncognitoBrowserContext() # Create a new page in a pristine context. page = await context.newPage() # Do stuff await page.goto('https://example.com') ...
-
coroutine
disconnect
() → None[source]¶ -
Disconnect browser.
-
coroutine
newPage
() → pyppeteer.page.Page[source]¶ -
Make new page on this browser and return its object.
-
coroutine
pages
() → List[pyppeteer.page.Page][source]¶ -
Get all pages of this browser.
Non visible pages, such as
"background_page"
, will not be listed
here. You can find then usingpyppeteer.target.Target.page()
.
-
process
¶ -
Return process of this browser.
If browser instance is created by
pyppeteer.launcher.connect()
,
returnNone
.
-
targets
() → List[pyppeteer.target.Target][source]¶ -
Get a list of all active targets inside the browser.
In case of multiple browser contexts, the method will return a list
with all the targets in all browser contexts.
-
coroutine
userAgent
() → str[source]¶ -
Return browser’s original user agent.
Note
Pages can override browser user agent with
pyppeteer.page.Page.setUserAgent()
.
-
coroutine
version
() → str[source]¶ -
Get version of the browser.
-
wsEndpoint
¶ -
Return websocket end point url.
-
BrowserContext Class¶
-
class
pyppeteer.browser.
BrowserContext
(browser: pyppeteer.browser.Browser, contextId: Optional[str])[source]¶ -
Bases:
pyee.EventEmitter
BrowserContext provides multiple independent browser sessions.
When a browser is launched, it has a single BrowserContext used by default.
The methodbrowser.newPage()
creates a page in the default browser
context.If a page opens another page, e.g. with a
window.open
call, the popup
will belong to the parent page’s browser context.Pyppeteer allows creation of “incognito” browser context with
browser.createIncognitoBrowserContext()
method.
“incognito” browser contexts don’t write any browser data to disk.# Create new incognito browser context context = await browser.createIncognitoBrowserContext() # Create a new page inside context page = await context.newPage() # ... do stuff with page ... await page.goto('https://example.com') # Dispose context once it's no longer needed await context.close()
-
browser
¶ -
Return the browser this browser context belongs to.
-
coroutine
close
() → None[source]¶ -
Close the browser context.
All the targets that belongs to the browser context will be closed.
Note
Only incognito browser context can be closed.
-
isIncognite
() → bool[source]¶ -
[Deprecated] Miss spelled method.
Use
isIncognito()
method instead.
-
isIncognito
() → bool[source]¶ -
Return whether BrowserContext is incognito.
The default browser context is the only non-incognito browser context.
Note
The default browser context cannot be closed.
-
coroutine
newPage
() → pyppeteer.page.Page[source]¶ -
Create a new page in the browser context.
-
targets
() → List[pyppeteer.target.Target][source]¶ -
Return a list of all active targets inside the browser context.
-
Page Class¶
-
class
pyppeteer.page.
Page
(client: pyppeteer.connection.CDPSession, target: Target, frameTree: Dict[KT, VT], ignoreHTTPSErrors: bool, screenshotTaskQueue: list = None)[source]¶ -
Bases:
pyee.EventEmitter
Page class.
This class provides methods to interact with a single tab of chrome. One
Browser
object might have multiple Page object.The
Page
class emits variousEvents
which can be
handled by usingon
oronce
method, which is inherited from
pyee’sEventEmitter
class.-
Events
= namespace(Close=’close’, Console=’console’, DOMContentLoaded=’domcontentloaded’, Dialog=’dialog’, Error=’error’, FrameAttached=’frameattached’, FrameDetached=’framedetached’, FrameNavigated=’framenavigated’, Load=’load’, Metrics=’metrics’, PageError=’pageerror’, Request=’request’, RequestFailed=’requestfailed’, RequestFinished=’requestfinished’, Response=’response’, WorkerCreated=’workercreated’, WorkerDestroyed=’workerdestroyed’)¶ -
Available events.
-
coroutine
J
(selector: str) → Optional[pyppeteer.element_handle.ElementHandle]¶ -
alias to
querySelector()
-
coroutine
JJ
(selector: str) → List[pyppeteer.element_handle.ElementHandle]¶ -
alias to
querySelectorAll()
-
coroutine
JJeval
(selector: str, pageFunction: str, *args) → Any¶ -
alias to
querySelectorAllEval()
-
coroutine
Jeval
(selector: str, pageFunction: str, *args) → Any¶ -
alias to
querySelectorEval()
-
coroutine
Jx
(expression: str) → List[pyppeteer.element_handle.ElementHandle]¶ -
alias to
xpath()
-
coroutine
addScriptTag
(options: Dict[KT, VT] = None, **kwargs) → pyppeteer.element_handle.ElementHandle[source]¶ -
Add script tag to this page.
- One of
url
,path
orcontent
option is necessary. -
url
(string): URL of a script to add.path
(string): Path to the local JavaScript file to add.content
(string): JavaScript string to add.type
(string): Script type. Usemodule
in order to load a
JavaScript ES6 module.
Return ElementHandle: ElementHandle
of added tag. - One of
-
coroutine
addStyleTag
(options: Dict[KT, VT] = None, **kwargs) → pyppeteer.element_handle.ElementHandle[source]¶ -
Add style or link tag to this page.
- One of
url
,path
orcontent
option is necessary. -
url
(string): URL of the link tag to add.path
(string): Path to the local CSS file to add.content
(string): CSS string to add.
Return ElementHandle: ElementHandle
of added tag. - One of
-
coroutine
authenticate
(credentials: Dict[str, str]) → Any[source]¶ -
Provide credentials for http authentication.
credentials
should beNone
or dict which hasusername
and
password
field.
-
coroutine
bringToFront
() → None[source]¶ -
Bring page to front (activate tab).
-
browser
¶ -
Get the browser the page belongs to.
-
coroutine
click
(selector: str, options: dict = None, **kwargs) → None[source]¶ -
Click element which matches
selector
.This method fetches an element with
selector
, scrolls it into view
if needed, and then usesmouse
to click in the center of the
element. If there’s no element matchingselector
, the method raises
PageError
.Available options are:
button
(str):left
,right
, ormiddle
, defaults to
left
.clickCount
(int): defaults to 1.delay
(int|float): Time to wait betweenmousedown
and
mouseup
in milliseconds. defaults to 0.
Note
If this method triggers a navigation event and there’s a
separatewaitForNavigation()
, you may end up with a race
condition that yields unexpected results. The correct pattern for
click and wait for navigation is the following:await asyncio.gather( page.waitForNavigation(waitOptions), page.click(selector, clickOptions), )
-
coroutine
close
(options: Dict[KT, VT] = None, **kwargs) → None[source]¶ -
Close this page.
Available options:
runBeforeUnload
(bool): Defaults toFalse
. Whether to run the
before unload
page handlers.
By defaults,
close()
does not run beforeunload handlers.Note
If
runBeforeUnload
is passed asTrue
, abeforeunload
dialog might be summoned and should be handled manually via page’s
dialog
event.
-
coroutine
content
() → str[source]¶ -
Get the full HTML contents of the page.
Returns HTML including the doctype.
-
coroutine
cookies
(*urls) → dict[source]¶ -
Get cookies.
If no URLs are specified, this method returns cookies for the current
page URL. If URLs are specified, only cookies for those URLs are
returned.Returned cookies are list of dictionaries which contain these fields:
name
(str)value
(str)url
(str)domain
(str)path
(str)expires
(number): Unix time in secondshttpOnly
(bool)secure
(bool)session
(bool)sameSite
(str):'Strict'
or'Lax'
-
coverage
¶ -
Return
Coverage
.
-
coroutine
deleteCookie
(*cookies) → None[source]¶ -
Delete cookie.
cookies
should be dictionaries which contain these fields:name
(str): requiredurl
(str)domain
(str)path
(str)secure
(bool)
-
coroutine
emulate
(options: dict = None, **kwargs) → None[source]¶ -
Emulate given device metrics and user agent.
This method is a shortcut for calling two methods:
setUserAgent()
setViewport()
options
is a dictionary containing these fields:viewport
(dict)width
(int): page width in pixels.height
(int): page width in pixels.deviceScaleFactor
(float): Specify device scale factor (can be
thought as dpr). Defaults to 1.isMobile
(bool): Whether themeta viewport
tag is taken
into account. Defaults toFalse
.hasTouch
(bool): Specifies if viewport supports touch events.
Defaults toFalse
.isLandscape
(bool): Specifies if viewport is in landscape mode.
Defaults toFalse
.
userAgent
(str): user agent string.
-
coroutine
emulateMedia
(mediaType: str = None) → None[source]¶ -
Emulate css media type of the page.
Parameters: mediaType (str) – Changes the CSS media type of the page. The only
allowed values are'screen'
,'print'
, and
None
. PassingNone
disables media
emulation.
-
coroutine
evaluate
(pageFunction: str, *args, force_expr: bool = False) → Any[source]¶ -
Execute js-function or js-expression on browser and get result.
Parameters: - pageFunction (str) – String of js-function/expression to be executed
on the browser. - force_expr (bool) – If True, evaluate
pageFunction
as expression.
If False (default), try to automatically detect
function or expression.
note:
force_expr
option is a keyword only argument. - pageFunction (str) – String of js-function/expression to be executed
-
coroutine
evaluateHandle
(pageFunction: str, *args) → pyppeteer.execution_context.JSHandle[source]¶ -
Execute function on this page.
Difference between
evaluate()
and
evaluateHandle()
is that
evaluateHandle
returns JSHandle object (not value).Parameters: pageFunction (str) – JavaScript function to be executed.
-
coroutine
evaluateOnNewDocument
(pageFunction: str, *args) → None[source]¶ -
Add a JavaScript function to the document.
This function would be invoked in one of the following scenarios:
- whenever the page is navigated
- whenever the child frame is attached or navigated. In this case, the
function is invoked in the context of the newly attached frame.
-
coroutine
exposeFunction
(name: str, pyppeteerFunction: Callable[[…], Any]) → None[source]¶ -
Add python function to the browser’s
window
object asname
.Registered function can be called from chrome process.
Parameters: - name (string) – Name of the function on the window object.
- pyppeteerFunction (Callable) – Function which will be called on
python process. This function should
not be asynchronous function.
-
coroutine
focus
(selector: str) → None[source]¶ -
Focus the element which matches
selector
.If no element matched the
selector
, raisePageError
.
-
frames
¶ -
Get all frames of this page.
-
coroutine
goBack
(options: dict = None, **kwargs) → Optional[pyppeteer.network_manager.Response][source]¶ -
Navigate to the previous page in history.
Available options are same as
goto()
method.If cannot go back, return
None
.
-
coroutine
goForward
(options: dict = None, **kwargs) → Optional[pyppeteer.network_manager.Response][source]¶ -
Navigate to the next page in history.
Available options are same as
goto()
method.If cannot go forward, return
None
.
-
coroutine
goto
(url: str, options: dict = None, **kwargs) → Optional[pyppeteer.network_manager.Response][source]¶ -
Go to the
url
.Parameters: url (string) – URL to navigate page to. The url should include
scheme, e.g.https://
.Available options are:
timeout
(int): Maximum navigation time in milliseconds, defaults
to 30 seconds, pass0
to disable timeout. The default value can
be changed by using thesetDefaultNavigationTimeout()
method.waitUntil
(str|List[str]): When to consider navigation succeeded,
defaults toload
. Given a list of event strings, navigation is
considered to be successful after all events have been fired. Events
can be either:load
: whenload
event is fired.domcontentloaded
: when theDOMContentLoaded
event is fired.networkidle0
: when there are no more than 0 network connections
for at least 500 ms.networkidle2
: when there are no more than 2 network connections
for at least 500 ms.
The
Page.goto
will raise errors if:- there’s an SSL error (e.g. in case of self-signed certificates)
- target URL is invalid
- the
timeout
is exceeded during navigation - then main resource failed to load
Note
goto()
either raise error or return a main resource response.
The only exceptions are navigation toabout:blank
or navigation
to the same URL with a different hash, which would succeed and
returnNone
.Note
Headless mode doesn’t support navigation to a PDF document.
-
coroutine
hover
(selector: str) → None[source]¶ -
Mouse hover the element which matches
selector
.If no element matched the
selector
, raisePageError
.
-
coroutine
injectFile
(filePath: str) → str[source]¶ -
[Deprecated] Inject file to this page.
This method is deprecated. Use
addScriptTag()
instead.
-
isClosed
() → bool[source]¶ -
Indicate that the page has been closed.
-
keyboard
¶ -
Get
Keyboard
object.
-
mainFrame
¶ -
Get main
Frame
of this page.
-
coroutine
metrics
() → Dict[str, Any][source]¶ -
Get metrics.
Returns dictionary containing metrics as key/value pairs:
Timestamp
(number): The timestamp when the metrics sample was
taken.Documents
(int): Number of documents in the page.Frames
(int): Number of frames in the page.JSEventListeners
(int): Number of events in the page.Nodes
(int): Number of DOM nodes in the page.LayoutCount
(int): Total number of full partial page layout.RecalcStyleCount
(int): Total number of page style
recalculations.LayoutDuration
(int): Combined duration of page duration.RecalcStyleDuration
(int): Combined duration of all page style
recalculations.ScriptDuration
(int): Combined duration of JavaScript
execution.TaskDuration
(int): Combined duration of all tasks performed by
the browser.JSHeapUsedSize
(float): Used JavaScript heap size.JSHeapTotalSize
(float): Total JavaScript heap size.
-
mouse
¶ -
Get
Mouse
object.
-
coroutine
pdf
(options: dict = None, **kwargs) → bytes[source]¶ -
Generate a pdf of the page.
Options:
path
(str): The file path to save the PDF.scale
(float): Scale of the webpage rendering, defaults to1
.displayHeaderFooter
(bool): Display header and footer.
Defaults toFalse
.headerTemplate
(str): HTML template for the print header. Should
be valid HTML markup with following classes.date
: formatted print datetitle
: document titleurl
: document locationpageNumber
: current page numbertotalPages
: total pages in the document
footerTemplate
(str): HTML template for the print footer. Should
use the same template asheaderTemplate
.printBackground
(bool): Print background graphics. Defaults to
False
.landscape
(bool): Paper orientation. Defaults toFalse
.pageRanges
(string): Paper ranges to print, e.g., ‘1-5,8,11-13’.
Defaults to empty string, which means all pages.format
(str): Paper format. If set, takes priority over
width
orheight
. Defaults toLetter
.width
(str): Paper width, accepts values labeled with units.height
(str): Paper height, accepts values labeled with units.margin
(dict): Paper margins, defaults toNone
.top
(str): Top margin, accepts values labeled with units.right
(str): Right margin, accepts values labeled with units.bottom
(str): Bottom margin, accepts values labeled with units.left
(str): Left margin, accepts values labeled with units.
Returns: Return generated PDF bytes
object.Note
Generating a pdf is currently only supported in headless mode.
pdf()
generates a pdf of the page withprint
css media. To
generate a pdf withscreen
media, call
page.emulateMedia('screen')
before callingpdf()
.Note
By default,
pdf()
generates a pdf with modified colors for
printing. Use the--webkit-print-color-adjust
property to force
rendering of exact colors.await page.emulateMedia('screen') await page.pdf({'path': 'page.pdf'})
The
width
,height
, andmargin
options accept values labeled
with units. Unlabeled values are treated as pixels.A few examples:
page.pdf({'width': 100})
: prints with width set to 100 pixels.page.pdf({'width': '100px'})
: prints with width set to 100 pixels.page.pdf({'width': '10cm'})
: prints with width set to 100 centimeters.
All available units are:
px
: pixelin
: inchcm
: centimetermm
: millimeter
The format options are:
Letter
: 8.5in x 11inLegal
: 8.5in x 14inTabloid
: 11in x 17inLedger
: 17in x 11inA0
: 33.1in x 46.8inA1
: 23.4in x 33.1inA2
: 16.5in x 23.4inA3
: 11.7in x 16.5inA4
: 8.27in x 11.7inA5
: 5.83in x 8.27inA6
: 4.13in x 5.83in
Note
headerTemplate
andfooterTemplate
markup have the following
limitations:- Script tags inside templates are not evaluated.
- Page styles are not visible inside templates.
-
coroutine
plainText
() → str[source]¶ -
[Deprecated] Get page content as plain text.
-
coroutine
queryObjects
(prototypeHandle: pyppeteer.execution_context.JSHandle) → pyppeteer.execution_context.JSHandle[source]¶ -
Iterate js heap and finds all the objects with the handle.
Parameters: prototypeHandle (JSHandle) – JSHandle of prototype object.
-
coroutine
querySelector
(selector: str) → Optional[pyppeteer.element_handle.ElementHandle][source]¶ -
Get an Element which matches
selector
.Parameters: selector (str) – A selector to search element. Return Optional[ElementHandle]: If element which matches the
selector
is found, return its
ElementHandle
. If not found,
returnsNone
.
-
coroutine
querySelectorAll
(selector: str) → List[pyppeteer.element_handle.ElementHandle][source]¶ -
Get all element which matches
selector
as a list.Parameters: selector (str) – A selector to search element. Return List[ElementHandle]: List of
ElementHandle
which matches the
selector
. If no element is matched to theselector
, return
empty list.
-
coroutine
querySelectorAllEval
(selector: str, pageFunction: str, *args) → Any[source]¶ -
Execute function with all elements which matches
selector
.Parameters: - selector (str) – A selector to query page for.
- pageFunction (str) – String of JavaScript function to be evaluated on
browser. This function takes Array of the
matched elements as the first argument. - args (Any) – Arguments to pass to
pageFunction
.
-
coroutine
querySelectorEval
(selector: str, pageFunction: str, *args) → Any[source]¶ -
Execute function with an element which matches
selector
.Parameters: - selector (str) – A selector to query page for.
- pageFunction (str) – String of JavaScript function to be evaluated on
browser. This function takes an element which
matches the selector as a first argument. - args (Any) – Arguments to pass to
pageFunction
.
This method raises error if no element matched the
selector
.
-
coroutine
reload
(options: dict = None, **kwargs) → Optional[pyppeteer.network_manager.Response][source]¶ -
Reload this page.
Available options are same as
goto()
method.
-
coroutine
screenshot
(options: dict = None, **kwargs) → Union[bytes, str][source]¶ -
Take a screen shot.
The following options are available:
path
(str): The file path to save the image to. The screenshot
type will be inferred from the file extension.type
(str): Specify screenshot type, can be eitherjpeg
or
png
. Defaults topng
.quality
(int): The quality of the image, between 0-100. Not
applicable topng
image.fullPage
(bool): When true, take a screenshot of the full
scrollable page. Defaults toFalse
.clip
(dict): An object which specifies clipping region of the
page. This option should have the following fields:x
(int): x-coordinate of top-left corner of clip area.y
(int): y-coordinate of top-left corner of clip area.width
(int): width of clipping area.height
(int): height of clipping area.
omitBackground
(bool): Hide default white background and allow
capturing screenshot with transparency.encoding
(str): The encoding of the image, can be either
'base64'
or'binary'
. Defaults to'binary'
.
-
coroutine
select
(selector: str, *values) → List[str][source]¶ -
Select options and return selected values.
If no element matched the
selector
, raiseElementHandleError
.
-
coroutine
setBypassCSP
(enabled: bool) → None[source]¶ -
Toggles bypassing page’s Content-Security-Policy.
Note
CSP bypassing happens at the moment of CSP initialization rather
then evaluation. Usually this means thatpage.setBypassCSP
should be called before navigating to the domain.
-
coroutine
setCacheEnabled
(enabled: bool = True) → None[source]¶ -
Enable/Disable cache for each request.
By default, caching is enabled.
-
coroutine
setContent
(html: str) → None[source]¶ -
Set content to this page.
Parameters: html (str) – HTML markup to assign to the page.
-
coroutine
setCookie
(*cookies) → None[source]¶ -
Set cookies.
cookies
should be dictionaries which contain these fields:name
(str): requiredvalue
(str): requiredurl
(str)domain
(str)path
(str)expires
(number): Unix time in secondshttpOnly
(bool)secure
(bool)sameSite
(str):'Strict'
or'Lax'
-
setDefaultNavigationTimeout
(timeout: int) → None[source]¶ -
Change the default maximum navigation timeout.
This method changes the default timeout of 30 seconds for the following
methods:goto()
goBack()
goForward()
reload()
waitForNavigation()
Parameters: timeout (int) – Maximum navigation time in milliseconds. Pass 0
to disable timeout.
-
Set extra HTTP headers.
The extra HTTP headers will be sent with every request the page
initiates.Note
page.setExtraHTTPHeaders
does not guarantee the order of
headers in the outgoing requests.Parameters: headers (Dict) – A dictionary containing additional http headers to
be sent with every requests. All header values must
be string.
-
coroutine
setJavaScriptEnabled
(enabled: bool) → None[source]¶ -
Set JavaScript enable/disable.
-
coroutine
setOfflineMode
(enabled: bool) → None[source]¶ -
Set offline mode enable/disable.
-
coroutine
setRequestInterception
(value: bool) → None[source]¶ -
Enable/disable request interception.
Activating request interception enables
Request
class’s
abort()
,
continue_()
, and
response()
methods.
This provides the capability to modify network requests that are made
by a page.
-
coroutine
setUserAgent
(userAgent: str) → None[source]¶ -
Set user agent to use in this page.
Parameters: userAgent (str) – Specific user agent to use in this page
-
coroutine
setViewport
(viewport: dict) → None[source]¶ -
Set viewport.
- Available options are:
-
width
(int): page width in pixel.height
(int): page height in pixel.deviceScaleFactor
(float): Default to 1.0.isMobile
(bool): Default toFalse
.hasTouch
(bool): Default toFalse
.isLandscape
(bool): Default toFalse
.
-
coroutine
tap
(selector: str) → None[source]¶ -
Tap the element which matches the
selector
.Parameters: selector (str) – A selector to search element to touch.
-
target
¶ -
Return a target this page created from.
-
coroutine
title
() → str[source]¶ -
Get page’s title.
-
touchscreen
¶ -
Get
Touchscreen
object.
-
tracing
¶ -
Get tracing object.
-
coroutine
type
(selector: str, text: str, options: dict = None, **kwargs) → None[source]¶ -
Type
text
on the element which matchesselector
.If no element matched the
selector
, raisePageError
.Details see
pyppeteer.input.Keyboard.type()
.
-
url
¶ -
Get URL of this page.
-
viewport
¶ -
Get viewport as a dictionary.
Fields of returned dictionary is same as
setViewport()
.
-
waitFor
(selectorOrFunctionOrTimeout: Union[str, int, float], options: dict = None, *args, **kwargs) → Awaitable[T_co][source]¶ -
Wait for function, timeout, or element which matches on page.
This method behaves differently with respect to the first argument:
- If
selectorOrFunctionOrTimeout
is number (int or float), then it
is treated as a timeout in milliseconds and this returns future which
will be done after the timeout. - If
selectorOrFunctionOrTimeout
is a string of JavaScript
function, this method is a shortcut towaitForFunction()
. - If
selectorOrFunctionOrTimeout
is a selector string or xpath
string, this method is a shortcut towaitForSelector()
or
waitForXPath()
. If the string starts with//
, the string is
treated as xpath.
Pyppeteer tries to automatically detect function or selector, but
sometimes miss-detects. If not work as you expected, use
waitForFunction()
orwaitForSelector()
directly.Parameters: - selectorOrFunctionOrTimeout – A selector, xpath, or function
string, or timeout (milliseconds). - args (Any) – Arguments to pass the function.
Returns: Return awaitable object which resolves to a JSHandle of the
success value.Available options: see
waitForFunction()
or
waitForSelector()
- If
-
waitForFunction
(pageFunction: str, options: dict = None, *args, **kwargs) → Awaitable[T_co][source]¶ -
Wait until the function completes and returns a truthy value.
Parameters: args (Any) – Arguments to pass to pageFunction
.Returns: Return awaitable object which resolves when the
pageFunction
returns a truthy value. It resolves to a
JSHandle
of the truthy
value.This method accepts the following options:
polling
(str|number): An interval at which thepageFunction
is executed, defaults toraf
. Ifpolling
is a number, then
it is treated as an interval in milliseconds at which the function
would be executed. Ifpolling
is a string, then it can be one of
the following values:raf
: to constantly executepageFunction
in
requestAnimationFrame
callback. This is the tightest polling
mode which is suitable to observe styling changes.mutation
: to executepageFunction
on every DOM mutation.
timeout
(int|float): maximum time to wait for in milliseconds.
Defaults to 30000 (30 seconds). Pass0
to disable timeout.
-
coroutine
waitForNavigation
(options: dict = None, **kwargs) → Optional[pyppeteer.network_manager.Response][source]¶ -
Wait for navigation.
Available options are same as
goto()
method.This returns
Response
when the page
navigates to a new URL or reloads. It is useful for when you run code
which will indirectly cause the page to navigate. In case of navigation
to a different anchor or navigation due to
History API
usage, the navigation will returnNone
.Consider this example:
navigationPromise = async.ensure_future(page.waitForNavigation()) await page.click('a.my-link') # indirectly cause a navigation await navigationPromise # wait until navigation finishes
or,
await asyncio.wait([ page.click('a.my-link'), page.waitForNavigation(), ])
Note
Usage of the History API to change the URL is considered a
navigation.
-
coroutine
waitForRequest
(urlOrPredicate: Union[str, Callable[[pyppeteer.network_manager.Request], bool]], options: Dict[KT, VT] = None, **kwargs) → pyppeteer.network_manager.Request[source]¶ -
Wait for request.
Parameters: urlOrPredicate – A URL or function to wait for. This method accepts below options:
timeout
(int|float): Maximum wait time in milliseconds, defaults
to 30 seconds, pass0
to disable the timeout.
Example:
firstRequest = await page.waitForRequest('http://example.com/resource') finalRequest = await page.waitForRequest(lambda req: req.url == 'http://example.com' and req.method == 'GET') return firstRequest.url
-
coroutine
waitForResponse
(urlOrPredicate: Union[str, Callable[[pyppeteer.network_manager.Response], bool]], options: Dict[KT, VT] = None, **kwargs) → pyppeteer.network_manager.Response[source]¶ -
Wait for response.
Parameters: urlOrPredicate – A URL or function to wait for. This method accepts below options:
timeout
(int|float): Maximum wait time in milliseconds, defaults
to 30 seconds, pass0
to disable the timeout.
Example:
firstResponse = await page.waitForResponse('http://example.com/resource') finalResponse = await page.waitForResponse(lambda res: res.url == 'http://example.com' and res.status == 200) return finalResponse.ok
-
waitForSelector
(selector: str, options: dict = None, **kwargs) → Awaitable[T_co][source]¶ -
Wait until element which matches
selector
appears on page.Wait for the
selector
to appear in page. If at the moment of
calling the method theselector
already exists, the method will
return immediately. If the selector doesn’t appear after the
timeout
milliseconds of waiting, the function will raise error.Parameters: selector (str) – A selector of an element to wait for. Returns: Return awaitable object which resolves when element specified
by selector string is added to DOM.This method accepts the following options:
visible
(bool): Wait for element to be present in DOM and to be
visible; i.e. to not havedisplay: none
orvisibility: hidden
CSS properties. Defaults toFalse
.hidden
(bool): Wait for element to not be found in the DOM or to
be hidden, i.e. havedisplay: none
orvisibility: hidden
CSS
properties. Defaults toFalse
.timeout
(int|float): Maximum time to wait for in milliseconds.
Defaults to 30000 (30 seconds). Pass0
to disable timeout.
-
waitForXPath
(xpath: str, options: dict = None, **kwargs) → Awaitable[T_co][source]¶ -
Wait until element which matches
xpath
appears on page.Wait for the
xpath
to appear in page. If the moment of calling the
method thexpath
already exists, the method will return
immediately. If the xpath doesn’t appear aftertimeout
milliseconds
of waiting, the function will raise exception.Parameters: xpath (str) – A [xpath] of an element to wait for. Returns: Return awaitable object which resolves when element specified
by xpath string is added to DOM.Available options are:
visible
(bool): wait for element to be present in DOM and to be
visible, i.e. to not havedisplay: none
orvisibility: hidden
CSS properties. Defaults toFalse
.hidden
(bool): wait for element to not be found in the DOM or to
be hidden, i.e. havedisplay: none
orvisibility: hidden
CSS
properties. Defaults toFalse
.timeout
(int|float): maximum time to wait for in milliseconds.
Defaults to 30000 (30 seconds). Pass0
to disable timeout.
-
workers
¶ -
Get all workers of this page.
-
coroutine
xpath
(expression: str) → List[pyppeteer.element_handle.ElementHandle][source]¶ -
Evaluate the XPath expression.
If there are no such elements in this page, return an empty list.
Parameters: expression (str) – XPath string to be evaluated.
-
Worker Class¶
-
class
pyppeteer.worker.
Worker
(client: CDPSession, url: str, consoleAPICalled: Callable[[str, List[pyppeteer.execution_context.JSHandle]], None], exceptionThrown: Callable[[Dict[KT, VT]], None])[source]¶ -
Bases:
pyee.EventEmitter
The Worker class represents a WebWorker.
The events
workercreated
andworkerdestroyed
are emitted on the page
object to signal the worker lifecycle.page.on('workercreated', lambda worker: print('Worker created:', worker.url))
-
coroutine
evaluate
(pageFunction: str, *args) → Any[source]¶ -
Evaluate
pageFunction
withargs
.Shortcut for
(await worker.executionContext).evaluate(pageFunction, *args)
.
-
coroutine
evaluateHandle
(pageFunction: str, *args) → pyppeteer.execution_context.JSHandle[source]¶ -
Evaluate
pageFunction
withargs
and returnJSHandle
.Shortcut for
(await worker.executionContext).evaluateHandle(pageFunction, *args)
.
-
coroutine
executionContext
() → pyppeteer.execution_context.ExecutionContext[source]¶ -
Return ExecutionContext.
-
url
¶ -
Return URL.
-
coroutine
Keyboard Class¶
-
class
pyppeteer.input.
Keyboard
(client: pyppeteer.connection.CDPSession)[source]¶ -
Bases:
object
Keyboard class provides as api for managing a virtual keyboard.
The high level api is
type()
, which takes raw characters and
generate proper keydown, keypress/input, and keyup events on your page.For finer control, you can use
down()
,up()
, and
sendCharacter()
to manually fire events as if they were generated
from a real keyboard.An example of holding down
Shift
in order to select and delete some
text:await page.keyboard.type('Hello, World!') await page.keyboard.press('ArrowLeft') await page.keyboard.down('Shift') for i in ' World': await page.keyboard.press('ArrowLeft') await page.keyboard.up('Shift') await page.keyboard.press('Backspace') # Result text will end up saying 'Hello!'.
An example of pressing
A
:await page.keyboard.down('Shift') await page.keyboard.press('KeyA') await page.keyboard.up('Shift')
-
coroutine
down
(key: str, options: dict = None, **kwargs) → None[source]¶ -
Dispatch a
keydown
event withkey
.If
key
is a single character and no modifier keys besidesShift
are being held down, and akeypress
/input
event will also
generated. Thetext
option can be specified to force aninput
event to be generated.If
key
is a modifier key, likeShift
,Meta
, orAlt
,
subsequent key presses will be sent with that modifier active. To
release the modifier key, useup()
method.Parameters: - key (str) – Name of key to press, such as
ArrowLeft
. - options (dict) – Option can have
text
field, and if this option
specified, generate an input event with this text.
Note
Modifier keys DO influence
down()
. Holding downshift
will type the text in upper case. - key (str) – Name of key to press, such as
-
coroutine
press
(key: str, options: Dict[KT, VT] = None, **kwargs) → None[source]¶ -
Press
key
.If
key
is a single character and no modifier keys besides
Shift
are being held down, akeypress
/input
event will also
generated. Thetext
option can be specified to force an input event
to be generated.Parameters: key (str) – Name of key to press, such as ArrowLeft
.This method accepts the following options:
text
(str): If specified, generates an input event with this
text.delay
(int|float): Time to wait betweenkeydown
and
keyup
. Defaults to 0.
Note
Modifier keys DO effect
press()
. Holding downShift
will
type the text in upper case.
-
coroutine
sendCharacter
(char: str) → None[source]¶ -
Send character into the page.
This method dispatches a
keypress
andinput
event. This does
not send akeydown
orkeyup
event.Note
Modifier keys DO NOT effect
sendCharacter()
. Holding down
shift
will not type the text in upper case.
-
coroutine
type
(text: str, options: Dict[KT, VT] = None, **kwargs) → None[source]¶ -
Type characters into a focused element.
This method sends
keydown
,keypress
/input
, andkeyup
event for each character in thetext
.To press a special key, like
Control
orArrowDown
, use
press()
method.Parameters: - text (str) – Text to type into a focused element.
- options (dict) – Options can have
delay
(int|float) field, which
specifies time to wait between key presses in milliseconds. Defaults
to 0.
Note
Modifier keys DO NOT effect
type()
. Holding downshift
will not type the text in upper case.
-
coroutine
up
(key: str) → None[source]¶ -
Dispatch a
keyup
event of thekey
.Parameters: key (str) – Name of key to release, such as ArrowLeft
.
-
coroutine
Mouse Class¶
-
class
pyppeteer.input.
Mouse
(client: pyppeteer.connection.CDPSession, keyboard: pyppeteer.input.Keyboard)[source]¶ -
Bases:
object
Mouse class.
-
coroutine
click
(x: float, y: float, options: dict = None, **kwargs) → None[source]¶ -
Click button at (
x
,y
).Shortcut to
move()
,down()
, andup()
.This method accepts the following options:
button
(str):left
,right
, ormiddle
, defaults to
left
.clickCount
(int): defaults to 1.delay
(int|float): Time to wait betweenmousedown
and
mouseup
in milliseconds. Defaults to 0.
-
coroutine
down
(options: dict = None, **kwargs) → None[source]¶ -
Press down button (dispatches
mousedown
event).This method accepts the following options:
button
(str):left
,right
, ormiddle
, defaults to
left
.clickCount
(int): defaults to 1.
-
coroutine
move
(x: float, y: float, options: dict = None, **kwargs) → None[source]¶ -
Move mouse cursor (dispatches a
mousemove
event).Options can accepts
steps
(int) field. If thissteps
option
specified, Sends intermediatemousemove
events. Defaults to 1.
-
coroutine
up
(options: dict = None, **kwargs) → None[source]¶ -
Release pressed button (dispatches
mouseup
event).This method accepts the following options:
button
(str):left
,right
, ormiddle
, defaults to
left
.clickCount
(int): defaults to 1.
-
coroutine
Tracing Class¶
-
class
pyppeteer.tracing.
Tracing
(client: pyppeteer.connection.CDPSession)[source]¶ -
Bases:
object
Tracing class.
You can use
start()
andstop()
to create a trace file which can
be opened in Chrome DevTools or
timeline viewer.await page.tracing.start({'path': 'trace.json'}) await page.goto('https://www.google.com') await page.tracing.stop()
-
coroutine
start
(options: dict = None, **kwargs) → None[source]¶ -
Start tracing.
Only one trace can be active at a time per browser.
This method accepts the following options:
path
(str): A path to write the trace file to.screenshots
(bool): Capture screenshots in the trace.categories
(List[str]): Specify custom categories to use instead
of default.
-
coroutine
stop
() → str[source]¶ -
Stop tracing.
Returns: trace data as string.
-
coroutine
Dialog Class¶
-
class
pyppeteer.dialog.
Dialog
(client: pyppeteer.connection.CDPSession, type: str, message: str, defaultValue: str = ‘’)[source]¶ -
Bases:
object
Dialog class.
Dialog objects are dispatched by page via the
dialog
event.An example of using
Dialog
class:browser = await launch() page = await browser.newPage() async def close_dialog(dialog): print(dialog.message) await dialog.dismiss() await browser.close() page.on( 'dialog', lambda dialog: asyncio.ensure_future(close_dialog(dialog)) ) await page.evaluate('() => alert("1")')
-
coroutine
accept
(promptText: str = ‘’) → None[source]¶ -
Accept the dialog.
promptText
(str): A text to enter in prompt. If the dialog’s type
is not prompt, this does not cause any effect.
-
defaultValue
¶ -
If dialog is prompt, get default prompt value.
If dialog is not prompt, return empty string (
''
).
-
coroutine
dismiss
() → None[source]¶ -
Dismiss the dialog.
-
message
¶ -
Get dialog message.
-
type
¶ -
Get dialog type.
One of
alert
,beforeunload
,confirm
, orprompt
.
-
coroutine
ConsoleMessage Class¶
-
class
pyppeteer.page.
ConsoleMessage
(type: str, text: str, args: List[pyppeteer.execution_context.JSHandle] = None)[source]¶ -
Bases:
object
Console message class.
ConsoleMessage objects are dispatched by page via the
console
event.-
args
¶ -
Return list of args (JSHandle) of this message.
-
text
¶ -
Return text representation of this message.
-
type
¶ -
Return type of this message.
-
Frame Class¶
-
class
pyppeteer.frame_manager.
Frame
(client: pyppeteer.connection.CDPSession, parentFrame: Optional[Frame], frameId: str)[source]¶ -
Bases:
object
Frame class.
Frame objects can be obtained via
pyppeteer.page.Page.mainFrame
.-
coroutine
J
(selector: str) → Optional[pyppeteer.element_handle.ElementHandle]¶ -
Alias to
querySelector()
-
coroutine
JJ
(selector: str) → List[pyppeteer.element_handle.ElementHandle]¶ -
Alias to
querySelectorAll()
-
coroutine
JJeval
(selector: str, pageFunction: str, *args) → Optional[Dict[KT, VT]]¶ -
Alias to
querySelectorAllEval()
-
coroutine
Jeval
(selector: str, pageFunction: str, *args) → Any¶ -
Alias to
querySelectorEval()
-
coroutine
Jx
(expression: str) → List[pyppeteer.element_handle.ElementHandle]¶ -
Alias to
xpath()
-
coroutine
addScriptTag
(options: Dict[KT, VT]) → pyppeteer.element_handle.ElementHandle[source]¶ -
Add script tag to this frame.
Details see
pyppeteer.page.Page.addScriptTag()
.
-
coroutine
addStyleTag
(options: Dict[KT, VT]) → pyppeteer.element_handle.ElementHandle[source]¶ -
Add style tag to this frame.
Details see
pyppeteer.page.Page.addStyleTag()
.
-
childFrames
¶ -
Get child frames.
-
coroutine
click
(selector: str, options: dict = None, **kwargs) → None[source]¶ -
Click element which matches
selector
.Details see
pyppeteer.page.Page.click()
.
-
coroutine
content
() → str[source]¶ -
Get the whole HTML contents of the page.
-
coroutine
evaluate
(pageFunction: str, *args, force_expr: bool = False) → Any[source]¶ -
Evaluate pageFunction on this frame.
Details see
pyppeteer.page.Page.evaluate()
.
-
coroutine
evaluateHandle
(pageFunction: str, *args) → pyppeteer.execution_context.JSHandle[source]¶ -
Execute function on this frame.
Details see
pyppeteer.page.Page.evaluateHandle()
.
-
coroutine
executionContext
() → Optional[pyppeteer.execution_context.ExecutionContext][source]¶ -
Return execution context of this frame.
Return
ExecutionContext
associated to this frame.
-
coroutine
focus
(selector: str) → None[source]¶ -
Focus element which matches
selector
.Details see
pyppeteer.page.Page.focus()
.
-
coroutine
hover
(selector: str) → None[source]¶ -
Mouse hover the element which matches
selector
.Details see
pyppeteer.page.Page.hover()
.
-
coroutine
injectFile
(filePath: str) → str[source]¶ -
[Deprecated] Inject file to the frame.
-
isDetached
() → bool[source]¶ -
Return
True
if this frame is detached.Otherwise return
False
.
-
name
¶ -
Get frame name.
-
parentFrame
¶ -
Get parent frame.
If this frame is main frame or detached frame, return
None
.
-
coroutine
querySelector
(selector: str) → Optional[pyppeteer.element_handle.ElementHandle][source]¶ -
Get element which matches
selector
string.Details see
pyppeteer.page.Page.querySelector()
.
-
coroutine
querySelectorAll
(selector: str) → List[pyppeteer.element_handle.ElementHandle][source]¶ -
Get all elements which matches
selector
.Details see
pyppeteer.page.Page.querySelectorAll()
.
-
coroutine
querySelectorAllEval
(selector: str, pageFunction: str, *args) → Optional[Dict[KT, VT]][source]¶ -
Execute function on all elements which matches selector.
Details see
pyppeteer.page.Page.querySelectorAllEval()
.
-
coroutine
querySelectorEval
(selector: str, pageFunction: str, *args) → Any[source]¶ -
Execute function on element which matches selector.
Details see
pyppeteer.page.Page.querySelectorEval()
.
-
coroutine
select
(selector: str, *values) → List[str][source]¶ -
Select options and return selected values.
Details see
pyppeteer.page.Page.select()
.
-
coroutine
setContent
(html: str) → None[source]¶ -
Set content to this page.
-
coroutine
tap
(selector: str) → None[source]¶ -
Tap the element which matches the
selector
.Details see
pyppeteer.page.Page.tap()
.
-
coroutine
title
() → str[source]¶ -
Get title of the frame.
-
coroutine
type
(selector: str, text: str, options: dict = None, **kwargs) → None[source]¶ -
Type
text
on the element which matchesselector
.Details see
pyppeteer.page.Page.type()
.
-
url
¶ -
Get url of the frame.
-
waitFor
(selectorOrFunctionOrTimeout: Union[str, int, float], options: dict = None, *args, **kwargs) → Union[Awaitable[T_co], pyppeteer.frame_manager.WaitTask][source]¶ -
Wait until
selectorOrFunctionOrTimeout
.Details see
pyppeteer.page.Page.waitFor()
.
-
waitForFunction
(pageFunction: str, options: dict = None, *args, **kwargs) → pyppeteer.frame_manager.WaitTask[source]¶ -
Wait until the function completes.
Details see
pyppeteer.page.Page.waitForFunction()
.
-
waitForSelector
(selector: str, options: dict = None, **kwargs) → pyppeteer.frame_manager.WaitTask[source]¶ -
Wait until element which matches
selector
appears on page.Details see
pyppeteer.page.Page.waitForSelector()
.
-
waitForXPath
(xpath: str, options: dict = None, **kwargs) → pyppeteer.frame_manager.WaitTask[source]¶ -
Wait until element which matches
xpath
appears on page.Details see
pyppeteer.page.Page.waitForXPath()
.
-
coroutine
xpath
(expression: str) → List[pyppeteer.element_handle.ElementHandle][source]¶ -
Evaluate the XPath expression.
If there are no such elements in this frame, return an empty list.
Parameters: expression (str) – XPath string to be evaluated.
-
coroutine
ExecutionContext Class¶
-
class
pyppeteer.execution_context.
ExecutionContext
(client: pyppeteer.connection.CDPSession, contextPayload: Dict[KT, VT], objectHandleFactory: Any, frame: Frame = None)[source]¶ -
Bases:
object
Execution Context class.
-
coroutine
evaluate
(pageFunction: str, *args, force_expr: bool = False) → Any[source]¶ -
Execute
pageFunction
on this context.Details see
pyppeteer.page.Page.evaluate()
.
-
coroutine
evaluateHandle
(pageFunction: str, *args, force_expr: bool = False) → pyppeteer.execution_context.JSHandle[source]¶ -
Execute
pageFunction
on this context.Details see
pyppeteer.page.Page.evaluateHandle()
.
-
frame
¶ -
Return frame associated with this execution context.
-
coroutine
queryObjects
(prototypeHandle: pyppeteer.execution_context.JSHandle) → pyppeteer.execution_context.JSHandle[source]¶ -
Send query.
Details see
pyppeteer.page.Page.queryObjects()
.
-
coroutine
JSHandle Class¶
-
class
pyppeteer.execution_context.
JSHandle
(context: pyppeteer.execution_context.ExecutionContext, client: pyppeteer.connection.CDPSession, remoteObject: Dict[KT, VT])[source]¶ -
Bases:
object
JSHandle class.
JSHandle represents an in-page JavaScript object. JSHandle can be created
with theevaluateHandle()
method.-
asElement
() → Optional[ElementHandle][source]¶ -
Return either null or the object handle itself.
-
coroutine
dispose
() → None[source]¶ -
Stop referencing the handle.
-
executionContext
¶ -
Get execution context of this handle.
-
coroutine
getProperties
() → Dict[str, pyppeteer.execution_context.JSHandle][source]¶ -
Get all properties of this handle.
-
coroutine
getProperty
(propertyName: str) → pyppeteer.execution_context.JSHandle[source]¶ -
Get property value of
propertyName
.
-
coroutine
jsonValue
() → Dict[KT, VT][source]¶ -
Get Jsonized value of this object.
-
toString
() → str[source]¶ -
Get string representation.
-
ElementHandle Class¶
-
class
pyppeteer.element_handle.
ElementHandle
(context: pyppeteer.execution_context.ExecutionContext, client: pyppeteer.connection.CDPSession, remoteObject: dict, page: Any, frameManager: FrameManager)[source]¶ -
Bases:
pyppeteer.execution_context.JSHandle
ElementHandle class.
This class represents an in-page DOM element. ElementHandle can be created
by thepyppeteer.page.Page.querySelector()
method.ElementHandle prevents DOM element from garbage collection unless the
handle is disposed. ElementHandles are automatically disposed when their
origin frame gets navigated.ElementHandle isinstance can be used as arguments in
pyppeteer.page.Page.querySelectorEval()
and
pyppeteer.page.Page.evaluate()
methods.-
coroutine
J
(selector: str) → Optional[pyppeteer.element_handle.ElementHandle]¶ -
alias to
querySelector()
-
coroutine
JJ
(selector: str) → List[pyppeteer.element_handle.ElementHandle]¶ -
alias to
querySelectorAll()
-
coroutine
JJeval
(selector: str, pageFunction: str, *args) → Any¶ -
alias to
querySelectorAllEval()
-
coroutine
Jeval
(selector: str, pageFunction: str, *args) → Any¶ -
alias to
querySelectorEval()
-
coroutine
Jx
(expression: str) → List[pyppeteer.element_handle.ElementHandle]¶ -
alias to
xpath()
-
asElement
() → pyppeteer.element_handle.ElementHandle[source]¶ -
Return this ElementHandle.
-
coroutine
boundingBox
() → Optional[Dict[str, float]][source]¶ -
Return bounding box of this element.
If the element is not visible, return
None
.This method returns dictionary of bounding box, which contains:
x
(int): The X coordinate of the element in pixels.y
(int): The Y coordinate of the element in pixels.width
(int): The width of the element in pixels.height
(int): The height of the element in pixels.
-
coroutine
boxModel
() → Optional[Dict[KT, VT]][source]¶ -
Return boxes of element.
Return
None
if element is not visible. Boxes are represented as an
list of points; each Point is a dictionary{x, y}
. Box points are
sorted clock-wise.Returned value is a dictionary with the following fields:
content
(List[Dict]): Content box.padding
(List[Dict]): Padding box.border
(List[Dict]): Border box.margin
(List[Dict]): Margin box.width
(int): Element’s width.height
(int): Element’s height.
-
coroutine
click
(options: dict = None, **kwargs) → None[source]¶ -
Click the center of this element.
If needed, this method scrolls element into view. If the element is
detached from DOM, the method raisesElementHandleError
.options
can contain the following fields:button
(str):left
,right
, ofmiddle
, defaults to
left
.clickCount
(int): Defaults to 1.delay
(int|float): Time to wait betweenmousedown
and
mouseup
in milliseconds. Defaults to 0.
-
coroutine
contentFrame
() → Optional[pyppeteer.frame_manager.Frame][source]¶ -
Return the content frame for the element handle.
Return
None
if this handle is not referencing iframe.
-
coroutine
focus
() → None[source]¶ -
Focus on this element.
-
coroutine
hover
() → None[source]¶ -
Move mouse over to center of this element.
If needed, this method scrolls element into view. If this element is
detached from DOM tree, the method raises anElementHandleError
.
-
coroutine
isIntersectingViewport
() → bool[source]¶ -
Return
True
if the element is visible in the viewport.
-
coroutine
press
(key: str, options: Dict[KT, VT] = None, **kwargs) → None[source]¶ -
Press
key
onto the element.This method focuses the element, and then uses
pyppeteer.input.keyboard.down()
and
pyppeteer.input.keyboard.up()
.Parameters: key (str) – Name of key to press, such as ArrowLeft
.This method accepts the following options:
text
(str): If specified, generates an input event with this
text.delay
(int|float): Time to wait betweenkeydown
and
keyup
. Defaults to 0.
-
coroutine
querySelector
(selector: str) → Optional[pyppeteer.element_handle.ElementHandle][source]¶ -
Return first element which matches
selector
under this element.If no element matches the
selector
, returnsNone
.
-
coroutine
querySelectorAll
(selector: str) → List[pyppeteer.element_handle.ElementHandle][source]¶ -
Return all elements which match
selector
under this element.If no element matches the
selector
, returns empty list ([]
).
-
coroutine
querySelectorAllEval
(selector: str, pageFunction: str, *args) → Any[source]¶ -
Run
Page.querySelectorAllEval
within the element.This method runs
Array.from(document.querySelectorAll)
within the
element and passes it as the first argument topageFunction
. If
there is no element matchingselector
, the method raises
ElementHandleError
.If
pageFunction
returns a promise, then wait for the promise to
resolve and return its value.Example:
<div class="feed"> <div class="tweet">Hello!</div> <div class="tweet">Hi!</div> </div>
feedHandle = await page.J('.feed') assert (await feedHandle.JJeval('.tweet', '(nodes => nodes.map(n => n.innerText))')) == ['Hello!', 'Hi!']
-
coroutine
querySelectorEval
(selector: str, pageFunction: str, *args) → Any[source]¶ -
Run
Page.querySelectorEval
within the element.This method runs
document.querySelector
within the element and
passes it as the first argument topageFunction
. If there is no
element matchingselector
, the method raises
ElementHandleError
.If
pageFunction
returns a promise, then wait for the promise to
resolve and return its value.ElementHandle.Jeval
is a shortcut of this method.Example:
tweetHandle = await page.querySelector('.tweet') assert (await tweetHandle.querySelectorEval('.like', 'node => node.innerText')) == 100 assert (await tweetHandle.Jeval('.retweets', 'node => node.innerText')) == 10
-
coroutine
screenshot
(options: Dict[KT, VT] = None, **kwargs) → bytes[source]¶ -
Take a screenshot of this element.
If the element is detached from DOM, this method raises an
ElementHandleError
.Available options are same as
pyppeteer.page.Page.screenshot()
.
-
coroutine
tap
() → None[source]¶ -
Tap the center of this element.
If needed, this method scrolls element into view. If the element is
detached from DOM, the method raisesElementHandleError
.
-
coroutine
type
(text: str, options: Dict[KT, VT] = None, **kwargs) → None[source]¶ -
Focus the element and then type text.
Details see
pyppeteer.input.Keyboard.type()
method.
-
coroutine
uploadFile
(*filePaths) → dict[source]¶ -
Upload files.
-
coroutine
xpath
(expression: str) → List[pyppeteer.element_handle.ElementHandle][source]¶ -
Evaluate the XPath expression relative to this elementHandle.
If there are no such elements, return an empty list.
Parameters: expression (str) – XPath string to be evaluated.
-
coroutine
Request Class¶
-
class
pyppeteer.network_manager.
Request
(client: pyppeteer.connection.CDPSession, requestId: Optional[str], interceptionId: str, isNavigationRequest: bool, allowInterception: bool, url: str, resourceType: str, payload: dict, frame: Optional[pyppeteer.frame_manager.Frame], redirectChain: List[Request])[source]¶ -
Bases:
object
Request class.
Whenever the page sends a request, such as for a network resource, the
following events are emitted by pyppeteer’s page:'request'
: emitted when the request is issued by the page.'response'
: emitted when/if the response is received for the request.'requestfinished'
: emitted when the response body is downloaded and
the request is complete.
If request fails at some point, then instead of
'requestfinished'
event
(and possibly instead of'response'
event), the'requestfailed'
event is emitted.If request gets a
'redirect'
response, the request is successfully
finished with the'requestfinished'
event, and a new request is issued
to a redirect url.-
coroutine
abort
(errorCode: str = ‘failed’) → None[source]¶ -
Abort request.
To use this, request interception should be enabled by
pyppeteer.page.Page.setRequestInterception()
.
If request interception is not enabled, raiseNetworkError
.errorCode
is an optional error code string. Defaults tofailed
,
could be one of the following:aborted
: An operation was aborted (due to user action).accessdenied
: Permission to access a resource, other than the
network, was denied.addressunreachable
: The IP address is unreachable. This usually
means that there is no route to the specified host or network.blockedbyclient
: The client chose to block the request.blockedbyresponse
: The request failed because the request was
delivered along with requirements which are not met
(‘X-Frame-Options’ and ‘Content-Security-Policy’ ancestor check,
for instance).connectionaborted
: A connection timeout as a result of not
receiving an ACK for data sent.connectionclosed
: A connection was closed (corresponding to a TCP
FIN).connectionfailed
: A connection attempt failed.connectionrefused
: A connection attempt was refused.connectionreset
: A connection was reset (corresponding to a TCP
RST).internetdisconnected
: The Internet connection has been lost.namenotresolved
: The host name could not be resolved.timedout
: An operation timed out.failed
: A generic failure occurred.
-
coroutine
continue_
(overrides: Dict[KT, VT] = None) → None[source]¶ -
Continue request with optional request overrides.
To use this method, request interception should be enabled by
pyppeteer.page.Page.setRequestInterception()
. If request
interception is not enabled, raiseNetworkError
.overrides
can have the following fields:url
(str): If set, the request url will be changed.method
(str): If set, change the request method (e.g.GET
).postData
(str): If set, change the post data or request.headers
(dict): If set, change the request HTTP header.
-
failure
() → Optional[Dict[KT, VT]][source]¶ -
Return error text.
Return
None
unless this request was failed, as reported by
requestfailed
event.When request failed, this method return dictionary which has a
errorText
field, which contains human-readable error message, e.g.
'net::ERR_RAILED'
.
-
frame
¶ -
Return a matching
frame
object.Return
None
if navigating to error page.
-
Return a dictionary of HTTP headers of this request.
All header names are lower-case.
-
isNavigationRequest
() → bool[source]¶ -
Whether this request is driving frame’s navigation.
-
method
¶ -
Return this request’s method (GET, POST, etc.).
-
postData
¶ -
Return post body of this request.
-
redirectChain
¶ -
Return chain of requests initiated to fetch a resource.
- If there are no redirects and request was successful, the chain will
be empty. - If a server responds with at least a single redirect, then the chain
will contain all the requests that were redirected.
redirectChain
is shared between all the requests of the same chain. - If there are no redirects and request was successful, the chain will
-
resourceType
¶ -
Resource type of this request perceived by the rendering engine.
ResourceType will be one of the following:
document
,
stylesheet
,image
,media
,font
,script
,
texttrack
,xhr
,fetch
,eventsource
,websocket
,
manifest
,other
.
-
coroutine
respond
(response: Dict[KT, VT]) → None[source]¶ -
Fulfills request with given response.
To use this, request interception should by enabled by
pyppeteer.page.Page.setRequestInterception()
. Request
interception is not enabled, raiseNetworkError
.response
is a dictionary which can have the following fields:status
(int): Response status code, defaults to 200.headers
(dict): Optional response headers.contentType
(str): If set, equals to settingContent-Type
response header.body
(str|bytes): Optional response body.
-
response
¶ -
Return matching
Response
object, orNone
.If the response has not been received, return
None
.
-
url
¶ -
URL of this request.
Response Class¶
-
class
pyppeteer.network_manager.
Response
(client: pyppeteer.connection.CDPSession, request: pyppeteer.network_manager.Request, status: int, headers: Dict[str, str], fromDiskCache: bool, fromServiceWorker: bool, securityDetails: Dict[KT, VT] = None)[source]¶ -
Bases:
object
Response class represents responses which are received by
Page
.-
buffer
() → Awaitable[bytes][source]¶ -
Return awaitable which resolves to bytes with response body.
-
fromCache
¶ -
Return
True
if the response was served from cache.Here
cache
is either the browser’s disk cache or memory cache.
-
fromServiceWorker
¶ -
Return
True
if the response was served by a service worker.
-
Return dictionary of HTTP headers of this response.
All header names are lower-case.
-
coroutine
json
() → dict[source]¶ -
Get JSON representation of response body.
-
ok
¶ -
Return bool whether this request is successful (200-299) or not.
-
request
¶ -
Get matching
Request
object.
-
securityDetails
¶ -
Return security details associated with this response.
Security details if the response was received over the secure
connection, orNone
otherwise.
-
status
¶ -
Status code of the response.
-
coroutine
text
() → str[source]¶ -
Get text representation of response body.
-
url
¶ -
URL of the response.
-
Target Class¶
-
class
pyppeteer.browser.
Target
(targetInfo: Dict[KT, VT], browserContext: BrowserContext, sessionFactory: Callable[[], Coroutine[Any, Any, pyppeteer.connection.CDPSession]], ignoreHTTPSErrors: bool, setDefaultViewport: bool, screenshotTaskQueue: List[T], loop: asyncio.events.AbstractEventLoop)[source]¶ -
Bases:
object
Browser’s target class.
-
browser
¶ -
Get the browser the target belongs to.
-
browserContext
¶ -
Return the browser context the target belongs to.
-
coroutine
createCDPSession
() → pyppeteer.connection.CDPSession[source]¶ -
Create a Chrome Devtools Protocol session attached to the target.
-
opener
¶ -
Get the target that opened this target.
Top-level targets return
None
.
-
coroutine
page
() → Optional[pyppeteer.page.Page][source]¶ -
Get page of this target.
If the target is not of type “page” or “background_page”, return
None
.
-
type
¶ -
Get type of this target.
Type can be
'page'
,'background_page'
,'service_worker'
,
'browser'
, or'other'
.
-
url
¶ -
Get url of this target.
-
CDPSession Class¶
-
class
pyppeteer.connection.
CDPSession
(connection: Union[pyppeteer.connection.Connection, CDPSession], targetType: str, sessionId: str, loop: asyncio.events.AbstractEventLoop)[source]¶ -
Bases:
pyee.EventEmitter
Chrome Devtools Protocol Session.
The
CDPSession
instances are used to talk raw Chrome Devtools
Protocol:- protocol methods can be called with
send()
method. - protocol events can be subscribed to with
on()
method.
Documentation on DevTools Protocol can be found
here.-
coroutine
detach
() → None[source]¶ -
Detach session from target.
Once detached, session won’t emit any events and can’t be used to send
messages.
-
send
(method: str, params: dict = None) → Awaitable[T_co][source]¶ -
Send message to the connected session.
Parameters: - method (str) – Protocol method name.
- params (dict) – Optional method parameters.
- protocol methods can be called with
Coverage Class¶
-
class
pyppeteer.coverage.
Coverage
(client: pyppeteer.connection.CDPSession)[source]¶ -
Bases:
object
Coverage class.
Coverage gathers information about parts of JavaScript and CSS that were
used by the page.An example of using JavaScript and CSS coverage to get percentage of
initially executed code:# Enable both JavaScript and CSS coverage await page.coverage.startJSCoverage() await page.coverage.startCSSCoverage() # Navigate to page await page.goto('https://example.com') # Disable JS and CSS coverage and get results jsCoverage = await page.coverage.stopJSCoverage() cssCoverage = await page.coverage.stopCSSCoverage() totalBytes = 0 usedBytes = 0 coverage = jsCoverage + cssCoverage for entry in coverage: totalBytes += len(entry['text']) for range in entry['ranges']: usedBytes += range['end'] - range['start'] - 1 print('Bytes used: {}%'.format(usedBytes / totalBytes * 100))
-
coroutine
startCSSCoverage
(options: Dict[KT, VT] = None, **kwargs) → None[source]¶ -
Start CSS coverage measurement.
Available options are:
resetOnNavigation
(bool): Whether to reset coverage on every
navigation. Defaults toTrue
.
-
coroutine
startJSCoverage
(options: Dict[KT, VT] = None, **kwargs) → None[source]¶ -
Start JS coverage measurement.
Available options are:
resetOnNavigation
(bool): Whether to reset coverage on every
navigation. Defaults toTrue
.reportAnonymousScript
(bool): Whether anonymous script generated
by the page should be reported. Defaults toFalse
.
Note
Anonymous scripts are ones that don’t have an associated url. These
are scripts that are dynamically created on the page usingeval
ofnew Function
. IfreportAnonymousScript
is set to
True
, anonymous scripts will have
__pyppeteer_evaluation_script__
as their url.
-
coroutine
stopCSSCoverage
() → List[T][source]¶ -
Stop CSS coverage measurement and get result.
Return list of coverage reports for all non-anonymous scripts. Each
report includes:url
(str): StyleSheet url.text
(str): StyleSheet content.ranges
(List[Dict]): StyleSheet ranges that were executed. Ranges
are sorted and non-overlapping.start
(int): A start offset in text, inclusive.end
(int): An end offset in text, exclusive.
Note
CSS coverage doesn’t include dynamically injected style tags without
sourceURLs (but currently includes… to be fixed).
-
coroutine
stopJSCoverage
() → List[T][source]¶ -
Stop JS coverage measurement and get result.
Return list of coverage reports for all scripts. Each report includes:
url
(str): Script url.text
(str): Script content.ranges
(List[Dict]): Script ranges that were executed. Ranges are
sorted and non-overlapping.start
(int): A start offset in text, inclusive.end
(int): An end offset in text, exclusive.
Note
JavaScript coverage doesn’t include anonymous scripts by default.
However, scripts with sourceURLs are reported.
-
coroutine
Debugging¶
For debugging, you can set logLevel
option to logging.DEBUG
for
pyppeteer.launcher.launch()
and pyppeteer.launcher.connect()
functions. However, this option prints too many logs including SEND/RECV
messages of pyppeteer. In order to only show suppressed error messages, you
should set pyppeteer.DEBUG
to True
.
Example:
import asyncio import pyppeteer from pyppeteer import launch pyppeteer.DEBUG = True # print suppressed errors as error log async def main(): browser = await launch() ... # do something asyncio.get_event_loop().run_until_complete(main())