2021-06-27
Python 爬虫处理 Post 请求
Post 请求发送的数据为 Data 数据
携带 Data 数据的 post 请求
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27RequestURL:http://127.0.0.1:8080/test/test.do
Request Method:POST
Status Code:200 OK
Request Headers
Accept:text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Encoding:gzip,deflate,sdch
Accept-Language:zh-CN,zh;q=0.8,en;q=0.6
AlexaToolbar-ALX_NS_PH:AlexaToolbar/alxg-3.2
Cache-Control:max-age=0
Connection:keep-alive
Content-Length:25
Content-Type:application/x-www-form-urlencoded
Cookie:JSESSIONID=74AC93F9F572980B6FC10474CD8EDD8D
Host:127.0.0.1:8080
Origin:http://127.0.0.1:8080
Referer:http://127.0.0.1:8080/test/index.jsp
User-Agent:Mozilla/5.0 (Windows NT 6.1)AppleWebKit/537.36 (KHTML, like Gecko) Chrome/33.0.1750.149 Safari/537.36
Form Data
name:mikan
address:street
Response Headers
Content-Length:2
Date:Sun, 11 May 2014 11:05:33 GMT
Server:Apache-Coyote/1.1使用 Requests 模块发生 携带 Data 数据 的 post 请求
1
2
3
4
5response = requests.post(
url = "请求 url 地址",
headers = "请求头字典",
data = "请求数据字典",
)
Post 请求发送的数据为 Request Payload 数据
携带 Request Payload 数据的 post 请求
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39General
Request URL: https://account.alibabacloud.com/csp/report.htm
Request Method: POST
Status Code: 200
Remote Address: 47.88.251.186:443
Referrer Policy: strict-origin-when-cross-origin
Response Headers
content-encoding: gzip
content-type: text/html;charset=UTF-8
date: Fri, 25 Jun 2021 07:54:07 GMT
eagleeye-traceid: 0a98a6bb16246076475364468e2bdc
server: Tengine
strict-transport-security: max-age=0
timing-allow-origin: *
vary: Accept-Encoding
Request Headers
:authority: account.alibabacloud.com
:method: POST
:path: /csp/report.htm
:scheme: https
accept: */*
accept-encoding: gzip, deflate, br
accept-language: zh-CN,zh;q=0.9
content-length: 1047
content-type: application/csp-report
cookie: ******************************************************************************************************
origin: https://account.alibabacloud.com
referer: https://account.alibabacloud.com/login/login.htm
sec-ch-ua: " Not;A Brand";v="99", "Google Chrome";v="91", "Chromium";v="91"
sec-ch-ua-mobile: ?0
sec-fetch-dest: report
sec-fetch-mode: no-cors
sec-fetch-site: same-origin
user-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.114 Safari/537.36
Request Payload
{"csp-report":{"document-uri":"https://account.alibabacloud.com/login/login.htm","referrer":"https://www.google.com/","violated-directive":"script-src-elem","effective-directive":"script-src-elem","original-policy":"base-uri 'self';script-src 'self' 'unsafe-inline' 'unsafe-eval' 'report-sample' https: http: 'sha256-lfXlPY3+MCPOPb4mrw1Y961+745U3WlDQVcOXdchSQc=' 'sha256-QbgF6nrAFOI1VumLs3RwKgg0Qmj5JImgLwiAhJOUoeQ=' 'sha256-rRMdkshZyJlCmDX27XnL7g3zXaxv7ei6Sg+yt4R3svU=' 'sha256-kbHtQyYDQKz4SWMQ8OHVol3EC0t3tHEJFPCSwNG9NxQ=' 'sha256-46mc3H6z56gnOReRHr//8M7FxjqtSaDN7KetqqduuiE=' 'Strict-Dynamic' 'unsafe-hashes' 'nonce-0MVMRYu19o';frame-src 'self' *.aliyun.com *.alibaba.com *.alibabacloud.com gaic.alicdn.com g.alicdn.com;worker-src blob: 'self' data:;object-src 'self' g.alicdn.com;frame-ancestors *.aliyun.com;report-uri /csp/report.htm;","disposition":"report","blocked-uri":"inline","line-number":59,"source-file":"https://account.alibabacloud.com/login/login.htm","status-code":0,"script-sample":"var ALIYUN_ACCOUNT_LOGIN_CONFIG = {\n "}}使用 Requests 模块发生 携带 Request Payload 数据 的 post 请求
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30import json
import requests
from pprint import pprint
url = ""
payloadHeaders = {
"user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.114 Safari/537.36",
"content-type": "application/json;charset=UTF-8",
}
payloadData = {"query": [],
"start": 6000,
"rows": 100,
"sort_field": {"sort_field": "ImpactFactor"},
"highlight_field": "",
"pinyin_title": [],
"class_code": "",
"core_periodical": [],
"sponsor_region": [],
"publishing_period": [],
"publish_status": "",
"return_fields": ["Title", "Id", "CorePeriodical", "Award", "IsPrePublished"]}
response = requests.post(url,
# 将 data 转化为 json 格式数据
data=json.dumps(payloadData),
headers=payloadHeaders)
dates = response.json()
print(dates['value'][-1])
print(len(dates['value']))
两者区别
- 请求头中
Content-Type: application/x-www-form-urlencoded
,那么就是一个 POST 表单请求,请求主体将以一个标准的键值对和&的querystring形式出现。这种方式是HTML表单的默认设置,所以在过去这种方式更加常见。 - 其他形式的POST请求,是放到 Request payload 中(现在是为了方便阅读,使用了Json这样的数据格式),请求的
Content-Type: application/json;charset=UTF-8
或者不指定。