当通过python爬虫爬取外网资源,python需要通过socks5等代理连接网络,这里记录一下两种常用方法。
修改内置py库
当使用urllib或者socket等内置库连接网络时,分别修改urllib.py或者socket.py文件。
不过修改以后,再使用其他python程序时也会受到影响,需要记得再改回来。
或者使用pyenv或者类似的工具,使该程序使用独立的虚拟python环境,防止影响其他程序。
修改urllib.pyimport socks import socket socks.setdefaultproxy(socks.PROXY_TYPE_SOCKS5, "127.0.0.1", 1080) socket.socket = socks.socksocket
修改socket.py
socket = SocketType = _socketobject # 在socket.py找到这一行 # 在下面添加下面的代码 import socks socks.setdefaultproxy(socks.PROXY_TYPE_SOCKS5, "127.0.0.1", 1080) socket = socks.socksocket
使用PySocks
PySocks是一个基于Python的SOCKS代理客户端,它是SocksiPy的一个分支,修改了一些bug和增加了一些额外功能。
安装
git clone https://github.com/Anorov/PySocks cd PySocks python setup.py install
或者直接pip install PySocks
使用
import socket import socks socks.set_default_proxy(socks.SOCKS5, "127.0.0.1", 1080) socket.socket = socks.socksocket response = request.get("www.google.com")
当众人都哭时,
应该允许有的人不哭。
当哭成为一种表演时,
更应该允许有的人不哭。
——莫言
评论
800255 286310All you need to know about News information to you. 691070
163614 743659Thank you for having the time to discuss this subject. I truly appreciate it. Ill stick a link of this entry in my web site. 727721
798361 725997Hmm is anyone else experiencing difficulties with the images on this weblog loading? Im trying to find out if its a issue on my finish or if its the blog. Any feed-back would be greatly appreciated. 308321
229441 612738I want to admit that that is 1 great insight. It surely gives a company the opportunity to have in about the ground floor and genuinely take part in making a thing particular and tailored to their needs. 456300