问题:

我试图通过抓取网站来进行练习,但是我一直收到HTTP错误403(它认为我是机器人)吗?

这是我的代码:

#import requests
import urllib.request
from bs4 import BeautifulSoup
#from urllib import urlopen
import re

webpage = urllib.request.urlopen('http://www.cmegroup.com/trading/products/#sortField=oi&sortAsc=false&venues=3&page=1&cleared=1&group=1').read
findrows = re.compile('<tr class="- banding(?:On|Off)>(.*?)</tr>')
findlink = re.compile('<a href =">(.*)</a>')

row_array = re.findall(findrows, webpage)
links = re.finall(findlink, webpate)

print(len(row_array))

iterator = []

我得到的错误是:

 File "C:\Python33\lib\urllib\request.py", line 160, in urlopen
    return opener.open(url, data, timeout)
  File "C:\Python33\lib\urllib\request.py", line 479, in open
    response = meth(req, response)
  File "C:\Python33\lib\urllib\request.py", line 591, in http_response
    'http', request, response, code, msg, hdrs)
  File "C:\Python33\lib\urllib\request.py", line 517, in error
    return self._call_chain(*args)
  File "C:\Python33\lib\urllib\request.py", line 451, in _call_chain
    result = func(*args)
  File "C:\Python33\lib\urllib\request.py", line 599, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 403: Forbidden

I was trying to scrap a website for practice, but I kept on getting the HTTP Error 403 (does it think I’m a bot)?Here is my code:The error I get is:

问题:

我正在开发一个iOS项目。

在此应用程序中,我正在从服务器下载图像。

问题:

在下载图像时,我收到请求超时 根据文档HTTP请求超时的状态代码是408

但在我的应用程序中,我收到HTTP状态代码0 ,出现以下错误

错误域= NSURLErrorDomain代码= -1001“请求超时。” UserInfo = 0xb9af710 {NSErrorFailingURLStringKey = http://xxxx.com/resources/p/PNG/1383906967_5621_63.jpg,NSErrorFailingURLKey = http://xxxx.com/resources/p/PNG/1383906967_5621_63.jpg,NSLocalizedDescription =请求超时。,NSUnderlyingError = 0x13846870“请求超时。”}

在搜索期间,通过互联网,我找不到有关HTTP状态代码0的信息。

任何人都可以向我解释这个吗?

I am working on an iOS project.In this application, I am downloading images from the server.Problem:While downloading images I am getting Request Timeout .According to documentation HTTP status code of request timeout is 408 .But in my application, I am getting HTTP status code 0 with the following errorError Domain=NSURLErrorDomain Code=-1001 “The request timed out.”UserInfo=0xb9af710 {NSErrorFailingURLStringKey= http://xxxx.com/resources/p/PNG/1383906967_5621_63.jpg , NSErrorFailingURLKey= http://xxxx.com/resources/p/PNG/1383906967_5621_63.jpg , NSLocalizedDescription=The request timed out., NSUnderlyingError=0x13846870 “The request timed out.”}During a search, over internet, I found no information about HTTP Status Code 0.Can anyone explain this to me?

问题:

在Java中,当HTTP结果为404范围时,此代码将引发异常:

URL url = new URL("http://stackoverflow.com/asdf404notfound");
HttpURLConnection conn = (HttpURLConnection) url.openConnection();
conn.getInputStream(); // throws!

就我而言,我碰巧知道内容为404,但无论如何我还是想阅读响应的内容。

(在我的实际情况下,响应代码为403,但是响应的主体说明了拒绝的原因,我希望向用户显示该信息。)

如何访问响应正文?

In Java, this code throws an exception when the HTTP result is 404 range:In my case, I happen to know that the content is 404, but I’d still like to read the body of the response anyway.(In my actual case the response code is 403, but the body of the response explains the reason for rejection, and I’d like to display that to the user.)How can I access the response body?

问题:

我正在尝试使用python自动下载历史股票数据。 我尝试打开的URL以CSV文件响应,但我无法使用urllib2打开。 我之前在几个问题中已经尝试更改用户代理,我甚至尝试接受响应cookie,没有运气。 你能帮忙吗?

注意:同样的方法适用于雅虎财经。

码:

import urllib2,cookielib

site= "http://www.nseindia.com/live_market/dynaContent/live_watch/get_quote/getHistoricalData.jsp?symbol=JPASSOCIAT&fromDate=1-JAN-2012&toDate=1-AUG-2012&datePeriod=unselected&hiddDwnld=true"

hdr = {'User-Agent':'Mozilla/5.0'}

req = urllib2.Request(site,headers=hdr)

page = urllib2.urlopen(req)

错误

文件“C:\\ Python27 \\ lib \\ urllib2.py”,第527行,在http_error_default中引发HTTPError(req.get_full_url(),代码,msg,hdrs,fp)urllib2.HTTPError:HTTP错误403:禁止

谢谢你的协助

I am trying to automate download of historic stock data using python.The URL I am trying to open responds with a CSV file, but I am unable to open using urllib2.I have tried changing user agent as specified in few questions earlier, I even tried to accept response cookies, with no luck.Can you please help.Note: The same method works for yahoo Finance.Code:ErrorFile “C:\\Python27\\lib\\urllib2.py”, line 527, in http_error_default raise HTTPError(req.get_full_url(), code, msg, hdrs, fp) urllib2.HTTPError: HTTP Error 403: ForbiddenThanks for your assistance

问题:

我正在创建一个RESTful API,它将处理大量用户交互,包括使用存储的信用卡下订单。

如果订单成功,我将返回200 OK,如果订单请求格式错误或无效,我将返回400 Bad Request。 但是,如果在订单的实际处理过程中出现问题,我应该返回什么?

  1. 客户端POSTS为服务器订购用户资源。 如果用户不存在,则返回404 Not Found。
  2. 订单格式和信息经过验证。 如果无效,则返回400 Bad Request。
  3. 订单已处理完毕。 如果订单成功,则会为订单返回201 Created。 如果遇到意外错误,则返回500 Server Error。

最后一步是问题 – 如果订单因任何其他原因没有完成,我该返回什么? 可能的情况包括:

  • 产品已售罄
  • 达到用户最大订单限制
  • 信用卡交易失败(资金不足等)

这似乎不适合400或500.如果没有更好的代码,我可以将其视为400 – 根据业务规则,请求无效。 它似乎不准确。

编辑:还发现了这个相同主题的现有讨论 所有答案似乎都指向使用此类违规的状态代码,并在使用400,409或422扩展之间进行了一些讨论。

I’m creating a RESTful API that will process a number of user interactions, including placing orders using stored credit cards.In the case of a successful order, I’m returning a 200 OK, and in the case where the order request is malformed or invalid I’m returning a 400 Bad Request.But what should I return if there is a problem during the actual processing of the order?Client POSTS order to server for a user resource.If user does not exist, 404 Not Found is returned.Order format and information is validated.If not valid, 400 Bad Request is returned.Order is processed.If the order is successful, a 201 Created is returned for the order.If an unexpected error is encountered, a 500 Server Error is returned.The last step is the problem – what do I return if the order doesn’t complete for any other reason?Possible scenarios could include:Product is sold outUser maximum order limit reachedCredit card transaction failure (insufficient funds, etc.)This doesn’t seem like it would be appropriate for either a 400 or 500. If anything I could see it as a 400 if there’s no better code – the request was invalid according to the business rules.It just doesn’t seem accurate.Edit: Also found this existing discussion of the same topic.All of the answers there seem to point to using status codes for this type of violation, with some discussion between using 400, 409, or the 422 extension.