This is my completed Python script so far. It basically just parses a news website and organizes the companies’ news into a google sheet file. The news is searched for with key words, and companies of certain market caps are kept.
while True: import bs4 as bs import urllib.request from selenium import webdriver driver = webdriver.Chrome() driver.get('https://login.globenewswire.com/?ReturnUrl=%2fSecurity%2fLogin%3fculture%3den-US&culture=en-US#login') # MANUALLY DO THE LOGIN import pygsheets gc = pygsheets.authorize() sh = gc.open('GNW API') wks = sh.sheet1 import datetime import time x = 18 y = 26 KeyWords = ['develop', 'contract', 'award', 'certif', 'execut', 'research', 'drug', 'theraputic', 'pivotal', 'trial', 'patient', 'data', 'fda', 'stud', 'phase', 'licenc', 'cancer', 'agree', 'clinical', 'acquisition', 'translational', 'trial', 'worldwide', 'world wide', 'world-wide', 'exclusiv', 'positive', 'successful', 'enter', 'sell', 'acquir', 'buy', 'bought', 'payment', 'availiab', 'design', 'transaction', 'increas', 'sale', 'record', 'clearance', 'right', 'launch', 'introduc', 'payment', 'meet', 'endpoint', 'primary', 'secondary', 'major', 'milestone', 'collaborat', 'beat', 'astound', 'sign', 'order', 'suppl', 'produc', 'made', 'make', 'making', 'customer', 'client', 'mulitpl', 'result', 'distribut', 'disease', 'treat', 'chmp', 'priority', 'promis', 'patent', 'purchas', 'allianc', 'strategic', 'team', 'commercializ', 'approv', 'select', 'strong', 'strength', 'grow', 'profit', 'improv', 'partner', 'cannabis', 'crypto', 'bitcoin', 'platform', 'expands', 'extends'] break while True: now = datetime.datetime.today() while now.hour == x: while now.minute == y: list = listfinal = driver.get('https://globenewswire.com/Search?runSearchId=41556723') elementals = driver.find_elements_by_class_name('post-title16px') for elements in elementals: list.append(elements.find_element_by_css_selector('a').get_attribute('href')) for elements in list: if any(KeyWords_item in elements.lower() for KeyWords_item in KeyWords): listfinal.append(elements) for elementals in listfinal: sauce = urllib.request.urlopen(elementals).read() soup = bs.BeautifulSoup(sauce,'lxml') desc = soup.find_all(attrs={"name":"ticker"}, limit=1) tickerraw = (desc[0]['content'].encode('utf-8')) decodedticker = tickerraw.decode('utf') souptitle = soup.title.text while True: if ', ' in decodedticker.lower(): finaltickerlist = decodedticker.split(', ') for elements in finaltickerlist: if 'nyse' in elements.lower(): if ':' in elements: a, b = elements.split(':') finaltickerexchange = 'NYSE' finalticker = b if ' ' in finalticker: finalticker = finalticker.replace(' ', '') break else: break else: finalticker = 'NoTicker' finaltickerexchange = 'NoTicker' elif 'nasdaq' in elements.lower(): if ':' in elements: a, b = elements.split(':') finaltickerexchange = 'NASDAQ' finalticker = b if ' ' in finalticker: finalticker = finalticker.replace(' ', '') break else: break else: finalticker = 'NoTicker' finaltickerexchange = 'NoTicker' elif 'tsx' in elements.lower(): if ':' in elements: a, b = elements.split(':') finaltickerexchange = 'TSX' finalticker = b if ' ' in finalticker: finalticker = finalticker.replace(' ', '') break else: break else: finalticker = 'NoTicker' finaltickerexchange = 'NoTicker' else: finalticker = 'NoTicker' finaltickerexchange = 'NoTicker' elif 'nasdaq' in decodedticker.lower(): if ':' in decodedticker.lower(): a, b = decodedticker.split(':', maxsplit=1) finalticker = b if ' ' in finalticker: finalticker = finalticker.replace(' ', '') finaltickerexchange = 'NASDAQ' elif 'nyse' in decodedticker.lower(): if ':' in decodedticker.lower(): a, b = decodedticker.split(':', maxsplit=1) finalticker = b if ' ' in finalticker: finalticker = finalticker.replace(' ', '') finaltickerexchange = 'NYSE' elif 'tsx' in decodedticker.lower(): if ':' in decodedticker.lower(): a, b = decodedticker.split(':', maxsplit=1) finalticker = b if ' ' in finalticker: finalticker = finalticker.replace(' ', '') finaltickerexchange = 'TSX' else: finalticker = 'NoTicker' finaltickerexchange = 'NoTicker' break if finalticker != 'NoTicker': sauce = urllib.request.urlopen('https://finance.yahoo.com/quote/' + finalticker + '?p=' + finalticker).read() soup = bs.BeautifulSoup(sauce,'lxml') mc_elm = soup.find(attrs={"data-test":"MARKET_CAP-value"}) while True: if mc_elm: marketcap = mc_elm.get_text() else: marketcap = "TickerNotFound" break while True: if 'B' in marketcap: marketcap = 'Billion Kalppa' else: values_list = ([finalticker,finaltickerexchange,marketcap,souptitle,elements]) wks.insert_rows(row=0, number=1, values=values_list) break if x == 23: x = 0 time.sleep(3000) else: x += 1 time.sleep(3000) break time.sleep(55) break
I intend to have this program run in the background and scrap news every hour when its the 26th minute of the hour. However after a while (not immediately, the code can run for 7-8 hours or iterations before this happens) I get this error:
Traceback (most recent call last): File "<pyshell#3>", line 93, in <module> sauce = urllib.request.urlopen('https://finance.yahoo.com/quote/' + finalticker + '?p=' + finalticker).read() File "C:UsersArbi717AppDataLocalProgramsPythonPython36-32liburllibrequest.py", line 223, in urlopen return opener.open(url, data, timeout) File "C:UsersArbi717AppDataLocalProgramsPythonPython36-32liburllibrequest.py", line 532, in open response = meth(req, response) File "C:UsersArbi717AppDataLocalProgramsPythonPython36-32liburllibrequest.py", line 642, in http_response 'http', request, response, code, msg, hdrs) File "C:UsersArbi717AppDataLocalProgramsPythonPython36-32liburllibrequest.py", line 570, in error return self._call_chain(*args) File "C:UsersArbi717AppDataLocalProgramsPythonPython36-32liburllibrequest.py", line 504, in _call_chain result = func(*args) File "C:UsersArbi717AppDataLocalProgramsPythonPython36-32liburllibrequest.py", line 650, in http_error_default raise HTTPError(req.full_url, code, msg, hdrs, fp) urllib.error.HTTPError: HTTP Error 503: Service Unavailable
I believe urllib.request is the problem because I am seeing that in the error a lot, however I have no idea what the solution is. Any help is much appreciated.
korj__33 2 / 2 / 0 Регистрация: 24.10.2014 Сообщений: 53 |
||||
1 |
||||
15.12.2015, 23:31. Показов 3473. Ответов 6 Метки нет (Все метки)
использую
питон версии 3.5 загвоздка в том что задаю разные сайты в эту конструкцию и , чаще всего, ответ — 200, но тот, который мне нужен отвечает 503. Но в браузере заходит без проблем. подскажите из-за чего проблема и как её решить. Спасибо.
__________________
0 |
2739 / 2342 / 620 Регистрация: 19.03.2012 Сообщений: 8,832 |
|
15.12.2015, 23:42 |
2 |
Что за сайт то? Все экстрасенсы в отпуске, по этому придется больше подробностей выложить.
0 |
2 / 2 / 0 Регистрация: 24.10.2014 Сообщений: 53 |
|
15.12.2015, 23:48 [ТС] |
3 |
0 |
2739 / 2342 / 620 Регистрация: 19.03.2012 Сообщений: 8,832 |
|
15.12.2015, 23:53 |
4 |
Там вон они открыто говорят, что от ddos всячески защищаются. Значит скорее всего там просто палят, что ты бот вот и все.
1 |
2 / 2 / 0 Регистрация: 24.10.2014 Сообщений: 53 |
|
15.12.2015, 23:57 [ТС] |
5 |
так, за инфу спасибо! Просто я вообще профан в этой области, можете примерчик
подменить на какие нибудь реальные подмены заголовков накидать, авось врублюсь
0 |
alex925 2739 / 2342 / 620 Регистрация: 19.03.2012 Сообщений: 8,832 |
||||
16.12.2015, 00:06 |
6 |
|||
1 |
2 / 2 / 0 Регистрация: 24.10.2014 Сообщений: 53 |
|
16.12.2015, 00:15 [ТС] |
7 |
Спасибо, буду копать
0 |
Errors happen – there’s some unexpected maintenance, a bug that went unnoticed, or a page goes viral and the flood of connections take the server down.
If you’ve been online for any amount of time, no doubt you’ve seen the somewhat vague 503 Service Unavailable error.
In this article we’ll go over HTTP status codes, what the 503 error means, and some possible ways to solve it – both for a site you’re trying to visit and for your own site.
An overview of HTTP status codes
Servers that host web pages listen for requests from web browsers or devices, also known as clients. The server then uses a bunch of different status codes to communicate back.
These status codes are organized into different classes, which is indicated by the first number of the status code:
- 1xx: Information – the server is still processing the request
- 2xx: Success – the request succeeded and the server responds with the page or resource
- 3xx: Redirection – the page or resource has moved and server will respond with its new location
- 4xx: Client error – there is an error in the request from the browser or device
- 5xx: Server error – there is an error with the server
The last two digits of each HTTP status code represent a more specific status for each class. For example, 301 means that a page or resource has moved permanently, while 302 means the move is temporary.
Check out this page for a list of common HTTP status codes and their meaning: https://en.wikipedia.org/wiki/List_of_HTTP_status_codes
Most status codes go by totally unnoticed, which is fine because it means everything is working. It’s only when you get to the 4xx-5xx range that you might notice a status code because you’ll see a page like this:
Now that you have a basic understanding of HTTP status codes, let’s dig a bit deeper into the 503 Service Unavailable error.
What does the 503 error code mean?
As mentioned above, 5xx status codes mean there’s a problem with the server itself.
A 503 Service Unavailable error means that the page or resource is unavailable. There are many reasons why a server might return a 503 error, but some common reasons are maintenance, a bug in the server’s code, or a sudden spike in traffic that causes the server to become overwhelmed.
The message that’s sent with the 503 error can vary depending on server it’s coming from, but here are some of the common ones you’ll see:
— 503 Service Unavailable
— 503 Service Temporarily Unavailable
— HTTP Server Error 503
— HTTP Error 503
— Error 503 Service Unavailable
— The server is temporarily unable to service your request due to maintenance downtime or capacity problems. Please try again later.Source
Whatever the reason for the 503 error, it’s usually temporary – the server will restart, traffic will die down, and the issue will resolve itself.
How to solve the 503 Status Unavailable error
When trying to solve a 503 error, there are two general camps.
The first is where you’re an end user, and you’re trying to visit a site that you don’t own. In the second, you own the site, and it’s throwing 503 errors to people who are trying to visit.
The method to solve 503 errors is different depending on which group you fall into. Let’s take a look at some things you can do as an end user if you see a 503 error.
How to solve a 503 Status Unavailable error as an end user
Since 5xx status codes mean that the error is on the server-side, there isn’t a lot you can do directly.
Even though 503 errors are usually temporary, there are some things you can do while you wait.
#1: Refresh the page
Sometimes the error is so temporary that a simple refresh is all it takes. With the page open, just press Ctrl — R on Windows and Linux, or Cmd — R on macOS to refresh the page.
#2: See if the page is down for other people
The next thing you can do is use a service like Is It Down Right Now? or Down For Everyone Or Just Me to see if other people are getting the same error.
Just go to either of those sites and enter in the URL for the page you’re trying to visit.
The service will ping the URL you entered to see if it gets a response. Then it’ll show you some cool stats and graphs about the page:
If you scroll down a bit you’ll see some comments from other people. Often people will give their general location and other data, so this can be a good way to determine if the error is just affecting certain regions or specific devices.
#3: Restart your router
Sometimes the issue has to do with a DNS server failure.
DNS stands for Domain Name System, and they basically act as translators between IP addresses and human readable URLs.
For example, you can visit Google by entering its long IP address directly (172.217.25.206), or you can just enter in the URL, www.google.com.
It’s a DNS, often hosted on a server, that handles all that behind the scenes.
All of that is to say, many routers cache responses from DNS servers (www.google.com <==> 172.217.25.206). But sometimes this cache can get corrupted and cause errors.
An easy way to reset or «flush» the cache is to restart your router. Just unplug your router for about 5 seconds, then plug it back in again.
It should restart after a minute and all of your devices should reconnect automatically. Once they do, try visiting the site again.
How to solve a 503 Status Unavailable error as the site’s owner
If you are the owner/developer of the site that’s returning 503 errors, there’s a bit more you can do to diagnose and resolve the issue.
Here are some general tips to get you started:
#1: Restart the server
Development is tough – even a simple static page can have so many moving parts that it can be difficult to pin down what’s causing the 503 error.
Sometimes the best thing to do is to restart the server and see if that fixes the issue.
The exact method of restarting your server can vary, but usually you can access it from your provider’s dashboard or by SSH’ing into the server and running a restart command.
The server should restart after a couple of minutes. If you’ve configured everything to run automatically on boot, you can visit your site and see if it’s working.
#2: Check the server logs
The next thing to do is check the logs.
The location of the server logs can vary depending on what service you’re running, but they’re often found in /var/log/...
.
Take a look around that directory and see if you can find anything. If not, check the manual for your programs by running man program_name
.
#3: Check if there’s ongoing automated maintenance
Some service providers offer automated package updates and maintenance. Normally this is a good thing – they usually occur during downtime, and help make sure everything is up-to-date.
Occasionally 503 errors are due to these scheduled maintenance sessions.
For example, some hosting providers that specialize in WordPress hosting automatically update WP whenever there’s a new release. WordPress automatically returns a 503 Service Unavailable error whenever it’s being updated.
Check with your service providers to see if the 503 error is being caused by scheduled maintenance.
#4: Check your server’s firewall settings
Sometimes 503 Service Unavailable errors are cause by a misconfigured firewall where connections can get through, but fail to get back out to the client.
Your firewall might also need special settings for a CDN, where multiple connections from a small handful of IP addresses might be misinterpreted as a DDoS attack.
The exact method of adjusting your firewall’s settings depends on a lot of factors. Take a look at your pipeline and your service provider’s dashboards to see where you can configure the firewall.
#5: Check the code
Bugs, like errors, happen. Try as you might, it’s impossible to catch them all. Occasionally one might slip through and cause a 503 error.
If you’ve tried everything else and your site is still showing a 503 Service Unavailable error, the cause might be somewhere in the code.
Check any server-side code, and pay special attention to anything having to do with regular expressions – a small regex bug is what caused a huge spike in CPU usage, rolling outages, and about three days of panic for us at freeCodeCamp.
Hopefully you’ll be able to track down the culprit, deploy a fix, and everything will be back to normal.
In summary
That should be everything you need to know about 503 Service Unavailable errors. While there’s usually not much you can do when you see a 503 error, hopefully some of these steps will help the next time you encounter one.
Stay safe, and happy refreshing-until-it-works
Learn to code for free. freeCodeCamp’s open source curriculum has helped more than 40,000 people get jobs as developers. Get started
Hypertext Transfer Protocol or HTTP 503 Service Unavaiable
server error response code indicates that the server is not ready to handle the request. Also HTTPS protocol will use the same code for the same reason. In this tutorial we will examine the 503 error code causes, client and server side solutions.
503 Expressions
HTTP 503 code can be expressed a little bit differently for different web servers. There are different web servers like Apache, IIS, lighttpd, Nginx etc.
503 Service Unavailable
503 Service Temporarily Unavailable
Http/1.1 Service Unavailable
HTTP Server Error 503
Service Unavailable - DNS Failure
503 Error
HTTP 503
HTTP Error 503
Error 503 Service Unavailable
Reasons
This error code simply means Service Unavailable
which means server can not handle and response to the request properly. Here the list of the HTTP 503 error causes.
- There is an update on the web server
- There is a bug in the server software
- There is a bug in the web application
- The request is an comply with the request filter
- There is a lot of request to the server which can not handled in the same time
- There is regular DDOS attack to the web server
- Client cache is poisoned with improper data
Client or Browser Solutions
Actually the error is mainly related with the server side but there may be some steps on the client side to try.
- Using different browser where some browser can send improper request.
- Clearing browser cache where poisoned data can be retrieved from cache
Server Side Solutions
The error is mainly related with the server side. We can do a lot of things to solve503
error. In some cases we may require to complete multiple of the following solutions.
- Restart the web server service
- Reload the web application
- Examine the server logs
- Check DNS server
- Increase concurrent request limit of web server
- Increase bandwidth of network connection
- Check the application logic related with URL
Programming Language and Frameworks Code References
In some cases we may want to sent 503
code to the client HTTP request. This can be done easily with the all ready defined codes in the programming languages and frameworks.
:service_unavailable
Go HTTP 503 Status Code
http.StatusServiceUnavailable
Symfony HTTP 503 Status Code
Response::HTTP_SERVICE_UNAVAILABLE
Python2 HTTP 503 Status Code
httplib.SERVICE_UNAVAILABLE
Python3 HTTP 503 Status Code
http.client.SERVICE_UNAVAILABLE
Python 3.5+ HTTP 503 Status Code
http.HTTPStatus.SERVICE_UNAVAILABLE
PHP HTTP 503 Status Code
StatusCodes::httpHeaderFor(503)