Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible way to fix 403 Error #62

Open
KostyaCholak opened this issue Oct 21, 2022 · 14 comments
Open

Possible way to fix 403 Error #62

KostyaCholak opened this issue Oct 21, 2022 · 14 comments
Assignees
Labels
bug Something isn't working

Comments

@KostyaCholak
Copy link

Hi, @alvarobartt !
I've encountered the 403 Error problem today and found that using curl seem to be working fine, no 403 error.
And the only difference I can see is the headers ordering - requests shuffles headers, while curl preserves them as provided.
So I tried using urllib.request and it worked.

I'm using Python 3.10.5

Maybe this can solve all 403 errors in the project?

minimal working example:

import urllib.request

# take them from your browser, no cookies required
headers = {}

req = urllib.request.Request(f'https://sbcharts.investing.com/events_charts/us/222.json', b"", headers)
with urllib.request.urlopen(req) as response:
    response = response.read().decode()
@RyuuOujiXS
Copy link

#56

@alvarobartt
Copy link
Owner

So I've just tested investiny and it seems to be working fine again... I assume their Cloudflare has some limitations but it's not blacklisting every IP forever, just after a certain number of requests...

Look:

(investiny-py3.9) alvarobartt@Alvaros-MacBook-Air investiny % poetry run python
Python 3.9.6 (default, Aug  5 2022, 15:21:02) 
[Clang 14.0.0 (clang-1400.0.29.102)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from investiny import historical_data
>>> d = historical_data(investing_id=6408)
>>> d
{'date': ['09/26/2022', '09/27/2022', '09/28/2022', '09/29/2022', '09/30/2022', '10/03/2022', '10/04/2022', '10/05/2022', '10/06/2022', '10/07/2022', '10/10/2022', '10/11/2022', '10/12/2022', '10/13/2022', '10/14/2022', '10/17/2022', '10/18/2022', '10/19/2022', '10/20/2022', '10/21/2022'], 'open': [149.66000366211, 152.74000549316, 147.63999938965, 146.10000610352, 141.2799987793, 138.21000671387, 145.0299987793, 144.07499694824, 145.80999755859, 142.53999328613, 140.41999816895, 139.89999389648, 139.13000488281, 134.99000549316, 144.30999755859, 141.06500244141, 145.49000549316, 141.69000244141, 143.02000427246, 142.96000671387], 'high': [153.7700958252, 154.7200012207, 150.64140319824, 146.7200012207, 143.10000610352, 143.07000732422, 146.2200012207, 147.38000488281, 147.53999328613, 143.10000610352, 141.88999938965, 141.35000610352, 140.36000061035, 143.58999633789, 144.52000427246, 142.89999389648, 146.69999694824, 144.94920349121, 145.88999938965, 147.83999633789], 'low': [149.63999938965, 149.94500732422, 144.83999633789, 140.67999267578, 138, 137.68499755859, 144.25999450684, 143.00999450684, 145.2200012207, 139.44500732422, 138.57290649414, 138.2200012207, 138.16000366211, 134.36999511719, 138.19000244141, 140.27000427246, 140.61000061035, 141.5, 142.64999389648, 142.67999267578], 'close': [150.77000427246, 151.75999450684, 149.83999633789, 142.47999572754, 138.19999694824, 142.44999694824, 146.10000610352, 146.39999389648, 145.42999267578, 140.08999633789, 140.41999816895, 138.97999572754, 138.33999633789, 142.99000549316, 138.38000488281, 142.41000366211, 143.75, 143.86000061035, 143.38999938965, 147.27000427246], 'volume': [93339000, 84443000, 146691008, 128138000, 124925000, 114312000, 87134000, 79148000, 68402000, 85926000, 74591000, 77034000, 69833000, 112876000, 88237000, 84684000, 98716000, 61758000, 64277000, 85641896]}

@alvarobartt alvarobartt self-assigned this Oct 23, 2022
@alvarobartt alvarobartt added the invalid This doesn't seem right label Oct 23, 2022
@KostyaCholak
Copy link
Author

So what is the problem with this solution @alvarobartt ? It works for me without any limitations from Cloudflare. Did it stop working for you after some time?

Seems like a reliable solution to me

@alvarobartt
Copy link
Owner

So what is the problem with this solution @alvarobartt ? It works for me without any limitations from Cloudflare. Did it stop working for you after some time?

Seems like a reliable solution to me

Not at all, I was about to test it whenever I realized that investiny was working fine with the current solution!

Is the current implementation not working from your side? Can you run some stress tests over Investing.com to see whether you end up getting HTTP 403 or not? Thanks 😄

@alvarobartt alvarobartt added bug Something isn't working and removed invalid This doesn't seem right labels Oct 24, 2022
@alvarobartt
Copy link
Owner

Oops, maybe the invalid label was confusing @KostyaCholak, I meant that it was related to something not valid e.g. the current implementation, not that your solution was not valid 👍🏻 I've updated the label to be more clear!

@gbonariva
Copy link

gbonariva commented Oct 24, 2022

 >>> from investiny import historical_data
 >>> d = historical_data(investing_id=6408)
 >>> d

ConnectionError: Request to Investing.com API failed with error code: 403.

@alvarobartt, no need to stress tests: I got it at first attempt

The only reliable solution for now seems the one posted here: alvarobartt/investpy#611 (comment)

@alvarobartt
Copy link
Owner

So your solution failed too? Or just the default investiny?

That seems to be a solution, yes, but I prefer to wait until I get a response from Investing.com, as I want to approach this the best way possible, but thanks for mentioning it again 😄

@alvarobartt
Copy link
Owner

alvarobartt commented Oct 24, 2022

@KostyaCholak see this, launched right now, and working fine:

Screenshot 2022-10-24 at 20 26 13

So maybe you are blocked or something, because it's working fine for me... both using investiny and plain httpx as shown in the screenshot above.

P.S. I'll be attaching the code in the Jupyter Notebook here so that you can reproduce it!

import httpx

headers = {
    "Content-Type": "application/json",
    "Origin": "https://tvc-invdn-com.investing.com",
    "Host": "tvc4.investing.com",
    "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/16.0 Safari/605.1.15",
    "Referer": "https://tvc-invdn-com.investing.com/",
    "Connection": "keep-alive",
}

url = "https://tvc4.investing.com/8f72bb4be70f8f06f5bad539977ee7ce/1666473162/1/1/8/history?symbol=6408&resolution=30&from=1663881169&to=1666473229"

r = httpx.get(url, headers=headers)
print(r)
print(r.json())

@alvarobartt
Copy link
Owner

And same thing if I run investiny's unit tests with poetry run make tests

image

@KostyaCholak
Copy link
Author

Oops, maybe the invalid label was confusing @KostyaCholak, I meant that it was related to something not valid e.g. the current implementation, not that your solution was not valid 👍🏻 I've updated the label to be more clear!

yes, thanks)

@KostyaCholak
Copy link
Author

Just tried latest version and result is the same. Should I try master branch?
But the urllib version works somehow.

Screenshot 2022-11-08 at 21 19 29

Screenshot 2022-11-08 at 21 21 59

@KostyaCholak
Copy link
Author

Also tried httpx, got 403

Screenshot 2022-11-08 at 23 48 02

@alvarobartt
Copy link
Owner

@KostyaCholak okay, let me stress test it so that I also get HTTP 403 so I can reproduce it, then I'll tell you! Also, could you paste them here or send me the headers you're using for the request via email? As copy-paste from the browser won't work if that cannot be automated 😩

@KostyaCholak
Copy link
Author

headers = {    
    'authority': 'sbcharts.investing.com',
    'accept': 'application/json, text/javascript, */*; q=0.01',
    'cache-control': 'no-cache',
    'origin': 'https://www.investing.com',
    'pragma': 'no-cache',
    'referer': 'https://www.investing.com/',
    'sec-ch-ua': '"Google Chrome";v="107", "Chromium";v="107", "Not=A?Brand";v="24"',
    'sec-ch-ua-mobile': '?0',
    'sec-ch-ua-platform': '"macOS"',
    'sec-fetch-dest': 'empty',
    'sec-fetch-mode': 'cors',
    'sec-fetch-site': 'same-site',
    'user-agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/107.0.0.0 Safari/537.36',
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants