Using proxies with Python requests: A step-by-step guide

In this post you'll learn how to use proxies along side the Python request module.

Prerequisites

Before we delve into the nitty-gritty, we need to talk about the prerequisites. If you're already comfortable with Python, you're all set for this ride. Otherwise, you might want to brush up your Python skills first.

To kick things off, we need the Python requests library, an indispensable tool for making HTTP requests. You can grab it using Python's package installer, pip, like this:

pip install requests

With the requests library now in our toolkit, let's forge ahead and configure proxies with Python requests.

Setting Up Proxies with Python Requests

Working with proxies involves creating a dictionary of proxies for different protocols. This dictionary will come into play when we pass the proxy address into the requests.get() method. Here's an example:

proxies = {   
  'http': 'http://10.10.1.10:3128',   
  'https': 'http://10.10.1.10:1080',   
}

And here's the python code and how you can utilize it with the requests.get() method:

requests.get('http://example.org', proxies=proxies)

Now, let's take a moment to differentiate between HTTP, HTTPS, and SOCKS proxies. HTTPand HTTPS are often the go-to choices, while SOCKS proxies offer more versatility at the cost of complexity.

Unfortunately, the requests module doesn’t play nice with SOCKS proxies out of the box. To use SOCKS proxies in Python, we need the requests[socks] module. You can install it using pip, just like we did with the requests library:

pip install requests[socks]

After installing requests[socks], you can configure SOCKS proxies like this:

proxies = {   
    'http': 'socks5://user:pass@host:port',   
    'https': 'socks5://user:pass@host:port'   
}

Managing Proxy Authentication

Proxies serve as intermediaries between your local network and the internet. Some proxy servers, in their quest to ensure top-notch security, require users to authenticate themselves with a username and password. This process, known as proxy authentication, is crucial in preventing unauthorized access.

If you're using Python's requests module to make HTTP requests, there might be times when you need to pass your proxy authentication details to http request (username and password). If you're unsure about how to go about it, you're at the right place. Let's explore this in detail.

Imagine you have a proxy with the following details: IP - 192.0.2.0, port - 8080, username - myuser, and password - mypassword. You can send a GET request using the get() method and pass along your proxy details in this manner:

import requests   
   
proxies = {   
    "http": "http://myuser:mypassword@192.0.2.0:8080",   
    "https": "https://myuser:mypassword@192.0.2.0:8080"   
}   
   
response = requests.get("http://example.com", proxies=proxies)  
  
print(response.text)

The script you just saw sends a GET request to http://example.com via your specified proxy, using your provided username and password for authentication.

Leveraging Environment Variables

Hardcoding your own proxy server details into your script might not always be the best approach, especially if you're looking for a more flexible solution. That's where environment variables come into play.

Python allows you to use environment variables to configure proxies for your requests. You can set environment variables and store your proxy configuration details in an .env file like this:

HTTP_PROXY=http://myuser:mypassword@192.0.2.0:8080   
HTTPS_PROXY=https://myuser:mypassword@192.0.2.0:8080

Afterwards, you can access these environment variables using the os module in Python:

import os   
import requests   
   
http_proxy = os.getenv("HTTP_PROXY")   
https_proxy = os.getenv("HTTPS_PROXY")   
   
proxies = {   
    "http": http_proxy,   
    "https": https_proxy   
}   
   
response = requests.get("http://example.com", proxies=proxies)  
  
print(response.text)

Using Sessions Alongside Python Requests and Proxies

Ever wondered how to persist certain parameters across multiple requests when interacting with a web application? Enter sessions! In Python, sessions are a powerful tool that can help maintain your cookies, headers, and proxies, saving you the hassle of redefining them with every request.

Think of a session as a 'hangout period' between a user and a web application. With Python requests, sessions are a godsend. Why? Because they allow you to persist parameters like cookies, headers, and proxies across requests. This comes in especially handy when you're making multiple requests to the same server.

To kick things off, you'll need to create a session object. This is done using the requests.Session() function. Here's how it's done:

import requests  
  
session = requests.Session()

Once you have a session up and running, it's time to configure your proxies. This is achieved using the .proxies attribute:

session.proxies = {  
    "http": "http://myuser:mypassword@192.0.2.0:8080",  
    "https": "https://myuser:mypassword@192.0.2.0:8080"  
}

With the above code, all requests made using this session will automatically use the proxies you've specified.

Now that your proxies are set, let's make a GET request using this session:

response = session.get("http://example.com")  
print(response.text)

Using Rotating Proxies in Python Requests

When you're neck-deep in HTTP requests using Python, you've probably bumped into some roadblocks such as blocked proxy IP addresses, bans, captchas, and pesky rate limits. These can be a real pain, slowing down or even bringing your web scraping or data gathering activities to a complete standstill. That's where rotating proxies come in handy.

Rotating proxies are IP addresses that take turns or 'rotate' after a set period or a certain number of requests. This clever strategy helps to scatter your requests, making them seem more natural and less likely to trip up defenses like IP address bans or rate limits. Using proxy types like residential or mobile proxies help trip up defenses even more.

There's more than one way to skin a cat when it comes to rotating proxies. You can opt for free proxies, which are available on various websites. However, these tend to be unreliable and sluggish. For a more robust and efficient solution, consider using a proxy provider like SOAX. If you do go the free proxy route, you can check their performance using our free proxy checker.

Here's how you can get rotating proxies to play nice with Python requests:

1. Use a List of Proxies
You can throw a list of proxies at the requests.get() method and use the random.choice() function to cherry-pick a random proxy for each request. Here's a basic example:

import requests   
import random   
   
# List of proxies   
proxies = ['proxy1', 'proxy2', 'proxy3']   
   
# Randomly select a proxy   
proxy = random.choice(proxies)   
   
# Make a request   
response = requests.get('https://example.com', 
                        proxies={"http": proxy, "https": proxy}
                        )

This method is as easy as pie. However, the quality and speed of your proxies hang on the sources you've picked.

2. Use a Premium Proxy Service
For a smoother ride, use a proxy premium service like SOAX that takes care of proxy rotation for you. Here's how to do it:

import requests   
   
# Using a SOAX proxy with a test username and password  
proxies = {    
    "http": "http://0YFEkZzfrwBX4Wfp:wifi;pl;@proxy.soax.com:9000",    
    "https": "http://0YFEkZzfrwBX4Wfp:wifi;pl;@proxy.soax.com:9000"    
}     
   
# Make a request   
response = requests.get('https://example.com', proxies=proxies)   
  
print(response.text)

This method guarantees you're using top-notch proxies that are less likely to get blacklisted.

Handling Proxy Errors and Exceptions in Python

Working with proxies and Python requests, you're bound to come across a few hiccups like ProxyError or ConnectionError. But don't fret! You can elegantly handle these exceptions using try-except blocks. Let's dive into how you can do this.

Here's an example of how you can use a try-except block to handle proxy errors:

import requests
import random

# List of proxies to choose from. Replace these with your own list.
PROXIES = [
    {'http': 'http://192.168.0.1:8080', 'https': 'https://192.168.0.1:8080'},
    {'http': 'http://192.168.0.2:8080', 'https': 'https://192.168.0.2:8080'},
    {'http': 'http://192.168.0.3:8080', 'https': 'https://192.168.0.3:8080'},
    # Add more proxies here...
]

# Number of retries
MAX_RETRIES = 5

def fetch_content():
    retries = 0
    while retries < MAX_RETRIES:
        # Pick a random proxy
        proxy = random.choice(PROXIES)

        try:
            # Make the request
            response = requests.get("http://www.google.com", proxies=proxy, timeout=5)
            
            # Check if the request was successful
            if response.status_code == 200:
                print(f"Successfully fetched content using proxy {proxy}")
                print(response.text[:100])  # Print the first 100 characters of the response
                return
            else:
                print(f"Received unexpected status code {response.status_code} using proxy {proxy}")
        
        except requests.RequestException as e:
            print(f"An error occurred while using proxy {proxy}: {e}")
        
        retries += 1
        print(f"Retrying... ({retries}/{MAX_RETRIES})")

    print("Max retries reached. Exiting.")

if __name__ == "__main__":
    fetch_content()

The fetch_content function tries to fetch the content of Google's homepage using a randomly chosen proxy. If the request fails for any reason, it catches the exception and retries with another randomly chosen proxy, up to a maximum number of retries (MAX_RETRIES)

By using this approach, your script will continue to run even if a proxy fails. This is critical when using free proxies which can often turn out to be unreliable.

Wrapping Up

The use of Python requests with proxies offers several advantages, particularly for web scraping, extracting data, privacy maintenance, and bypassing geo-restrictions. The code snippets provided in this guide give you a solid foundation to experiment with different proxies.

To broaden your knowledge on this topic, consider exploring the official Python requests documentation, tutorials, and blogs. These resources will provide you with a comprehensive understanding of how to effectively use Python requests and proxies.