Your First Plan is on Us!

Get 100% of your first residential proxy purchase back as wallet balance, up to $900.

Start now
EN
English
简体中文
Log inGet started for free

Blog

Scraper

How to Scrape Amazon Reviews with Python: Ultimate Guide for 2026

Scrape Amazon Reviews

author jenny
Jenny Avery
Last updated on
 
2026-01-09
 
8 min read
 

As e-commerce continues to dominate retail in 2026, platforms like Amazon host millions of customer reviews that offer invaluable insights into product performance, consumer preferences, and market trends. Web scraping Amazon reviews with Python enables businesses, researchers, and developers to automate data collection, transforming raw feedback into actionable insights. This guide provides a comprehensive, step-by-step approach to building a robust Amazon review scraper.

Whether you’re new to web scraping reviews in Python or looking to optimize your existing tools, this article covers everything from setup to advanced integrations.

Why Scrape Amazon Reviews?

Web scraping reviews with Python isn’t just a technical exercise—it’s a strategic tool for gaining a competitive edge. In 2026, with AI-driven analytics on the rise, scraping Amazon reviews helps companies monitor sentiment, identify pain points, and benchmark against competitors.

Key benefits

● Sentiment Analysis: Aggregate ratings and text to gauge overall customer satisfaction.

● Trend Detection: Spot emerging patterns in feedback over time.

● Product Improvement: Use detailed reviews to refine offerings and boost sales.

● Market Research: Compare reviews across products without manual effort.

By leveraging Python for web scraping reviews, you can process thousands of entries efficiently. However, for large-scale operations, Thordata’s SERP Scraper API shines by providing seamless access to data without the hassle of blocks or bans.

Setting Up Your Python Environment for Web Scraping Reviews

Before diving into code, prepare your environment to ensure smooth web scraping of Amazon reviews with Python. This setup minimizes errors and supports efficient data handling.

Essential Libraries for Python Web Scraping

You’ll need Python 3.8 or higher. Install the following packages via pip:

Code Block Example
text
             pip install requests beautifulsoup4 pandas lxml

● Requests: For sending HTTP requests to Amazon pages.

● BeautifulSoup: To parse HTML and extract review elements.

● Pandas: For structuring scraped data into DataFrames and exporting to CSV.

● lxml: A fast parser for BeautifulSoup.

These libraries form the core of most Python web scraping projects. If you're facing installation issues, refer to Python's official documentation.

Configuring Headers and Proxies

To avoid detection, use custom headers mimicking a real browser. For added reliability, integrate residential proxies early— they offer rotating residential IPs that boost success rates to over 99%, far superior to free options.

Understanding Amazon's Review Page Structure

Amazon's product pages (e.g., https://www.amazon.com/dp/B098FKXT8L) display a limited set of local and global reviews. As of 2026, reviews are typically wrapped in <div data-hook="review"> elements under IDs like #cm-cr-dp-review-list for local reviews and #cm-cr-global-review-list for global ones.

amazon

Key elements include:

● Author: .a-profile-name

● Rating: .review-rating > span

● Date: .review-date

● Title and Content: Vary by local/global, often under .review-title and .review-text

● Images: .review-image-tile or similar

● Verified: span.a-size-mini

Inspect the page using browser dev tools to confirm selectors as the structure evolves. For full reviews, consider extending to /product-reviews/ASIN with pagination.

Element

CSS Selector

Description

Author

.a-profile-name

Reviewer's name

Rating

.review-rating > span

Star rating (e.g., "5.0")

Date

.review-date

Review timestamp

Title (Local)

.review-title span:not([class])

Review headline

Content (Local)

.review-text

Full review body

Images

.review-image-tile

Attached photos

This table summarizes core selectors for quick reference in your web scraping reviews Python project.

Step-by-Step Guide to Building an Amazon Review Scraper in Python

Let's build the scraper incrementally.

Sending Requests and Parsing HTML

Start by importing libraries and defining the ASIN (Amazon Standard Identification Number).

Code Block Example
Python
import requests
from bs4 import BeautifulSoup
import pandas as pd

asin = "B098FKXT8L"
custom_headers = {
    "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8",
    "Accept-Language": "en-US,en;q=0.9",
    "Connection": "keep-alive",
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) Gecko/20100101 Firefox/135.0"
}

def get_soup(url):
    response = requests.get(url, headers=custom_headers)
    if response.status_code != 200:
        print("Error in getting webpage")
        return None
    return BeautifulSoup(response.text, "lxml")

This function fetches and parses the page.

Extracting Key Review Data

Define a function to extract details from each review element.

Code Block Example
Python
def extract_review(review, is_local=True):
    author_element = review.select_one(".a-profile-name")
    author = author_element.text.strip() if author_element else "Unknown"

    rating_element = review.select_one(".review-rating > span")
    rating = rating_element.text.replace("out of 5 stars", "").strip() if rating_element else "0"

    date_element = review.select_one(".review-date")
    date = date_element.text.strip() if date_element else "Unknown"

    if is_local:
        title = review.select_one(".review-title span:not([class])").text.strip() if review.select_one(".review-title span:not([class])") else ""
        content = ' '.join(review.select_one(".review-text").stripped_strings) if review.select_one(".review-text") else ""
        img_selector = ".review-image-tile"
    else:
        title = review.select_one(".review-title .cr-original-review-content").text.strip() if review.select_one(".review-title .cr-original-review-content") else ""
        content = ' '.join(review.select_one(".review-text .cr-original-review-content").stripped_strings) if review.select_one(".review-text .cr-original-review-content") else ""
        img_selector = ".linkless-review-image-tile"

    verified_element = review.select_one("span.a-size-mini")
    verified = verified_element.text.strip() if verified_element else None

    image_elements = review.select(img_selector)
    images = [img.attrs.get("data-src") for img in image_elements if img.attrs.get("data-src")] if image_elements else None

    return {
        "type": "local" if is_local else "global",
        "author": author,
        "rating": rating,
        "title": title,
        "content": content.replace("Read more", ""),
        "date": date,
        "verified": verified,
        "images": images
    }

This handles variations between local and global reviews.

Handling Local and Global Reviews

Collect reviews with this function:

Code Block Example
Python
def get_reviews(soup):
    reviews = []
    local_reviews = soup.select("#cm-cr-dp-review-list > div[data-hook='review']")
    global_reviews = soup.select("#cm-cr-global-review-list > div[data-hook='review']")
    for review in local_reviews:
        reviews.append(extract_review(review, is_local=True))
    for review in global_reviews:
        reviews.append(extract_review(review, is_local=False))
    return reviews

Exporting Scraped Data to CSV

Wrap it up in main:

Code Block Example
Python
def main():
    search_url = f"https://www.amazon.com/dp/{asin}"
    soup = get_soup(search_url)
    if soup is None:
        return
    reviews = get_reviews(soup)
    df = pd.DataFrame(reviews)
    df.to_csv(f"reviews_{asin}.csv", index=False)
    print("Data exported to CSV.")

if __name__ == "__main__":
    main()

Run this script to save reviews in a CSV file.

Full Python Code for Scraping Amazon Reviews

Here's the complete, runnable script combining all parts:

Code Block Example
Python
import requests
from bs4 import BeautifulSoup
import pandas as pd

asin = "B098FKXT8L"
custom_headers = {
    "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8",
    "Accept-Language": "en-US,en;q=0.9",
    "Connection": "keep-alive",
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) Gecko/20100101 Firefox/135.0"
}

def get_soup(url):
    response = requests.get(url, headers=custom_headers)
    if response.status_code != 200:
        print("Error in getting webpage")
        return None
    return BeautifulSoup(response.text, "lxml")

def extract_review(review, is_local=True):
    author_element = review.select_one(".a-profile-name")
    author = author_element.text.strip() if author_element else "Unknown"

    rating_element = review.select_one(".review-rating > span")
    rating = rating_element.text.replace("out of 5 stars", "").strip() if rating_element else "0"

    date_element = review.select_one(".review-date")
    date = date_element.text.strip() if date_element else "Unknown"

    if is_local:
        title = review.select_one(".review-title span:not([class])").text.strip() if review.select_one(".review-title span:not([class])") else ""
        content = ' '.join(review.select_one(".review-text").stripped_strings) if review.select_one(".review-text") else ""
        img_selector = ".review-image-tile"
    else:
        title = review.select_one(".review-title .cr-original-review-content").text.strip() if review.select_one(".review-title .cr-original-review-content") else ""
        content = ' '.join(review.select_one(".review-text .cr-original-review-content").stripped_strings) if review.select_one(".review-text .cr-original-review-content") else ""
        img_selector = ".linkless-review-image-tile"

    verified_element = review.select_one("span.a-size-mini")
    verified = verified_element.text.strip() if verified_element else None

    image_elements = review.select(img_selector)
    images = [img.attrs.get("data-src") for img in image_elements if img.attrs.get("data-src")] if image_elements else None

    return {
        "type": "local" if is_local else "global",
        "author": author,
        "rating": rating,
        "title": title,
        "content": content.replace("Read more", ""),
        "date": date,
        "verified": verified,
        "images": images
    }

def get_reviews(soup):
    reviews = []
    local_reviews = soup.select("#cm-cr-dp-review-list > div[data-hook='review']")
    global_reviews = soup.select("#cm-cr-global-review-list > div[data-hook='review']")
    for review in local_reviews:
        reviews.append(extract_review(review, is_local=True))
    for review in global_reviews:
        reviews.append(extract_review(review, is_local=False))
    return reviews

def main():
    search_url = f"https://www.amazon.com/dp/{asin}"
    soup = get_soup(search_url)
    if soup is None:
        return
    reviews = get_reviews(soup)
    df = pd.DataFrame(reviews)
    df.to_csv(f"reviews_{asin}.csv", index=False)
    print("Data exported to CSV.")

if __name__ == "__main__":
    main()

This code is optimized for 2026 Amazon structures, with error handling for robustness.

Enhancing Your Scraper with Thordata's Advanced Tools

While custom Python scripts offer flexibility, scaling web scraping reviews can be challenging. Thordata's Amazon Scraper API addresses this by providing enterprise-grade solutions.

Using Thordata Proxies for Reliable Scraping

For smaller projects, Thordata offers a free trail to get started. They ensure connection stability and basic anti-scraping bypass. For larger scales, Thordata Residential Proxies mimic real user behavior, minimizing blocks—ideal for web scraping Amazon reviews with Python.

Best Practices and Legal Considerations for Web Scraping Reviews

To maintain a positive user experience and comply with standards:

1. Rate-limit requests to avoid overloading servers.

2. Use ethical data practices—don't misuse personal info.

3. Check Amazon's robots.txt and terms; scraping public data is common but consult legal advice.

4. Monitor for changes in page structure.

5. Opt for Thordata to ensure compliant, efficient scraping.

Conclusion

Web scraping Amazon reviews with Python empowers data-driven decisions in 2026's competitive landscape. By following this guide and leveraging Thordata's superior scraping products, you can achieve reliable, scalable results with minimal effort. Start small with the provided code, then scale up with Thordata for professional-grade performance.

Contact us at support@thordata.com for tailored advice.

 
Get started for free

Frequently asked questions

How to Scrape All Amazon Reviews with Python?

 

To scrape beyond the initial page, add pagination by looping through URLs like https://www.amazon.com/product-reviews/{asin}?pageNumber={page}. Use a while loop until no more reviews are found, and integrate Thordata proxies to handle volume without blocks.

What Are the Best Python Libraries for Web Scraping Amazon Reviews?

 

Top libraries include Requests for HTTP, BeautifulSoup for parsing, Pandas for data export, and Selenium for dynamic content. For advanced needs, combine with Thordata's API to simplify integration.

How to Avoid Blocks When Web Scraping Amazon Reviews with Python?

 

Yes, for public data if you respect robots.txt and ToS—focus on ethical crawling and scraping to avoid blocks, prioritizing proxies and rate limits.

About the author

Jenny is a Content Specialist with a deep passion for digital technology and its impact on business growth. She has an eye for detail and a knack for creatively crafting insightful, results-focused content that educates and inspires. Her expertise lies in helping businesses and individuals navigate the ever-changing digital landscape.

The thordata Blog offers all its content in its original form and solely for informational intent. We do not offer any guarantees regarding the information found on the thordata Blog or any external sites that it may direct you to. It is essential that you seek legal counsel and thoroughly examine the specific terms of service of any website before engaging in any scraping endeavors, or obtain a scraping permit if required.