Custom Project (Web Scraping / Data Extraction) June 2026

TikTok Shop Indonesia Scraper

Apify actor that scrapes public product data from TikTok Shop Indonesia (shop-id.tokopedia.com): prices, units sold, sellers, and reviews. Built-in CAPTCHA solver, no proxy, pay per result.

Client

Published Actor

Role

Solo Developer

Status

Live on Apify Store

Category

Custom Project (Web Scraping / Data Extraction)

Tech Stack

Python Playwright (patchright) OpenCV Apify SDK Docker
Main screenshot of TikTok Shop Indonesia Scraper project

Overview

A web scraper for TikTok Shop Indonesia, published as a paid actor on the Apify Store. After the 2023 ban and the Tokopedia merger, TikTok Shop Indonesia runs on shop-id.tokopedia.com, a storefront protected by ByteDance anti-bot. This actor extracts public product data (price, discount, rating, units sold, seller, brand, images) across the homepage and 28 categories, plus full product detail and customer reviews by URL.

It runs with no proxy and no paid CAPTCHA service, and bills per result, so a buyer just presses Run and only pays for the products that come back.

Challenge

The Indonesian storefront is one of the harder TikTok Shop surfaces to reach, and existing scrapers avoid it:

  • shop-id.tokopedia.com serves a ByteDance slider CAPTCHA on cold, automated sessions.
  • The block is driven by session and behavior, not IP reputation. A real Indonesian residential IP still gets challenged, so residential proxies alone do not solve it.
  • The storefront exposes no public keyword search, so targeting has to work through categories and product URLs.
  • Popular TikTok Shop scrapers target the global shop.tiktok.com or the US storefront, not the actual Indonesian storefront on Tokopedia infrastructure.

Solution

Clean browser fingerprint

Hand-rolled stealth was still detectable. Switching to patchright, a patched Playwright build, passed all 31 bot-detection checks on bot.sannysoft.com and closed the CDP leak that ByteDance inspects.

Endpoint discovery instead of guessing

Rather than guess API paths, every JSON response and server-rendered payload was logged on a real session. Homepage products arrive through a products_by_component request, category products are rendered inside the page itself, and product detail and reviews live in the product page payload. Parsers were tuned to the real structures and validated on hundreds of live products.

Free computer-vision CAPTCHA solver

Instead of paying a solver service, the slider is solved with OpenCV. The gap position is found by edge template matching between the puzzle background and the piece, then the handle is dragged with a human-like motion curve. It clears about 93 percent of puzzles on the first attempt, and close to 99 percent with a refresh and retry loop. Tested live against many real puzzles.

Targeted extraction

Category and subcategory crawling for broad coverage, best-seller filters (minimum units sold, rating, and price range) for product research, and a detail mode that returns description, all images, variants, specifications, and reviews from product URLs.

One-click cloud delivery

Packaged as an Apify actor with a Dockerfile, input schema, and dataset output. Buyers run it with zero configuration; the CAPTCHA is handled automatically and the data exports to JSON, CSV, or Excel.

Results

  • 733 products in a single cold cloud run, with the CAPTCHA solved automatically and no proxy.
  • No per-run cost for proxies or CAPTCHA solving, which keeps the resale margin high at a low price.
  • Published and monetized on the Apify Store with pay-per-result pricing.
  • Targets the real Indonesian storefront (shop-id.tokopedia.com), a niche the popular competitors do not cover.

Tech stack rationale

  • patchright over playwright-stealth: patches detection at the engine level, including the CDP leak that hand-written scripts cannot hide.
  • OpenCV solver over a paid API: no per-solve fee, so the actor stays profitable at a low price, with a retry loop covering the occasional miss.
  • Server-side parsing: category and product pages embed their data in the page, so it is read directly without a fragile keyword search.
  • Apify for delivery: managed runs, storage, and pay-per-result billing, with the same portable core able to run on a plain VPS.

Screenshots

TikTok Shop Indonesia Scraper: Screenshots 1

Short brief

Send scope, timeline, and a rough budget. I reply with numbers—or a short note if I am not the right fit.