Facebook-Data-Extractor

Facebook Data Extractor

Web-based Facebook groups scraper that opens Chrome, searches groups, extracts posts, and exports CSV.

Quick Start

For the full startup flow, see:

System Flow

System Flow

What This Project Does

Who Needs Installation?

Host Requirements

docker compose up --build -d

Open:

http://localhost:5000

For Facebook login inside the Selenium browser container, open:

http://localhost:7900

The noVNC password is disabled for local development in the current compose file.

requirements.txt is still required because the Docker image for the Flask app installs Python packages from it during build.

Docker Services

If you expose this app with ngrok, users open the ngrok URL in their browser and use it directly. They do not install Python, ChromeDriver, Docker, or this repository.

Web UI Features

Queue Behavior (Important)

Docker Notes

Input Fields

Search in Facebook

Phrase sent to Facebook search.

Requested number of groups to process.

Posts from each group

Requested number of posts per group.

Expected rows = group_links_number * posts_from_each_group (best effort, based on available data and page behavior).

Output

Share with Others (ngrok)

If you want to share a temporary public link to your local app:

  1. Run the app:
docker compose up --build -d
  1. Run ngrok to expose port 5000:
ngrok http 5000
  1. Send the https://...ngrok... URL.

What End Users Need

ngrok Notes

Troubleshooting

Push blocked by GitHub secret scanning

Do not commit browser profile/cache/history folders (web_profiles/** should be ignored).
If sensitive data was already committed, rewrite history and rotate exposed credentials.

ngrok not recognized

Make sure ngrok is installed and in PATH, or run the full executable path.

Chrome/FB login behavior

When Facebook login is required, use:

That page shows the Selenium browser through noVNC so you can log in manually when needed.