⌁
Cookie3 Data Collection Engines
A set of scalable scrapers feeding the company's data ecosystem with content from YouTube, Telegram and news platforms.
What I used
.NETMongoDBDockerGitHub Actions
01
What I did
- Built fault-tolerant data pipelines for many different sources.
- Normalized diverse formats into one consistent model.
- Automated deployment and maintenance with Docker and GitHub Actions.
02
What I learned
- Working with different APIs and handling rate limiting.
- Building pipelines that survive a single source going down.
- Thinking of data as a product — from raw stream to finished insight.
03
Challenges
- Sources changed structure often — scrapers had to be flexible and easy to fix.
- Keeping data fresh within external API constraints.
- Scale — what worked for one source had to generalize to dozens.