Apify + Meteor: Making the Web More Programmable Together

Meteor Software
Meteor Blog
Published in
9 min readSep 20, 2022

--

[image source: Apify]

The web is the largest and most important source of information ever created by humankind. But since it was designed for people, computers and automated systems can’t easily gather or automatically understand all that data.

So if someone wants to compare stock prices, product details, or hotel rates, they may spend hours gathering and translating this intel on their own. Bad news? This task isn’t just tedious and labor-intensive; it’s also prone to human errors.

That’s why web scrapers like Apify “make the web work for you.”

Web scraping is the process of extracting data from websites and exporting it into more usable formats. That harvested data can then be compiled into a spreadsheet or routed to an API, for example, to leverage it for your needs.

Apify is a web scraping and automation platform built with Meteor. It lets you extract data from websites, process harvested data, and automate workflows on the web.

Let’s explore Apify’s one-stop shop for web scraping and robotic process automation (RPA) in this case study and find out why they chose to build smarter with Meteor.

Meet the Apify Team: Two Co-Founders Who Participated in the First Ever Y Combinator Fellowship

Back in the Summer of 2015, Y Combinator (YC) launched a new program called the Y Combinator Fellowship. The idea was simple: participants would receive two months of guidance from the world’s most prestigious startup community and a grant of $12,000 to develop a minimum viable product (MVP).

Jan Čurn and Jakub Balada heard about the YC Fellowship while reading Hacker News. They were working part-time on what would later become Apify. So they recorded a one-minute video and filled out the application. Jan accidentally clicked the “Submit” button instead of “Save draft,” a full three days before the deadline, and “the die was cast.”

[image source: Apify]

Only 20 companies would be invited to the YC Fellowship (out of the 6,500 applications). Jan and Jakub calculated their chance of being one of them, which came to an abysmal 0.3%.

Against the odds, the duo from Prague, Czech Republic, received an email with an invitation for an interview. To “maximize” their chances, Jan and Jakob skipped the Skype interview option and booked flights to interview in person — choosing to travel “10,000 km for a 10-minute interview.”

[image source: Apify]

The gamble paid off. Their startup Apifier was among 32 other projects selected to participate in the inaugural Y Combinator Fellowship. They spent August through November 2015 in Mountain View, California, building what would become the most powerful web scraping and automation platform.

🖥️ You can read about Jan and Jakob’s entire experience at the YC Fellowship here.

So What Exactly Does Apify Do?

Apify is a software-as-a-service (SaaS) business-to-business (B2B) platform for web scraping, data extraction, and web automation. It enables people to automate any workflow that a person can do manually in a web browser and then run it at scale in the cloud.

“We’re making the web more programmable by making it possible to turn any website into an API.”

An API, or application programming interface, is software that allows two applications to talk to each other. To turn any website into an API, Apify uses this 3-step process:

1. Collect data from any website. Extract unlimited amounts of structured data right away with their ready-to-use scraping tools. You can also work with the Apify team to build a custom solution to solve your unique use case. You’ll gain fast, accurate results you can rely on.

Plus, Apify utilizes a smart rotation of data center and residential proxies, combined with industry-leading browser fingerprinting technology. This makes Apify bots indistinguishable from humans, so they never get blocked.

Find out how web scraping and Apify can make the web work for you.

2. Automate any online process. Scale processes, robotize tedious tasks, and speed up workflows with Apify’s flexible automation software. Their automation lets you work faster and smarter than your competitors with less effort. Automating mundane tasks online allows people to spend more time on things that matter.

3. Integrate your harvested data with any system. Export scraped data in machine-readable formats like JSON or CSV. Apify lets you seamlessly integrate with your existing Zapier or Make workflows or any other web app using API and webhooks.

From small startups to Fortune 500 companies, Apify empowers its innovative users to improve their products, marketing, and decision-making using data from the web.

💡 Check out the Apify platform, pricing, use cases, and examples here.

4 Reasons We Dig Apify

Building in-house web scraping solutions is time-consuming and expensive. Whether you’re a developer or a startup, this process may be essential, yet it takes you away from your core business.

Building your own scrapers isn’t even an ideal solution. Scrapers can get blocked or become unreliable over time. Plus, if you need to scrape a lot of data, these solutions can’t easily scale to keep up with your needs.

The Apify platform processes more than 1 billion web pages monthly. And that’s not even its greatest achievement.

The Apify Platform is Incredibly Powerful and Flexible

The Apify platform can literally automate anything that can be done manually in a web browser at scale.

Apify has built-in features such as autoscaling, run schedulers, and rotating proxy pools. You can scrape millions of data points simultaneously, set up complex solutions, and maintain them. This crucial data keeps rolling in, in any format, and can be pushed directly to your database.

Regarding flexibility, Apify is built on solid open-source tools (like Meteor!), so you never have to worry about vendor lock-in.

It’s a Dream for Businesses and Developers

Over 1,000 customers in 95 countries trust Apify’s flexible, ready-to-use tools to get the job done quickly and accurately.

When it comes to developers, Apify is the most effortless way to ship automation software. Their rich developer ecosystem also enables devs to earn passive income from tools they create on Apify.

So think of it as the Airbnb of automation software. Any company that needs web scraping tools or web RPA solutions can find what they need on Apify, or they can rely on the thriving community of Apify Freelancers to create an affordable solution.

Along with Apify Proxy and Apify Storage, the Apify platform is a complete solution for developers and clients alike.

👍 Devs can score tutorials, tips, advice, and articles on web scraping and automation on the Apify blog. You can also learn web scraping in their free Academy via the Apify Developer Portal.

The Apify Store is a Legit Treasure Chest of Ready-Made Tools for Web Scraping and Automation Projects

Apify’s powerful software platform enables forward-thinking companies to leverage the web’s full potential with thousands of Apify actors.

An Apify actor is a serverless microservice that accepts input or output. An actor can perform anything from a simple action (such as filling out a web form or sending an email) to complex operations (such as crawling an entire website and removing duplicates from a large dataset).

You’ll find hundreds of ready-made actors, tools, and APIs built by programmers in the Apify Store.

The Apify Culture & “Garage Spirit”

Jan and Jakub say they strive to keep the web open as a public good and a basic right for everyone, regardless of the way you want to use it, as its creators intended. Likewise, they’re proud to maintain the open, dynamic startup culture they established from the beginning.

“Apify is still like a big family, even as we grow.”

According to them, they’re hackers who always find a way forward, even if it doesn’t look like there is one. Although they no longer fit in a garage, they still have their “garage spirit.” They’re proud to be building the company that they always wanted to work for.

And Meteor is proud to play a starring role in Apify’s tech stack.

The Apify Tech Stack

In short, the Apify tech stack is 100% Node.js and TypeScript, running on AWS and Kubernetes. Here’s a full rundown:

  • Front-end: React.js, styled components, Storybook, Cypress
  • Back-end: TypeScript/Node.js, Next.js, Express.js, Meteor.js, Jest
  • Infra: AWS, Kubernetes, Helm, MongoDB, Redis, DynamoDB, S3
  • Monitoring: New Relic, LogDNA, Sentry, PagerDuty

You’ll find a detailed reference to architecture in the image Apify shared below:

After learning what powers Apify, we were keen to find out:

Why Apify Builds With Meteor

Meteor is an open-source framework for seamlessly building and deploying full-stack web, mobile, and desktop applications in Javascript.

The Apify co-founders said they’ve been using Meteor for six years. They started their journey during their founding at the Y Combinator Fellowship.

“Meteor is a huge advantage for a bootstrapped startup. You install Meteor and start developing right after. It’s a complete toolchain from back-end to front-end.”

Now that processes are becoming more complex and the Apify team continues expanding, they’ve had to overcome new challenges and tackle scaling.

Solved: Apify’s Challenges and Scaling with Meteor

Apify says their main technical challenge on the back-end remains scaling up, while the user experience is becoming their primary focus on the front-end.

Migrating from Handlebars to React was a “huge” challenge Jan and Jakob had to figure out. They accomplished this by going from the bottom up, component by component, with hundreds of deploys gradually over more than a year.

They also had to brainstorm workarounds for:

Poorly Performing Oplog Tailing at >5GBs/Hour

Jan and Jakob love Meteor’s reactivity. They say, “Oplog polling is an amazing idea and works great in production and when developing locally.” They described this as a “killer feature” they took advantage of during their first four years.

Unfortunately, the duo learned it “stops performing when your oplog reaches gigabytes of throughput every hour.”

When their Oplog hit over 5GBs/hour, they were slammed with “badly performing Oplog tailing.” Oplog processing was consuming too much processor time. So they decided what needed to be reactive and what didn’t. They replaced Oplog polling with queries run with intervals for anything they deemed a reactive publication.

The team is slowly replacing different Meteor features with custom implementations like this. PS: Apify’s Oplog is now at 10GBs/hour.

Heavy API Workloads Call for a Split

Initially, the Apify team had both their app (console.apify.com) and API (api.apify.com) implemented in a single Meteor.js codebase. But at some point, their API workloads were too heavy (150k+ req/s).

The team decided to split the API from the common codebase into a separate Express.js application. At that point, they decided to start using their API from the front-end instead of code sharing. Jan and Jakub say this helped decrease the size of the Meteor server.

What’s Next for Apify

The Apify team is currently focusing on expanding all the ways people can integrate their systems with the Apify platform. They’re exploring new pricing models for the hundreds of ready-made actors available in the Apify Store. And they plan to move more into the web automation market.

🤖 Psst! Apify is also hiring!

So How Can Meteor Take Your Team To the Next Level?

Developed for over a decade and trusted by industry giants like Apify, Meteor is a mature open-source framework that allows you to build and scale efficiently, so you can serve millions of users.

You can create full-stack Javascript apps using the same code, whether you’re developing for the web, iOS, Android, or desktop. Leveraging popular frameworks and out-of-the-box tools lets you focus on building features instead of configuring disparate components yourself.

See why over 500k developers rely on Meteor now!

--

--

Meteor is an open-source platform for building top-quality web apps in a fraction of the time, whether you're an expert developer or just getting started.