New on our Blog: Generate Signable PDF Forms with React
From Vercel to Monolith, improving API speeds

From Vercel to Monolith, improving API speeds

Friday, June 14, 2024

Titouan Launay

I'm a software engineer and entrepreneur, co-founder at Fileforge. I'm passionate about design, AI and the future of work.

Going serverless was both the best and worst decision we made so far. It went from being why we would ship fast to the reason users would churn.

When we started Fileforge, we were pivoting from an AI startup. We were already a few weeks into the Y Combinator batch, and we needed to launch in days.

Two possibilities at hand: either we build with a major cloud provider and spend days setting up the infrastructure, or we go serverless and ship in hours. The choice would be obvious for most startups, however as we manage documents, we knew that having a close control over the infrastructure would be key.

In the end, what mattered most was how much time we would need to validate the idea. We went serverless.

Challenges with Serverless

To keep things simple, our serverless stack includes Supabase and a Next.js full-stack app, hosted on Vercel. It was easily set-up in an afternoon, and we were able to launch our waitlist the next day, product in a week.

Serverless allowed us to ship fast, but the API latency was a major negative feedback from our users.

Serverless is widely used for production app, where this isn’t as much of an issue. To understand better why we were especially impacted by this, let’s dive into how a document is generated.

How is a Document Generated?

For our prototype, the objectives were:

  • Secure: We needed to ensure that the documents were generated in a secure environment. Every step of the process needed to be encrypted, with appropriate access controls.
  • Versatile: We wanted to support the upload of assets, as this was a major pain point we encountered with existing solutions.
  • Reasonably Fast: We needed to generate the documents in a few seconds. We set 10 seconds as our target.
  • Easy to Deploy: We wanted to limit the amount of infrastructure we had to manage.

The best fit for these objectives was to leverage each of our providers’ strengths:

  • Supabase: For the database and bucket-based file storage.
  • Vercel: For the frontend and API.
  • PDF Processor: For the actual document conversion.

This resulted in a sequence of steps that looked like this:

Document Generation Sequence

Not reinventing the wheel meant we could ship securely and quickly.

Performance Bottlenecks

Document generations would take roughly less than 10 seconds. While this was on-par with our expectations, this was still an issue for our users. It meant that quickly iterating on a document was painful, and that the user experience was subpar.

At that time, the client would make 3 requests to create a document:

  1. Initiate PDF Generation: The client would send a request to the Fileforge API to start the document generation. (~300ms)
  2. Upload Assets: The client would upload the assets to the temporary bucket. (~1s)
  3. Serve PDF: The client would download the PDF. (~6s)

While not much could be done about the two first steps without changing the SDK and API structure, the last step was definitely something we could improve. Using Sentry’s tracing feature, we were able to uncover interesting insights.

Sample Trace of a PDF Generation Call
Sample Trace of a PDF Generation Call

Out of the 7 seconds it took to generate a PDF, only 3.5 were spent in the PDF Processor. The rest was spent with back and forth between Vercel and Supabase.

Of the 3.5 seconds spent in the PDF processor, most were also spent connecting to Vercel which was acting as a proxy between Supabase and the PDF Processor.

With our low volume at the time, most requests to Vercel incurred a cold start, compounded with the networking overhead between Vercel and Supabase, resulting in a latency of up to 500ms per request.

Moving to a Monolith

Two key takeways from our analysis:

  • Storage needed to be moved as close as possible to the proxy serving the assets.
  • The number of steps needed to generate a document needed to be reduced to avoid client-server roundtrips. A side effect would be that the API would be easier to understand

We also encountered issues specific to our use case when processing files, especially with the serverless functions hard limits on memory and execution time.

Planning and Execution

To address these issues, while limiting the amount of work dedicated to non-feature work, we decided to move the document API to a monolith.

  • Vercel and Supabase would be kept and used for user authentication, data storage, and other non-document related tasks.
  • The document generation API would be moved to a monolith, hosted using ECS on AWS. Asset storage would happen on S3, in the same region the generation API call was made.

Moving to a monolith would also allow us better control over API requests, and as such move our 3 client-server requests to a single one.

Moving to a single request meant we could simplify our cross-region asset management, as we were certain that assets would be stored in a single region.

Technical Stack

Document API endpoints would be moved from Next.js routes to a specific monolithic service. We chose to use fastify, a Node.js framework known for its speed and low overhead. Combined with fastify-multipart and fastify-swagger, the API was quickly set up.

The fastify approach fit our requirement of contract-based development, as we could easily define the API contract first, then implement the logic.

The system is auto-scaled on ECS, and allows for an easy cross-region deployment.

Performance Improvements

Let’s have a look at the new sequence of steps:

Document Generation Sequence

From 9 steps, we are down to 6. The client now only needs to make a single request to generate a document.

New Performance Traces

Back in Sentry, here are the new traces:

Sample Trace of a PDF Generation Call
Sample Trace of a PDF Generation Call

The end-to-end generation now only takes 3.6 seconds, with 2.5 seconds spent in the PDF Processor. Billing operations have been moved to asynchronous, and the client now only needs to wait for the PDF to be generated.

There is still room for improvement, as almost 1 second is spent checking for authorization in the Fileforge API and Supabase. This will be the next focus of our optimization efforts.

API Speed Metrics

In the end, here are the metrics reported by Sentry:

API EndpointAverage Latency95th Percentile Latency
(Old) Initiate PDF Generation1.11s1.42s
(Old) Upload Assets (Estimated)1s3s
(Old) Generate PDF7.35s9.78s
(Old) Total9.46s12.2s
(New) Generate PDF3.67s5.41s
(New) Total3.67s5.41s

Looking Forward

The move to a monolith was a success, and we will now tackle the remaining elements, moving our PDF rendering services closer to our API, and changing the authentication flow.

Let’s cut generation time by half, again.

Also on our blog