How I Built My Own Loom-Style Screen Recorder

1. The Friction That Started It All

1.1 The 5-Minute Loom Problem

A few days ago I needed to record a product walkthrough. Nothing complicated — just a simple screen recording where I explain a feature while navigating through the UI.

Naturally, the first tool I reached for was Loom. It’s probably the fastest way to record something and send a shareable link but I ran into a limitation almost immediately. The free plan only allows recordings up to 5 minutes.

My demo needed more time. I could have split it into multiple videos, but that breaks the flow of explaining a product. A single continuous recording just makes more sense when you’re walking someone through something. At that moment I realized something slightly annoying — the tool that was supposed to make recording easy was suddenly introducing friction.

So I started looking for alternatives.

1.2 Why Traditional Recording Tools Didn’t Fit

The obvious alternative was using a traditional recording tool like OBS Studio.

OBS is incredibly powerful. Streamers use it, YouTubers use it, and it supports almost every configuration you could imagine. But for my use case, it felt like overkill. Every time I wanted to record something, the workflow looked like this:

open OBS
configure screen capture
check microphone input
configure webcam overlay
start recording
stop recording
export the video
upload it somewhere (Likely Loom)
generate a shareable link

It works, but it’s a lot of setup for something that should be quick.

Other tools like Camtasia have a similar issue. They’re great for producing polished videos, but they’re not optimized for quick “record and share” moments.

1.3 The Idea: A Minimal Loom-Style Recorder

After going through these options, it occurred to me that modern browsers already provide the APIs needed to capture screens, cameras, and microphones directly from a web application.

Instead of relying on external software each time I wanted to record something, I decided to build a small tool for myself — essentially a minimal Loom-style recorder that could:

record my screen
capture my microphone
overlay my webcam
automatically upload the recording
generate a shareable link

In other words, recreate the convenience of Loom, but without the recording limits and without needing to juggle multiple tools.

2. Designing the Recorder

2.1 What the Recorder Needed to Do

Once I decided to build it, the goal wasn’t to recreate a full product like Loom. I only needed something that solved my own recording workflow.

At a minimum, the recorder needed to capture three things: the screen, the microphone, and my webcam. Product demos feel much more natural when the person explaining is visible, so the webcam had to appear as a small picture-in-picture overlay on the screen recording.

Beyond recording, convenience was the most important requirement. I didn’t want to deal with exporting files or uploading them manually every time. Once the recording stopped, the tool should automatically upload the video and give me a shareable link.

At the same time, I wanted the flexibility to download the recording locally if needed — for example if I wanted to edit it later or upload it somewhere else.

In short, the workflow I wanted looked like this:

Open recorder
↓
Record screen + webcam
↓
Stop recording
↓
Get a shareable link

2.2 Choosing Mux for Video Infrastructure

Recording the video in the browser solved only half of the problem. The next question was: where should the video go after recording?

Handling video infrastructure yourself can get complicated very quickly. You need storage, encoding pipelines, streaming formats, and a reliable way to deliver the video. Instead of building all of that from scratch, I decided to use Mux.

Mux provides video infrastructure as an API. You upload a video file and Mux takes care of processing it, encoding it into streaming formats, and generating a playback ID that can be used to share or embed the video.

Another advantage is that Mux supports direct uploads, meaning the browser can upload the recording straight to Mux without sending the video through my server. My backend only needs to generate the upload URL and later check when the video processing is complete.

2.3 System Architecture

With the tools decided, the overall architecture turned out to be surprisingly simple. Most of the heavy lifting happens directly in the browser.

At a high level, the flow looks like this:

Screen + Webcam + Microphone
            ↓
        Canvas compositor
            ↓
        MediaRecorder
            ↓
          Video Blob
            ↓
       Direct Upload
            ↓
            Mux
            ↓
      Shareable Video Link

The browser captures the screen, webcam, and microphone using built-in Web APIs. The video streams are composited together on a canvas so the webcam appears as an overlay on top of the screen recording.

That combined stream is then recorded into a video file and uploaded directly to Mux, which handles processing and streaming.

The backend only performs two small tasks: generating upload URLs and retrieving the playback ID once the video is ready.

3. The coding: Frontend

3.1 Recording the Screen

The frontend handles the actual recording experience.

The recording pipeline captures three inputs:

screen
microphone
webcam

3.2 Capturing the Screen, Microphone and Webcam

The screen is captured using the browser's getDisplayMedia API.

const screenStream = await navigator.mediaDevices.getDisplayMedia({
  video: true,
  audio: false,
});

This triggers the browser's native screen sharing prompt.

Next, the microphone and webcam are captured using getUserMedia.

const micStream = await navigator.mediaDevices.getUserMedia({
  audio: {
    echoCancellation: true,
    noiseSuppression: true,
  },
});

And the webcam:

const webcamStream = await navigator.mediaDevices.getUserMedia({
  video: { width: 320, height: 320 },
});

3.3 Compositing Video with Canvas

Because browsers cannot directly merge multiple video streams, the screen and webcam are composited using a canvas.

The screen is drawn as the background:

ctx.drawImage(screenVideo, 0, 0, canvas.width, canvas.height);

Then the webcam is rendered as a circular overlay.

ctx.arc(x + size / 2, y + size / 2, size / 2, 0, Math.PI * 2);
ctx.clip();
ctx.drawImage(webcamVideo, -size / 2, -size / 2, size, size);

The result is a picture-in-picture recording similar to Loom.

3.4 Recording the Stream

Once the canvas is rendering frames, it can be converted into a video stream.

const canvasStream = canvas.captureStream(30);

The microphone audio track is then added:

const combinedStream = new MediaStream([
  ...canvasStream.getVideoTracks(),
  ...micStream.getAudioTracks(),
]);

Finally, the combined stream is recorded using the MediaRecorder API.

const mediaRecorder = new MediaRecorder(combinedStream, {
  mimeType: "video/webm; codecs=vp9",
});

When recording stops, the browser produces a video blob, which is ready to be uploaded.

4. The Coding: Backend

4.1 Handling Uploads and Video Processing

The backend is responsible for a few small but important tasks:

generating upload URLs
checking when video processing is complete
retrieving transcripts
generating summaries

Instead of handling video uploads directly, the server generates signed upload URLs using Mux.

4.2 Creating an Upload URL

When the frontend wants to upload a recording, it calls the createUploadUrl server action.

export async function createUploadUrl() {
  const upload = await mux.video.uploads.create({
    new_asset_settings: {
      playback_policy: ["public"],
      video_quality: "plus",
      mp4_support: "standard",
      input: [
        {
          generated_subtitles: [
            { language_code: "en", name: "English (Auto)" },
          ],
        },
      ],
    },
    cors_origin: "*",
  });

  return upload;
}

This tells Mux to create a new upload session with the following settings:

public playback so the video can be shared
automatic MP4 support
auto-generated subtitles
higher video quality encoding

The important part here is that Mux returns a direct upload URL, which allows the browser to upload the video without passing through the server.

4.3 Retrieving the Playback ID

Once the upload finishes, Mux processes the video asynchronously.

To check when processing is complete, the server retrieves the asset linked to the upload.

export async function getAssetIdFromUpload(uploadId: string) {
  const upload = await mux.video.uploads.retrieve(uploadId);

  if (upload.asset_id) {
    const asset = await mux.video.assets.retrieve(upload.asset_id);

    return {
      playbackId: asset.playback_ids?.[0]?.id,
      status: asset.status,
    };
  }

  return { status: "waiting" };
}

When the asset becomes ready, Mux provides a playback ID that can be used to stream the video.

4.4 Listing and Deleting Videos

The backend also exposes helpers to manage videos stored in Mux.

export async function listVideos() {
  const assets = await mux.video.assets.list({ limit: 25 });
  return assets.data;
}

And deleting a video:

export async function deleteVideo(assetId: string) {
  await mux.video.assets.delete(assetId);
}

These functions make it easy to build a simple video dashboard if needed.

4.5 Retrieving the Transcript

Because we enabled auto-generated subtitles, we can also retrieve transcripts from the video.

The server fetches the .vtt subtitle file generated by Mux and parses it.

const vttUrl = `https://stream.mux.com/${playbackId}/text/${textTrack.id}.vtt`;
const response = await fetch(vttUrl);
const vttText = await response.text();

The .vtt file is then parsed into timestamped transcript blocks:

00:01 Hello everyone
00:04 Today I'll show you this feature

This allows the UI to display searchable transcripts or timestamps alongside the video.

4.6 Generating AI Summaries

One interesting addition is automatic summaries.

Using the @mux/ai workflows, the system can generate:

a video title
a description
relevant tags

const result = await getSummaryAndTags(asset.id, {
  provider: "google",
  tone: "professional",
});

This uses the auto-generated transcript to summarize the video content.

Just a simple record → upload → share experience.

5. Final Thoughts

What started as a small annoyance — Loom’s five-minute limit — turned into a fun exercise in building a lightweight recording tool using modern browser APIs. Modern browsers already provide powerful primitives for media capture, and when combined with services like Mux, it becomes surprisingly straightforward to build something that feels close to a production video platform. Sometimes the best tools come from solving a tiny problem you run into every day.

If you're curious to explore the implementation, you can check out the GitHub repository, or try the live demo to see the recorder in action.