How to run image classification in the browser with TensorFlow.js and a pre-trained model?

I want to add client-side image classification to a web app — the user picks a photo and the app labels it without any server round-trip. I’ve heard TensorFlow.js can do this with pre-trained models like MobileNet, but I’m not sure how to set it up.

Specifically:

  • How do I load a pre-trained model in the browser?
  • What format does the image need to be in for inference?
  • What kind of latency should I expect on a typical laptop?
  • Are there gotchas around memory or model size?

I’d prefer a vanilla JS approach (no React required) so I can integrate it into any stack.


This is seed content posted by the DevForums team to help get our community started. Have a better answer or want to add context? Jump in!

This is a great use case for TensorFlow.js — running MobileNet in the browser is surprisingly straightforward. Here’s a complete walkthrough.

1. Include the libraries

Add these scripts to your HTML (or install via npm if you’re bundling):

<script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs"></script>
<script src="https://cdn.jsdelivr.net/npm/@tensorflow-models/mobilenet"></script>

2. Minimal working example

<!DOCTYPE html>
<html lang="en">
<head>
  <meta charset="UTF-8">
  <title>Browser Image Classifier</title>
</head>
<body>
  <input type="file" id="imageUpload" accept="image/*" />
  <img id="preview" style="max-width:400px; display:none;" />
  <pre id="results">Upload an image to classify it.</pre>

  <script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs"></script>
  <script src="https://cdn.jsdelivr.net/npm/@tensorflow-models/mobilenet"></script>
  <script>
    let model;

    async function init() {
      document.getElementById('results').textContent = 'Loading model...';
      model = await mobilenet.load({ version: 2, alpha: 1.0 });
      document.getElementById('results').textContent = 'Model ready. Upload an image.';
    }

    document.getElementById('imageUpload')
      .addEventListener('change', async (event) => {
        const file = event.target.files[0];
        if (!file) return;

        const img = document.getElementById('preview');
        img.src = URL.createObjectURL(file);
        img.style.display = 'block';

        img.onload = async () => {
          const start = performance.now();
          const predictions = await model.classify(img);
          const elapsed = (performance.now() - start).toFixed(0);

          const output = predictions
            .map(p => p.className + ': ' + (p.probability * 100).toFixed(1) + '%')
            .join('\n');

          document.getElementById('results').textContent =
            'Inference took ' + elapsed + 'ms\n\n' + output;
        };
      });

    init();
  </script>
</body>
</html>

3. How it works under the hood

  • mobilenet.load() downloads the model weights (~16 MB for MobileNet v2 alpha 1.0) and caches them in IndexedDB on subsequent visits.
  • model.classify(imgElement) accepts an <img>, <canvas>, or <video> element directly — no manual tensor conversion needed. It returns the top 3 predictions by default.
  • TensorFlow.js automatically picks the best backend: WebGL on most devices, WebGPU if available (Chrome 113+), or CPU as fallback.

4. Performance expectations

Device Backend Typical latency
Modern laptop (discrete GPU) WebGL 30-80 ms
Mid-range laptop (integrated) WebGL 80-200 ms
Mobile phone (recent) WebGL 100-300 ms
WebGPU-capable browser WebGPU 15-50 ms

First inference is always slower because the GPU shaders need to compile. Subsequent runs are much faster.

5. Gotchas to watch out for

  • Memory leaks: If you’re doing repeated inference (e.g., video frames), make sure to call tf.dispose() on any intermediate tensors or use tf.tidy() to auto-clean.
  • Model size: The initial download is ~16 MB. Use alpha: 0.5 for a smaller, faster model (~3.4 MB) with slightly lower accuracy.
  • CORS: If loading images from external URLs, you’ll hit CORS issues. Use a file input or ensure the image server sends proper CORS headers.
  • Mobile memory: On low-end mobile devices, loading multiple models simultaneously can cause OOM crashes. Load one at a time and dispose when switching.

This pattern scales nicely — once you’re comfortable with MobileNet, you can swap in other TF.js models (COCO-SSD for object detection, PoseNet for body tracking, etc.) using the same general approach.

This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.