Technical walkthrough

How HTMstudio works

A nine-stage pipeline that turns any self-contained webpage into a production-ready Next.js 14 App Router project — including CSS injected at runtime by JavaScript, semantic class renaming, real React state, extracted images, design tokens, sub-components, and a typed interaction map.

Step 01

Capture the full page with SingleFile

Save any live webpage as a single, self-contained .html file — including computed styles, fonts, base64 images, and the fully-rendered DOM after JavaScript runs.

  • Install the free SingleFile extension (Chrome, Firefox, Edge, Safari)
  • Navigate to any page — including pages behind login
  • Click the SingleFile icon → one .html file saved to disk
  • Every external stylesheet, font, and image is inlined as base64 — nothing is missing
  • Captures the post-JS DOM, not the source HTML — dynamic content included
Terminal
1$ ls ~/Downloads/
2my-dashboard.html (2.4 MB, fully self-contained)
3 
4# Everything is inside this one file:
5# ✓ CSS from 6 stylesheets
6# ✓ Google Fonts inlined as base64
7# ✓ 14 images inlined as data URIs
8# ✓ Fully rendered DOM
Step 02

Drop the file — staged, not instant

Drop your .html file onto the converter drop zone (or open via VS Code). The file is read client-side, the component name is inferred from the filename, and you can edit it before paying the credit.

  • Drag-and-drop or click to open a file picker — no upload size limit
  • Component name auto-inferred from filename (my-dashboard.html → MyDashboard)
  • Edit the component name before scanning — it becomes the function name and file prefix
  • One credit is only charged when you click 'Scan now · 1 credit'
  • VS Code: right-click any .html or Ctrl+Shift+H sends the active file directly
UI
1┌─────────────────────────────────────────────┐
2│ Drop your .html file here │
3│ │
4│ 📄 my-dashboard.html (2.4 MB) │
5│ │
6│ Component name: [ MyDashboard ] │
7│ │
8│ [ Scan now · 1 credit ] │
9└─────────────────────────────────────────────┘
Step 03

CSS extraction — static + runtime-injected

Every <style> block is merged into ConvertedPage.css — including CSS that JavaScript injected at runtime. Pages like Google, YouTube, and modern SPAs inject critical layout rules (flex grids, component styles) via JS after load. SingleFile captures those injected blocks. HTMstudio extracts all of them.

  • All <style> elements extracted — including runtime JS-injected styles captured by SingleFile
  • SingleFile marks JS-injected styles with class='sf-hidden' — HTMstudio includes these in extraction and removes them from the JSX (they live in ConvertedPage.css instead)
  • @media and @supports blocks parsed with recursive depth counting — never flattened
  • URL references rewritten to absolute using the page's base href
  • Vendor prefixes and CSS custom properties preserved exactly as-is
  • Design tokens (--color-*, --spacing-*, --font-*) extracted into styles/tokens.css
  • :root and body variable blocks grouped by category (colors, spacing, typography, borders, shadows)
  • Result: pages that rely on JS-injected CSS (search result pages, dashboards, SPAs) now render with correct layouts out of the box
CSS source breakdown
1# Sources included in ConvertedPage.css:
2 
3✓ <style> in <head> — static page CSS
4✓ <style data-late-css> — deferred/async CSS
5✓ <style class=sf-hidden> — JS-injected at runtime
6✓ <link rel=stylesheet> — external sheets (inlined by SingleFile)
7 
8# Result for Google Search scan:
9 Before: 45 KB extracted (sf-hidden excluded)
10 After: 162 KB extracted (sf-hidden included)
11 
12 components.css: 13 KB → 80 KB (5× more semantic rules)
13 Multi-column knowledge panel: ✓ now renders correctly
Step 04

CSS de-obfuscation + semantic naming

Hash-generated class names (s17s3xpb, _3TG4x) are detected and renamed to readable semantic equivalents based on their CSS rules and the HTML element they appear on. A two-pass algorithm avoids junk number suffixes when a name is unique.

  • Detection: 3–20 chars, mix of digits + letters, no hyphens = obfuscated
  • Two-pass naming: count frequency first, only add -1/-2 suffix when a collision exists
  • Element-aware: a class on <button> with border becomes btn-outline, not card
  • Patterns: flex, grid, card, row, btn, btn-outline, icon, panel, section, animated, etc.
  • Rename applied to both JSX (className props) AND CSS selectors — always consistent
  • Word-boundary regex: .card never corrupts .card-header or .card-footer
  • css-class-map.json in ZIP: full table of original → semantic with CSS rules
Before → After
1BEFORE (unreadable):
2 <div className="_3TG4x s17s3xpb xk2d9">
3 <button className="_AbcX1">Submit</button>
4 </div>
5 
6AFTER (semantic):
7 <div className="card flex row">
8 <button className="btn-primary">Submit</button>
9 </div>
10 
11 // css-class-map.json
12 { "_3TG4x": "card", "s17s3xpb": "flex",
13 "xk2d9": "row", "_AbcX1": "btn-primary" }
Step 05

JSX conversion + attribute normalisation

The HTML DOM is walked node-by-node and serialised to valid JSX. All 40+ HTML→JSX renames are applied, inline styles become JS objects, form controls get React defaults, and open/active ARIA states are reset to closed.

  • All 40+ HTML→JSX renames: class→className, for→htmlFor, tabindex→tabIndex, etc.
  • SVG attributes camelCased: stroke-width→strokeWidth, clip-path→clipPath
  • Inline style strings become JS objects: style={{ display: 'flex', gap: '8px' }}
  • CSS custom properties in inline styles stay quoted: style={{ '--rail-width': '240px' }}
  • Boolean attributes emit bare prop or {true}/{false} — no empty strings
  • Inline event handlers stripped; replaced with TODO stubs: onClick={() => { /* TODO */ }}
  • Open states reset: aria-expanded=true → false, data-state=open → closed
  • Script and noscript blocks removed — they would re-execute and break hydration
components/MyDashboard.tsx
1export default function MyDashboard() {
2 return (
3 <nav className="sticky row">
4 <a href="/" className="logo">
5 <span className="logo-icon">⬡</span>
6 </a>
7 <button
8 className="btn-outline"
9 aria-expanded={false}
10 onClick={() => { /* TODO: toggleMenu */ }}
11 >
12 Menu
13 </button>
14 </nav>
15 );
16}
Step 06

Component decomposition

Landmark elements (<nav>, <header>, <main>, <aside>, <footer>) up to 2 levels deep are extracted into their own .tsx sub-components. Names are derived from aria-label, role, or id — not generic numbers.

  • Landmarks detected: nav, header, main, aside, footer
  • Only top-level landmarks extracted (depth ≤ 2) — nested sections stay inline
  • Name from aria-label: aria-label="Primary navigation" → PrimaryNavigationNav.tsx
  • Name from id: id="sidebar" → SidebarAside.tsx
  • Name from role attribute when no other identifier exists
  • Main component imports and composes sub-components cleanly
  • Each sub-component is a standalone .tsx file with its own "use client" if needed
File tree
1components/
2 MyDashboard.tsx ← main shell
3 PrimaryNavigationNav.tsx ← <nav aria-label='Primary navigation'>
4 SidebarAside.tsx ← <aside id='sidebar'>
5 MainContent.tsx ← <main>
6 SiteFooter.tsx ← <footer>
7 ClientInit.tsx ← boot: body reveal + delegated events
8 _interactions.ts ← typed event handler map
Step 07

ARIA interactivity injection

Eight ARIA interaction patterns are detected in the original HTML and automatically converted to working React useState hooks — with real onClick, onMouseEnter/Leave, onFocus/Blur handlers wired into the JSX.

  • role="tablist" + role="tab" → Tab switcher with useState(activeIndex)
  • aria-expanded + aria-controls → Disclosure/accordion toggle
  • aria-haspopup + aria-controls → Dropdown menu open/close
  • role="dialog" + aria-modal → Modal with open/close state
  • role="switch" + aria-checked → Toggle switch
  • role="progressbar" + aria-valuenow → Animated progress bar with state
  • role="tooltip" + aria-describedby → Hover + focus tooltip
  • data-state="open/closed" → Generic panel toggle
  • Each pattern adds "use client" + import { useState } automatically
ARIA → useState (example: tabs)
1"use client";
2import { useState } from "react";
3 
4export default function MyDashboard() {
5 const [s1, setS1] = useState(0); // tabs
6 const [s2, setS2] = useState(false); // dropdown
7 const [s3, setS3] = useState(false); // modal
8 
9 return (
10 <div role="tablist">
11 <button role="tab"
12 aria-selected={s1 === 0 ? "true" : "false"}
13 onClick={() => setS1(0)}>Overview</button>
14 <button role="tab"
15 aria-selected={s1 === 1 ? "true" : "false"}
16 onClick={() => setS1(1)}>Settings</button>
17 </div>
18 );
19}
Step 08

Base64 image + favicon extraction

Data URIs embedded in JSX src= attributes and CSS url() values are extracted to real image files in public/images/. The favicon is detected, format-corrected from magic bytes (not filename extension), and wired into Next.js metadata. Duplicates share one file.

  • Scans JSX src="data:image/..." and CSS url(data:image/...)
  • Shared deduplication map: same image in JSX + CSS = one file, not two
  • Supports PNG, JPG, GIF, WebP, SVG, AVIF
  • Filenames: img-1.png, img-2.jpg, etc. — stable across runs
  • ZIP includes real binary files (Uint8Array) — not re-encoded strings
  • Favicon: extracted from <link rel=icon>, format detected from binary magic bytes (PNG 89 50 4E 47, JPEG FF D8, GIF 47 49 46 38) — not just the declared MIME type or .ico extension
  • Favicon format correction: .ico files that are actually PNG are saved as .png so browsers render them correctly as tab icons
  • Preview tab favicon updates to match the converted page — no stale icon from previous scans
Before → After (image + favicon)
1BEFORE:
2 <img src="data:image/png;base64,iVBOR..." />
3 background-image: url('data:image/png;base64,iVBOR...')
4 favicon: data:image/x-icon;base64,... (PNG content, wrong MIME)
5 
6AFTER:
7 <img src="/images/img-1.png" />
8 background-image: url("/images/img-1.png")
9 favicon: /favicon.png (correct MIME detected from magic bytes)
10 
11public/
12 images/
13 img-1.png ← real binary file, 24 KB
14 img-2.jpg ← extracted from CSS background
15 favicon.png ← PNG magic bytes detected → saved as .png
Step 09

Prettier formatting + bundle assembly

The final component is formatted with Prettier (printWidth 100, trailingComma none) server-side. Then all 20+ files are assembled into a ZIP — a complete, immediately-runnable Next.js 14 App Router project.

  • Prettier runs server-side (Node.js) — identical output across all browsers
  • Applied after de-obfuscation so renamed classes are formatted correctly
  • Skipped for components > 1.6 MB to prevent timeout
  • ZIP assembled with binary-safe encoder — images are real Uint8Array blobs
  • No manual config — unzip, npm install, npm run dev and it starts
ZIP contents
1my-dashboard-nextjs-app.zip
2├── app/
3│ ├── layout.tsx ← metadata, CSS imports, flash fix
4│ ├── page.tsx ← renders <MyDashboard />
5│ ├── globals.css ← minimal reset
6│ ├── error.tsx ├── loading.tsx
7│ └── not-found.tsx
8├── components/
9│ ├── MyDashboard.tsx ← main component (useState wired)
10│ ├── SidebarAside.tsx ← extracted landmark
11│ ├── ClientInit.tsx ← body reveal + delegation
12│ └── _interactions.ts ← typed event map
13├── styles/
14│ ├── MyDashboard.css ← all page CSS
15│ ├── components.css ← semantic class rules + @media
16│ └── tokens.css ← design tokens
17├── public/images/ ← extracted base64 images
18├── css-class-map.json ← rename table
19├── asset-manifest.json ← asset status report
20├── report.json ← conversion metadata
21└── package.json + next.config.ts + tsconfig.json ...

Ready to try it?

Drop a SingleFile capture and see the full output in under 3 seconds.

Open the converter →