Back to Projects
FeaturedProject
Flutter
AWS
3D

DiffStudio

A cross-platform 3D product visualization platform for e-commerce, converting videos into interactive 3D Gaussian Splat models through an AWS-powered serverless neural rendering pipeline.

DiffStudio

Overview

E-commerce product photography has remained largely unchanged for decades—static images shot from limited angles, leaving customers to imagine how products look from other perspectives. DiffStudio transforms this by converting ordinary product videos into fully interactive 3D experiences that customers can explore from any angle. The platform enables online retailers to create photorealistic 3D models from simple smartphone videos, delivering an immersive shopping experience without specialized 3D scanning hardware or expertise. Upload a video, wait 20-30 minutes, and receive a web-ready interactive 3D viewer that works across all devices. Built on a serverless, event-driven ML pipeline, DiffStudio orchestrates the entire 3D reconstruction process across AWS services—from video upload through neural training to global CDN delivery—enabling horizontal scaling from 0 to 100+ concurrent training jobs while maintaining cost efficiency.

How it works

From Capture to Commerce: The DiffStudio Pipeline

A serverless, event-driven pipeline that transforms smartphone videos into interactive 3D experiences. The Flutter app uploads to S3, triggering an AWS Step Functions workflow that orchestrates GPU-powered neural training on SageMaker, then delivers compressed 3D models via CloudFront CDN for real-time WebGL viewing.

Architecture & Features

Architecture & Technical Highlights

A serverless, event-driven ML pipeline that orchestrates 3D reconstruction across AWS services, enabling horizontal scaling from 0 to 100+ concurrent training jobs.

  • Serverless ML at Scale: On-demand SageMaker instances (ml.g4dn.xlarge, ml.g5.4xlarge) with Spot pricing reduce idle GPU costs from $378/month to ~$26/month for 100 training jobs
  • Event-Driven Orchestration: AWS Step Functions + SNS/Lambda decouple API from long-running ML workloads, maintaining responsive backend while training runs asynchronously
  • Neural Rendering Pipeline: Containerized Python pipeline (COLMAP → NerfStudio splatfacto → PLY export) executes on GPU instances with automatic error recovery
  • Real-Time 3D Rendering: WebGL-based viewer achieves 60+ FPS using Gaussian Splatting (10x faster than NeRF) with compressed PLY files (50-200 MB, <2s load time)
  • Cross-Platform Single Codebase: Flutter achieves 80% code reuse across iOS/Android/Web with unified abstractions for platform-specific features (drag-drop vs camera)
  • Database-Driven Asset Management: PostgreSQL stores asset metadata with S3 keys (not filesystem paths), enabling transactional creation and queryable analytics
  • Infrastructure as Code: Complete AWS infrastructure (ECR, S3, CloudFront, Step Functions, IAM) defined in Terraform with version-controlled deployments
  • Key Metrics: Training time 15-30min (100 images), 90%+ GPU utilization, <100ms API response (p95), 70% cost reduction via Spot instances

Tech Stack

Flutter
Dart
Provider
go_router
Go 1.22
Gin
PostgreSQL
pgx v5
AWS SDK v2
Python 3.11+
NerfStudio
COLMAP
PyTorch
PlayCanvas
CUDA
AWS SageMaker
Step Functions
Lambda
S3
CloudFront
ECR
SNS
Terraform
Docker
GitHub Actions
Firebase Hosting

Event-Driven Serverless Architecture

AWS Step Functions orchestrates SageMaker training jobs with on-demand GPU instances, enabling horizontal scaling from 0 to 100+ concurrent jobs with pay-per-use pricing.

Cross-Platform Flutter Application

Single Dart codebase deployed to iOS, Android, and Web with 80% code reuse, featuring camera integration, drag-and-drop upload, and real-time job status polling.

High-Performance Go Backend

RESTful API built with Gin framework, featuring JWT authentication via Supabase, multipart S3 streaming, PostgreSQL with connection pooling, and SNS event publishing.

Neural Rendering Pipeline

NerfStudio-powered Gaussian Splatting training with COLMAP camera pose estimation, running on SageMaker GPU instances with TensorBoard metrics and automatic checkpointing.

Real-Time 3D Rendering

WebGL-based viewer achieves 60+ FPS on modern devices using Gaussian Splatting, with PlayCanvas compression reducing model sizes by 70-80% for fast CDN delivery.

Infrastructure as Code

Complete Terraform-managed AWS infrastructure including ECR container registry, S3 buckets, CloudFront CDN, Step Functions state machines, and IAM roles with least-privilege access.

Resources

Future Vision

The next frontier is real-time streaming—imagine training 3D models while filming, with progressive refinement as you capture more angles. Mobile-first optimization, AR viewer integration (USDZ for iOS AR, glTF for WebXR), and multi-format export will make DiffStudio the standard for e-commerce 3D visualization.

Concierge

Ask me anything

Hi there!

I can help you learn about Naga's work, schedule a call, or send a message.