From Perception to Presence
Turning AI into a Living Being
The Problem with Today's Robots
Industry Status Quo
- Weak vision capabilities
- Passive response (Trigger-based)
- "Chat" only, cannot drive behavior
- Pre-scripted mechanical motions
- Breaks down in multi-person scenarios
"Pophie is built as an AI Lifeform where perception, cognition, emotion, memory, and expression operate as one."
The World's First AI Lifeform Architecture
An AI Lifeform continuously perceives the real world, understands people and context, forms memory and emotion, and expresses itself through body, gaze, and voice.
On-Device Intelligence
Low-latency perception and real-time control. It drives instant reactions, smooth motion, and always-on attention—before the cloud responds.
- Face tracking and gaze locking
- Audio interrupt and turn-taking cues
- Touch, posture, and reflex loops
- Motion control and safety limits
- Local state: awake, sleepy, engaged
Cloud Lifeform Engine
Full-modal understanding, reasoning, memory, and personality. It builds context, learns preferences, and plans responses across voice, motion, and expression.
- Multimodal understanding and reasoning
- Identity, emotion, and intent modeling
- Long-term memory and personalization
- Story generation and dialogue planning
- Continuous behavior orchestration
Edge + Cloud, One Living Loop
Edge-first sensing.
Cloud-ready context.
Eyes, microphones, touch, and motion sense continuously on-device, capturing who is here and what's happening — in real time.
Cloud-level understanding.
Edge-level attention.
The cloud fuses vision with audio cues to build scene and social context, while the edge maintains attention — gaze tracking, speaker direction, and interruption cues.
Personality, memory,
and social rules.
The cloud chooses intent and behavior using character, memory, and social dynamics — while the edge enforces timing and safety constraints.
Expressed by the body.
Synchronized by the loop.
Eyes move first, body follows. Motion, voice, and belly light deliver responses with lifelike timing — no pre-scripted loops.
Capabilities That Create Presence
A unified loop across vision, conversation, emotion, memory, self-model, and motion.
Real-world Understanding
Cloud vision reasoning that turns what Pophie sees into meaning—and action.
Visual Reasoning
She interprets scenes and context, not just objects or faces.
Intent & Emotion Cues
She infers attention, intent, and mood from gaze, posture, and situation.
Vision-to-Interaction
What she sees naturally shapes dialogue and behavior—so responses feel timely and lifelike.
Proactive Attention
Powered by continuous real-time visual reasoning, not trigger-based detection.
Attention
You look at her → she notices. She follows your gaze and keeps eye contact naturally.
Initiation
You keep looking → she reacts emotionally. You wave → she starts the conversation.
Context Awareness
Actively scans surroundings. Understands who is present and what's happening.
Natural Multi-person Conversation
A truly usable home conversation system.
No Wake Word
Look and speak naturally. Interrupt anytime.
Multi-Person Awareness
Knows who is speaking, who is being spoken to, and won't interrupt human-to-human conversations.
Memory-Driven Dialogue
Remembers each person separately. Asks better questions over time.
Self-awareness & Boundaries
She knows "who she is".
Identity & Boundaries
Knows what she can and cannot do. Has boundaries and emotions.
Emotional Autonomy
Can say "no", get annoyed, or feel proud based on interaction history.
Lifelike Expression
Not pre-set expression packs — a continuous emotion space where feelings blend and flow naturally.
Continuous Emotion, Not Preset Faces
Most robots switch between "Happy", "Sad", "Angry" like stickers. Pophie uses a 3D emotion model (Valence, Arousal, Dominance) where feelings transition smoothly — the way real emotions do.
Hyper-Real Eyes
Six blendable eye primitives create infinite expressions. Independent gaze per eye, iris micro-motion, highlight shimmer, and eyelid-driven rotation — no "dead eyes" here.
Whole-Body Coherence
The same emotion drives eyes, voice (speed, pitch, pauses), and body (speed, amplitude, posture) in sync. You experience one unified emotional being, not separate systems.
Memory & Growth
She doesn't just remember conversations — she builds a real understanding of who you are.
Four-Layer Memory
From moment-to-moment awareness to permanent knowledge. She'll remember you mentioned hating broccoli three months ago, or that your daughter's birthday is coming up.
Natural Forgetting
Like human memory — vivid details of important moments, faded impressions of the routine. What matters stays sharp; what doesn't fades naturally.
A Life of Her Own
When idle, she plays, hums, and explores. She has her own rhythms and curiosities — not just a blank screen waiting for a command.
An Ecosystem of Capabilities
Skill = Prompt + Code + Lifeform APIs. Developers define "what to do" — the OS handles making it feel alive.
Embodied Reflexes
Instant on-device reactions to touch, motion, and presence. No cloud latency — pure instinct.
Interactive Skills
Dialogue-driven, single-step tasks combining conversation with real-world actions.
Agentic Skills
Multi-step autonomous tasks that plan, execute, and adapt — like a real assistant with follow-through.
Where Presence Becomes Physical
Embodied Expression.
Not Pre-Set Animations.
Every reaction is a coordinated full-body performance—driven by the life simulation system.
Whole-body coordination
Motion is never single-axis—eyes, head, body, and timing move as one.
Eyes lead, body follows
Gaze moves first, body follows, then gaze stabilizes—like a real being.
Expression without a screen
Eyes stay pure—no UI overlays, no icons, no "display face."
5-DOF Expressive Motion
Hands, ears, and full-body rotation enable rich emotional language.
A Warm Body, Not Cold Plastic
Constant warmth adds a subtle "living" comfort when you hold her.
Hyper-Real Eyes
Micro gaze dynamics, eyelid-follow, and subtle iris motion create true presence.
Belly Light as Expression
Speech-synced light replaces a mouth—and color becomes emotion.
No Buttons, No Mode Switching
Power on/off, volume, and settings happen through natural interaction.
Natural wake. Clear status.
Wake her by touch or by name. Ask for battery and connectivity anytime.
Pophie is not a robot that reacts.
She is a lifeform that perceives,
understands, and responds
with presence, emotion, and intention.