# Overview An AI system that produces perfect responses is worthless if no one uses it. Level 2 Evaluation moves beyond technical accuracy to measure the "digital traces" users leave behind. By tracking how users move from their first interaction to long-term habit formation, we can ensure the product actually delivers value in the real world. *** #### Key Motivation Technical performance (Level 1) does not guarantee user adoption. Level 2 evaluation is critical because: * Value Validation: If users stop interacting, they likely see no value, meaning the intervention cannot achieve its intended life outcomes. * Continuous Improvement: It transforms product development from opinion-driven to data-driven through iterative cycles and A/B testing. * Safety & Risk Management: Monitoring user signals allows for controlled rollouts of experimental features, preventing negative reactions from reaching your entire user base at once. Read more -> *** #### Core Concept: The User Funnel To evaluate the product, we "instrument" the application to track users as they progress through four distinct stages. We prioritize "Time to Success" (solving the user's problem) over "Time on Device" to ensure we are optimizing for welfare rather than just addiction. | Stage | Goal | Key Metric Example | | ----------- | ----------------------------------- | ----------------------------------------- | | Acquisition | Bring users into the ecosystem. | New User Count, Cost Per User (CAC) | | Activation | Ensure users find "First Value." | Activation Rate, Time to Activate | | Engagement | Measure depth and frequency of use. | Active Users (DAU/WAU), Interaction Depth | | Retention | Build long-term habits/commitment. | Stickiness (DAU/MAU), Retention Rate | Read more -> *** #### How to Evaluate Level 2 evaluation is performed by integrating 3rd party analytics tools (e.g., Amplitude, Mixpanel) to capture real-time data. 1. Define & Instrument: Map your user journey and identify specific "events" (e.g., "audio advice played") that signal progress. 2. Analyze Trends: Use dashboards to identify friction points where users consistently drop off. 3. Experiment: Run A/B Tests to compare different versions of a feature. By randomly assigning users to "Version A" or "Version B," you can statistically prove which design better supports user goals. 4. Diagnose: If metrics are low, conduct a Process Evaluation (interviews or surveys) to understand the "why" behind the data—such as connectivity constraints or literacy barriers. Read more -> ***

💬 Want to suggest edits or provide feedback?

{% embed url="" %}