top of page
Convert User Needs to Data Needs

Bringing human centered design and AI together means turning real user behaviors into meaningful data output.
Defining User Needs
First and foremost, design for the user by understanding their needs and pain points, and not for the technology or its capabilities. Looking through both qualitative and quantitative data, and observing emerging behaviours can shift your thinking from technology-first to people-first. Toolkit
Triangulated Research: - Qualitative: Market Reports, Surveys and Questionnaires. - Quantitative: User Interviews,Observational studies - Emergent: Product reviews, Social listening
Research Synthesis: - JTBD framework
- Empathy Map to visualise user emotions and perspectives.
- Value Proposition Canvas: Allign user gains and pains with features
For example:
User Interview | User Need | JTBD |
"I find it hard to find the right apartment. I want to know the potential properties on sale based upon user's budget, location, size, and other factors fast" | Avoid wasting time visiting unsuitable homes based upon price and location | When searching for a 3BHK in London, I want accurate budget-filtered matches so I can prioritize viable options |
Defining Data Needs
Once user needs are clearly defined, systematically map needs into model-ready data requirements
User Need | ML Prediction | Required Data type |
Budget Constraint | Price estimation | Historical sale prices and inflation trends |
Location type | Geo-match scoring | Polygon maps of neighbourhoods + transport hubs |
How to use this pattern? Collaborate as a cross-functional team to identify the features, labels and examples needed to train an effective AI model.

Important Notes on Data Sourcing
As you gather or plan to collect data, carefully inspect it for quality, potential biases, and ensure robust data collection methods. Some of the popular tools are Pandas (Python) – For checking missing values, duplicates, and inconsistencies. IBM AI Fairness 360 (AIF360) – For detecting and mitigating bias in datasets. Apache Airflow – For orchestrating and monitoring data pipelines.
Follow best practises in Data collection, Data Documentation, Data Labelling as defined in DataCard policy
Design for Labeling. Correctly labeled data is a crucial ingredient to an effective supervised ML system
bottom of page