ORB International has been the sole UK representative of WIN (Worldwide Independent Network) since 2015.    What does it offer us as a business?   

Each member from around the world is independent.  In reality this means they are all decision makers – so, on behalf of their organisations (which across the network range from those turning over a couple of million to those turning over £100m+) they make the decisions in which AI software the use, which panel providers, translation software etc. etc.    ORB’s current personas solution is an example of the benefits of membership – the original solutions comes from our Swedish member who shares it with members (for a fee).  We have presented our wAIve capability to the network also.

But the network also offers thought leadership.   The challenges we face in the UK are very often exactly the same for all members around the world.   At our last conference in Helsinki many members had been questioned by their clients about “synthetic data”.   So, the network ran an exercise with our End of Year survey and the results are below.  

 

Please read this short article to familiarise yourself with synthetic data. The full report is in the appendix.

 

The Synthetic Shortcut? Why Augmented Data Still Has a Long Road Ahead

By Richard Colwell, Laura Ruvalcaba & Edouard Lecerf | WIN Association

As the market research industry races toward faster, cheaper, and more scalable data solutions, synthetic and augmented data have emerged as promising contenders. But a recent study by the WIN Association—spanning five countries and three AI-driven models—reveals that while synthetic data offers efficiency, it also carries significant limitations that researchers must not overlook.

The Promise—and the Problem

Synthetic data, particularly in its augmented form, aims to enrich real survey datasets by algorithmically generating responses. This approach can reduce costs, accelerate timelines, and protect privacy. However, the WIN study highlights several critical challenges:

  • Loss of Authenticity: Synthetic records often fail to capture the nuanced behaviours and cultural subtleties present in real-world data. This can lead to misleading conclusions, especially in attitudinal and behavioural research.
  • Subgroup Sensitivity: Models struggle with accuracy in underrepresented segments. For example, in Mexico, synthetic data showed notable degradation when applied to older male respondents.
  • Validation Complexity: Ensuring that synthetic data truly mirrors real population behaviour is difficult. High-dimensional datasets make it hard to detect subtle but impactful deviations.
  • Overfitting Risks: AI models may latch onto patterns that don’t generalise well, especially when trained on limited or unbalanced real data.

Model Performance: A Mixed Picture

The study compared three augmentation methodologies—Lite, Express, and Full (N-Infinite)—under two mixing scenarios (50/50 and 70/30 real-to-synthetic ratios). While the Full 70% model consistently delivered the highest fidelity, even it showed limitations in replicating question-level outcomes across all demographic groups.

Interestingly, the Lite model was dropped entirely due to its inability to scale or produce high-quality synthetic cases. Express 70% performed well in some contexts but still fell short in others, underscoring the variability of model performance depending on geography and sample composition.

A Cautious Path Forward

The WIN team concludes that while synthetic data can complement traditional methods, it is not yet a reliable substitute. Hybrid datasets—those combining real and synthetic responses—offer a middle ground, but even these require careful planning and robust validation.

For market researchers, the message is clear: synthetic data is a powerful tool, but not a panacea. Its use demands rigour, transparency, and a deep understanding of its limitations. As the industry embraces AI-driven methodologies, it must do so with eyes wide open—and a commitment to preserving the integrity of insights.

 

Appendix:
See full report here.

Menu