Global AI Firms Form Strategic Data Partnerships Amid Privacy Concerns

Leading companies in the field of artificial intelligence are forming regional and sector-specific strategic partnerships to access user data that is not readily available on the open internet.

Editor Editor

August 21, 2025

Global artificial intelligence firms are increasingly forging strategic partnerships with regional companies to gain access to user data that is not available through public web sources. These partnerships aim to enhance the performance of AI models in specific sectors like healthcare, finance, logistics, and e‑commerce fields where structured, real-time data is more valuable than the open internet.

One notable example is OpenAI, which has entered into partnerships with Shopee, a leading e-commerce platform in Southeast Asia operated by Sea Ltd., and Shopify, one of the most widely used platforms for online stores globally. Through these agreements, OpenAI gains access to valuable behavioral data, customer queries, and transaction logs—elements that are rarely available through traditional web scraping. This type of data allows OpenAI to fine-tune models like ChatGPT for e-commerce use cases such as smart customer support, dynamic product recommendations, and personalized marketing (https://www.theverge.com/2025/07/10/openai-shopify-partnership-ai-data).

Other AI companies are taking alternative approaches. Perplexity AI, a rapidly growing AI search startup, has partnered with telecom operators to distribute its services at scale and simultaneously gain data insights. In India, it partnered with Bharti Airtel to offer all 360 million Airtel subscribers free access to Perplexity Pro—its premium AI product. Similar agreements have also been made in Japan with SoftBank and in South Korea with SK Telecom. According to mobile analytics firm Sensor Tower, Perplexity’s app downloads in India increased by over 600% year-on-year, outpacing ChatGPT, which recorded a 587% increase over the same period (https://www.business-standard.com/companies/news/bharti-airtel-perplexity-artificial-intelligence-partnership-free-subscription-125071700521_1.html; https://www.theoutpost.ai/news-story/perplexity-ai-surges-in-india-challenging-chat-gpt-s-dominance-17934).

Google is also experimenting with a similar approach by offering free access to AI tools in university networks and select markets like India and Brazil. These deployments allow the company to gather insights into how people in different cultural and linguistic environments interact with generative AI something that generalized datasets often fail to capture.

Industry analysts say these partnerships are redefining how AI companies train their models. Sameer Patil from the Observer Research Foundation in India noted, “These data-sharing arrangements provide not just volume but context—allowing models to learn more effectively in domain-specific environments.” He emphasized that the personalization and local relevance offered by such models are largely dependent on high-quality proprietary datasets.

However, these practices are not without controversy. As AI companies gain deeper access to personal, transactional, and behavioral data, concerns around data privacy, user consent, and sovereignty are growing—especially in emerging markets. Governments in countries such as India, Turkey, Nigeria, and Vietnam are increasingly pushing for data localization laws that require companies to store and process user data within national borders. This trend aims to reduce the influence of foreign entities over local digital infrastructure while enhancing control over how data is collected and used (https://www.thehindu.com/sci-tech/technology/data-localisation-in-india/article67719286.ece).

India’s recent Personal Data Protection Act (DPDP), although not fully implemented yet, reflects the government’s intention to tighten oversight over foreign tech firms. Legal experts argue that partnerships like the Airtel–Perplexity deal exist in a regulatory gray zone, where enforcement and transparency are still developing. Privacy advocates have also raised questions about whether users truly understand the scope of data sharing when they use “free” AI services bundled into telecom or platform offerings (https://www.the-secretariat.in/article/airtel-is-giving-free-perplexity-ai-can-india-handle-the-pandora-s-box-it-ll-open).

Additionally, ethical considerations about who benefits from these partnerships are now entering policy discussions. While AI companies receive a steady flow of high-quality data to improve their products, local entities often receive limited returns or influence over how the models are deployed or monetized. Some experts have called for global frameworks to ensure more equitable benefit-sharing in AI data partnerships.

The shift from scraping publicly available online data to acquiring exclusive offline datasets via commercial partnerships represents a new phase in AI development. These deals give companies access to fresher, more structured, and domain-specific information—but they also raise concerns about user rights, consent, and international data flows.

In summary, global AI firms are actively forming strategic alliances with e-commerce platforms, telecom companies, and service providers to access offline user data. This new model of AI data acquisition provides a significant advantage in developing localized and accurate models. However, without updated legal frameworks and clear safeguards, these strategies could outpace existing privacy protections and challenge national data governance structures. The evolving landscape calls for stronger regulations, greater transparency, and more inclusive decision-making in shaping the future of global AI development.