# Why Data Ownership Matters More and More to Internet Users

The digital landscape has transformed dramatically over the past two decades, evolving from simple information repositories into complex ecosystems where personal data has become the currency of the internet economy. Every click, search, purchase, and interaction generates valuable data points that companies harvest, analyse, and monetise. In 2023 alone, cybersecurity incidents affected over 343 million individuals worldwide—a figure that rivals the entire population of the United States. This staggering statistic reveals an uncomfortable truth: our digital identities are under constant siege, and the traditional models of data stewardship are failing to protect us. As internet users become increasingly aware of how their information is collected, stored, and exploited, the question of data ownership has shifted from a niche privacy concern to a fundamental digital right that affects everyone who ventures online.

The cambridge analytica scandal and the awakening of user privacy consciousness

The Cambridge Analytica scandal of 2018 marked a watershed moment in public awareness of data exploitation. When journalists revealed that the political consulting firm had harvested personal information from over 87 million Facebook users without explicit consent, it sent shockwaves through the technology industry and beyond. The data, obtained through a seemingly innocuous personality quiz app, was subsequently used to build psychological profiles for targeted political advertising during the 2016 US presidential election and the Brexit referendum.

What made this incident particularly alarming was the revelation of how easily third-party developers could access vast troves of user data through Facebook’s permissive API policies. The scandal exposed a fundamental asymmetry in the digital ecosystem: whilst users believed they were simply sharing information with friends and family, platform operators were quietly constructing detailed dossiers of their behaviours, preferences, and social connections. Following the exposé, Facebook experienced a significant decline in user engagement, with some analysts estimating usage dropped to approximately 80% of pre-scandal levels—a commercial impact that demonstrated users’ willingness to vote with their feet when trust evaporates.

The Cambridge Analytica affair catalysed what technology commentators now call the “Privacy Great Awakening.” It forced millions of internet users to confront an uncomfortable reality: the free services they enjoyed daily came at a substantial cost—their personal autonomy and privacy. This awakening wasn’t confined to individual users; it galvanised lawmakers, privacy advocates, and even some technology leaders to reconsider the fundamental architecture of the internet and the power dynamics inherent in contemporary data practices. The scandal proved that data ownership isn’t merely a technical concern—it’s a question of democratic participation, personal autonomy, and the balance of power in the digital age.

GDPR, CCPA, and the global legislative framework for personal data protection

The regulatory response to growing privacy concerns has been swift and increasingly comprehensive. Governments worldwide have recognised that self-regulation by technology companies has proven inadequate, necessitating legislative intervention to protect citizens’ digital rights. This regulatory evolution represents a fundamental shift in how societies conceptualise personal data—from a commodity that companies can freely exploit to a protected asset that individuals control.

General data protection regulation: territorial scope and extraterritorial enforcement mechanisms

The European Union’s General Data Protection Regulation (GDPR), which came into effect in May 2018, established the most comprehensive data protection framework to date. Unlike previous privacy legislation with limited geographical reach, GDPR employs an extraterritorial enforcement mechanism that applies to any organisation processing the personal data of EU residents, regardless of where that organisation is based. This expansive territorial scope has effectively made GDPR a de facto global standard, as multinational corporations found it more practical to implement GDPR-compliant practices universally rather than maintain separate systems for different jurisdictions.

GDPR introduced several groundbreaking principles that fundamentally altered the data landscape. The regulation enshrines the right to access, requiring organisations to provide individuals with clear information about what data is collected and how it’s used. The right to erasure—commonly known as the “right to be forgotten”—empowers individuals to request deletion of their personal data under certain circumstances. Perhaps most significantly, GDPR established the principle of data portability, enabling users to transfer their information between service providers, thereby reducing vendor lock-in and promoting competition.

California consumer privacy act: right to delete and Opt-Out provisions

Across the Atlantic, California

Across the Atlantic, California took a leading role in shaping the U.S. response to growing privacy concerns with the California Consumer Privacy Act (CCPA), which came into force in 2020. Often described as “GDPR-lite,” CCPA grants California residents the right to know what personal information businesses collect about them, the right to request deletion of that information, and the right to opt out of the sale of their data. These provisions strike at the heart of data brokerage and ad-tech business models that rely on trading user profiles at scale.

From a data ownership perspective, CCPA shifts the default from silent exploitation to informed choice. Businesses subject to the law must provide clear “Do Not Sell My Personal Information” links and respond to consumer requests within defined timelines, or face enforcement actions and statutory damages in the event of certain data breaches. While CCPA does not go as far as GDPR in some areas—for example, it lacks a full-fledged data portability right in practice—it nonetheless signals that U.S. states are prepared to assert robust rights over personal data where federal legislation remains absent or fragmented.

Brazil’s LGPD and india’s digital personal data protection act: emerging market approaches

Beyond Europe and the United States, emerging economies are also redefining what data ownership means for their citizens. Brazil’s Lei Geral de Proteção de Dados (LGPD), effective since 2020, closely mirrors GDPR in many respects, introducing principles of purpose limitation, data minimisation, and user consent as the default basis for processing. The law applies both to online and offline processing activities and, like GDPR, contains extraterritorial provisions that extend its reach to organisations outside Brazil that handle Brazilian residents’ data.

India’s Digital Personal Data Protection Act (DPDPA), passed in 2023, reflects a different political and economic context, balancing user privacy with ambitions for digital innovation and state capacity. The Act grants individuals rights to access, correct, and delete their personal data, while also permitting certain broad exemptions for state functions such as national security and public order. For internet users, these laws are significant because they embed the idea that personal data is not an unregulated resource for global platforms; it is subject to local democratic oversight, cultural expectations, and sovereignty concerns in major emerging markets.

Divergence between EU and US privacy models: adequacy decisions and Cross-Border transfers

Despite converging rhetoric around data protection, the EU and US continue to embody fundamentally different privacy philosophies. The EU treats privacy as a fundamental right anchored in the Charter of Fundamental Rights, while the US approach remains more sectoral and market-driven, with a patchwork of laws for specific domains such as healthcare (HIPAA) and finance (GLBA). This divergence has concrete implications for cross-border data flows and for everyday data ownership in a globalised internet economy.

The Court of Justice of the European Union has twice invalidated EU–US data transfer frameworks (Safe Harbor in 2015 and Privacy Shield in 2020), arguing that US surveillance laws undermine the “essentially equivalent” protection required for adequacy. Adequacy decisions—formal recognitions that a third country provides sufficient protection—have become a powerful geopolitical tool, rewarding jurisdictions that align with EU-style privacy standards. For users, this legal tug-of-war is more than a diplomatic detail: it determines whether their personal data can be routinely transferred to foreign servers, which oversight regime applies, and ultimately, who is accountable when things go wrong.

Data monetisation by platform monopolies: facebook, google, and the attention economy

While lawmakers race to catch up, platform monopolies have spent years perfecting the art of turning user attention and personal data into revenue. Facebook, Google, TikTok, and other giants of the attention economy design their services to maximise engagement, because more engagement produces more behavioural data, and more data refines the advertising models that pay their bills. This circular dynamic illustrates why data ownership matters so much: when your time online is engineered to be addictive, your data becomes the raw material in a vast industrial system you barely see.

In 2023, digital advertising spending exceeded US$600 billion globally, with Google and Meta alone capturing a dominant share of that market. Their power rests not only on scale, but on their ability to build detailed user profiles that predict what content and ads you are most likely to click. The less control you have over how that profile is built and traded, the more the platform, rather than you, “owns” your digital identity in practical terms.

Surveillance capitalism and behavioural surplus extraction models

Harvard scholar Shoshana Zuboff popularised the term surveillance capitalism to describe business models that rely on the capture and monetisation of “behavioural surplus”—data about your actions that goes far beyond what is strictly necessary to provide a service. Think of a navigation app that not only helps you find your way, but also logs your commuting patterns, favourite restaurants, and the times you usually travel, all to refine advertising and sell insights.

In this model, platforms claim ownership over the predictive value of your data, treating it as a proprietary asset rather than a shared resource or a user-controlled property. The more they know about you, the more accurately they can forecast your behaviour, and the more valuable your profile becomes to advertisers, political campaigns, and data brokers. If data is the new oil, then behavioural surplus is the refined fuel—and without transparent controls, users are relegated to being unknowing oil wells rather than empowered owners.

Third-party cookie deprecation and the future of programmatic advertising

For years, third-party cookies acted as the backbone of programmatic advertising, allowing advertisers to track users across websites and build cross-site profiles. However, growing privacy concerns and browser-level interventions have triggered a major shift. Apple’s Safari and Mozilla’s Firefox now block most third-party cookies by default, and Google Chrome—still the dominant browser—is phasing them out in favour of alternatives like the Privacy Sandbox.

This transition raises an important question: will it truly strengthen data ownership for users, or merely consolidate power in the hands of a few large platforms with extensive first-party data? On one hand, blocking cross-site tracking reduces the spread of your data across countless intermediaries. On the other, platforms like Google and Meta, which already control massive logged-in ecosystems, may become even more central because advertisers will rely more heavily on their walled gardens. For users seeking meaningful control over how their personal data is monetised, this shift underscores the need to look beyond technical tweaks and towards structural changes in how the attention economy operates.

Shadow profiles and Inference-Based data collection techniques

Even if you lock down your privacy settings, large platforms can still assemble so-called shadow profiles—collections of inferred data about people who have not explicitly provided information. This can include non-users whose contact details have been uploaded by friends, or behavioural traits inferred from minimal activity, such as a single “like” or search query. Machine learning models excel at filling in the blanks, predicting your political leaning, income level, or health risks based on patterns observed in similar users.

From a data ownership standpoint, inference-based profiling is particularly problematic because it operates in the grey area between what you explicitly provide and what algorithms deduce. How do you exercise your right to access or erase information that has been inferred rather than collected? And if your digital twin—a probabilistic model of you—can be traded and targeted without your knowledge, to what extent do you truly own your online self? These questions reveal why regulation alone cannot resolve every tension; technical design and ethical choices by platforms also matter.

AWS, azure, and cloud service provider access to customer infrastructure data

Data monetisation is not limited to consumer-facing platforms. Major cloud service providers such as Amazon Web Services (AWS), Microsoft Azure, and Google Cloud sit at a critical layer of the internet stack, hosting everything from small blogs to global enterprises. While their business models revolve around infrastructure rather than advertising, they still have potential visibility into customer metadata, usage patterns, and sometimes even content, depending on how services are configured.

Most hyperscalers assert that customer data remains the property of the customer, and they invest heavily in security and encryption. Yet fine print in service agreements often permits some level of analysis of aggregated or anonymised data for service improvement, fraud detection, or new feature development. For organisations that care deeply about data ownership—such as healthcare providers or financial institutions—this raises subtle but important questions: who controls the logs, telemetry, and derived analytics produced by their use of the cloud, and how portable is that data if they decide to migrate to another provider?

Decentralised web technologies: IPFS, blockchain, and Self-Sovereign identity solutions

In response to the centralisation of data in a handful of corporate silos, a new wave of decentralised web technologies has emerged with an ambitious promise: to give users cryptographic control over their data and digital identities. Rather than trusting a single company to store and mediate access to your information, you can rely on distributed networks, smart contracts, and self-sovereign identity frameworks. These tools do not make the challenges of data ownership disappear, but they shift the balance of power in interesting ways.

At their core, decentralised systems replace institutional guarantees with mathematical ones. If data is encrypted and distributed across a peer-to-peer network, no single party can unilaterally alter or confiscate it. If access permissions are encoded in smart contracts, they execute exactly as written, much like the “digital laws” of gravity. For internet users wary of platform overreach, this vision of user-centric control is compelling—though not without trade-offs in usability, governance, and regulation.

Interplanetary file system architecture for distributed content storage

The InterPlanetary File System (IPFS) is a distributed file protocol designed to make the web more resilient, censorship-resistant, and content-addressable. Instead of locating content by its server address (a URL tied to a specific domain), IPFS locates it by its cryptographic hash—a unique fingerprint of the file. When you request a piece of content, the network fetches it from any node that has a copy, not from a central host.

This architecture has profound implications for data ownership and content permanence. Once you add a file to IPFS, you or anyone else can “pin” it, ensuring it remains available regardless of what happens to any single server. You’re not dependent on a platform deciding whether your content violates terms of service or whether your account should be suspended. However, this also means you must take more responsibility for availability and access control, since a purely open, replicated network does not inherently distinguish between authorised and unauthorised viewers without additional encryption or access layers.

Ethereum-based dapps and smart contract data ownership models

Public blockchains like Ethereum enable decentralised applications (dApps) whose logic and state are managed by smart contracts rather than by centralised servers. In many Web3 designs, your data is represented by tokens or on-chain records controlled by your private keys. You do not log into a platform so much as connect a wallet, and the dApp reads or updates data that you, cryptographically, own.

This model changes the traditional platform-user relationship. Instead of uploading your photos or credentials to a company database, you might store them in decentralised storage and reference them from a smart contract that enforces who can access them and under what conditions. You can grant a service temporary rights to use a piece of data—say, your age verification credential—without revealing your full identity. The trade-off is that key management becomes critical: lose your private keys, and you lose access, much like misplacing the only keys to a safe that holds all your documents.

Decentralised identifiers (DIDs) and verifiable credentials standards

Decentralised Identifiers (DIDs) and Verifiable Credentials (VCs) are emerging standards from the W3C that aim to reinvent digital identity with user sovereignty at the centre. A DID is a globally unique identifier that you control, independent of any particular service provider. Associated with it is a DID document, which specifies public keys and endpoints. Verifiable Credentials are digitally signed attestations—such as “over 18,” “employee of Company X,” or “holder of degree Y”—that you can store in your own wallet and selectively disclose.

Instead of filling out the same forms on every website and scattering your personal data across databases, you could present cryptographic proofs derived from your credentials. For example, you might prove you are an adult without revealing your exact date of birth, using techniques like zero-knowledge proofs. This shifts data ownership in a tangible way: you no longer rely on each platform to safeguard your identity attributes, because you retain them and grant limited, revocable access as needed. It is akin to holding a folder of official documents at home rather than leaving the originals with every institution you deal with.

Solid protocol by tim Berners-Lee: personal online data stores (pods)

Solid, a project spearheaded by web inventor Tim Berners-Lee, takes a slightly different but complementary approach. Instead of anchoring everything on a blockchain, Solid proposes Personal Online Data Stores (Pods) where you keep your data—on your own server, with a trusted provider, or even in a corporate environment. Applications no longer hoard your personal information; they request permission to read or write to your Pod, and you can revoke that access at any time.

In the Solid model, data ownership becomes explicit and practical. You might use one Pod for health data, another for social interactions, and a third for financial records, all under your control. If you switch services—for instance, from one social app to another—you can take your data with you because it resides in your Pod, not in the app’s proprietary database. For internet users frustrated by platform lock-in, this vision of “bring your own data” offers a path towards real portability and autonomy.

Vendor Lock-In and data portability challenges across SaaS platforms

Even without adopting cutting-edge decentralised tools, many organisations and individuals face a more immediate concern: vendor lock-in within Software-as-a-Service (SaaS) platforms. When your emails, documents, CRM records, or project histories live inside a proprietary ecosystem, leaving that ecosystem can feel as daunting as moving house without being allowed to pack your furniture. If exporting your data is difficult, incomplete, or prohibitively expensive, your ability to exercise meaningful ownership is severely constrained.

Some SaaS providers make data portability a first-class feature, offering comprehensive export tools and open APIs. Others bury export options behind paywalls or provide only partial access to critical data like logs, analytics, or configuration settings. For internet users and businesses alike, a practical way to assert data ownership is to ask hard questions before signing any contract: How easy is it to export all my data? In what format? Can I run my own backups? Data portability is not merely a compliance checkbox under laws like GDPR; it is a concrete test of whether ownership is more than a marketing slogan.

Zero-knowledge proofs, homomorphic encryption, and Privacy-Preserving computation

As data becomes more valuable and regulation tightens, a new category of technologies is emerging to reconcile two seemingly conflicting goals: extracting insights from data while preserving privacy and user control. Privacy-preserving computation techniques such as zero-knowledge proofs, homomorphic encryption, and secure multiparty computation promise a future where we can prove facts about data or run computations on encrypted information without revealing the underlying raw data itself.

At first glance, these ideas may sound like science fiction, but they are already finding real-world applications in messaging apps, mobile operating systems, and analytics platforms. For users who care about data ownership, the key takeaway is that you may no longer need to choose between utility and privacy in absolute terms. You can benefit from smart, personalised services while keeping tight cryptographic control over what is shared and with whom.

End-to-end encryption in signal and WhatsApp: architecture and limitations

End-to-end encryption (E2EE), popularised by apps like Signal and WhatsApp, ensures that only the communicating parties can read the content of their messages. Even the service provider cannot decrypt the conversations because the encryption keys are generated and stored on user devices. Technically, this is like sending a locked box to a friend where only the two of you possess the keys; intermediaries can transport the box but cannot open it.

However, E2EE does not solve every aspect of data ownership. While message content is encrypted, metadata—who you talk to, when, and how often—can still be collected and analysed. Moreover, backup solutions, spam detection, and abuse reporting sometimes require trade-offs that weaken absolute privacy. You might own the content of your chats, but platforms still possess powerful levers via metadata and account-level controls. Understanding this distinction helps users make informed choices about which services align with their expectations of control.

Federated learning and On-Device machine learning for data minimisation

Federated learning is a promising technique that allows machine learning models to be trained across many devices without centralising the underlying data. Your phone, for example, may locally train a predictive text model based on your typing habits. Rather than sending your raw keystrokes to a server, it sends only aggregated model updates, which are combined with updates from thousands or millions of other devices.

This approach embodies the principle of data minimisation: collecting only what is necessary and keeping sensitive information as close to the user as possible. For you, the internet user, this means you can still benefit from personalised experiences—better autocorrect, more relevant recommendations—while significantly reducing the amount of personal data exposed to central servers. Of course, federated learning is not a silver bullet; implementation details, security of model updates, and potential reconstruction attacks still need careful attention. Yet it represents a concrete step towards aligning sophisticated AI with stronger notions of data ownership.

Differential privacy implementations in apple’s iOS analytics

Differential privacy is another key technique designed to enable data analysis without exposing individuals. It works by adding carefully calibrated statistical noise to datasets or queries, so that the presence or absence of any single user does not significantly change the result. Apple has integrated forms of differential privacy into iOS analytics and features like QuickType and emoji suggestions, allowing them to learn from aggregate user behaviour while reducing the risk of identifying specific individuals.

From a data ownership perspective, differential privacy can be seen as a kind of protective blur over your digital reflection. You still contribute to improving services, but no one—at least in theory—can reverse-engineer the noise to single you out. For internet users, this suggests a future in which contributing data to collective intelligence does not automatically entail surrendering control over one’s personal profile. The challenge ahead lies in ensuring that claims of “anonymous” or “privacy-preserving” analytics are backed by rigorous implementation rather than mere marketing, and that users are given clear choices about if and how their data participates in such systems.