Personal Data

Defining the Scope of ‘Personal Data’ in Digital India


Key Points

  • Last week, a revised version of India’s new data protection law was approved by the Union Cabinet and could be tabled in Parliament during the monsoon session (July 20 – August 11, 2023).
  •  Like the EU’s General Data Protection Regulation (“GDPR”), India’s proposed law follows the notice-and-consent model for collecting and processing personal data. Personal data has been defined as “any data about an individual who is identifiable by or in relation to such data.”
  •  Specifically, the proposed data protection framework applies to the processing of digital personal data within Indian territory (whether it is collected online or offline, as long as it is digitized).
  •  In addition, it applies to the processing of digital personal data outside Indian territory when such processing involves profiling individuals or offering goods/services within Indian territory.
  •  However, it does not apply to non-automated data processing or to offline personal data – if the latter is left undigitized.


At present, India regulates the use of data under the Information Technology Act, 2000, as amended, including by the Information Technology (Amendment) Act, 2008 (“the 2008 Amendment,” and collectively, the “IT Act”) – along with rules framed under the IT Act, such as the Information Technology (Reasonable Security Practices and Procedures and Sensitive Personal Data or Information) Rules, 2011 (the “SPDI Rules,” and together with the IT Act, the “Existing Data Protection Framework”).

However, this regime is likely to change soon. Last week, India’s Union Cabinet approved a revised version (the “2023 Draft”) of the Digital Personal Data Protection Bill (“DPDP”) – which proposes to replace the Existing Data Protection Framework.

A previous draft (the “2022 Draft”) of DPDP was released by the Ministry of Electronics and Information Technology (“MeitY”) late last year for public comments. According to the Government of India (Allocation of Business) Rules, 1961, MeitY is the nodal authority in respect of matters related to the IT Act as well as other laws concerning information technology (“IT”).


Previously, the Department of Personnel and Training (“DoPT”) (under the Ministry of Personnel, Public Grievances and Pensions) had been the nodal authority with respect to a proposed privacy law in India. For instance, in 2011, a draft ‘Right to Privacy Bill’ was drafted and discussed under the auspices of the DoPT. A year later, India’s erstwhile Planning Commission constituted a group of experts, which then submitted a report in October 2012 recommending a separate legal framework for privacy. However, this law did not come to pass.

Thereafter, the year 2017 saw a flurry of activity. In the absence of a dedicated data protection law in India, along with perceived inadequacies in the Existing Data Protection Framework (especially on account of the country’s needs in a data-driven global economy), MeitY constituted a committee of experts chaired by Justice B. N. Srikrishna (the “Expert Committee”) with a mandate to (i) examine issues related to data protection, and (ii) make specific suggestions towards a data protection bill.

Separately, the Telecom Regulatory Authority of India (“TRAI”) released a consultation paper in August 2017 on privacy, security and data ownership in the telecom sector. That same month, the Supreme Court unanimously recognized privacy as a fundamental right under the Indian Constitution, stemming from the latter’s Article 21. Further, by late November, the Expert Committee managed to release a white paper on a revised data protection framework (the “White Paper”) for public comments.

Pursuant to multicity consultations in connection with the White Paper, in July 2018, the Expert Committee submitted a report on a free and fair digital economy (the “EC Report”) to MeitY – along with a draft Personal Data Protection Bill, 2018 (“PDP 18”). A year and a half after the EC Report was released, based on suggestions from stakeholders, the Personal Data Protection Bill, 2019 (“PDP 19”) came to fruition – which, along with a parliamentary introduction – got referred to a joint committee (the “Joint Committee”). After 78 sittings conducted over two years during the pandemic, the Joint Committee’s report (the “JC Report”) was presented in Parliament at the end of 2021. Among other things, the JC Report suggested that the short title of PDP 19 ought to be changed to the ‘Data Protection Act, 2021’ (“Proposed DP Act”), including for the purpose of extending legal protection to non-personal data.

Comprising an amalgam of revisions made to PDP 19, the Proposed DP Act was included within the JC Report. However, PDP 19 was withdrawn last August in light of extensive changes, including those suggested by the Joint Committee. A few months later, MeitY released the 2022 Draft of DPDP.

Present status

Pursuant to extensive feedback received on the 2022 Draft, the 2023 Draft of DPDP – a version of which was approved by a parliamentary standing committee in March – is now ready. However, unlike the 2022 Draft, the 2023 Draft has not yet been made publicly available. In the end, it is possible that the 2022 and 2023 Drafts will resemble each other in large part – barring a few provisions, such as those on voluntary disclosure of data breaches and alternative dispute resolution, respectively, along with reduced penalties for non-compliance.

At any rate, since the Union Cabinet has now officially cleared the 2023 Draft, it appears that DPDP is poised for parliamentary deliberation – i.e., if the government manages to get it tabled before the Lok Sabha during the latter’s monsoon session, scheduled between July 20 and August 11.

Defining Personal Data

In this note, we discuss the statutory definition of personal data under the 2022 Draft of DPDP, including with reference to the GDPR.


Global regulatory consensus appears to have broadly converged to define ‘personal data’ as information that relates to an identified or identifiable individual. For instance, Article 4(1) of GDPR defines ‘personal data’ as any information relating to an ‘identified or identifiable natural person’. Under GDPR, such a ‘data subject’ may be identified either directly or indirectly with reference to an ‘identifier’ – such as a name, an identification number, location data, an online trail, etc., as well as in connection with one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of a natural person.

Definitions of personal data may exhibit variation in terms of details, parameters and complexity across jurisdictions, subject to local requirements and national/cultural perspectives. Nevertheless, the definition itself, other than providing a basis for interpretative debate among lawyers and policymakers, involves matters of practical importance as well – including for businesses and consumers. After all, it is this definition that goes on to determine the zone of informational privacy guaranteed by a country’s data protection framework, and thus, indicates the extent of obligations for those that handle personal data.

Furthermore, global definitional convergence is useful for regulatory and negotiation purposes. For instance, the definition of personal data under the 2022 Draft of DPDP is substantially similar to the one contained in the EU’s current proposal for a legal text on digital trade (the “EU Proposed Text”) – as tabled for discussion in respect of an anticipated free trade agreement with India (the “FTA”), negotiations for which resumed in June last year.

Nevertheless, the equivalent definition in GDPR stems from, and improves upon, Article 2(a) of Directive 95/46/EC (the “EU DP Directive”) – the former’s predecessor – which got replaced in May 2018 (see Article 94(1) of GDPR). Thus, for instance, GDPR has added the word ‘genetic’ to the list of identifiers in respect of the specific factors related to a person’s identity, whereas Article 4(13) separately defines ‘genetic data’.

While the 2022 Draft of India’s DPDP defines personal data somewhat sparsely, GDPR goes on to explain that an identifiable natural person is one who can be identified in particular. In practice, the illustrative list of identifiers that GDPR refers to includes data which has been, or is capable of being, assigned to a person. For example, telephone numbers, credit cards, personnel ID, account data, number plates, customer numbers, as well as private addresses, may constitute personal data.

India’s existing framework vs. DPDP vs. GDPR


Pursuant to the 2008 Amendment, India’s Existing Data Protection Framework specifically stems from Section 43A of the IT Act, read with the SPDI Rules. Rule 3 of the latter defines ‘sensitive personal data or information’ (“SPDI”) as that which consists of information relating to items such as passwords, financial information (e.g., details related to a bank account, credit/debit card or other payment instruments), health and medical records, biometric information, etc.


However, the IT Act defines ‘data’ quite broadly: viz., “a representation of information, knowledge, facts, concepts or instructions” – which are prepared in a formalized manner and processed (or intended to be processed) in a computer system/network. Further, the IT Act specifies that data can be in any form, and may be stored internally in the memory of a computer.

While Section 2(4) of the 2022 Draft of DPDP defines ‘data’ more or less consistent with the IT Act, it is more refined relative to the latter – such as by including ‘opinions’ within the ambit of represented information, and adding components like suitability of communication, interpretation, or processing by human beings or through automated means.

The slightly modified definition of ‘data’ under DPDP (vis-à-vis the IT Act) may be relevant in the context of new technologies such as artificial intelligence (“AI”) that involve automated decision-making or machine learning (“ML”) systems, which are algorithmically controlled. For instance, in 2019, the Organization for Economic Co-operation and Development (“OECD”) had defined ‘AI’ as a machine-based system that can make predictions, recommendations, or decisions influencing real or virtual environments, designed to operate with varying levels of autonomy.

On the question of including ‘opinions’ under data, while jurisdictions like Singapore explicitly clarify that information may not necessarily be true or proven in order to constitute personal data (see Section 2(1) of the Personal Data Protection Act 2012 of Singapore (“PDPAS”)), this is implicit in other national legislations as well. For instance, under DPDP, an individual has the right to correct their personal data (see Section 13(1)), including when it is inaccurate or misleading.

Personal data or information

Even though it clarifies what ‘data’ is, the IT Act does not contain an explicit definition of personal data. On the other hand, Rule 2(1)(i) of the SPDI Rules defines personal information. Importantly, while the latter definition reflects elements from both GDPR and DPDP, it does so by linking identifiability to information that is available (or likely to be available) with a body corporate – which term was defined in the 2008 Amendment of the IT Act to include companies, firms, sole proprietorships, and other associations of individuals engaged in commercial or professional activities. Further, while DPDP refers to an ‘individual’ in the context of personal data – the SPDI Rules, like GDPR, refers to a ‘natural person’. Nonetheless, these terms appear to be functional equivalents.

Information or data: what’s the difference?

The terms ‘information’ and ‘data’ are both used in the context of privacy and data protection laws around the world. Accordingly, the use of either term – i.e., ‘data’ or ‘information’ – may not signify a distinction as such. Other jurisdictions also appear to not make this distinction. Although the IT Act contains separate definitions for ‘information’ and ‘data’ under Sections 2(1)(v) and 2(1)(o), respectively, Explanation (iii) to its Section 43A, read with Rule 3 of the SPDI Rules, refers to SPDI as ‘sensitive personal data or information’ (emphasis added) – suggesting synonymous legislative use.

The DPDP definition

Section 2(13) of the 2022 Draft of DPDP defines ‘personal data’ as “any data about an individual who is identifiable by or in relation to such data.” In the next few paragraphs, we will unpack this definition part by part.

‘Any data’

While personal data may involve any kind of information about an individual, it could cover both objective and subjective information (including opinions, estimates or assessments related to such person).


Further, Section 4 of DPDP limits the operation of the proposed data protection framework to the processing of digital personal data within Indian territory. Such data may be collected either online or offline, but as long as it has been digitized, DPDP will apply.

In addition, DPDP applies to the processing of digital personal data outside Indian territory when such processing relates to: (i) the profiling of a ‘data principal’ (i.e., the equivalent of a ‘data subject’ under GDPR); or (ii) an activity of offering goods or services within Indian territory. The second condition appears to track GDPR in terms of the latter’s Recitals 23 and 24.


In general, GDPR protects personal data irrespective of the technology used for data processing (see Recital 15). Thus, GDPR is technology neutral and applies to both automated and manual processing, as long as the data is organized in accordance with pre-defined criteria (e.g., in alphabetical order). Further, as far as GDPR is concerned, it does not matter how and where the data is stored – e.g., in an IT system, through video surveillance, or on paper. In all such cases, personal data will be subject to the protection requirements set out in the law.

While DPDP does not apply to non-automated data processing (or to offline personal data, if subsequently left undigitized), GDPR too does not apply to non-automated data processing when the underlying personal information is not intended to be part of a ‘filing system’ (see Article 2(1) of GDPR (“Material Scope”)). This GDPR provision, in turn, stems from Article 3(1) of the EU DP Directive.

Indeed, while the definition of ‘processing’ under DPDP seems to be inspired by GDPR, the latter’s definition, in turn, stems from Article 2(b) of the EU DP Directive.


Article 2(b) of the EU DP Directive defined the ‘processing of personal data’ (or ‘processing’) as “any operation or set of operations which is performed upon personal data, whether or not by automatic means, such as collection, recording, organization, storage, adaptation or alteration, retrieval, consultation, use, disclosure by transmission, dissemination or otherwise making available, alignment or combination, blocking, erasure or destruction.” Article 4(2) of GDPR defines ‘processing’ as “any operation or set of operations which is performed on personal data or on sets of personal data, whether or not by automated means, such as collection, recording, organisation, structuring, storage, adaptation or alteration, retrieval, consultation, use, disclosure by transmission, dissemination or otherwise making available, alignment or combination, restriction, erasure or destruction.” Section 2(16) of the 2022 Draft of DPDP defines ‘processing’ in relation to personal data to mean “an automated operation or set of operations performed on digital personal data, and may include operations such as collection, recording, organisation, structuring, storage, adaptation, alteration, retrieval, use, alignment or combination, indexing, sharing, disclosure by transmission, dissemination or otherwise making available, restriction, erasure or destruction.”

Digital data

Templates of data protection laws first emerged when new technologies in electronic data processing started to enable widespread access to personal information. To that extent, the material scope of GDPR (see Article 2) refers to a ‘filing system’ as any structured set of personal data which is accessible according to specific criteria, whether centralized, decentralized, or dispersed on a functional or geographical basis. This term is defined under Article 4(6) of GDPR, using the same language as contained in Article 2(c) of the erstwhile EU DP Directive.

Meanwhile, the explanatory note to the 2022 Draft of DPDP (the “Explanatory Note”), which accompanied the draft’s release in India, specified that the law would apply to digital personal data only – in recognition of the rising role of the internet and increased ‘digitalization’. However, the sole purpose of the Explanatory Note was to make it easier for the public to understand the various provisions of DPDP. Thus, the Explanatory Note is not intended to form a part of DPDP, and therefore, it should not be used for the purpose of interpreting the latter’s legal provisions.

Nevertheless, this is the first time that Indian law has clarified, both in terms of nomenclature and scope, that the proposed data protection framework is restricted to digital data only. While ‘digitization’ refers to the process of converting physical or analog information (such as paper documents or images) into digital, machine-readable formats – which, in turn, may be accessed, stored, and manipulated using computers and digital technologies – ‘digitalization’ refers to the use of digital technologies (such as cloud computing, AI, and the Internet-of-Things) to automate and/or otherwise improve upon business processes, create additional value for customers through new products/services, enhance customer experience, as well as generate revenue. Thus, digitization is the first step towards digitalization. By converting physical data into digital formats, businesses are able to better leverage digital technologies.

To maintain its focus on digital interactions in the context of widespread digitization across the Indian economy, DPDP recognizes that data in general, and personal data in particular, remains at the core of a fast-growing ecosystem of digital products, services, and intermediation. In this regard, DPDP may involve a focused pivot with respect to the automated processing of online digital datasets.

Nevertheless, digital datasets are abundant and on the rise. Along with India’s transition to a digital economy, the processing of personal data has increased exponentially in both public and private sectors. In addition, other than being innately valuable, personal data assumes special significance when shared. Almost every human activity today involves some species of data transaction. Accordingly, the internet has birthed new markets, including those engaged in the collection, organization, and processing of personal data, where such activities remain a critical component of the underlying business model. ‘Big Data’ offers specific methods and technologies for statistical data evaluation, which arise at the interface of business informatics and commercial data management – combining the fields of business intelligence, data warehousing (an infrastructural technology that helps evaluate data inventories) and data mining (i.e., the application of exploratory methods to a data inventory with the aim of discovering knowledge from such databases, and specifically, for the purpose of pattern recognition). In this regard, issues of privacy and customer confidentiality have acquired added prominence on account of the rise of digital tracking and targeted advertising.

Offline data and non-automated processing

Section 4(3) (“Application of the Act”) excludes ‘offline’ personal data and ‘non-automated’ processing from DPDP’s ambit. Section 2(1) of DPDP clarifies what ‘automated’ means, i.e., “any digital process capable of operating automatically in response to instructions given or otherwise for the purpose of processing data.” However, in the absence of a separate clarification on what the terms ‘offline’ and ‘non-automated,’ respectively, connote – it could be argued that DPDP excludes both: (i) processing of offline data; and (ii) offline processing of data. Thus, data processing that does not involve either digitization or automation may be considered ‘offline’.

However, according to Section 4(1)(b), DPDP does apply to data when it is subsequently digitized – even if collected offline. Strictly speaking, such collection may include the use of a computer or other device which is not connected to the internet (and hence is ‘offline’) – although the underlying data gets digitized at the time of collection itself. Accordingly, this exclusion is not clear and could be clarified further.

‘In relation to’

In general terms, information can be considered to ‘relate’ to an individual when it is about that person. In many situations, this relationship can be easily established. For instance, digitized data with respect to someone’s file in a personnel office is “in relation to” their status and situation as an employee. Similarly, data about the results of a patient’s medical test, or the image of a person filmed on video, are in relation to such individual.

However, in some situations, the information conveyed by a piece of data may involve something or someone else, and not a specific individual as such. Even in these cases, the information may relate to an individual indirectly. For example, a digitized automobile service register may contain information about a car, its mileage, dates of service checks, technical problems, and material condition. In turn, this information may be associated with the digitized record of vehicular registration, number plate, and/or engine number, which, in turn, may be linked to a specific owner. Where a garage or insurance company establishes a connection between the vehicle and such owner, the information will ‘relate’ to the owner for the purpose of billing or assessment. If, on the other hand, a similar connection is made in respect of a mechanic for the purpose of ascertaining productivity or expertise, the underlying information may ‘relate’ to such mechanic.

In sum, for data to ‘relate’ to an individual, either of three key elements, associated with (1) content, (2) purpose, or (3) result, needs to exist.


The element of content is present where information about a particular person may be available, regardless of the purpose or impact of such information on the individual concerned. For example, results from a medical test may unequivocally relate to a patient, or the information contained in a bank’s database under the name of a certain customer will obviously relate to such customer.


The element of purpose may be invoked when certain valuable information – which a ‘data fiduciary’ or a third party intends to exploit commercially – relates to a certain person. Section 2(5) of DPDP defines a ‘data fiduciary’ as “any person who alone or in conjunction with other persons determines the purpose and means of processing of personal data.” Such persons may include companies, firms, unincorporated associations or bodies (even if incorporated), the state itself, and any artificial juristic person (see Section 2(12) of DPDP).

Thus, the element of purpose may exist when personal data is used, or is likely to be used, with the purpose of evaluating, collating, commercially exploiting, and/or influencing the status, preferences, profile, or behavior of an individual.

For instance, individual persons may be associated with certain online identifiers provided by their devices, applications, and protocols – such as IP addresses, cookies, and other things – including radio frequency identification tags. Such online identifiers may leave behind traces which – particularly when combined with uniquely identifying elements and/or other information received by servers – can be used to create personal profiles (see Recital 30 of GDPR). Similar to Article 4(4) of GDPR, Section 4(2) of DPDP refers to ‘profiling’ as any form of data processing that analyzes or predicts certain aspects related to the behavior, attributes, or interests of a data principal.

Thus, even while extending special protections to children (see Section 10), it is clear that DPDP does include online tracking, behavioral monitoring, and targeted advertising within its ambit.


The third element deals with result. Accordingly, despite the absence of either content or purpose as discussed above, data may nevertheless ‘relate’ to an individual when its use is likely to have an impact on their rights and/or interests. In this regard, it may not be necessary for the potential result to have a major impact, as long as an individual is likely to be treated differently from other persons as a result of data processing.

For instance, there may be a satellite-based radionavigation system used by a cab service provider/aggregator that helps discern the position of available cars within a fleet in real time. The purpose of data processing in this context may be to provide better services by assigning to each client a taxi that is closest to their current location. While the aim is not to evaluate the performance of drivers, the system may nonetheless allow the monitoring of their performance and itineraries, including their speed limits and other activities. Such a system can therefore have a considerable impact on drivers, where the data relates to individual persons. Thus, the processing of such information could be subject to data protection rules.


The three elements (viz., content, purpose and result) need to be viewed as alternative conditions (and not cumulatively). For instance, where the element of content exists, there is no requirement to check for the other two elements to ascertain whether the information relates to a specific individual. On the other hand, the same piece of information may relate to different individuals at the same time, depending on the element present with regard to each such individual. Thus, the same data may relate to Person A because of content (e.g., where the underlying information is about Person A), while also relating to Person B on account of purpose (e.g., where such information might be used to treat Person B in a certain way), and relate to Person C as well – stemming from result (e.g., when the information is likely to have an impact on the rights of Person C). Taken together, this implies that a certain piece of data need not singularly focus on a particular person for the purpose of ascertaining whether it relates to them alone.

This insight has been authored by Deborshi Barat (Counsel); he can be reached at for any questions.This insight is intended only as a general discussion of issues and is not intended for any solicitation of work. It should not be regarded as legal advice and no legal or business decision should be based on its content.