Intellectual Property

‘Big Data’ - IP Protection a Challenge

Volume 2 Issue 1 June 2018 : By ITM Research Team

‘Big Data’ is technological change taking place at a rapid pace these days and one would wonder what IPRs to protect this Intellectual Property. According to Gartner “Big Data is high-volume, high-velocity and/or high-variety information assets that demand cost-effective, innovative forms of information processing that enable enhanced insight, decision making and processing automation”. However, the ‘Big Data’ generation, analysis and usage present every business with a serious challenge in terms of time, labour and money. We all know that Intellectual Capital is at the heart of any innovative business and using it better will be the recipe for future business success. But failing to use it better will be a recipe for failure. Hence, it will be prudent to frame some IP strategies round Big Data. So, we need to assess Intellectual Property protection to Big Data, on the following routes:



For Big Data patents the companies will have to submit algorithm, which are, in fact, not technical inventions. They are theoretical structures or methods and could therefore easily fall into the area of non-patentable matter. The second is that algorithmic patents are particularly vulnerable to the ability by others to “innovate” around them. It is quite unlikely that a data analysis algorithm would be unique from a technical point of view. Most data analysis algorithms are a way of doing similar things in sequence; such as search, clever search, and pattern recognition. We can call it, a commoditization process starting from search and ending with pattern recognition.


Indian Patent Act Section 3(k) specifies that the following cannot be considered as inventions within the meaning of the act.


"A mathematical or business method or a computer program or algorithms"


But many companies have managed to secure patents by presenting them as "more” than just a computer program. Examples of patents granted are, Patent 252220 awarded to Google for "Generating user data for use in targeted advertising", Patent 252448 awarded to Oracle for "In place evolution of xml scheme as in databases"


Hence, the key for successful patent grant for software innovation lies in applicant’s ability to present innovation in such a way that it is not merely a computer program. Hence, subject matter that is “not just a computer program” should be intelligently made out as an essential part of the invention, without compromising on the scope of protection. Hence, for getting patent on algorithms must be quite specific and focused on the its application and usage. The broader they are described, the higher the likelihood of rejection because of the existence of prior art. However, this reduces their usage to block others access to market. In practice, it means that a patent around data analysis can almost always be circumvented with relative ease. In case of Big Data, algorithms have continued business value, they must be adapted continuously with the changes as volume, sources and behaviour of big data changes continuously.


The example is core search algorithms of Google which are continuously modified and updated to stay relevant. If Google don’t do so, they would get behind the competition very quickly and their algorithm will be irrelevant in a short time. The consequence is that, even if a business manages to successfully patent Big Data analytical algorithms, and avoids the pitfalls described above, such patent will lose its value very quickly. The reason is because the algorithms used in the product or service will quickly evolve away from the ones described in the patent due to continuous changes in volume, variety and velocity of data.


Hence, in IP Strategy point of view, the businesses will have to become much more selective in applying for and using patents and further re-assess the value that patents add to a business on the continuous basis.


Data Exclusivity

Data Exclusivity is nothing but ownership of Data one has generated using money, labour and time. In most countries, it is possible to “own” data under the law. The legal principle is based on protection of the effort to create or gather the data and will allow to block or charge for access or use (data in drug development in particular). In India there is no separate legislation for protection of general database (for pharmaceutical product/drug development) as it is in European Union (EU Data Protection Directives 1995).  The limited protection in India for databases are;


“Article 21 of the Constitution protects individual’s data not to be available on public domain. The right extents to data in electronic form and IT Act 2000 vide Section 66 E is dealing with the punishment for violation of privacy facilitates the protection such data. However, there are number of challenges related to ownership of data.



The applicability of copyright on machine or user generated content is a question mark. Copyright is granted for the expression of creative activity: writing a book or a blog, creating or playing music, making a film, etc.  However, copyright also applies to software code, based on the observation that code is like language, and therefore subject to copyright. As such, copyright covers the code, but not the software functionality expressed through the code. Big Data is information and hence, Copyright does not apply to the semantic content or meaning of text written by human authors. The big data generated by machines or sensors will not be covered by copyright. In general, any statistical or mathematical data is, as such, not covered by copyright. Claiming copyright is easy as there is no registration system, and there is no sanction attached to wrongfully claiming copyright or claiming copyright on something that cannot be covered by copyright law (e.g. machine-generated data).


Hence, in practice, the copyright approach to big data does not work. Most business value in using Big Data will be in open breach of copyright, typically by ignoring it (as e.g. Facebook or other large social media do) or will be dealing with data that are not under copyright but have not necessarily been recognized yet as such by the court system.


Therefore, one has to make the analysis of a) whether copyright applies, and b) whether it adds any business value before applying for the copyrights protection. Today, due to technology advances, it has become easier for persons/organisations to copy and distribute data in electronic form any source for commercial gain. Hence in absence of any data protection law the companies have to rely on interpretation of Copyrights Act by the courts. The original work with considerable skills and labour put can be protected under Copyrights Act. But as per courts interpretation the mere assimilation of data does not constitutes original work and cannot be protected. In Big Data, most data are generated by someone else, and the value of data increases by their use as there is no the restriction on their use. Most of the business value in Big Data lies in combining data from multiple sources. Moreover, the actual source of data is often unknown which is derives from various levels of communication. For example, data from customers will be combined with data from suppliers. Data from government agencies will be combined with data from machines. Internal data need to be compared with external data.


The value of data is in its flow, not its sources. Big Data can be compared to river systems springing up everywhere. And the value of a river is in having access to the flow, not control over the sources. Of course, the sources have some relevance, and control over specific forms or aspects of data which may be valuable for certain applications. But gaining and providing access to data will be much more valuable than preventing access to data. As a result, the question of “ownership” of data is probably not the right question to ask. It does not matter so much who “owns” the data, but who can use them, and for what purpose.


The conclusion on ownership (copyrights) is again best illustrated by our river analogy: we should not focus on who owns the land that is alongside the river; we should focus on being able to use the flow, and extract value from that


Trade Secrets

Secrecy and know-how protection can be a very valuable asset of businesses as we all know about the secret formula of ‘Coca-Cola’ a soft drink or ‘Heineken’ beer, wherein the owners have not protected the products by any IP rights, but it has significant business value, and it is protected by other legal instruments. Typically, contract law, with confidentiality agreements, which will play a major role in protecting business secrets and know-how. The most legal systems allow businesses to bring legal claims against competitors, business partners or employees, who disclose or use secret information in unauthorized ways.


By and large, this approach is used by many businesses. Often, the strategy around protecting Intellectual Capital will consist of understanding what the business secrets are and building appropriate procedures of protection or disclosure.


Here, a key consideration for Big Data, IP business strategy is the word “secret”. This is where Big Data presents an important shift: not only does it become much harder to know, who owns or generates which data, or what is in those data, it also becomes much riskier not to grant relatively free access to data. This is because there is a lot of value in using the data. A lot of value in Big Data comes from recombining data from diverse sources or approaching data in a unique way.


Therefore, as a conclusion, IP Strategies around Big Data should be focused on the instruments to access and use the flow of data, rather than patenting, data ownership and copyrighting, as either of them do not consider the ‘continuous growth’ aspect of Big Data in 


Contributed by Research Team- ITM Business School, Navi Mumbai