Closely comparing Hadoop Big Data with more traditional Relational Database solutions helps you to more fully understand the advantages and drawbacks of each. If you want to engage in more meaningful IT-related discussions and make more informed business decisions, knowing more about available technologies and techniques is a key first step. As Viktor Mayer-Schönberger and Kenneth Cukier put it:
Just as the telescope enabled us to comprehend the universe and the microscope allowed us to understand germs, the new techniques for collecting and analyzing huge bodies of data will help us make sense of our world in ways we are just starting to appreciate.[1]
According to Munvo software partner, SAS:
Hadoop is an open-source software framework for storing data and running applications on clusters of commodity hardware. It provides massive storage for any kind of data, enormous processing power and the ability to handle virtually limitless concurrent tasks or jobs.[i][2] A more concise colleague put it this way: Hadoop is a technology architecture that makes use of commodity hardware in a highly distributed and scalable fashion, enabling fast data retrieval at a lower cost.
A more concise colleague put it this way:
Hadoop is a technology architecture that makes use of commodity hardware in a highly distributed and scalable fashion, enabling fast data retrieval at a lower cost.
Both definitions are admirably succinct explanations, and both show how the world (and the market) are transforming the way both small and large amounts of data are collected and stored. It’s time to get on board.
To see how well Hadoop Big Data stands up against Relational Database solutions like IBM Campaign (formerly IBM Unica), we compared the two, designating seven different characteristics from the outset. In our study, Hadoop Big Data and traditional Relational Databases went head-to-head in the following arenas:
With more and more organizations realizing the potential of more comprehensive quantities of data to flesh out CRM platforms, streamline data to current marketing solutions or enhance ongoing Business Intelligence (BI) initiatives, Big Data solutions like Hadoop are very attractive.
If you look back at Figure 1, however, you’ll see that Hadoop Big Data is no cure-all. In fact, more traditional relational databases are still superior when it comes to security, IT support, static customer profiles and profile integration.
And why is that?
Hadoop Big Data and Relational Databases function in markedly different ways.
Relational databases follow a principle known as Schema “On Write.” Hadoop uses Schema “On Read.”
When writing data, in IBM Campaign for example, using Schema “On Write” takes information about data structures into account. The data is then used to construct tables, joins, rules and constraints. This approach gives users the advantage of maintaining clean data, which enforces specific rules and structures.
Hadoop, on the other hand, uses a Schema “On Read” approach, in which it typically “dumps” data by effectively ignoring all structure when writing, resulting in “unstructured” data. As a result, cleaning and interpreting data is left to whoever is querying Hadoop during the “read.”
The absence of identifiable rules, constraints and overall structure makes it difficult to maintain a static customer profile that is unambiguous while excluding duplicate data. Relational databases are more suited to storing and maintaining clear systems of customer records, especially with critical information. Hadoop isn’t looking for a specific, single column or row. Hadoop searches for patterns, probabilities and ambiguous recurrences.
Your organization may have already invested in advanced tools — like ETL, or “Extract, Transform, Load” — that do not easily transfer to Hadoop. What’s more, chances are that your organization has already based its applications, such as IBM Campaign – and maybe its entire infrastructure — on relational databases.
Through it all, it is important to remember that technologies, requirements, skillsets and objectives can, and will, change. Learn all you can and ask the right questions.
[1] Mayer-Schönberger, Viktor and Kenneth Cukier. Big Data: A Revolution That Will Transform How We Live, Work, and Think. New York: Houghton Mifflin Harcourt, 2013, p. 7.
[2] https://www.sas.com/en_ca/insights/big-data/hadoop.html#
“PLUS COMPANY” AND “COMPANY” MEAN PLUS COMPANY CANADA INC. AND ITS AFFILIATES AND BUSINESS UNITS IN CANADA.
Plus Company respects the privacy of its customers.
This Policy concerns you. It describes how we collect, use, disclose and protect your personal information, including when you visit our website or any website we own, operate or control (collectively, the “Site”), when you contact us by phone or email or when you communicate with us via social media.
We may update this Policy (see “Changes to the Policy” below).
You should read this entire Policy before submitting information to us or using our Site. If you submit personal information to us, we assume that you authorize us to use and disclose it as described in this Policy.
Personal information is information that identifies you directly or indirectly, on its own or with other information, such as your name, contact details or IP address.
We may make full use of all information that is de-identified, aggregated or otherwise not in personally identifiable form.
We collect personal information …
When do we collect your personal information?
What type of personal information do we collect?
Why do we need it?
As part of our business operations, we may disclose personal information to the following categories of third parties:
We currently retain personal information in North America.
We may disclose personal information in locations other than your country, province or state of residence, where privacy laws may differ.
If your personal information is used outside your country, province or state of residence, it is subject to the laws of the place where it is located and may be disclosed to governments, courts, law enforcement agencies or regulatory bodies of that place, or disclosed in accordance with the laws of that place. However, our practices regarding your personal information will remain governed by this Policy and by applicable privacy laws.
We will retain your personal information (collected through online and offline methods) for as long as it is necessary for the purposes described in this Policy. We will also retain and use your personal information to the extent necessary to comply with our legal obligations, resolve disputes and enforce our legal agreements and policies.
We take reasonable, appropriate steps to protect personal data from loss, misuse and unauthorized access, disclosure, alteration or destruction, whether in transmission or storage. Remember, however, that no security system is infallible and that transmission over the Internet is not perfectly secure or error-free.
We use a secure server. Only authorized persons have the right to access this information, which they are required to keep strictly confidential.
Right to access and correct
You may request access to and obtain a copy of the personal information we hold about you.
If any personal information about you is inaccurate, incomplete or ambiguous, or if the collection, disclosure or retention of such personal information is not permitted by law, you may require that it be rectified.
You can also ask us for information about the source of your personal information (if it was not obtained from you), as well as the names of persons who have access to your information and details about how long it is kept.
Right to withdraw your consent
You may request to withdraw your consent to our use or disclosure of your personal information.
In most cases, withdrawing your consent means that we will no longer be able to offer you certain services. Otherwise, we will inform you of the consequences of refusal in our request for consent.
To exercise your rights, please send a request in writing, along with proof of identity, to our Privacy Officer at the contact information provided under How to contact us.
Once your request has been received, we will respond in writing within 30 days.