Theodora Fragkouli

About Theodora Fragkouli

Theodora has graduated from Computer Engineering and Informatics Department in the University of Patras. She also holds a Master degree in Economics from the National and Technical University of Athens. During her studies she has been involved with a large number of projects ranging from programming and software engineering to telecommunications, hardware design and analysis.

Big Data: What about Security?

From the first time Hadoop appeared it had a security problem. Apache Knox and Cloudera Manager have been solutions for providing authentication and authorization for basic database management functions. Also, the underlying Hadoop Filesystem now incorporates Unix-like permissions. But the issue has not been solved, so usually the pattern followed is to “plunk the S-word after the name of a new technology and you have a “BOLD IDEA FOR A NEW STARTUP!!!!””, as explained in Trust me: Big data is a huge security risk.

There have been other cases (SOA security, AJAX security, open source security) where a security startup came up.

In Hadoop, and in big data in general, the real security problem is that when we have a lot of data to aggregate we may lose context. Hadoop allows to store context, but checking all that context with each piece of data is an expensive proposition.

The important thing to know about context is, for example, not only how to get access to a database as a certain user, but also how to aggregate more data also preserving granular rights and permissions.

In order to succeed in having data ownership and data context rules in place without killing the performance, there are emerging technology solutions, such as Accumulo, created by the big data community — including everyone’s favorite member, the NSA.

Since security problem has been a hot topic for almost a decade now, there has also been research. When building a big data project for data aggregation and wondering about security, one should search on “datawarehouse security”.

Though 70 percent of the results will be vendor pitches or complaints about RBAC, there will also be plenty of results that explain exactly how this was done before, describing neither technologies nor tools, but methodologies — and those more or less translate directly to big data.

Related Whitepaper:

Big Data Basics

An Introduction to Big Data and How It Is Changing Business

Amazingly, 90% of the data in the world today has been created only in the last two years. With the increase of mobile devices, social media networks, and the sharing of digital photos and videos, we are continuing to grow the world's data at an astounding pace. However, big data is more than just the data itself. It is a combination of factors that require a new way of collecting, analyzing, visualizing, and sharing data. These factors are forcing software companies to re-think the ways that they manage and offer their data, from new insights to completely new revenue streams.

Get it Now!  

Leave a Reply


two × 2 =



Java Code Geeks and all content copyright © 2010-2014, Exelixis Media Ltd | Terms of Use | Privacy Policy
All trademarks and registered trademarks appearing on Java Code Geeks are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries.
Java Code Geeks is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.

Sign up for our Newsletter

20,709 insiders are already enjoying weekly updates and complimentary whitepapers! Join them now to gain exclusive access to the latest news in the Java world, as well as insights about Android, Scala, Groovy and other related technologies.

As an extra bonus, by joining you will get our brand new e-books, published by Java Code Geeks and their JCG partners for your reading pleasure! Enter your info and stay on top of things,

  • Fresh trends
  • Cases and examples
  • Research and insights
  • Two complimentary e-books