Theodora Fragkouli

About Theodora Fragkouli

Theodora has graduated from Computer Engineering and Informatics Department in the University of Patras. She also holds a Master degree in Economics from the National and Technical University of Athens. During her studies she has been involved with a large number of projects ranging from programming and software engineering to telecommunications, hardware design and analysis.

Big Data: What about Security?

From the first time Hadoop appeared it had a security problem. Apache Knox and Cloudera Manager have been solutions for providing authentication and authorization for basic database management functions. Also, the underlying Hadoop Filesystem now incorporates Unix-like permissions. But the issue has not been solved, so usually the pattern followed is to “plunk the S-word after the name of a new technology and you have a “BOLD IDEA FOR A NEW STARTUP!!!!””, as explained in Trust me: Big data is a huge security risk.

There have been other cases (SOA security, AJAX security, open source security) where a security startup came up.

In Hadoop, and in big data in general, the real security problem is that when we have a lot of data to aggregate we may lose context. Hadoop allows to store context, but checking all that context with each piece of data is an expensive proposition.

The important thing to know about context is, for example, not only how to get access to a database as a certain user, but also how to aggregate more data also preserving granular rights and permissions.

In order to succeed in having data ownership and data context rules in place without killing the performance, there are emerging technology solutions, such as Accumulo, created by the big data community — including everyone’s favorite member, the NSA.

Since security problem has been a hot topic for almost a decade now, there has also been research. When building a big data project for data aggregation and wondering about security, one should search on “datawarehouse security”.

Though 70 percent of the results will be vendor pitches or complaints about RBAC, there will also be plenty of results that explain exactly how this was done before, describing neither technologies nor tools, but methodologies — and those more or less translate directly to big data.

Do you want to know how to develop your skillset to become a Java Rockstar?

Subscribe to our newsletter to start Rocking right now!

To get you started we give you two of our best selling eBooks for FREE!

JPA Mini Book

Learn how to leverage the power of JPA in order to create robust and flexible Java applications. With this Mini Book, you will get introduced to JPA and smoothly transition to more advanced concepts.

JVM Troubleshooting Guide

The Java virtual machine is really the foundation of any Java EE platform. Learn how to master it with this advanced guide!

Given email address is already subscribed, thank you!
Oops. Something went wrong. Please try again later.
Please provide a valid email address.
Thank you, your sign-up request was successful! Please check your e-mail inbox.
Please complete the CAPTCHA.
Please fill in the required fields.

Leave a Reply


eight × = 32



Java Code Geeks and all content copyright © 2010-2014, Exelixis Media Ltd | Terms of Use | Privacy Policy | Contact
All trademarks and registered trademarks appearing on Java Code Geeks are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries.
Java Code Geeks is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.
Do you want to know how to develop your skillset and become a ...
Java Rockstar?

Subscribe to our newsletter to start Rocking right now!

To get you started we give you two of our best selling eBooks for FREE!

Get ready to Rock!
You can download the complementary eBooks using the links below:
Close