Software Development

Big Data: What about Security?

From the first time Hadoop appeared it had a security problem. Apache Knox and Cloudera Manager have been solutions for providing authentication and authorization for basic database management functions. Also, the underlying Hadoop Filesystem now incorporates Unix-like permissions. But the issue has not been solved, so usually the pattern followed is to “plunk the S-word after the name of a new technology and you have a “BOLD IDEA FOR A NEW STARTUP!!!!””, as explained in Trust me: Big data is a huge security risk.

There have been other cases (SOA security, AJAX security, open source security) where a security startup came up.

In Hadoop, and in big data in general, the real security problem is that when we have a lot of data to aggregate we may lose context. Hadoop allows to store context, but checking all that context with each piece of data is an expensive proposition.

The important thing to know about context is, for example, not only how to get access to a database as a certain user, but also how to aggregate more data also preserving granular rights and permissions.

In order to succeed in having data ownership and data context rules in place without killing the performance, there are emerging technology solutions, such as Accumulo, created by the big data community — including everyone’s favorite member, the NSA.

Since security problem has been a hot topic for almost a decade now, there has also been research. When building a big data project for data aggregation and wondering about security, one should search on “datawarehouse security”.

Though 70 percent of the results will be vendor pitches or complaints about RBAC, there will also be plenty of results that explain exactly how this was done before, describing neither technologies nor tools, but methodologies — and those more or less translate directly to big data.

Theodora Fragkouli

Theodora has graduated from Computer Engineering and Informatics Department in the University of Patras. She also holds a Master degree in Economics from the National and Technical University of Athens. During her studies she has been involved with a large number of projects ranging from programming and software engineering to telecommunications, hardware design and analysis. She works as a junior Software Engineer in the telecommunications sector where she is mainly involved with projects based on Java and Big Data technologies.
Notify of

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Inline Feedbacks
View all comments
Back to top button