Featured FREE Whitepapers

What's New Here?


Top 5 Picks of Google IO 2012

Google IO 2012 developer conference has just concluded last week amid lot of fanfare. I for one think that Google has lot more influence on Enterprise technology than it seems. Some of what Enterprises see as latest and greatest of technology (such as Map Reduce) has been pioneered in Google a while ago. This is one the main reasons why I followed this event closely. Over the 3 days Google has several technology and product announcements across the product lines. Here are the top 5 picks at Techspot. #1 Google Compute EngineWith Compute Engine, Google made it intentions clear as a Cloud Service provider. Google entering Infrastructure as a Service (IaaS) market brings more innovation and cheaper compute resources; enhancing its portfolio beyond PaaS. How exciting!! It is interesting to see the reverse trend of PaaS offerings to IaaS offerings (Microsoft also recently started offering Linux VMs as a service). Does this mean uptake of PaaS is very low? Compute Engine offers Linux VMs running Ubuntu 12.04 and CentOS 6.2. Most of the concepts that Amazon EC2 uses such as zones, ephemeral versus persistent disks are present in Compute Engine tool. That said, this is a reasonable start for Google but a long way to be a threat to Amazon EC2. It is interesting to see Enterprises like Bestbuy listed as their beta customers. #2 Jelly Bean, Android 4.1Jelly bean, Android 4.1 is the newest version of the popular mobile operating System announced in IO. Apart from the several performance optimizations and various user level features, Jelly bean packs several interesting technology innovations. Prime among them is “Google Now”. Gartner has been talking about Context Delivery Architecture for few years, Google Now is probably first main stream execution of the same. Google now combines user location, calendar details, past searches, traffic, weather and time in interesting ways and provides useful information before even asking. One of the use case shown is, when you have an appointment at a certain location, it checks traffic and tells how long it’ll take to get there. It notifies when you should leave, so that you can reach the destination on time. There are many other interesting features including Project Butter, Systrace tool, peer-to-peer service discovery, cloud messaging, smart app updates. #3 Packaged Apps & Chrome Everywhere Google chrome is the new generally available on Android 4.1. With good HTML 5 support, mobile web applications on Chrome in Android are very powerful. Chrome is also available on iOS. Packaged apps are one of the interesting features that are announced for Chrome. Packaged apps allow access to Chrome API, but are written in HTML 5, JavaScript and CSS. Packaged Apps are loaded locally and supports offline mode. In a way these are like Adobe AIR applications, but are written using standard web technologies. This is probably another category defining feature; developing cross platform hybrid applications running locally. Some interesting stats on chrome are revealed. Chrome now has 310M active users across the world. #4 Project Glass Lot has been written and said about Project Glass from Google. Wearable computing is about to start. Project glass probably can make a new way of how we use and interact with computers. Technically not much has been revealed on project glass, other than the fact that it has camera, microphone, gyroscope and wi-fi connectivity. Key-note demo by Sergey Brin that involved blimp, sky divers, project glass and Google+ hang-outs is an epic. #5 Nexus 7 Tablet Technically Nexus 7 is just an Android Jelly Bean device. There is no reason to pick this other than for the fact that it is the first official Tablet from Google. At a price of $199 and with Quad Core Tegra 3 architecture and 12 GPUs, this tablet is a real power house. Wi-fi, gyroscope, GPS, accelerometer, Gorilla Glass, front facing camera, this is a sure main stream device and probably puts android tablets into millions of hands, another great revenue opportunity for Mobile developers. Interestingly Nexus 7 ships with Chrome as the default browser instead of Android’s native web-kit based browser. I guess Google is trying to avoid an IE 6 scenario here by separating browser from OS. There are many more interesting announcements that didn’t make it to the top 5 list but I found them interesting enough to mention here:Offline maps and offline voice recognition for Android Enhanced Gesture support in Android Apps Script for Good Apps Automation Google Docs OfflineWhat are your top 5 picks? Reference: Top 5 Picks of Google IO 2012 from our JCG partner Munish K Gupta at the Tech Spot blog....

DevOps is Culture – so is a lot of other stuff

I hung out in an excellent discussion at DevopsDays driven by Spike Morelli around culture. The premise was that DevOps started as an idea around culture – around behavior patterns that lead to better software. Somewhere along the way our industry shifted this discussion into a tools discussion & now the amount of noise out there about “DevOps tools” is magnitudes higher than any discussion about the real reason DevOps exists – to shift culture. I looked up the definition of “culture” – here are a few definitions:the quality in a person or society that arises from a concern for what is regarded as excellent in arts, letters, manners,scholarly pursuits, etc. that which is excellent in the arts, manners, etc. the behaviors and beliefs characteristic of a particular social, ethnic, or age group: the youth culture; the drug culture.Note that culture is the manifestation of intellectual achievement. It’s the evidence and result of achievement. I think the 3rd definition is most appropriate for DevOps – what are the behaviors that are characteristic of a well integrated Development & Operations organization? The challenge, the discussion, was how we can re-balance the scales and get the word out that this is actually about culture and that tools happen as a result of culture, not the other way around. This post begins my contribution to that effort. The question was asked – do we all agree that culture is the most important thing when it comes to creating a successful business? The short answer is “yes”. If you wanted to hear all the if/and/but/what-if/etc discussions, you should have come to Devopsdays. For the sake of this blog post – culture is the most important factor. If you want case studies and analysis that proves that culture matters – read Jim Collins Good to Great. My present company has a really excellent culture of Developer / Ops cooperation and collaboration. I wasn’t there when it wasn’t that way (if ever) and so I can’t tell a story about how to change your organization. What I can tell you is what a healthy and thriving Dev/Ops practicing organization looks like and what I think some of the key factors in that success are. I see this as two components – there are fundamental core values that enable and support the culture, and then there are tactical things that are done to make the culture work for us. I’d like to talk about both. The culture is the result of these actions and ideas put into practice. Background I work for a company with a well defined set of core values. Those values set forth parameters under which the culture exists. Here’s what they are: These values are public and they matter – they matter a lot. These might sound hokey to you – but every single one of them is held high at the company & strongly defended. Defending a list of values like this is hard sometimes. When someone doesn’t show respect to others, how do you uphold that core value? When someone’s idea of “work life balance” is different than another person, how do you support both of them? When creating your own reality means you don’t want to work for Rally anymore – what do you do? I’m proud to say that in Rally’s case – they are generally true to the core values. Putting “Create your own reality” on a list of core values doesn’t create culture – what creates culture is having repeated examples where individuals have followed their passion & the company has supported them. This support doesn’t just mean they have permissions, it means the company uses whatever resources it can to help. Sometimes this means using your resources to help someone find another job. Sometimes this means helping them get an education they can use at another company. Usually though, it means getting them into a role where they can do their best work. Whatever the case – Rally’s culture is to always be true to that core value and do whatever they can to support an employee in creating their own reality. This is repeated for all of the core values. By being explicit & public about these values they set the stage for what an employee can expect from Rally as a workplace. But there’s more to it – you have to make sure these core values are upheld and you have to make sure they thrive – and this is where some of the tactical parts come in. What are the tactical things? Collaboration – at Rally collaboration is a requirement. Development is done almost exclusively in pairs, planning is done as groups, retrospectives are done regularly and the actions from those retrospectives are announced publicly and followed up on. Architecture decisions are reviewed by a group comprised of Developers, Operations and Product. Self Formed Teams – teams are largely formed of individuals who have an interest in that teams work. When we need a task force, an email will go out to the organization looking for people interested in participating & those teams self-form. This also gives anyone in the company the ability to participate in areas of the business they may never otherwise get exposure to. Servant Leadership – Leaders at Rally often do very similar work to everyone else – they just have the additional responsibility of enabling their teams. Decisions about how to do things don’t often come from managers, they come from the teams. Data Driven Decisions – Not strictly associated with a core value, I think this is one of the most important aspects of the Rally culture. There is an expectation that we establish evidence that a decision is correct. Sometimes this evidence is established before any dev work is done but sometimes this data comes from dark launching a new service or testing out some new piece of software. Either way, it’s understood that the job isn’t really done until you have data to support why a particular decision is right & have talked to the broader group about it. There are plenty of other things here and there but you get the general idea. We talk a lot & tell each other what we’re doing, we enlist passionate individuals in areas they have interest, we embrace & seek out change and we empower individuals to drive change by working with others. So what? What does that have to do with Devops? Everything 2.5 years ago the company had some very serious performance & stability problems. Technical debt had caught up with them and the only real way to fix the problem was to completely change the way the company did development & prioritized their work. The good news is that they did it, but it was made possible by the fact that individuals were empowered to drive that change. Almost overnight, two teams were formed to focus on architectural issues. A council was formed to prioritize architectural work. The things we all complain about never being able to prioritize became a priority and remain a priority to a degree I’ve never experienced before at other companies. Prioritizing this work is defended and advocated by the development teams – something only possible because of the collaborative environment in which we operate. I have been personally involved in two services that literally started out as a skeleton of an app when they went into production. The goal was to lay the groundwork to allow fast production deployments, get monitoring in place & enable visibility while the system was being developed. This was all done because the developers understand the value of these things, but they don’t know exactly how to build it – they need Ops help. Having tight Ops and Dev collaboration on these projects has made them examples of what works in our organization. These projects become examples for other teams in the company and they push the envelope on new tech. These two projects have: Implemented a completely new monitoring framework that allows developers to add any metric they want to the system Implemented Continuous deployment Established an example of how and why to Dark Launch a service I’m sure the list will continue to go on… it’s fantastic stuff. The Rub – culture isn’t much of anything without people who embrace it. Along with a responsibility for pushing change from the bottom up in Rally comes responsibility for defending culture – or changing it. This means that when you hire people, they have to align with your core values – they have to be willing to defend that culture or the company as a whole needs to shift culture. All those core values and tactical things will not maintain a culture that the team members do not support. Rally’s culture is what it is because everyone takes it seriously and that includes taking it seriously when there’s a problem that needs fixing. This has happened. There are core values that used to be on that list above but they aren’t anymore. At one point or another things changed and those core values were eroding at other core values. This takes time to surface, it takes time to collect data to show it’s true, but when the teams start to observe this trend they have to take action. This isn’t the job of management alone – this is the job of every member of the company. When the voice begins to develop asking for change – you need a culture that allows that change to take place and for everyone to agree on the new shape things take. That said, it also isn’t possible if management doesn’t support those same core values. Management has the same responsibility to take those core values seriously. DevOps is our little corner of a much bigger idea There’s a problem that we’re trying to fix – we’re trying to improve the happiness of people, the quality of software, and the general health of our industry. Our industry is totally healthy when you look at the bottom line, but we’re looking for something more. We want a happy and healthy development organization (including Ops, because Ops is part of the Development organization), but we also want our other teams to be part of that. As Ops folks and Developers, we can clean up our side of the street – we can do better. We seek to set an example for the rest of the organization. For culture to really improve in companies it has to go beyond Dev and Ops into Executives, Product, Support, Marketing, Sales and everyone else. You ALL own quality by building a healthy substrate (culture) on top of which all else evolves. But in the end it’s about culture. It’s really only about culture for now – because when you get culture right the other problems are easy to solve. Congratulations to those of you who read this far – shoot me a note and let me know you read this far because you probably share the same passion about this that I do. Also – putting up blog posts from 32,000 feet is awesome – thanks Southwest. Reference: DevOps is Culture – so is a lot of other stuff… from our JCG partner Aaron Nichols at the Operation Bootstrap blog....

EasyMock tutorial – Getting Started

In this post, I’m going to show you what EasyMock is and how you can use it for testing your java application. For this purpose, I’m going to create a simple Portfolio application and test it using JUnit & EasyMock libraries. Before we begin, lets first understand the need behind using EasyMock. Lets say, you are building an Android mobile application for maintaining user’s stock portfolios. Your application would use a stock market service to retrieve stock prices from a real server (such as NASDAQ). When it comes to testing your code, you wouldn’t want to hit the real stock market server for fetching the stock prices. Instead, you would like some dummy price values. So, you need to mock the stock market service that returns dummy values without hitting the real server. EasyMock is exactly doing the same – helps you to mock interfaces. You can pre-define the behavior of your mock objects and then use this mock object in your code for testing. Because, you are only concerned about testing your logic and not the external services or objects. So, it makes sense mock the external services. To make it clear, have a look at the below code excerpt (we’ll see the complete code in a while): StockMarket marketMock = EasyMock.createMock(StockMarket.class); EasyMock.expect(marketMock.getPrice('EBAY')).andReturn(42.00); EasyMock.replay(marketMock); In the first line, we ask the EasyMock to create a mock object for our StockMarket interface. And then in the second line, we define how this mock object should behave – i.e., when the getPrice() method is called with the parameter “EBAY”, the mock should return 42.00. And then, we call the replay() method, to make the mock object ready to use. So, that pretty much set the context about the EasyMock and it’s usage. Let’s dive into our Portfolio application. You can download the complete source code from Github. Portfolio application Our Portfolio application is really simple. It has a Stock class to represent a stock name and quantity and the Portfolio class to hold a list of stocks. This Portfolio class has a method to calculate the total value of the portfolio. Our class uses a StockMarket (an interface) object to retrieve the stock prices. While testing our code, we will mock this StockMarket using EasyMock. Stock.java A very simple Plain Old Java Object (POJO) to represent a single stock. package com.veerasundar.easymock; public class Stock { private String name; private int quantity; public Stock(String name, int quantity) { this.name = name; this.quantity = quantity; } public String getName() { return name; } public void setName(String name) { this.name = name; } public int getQuantity() { return quantity; } public void setQuantity(int quantity) { this.quantity = quantity; } }StockMarket.java An interface to represent a stock market service. It has a method that returns the stock price of the given stock name. package com.veerasundar.easymock; public interface StockMarket { public Double getPrice(String stockName); }Portfolio.java This object holds a list of Stock objects and a method to calculate the total value of the portfolio. It uses a StockMarket object to retrieve the stock prices. Since it is not a good practice to hard code the dependencies, we haven’t initialized the stockMarket object. We’ll inject it later using our test code. package com.veerasundar.easymock; import java.util.ArrayList; import java.util.List; public class Portfolio { private String name; private StockMarket stockMarket; private List<Stock> stocks = new ArrayList<Stock>(); * * this method gets the market value for each stock, sums it up and returns * the total value of the portfolio. * public Double getTotalValue() { Double value = 0.0; for (Stock stock : this.stocks) { value += (stockMarket.getPrice(stock.getName()) * stock .getQuantity()); } return value; } public String getName() { return name; } public void setName(String name) { this.name = name; } public List<Stock> getStocks() { return stocks; } public void setStocks(List<Stock> stocks) { this.stocks = stocks; } public void addStock(Stock stock) { stocks.add(stock); } public StockMarket getStockMarket() { return stockMarket; } public void setStockMarket(StockMarket stockMarket) { this.stockMarket = stockMarket; } } So, now we have coded the entire application. In this, we are going to test the Portfolio.getTotalValue() method, because that’s where our business logic is. Testing Portfolio application using JUnit and EasyMock If you haven’t used JUnit before, then it is a good time to Get started with JUnit. PortfolioTest.java package com.veerasundar.easymock.tests; import junit.framework.TestCase; import org.easymock.EasyMock; import org.junit.Before; import org.junit.Test; import com.veerasundar.easymock.Portfolio; import com.veerasundar.easymock.Stock; import com.veerasundar.easymock.StockMarket; public class PortfolioTest extends TestCase { private Portfolio portfolio; private StockMarket marketMock; @Before public void setUp() { portfolio = new Portfolio(); portfolio.setName('Veera's portfolio.'); marketMock = EasyMock.createMock(StockMarket.class); portfolio.setStockMarket(marketMock); } @Test public void testGetTotalValue() { * = Setup our mock object with the expected values * EasyMock.expect(marketMock.getPrice('EBAY')).andReturn(42.00); EasyMock.replay(marketMock); * = Now start testing our portfolio * Stock ebayStock = new Stock('EBAY', 2); portfolio.addStock(ebayStock); assertEquals(84.00, portfolio.getTotalValue()); } } As you can see, during setUp() we are creating new Portfolio object. Then we ask EasyMock to create a mock object for the StockMarket interface. Then we inject this mock object into our portfolio object using portfolio.setStockMarket() method. In the @Test method, we define how our mock object should behave when called, using the below code: EasyMock.expect(marketMock.getPrice('EBAY')).andReturn(42.00); EasyMock.replay(marketMock); So, here after our mock object’s getPrice method would return 42.00 when called with EBAY. Then we are creating a ebayStock with 2 quantities and add that to our portfolio. Since we setup the stock price of EBAY as 42.00, we know that the total value of our portfolio is 84.00 (i.e. 2 x 42.00). In the last line, we are asserting the same using the JUnit assertEquals() method. The above test should run successfully if we haven’t made any mistakes in the getTotalValue() code. Otherwise, the test would fail. Conclusion So, that’s how we use the EasyMock library to mock the external services/objects and use them in our testing code. EasyMock can do much more than what I shown in this post. I’ll probably try to cover some advanced usage scenarios in my next posts. Reference: EasyMock tutorial – Getting Started from our JCG partner Veera Sundar at the Veera Sundar blog....

Grails Custom AuthenticationProvider

In order to tighten up security in our new Grails app I went about implementing the Spring Security Plugin. Getting it up and running with a standard username/password scenario was simple, as that is all wired up automagically by the plugin. That solved half of my problem, but we also need to support authentication with SAML, and there were no clear examples of how to do that. I’d like to share what I built in case anyone has a similar requirement. I won’t focus on the SAML specifics, but rather on how to build any custom authentication provider in grails.You can map a URL to a filter by extending AbstractAuthenticationProcessingFilter and registering it with Spring. Then you can provide that URL for custom authentication. In my case it looked something like this: class SamlAuthenticationFilter extends AbstractAuthenticationProcessingFilter {public SamlAuthenticationFilter() { super("/somecustomauth") }@Override Authentication attemptAuthentication(HttpServletRequest request, HttpServletResponse response) { if (!request.getMethod().equals("POST")) { throw new AuthenticationServiceException("Authentication method not supported: " + request.getMethod()) }String accessToken = request.getParameter("sometoken") return this.getAuthenticationManager().authenticate(new SamlAuthenticationToken(accessToken)); }}The filter is then setup as a Spring bean, along with an authentication provider which I’ll discuss shortly: import SamlAuthenticationFilter import SamlAuthenticationProviderbeans = { samlAuthenticationFilter(SamlAuthenticationFilter) { authenticationManager = ref('authenticationManager') sessionAuthenticationStrategy = ref('sessionAuthenticationStrategy') authenticationSuccessHandler = ref('authenticationSuccessHandler') authenticationFailureHandler = ref('authenticationFailureHandler') rememberMeServices = ref('rememberMeServices') authenticationDetailsSource = ref('authenticationDetailsSource') }samlAuthenticationProvider(SamlAuthenticationProvider) { sAMLAuthenticationService = ref('SAMLAuthenticationService') sAMLSettingsService = ref('SAMLSettingsService') userDetailsService = ref('userDetailsService') passwordEncoder = ref('passwordEncoder') userCache = ref('userCache') saltSource = ref('saltSource') preAuthenticationChecks = ref('preAuthenticationChecks') postAuthenticationChecks = ref('postAuthenticationChecks') } }And the bean is then registered as a filter in the Bootstrap: import org.codehaus.groovy.grails.plugins.springsecurity.SecurityFilterPosition import org.codehaus.groovy.grails.plugins.springsecurity.SpringSecurityUtilsclass BootStrap {def init = { servletContext -> SpringSecurityUtils.clientRegisterFilter('samlAuthenticationFilter', SecurityFilterPosition.SECURITY_CONTEXT_FILTER.order + 10) }def destroy = { } }We also need to create the Token class that is used by the Filter and the Authentication Provider: import org.springframework.security.authentication.UsernamePasswordAuthenticationToken import org.springframework.security.core.userdetails.UserDetailsclass SamlAuthenticationToken extends UsernamePasswordAuthenticationToken {String tokenpublic SamlAuthenticationToken(String token) { super(null, null); this.token = token; }public SamlAuthenticationToken(UserDetails principal, String samlResponse) { super(principal, samlResponse, principal.getAuthorities()) }}And finally the AuthenticationProvider itself: import org.springframework.security.authentication.dao.DaoAuthenticationProvider import org.springframework.security.core.Authentication import sonicg.authentication.SAMLAuthenticationServiceclass SamlAuthenticationProvider extends DaoAuthenticationProvider {@Override Authentication authenticate(Authentication authentication) { def token = (SamlAuthenticationToken) authenticationdef user = // define user if credentials check outif (user){ def userDetails = userDetailsService.loadUserByUsername(user.username) def token1 = new SamlAuthenticationToken(userDetails, token.samlResponse) return token1 }else{ return null }}@Override public boolean supports(Class authentication) { return (SamlAuthenticationToken.class.isAssignableFrom(authentication)); } }The last piece of the puzzle is to tell Spring to try using this authentication provider before the other standard three in Config.groovy: grails.plugins.springsecurity.providerNames = [ 'samlAuthenticationProvider', 'daoAuthenticationProvider', 'anonymousAuthenticationProvider', 'rememberMeAuthenticationProvider']In this case it’s important that the custom filter goes first, as it’s Token is a subclass of UsernamePasswordAuthenticationToken. If the DAO provider was first it would try to authenticate the custom token before our filter gets a chance.That’s it! Hopefully this proves useful to someone. It’s also just a first draft, and perhaps once the security requirements evolve I can refine the implementation and share what I’ve learned. Reference: Custom AuthenticationProvider with Grails from our JCG partner Kali Kallin at the Kallin Nagelberg’s journey into the west blog....

Concurrency – Sequential and Raw Thread

I worked on a project a while back, where the report flow was along these lines:User would request for a report The report request would be translated into smaller parts/sections The report for each part, based on the type of the part/section would be generated by a report generator The constituent report parts would be reassembled into a final report and given back to the userMy objective is to show how I progressed from a bad implementation to a fairly good implementation: Some of the basic building blocks that I have is best demonstrated by a unit test: This is a test helper which generates a sample report request, with constituent report request parts: public class FixtureGenerator { public static ReportRequest generateReportRequest(){ List<ReportRequestPart> requestParts = new ArrayList<ReportRequestPart>(); Map<String, String> attributes = new HashMap<String, String>(); attributes.put("user","user"); Context context = new Context(attributes ); ReportRequestPart part1 = new ReportRequestPart(Section.HEADER, context); ReportRequestPart part2 = new ReportRequestPart(Section.SECTION1, context); ReportRequestPart part3 = new ReportRequestPart(Section.SECTION2, context); ReportRequestPart part4 = new ReportRequestPart(Section.SECTION3, context); ReportRequestPart part5 = new ReportRequestPart(Section.FOOTER, context); requestParts.add(part1); requestParts.add(part2); requestParts.add(part3); requestParts.add(part4); requestParts.add(part5); ReportRequest reportRequest = new ReportRequest(requestParts ); return reportRequest; }}And the test for the report generation: public class FixtureGenerator { @Test public void testSequentialReportGeneratorTime(){ long startTime = System.currentTimeMillis(); Report report = this.reportGenerator.generateReport(FixtureGenerator.generateReportRequest()); long timeForReport = System.currentTimeMillis()-startTime; assertThat(report.getSectionReports().size(), is (5)); logger.error(String.format("Sequential Report Generator : %s ms", timeForReport)); }The component which generates a part of the report is a dummy implementation with a 2 second delay to simulate a IO intensive call: public class DummyReportPartGenerator implements ReportPartGenerator{@Override public ReportPart generateReportPart(ReportRequestPart reportRequestPart) { try { //Deliberately introduce a delay Thread.sleep(2000); } catch (InterruptedException e) { e.printStackTrace(); } return new ReportPart(reportRequestPart.getSection(), "Report for " + reportRequestPart.getSection()); } }Sequential Implementation   Given these base set of classes, my first naive sequential implementation is the following: public class SequentialReportGenerator implements ReportGenerator { private ReportPartGenerator reportPartGenerator;@Override public Report generateReport(ReportRequest reportRequest){ List<ReportRequestPart> reportRequestParts = reportRequest.getRequestParts(); List<ReportPart> reportSections = new ArrayList<ReportPart>(); for (ReportRequestPart reportRequestPart: reportRequestParts){ reportSections.add(reportPartGenerator.generateReportPart(reportRequestPart)); } return new Report(reportSections); } ...... }Obviously, for a report request with 5 parts in it, each part taking 2 seconds to be fulfilled this report takes about 10 seconds for it to be returned back to the user. It begs to be made concurrent. Raw Thread Based Implementation   The first concurrent implementation, not good but better than sequential is the following, where a thread is spawned for every report request part, waiting on the reportparts to be generated(using thread.join() method), and aggregating the pieces as they come in. public class RawThreadBasedReportGenerator implements ReportGenerator { private static final Logger logger = LoggerFactory.getLogger(RawThreadBasedReportGenerator.class);private ReportPartGenerator reportPartGenerator;@Override public Report generateReport(ReportRequest reportRequest) { List<ReportRequestPart> reportRequestParts = reportRequest.getRequestParts(); List<Thread> threads = new ArrayList<Thread>(); List<ReportPartRequestRunnable> runnablesList = new ArrayList<ReportPartRequestRunnable>(); for (ReportRequestPart reportRequestPart : reportRequestParts) { ReportPartRequestRunnable reportPartRequestRunnable = new ReportPartRequestRunnable(reportRequestPart, reportPartGenerator); runnablesList.add(reportPartRequestRunnable); Thread thread = new Thread(reportPartRequestRunnable); threads.add(thread); thread.start(); }for (Thread thread : threads) { try { thread.join(); } catch (InterruptedException e) { logger.error(e.getMessage(), e); } }List<ReportPart> reportParts = new ArrayList<ReportPart>();for (ReportPartRequestRunnable reportPartRequestRunnable : runnablesList) { reportParts.add(reportPartRequestRunnable.getReportPart()); }return new Report(reportParts);} ..... }The danger with this approach is that a new thread is being created for every report part, so in a real world scenario if a 100 simultaneous request comes in with each request spawning 5 threads, this can potentially end up creating 500 costly threads in the vm!! So thread creation has to be constrained in some way. I will go through two more approaches where threads are controlled, in the next blog entry. Reference: Concurrency – Sequential and Raw Thread from our JCG partner Biju Kunjummen at the all and sundry blog....

Anti cross-site scripting (XSS) filter for Java web apps

Here is a good and simple anti cross-site scripting (XSS) filter written for Java web applications. What it basically does is remove all suspicious strings from request parameters before returning them to the application. It’s an improvement over my previous post on the topic.You should configure it as the first filter in your chain (web.xml) and it’s generally a good idea to let it catch every request made to your site.The actual implementation consists of two classes, the actual filter is quite simple, it wraps the HTTP request object in a specialized HttpServletRequestWrapper that will perform our filtering. public class XSSFilter implements Filter {@Override public void init(FilterConfig filterConfig) throws ServletException { }@Override public void destroy() { }@Override public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain) throws IOException, ServletException { chain.doFilter(new XSSRequestWrapper((HttpServletRequest) request), response); }}The wrapper overrides the getParameterValues(), getParameter() and getHeader() methods to execute the filtering before returning the desired field to the caller. The actual XSS checking and striping is performed in the stripXSS() private method. import java.util.regex.Pattern; import javax.servlet.http.HttpServletRequest; import javax.servlet.http.HttpServletRequestWrapper;public class XSSRequestWrapper extends HttpServletRequestWrapper {public XSSRequestWrapper(HttpServletRequest servletRequest) { super(servletRequest); }@Override public String[] getParameterValues(String parameter) { String[] values = super.getParameterValues(parameter);if (values == null) { return null; }int count = values.length; String[] encodedValues = new String[count]; for (int i = 0; i < count; i++) { encodedValues[i] = stripXSS(values[i]); }return encodedValues; }@Override public String getParameter(String parameter) { String value = super.getParameter(parameter);return stripXSS(value); }@Override public String getHeader(String name) { String value = super.getHeader(name); return stripXSS(value); }private String stripXSS(String value) { if (value != null) { // NOTE: It's highly recommended to use the ESAPI library and uncomment the following line to // avoid encoded attacks. // value = ESAPI.encoder().canonicalize(value);// Avoid null characters value = value.replaceAll("", "");// Avoid anything between script tags Pattern scriptPattern = Pattern.compile("<script>(.*?)</script>", Pattern.CASE_INSENSITIVE); value = scriptPattern.matcher(value).replaceAll("");// Avoid anything in a src='...' type of expression scriptPattern = Pattern.compile("src[\r\n]*=[\r\n]*\\\'(.*?)\\\'", Pattern.CASE_INSENSITIVE | Pattern.MULTILINE | Pattern.DOTALL); value = scriptPattern.matcher(value).replaceAll("");scriptPattern = Pattern.compile("src[\r\n]*=[\r\n]*\\\"(.*?)\\\"", Pattern.CASE_INSENSITIVE | Pattern.MULTILINE | Pattern.DOTALL); value = scriptPattern.matcher(value).replaceAll("");// Remove any lonesome </script> tag scriptPattern = Pattern.compile("</script>", Pattern.CASE_INSENSITIVE); value = scriptPattern.matcher(value).replaceAll("");// Remove any lonesome <script ...> tag scriptPattern = Pattern.compile("<script(.*?)>", Pattern.CASE_INSENSITIVE | Pattern.MULTILINE | Pattern.DOTALL); value = scriptPattern.matcher(value).replaceAll("");// Avoid eval(...) expressions scriptPattern = Pattern.compile("eval\\((.*?)\\)", Pattern.CASE_INSENSITIVE | Pattern.MULTILINE | Pattern.DOTALL); value = scriptPattern.matcher(value).replaceAll("");// Avoid expression(...) expressions scriptPattern = Pattern.compile("expression\\((.*?)\\)", Pattern.CASE_INSENSITIVE | Pattern.MULTILINE | Pattern.DOTALL); value = scriptPattern.matcher(value).replaceAll("");// Avoid javascript:... expressions scriptPattern = Pattern.compile("javascript:", Pattern.CASE_INSENSITIVE); value = scriptPattern.matcher(value).replaceAll("");// Avoid vbscript:... expressions scriptPattern = Pattern.compile("vbscript:", Pattern.CASE_INSENSITIVE); value = scriptPattern.matcher(value).replaceAll("");// Avoid onload= expressions scriptPattern = Pattern.compile("onload(.*?)=", Pattern.CASE_INSENSITIVE | Pattern.MULTILINE | Pattern.DOTALL); value = scriptPattern.matcher(value).replaceAll(""); } return value; } }Notice the comment about the ESAPI library, I strongly recommend you check it out and try to include it in your projects. If you want to dig deeper on the topic I suggest you check out the OWASP page about XSS and RSnake’s XSS (Cross Site Scripting) Cheat Sheet. Reference: Stronger anti cross-site scripting (XSS) filter for Java web apps from our JCG partner Ricardo Zuasti at the Ricardo Zuasti’s blog blog....

The Activiti Performance Showdown

The question everybody always asks when they learn about Activiti, is as old as software development itself: “How does it perform?”. Up till now, when you would ask me that same question, I would tell you about how Activiti minimizes database access in every way possible, how we break down the process structure into an ‘execution tree’ which allows for fast queries or how we leverage ten years of workflow framework development knowledge. You know, trying to get around the question without answering it. We knew it is fast, because of the theoretical foundation upon which we have built it. But now we have proof: real numbers …. Yes, it’s going to be a lengthy post. But trust me, it’ll be worth your time! Disclaimer: performance benchmarks are hard. Really hard. Different machines, slight different test setup … very small things can change the results seriously. The numbers here are only to prove that the Activiti engine has a very minimal overhead, while also integrating very easily into the Java eco-system and offering BPMN 2.0 process execution. The Activiti Benchmark Project To test process execution overhead of the Activiti engine, I created a little side project on github: https://github.com/jbarrez/activiti-benchmark The project contains currently 9 test processes, which we’ll analyse below. The logic in the project is pretty straightforward:A process engine is created for each test run Each of the processes are sequentially executed on this process engine, using a threadpool from 1 up to 10 threads. All the processes are thrown into a bag, of which a number of random executions are drawn. All the results are collected and a HTML report with some nice charts are generatedTo run the benchmark, simply follow the instructions on the github page to build and execute the jar. Benchmark Results The test machine I used for the results is my (fairly old) desktop machine: AMD Phenom II X4 940 3.0Ghz, 8 Gb 800Mhz RAM and an old-skool 7200 rpm HD running Ubuntu 11.10. The database used for the test runs on the same machine on which the tests also run. So keep in mind that in a ‘real’ server environment the results could even be better! The benchmark project I mentioned above, was executed on a default Ubuntu MySQL 5 database. I just switched to the ‘large.cnf’ setting (which throws more RAM at the db and stuff like that) instead the default config.Each of the test processes ran for 2500 times, using a threadpool going from one to ten threads. In simpleton language: 2500 process executions using just one thread, 2500 threads using two threads, 2500 process executions using three … yeah, you get it. Each benchmark run was done using a ‘default’ Activiti process engine. This basically means a ‘regular’ standalone Activiti engine, created in plain Java. Each benchmark run was also done in a ‘Spring’ config. Here, the process engine was constructed by wrapping it in the factory bean, the datasource is a Spring datasource and also the transactions and connection pool is managed by Spring (I’m actually using a tweaked BoneCP threadpool) Each benchmark run was executed with history on the default history level (ie. ‘audit’) and without history enabled (ie. history level ‘none’).The processes are in detail analyzed in the sections below, but here are the integral results of the test runs already:Activiti 5.9 – MySQL – default – history enabled Activiti 5.9 – MySQL – default – history disabled Activiti 5.9 – MySQL – Spring – history enabled Activiti 5.9 – MySQL – Spring – history disabledI ran all the tests using the latest public release of Activiti, being Activiti 5.9. However, my test runs brought some potential performance fixes to the surface (I also ran the benchmark project through a profiler). It was quickly clear that most of the process execution time was done actually cleaning up when a process ended. Basically, more than often queries were fired which were not necessary if we would save some more state in our execution tree. I sat together with Daniel Meyer from Camunda and my colleague Frederik Heremans, and they’ve managed to commit fixes for this! As such, the current trunk of Activiti, being Activiti 5.10-SNAPSHOT at the moment, is significantly faster than 5.9.Activiti 5.10 – MySQL – default – history enabled Activiti 5.10 – MySQL – default – history disabled Activiti 5.10 – MySQL – Spring – history enabled Activiti 5.10 – MySQL – Spring – history disabledFrom a high-level perspective (scroll down for detailed analysis), there are a few things to note:I had expected some difference between the default and Spring config, due to the more ‘professional’ connection pool being used. However, the results for both environments are quite alike. Sometimes the default is faster, sometimes Spring. It’s hard to really find a pattern. As such, I omitted the Spring results in the detailed analyses below. The best average timings are most of the times found when using four threads to execute the processes. This is probably due to having a quad-core machine. The best throughput numbers are most of the times found when using eight threads to execute the processes. I can only assume that is also has something to do with having a quad-core machine. When the number of threads in the threadpool go up, the throughput (processes executed / second) goes up, both it has a negative effect on the average time. Certainly with more than six or seven threads, you see this effect very clear. This basically means that while the processes on itself take a little longer to execute, but due to the multiple threads you can execute more of these ‘slower’ processes in the same amount of time. Enabling history does have an impact. Often, enabling history will double execution time. This is logical, given that many extra records are inserted when history is on the default level (ie. ‘audit’).There was one last test I ran, just out of curiosity: running the best performing setting on an Oracle XE 11.2 database. The Oracle XE is a free version of the ‘real’ Oracle database. No matter how hard, I tried, I couldn’t get it decently running on Ubuntu. As such, I used an old Windows XP install on that same machine. However, the OS is 32 bit, wich means the system only has 3.2 of the 8Gb of RAM available. Here are the results:Activiti 5.10 – Oracle on Windows – default – history disabledThe results speak for itself. Oracle blows away any of the (single-threaded) results on MySQL (and they are already very fast!). However, when going multi-threaded it is far worse than any of the MySQL results. My guess is that these are due to the limitations of the XE version: only one CPU is used, only 1 GB of RAM, etc. I would really like to run these test on a real Oracle-managed-by-a-real-DBA … Feel free to contact me if you are interested! In the next sections, we will take a detailed look into the performance numbers of each of the test processes. An Excel sheet containing all the the numbers and charts below can be downloaded for yourself. Process 1: The bare micromum (one transaction)The first process is not a very interesting one, business-wise at least. After starting the process, the end is immediately reached. Not very useful on itself, but its numbers learn us one essential thing: the bare overhead of the Activiti engine. Here are the average timings:This process runs in a single transaction, which means that nothing is saved to the database when the history is disabled due to Activiti’s optimizations. With history enabled, you’ll basically get the cost for inserting one row into the historical process instance table, which is around 4.44 ms here. It is also clear that our fix for Activiti 5.10 has an enormous impact here. In the previous version, 99% of the time was spent in the cleanup check of the process. Take a look at the best result here: 0.47 ms when using 4 threads to execute 2500 runs of this process. That’s only half a millisecond! It’s fair to say that the Activiti engine overhead is extremely small. The throughput numbers are equally impressive:In the best case here, 8741 processes are executed. Per second. By the time you arrive here reading the post, you could have executed a few millions of this process . You can also see that there is little difference between 4 or 8 threads here. Most of the execution time here is cpu time, and no potential collisions such as waiting for a database lock happens here. In these numbers, you can also easily see that the Oracle XE doesn’t scale well with multiple threads (which is explained above). You will see the same behavior in the following results. Process 2: The same, but a bit longer (one transaction)This process is pretty similar to the previous one. We have again only one transaction. After the process is started, we pass through seven no-op passthrough activities before reaching the end.Some things to note here:The best result (again 4 threads, with history disabled) is actually better than the simpler previous process. But also note that the single threaded execution is a tad slower. This means that the process on itself is a bit slower, which is logical as is has more activities. But using more threads and having more activities in the process does allow for more potential interleaving. In the previous case, the thread was barely born before it was killed again. The difference between history enabled/disabled is bigger than the previous process. This is logical, as more history is written here (for each activity one record in the database). Again, Activiti 5.10 is far more superior to Activiti 5.9.The throughput numbers follow these observations: there is more opportunity to use threading here. The best result lingers around 12000 process execution per second. Again, it demonstrates the very lightweight execution of the Activiti engine.Process 3: Parallelism in one transactionThis process executes a parallel gateway that forks and one that joins in the same transaction. You would expect something along the lines of the previous results, but you’d be surprised:Comparing these numbers with the previous process, you see that execution is slower. So why is this process slower, even if it has less activities? The reason lies with how the parallel gateway is implemented, especially the join behavior. The hard part, implementation-wise, is that you need to cope with the situation when multiple executions arrive at the join. To make sure that the behavior is atomic, we internally do some locking and fetch all child executions in the execution tree to find out whether the join activates or not. So it is quite a ‘costly’ operation, compared to the ‘regular’ activities. Do mind, we’re talking here about only 5 ms single threaded and 3.59 ms in the best case for MySQL. Given the functionality that is required for implementing the parallel gateway functionality, this is peanuts if you’d ask me. The throughput numbers:This is the first process which actually contains some ‘logic’. In the best case above, it means 1112 processes can be executed in a second. Pretty impressive, if you’d ask me!. Process 4: Now we’re getting somewhere (one transaction)This process already looks like something you’d see when modeling real business processes. We’re still running it in one database transaction though, as all the activities are automatic passthroughs. Here we also have two forks and two joins.Take a look at the lowest number: 6.88 ms on Oracle when running with one thread. That’s freaking fast, taking in account all that is happening here. The history numbers are at least doubled here (Activiti 5.10), which makes sense because there is quite a bit of activity audit logging going on here. You can also see that this causes to have a higher average time for four threads here, which is probably due to the implementation of the joining. If you know a bit about Activiti internals, you’ll understand this means there are quite a bit of executions in the execution tree. We have one big concurrent root, but also multiple children which are sometimes also concurrent roots. But while the average time rises, the throughput definitely benefits:Running this process with eight threads, allows you to do 411 runs of this process in a single second. There is also something peculiar here: the Oracle database performs better with more thread concurrency. This is completely contrary with all other measurements, where Oracle is always slower in that environment (see above for explanation). I assume it has something to do with the internal locking and forced update we are applying when forking/joining, which is better handled by Oracle it seems. Process 5: Adding some Java logic (single transaction)I added this process to see the influence of adding a Java service task in a process. In this process, the first activity generates a random value, stores it as a process variable and then goes up or down in the process depending on the random value. The chance is about 50/50 to go up or down.The average timings are very very good. Actually, the results are in the same range as those of process 1 and 2 above (which had no activities or only automatic passthroughs). This means that the overhead of integrating Java logic into your process is nearly non-existant (nothing is of course for free). Of course, you can still write slow code in that logic, but you can’t blame the Activiti engine for that Throughput numbers are comparable to those of process 1 and 2: very, very high. In the best case here, more than 9000 processes are executed per second. That indeed also means 9000 invocations of your own Java logic.Process 6, 7 and 8: adding wait states and transactions The previous processes demonstrated us the bare overhead of the Activiti engine. Here, we’ll take a look at how wait states and multiple transactions have influence on performance. For this, I added three test processes which contain user tasks. For each user task, the engine commits the current transaction and returns the thread to the client. Since the results are pretty much compatible for these processes, we’re grouping them here. These are the processes:Here are the average timings results, in order of the processes above. For the first process, containing just one user task:It is clear that having wait states and multiple transaction does have influence on the performance. This is also logical: before, the engine could optimize by not inserting the runtime state into the database, because the process was finished in one transaction. Now, the whole state, meaning the pointers to where you are currently, need to be saved into the database. The process could be ‘sleeping’ like this for many days, months, years now …. The Activiti engine doesn’t hold it into memory now anymore, and it is freed to give its full attention to other processes. If you check the results of the process with only one user task, you can see that in the best case (Oracle, single thread – the 4 threads on MySQL is pretty close) this is done in 6.27ms. This is really fast, if you take in account we have a few inserts (the execution tree, the task), a few updates (the execution tree) and deletes (cleaning up) going on here. The second process here, with 7 user tasks:The second chart learns us that logically, more transactions means more time. In the best case here the process is done in 32.12 ms. That is for seven transactions, which gives 4.6 ms for each transactions. So it is clear that average time scales in a linearly way when adding wait states. This makes of course sense, because transactions aren’t free. Also note that enabling history does add quite some overhead here. This is due to having the history level set to ‘audit’, which stores all the user task information in the history tables. This is also noticeable from the difference between Activiti 5.9 with history disabled and Activiti 5.10 with history enabled: this is a rare case where Activiti 5.10 with history enabled is slower than 5.9 with history disabled. But it is logical, given the volume of history stored here. And the third process learns us how user tasks and parallel gateways interact:The third chart learns us not much new. We have two user tasks now, and the more ‘expensive’ fork/join (see above). The average timings are how we expected them. The throughput charts are as you would expect given the average timings. Between 70 and 250 processes per second. Aw yeah! To save some space, you’ll need to click them to enlarge:Process 9: So what about scopes?For the last process, we’ll take a look at ‘scopes’. A ‘scope’ is how we call it internally in the engine, and it has to do with variable visibility, relationships between the pointers indicating process state, event catching, etc. BPMN 2.0 has quite some cases for those scopes, for example with embedded subprocesses as shown in the process here. Basically, every subprocess can have boundary events (catching an error, a message, etc) that only are applied on its internal activities when it’s scope is active. Without going into too much technical details: to get scopes implemented in the correct way, you need some not so trivial logic. The example process here has 4 subprocesses, nested in each other. The inner process is using concurrency, which is a scope on itself again for the Activiti engine. There are also two user tasks here, so that means two transactions. So let’s see how it performs:You can clearly see the big difference between Activiti 5.9 and 5.10. Scopes are indeed an area where the fixes around the ‘process cleanup’ at the end have a huge benefit, as many execution objects are created and persisted to represent the many different scopes. Single threaded performance is not so good on Activiti 5.9. Luckily, as you can see from the gap between the blue and the red bars, those scopes do allow for high concurrency. The numbers of Oracle, combined with the multi-threaded results of the 5.10 tests, do prove that scopes are now efficiently handled by the engine. The throughput charts prove that the process nicely scales with more threads, as you can see by the big gap between the red and green line in the second last block. In the best case, 64 processes of this more complex process are handled by the engine.Random execution If you have already clicked on the full reports at the beginning of the post, you probably have noticed also random execution is tested for each environment. In this setting, 2500 process executions were done, both the process was randomly chosen. As shown in those reports this meant that over 2500 executions, each process was executed almost the same number of times (normal distribution). This last chart shows the best setting (Activiti 5.10, history disabled) and how the throughput of those random process executions goes when adding more threads:As we’ve seen in many of the test above, once passed four threads things don’t change that much anymore. The numbers (167 processes/second) prove that in a realistic situation (ie. multiple processes executing at the same time), the Activiti engine nicely scales up. Conclusion The average timing charts show two things clearly:The Activiti engine is fast and overhead is minimal! The difference between history enabled or disabled is definitely noticeably. Sometimes it comes even down to half the time needed. All history tests were done using the ‘audit’ level, but there is a simpler history level (‘activity’) which might be good enough for the use case. Activiti is very flexible in history configuration, and you can tweak the history level for each process specifically. So do think about the level your process needs to have, if it needs to have history at all!The throughput charts prove that the engine scales very well when more threads are available (ie. any modern application server). Activiti is well designed to be used in high-throughput and availability (clustered) architectures. As I said in the introduction, the numbers are what they are: just numbers. My main point which I want to conclude here, is that the Activiti engine is extremely lightweight. The overhead of using Activiti for automating your business processes is small. In general, if you need to automate your business processes or workflows, you want top-notch integration with any Java system and you like all of that fast and scalable … look no further! Reference: The Activiti Performance Showdown from our JCG partner Joram Barrez at the Small steps with big feet blog....

Demeter Law

Hello, how are you?Let us talk today about the Demeter Law. It is a pattern of Object Orientation that helps us to lower our coupling, decrease our maintenance impact and the raise adaptability of our systems.What is utility for those “weird” words? If you have to do any maintenance in your application, it will have a lesser impact; your classes will only know the classes that it should know, and your code changes will be quicker and with less impact in your system.Just advantages in the Demeter law, right? Let us take it easy; in the end of this post we will see the disadvantage of this approach.If you want to see another post about OO just click in the link: “Tell, do not ask!”Take a look in the code bellow, what could we do to make it better? package com;public class Main {public static void main(String[] args) { IAddress address = new Address(); address.setName("01"); address.setZipCode("000001");IHouse house = new House(); house.setAddress(address);IPerson person = new Person(); person.setHouse(house);// Print the person zip code System.out.println(person.getHouse().getAddress().getZipCode()); } } The code above will run as expected; we are coding to interface, our code is wellindented and well formatted. What could we do to “upgrade” our code?The Demeter law says that a class may not know more then one friendly class. WHAT? Let us take small steps and analyze the code above; notice that our Main class wants to print the Person ZipCode, but to do this the Main class get to know two more classes. If you did not noticed, there is a coupling there.To print de ZipCode our class Main is going through the Person, House and finally Address class. What is this a bad approaching? Imagine if out Annalist decide to remove our Address class from the system and the House class will be responsible to keep the ZipCode.In our cod ewill be very easy to change; but imagine now if we had a huge system with the ZipCode printed for more than 100 code lines. You would have to change 100 lines of codes at your system.The Demeter law came to help us with this kind of situation, with a little change in our Person and House classes; we can avoid this huge impact when we remove the Address class. Take a look in our new code. package com;public class Main {public static void main(String[] args) { IAddress address = new Address(); address.setName("01"); address.setZipCode("000001");IHouse house = new House(); house.setAddress(address);IPerson person = new Person(); person.setHouse(house);// Print the person zip code System.out.println(person.getZipCode()); } } package com;public interface IPerson {void setHouse(IHouse house);IHouse getHouse();String getZipCode();} package com;public class Person implements IPerson { private IHouse house;@Override public void setHouse(IHouse house) { this.house = house; }@Override public IHouse getHouse() { return house; }@Override public String getZipCode() { return house.getZipCode(); } } package com;public interface IHouse {void setAddress(IAddress address);IAddress getAddress();String getZipCode();} package com;public class House implements IHouse {private IAddress address;@Override public void setAddress(IAddress address) { this.address = address; }@Override public IAddress getAddress() { return address; }@Override public String getZipCode() { return address.getZipCode(); } } package com;public interface IAddress {void setName(String string);void setZipCode(String string);String getZipCode();public abstract String getName();} package com;public class Address implements IAddress {private String name; private String zipCode;@Override public void setName(String name) { this.name = name; }@Override public void setZipCode(String zipCode) { this.zipCode = zipCode; }@Override public String getName() { return name; }@Override public String getZipCode() { return zipCode; } } Look at our new code, think now where you will need to change if you need to remove the Address class. Only the Home class will be edited, the rest of our code will remain the same.This is the greatest advantage of the Demeter law. When you have to do maintenance your project will have a small impact. The new features will be easily adapted, simpler code editions, and with a small cost. In code that we saw today only one class would be impacted. The other classes of your system would remain the same and your system will have a small coupling.The disadvantage of this approach is an impact on the system performance. You may have a low performance if you use this approach in loops like “While, For, …” .In this case you will have to see which code of your system will not have the performance impacted with the Demeter law.I believe that even with this disadvantage in the performance in some code pieces this approach is useful and worth of use it in our systems; Demeter law could be used in almost all code of our system.I hope this post might help youIf you have any doubt or question just post it.See you soon! \o_ Reference: Demeter Law from our JCG partner Hebert Coelho at the uaiHebert blog....

MongoDB with Java Kickstart

NoSQL databases due to their scalability are becoming increasingly popular. When used appropriately NoSQL databases can offer real benefits. MongoDB is such a highly scalable opensource NoSQL database written in C++. 1. Installing MongoDB Without much of a trouble you can install MongoDB using the instructions given in the official MongoDB site, according to whatever the OS you are using. 2. Starting the MongoDB server This is quite simple. Run the mongod.exe file inside bin folder(I am using windows OS here) to start the MongoDB server. By default the server will start on port 27017 and the data will be stored at /data/db directory which you’ll have to create during the installing process. 3. Starting MongoDB shell You can start the MongoBD shell by running the mongo.exe file. 4. Creating a database with MongoDB To create a database named ‘company’ using MongoDB type the following on MongoDB shell use company Mind that MangoDB will not create a database until you save something inside it. Use following command to view the available databases and that will show you that ‘company’ database hasn’t been created yet. show dbs; 5. Saving data in MongoDB Use following commands to save employee data to a collection called employees employee = {name : 'A', no : 1} db.employees.save(employee)To view the data inside the collection use following command, db.users.find();Do it with Java :) Following is a simple Java code which is doing the same thing we did above. You can get the mongo-java driver from here. Just go through the code, it’s very simple, hopefully you’ll get the idea. package com.eviac.blog.mongo;import java.net.UnknownHostException;import com.mongodb.BasicDBObject; import com.mongodb.DB; import com.mongodb.DBCollection; import com.mongodb.DBCursor; import com.mongodb.Mongo; import com.mongodb.MongoException;public class MongoDBClient {public static void main(String[] args) {try {Mongo mongo = new Mongo('localhost', 27017);DB db = mongo.getDB('company');DBCollection collection = db.getCollection('employees');BasicDBObject employee = new BasicDBObject(); employee.put('name', 'Hannah'); employee.put('no', 2);collection.insert(employee);BasicDBObject searchEmployee = new BasicDBObject(); searchEmployee.put('no', 2);DBCursor cursor = collection.find(searchEmployee);while (cursor.hasNext()) { System.out.println(cursor.next()); }System.out.println('The Search Query has Executed!');} catch (UnknownHostException e) { e.printStackTrace(); } catch (MongoException e) { e.printStackTrace(); }}}Result { '_id' : { '$oid' : '4fec74dc907cbe9445fd2d70'} , 'name' : 'Hannah' , 'no' : 2} The Search Query has Executed!Reference: MongoDB with Java from our JCG partner Pavithra Siriwardena at the EVIAC blog....

Apache Bigtop – Installing Hive, HBase and Pig

In the previous post we learnt how easy it was to install Hadoop with Apache Bigtop! We know its not just Hadoop and there are sub-projects around the table! So, lets have a look at how to install Hive, Hbase and Pig in this post. Before rowing your boat… Please follow the previous post and get ready with Hadoop installed! Follow the link for previous post: http://femgeekz.blogspot.in/2012/06/hadoop-hangover-introduction-to-apache.html also, the same can be found at DZone, developer site: http://www.dzone.com/links/hadoop_hangover_introduction_to_apache_bigtop_and.html All Set?? Great! Head On.. Make sure all the services of Hadoop are running. Namely, JobTracker, SecondaryNameNode, TaskTracker, DataNode and NameNode. [standalone mode] Hive with Bigtop: The steps here are almost the same as Installing Hive as a separate project. However, few steps are reduced. The Hadoop installed in the previous post is Release 1.0.1 We had installed Hadoop with the following command sudo apt-get install hadoop\* Step 1: Installing Hive We have installed Bigtop 0.3.0, and so issuing the following command installs all the hive components. ie. hive, hive-metastore, hive-server. The daemons names are different in Bigtop 0.3.0. sudo apt-get install hive\*This installs all the hive components. After installing, the scripts must be able to create /tmp and /usr/hive/warehouse and HDFS doesn’t allow these to be created while installing as it is unaware of the path to Java. So, create the directories if not created and grant the execute permissions. In the hadoop directory, ie. /usr/lib/hadoop/ bin/hadoop fs -mkdir /tmp bin/hadoop fs -mkdir /user/hive/warehouse bin/hadoop -chmod g+x /tmp bin/hadoop -chmod g+x /user/hive/warehouse Step 2: The alternative directories could be /var/run/hive and /var/lock/subsys sudo mkdir /var/run/hive sudo mkdir /var/lock/subsysStep 3: Start the hive server, a daemon sudo /etc/init.d/hive-server startImage:Step 4: Running Hive Go-to the directory /usr/lib/hive. See the Image below: bin/hiveStep 5: Operations on Hive Image:HBase with Bigtop: Installing Hbase is similar to Hive. Step 1: Installing HBase sudo apt-get install hbase\*Image:Step 2: Starting HMaster sudo service hbase-master startImage:Image:Step 3: Starting HBase shell hbase shell Image:Step 4: HBase Operations Image:Image:Pig with Bigtop: Installing Pig is similar too. Step 1: Installing Pig sudo apt-get install pigImage:Step 2: Moving a file to HDFS Image:Step 3: Installed Pig-0.9.2 Image:Step 4: Starting the grunt shell pigImage:Step 5: Pig Basic Operations Image:Image:We saw that is it possible to install the subprojects and work with Hadoop, with no issues. Apache Bigtop has its own spark! :) There is a release coming BIGTOP-0.4.0 which is supposedly to fix the following issues: https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12318889&styleName=Html&projectId=12311420 Source and binary files: http://people.apache.org/~rvs/bigtop-0.4.0-incubating-RC0 Maven staging repo: https://repository.apache.org/content/repositories/orgapachebigtop-279 Bigtop’s KEYS file containing PGP keys we use to sign the release: http://svn.apache.org/repos/asf/incubator/bigtop/dist/KEYS Let us see how to install other sub-projects in the coming posts! Until then, Happy Learning! Reference: Hadoop Hangover: Introduction To Apache Bigtop and Installing Hive, HBase and Pig from our JCG partner Swathi V at the * Techie(S)pArK * blog....
Java Code Geeks and all content copyright © 2010-2014, Exelixis Media Ltd | Terms of Use | Privacy Policy
All trademarks and registered trademarks appearing on Java Code Geeks are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries.
Java Code Geeks is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.

Sign up for our Newsletter

20,709 insiders are already enjoying weekly updates and complimentary whitepapers! Join them now to gain exclusive access to the latest news in the Java world, as well as insights about Android, Scala, Groovy and other related technologies.

As an extra bonus, by joining you will get our brand new e-books, published by Java Code Geeks and their JCG partners for your reading pleasure! Enter your info and stay on top of things,

  • Fresh trends
  • Cases and examples
  • Research and insights
  • Two complimentary e-books