Enterprise Java

Scan DynamoDB Items with DynamoDBMapper

Previously we covered how to query a DynamoDB database either using DynamoDBMapper or the low level java api.

Apart from issuing queries, DynamoDB also offers Scan functionality.
What scan does, is fetching all the Items you might have on your DynamoDB Table.
Therefore scan does not require any rules based on our partition key or your global/local secondary indexes.
What scan offers is filtering based on the items already fetched and return specific attributes from the items fetched.

The snippet below issues a scan on the Logins table by filtering items with a lower date.

public List<Login> scanLogins(Long date) {

        Map<String, String> attributeNames = new HashMap<String, String>();
        attributeNames.put("#timestamp", "timestamp");

        Map<String, AttributeValue> attributeValues = new HashMap<String, AttributeValue>();
        attributeValues.put(":from", new AttributeValue().withN(date.toString()));

        DynamoDBScanExpression dynamoDBScanExpression = new DynamoDBScanExpression()
                .withFilterExpression("#timestamp < :from")
                .withExpressionAttributeNames(attributeNames)
                .withExpressionAttributeValues(attributeValues);

        List<Login> logins = dynamoDBMapper.scan(Login.class, dynamoDBScanExpression);

        return logins;
    }

Another great feature of DynamoDBMapper is parallel scan. Parallel scan divides the scan task among multiple workers, one for each logical segment. The workers process the data in parallel and return the results.
Generally the performance of a scan request depends largely on the number of items stored in a DynamoDB table. Therefore parallel scan might lift some of the performance issues of a scan request, since you have to deal with large amounts of data.

public List<Login> scanLogins(Long date,Integer workers) {

        Map<String, String> attributeNames = new HashMap<String, String>();
        attributeNames.put("#timestamp", "timestamp");

        Map<String, AttributeValue> attributeValues = new HashMap<String, AttributeValue>();
        attributeValues.put(":from", new AttributeValue().withN(date.toString()));

        DynamoDBScanExpression dynamoDBScanExpression = new DynamoDBScanExpression()
                .withFilterExpression("#timestamp < :from")
                .withExpressionAttributeNames(attributeNames)
                .withExpressionAttributeValues(attributeValues);

        List<Login> logins = dynamoDBMapper.parallelScan(Login.class, dynamoDBScanExpression,workers);

        return logins;
    }

Before using scan to our application we have to take into consideration that scan fetches all table items. Therefore It has a high cost both on charges and performance. Also it might consume your provision capacity.
Generally it is better to stick to queries and avoid scans.

You can find full source code with unit tests on github.

Reference: Scan DynamoDB Items with DynamoDBMapper from our JCG partner Emmanouil Gkatziouras at the gkatzioura blog.

Emmanouil Gkatziouras

He is a versatile software engineer with experience in a wide variety of applications/services.He is enthusiastic about new projects, embracing new technologies, and getting to know people in the field of software.
Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

0 Comments
Inline Feedbacks
View all comments
Back to top button