Scan DynamoDB Items with DynamoDBMapper

Previously we covered how to query a DynamoDB database either using DynamoDBMapper or the low level java api.

Apart from issuing queries, DynamoDB also offers Scan functionality.
What scan does, is fetching all the Items you might have on your DynamoDB Table.
Therefore scan does not require any rules based on our partition key or your global/local secondary indexes.
What scan offers is filtering based on the items already fetched and return specific attributes from the items fetched.

The snippet below issues a scan on the Logins table by filtering items with a lower date.

    public List<Login> scanLogins(Long date) {

        Map<String, String> attributeNames = new HashMap<String, String>();
        attributeNames.put("#timestamp", "timestamp");

        Map<String, AttributeValue> attributeValues = new HashMap<String, AttributeValue>();
        attributeValues.put(":from", new AttributeValue().withN(date.toString()));

        DynamoDBScanExpression dynamoDBScanExpression = new DynamoDBScanExpression()
                .withFilterExpression("#timestamp < :from")
                .withExpressionAttributeNames(attributeNames)
                .withExpressionAttributeValues(attributeValues);

        List<Login> logins = dynamoDBMapper.scan(Login.class, dynamoDBScanExpression);

        return logins;
    }

Another great feature of DynamoDBMapper is parallel scan. Parallel scan divides the scan task among multiple workers, one for each logical segment. The workers process the data in parallel and return the results.
Generally the performance of a scan request depends largely on the number of items stored in a DynamoDB table. Therefore parallel scan might lift some of the performance issues of a scan request, since you have to deal with large amounts of data.

    public List<Login> scanLogins(Long date,Integer workers) {

        Map<String, String> attributeNames = new HashMap<String, String>();
        attributeNames.put("#timestamp", "timestamp");

        Map<String, AttributeValue> attributeValues = new HashMap<String, AttributeValue>();
        attributeValues.put(":from", new AttributeValue().withN(date.toString()));

        DynamoDBScanExpression dynamoDBScanExpression = new DynamoDBScanExpression()
                .withFilterExpression("#timestamp < :from")
                .withExpressionAttributeNames(attributeNames)
                .withExpressionAttributeValues(attributeValues);

        List<Login> logins = dynamoDBMapper.parallelScan(Login.class, dynamoDBScanExpression,workers);

        return logins;
    }

Before using scan to our application we have to take into consideration that scan fetches all table items. Therefore It has a high cost both on charges and performance. Also it might consume your provision capacity.
Generally it is better to stick to queries and avoid scans.

You can find full source code with unit tests on github.

Query DynamoDB Items with DynamoDBMapper

On a previous post we issued queries on a DynamoDB database using the low level java api.

Querying using the DynamoDBMapper is pretty easy.

Issue a query using a hash key is as simple as it gets. The best candidate for a query like this would be the Users table by searching using the email hash key.

    public User getUser(String email) {

        User user = dynamoDBMapper.load(User.class,email);
        return user;
    }

Since we use only hashkey for the Users table, our result would be limited to one.

The load function can also be used for composite keys. Therefore querying for a Logins Table Item would require a hash key and a range key.

    public Login getLogin(String email,Long date) {

        Login login =  dynamoDBMapper.load(Login.class,email,date);
        return login;
    }

Next step is to issue more complex queries using conditions. We will issue a query that will fetch the login attempts between two dates.


 public List<Login> queryLoginsBetween(String email, Long from, Long to) {

        Map<String,String> expressionAttributesNames = new HashMap<>();
        expressionAttributesNames.put("#email","email");
        expressionAttributesNames.put("#timestamp","timestamp");

        Map<String,AttributeValue> expressionAttributeValues = new HashMap<>();
        expressionAttributeValues.put(":emailValue",new AttributeValue().withS(email));
        expressionAttributeValues.put(":from",new AttributeValue().withN(Long.toString(from)));
        expressionAttributeValues.put(":to",new AttributeValue().withN(Long.toString(to)));

        DynamoDBQueryExpression<Login> queryExpression = new DynamoDBQueryExpression<Login>()
                .withKeyConditionExpression("#email = :emailValue and #timestamp BETWEEN :from AND :to ")
                .withExpressionAttributeNames(expressionAttributesNames)
                .withExpressionAttributeValues(expressionAttributeValues);

        return dynamoDBMapper.query(Login.class,queryExpression);
    }

We use DynamoDBQueryExpression, in the same manner that we used it in the low level api.
The main difference is that we do not have to handle the paging at all. DynamoDBMapper will map the DynamoDB items to objects but also it will return a “lazy-loaded” collection. It initially returns only one page of results, and then makes a service call for the next page if needed.

Last but not least querying on indexes is one of the basic actions. It is the same routine either for local or global secondary indexes.
Keep in mind that the results fetched, depend on the projection type we specified once creating the Table. In our case the projection type is for all fields.

   public Supervisor getSupervisor(String company,String factory) {

        Map<String,String> expressionAttributesNames = new HashMap<>();
        expressionAttributesNames.put("#company","company");
        expressionAttributesNames.put("#factory","factory");

        Map<String,AttributeValue> expressionAttributeValues = new HashMap<>();
        expressionAttributeValues.put(":company",new AttributeValue().withS(company));
        expressionAttributeValues.put(":factory",new AttributeValue().withS(factory));

        DynamoDBQueryExpression<Supervisor> dynamoDBQueryExpression = new DynamoDBQueryExpression<Supervisor>()
                .withIndexName("FactoryIndex")
                .withKeyConditionExpression("#company = :company and #factory = :factory ")
                .withExpressionAttributeNames(expressionAttributesNames)
                .withExpressionAttributeValues(expressionAttributeValues)
                .withConsistentRead(false);

        List<Supervisor> supervisor = dynamoDBMapper.query(Supervisor.class,dynamoDBQueryExpression);

        if(supervisor.size()>0) {
            return supervisor.get(0);
        } else {
            return null;
        }
    }

Pay extra attention to the fact that consistent read is set to false. DynamoDBQueryExpression uses by defaut consistent reads. When using a global secondary index you cannot issue a consistent read.

You can find full source code with unit tests on github.