Scan DynamoDB Items with DynamoDBMapper

Previously we covered how to query a DynamoDB database either using DynamoDBMapper or the low level java api.

Apart from issuing queries, DynamoDB also offers Scan functionality.
What scan does, is fetching all the Items you might have on your DynamoDB Table.
Therefore scan does not require any rules based on our partition key or your global/local secondary indexes.
What scan offers is filtering based on the items already fetched and return specific attributes from the items fetched.

The snippet below issues a scan on the Logins table by filtering items with a lower date.

    public List<Login> scanLogins(Long date) {

        Map<String, String> attributeNames = new HashMap<String, String>();
        attributeNames.put("#timestamp", "timestamp");

        Map<String, AttributeValue> attributeValues = new HashMap<String, AttributeValue>();
        attributeValues.put(":from", new AttributeValue().withN(date.toString()));

        DynamoDBScanExpression dynamoDBScanExpression = new DynamoDBScanExpression()
                .withFilterExpression("#timestamp < :from")
                .withExpressionAttributeNames(attributeNames)
                .withExpressionAttributeValues(attributeValues);

        List<Login> logins = dynamoDBMapper.scan(Login.class, dynamoDBScanExpression);

        return logins;
    }

Another great feature of DynamoDBMapper is parallel scan. Parallel scan divides the scan task among multiple workers, one for each logical segment. The workers process the data in parallel and return the results.
Generally the performance of a scan request depends largely on the number of items stored in a DynamoDB table. Therefore parallel scan might lift some of the performance issues of a scan request, since you have to deal with large amounts of data.

    public List<Login> scanLogins(Long date,Integer workers) {

        Map<String, String> attributeNames = new HashMap<String, String>();
        attributeNames.put("#timestamp", "timestamp");

        Map<String, AttributeValue> attributeValues = new HashMap<String, AttributeValue>();
        attributeValues.put(":from", new AttributeValue().withN(date.toString()));

        DynamoDBScanExpression dynamoDBScanExpression = new DynamoDBScanExpression()
                .withFilterExpression("#timestamp < :from")
                .withExpressionAttributeNames(attributeNames)
                .withExpressionAttributeValues(attributeValues);

        List<Login> logins = dynamoDBMapper.parallelScan(Login.class, dynamoDBScanExpression,workers);

        return logins;
    }

Before using scan to our application we have to take into consideration that scan fetches all table items. Therefore It has a high cost both on charges and performance. Also it might consume your provision capacity.
Generally it is better to stick to queries and avoid scans.

You can find full source code with unit tests on github.

Advertisement

Query DynamoDB Items with DynamoDBMapper

On a previous post we issued queries on a DynamoDB database using the low level java api.

Querying using the DynamoDBMapper is pretty easy.

Issue a query using a hash key is as simple as it gets. The best candidate for a query like this would be the Users table by searching using the email hash key.

    public User getUser(String email) {

        User user = dynamoDBMapper.load(User.class,email);
        return user;
    }

Since we use only hashkey for the Users table, our result would be limited to one.

The load function can also be used for composite keys. Therefore querying for a Logins Table Item would require a hash key and a range key.

    public Login getLogin(String email,Long date) {

        Login login =  dynamoDBMapper.load(Login.class,email,date);
        return login;
    }

Next step is to issue more complex queries using conditions. We will issue a query that will fetch the login attempts between two dates.


 public List<Login> queryLoginsBetween(String email, Long from, Long to) {

        Map<String,String> expressionAttributesNames = new HashMap<>();
        expressionAttributesNames.put("#email","email");
        expressionAttributesNames.put("#timestamp","timestamp");

        Map<String,AttributeValue> expressionAttributeValues = new HashMap<>();
        expressionAttributeValues.put(":emailValue",new AttributeValue().withS(email));
        expressionAttributeValues.put(":from",new AttributeValue().withN(Long.toString(from)));
        expressionAttributeValues.put(":to",new AttributeValue().withN(Long.toString(to)));

        DynamoDBQueryExpression<Login> queryExpression = new DynamoDBQueryExpression<Login>()
                .withKeyConditionExpression("#email = :emailValue and #timestamp BETWEEN :from AND :to ")
                .withExpressionAttributeNames(expressionAttributesNames)
                .withExpressionAttributeValues(expressionAttributeValues);

        return dynamoDBMapper.query(Login.class,queryExpression);
    }

We use DynamoDBQueryExpression, in the same manner that we used it in the low level api.
The main difference is that we do not have to handle the paging at all. DynamoDBMapper will map the DynamoDB items to objects but also it will return a “lazy-loaded” collection. It initially returns only one page of results, and then makes a service call for the next page if needed.

Last but not least querying on indexes is one of the basic actions. It is the same routine either for local or global secondary indexes.
Keep in mind that the results fetched, depend on the projection type we specified once creating the Table. In our case the projection type is for all fields.

   public Supervisor getSupervisor(String company,String factory) {

        Map<String,String> expressionAttributesNames = new HashMap<>();
        expressionAttributesNames.put("#company","company");
        expressionAttributesNames.put("#factory","factory");

        Map<String,AttributeValue> expressionAttributeValues = new HashMap<>();
        expressionAttributeValues.put(":company",new AttributeValue().withS(company));
        expressionAttributeValues.put(":factory",new AttributeValue().withS(factory));

        DynamoDBQueryExpression<Supervisor> dynamoDBQueryExpression = new DynamoDBQueryExpression<Supervisor>()
                .withIndexName("FactoryIndex")
                .withKeyConditionExpression("#company = :company and #factory = :factory ")
                .withExpressionAttributeNames(expressionAttributesNames)
                .withExpressionAttributeValues(expressionAttributeValues)
                .withConsistentRead(false);

        List<Supervisor> supervisor = dynamoDBMapper.query(Supervisor.class,dynamoDBQueryExpression);

        if(supervisor.size()>0) {
            return supervisor.get(0);
        } else {
            return null;
        }
    }

Pay extra attention to the fact that consistent read is set to false. DynamoDBQueryExpression uses by defaut consistent reads. When using a global secondary index you cannot issue a consistent read.

You can find full source code with unit tests on github.

Insert DynamoDB Items with DynamoDBMapper

In a previous post we used DynamoDBMapper in order to map DynamoDB Tables into Java objects.

When it comes to insert, our actions are pretty much the same but with a more convenient way. In order to insert an item all you have to do is to persist an object using the object mapper

In our case, we will create a User repository that does a simple insert.

package com.gkatzioura.dynamodb.mapper.repository;

import com.amazonaws.services.dynamodbv2.AmazonDynamoDB;
import com.amazonaws.services.dynamodbv2.datamodeling.DynamoDBMapper;
import com.gkatzioura.dynamodb.mapper.entities.User;

import java.util.ArrayList;
import java.util.Date;
import java.util.List;

/**
 * Created by gkatzioura on 9/22/16.
 */
public class UserMapperRepository {

    private DynamoDBMapper dynamoDBMapper;

    public UserMapperRepository(AmazonDynamoDB amazonDynamoDB) {
        dynamoDBMapper = new DynamoDBMapper(amazonDynamoDB);
    }

    public void insert(User user) {

        dynamoDBMapper.save(user);
    }

}

To persist we just have to create a simple object.

    @Test
    public void testInsertUser() {

        User user = new User();
        user.setRegisterDate(new Date().getTime());
        user.setFullName("John Doe");
        user.setEmail("john@doe.com");

        userMapperRepository.insert(user);
    }

Also using DynamoDBMapper we can do batch inserts or batch deletes. Therefore we will add two extra methods to the repository.

    public void insert(List<User> users) {

        dynamoDBMapper.batchWrite(users,new ArrayList<>());
    }

    public void delete(List<User> users) {
        dynamoDBMapper.batchDelete(users);
    }

Adding items in batch (or deleting) them, simply requires to pass a list of objects that contains values for the keys defined.

    @Test
    public void testBatchUserInsert() {

        List<User> users = new ArrayList<>();

        for(int i=0;i<10;i++) {

            String email = emailPrefix+i+"@doe.com";
            User user = new User();
            user.setRegisterDate(new Date().getTime());
            user.setFullName("John Doe");
            user.setEmail("john@doe.com");
            users.add(user);
        }

        userMapperRepository.insert(users);
    }

    @Test
    public void testBatchDelete() {

        testBatchUserInsert();

        List<User> users = new ArrayList<>();

        for(int i=0;i<10;i++) {

            String email = emailPrefix+i+"@doe.com";
            User user = new User();
            user.setRegisterDate(new Date().getTime());
            user.setFullName("John Doe");
            user.setEmail("john@doe.com");
            users.add(user);
        }
        
        userMapperRepository.delete(users);
    }

You can find the sourcecode on github

Map DynamoDB Items to Objects using DynamoDB mapper.

Previously we created DynamoDB Tables using Java.

For various databases such sql databases or nosql there is a set of tools that help to access, persist, and manage data between objects/classes and the underlying database. For example for SQL databases we use JPA, for Cassandra we use MappingManager.

DynamoDBMapper is a tool that enables you to access your data in various tables, perform various CRUD operations on items, and execute queries and scans against tables.

We will try to map the Users, Logins, Supervisors and companies tables from the previous example.
Users is a simple table using the users email as a Hash key.

package com.gkatzioura.dynamodb.mapper.entities;

import com.amazonaws.services.dynamodbv2.datamodeling.DynamoDBAttribute;
import com.amazonaws.services.dynamodbv2.datamodeling.DynamoDBHashKey;
import com.amazonaws.services.dynamodbv2.datamodeling.DynamoDBTable;

/**
 * Created by gkatzioura on 9/20/16.
 */
@DynamoDBTable(tableName="Users")
public class User {

    private String email;
    private String fullName;

    @DynamoDBHashKey(attributeName="email")
    public String getEmail() {
        return email;
    }

    @DynamoDBAttribute(attributeName="fullname")
    public void setEmail(String email) {
        this.email = email;
    }

    public String getFullName() {
        return fullName;
    }

    public void setFullName(String fullName) {
        this.fullName = fullName;
    }
}

However in various cases our DynamoDB Table uses a hash and a range key. The Logins table keeps track of the login attempts of a user. The email is the hash key and the timestamp the range key.

package com.gkatzioura.dynamodb.mapper.entities;

import com.amazonaws.services.dynamodbv2.datamodeling.DynamoDBHashKey;
import com.amazonaws.services.dynamodbv2.datamodeling.DynamoDBRangeKey;
import com.amazonaws.services.dynamodbv2.datamodeling.DynamoDBTable;

/**
 * Created by gkatzioura on 9/20/16.
 */
@DynamoDBTable(tableName="Logins")
public class Login {

    private String email;
    private Long timestamp;

    @DynamoDBHashKey(attributeName="email")
    public String getEmail() {
        return email;
    }

    public void setEmail(String email) {
        this.email = email;
    }

    @DynamoDBRangeKey(attributeName="timestamp")
    public Long getTimestamp() {
        return timestamp;
    }

    public void setTimestamp(Long timestamp) {
        this.timestamp = timestamp;
    }
}

Another popular case are tables with global secondary indexes (GSI). For example the Supervisors table is used to retrieve a supervisor by his name. However we also use this table in order to retrieve all the supervisors from a specific company or the supervisors who work on specific factory of a company.
The supervisor name is our hash key, the company name is the hash key and the factory name is the range key of the global secondary index.

package com.gkatzioura.dynamodb.mapper.entities;

import com.amazonaws.services.dynamodbv2.datamodeling.DynamoDBHashKey;
import com.amazonaws.services.dynamodbv2.datamodeling.DynamoDBIndexHashKey;
import com.amazonaws.services.dynamodbv2.datamodeling.DynamoDBIndexRangeKey;
import com.amazonaws.services.dynamodbv2.datamodeling.DynamoDBTable;

/**
 * Created by gkatzioura on 9/21/16.
 */
@DynamoDBTable(tableName="Supervisors")
public class Supervisor {

    private String name;
    private String company;
    private String factory;

    @DynamoDBHashKey(attributeName="name")
    public String getName() {
        return name;
    }

    public void setName(String name) {
        this.name = name;
    }

    @DynamoDBIndexHashKey(globalSecondaryIndexName = "FactoryIndex",attributeName = "company")
    public String getCompany() {
        return company;
    }

    public void setCompany(String company) {
        this.company = company;
    }

    @DynamoDBIndexRangeKey(globalSecondaryIndexName = "FactoryIndex",attributeName = "factory")
    public String getFactory() {
        return factory;
    }

    public void setFactory(String factory) {
        this.factory = factory;
    }
}

Last but not least we can use Local Secondary Indexes. The Companies table uses the company name as a hash key and the subsidiary name as a range key. Since we want to issue queries based on a company’s CEOs a local secondary index is used with a range key based on the name of the CEO.

package com.gkatzioura.dynamodb.mapper.entities;

import com.amazonaws.services.dynamodbv2.datamodeling.*;

/**
 * Created by gkatzioura on 9/21/16.
 */
@DynamoDBTable(tableName="Companies")
public class Company {

    private String name;
    private String subsidiary;
    private String ceo;

    @DynamoDBHashKey(attributeName="name")
    public String getName() {
        return name;
    }

    public void setName(String name) {
        this.name = name;
    }

    @DynamoDBRangeKey(attributeName = "subsidiary")
    public String getSubsidiary() {
        return subsidiary;
    }

    public void setSubsidiary(String subsidiary) {
        this.subsidiary = subsidiary;
    }

    @DynamoDBIndexRangeKey(localSecondaryIndexName = "CeoIndex",attributeName = "ceo")
    public String getCeo() {
        return ceo;
    }

    public void setCeo(String ceo) {
        this.ceo = ceo;
    }
}

You can find the source code on github.

Update DynamoDB Items with Node.js

On a previous post we proceeded into inserting items to DynamoDB using Node.js. DynamoDB also supports updating items.

We will use the Login table for the update examples.
When issuing an update you must specify the primary key of the item you want to update.

var updateName = function(email,fullName,callback) {
	
	var docClient = new AWS.DynamoDB.DocumentClient();
	
	var params = {
			TableName:"Users",
			Key: {
				email : email
			},
			UpdateExpression: "set fullname = :fullname",
		    ExpressionAttributeValues:{
		        ":fullname":fullName
		    },
		    ReturnValues:"UPDATED_NEW"
		};
	
	docClient.update(params,callback);
}

We can proceed on more advanced statements using conditional updates. Conditional updates can help us in many cases such as handling concurrent updates. In our case we will update an item’s Full name only if it starts with a certain prefix.

var updateConditionally = function(email,fullName,prefix,callback) {
	
	var docClient = new AWS.DynamoDB.DocumentClient();
	
	var params = {
			TableName:"Users",
			Key: {
				email : email
			},
			UpdateExpression: "set fullname = :fullname",
			ConditionExpression: "begins_with(fullname,:prefix)",
			ExpressionAttributeValues:{
		        ":fullname":fullName,
		        ":prefix":prefix
		    },
		    ReturnValues:"UPDATED_NEW"
		};
	
	docClient.update(params,callback);
}

Another feature is atomic counters. We can issue updates to a DynamoDB item and increase the attribute values. We will add an extra field called count. Also we will add another update function, which once called will update the field specified, but will also increase the counter attribute. Thus the counter attribute will represent how many times an update was performed on a specific item.

var addUpdateCounter = function(email,callback) {
	
	var docClient = new AWS.DynamoDB.DocumentClient();
	
	var params = {
			TableName:"Users",
			Key: {
				email : email
			},
			UpdateExpression: "set #counter = :counter",
			ExpressionAttributeNames:{
		        "#counter":"counter"
		    },
			ExpressionAttributeValues:{
		        ":counter":0
		    },
			ReturnValues:"UPDATED_NEW"
		};
	
	docClient.update(params,callback);
}

var updateAndIncreaseCounter = function(email,fullName,callback) {

	var docClient = new AWS.DynamoDB.DocumentClient();
	
	var params = {
			TableName:"Users",
			Key: {
				email : email
			},
			UpdateExpression: "set fullname = :fullname ADD #counter :incva",
			ExpressionAttributeNames:{
		        "#counter":"counter"
		    },
			ExpressionAttributeValues:{
		        ":fullname":fullName,
		        ":incva":1
		    },
		    ReturnValues:"UPDATED_NEW"
		};
	
	docClient.update(params,callback);
}

You can find the sourcecode on github.

Update DynamoDB Items with Java

On a previous post we proceeded into inserting items to DynamoDB using Java. DynamoDB also supports updating items.

We will use the Login table for the update examples.
When issuing an update you must specify the primary key of the item you want to update.

    public void updateName(String email,String fullName) {

        Map<String,AttributeValue> attributeValues = new HashMap<>();
        attributeValues.put("email",new AttributeValue().withS(email));
        attributeValues.put("fullname",new AttributeValue().withS(fullName));

        UpdateItemRequest updateItemRequest = new UpdateItemRequest()
                .withTableName(TABLE_NAME)
                .addKeyEntry("email",new AttributeValue().withS(email))
                .addAttributeUpdatesEntry("fullname",
                        new AttributeValueUpdate().withValue(new AttributeValue().withS(fullName)));

        UpdateItemResult updateItemResult = amazonDynamoDB.updateItem(updateItemRequest);
    }

We can proceed on more advanced statements using conditional updates. Conditional updates can help us in many cases such as handling concurrent updates.

We can achieve so by using plain expressions.

    public void updateConditionallyWithExpression(String email,String fullName,String prefix) {

        Map<String, AttributeValue> key = new HashMap<>();
        key.put("email", new AttributeValue().withS(email));

        Map<String, AttributeValue> attributeValues = new HashMap<>();
        attributeValues.put(":prefix", new AttributeValue().withS(prefix));
        attributeValues.put(":fullname", new AttributeValue().withS(fullName));

        UpdateItemRequest updateItemRequest = new UpdateItemRequest()
                .withTableName(TABLE_NAME)
                .withKey(key)
                .withUpdateExpression("set fullname = :fullname")
                .withConditionExpression("begins_with(fullname,:prefix)")
                .withExpressionAttributeValues(attributeValues);
        UpdateItemResult updateItemResult = amazonDynamoDB.updateItem(updateItemRequest);
    }

Or through by specifying attributes.

    public void updateConditionallyWithAttributeEntries(String email, String fullName, String prefix){

        Map<String,AttributeValue> key = new HashMap<>();
        key.put("email",new AttributeValue().withS(email));

        UpdateItemRequest updateItemRequest = new UpdateItemRequest()
                .withTableName(TABLE_NAME)
                .withKey(key)
                .addAttributeUpdatesEntry("fullname",new AttributeValueUpdate().withValue(new AttributeValue().withS(fullName)).withAction(AttributeAction.PUT))
                .addExpectedEntry("fullname",new ExpectedAttributeValue().withValue(new AttributeValue().withS(prefix)).withComparisonOperator(ComparisonOperator.BEGINS_WITH));

        UpdateItemResult updateItemResult = amazonDynamoDB.updateItem(updateItemRequest);
    }

Another feature is atomic counters. We can issue updates to a DynamoDB item and increase the attribute values. We will add an extra field called count. Also we will add another update function, which once called will update the field specified, but will also increase the counter attribute. Thus the counter attribute will represent how many times an update was performed on a specific item.

    public void addUpdateCounter(String email) {

        Map<String,AttributeValue> key = new HashMap<>();
        key.put("email",new AttributeValue().withS(email));

        UpdateItemRequest updateItemRequest = new UpdateItemRequest()
                .withTableName(TABLE_NAME)
                .withKey(key)
                .addAttributeUpdatesEntry("counter",new AttributeValueUpdate().withValue(new AttributeValue().withN("0")).withAction(AttributeAction.PUT));

        UpdateItemResult updateItemResult = amazonDynamoDB.updateItem(updateItemRequest);
    }

    public void updateAndIncreaseCounter(String email,String fullname) {

        Map<String,AttributeValue> key = new HashMap<>();
        key.put("email",new AttributeValue().withS(email));

        UpdateItemRequest updateItemRequest = new UpdateItemRequest()
                .withTableName(TABLE_NAME)
                .withKey(key)
                .addAttributeUpdatesEntry("fullname",new AttributeValueUpdate().withValue(new AttributeValue().withS(fullname)).withAction(AttributeAction.PUT))
                .addAttributeUpdatesEntry("counter",new AttributeValueUpdate().withValue(new AttributeValue().withN("1")).withAction(AttributeAction.ADD));

        UpdateItemResult updateItemResult = amazonDynamoDB.updateItem(updateItemRequest);
    }

You can find the sourcecode on github.

Scan DynamoDB Items with Node.js

On previous posts we covered how to query a DynamoDB database
Query DynamoDB Part 1
Query DynamoDB Part 2.

Apart from issuing queries DynamoDB also offers Scan functionality.
What scan does is fetching all the Items you might have on your DynamoDB Table.
Therefore scan does not require any rules based on our partition key or your global/local secondary indexes.
What scan offers is filtering based on the items already fetched and return specific attributes from the items fetched.

The snippet below issues a scan on the Logins table by adding filtering and selecting only the email field.

var scanLogins = function(date,callback) {

	var docClient = new AWS.DynamoDB.DocumentClient();

	var params = {
		TableName:"Logins",
		ProjectionExpression: "email",
	    FilterExpression: "#timestamp < :from",
	    ExpressionAttributeNames: {
	        "#timestamp": "timestamp",
	    },
	    ExpressionAttributeValues: {
	         ":from": date.getTime()
	    }
	};

	var items = []
	
	var scanExecute = function(callback) {
	
		docClient.scan(params,function(err,result) {

			if(err) {
				callback(err);
			} else {
				
				items = items.concat(result.Items);

				if(result.LastEvaluatedKey) {

					params.ExclusiveStartKey = result.LastEvaluatedKey;
					scanExecute(callback);				
				} else {
					callback(err,items);
				}	
			}
		});
	}
	
	scanExecute(callback);
};

Before using scan to an application we have to take into consideration that scan fetches all table items. Therefore It has a high cost both on charges and performance. Also it might consume your provision capacity.
It is better to stick to queries and avoid scans.

You can find the sourcecode on github.

Scan DynamoDB Items with Java

On previous posts we covered how to query a DynamoDB database
Query DynamoDB Part 1
Query DynamoDB Part2.

Apart from issuing queries DynamoDB also offers Scan functionality.
What scan does is fetching all the Items you might have on your DynamoDB Table.
Therefore scan does not require any rules based on our partition key or your global/local secondary indexes.
What scan offers is filtering based on the items already fetched and return specific attributes from the items fetched.

The snippet below issues a scan on the Logins table by adding filtering and selecting only the email field.

public List<String> scanLogins(Date date) {

        List<String> emails = new ArrayList<>();

        Map<String, String> attributeNames = new HashMap<String, String >();
        attributeNames.put("#timestamp", "timestamp");

        Map<String, AttributeValue> attributeValues = new HashMap<String, AttributeValue>();
        attributeValues.put(":from", new AttributeValue().withN(Long.toString(date.getTime())));

        ScanRequest scanRequest = new ScanRequest()
                .withTableName(TABLE_NAME)
                .withFilterExpression("#timestamp < :from")
                .withExpressionAttributeNames(attributeNames)
                .withExpressionAttributeValues(attributeValues)
                .withProjectionExpression("email");

        Map<String,AttributeValue> lastKey = null;

        do {

            ScanResult scanResult = amazonDynamoDB.scan(scanRequest);

            List<Map<String,AttributeValue>> results = scanResult.getItems();
            results.forEach(r->emails.add(r.get("email").getS()));
            lastKey = scanResult.getLastEvaluatedKey();
            scanRequest.setExclusiveStartKey(lastKey);
        } while (lastKey!=null);

        return emails;
    }

Before using scan to an application we have to take into consideration that scan fetches all table items. Therefore It has a high cost both on charges and performance. Also it might consume your provision capacity.
It is better to stick to queries and avoid scans.

You can find the sourcecode on github.

I’ve compiled a cheat sheet that lists dynamodb functions in Java
Sign up in the link to receive it.

Configure hazelcast with EC2

Hazelcast is hands down a great caching tool when it comes to a JVM based application. If you use Amazon Web Services Hazelcast integrates wonderfully.

First task is to create a policy responsible for describing instances. We shall name this policy as
describe-instances-policy.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "Stmt1467219263000",
            "Effect": "Allow",
            "Action": [
                "ec2:DescribeInstances"
            ],
            "Resource": [
                "*"
            ]
        }
    ]
}

Applications that have to access amazon resources, should either use a user or a role that has the policies for the resources we need attached. Using an amazon user for your application is a bad practice and It becomes a maintenance headache managing keys, let alone security issues.
Therefore we will focus on hazelcast configuration using IAM roles.

Our role will be called my-ec2-role and will have the policy describe-instances-policy attached.

By doing so an ec2 instance with hazelcast would be able to retrieve the private ip’s of other ec2 instances and therefore would attempt to identify which instances are eligible to establish a distributed cache.

Now we can proceed to the hazelcast configuration.
We can either do a java based configuration or an xml based configuration.

Let us start with the xml configuration.

<hazelcast
        xsi:schemaLocation="https://hazelcast.com/schema/config https://hazelcast.com/schema/config/hazelcast-config-3.7.xsd"
        xmlns="http://www.hazelcast.com/schema/config" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
    <group>
        <field>ec2-group</field>
        <password>ec2-password</password>
    </group>
    <network>
        <join>
            <multicast enabled="false">
            </multicast>
            <tcp-ip enabled="false">
            </tcp-ip>
            <aws enabled="true">
                <!--optional, default is us-east-1 -->
                <region>eu-west-1</region>
                <iam-role>my-ec2-role</iam-role>
                <!-- optional, only instances belonging to this group will be discovered, default will try all running instances -->
                <security-group-field></security-group-field>
                <tag-key></tag-key>
                <tag-value></tag-value>
            </aws>
        </join>
    </network>
</hazelcast>

And the main class to load the xml file.

package com.gkatzioura.hazelcastec2;

import com.hazelcast.config.*;
import com.hazelcast.core.Hazelcast;

/**
 * Created by gkatzioura on 7/26/16.
 */
public class HazelCastXMLExample {

    public static void main(String args[]) {

        Config config = new ClasspathXmlConfig("hazelcast.xml");

        Hazelcast.newHazelcastInstance(config);
    }

}

Pay extra attention that multicast and tcp-ip should be disabled.
Since we specify an IAM role there is no need to provide credentials.
Tag-key and Tag-value represent the tags that you can add on an ec2 machine. In case you specify tag key and value a connection will be established only on machine that have the same tag and value.

You can have the security-group-field empty. Hazelcast uses this information for instance filtering however you must make sure the security group that the ec2 instance uses has ports 5701, 5702, and 5703 open for inbound and outbound traffic.

The java configuration follows the same rules.

package com.gkatzioura.hazelcastec2;

import com.hazelcast.aws.AWSClient;
import com.hazelcast.config.AwsConfig;
import com.hazelcast.config.Config;
import com.hazelcast.config.GroupConfig;
import com.hazelcast.config.JoinConfig;
import com.hazelcast.core.Hazelcast;
import com.hazelcast.core.HazelcastInstance;

/**
 * Created by gkatzioura on 7/25/16.
 */
public class HazelCastJavaExample {

    public static void main(String args[]) {

        Config config = new Config();

        GroupConfig groupConfig = new GroupConfig();
        groupConfig.setName("ec2-group");
        groupConfig.setPassword("ec2-password");

        config.setGroupConfig(groupConfig);

        JoinConfig joinConfig = config.getNetworkConfig().getJoin();
        joinConfig.getTcpIpConfig().setEnabled(false);
        joinConfig.getMulticastConfig().setEnabled(false);

        AwsConfig awsConfig = joinConfig.getAwsConfig();
        awsConfig.setIamRole("my-ec2-role");
        awsConfig.setEnabled(true);
        awsConfig.setRegion("eu-west-1");

        Hazelcast.newHazelcastInstance(config);
    }

}

After uploading your hazelcast apps to ec2 and run them you can see the following log

Jul 26, 2016 6:34:50 PM com.hazelcast.cluster.ClusterService
INFO: [172.31.33.104]:5701 [dev] [3.5.4] 

Members [2] {
	Member [172.31.33.104]:5701 this
	Member [172.31.41.154]:5701
}

I have added a gradle file for some quick testing either with xml or java configuration.

group 'com.gkatzioura'
version '1.0-SNAPSHOT'

apply plugin: 'java'

sourceCompatibility = 1.5

repositories {
    mavenCentral()
}

apply plugin: 'idea'

dependencies {
    testCompile group: 'junit', name: 'junit', version: '4.11'
    compile group: 'com.hazelcast', name:'hazelcast-cloud', version:'3.5.4'
}

task javaConfJar(type: Jar) {
    manifest {
        attributes 'Main-Class': 'com.gkatzioura.hazelcastec2.HazelCastJavaExample'
    }
    baseName = project.name + '-jconf'
    from { configurations.compile.collect { it.isDirectory() ? it : zipTree(it) } }
    with jar
}

task javaXMLJar(type: Jar) {
    manifest {
        attributes 'Main-Class': 'com.gkatzioura.hazelcastec2.HazelCastXMLExample'
    }
    baseName = project.name + '-xmlconf'
    from { configurations.compile.collect { it.isDirectory() ? it : zipTree(it) } }
    with jar
}

You can find the sourcecode on github.

Query DynamoDB Items with Node.js Part 2

On a previous post we had the chance to issue some basic DynamoDB query actions.

However apart from the basic actions the DynamoDB api provides us with some extra functionality.

Projections is a feature that has a select-like functionality.
You choose which attributes from a DynamoDB Item shall be fetched. Keep in mind that using projection will not have any impact on your query billing.

var getRegisterDate = function(email,callback) {
	
	var docClient = new AWS.DynamoDB.DocumentClient();
	
	var params = {
		    TableName: "Users",
		    KeyConditionExpression: "#email = :email",
		    ExpressionAttributeNames:{
		        "#email": "email"
		    },
		    ExpressionAttributeValues: {
		        ":email":email
		    },
		    ProjectionExpression: 'registerDate'
		};
	
	docClient.query(params,callback);
}

Apart from selecting the attributes we can also specify the order according to our range key. We shall query the logins Table in a Descending order using scanIndexForward.

var fetchLoginsDesc = function(email,callback) {

	var docClient = new AWS.DynamoDB.DocumentClient();

	var params = {
	    TableName:"Logins",
	    KeyConditionExpression:"#email = :emailValue",
	    ExpressionAttributeNames: {
	    	"#email":"email"
	    },
	    ExpressionAttributeValues: {
	    	":emailValue":email
	    },
	    ScanIndexForward: false
	};
	
	docClient.query(params,callback);
}

A common functionality of databases is counting the items persisted in a collection. In our case we want to count the login occurrences of a specific user. However pay extra attention since the count functionality does nothing more than counting the total items fetched, therefore it will cost you as if you fetched the items.

var countLogins = function(email,callback) {

	var docClient = new AWS.DynamoDB.DocumentClient();

	var params = {
	    TableName:"Logins",
	    KeyConditionExpression:"#email = :emailValue",
	    ExpressionAttributeNames: {
	    	"#email":"email"
	    },
	    ExpressionAttributeValues: {
	    	":emailValue":email
	    },
	    Select:'COUNT'
	};
	
	docClient.query(params,callback);
}

Another feature of DynamoDB is getting items in batches even if they belong on different tables. This is really helpful in cases where data that belong on a specific context are spread through different tables. Every get item is handled and charged as a DynamoDB read action. In case of batch get item all table keys should be specified since every query’s purpose on BatchGetItem is to fetch a single Item.
It is important to know that you can fetch up to 1 MB of data and up to 100 items per BatchGetTime request.

var getMultipleInformation = function(email,name,callback) {
	
	var params = {
			"RequestItems" : {
			    "Users": {
			      "Keys" : [
			        {"email" : { "S" : email }}
			      ]
			    },
			    "Supervisors": {
				   "Keys" : [
					{"name" : { "S" : name }}
				  ]
			    }
			  }
			};
	
	dynamodb.batchGetItem(params,callback);
};

You can find the sourcecode on github