Posted tagged ‘simpledb’

Testing Cloudfork AWS SimpleDB based classes

September 20, 2009

The Cloudfork framework includes an alternate implementation of CFSimpleBase that stores all items in memory. The CFSimpleDBEmulator was initially created to support the unit testing of the ActiveItem framework. With the exception of some query constructs, it implements the complete api and therefore is suitable for unit testing your own applications as well.

| emulator domain item |
emulator := CFSimpleDBEmulator new.
domain := (emulator createDomain: 'myapp.development') result. 
item := CFSimpleDBItem newNamed: 'user.dennis'.
item valueAt: 'birthday' put: '20060916'  ;valueAt: 'hobby' put: 'soccer'.
domain putItem: item.

After running the tests, you can inspect the emulator to see what items have been stored in which domains and what attributes they have. If you store the emulator in some class var then you can keep the data around for development too.

Because the ActiveItem framework is build on top of SimpleDB, the same emulator class can be used to unit test those applications. ActiveItem uses a globally shared CFSimpleDB instance so you only need to replace that with an emulated instance.

CFActiveItem activateWithSimpleDB: CFSimpleDBEmulator new.

Using the CFActiveItemSerializer you can even dump the items to a local Filesystem for development convenience.

HTTP Clients for Squeak

May 25, 2009

Cloudfork-AWS makes the Amazon Web Services (AWS) S3, SQS and SimpleDB easily accessible from Smalltalk. All the communication between the Smalltalk image and AWS is done via HTTP. So a HTTP Client is an important requirement for Cloudfork-AWS.

Cloudfork-AWS needs more than just handling simple HTTP GET and POST requests, the following features are also needed:

  • Setting custom request headers – S3 uses custom headers for authentication and for attaching meta-data to S3 objects. We need to be able to set these headers. This feature is also required for range requests, with these requests you can download a part of a S3 object instead of downloading the entire object.
  • Access to the response headers – So we can read the meta-data of S3 objects.
  • Support for PUT, HEAD and DELETE requests – Also required for S3. PUT is required for storing objects and creating buckets. DELETE is required for removing objects and HEAD for getting the object meta-data without downloading the object itself.
  • HTTPS support – The AWS services can be accessed via plain HTTP or via secure HTTPS. The choice is up to the client. But the release notes of the latest releases of SimpleDB mention that HTTP support will be deprecated and that future versions will require HTTPS.
  • HTTP/1.1 support – Not a must have feature but version 1.1 requests can be more efficient than version 1.0 requests because of the keep-alive feature of version 1.1. With this feature socket connections can be reused between requests.
  • Streaming uploads and downloads – Also not a must have feature for most use cases. Only when large s3 objects need to be handled.
  • Proxy support – Not a requirement of one of the AWS services but a feature that is often required by client configurations.

Note that most of these features are only required for S3 and not for SQS and SimpleDB. SQS and SimpleDB only use GET and POST requests and the authentication is done in the URL and not through HTTP header fields. The HTTP responses of SQS and SimpleDB always contain XML and the maximum size is about 8KB for SQS and 1MB for a SimpleDB resultset so streaming support is not required.

As far as I know there are three HTTP clients available for Squeak:

  • The HTTPSocket class – This class is part of the Network-Protocols package and is part of the standard images of the latest Squeak and Pharo versions.
  • SWHTTPClient – This is an extensive HTTP client library. It was originally developed for Dolphin Smalltalk and was ported to Squeak. The latest release is not fully compatible with the latest Squeak release. There are a number of class extension collisions.
  • CurlPlugin – This is a Squeak plugin that uses the libcurl C library, libcurl is a well-known and powerful open source “URL transfer library” with support for HTTP, FTP and many other protocols.


This is a very simple implementation of a HTTP client in a single class. HTTP GET and POST requests are supported, access to the headers is also possible and simple proxy configurations are also supported. HTTP version 1.1 is not supported, HTTPS is also not possible.

The current version of Cloudfork-AWS does not work with HTTPSocket as a HTTP client. With the provided functionality it should be possible to support the SQS and SimpleDB API’s. But when I use HTTPSocket I get an AWS error telling me that the signature calculated is wrong. I think this is because HTTPSocket always adds the port number to the host header field. Cloudfork doesn’t do this when it calculates the signature so you get a mismatch. It is on my todo list to fix this.


SWHTTPClient is a full featured HTTP client library. It supports HTTP/1.1, access to the header fields and the PUT, HEAD and DELETE methods. Streaming uploads and downloads are also possible. The one thing that is not supported or that I couldn’t get working is HTTPS. Perhaps it’s possible to get this working by plugging in the Cryptography package but I have no idea how.

Another issue is that SWHTTPClient is not fully compatible with the latest Squeak and Pharo releases. The package contains some class extensions that override exiting methods with different behavior. For example the String>>subStrings: method.

Cloudfork-AWS can use SWHTTPClient, all AWS features work except HTTPS. I have fixed all the incompatibilities I bumped into. The patched version of SWHTTPClient is available from the Cloudfork project page on SqueakSource.


The installation of this library is a bit more work. You need the place the correct binaries for your platform in the Squeak installation directory and load the CurlPlugin package from SqueakSource. If you load the package you may get a warning that the class CurlPlugin cannot be loaded. This is no problem, you can still use the plugin through the Curl class. The CurlPlugin class is only needed if you want to create a new version of the plugin or support a new platform.

The libcurl library that the CurlPlugin uses supports all the HTTP features we need and many more. It is one of the bests HTTP client libraries around. And it’s open source. It has an optional integration with openssl which provides the functions required for HTTPS.

The current version of the CurlPlugin doesn’t expose all the features of libcurl. Currently HEAD and DELETE requests are not supported. It is also not yet possible to set the header fields for a requests. The other methods work very well and HTTPS also works fine.


For the SimpleDB and SQS services the CurlPlugin is the best HTTP client. All the required features are there and the performance is very good. SimpleDB and SQS also work with the SWHTTPClient, only without HTTPS support. If the Curl class is present in your image Cloudfork-AWS will use this class for all SimpleDB and SQS service calls, otherwise the SWHTTPClient is used.

The current CurlPlugin doesn’t support all the features required by the S3 service. For this reason the Cloudfork S3 functionality requires the SWHTTPClient.

Future work

I think the CurlPlugin has the potential to become a very good HTTP client library for Squeak and Pharo. It will also be relatively easy to maintain this library because all of the complex work of supporting the different protocols is implemented in libcurl. This C library has a very large community and is well maintained. I will try to extend the plugin and add the missing features.

I will also try to make Cloudfork-AWS compatible with HTTPSocket. This will not be the best performing solution but it can be an easy starting point.

Composition relations in Cloudfork-ActiveItem

April 20, 2009

In UML, the composition relation between objects is a special association that is used to model a “private-container” relationship. The typical class-room example is the Car object having 4 Wheel objects. Although you can replace wheels on a car, one particular Wheel object is never shared with other Car objects.

In Amazon SimpleDB there is no concept of relations ; it is a simple storage of items having attributes (key-value pairs). The Cloudfork-ActiveItem framework can map these relations to foreignkey-like attributes but that should be used with care. Because SimpleDB is not a relational database, operations such as Joins are simply not possible. However, mapping the composition relation fits much better in the SimpleDB storage model. The notion of a SimpleDB item being a container of information is just what it is meant to be.

To illustrate how ActiveItem supports this design construct, I will give an example that models multiple-choice questions for an exam training application. A Question is a composition of 4 Choices ; one of them is the correct answer to that question.

Question class>>describe: aQuestion

    hasString: #code ;
    hasText: #text ;
    ownsMany: #choices
Choice class>>describe: aChoice

    hasText: #text ;
    hasBoolean: #isAnswer

When saving a Question, ActiveItem will create one SimpleDB item with both the attributes of the Question and the attributes of each Choice:

code -> '010-0001'
text -> 'What language is best for writing Web applications?'
choices.1.text -> 'Java'
choices.1.isAnswer -> 'false'
choices.2.text -> 'Ruby'
choices.2.isAnswer -> 'false'
choices.3.text -> 'PHP'
choices.3.isAnswer -> 'false'
choices.4.text -> 'Smalltalk'
choices.4.isAnswer -> 'true'

By supporting composition, class Choice can be a normal ActiveItem subclass with its own attribute description. However, as you can see in the example, Choice objects do not need an id (actually the collection index is the id).

Because of limitations to the number of attributes per item (currently 256), this composition solution is not suitable for arbitrary large collections. If you expect this for your model then I suggest you use the normal hasMany: method that maps associations using “foreign-key” attributes.

Problems with Daylight saving time in VA Smalltalk

March 29, 2009

All the requests that Cloudfork-AWS sends to the Amazon web services contain the current date and time in Coordinated Universal Time (UTC). If this timestamp differs more than a few seconds from the current time you get an error. For example the S3 error: RequestTimeTooSkewed – The difference between the request time and the current time is too large. The reason for this time check is security, it prevents “record en playback” attacks.

So systems that make use of AWS must have the correct time and also the timezone must be correct. Otherwise the conversion to UTC will give the wrong result. A few days ago this all worked perfectly in VA Smalltalk, but tonight all AWS calls fail😦 Last night we in The Netherlands switched to Daylight saving time (DST). VA Smalltalk doesn’t seem to handle this very well. A call to “DateAndTime now” still returns an offset from UTC as one hour instead of two. It seems that this is a known problem.

Until this problem is fixed we have to use a less than elegant solution to get things working again. We have added a “DSTMode” flag, when this flag is true we subtract an extra hour when converting to UTC. You can enable this mode by executing:

CFPlatformServiceVASTUtils enableDSTMode: true

Support for the DSTMode was built into Cloudfork version jvds.79.

Getting started with SimpleDB using Cloudfork

March 29, 2009

The Cloudfork-AWS project has classes to use the Amazon Web Services Simple Database (SimpleDB) directly from Smalltalk. Using these classes, you can create domains, put items, and query items even using regular Smalltalk blocks. Cloudfork contains the CFSimpleDB class that makes the generic calls available as Smalltalk methods. Calls that are related to a domain are implemented in the CFSimpleDBDomain class.

In short, SimpleDB items are stored in a domain which has a name. Each item has a name and a collection of attributes. Each attribute has a name and one or more values. Values can be String only ; the application must take care of conversion. SimpleDB provides a database in the cloud that supports large volumes of data that can be accessed anywhere on the Internet. Amazon Web Services (AWS) takes care of high availability, consistency, indexing and performance.

Before throwing away your current persistency, it is important to realize that SimpleDB is not a relational database. It does not provide relational consistency (constraints), does not have a schema and “records” contain String values only. Queries on items require expressions that have limited operators; no joins, no subselects (see documentation).

Use case
Besides being a simple object database, the service can be used to store reference data (e.g. zipcode tables, gps locations, currencies) or logging information (audits). Another example is storing social user profiles which typically have a variable set of properties. SimpleDB can also be used for storing metadata and references to S3 objects such as images,video and documents. Because a S3 object can contain any data, it can be used to store large attribute values that do not fit into a SimpleDB item.

After subscribing to the SimpleDB services with your AWS account, you must create the credentials object:

awsCredentials := CFAWSCredentials  
  newWith: '<your access key>'  
  andSecret: '<your secret access key>'.

Create a SimpleDB domain
To store items we need a domain, let us create one. Every api call returns a CFAWSResponse which must be checked for errors before using its result.

sdb := CFSimpleDB newWith: awsCredentials.
response := sdb createDomain: 'zipcodes'.
response isError 
  ifFalse:[domain := response result]

Add items
The variable “domain” will be an instance of CFSimpleDBDomain that has various methods to access its items. For convenience, class CFSimpleDBItem can be used to encapsulate the name of the item and its attributes (name,value pairs).

item := CFSimpleDBItem newNamed: '3768GX'.
item valueAt: 'city' put: 'Soest'.
item valueAt: 'country' put: 'Netherlands'.
domain putItem: item.

Query items using expressions
AWS SimpleDB offers two sets of api calls that support criteria-based retrieval. You can use the query syntax:

domain query: '[''country'' = ''Netherlands'']'.

Or the select expressions:

sdb select: 'select city from zipcodes'.

More details on quering using “select” and “query” can be found at the AWS SimpleDB documentation.

Query items using Block
Cloudfork has classes that support the use of normal Smalltalk blocks to define the select condition. See the documentation for all possible operators and functions and class CFSSWOperand for the Smalltalk counterpart.

domain selectAllWhere: [:each | each country = 'Netherlands'].

Delete a SimpleDB domain
Deleting a domain will also delete all its items. No warning here.

sdb deleteDomain: 'zipcodes'.

This post showed you the basic usage of the Cloudfork SimpleDB services. Browse the classes in this package to see how other SimpleDB API services are mapped to Smalltalk messages. Also have a look at the CFSimpleDBEmulator which can used for Unit testing classes that use Cloudfork-AWS SimpleDB.

If you are planning to use SimpleDB as an object database, then have a look at Cloudfork-ActiveItem. It is a framework that can help in mapping your objects to SimpleDB items and takes care of String conversions, data sharding and can handle associations similar to Rails ActiveRecord.

VA Smalltalk version of Cloudfork is ready for use

March 27, 2009

All the functionality of Cloudfork-AWS is now also available for VA Smalltalk. With Cloudfork-AWS you can access the Amazon S3, SQS and SimpleDB services from a simple to use Smalltalk interface. The code is hosted at, the SourceForge for VA related projects.

All tests are green!

All tests are green!

As you can see all tests pass. Porting from one Smalltalk dialect to another is a tedious job, there are a lot of little differences you have to take care of. For example the asSortedCollection is case insensitive in VA Smalltalk and is case sensitive in Squeak/Pharo. Because of this the AWS signatures were calculated wrong in VA. Also the functionality for parsing xml and using http are completely different. We have isolated all this dialect specific stuff in a separate package/application.

For installation instructions and for reporting issues you can use our project page on Google code:

Cloudfork SimpleDB now supports Batch Put

March 26, 2009

Yesterday, Amazon AWS announced the availability of the feature known as “Batch Put” for its SimpleDB web services. This operation allows you to do a faster put of multiple items (max 25) using a single Http request in a transactional way: either all inserts and updates succeed or nothing gets processed. Read the documentation for details and the warning about URL limits.

Using the Cloudfork-AWS, the operation can be used like this:

simpleDB := CFSimpleDB newWith: awsCredentials.
"create new or open existing domain"
domain := (simpleDB createDomain: 'cloudfork-batch-put') result.  "normally check for errors first"

"create some items"
item1 := CFSimpleDBItem newNamed: 'Jack'.
item1 valueAt: 'gender' put: 'male'.
item2 := CFSimpleDBItem newNamed: 'Jill'.
item2 valueAt: 'gender' put: 'female'.

"store them all at once"
domain batchPutItems: (Array with: item1 with: item2). "normally check for errors afterwards"

After adding the “batchPutItems:” method, one helper class and a few tests, I finished this feature within 2 hours thanks to the great Smalltalk language and a powerful IDE. So after a day, Cloudfork already supports the new API.

This feature is now available for Squeak/Pharo at Cloudfork. You can expect updates of the VA Smalltalk port ( and VisualWorks port (Public Cincom Store) once we finished the export/imports.


Get every new post delivered to your Inbox.