Posted tagged ‘aws’

Amazon AWS Region Endpoints in Europe

June 1, 2010

To use the Cloudfork classes for services located in Europe (Ireland), you need to change the serviceUrl property such as:

|sdb|
sdb := CFSimpleDB new.
sdb serviceUrl: 'http://sdb.eu-west-1.amazonaws.com'
Service URL
SimpleDB sdb.eu-west-1.amazonaws.com
SQS eu-west-1.queue.amazonaws.com
EC2 eu-west-1.ec2.amazonaws.com
SNS sns.eu-west-1.amazonaws.com
S3 Set Bucket location constraint to EU
RDS rds.eu-west-1.amazonaws.com

A comprehensive list can be found over at Elastician

Testing Cloudfork AWS SimpleDB based classes

September 20, 2009

The Cloudfork framework includes an alternate implementation of CFSimpleBase that stores all items in memory. The CFSimpleDBEmulator was initially created to support the unit testing of the ActiveItem framework. With the exception of some query constructs, it implements the complete api and therefore is suitable for unit testing your own applications as well.

| emulator domain item |
emulator := CFSimpleDBEmulator new.
domain := (emulator createDomain: 'myapp.development') result. 
item := CFSimpleDBItem newNamed: 'user.dennis'.
item valueAt: 'birthday' put: '20060916'  ;valueAt: 'hobby' put: 'soccer'.
domain putItem: item.

After running the tests, you can inspect the emulator to see what items have been stored in which domains and what attributes they have. If you store the emulator in some class var then you can keep the data around for development too.

ActiveItem
Because the ActiveItem framework is build on top of SimpleDB, the same emulator class can be used to unit test those applications. ActiveItem uses a globally shared CFSimpleDB instance so you only need to replace that with an emulated instance.

CFActiveItem activateWithSimpleDB: CFSimpleDBEmulator new.

Using the CFActiveItemSerializer you can even dump the items to a local Filesystem for development convenience.

Secure access to AWS from VisualWorks

July 9, 2009

Cloudfork implements the REST API of the Amazon Web Services using both secure (https) and non-secure (http) communication. In order to use the https protocol to access S3,SimpleDB,SQS or EC2, you need to prepare the Smalltalk image by registering a trusted certificate. Without that certificate, your application will produce an error saying “CA Not in Trust Registry!” (CA = Certificate Authority). The steps below describe how to register the correct certificate in a VisualWorks (or WebVelocity) image.

Install HTTPS
Unless already loaded in your image, you need to install the HTTPS parcel (use Parcel Manager).

Export Certificate
Amazon WebServices uses the following root certificate “VeriSign Class 3 Secure Server CA”. You can verify this by inspecting the chain object in the debugger that can be opened if you have a failed secure test.

One way to get this certificate file is to export it from the list of certificates known to your Internet Browser. For FireFox users, open Preferences>Advanced>Encryption>View Certificates. Under VeriSign, Inc., select the certifcate, export it using the format “X.509 Certificate with chain (PEM)” and name it “VeriSignClass3SecureServerCA.pem”.

Import Certificate
The following script will import the Base-64 encoded certificate file.

| certificate registry |
registry := Security.X509.X509Registry default.
certificate := Security.X509.Certificate fromFile:'VeriSignClass3SecureServerCA.pem'.
registry addCertificate: certificate.

Please be aware of what is stated in the VisualWorks SecurityGuide.pdf (page 72): “Adding a CA certificate to your registry is deceivingly simple and does not convey the degree of trust actually involved in that action. Be sure to understand what it is you are trusting a CA to do and ensure that it matches the security requirements of your application.”

Run the Tests
Results of the secure Cloudfork Integration tests should all be in the green now.

HTTP Clients for Squeak

May 25, 2009

Cloudfork-AWS makes the Amazon Web Services (AWS) S3, SQS and SimpleDB easily accessible from Smalltalk. All the communication between the Smalltalk image and AWS is done via HTTP. So a HTTP Client is an important requirement for Cloudfork-AWS.

Cloudfork-AWS needs more than just handling simple HTTP GET and POST requests, the following features are also needed:

  • Setting custom request headers – S3 uses custom headers for authentication and for attaching meta-data to S3 objects. We need to be able to set these headers. This feature is also required for range requests, with these requests you can download a part of a S3 object instead of downloading the entire object.
  • Access to the response headers – So we can read the meta-data of S3 objects.
  • Support for PUT, HEAD and DELETE requests – Also required for S3. PUT is required for storing objects and creating buckets. DELETE is required for removing objects and HEAD for getting the object meta-data without downloading the object itself.
  • HTTPS support – The AWS services can be accessed via plain HTTP or via secure HTTPS. The choice is up to the client. But the release notes of the latest releases of SimpleDB mention that HTTP support will be deprecated and that future versions will require HTTPS.
  • HTTP/1.1 support – Not a must have feature but version 1.1 requests can be more efficient than version 1.0 requests because of the keep-alive feature of version 1.1. With this feature socket connections can be reused between requests.
  • Streaming uploads and downloads – Also not a must have feature for most use cases. Only when large s3 objects need to be handled.
  • Proxy support – Not a requirement of one of the AWS services but a feature that is often required by client configurations.

Note that most of these features are only required for S3 and not for SQS and SimpleDB. SQS and SimpleDB only use GET and POST requests and the authentication is done in the URL and not through HTTP header fields. The HTTP responses of SQS and SimpleDB always contain XML and the maximum size is about 8KB for SQS and 1MB for a SimpleDB resultset so streaming support is not required.

As far as I know there are three HTTP clients available for Squeak:

  • The HTTPSocket class – This class is part of the Network-Protocols package and is part of the standard images of the latest Squeak and Pharo versions.
  • SWHTTPClient – This is an extensive HTTP client library. It was originally developed for Dolphin Smalltalk and was ported to Squeak. The latest release is not fully compatible with the latest Squeak release. There are a number of class extension collisions.
  • CurlPlugin – This is a Squeak plugin that uses the libcurl C library, libcurl is a well-known and powerful open source “URL transfer library” with support for HTTP, FTP and many other protocols.

HTTPSocket

This is a very simple implementation of a HTTP client in a single class. HTTP GET and POST requests are supported, access to the headers is also possible and simple proxy configurations are also supported. HTTP version 1.1 is not supported, HTTPS is also not possible.

The current version of Cloudfork-AWS does not work with HTTPSocket as a HTTP client. With the provided functionality it should be possible to support the SQS and SimpleDB API’s. But when I use HTTPSocket I get an AWS error telling me that the signature calculated is wrong. I think this is because HTTPSocket always adds the port number to the host header field. Cloudfork doesn’t do this when it calculates the signature so you get a mismatch. It is on my todo list to fix this.

SWHTTPClient

SWHTTPClient is a full featured HTTP client library. It supports HTTP/1.1, access to the header fields and the PUT, HEAD and DELETE methods. Streaming uploads and downloads are also possible. The one thing that is not supported or that I couldn’t get working is HTTPS. Perhaps it’s possible to get this working by plugging in the Cryptography package but I have no idea how.

Another issue is that SWHTTPClient is not fully compatible with the latest Squeak and Pharo releases. The package contains some class extensions that override exiting methods with different behavior. For example the String>>subStrings: method.

Cloudfork-AWS can use SWHTTPClient, all AWS features work except HTTPS. I have fixed all the incompatibilities I bumped into. The patched version of SWHTTPClient is available from the Cloudfork project page on SqueakSource.

CurlPlugin

The installation of this library is a bit more work. You need the place the correct binaries for your platform in the Squeak installation directory and load the CurlPlugin package from SqueakSource. If you load the package you may get a warning that the class CurlPlugin cannot be loaded. This is no problem, you can still use the plugin through the Curl class. The CurlPlugin class is only needed if you want to create a new version of the plugin or support a new platform.

The libcurl library that the CurlPlugin uses supports all the HTTP features we need and many more. It is one of the bests HTTP client libraries around. And it’s open source. It has an optional integration with openssl which provides the functions required for HTTPS.

The current version of the CurlPlugin doesn’t expose all the features of libcurl. Currently HEAD and DELETE requests are not supported. It is also not yet possible to set the header fields for a requests. The other methods work very well and HTTPS also works fine.

Cloudfork-AWS

For the SimpleDB and SQS services the CurlPlugin is the best HTTP client. All the required features are there and the performance is very good. SimpleDB and SQS also work with the SWHTTPClient, only without HTTPS support. If the Curl class is present in your image Cloudfork-AWS will use this class for all SimpleDB and SQS service calls, otherwise the SWHTTPClient is used.

The current CurlPlugin doesn’t support all the features required by the S3 service. For this reason the Cloudfork S3 functionality requires the SWHTTPClient.

Future work

I think the CurlPlugin has the potential to become a very good HTTP client library for Squeak and Pharo. It will also be relatively easy to maintain this library because all of the complex work of supporting the different protocols is implemented in libcurl. This C library has a very large community and is well maintained. I will try to extend the plugin and add the missing features.

I will also try to make Cloudfork-AWS compatible with HTTPSocket. This will not be the best performing solution but it can be an easy starting point.

Problems with Daylight saving time in VA Smalltalk

March 29, 2009

All the requests that Cloudfork-AWS sends to the Amazon web services contain the current date and time in Coordinated Universal Time (UTC). If this timestamp differs more than a few seconds from the current time you get an error. For example the S3 error: RequestTimeTooSkewed – The difference between the request time and the current time is too large. The reason for this time check is security, it prevents “record en playback” attacks.

So systems that make use of AWS must have the correct time and also the timezone must be correct. Otherwise the conversion to UTC will give the wrong result. A few days ago this all worked perfectly in VA Smalltalk, but tonight all AWS calls fail :-( Last night we in The Netherlands switched to Daylight saving time (DST). VA Smalltalk doesn’t seem to handle this very well. A call to “DateAndTime now” still returns an offset from UTC as one hour instead of two. It seems that this is a known problem.

Until this problem is fixed we have to use a less than elegant solution to get things working again. We have added a “DSTMode” flag, when this flag is true we subtract an extra hour when converting to UTC. You can enable this mode by executing:


CFPlatformServiceVASTUtils enableDSTMode: true

Support for the DSTMode was built into Cloudfork version jvds.79.

Getting started with SimpleDB using Cloudfork

March 29, 2009

The Cloudfork-AWS project has classes to use the Amazon Web Services Simple Database (SimpleDB) directly from Smalltalk. Using these classes, you can create domains, put items, and query items even using regular Smalltalk blocks. Cloudfork contains the CFSimpleDB class that makes the generic calls available as Smalltalk methods. Calls that are related to a domain are implemented in the CFSimpleDBDomain class.

SimpleDB
In short, SimpleDB items are stored in a domain which has a name. Each item has a name and a collection of attributes. Each attribute has a name and one or more values. Values can be String only ; the application must take care of conversion. SimpleDB provides a database in the cloud that supports large volumes of data that can be accessed anywhere on the Internet. Amazon Web Services (AWS) takes care of high availability, consistency, indexing and performance.

Before throwing away your current persistency, it is important to realize that SimpleDB is not a relational database. It does not provide relational consistency (constraints), does not have a schema and “records” contain String values only. Queries on items require expressions that have limited operators; no joins, no subselects (see documentation).

Use case
Besides being a simple object database, the service can be used to store reference data (e.g. zipcode tables, gps locations, currencies) or logging information (audits). Another example is storing social user profiles which typically have a variable set of properties. SimpleDB can also be used for storing metadata and references to S3 objects such as images,video and documents. Because a S3 object can contain any data, it can be used to store large attribute values that do not fit into a SimpleDB item.

Smalltalk
After subscribing to the SimpleDB services with your AWS account, you must create the credentials object:

awsCredentials := CFAWSCredentials  
  newWith: '<your access key>'  
  andSecret: '<your secret access key>'.

Create a SimpleDB domain
To store items we need a domain, let us create one. Every api call returns a CFAWSResponse which must be checked for errors before using its result.

sdb := CFSimpleDB newWith: awsCredentials.
response := sdb createDomain: 'zipcodes'.
response isError 
  ifFalse:[domain := response result]

Add items
The variable “domain” will be an instance of CFSimpleDBDomain that has various methods to access its items. For convenience, class CFSimpleDBItem can be used to encapsulate the name of the item and its attributes (name,value pairs).

item := CFSimpleDBItem newNamed: '3768GX'.
item valueAt: 'city' put: 'Soest'.
item valueAt: 'country' put: 'Netherlands'.
domain putItem: item.

Query items using expressions
AWS SimpleDB offers two sets of api calls that support criteria-based retrieval. You can use the query syntax:

domain query: '[''country'' = ''Netherlands'']'.

Or the select expressions:

sdb select: 'select city from zipcodes'.

More details on quering using “select” and “query” can be found at the AWS SimpleDB documentation.

Query items using Block
Cloudfork has classes that support the use of normal Smalltalk blocks to define the select condition. See the documentation for all possible operators and functions and class CFSSWOperand for the Smalltalk counterpart.

domain selectAllWhere: [:each | each country = 'Netherlands'].

Delete a SimpleDB domain
Deleting a domain will also delete all its items. No warning here.

sdb deleteDomain: 'zipcodes'.

This post showed you the basic usage of the Cloudfork SimpleDB services. Browse the classes in this package to see how other SimpleDB API services are mapped to Smalltalk messages. Also have a look at the CFSimpleDBEmulator which can used for Unit testing classes that use Cloudfork-AWS SimpleDB.

If you are planning to use SimpleDB as an object database, then have a look at Cloudfork-ActiveItem. It is a framework that can help in mapping your objects to SimpleDB items and takes care of String conversions, data sharding and can handle associations similar to Rails ActiveRecord.

VisualWorks version of Cloudfork is ready for use

March 28, 2009

All the functionality of Cloudfork-AWS is now also available for Cincom VisualWorks 7.6. With Cloudfork-AWS you can access the Amazon S3, SQS and SimpleDB services from a simple to use Smalltalk interface. The code is available on the Cincom Public Repository.

Cloudfork has been designed and implemented with portability in mind. Early in the development, we refactored classes and methods such that a large portion of the packages were platform (Squeak,VA Smalltalk, VisualWorks) independent. In addition, we paid a lot of attention to avoid indirect dependencies with other open-source projects such as Seaside. Therefore our implementation should work in “clean images” with minimal dependencies (some packages are required such as SHA and Http client).

In case of the VisualWorks port, I spent most of the time getting all the Date codecs working (RFC822, RFC3339, ISO8601) and the Http requests (all the Http verbs). In particular getting streaming support for GET and PUT turned out to be quite an investigation into the inner workings of the standard HttpClient. There are also some issues with the appearant auto-inclusion of Http headers and the case-sensitivity of headers keys.  Both the streaming support and Http headers will require some rework. Uploading and downloading of large S3 Objects is not very efficient at the moment.

For installation instructions and for reporting issues, you can use our project page on Google code: InstallingForVisualWorks

BTW: my VW tests are green too :-)

Uploading to an Amazon S3 Bucket from Seaside

February 18, 2009

The Cloudfork repository on Squeaksource contains a package named Cloudfork-S3Upload. This package contains a Seaside component that shows how you can let a web user upload data directly to a S3 bucket. The Seaside component uses the AWS POST feature to implement this.

An advantage of this approach is that the data goes directly to S3 and doesn’t pass through your server. This increases the scalability and robustness of the system. Especially when uploading large files like media files.

The Cloudfork-S3Upload package doesn’t depend on any other Cloudfork package. It depends on the Seaside web framework and also on the Cryptography Team Package (required for the SHA1 hash function). I used the alpha version of Seaside 2.9 to develop this package.

My ambition is to develop a reusable S3 upload component that uses AJAX functionality to start the upload process without submitting the complete page. In this way it will be possible to perform multiple uploads simultaneously from a single page. This is not trivial and with my limited AJAX knowledge it will take me some time to get this working. The current code (version jvds.3) contains the functionality to configure the policy and to make a signature of this policy. The Seaside component CFS3UploadExample1 shows how this code can be used to implement an upload form. This sample form doesn’t have any fancy AJAX functionality but it does work without problems.

Getting started with SQS using Cloudfork

February 5, 2009

The Cloudfork-AWS project makes it easy to use the Amazon Simple Queue Service (SQS) from Smalltalk. SQS is a highly scalable and simple to use messaging system. Basically it lets you send and receive messages via one or more queues you define. It is available via a REST and a SOAP based HTTP interface. Cloudfork contains the CFSimpleQueueService class that makes the generic SQS calls available as Smalltalk methods. Calls that are related to a queue are implemented in the CFSimpleQueue class.

It is important to remember that SQS is not a transactional messaging system. This means that messages you send are immediately available on the queue without requiring a commit. For receiving messages SQS uses the concept of a Visibility Timeout. The timeout value in seconds is the amount of time you have to processes the message after the receive. During this time the message is invisible to other clients that receive messages from the queue. If you don’t delete the message before the timeout expires the message will become visible to other clients. So you should think carefully about how to set the visibility timeout or better: design your system in such a way that the occasional occurrence that a message is processed more than once doesn’t cause serious problems.

Use case
There are many different situations in which a messaging system can improve the scalability flexibility and or robustness of a system. As an example think of a web application that needs to send out emails with a PDF report as an attachment. You can generate the PDF and send the mail directly from your web application. But if creating the report involves a lot of data processing this will seriously limit the scalability of the web application. In these situations it is better to send out a CreateAndMailReport message and let some other process handle the processing. You can start with a single create report process, if your website becomes popular you can easily add as many report processes on as many servers as you need without changing a single line of code.

With the cloud computing facilities of Amazon you can completely automate this process. Create an EC2 image which runs one or more create report processes and create a monitor process that checks the number of messages in the queue. If the number of messages in the queue increases this monitor process can startup extra EC2 report instances. If the queue is empty the monitor process can stop the EC2 instances. I use exactly this construction for transcoding media files (audio and video). Sometimes I need to process a lot of media files in a short period of time. I use a maximum of eight EC2 images to handle these peaks. Most of the time I have just one transcoding EC2 instance running during working hours and zero instances during the night.

Smalltalk
Ok, that was a short intro of SQS, Amazon provides excellent documentation where you can find all the details. Now let’s turn to some Smalltalk code. The assumption is that you have an AWS account and that account is subscribed to the SQS service.

Sending messages
Messages in SQS must be Strings with a maximum size of 8K. The method below creates a new queue and sends one message to this queue. If the queue already exists then the createQueue method does nothing.

sendMessage: sqs

	| name qurl |
	
	name := 'cloudfork-example-q'.

	qurl := (sqs createQueue: name) result.
	queue := sqs openQueue: qurl.
	
	queue sendMessage: 'Cloudfork says Hi!'

Receiving messages
The code below shows how a receive loop can be implemented. This loop will run forever and call handleMessage: for all messages received. With SQS you have to use polling to get new messages. If no message is available nil is returned as the result. In this case the process sleeps for 30 seconds before trying again. This loop should be run as a background process otherwise it will lock up the user interface.

startReceiveLoopFor: aQueue

	| message response |
	
	[ response := aQueue receiveMessage.
	response isError
		ifTrue: [ self error: 'error receiving message']
		ifFalse: [
			message := response result.
			message isNil
				ifTrue: [
					Delay forSeconds: 30 ]
				ifFalse: [
					[ self handleMessage: message ] 
						ensure: [ aQueue deleteMessage: message ]]]] repeat

Note that the error handling is very basic. The loop will keep running if an error occurs during the handling of a message. But the loop will end when the receiveMessage method returns an error.


Follow

Get every new post delivered to your Inbox.