{"id":3623,"date":"2020-02-06T01:13:42","date_gmt":"2020-02-06T01:13:42","guid":{"rendered":"https:\/\/www.migenius.com\/?p=3623"},"modified":"2020-02-06T01:13:48","modified_gmt":"2020-02-06T01:13:48","slug":"authenticated-aws-s3-downloads-in-v8","status":"publish","type":"post","link":"https:\/\/www.migenius.com\/technical\/authenticated-aws-s3-downloads-in-v8","title":{"rendered":"Authenticated AWS S3 Downloads in V8"},"content":{"rendered":"\n

We’ve had quite a few customers express interest in downloading content from AWS S3<\/a> rather than persistently storing it on their RealityServer instances. While this introduces latency, in many use cases it can still make a lot of sense. Our recently released HTTP Request functionality for V8<\/a> makes it easy to download content from public URLs. What do you do if your S3 buckets require authentication though? Let’s dive in.<\/p>\n\n\n\n\n\n\n\n

Introduction<\/h3>\n\n\n\n

This won’t be a comprehensive guide. S3 has a lot of different options and settings we won’t cover here but by far the most complex to get working is the authentication, so we are going to focus on this. Using this technique you’ll still download files to your RealityServer instance and then load them from disk but this can be done on-demand. Batch rendering is one example where this can make a lot of sense since it is not as latency sensitive. If you are also running your instances on AWS in EC2 then it is likely the download times will be quite reasonable as well.<\/p>\n\n\n\n

For public S3 buckets and objects you don’t need to do anything special, just make a regular HTTP request. However to access buckets and their objects which are not flagged for public access requires jumping through some hoops.<\/p>\n\n\n\n

AWS S3 Authentication<\/h3>\n\n\n\n

There are two different authentication methods currently in use for AWS S3, the now deprecated Signature Version 2<\/a> method and the current Signature Version 4<\/a> method. We’re only covering the Signature Version 4 method here since it works in all regions and really should be what you are using today. Getting Signature Version 2 working is a little easier, if you want an example of that contact us<\/a>.<\/p>\n\n\n\n

For the full details of the process please refer to the Authenticating Requests (AWS Signature Version 4)<\/a> article published by Amazon. Basically the process involves computing HMAC and hash results from a very specific set of information and then setting the HTTP Authorization<\/code> header based (and a few others) on the results. This process is of course only needed if your buckets are private.<\/p>\n\n\n\n

This guide assumes you have a variable called s3_config<\/code> which is an object containing a key<\/code>, secret<\/code> and region<\/code> property. We like to put this into a JavaScript file in v8\/include\/s3_config.js<\/code> that looks something like this.<\/p>\n\n\n

\nconst s3_config = {\n    key: 'AKIAIOSFODNN7EXAMPLE',\n    secret: 'wJalrXUtnFEMI\/K7MDENG\/bPxRfiCYEXAMPLEKEY',\n    region: 'us-east-1'\n};\n<\/pre><\/div>\n\n\n

Then include it like this.<\/p>\n\n\n

\nconst s3_config = require('s3_config');\n<\/pre><\/div>\n\n\n

Of course you can populate this any way you like. Which ever key and secret you provide should include IAM<\/a> permissions to perform whatever operations you want to do on your bucket objects. Before we can actually start computing the values needed to make this work, we also need some cryptographic functionality.<\/p>\n\n\n\n

Cryptographic Tools in JavaScript<\/h3>\n\n\n\n

AWS S3 authentication requires the use of SHA256<\/a> and HMAC<\/a> functionality in order to calculate the final string for the authorization header. Fortunately there is already a great JavaScript library called Forge<\/a> which provides both of these in an easy to use way. With a little cajoling it can be made to work in the RealityServer V8 environment.<\/p>\n\n\n\n

Installing and Using Forge<\/h4>\n\n\n\n

Download the latest release<\/a> of Forge. Open the archive and extract the contents of the forge-x.x.x\/lib<\/code> directory into v8\/include\/forge<\/code> in your RealityServer directory. It can go in any directory you have configured for V8 includes but should be in its own folder. Now within your command you can include Forge like this.<\/p>\n\n\n

\nwindow = {};\n\nconst forge = require('forge\/index');\n<\/pre><\/div>\n\n\n

The window<\/code> variable needs to be defined for the library to work correctly. The library normally detects whether it is running in Node.js or the browser but doesn’t know anything about RealityServer so we have to trick it. Other than that it works perfectly out of the box. Now since you will call the SHA256 and HMAC functionality many times, you’ll want to create a couple of convenience functions.<\/p>\n\n\n

\nfunction hmac_sha256(key, data) {\n    let hmac = forge.hmac.create();\n    hmac.start('sha256', key);\n    hmac.update(data);\n    return hmac.digest().getBytes();\n}\n\nfunction hash_sha256(data) {\n    let sha = forge.md.sha256.create();\n    sha.update(data);\n    return sha.digest().toHex();\n}\n<\/pre><\/div>\n\n\n

You’ll notice the hmac_sha256<\/code> function returns the bytes from the result while hash_sha256<\/code> returns the hex string. This is because the HMAC calls we will use are chained into each other as specified by the AWS documentation. What the documentation does not mention (at least not in the same place) is that each link in the chain expects the bytes from the last, not the hex string. We’ll use the Forge util.bytesToHex<\/code> helper later to convert the last HMAC into the hex string needed.<\/p>\n\n\n\n

Forge is now ready to use in your command. You might be surprised how many other off the shelf JavaScript libraries can be run in RealityServer V8 with a few tricks.<\/p>\n\n\n\n

Computing the Authorization Header<\/h3>\n\n\n\n

Here is what you’ll need to build the needed headers to make the request. If you read the Amazon documentation you should recognise all of the elements here. This code assumes there are variables called bucket<\/code> and key<\/code> which define the AWS S3 bucket you want to access and the [Object Key] which uniquely identifies a given object within a bucket (not to be confused with the AWS access key).<\/p>\n\n\n

\n\/\/ Required data\nconst host = `${bucket}.s3.amazonaws.com`;\nconst url = `https:\/\/${host}\/${key}`;\nconst path = `\/${bucket}\/${key}`;\nconst algorithm = 'AWS4-HMAC-SHA256';\nconst time = new Date();\nconst date = time.toISOString().slice(0,10).replace(\/-\/g,"");\nconst time_stamp = time.toISOString().replace(\/[-:]\/g, '').split('.')[0] + 'Z';\nconst scope = `${date}\/${s3_config.region}\/s3\/aws4_request`;\n\n\/\/ The canonical request\nconst hashed_payload = hash_sha256('');\nconst http_method = 'GET';\nconst canonical_uri = `\/${key}`;\nconst canonical_query_string = '';\nconst canonical_headers = `host:${host}\\n`\n    + `x-amz-content-sha256:${hashed_payload}\\n`\n    + `x-amz-date:${time_stamp}\\n`;\nconst signed_headers = 'host;x-amz-content-sha256;x-amz-date';\nconst request = `${http_method}\\n`\n    + `${canonical_uri}\\n`\n    + `${canonical_query_string}\\n`\n    + `${canonical_headers}\\n`\n    + `${signed_headers}\\n`\n    + `${hashed_payload}`;\nconst canonical_request = hash_sha256(request);;\n\n\/\/ The signing key\nconst signing_key = hmac_sha256(hmac_sha256(hmac_sha256(hmac_sha256(\n        `AWS4${s3_config.secret}`, date), 'us-east-1'), 's3'), 'aws4_request');\n\n\/\/ Sign the string and create authorization header\nconst string_to_sign = `AWS4-HMAC-SHA256\\n${time_stamp}\\n${scope}\\n${canonical_request}`;\nconst signature = forge.util.bytesToHex(hmac_sha256(signing_key, string_to_sign));\n\n\/\/ Required headers\nconst authorization = `AWS4-HMAC-SHA256 `\n    + `Credential=${s3_config.key}\/${date}\/${s3_config.region}\/s3\/aws4_request,`\n    + `SignedHeaders=host;x-amz-content-sha256;x-amz-date,Signature=${signature}`;\nconst x_amz_content_sha256 = hashed_payload;\nconst x_amz_date = time_stamp;\n<\/pre><\/div>\n\n\n

This is what is needed for a simple GET request. If you want to use other HTTP verbs and send body text or use query string parameters you’ll need to expand on this code. However the most troublesome elements are all covered in the above code. While there is a lot of good documentation from Amazon, the above still needed a fair amount of trial and error to get right (which we’ve done so you don’t have to).<\/p>\n\n\n\n

Making the Request<\/h3>\n\n\n\n

So we want to download a binary scene file from S3. We’ve computed the headers above, how do we make the request? With the new HTTP request functionality in V8 that is very straight forward, here is the code.<\/p>\n\n\n

\nconst response = http.get({\n    url: url,\n    headers: {\n        'Host': host,\n        'x-amz-content-sha256': x_amz_content_sha256,\n        'x-amz-date': x_amz_date,\n        'Authorization': authorization\n    },\n    encoding: null\n});\n\nfs.writeFile('file.bin', response.body);\n<\/pre><\/div>\n\n\n

So it’s just a matter of setting some headers. Since this example is requesting a binary file, we set encoding<\/code> to null<\/code>. If you were downloading JSON data you can use the json<\/code> property to have it automatically parsed. We can then write the file directly to disk. Obviously you’d want to use a specific name or derive it from the URL. Once the file is written you can of course access it from other RealityServer methods such as Scene.import<\/code> or the filename<\/code> property of the Image<\/code> class.<\/p>\n\n\n\n

Putting it Together<\/h3>\n\n\n\n

There are lots of ways to use the above. You could wrap it in a class and re-use it in many places or just put it straight into a command. You can take a look at a complete example command<\/a> which shows a working version, you just need to provide the AWS credentials.<\/p>\n\n\n\n

Going Further<\/h3>\n\n\n\n

This article only shows the basics of how to get the most common request type going (downloading a file with a GET request), you might need more functionality such as query string support. These can be easily added based on the code shown here. Also note that the Version 4 Signature authentication method is used by some other AWS services so you might be able to use this to access things other than S3.<\/p>\n\n\n\n

Building a caching system to avoid re-downloading content is also something you could do to enhance this functionality further. This way only cold starts of new instances would need to download content and subsequent requests would just re-use the already downloaded files. You could implement your own cache policies and store the information in the RealityServer database as Attribute_container<\/code>, a trick we use in our compositing V8 scripts to persist data between calls.<\/p>\n\n\n\n

Another cool aspect of what has been shown in this article is the use of an off the shelf JavaScript library to save a lot of time by avoiding implementing functionality that has already been created by others (in this case cryptographic functions). We’ve tried a few other libraries with some great results. Libraries with complex dependencies are tricker to get working but self contained ones can usually be convinced to work.<\/p>\n\n\n\n

Got another API or JavaScript library you’d like to see us try out. Get in touch<\/a> and tell us more.<\/p>\n","protected":false},"excerpt":{"rendered":"

We’ve had quite a few customers express interest in downloading content from AWS S3 rather than persistently storing it on their RealityServer instances. While this introduces latency, in many use cases it can still make a lot of sense. Our recently released HTTP Request functionality for V8 makes it easy to download content from public […]<\/p>\n","protected":false},"author":2,"featured_media":3625,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"inline_featured_image":false,"spay_email":"","jetpack_publicize_message":"How to use authenticated AWS S3 requests to download content.","jetpack_is_tweetstorm":false,"jetpack_publicize_feature_enabled":true},"categories":[15,11,30],"tags":[44,14,46,27,34],"jetpack_featured_media_url":"https:\/\/www.migenius.com\/migenius\/wp-content\/uploads\/2020\/02\/aws_s3_feature.jpg","_links":{"self":[{"href":"https:\/\/www.migenius.com\/wp-json\/wp\/v2\/posts\/3623"}],"collection":[{"href":"https:\/\/www.migenius.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.migenius.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.migenius.com\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.migenius.com\/wp-json\/wp\/v2\/comments?post=3623"}],"version-history":[{"count":2,"href":"https:\/\/www.migenius.com\/wp-json\/wp\/v2\/posts\/3623\/revisions"}],"predecessor-version":[{"id":3626,"href":"https:\/\/www.migenius.com\/wp-json\/wp\/v2\/posts\/3623\/revisions\/3626"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.migenius.com\/wp-json\/wp\/v2\/media\/3625"}],"wp:attachment":[{"href":"https:\/\/www.migenius.com\/wp-json\/wp\/v2\/media?parent=3623"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.migenius.com\/wp-json\/wp\/v2\/categories?post=3623"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.migenius.com\/wp-json\/wp\/v2\/tags?post=3623"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}