Amazon S3

Introduction

Amazon Simple Storage Service (Amazon S3) is an object storage service for collecting, storing, and analyzing data. Offering management features to organize data, it stores and protects data for web and mobile applications, IoT, and Big Data.

Supported actions

For your Amazon S3 account, YAP supports the following actions:

  • Write files

  • Write bulk files

  • Read files

  • Delete files

Write files

List of fields available for Write files action on Amazon S3 connector:

Field

Type

Bucket

String

FileName

String

Value

String

Content-type

String

Write bulk files

List of fields available for Write bulk files action on Amazon S3 connector:

Field

Type

Values

String

Read files

List of fields available for Read files action on Amazon S3 connector:

Field

Type

Bucket

String

Path

String

Delete files

List of fields available for Delete files action on Amazon S3 connector:

Field

Type

Bucket

String

Path

String

Setup requirements

Before setup, please check:

  • An S3 bucket contains files with supported encodings and file types

  • You have granted to YAP the permissions:

    • to send notification to your AWS S3 account,

    • to read from the bucket (objects container).

    For this, sign in to AWS S3 account and authorize the permissions requested by YAP.

Defining bucket and object by names in Amazon S3, remember that names are case sensitive, and should be exact.

Configuration options

The below configuration options let you to select subsets of your folders, certain file, and more to sync only the files you need in your warehouse. Setting up multiple S3 connectors targeted at the same bucket, but with different options, allow you to investigate a bucket in any way.

Folder path

The folder path is used to specify a bucket section in which you would like YAP to look for files. All files under the specified folder and all files of its nested subfolders will be searched for files YAP can upload. If no prefix is specified, YAP will search through the entire bucket for files to synchronize.

File pattern

The file pattern is a regular expression that YAP uses to decide whether to sync certain files. It applies to everything under the prefix. For instance, suppose under the prefix logs you had three folders: 2018, 2019, and errors. Using the pattern \d\d\d\d/.*, you could exclude all the files in the errors folder, because \d\d\d\d applies to the folders, and .* applies to the files under them. If you're not sure what regular expression to use, leave this field blank, and YAP will synchronize everything under the prefix. If you're feeling particularly bold, you can learn to write your own regular expression.

Archive folder pattern

If archive (TAR or ZIP) folders contain multiple files, you can use the archive folder pattern to filter those as well. For example, the archive folder pattern .*json will synchronize from an archive folder only those files that end a JSON file extension.

File type

The file type is used to let YAP know that even files without a file extension ought to be parsed as this file type. For example, if you have an automated CSV output system that saves files without a CSV extension, you can specify the CSV type and YAP will synchronize them as CSVs. Selecting "infer" will let YAP infer from a file's extension (.csv, .tsv, .json, .avro, or .log) what to synchronize. If you do choose a file type, every file examined by YAP will be interpreted as the file type you select, so make sure everything YAP synchronizes has the same file type.

Escape character (optional)

CSVs have a special rule for escaping quotation marks as opposed to other characters – they require two consecutive double quotes to represent an escaped double quote. However, some CSV generators do not follow this rule and use other characters like backslash for escaping. Only use this field if you are sure your CSVs have a different escape character.

Null sequence (optional)

CSV has no native notion of a null character. However, some CSV generators have created one, using characters such as \N to represent null. Note: text is un-escaped before the null sequence is matched, so don't use the escape character in your null sequence. Only use this field if you are sure your CSVs have a null sequence.

Delimiter (optional)

The delimiter is a character used in CSV files to separate one field from the next. If this is left blank, YAP will infer the delimiter for each file, and files of many different types of delimiters can be stored in the same folder with no problems. If this is not left blank, then all CSV files in your search path will be parsed with this delimiter.

Connection

The Amazon S3 connector uses the next versions:

  • Amazon S3 REST API, version 2006-03-01

  • AWS Signature Version 4 to authenticate to Amazon S3

Amazon S3 connector is being connected by using access key. This is the simplest way to connect to Amazon S3. You need to provide the access key of an IAM User in your Amazon S3 system. To learn how to create an IAM User, refer to Amazon documentation.

Under your IAM User, YAP performs operations in your Amazon S3 account. To use the full set of triggers and actions, the IAM User should have list/read/write permission to specific buckets and folders.

For more information, please use Amazon S3 documentation.

Creating IAM role for YAP

1) In Amazon S3, under your username, open the drop-down menu and select My Security Credentials:

2) Select Roles > Create role:

3) Select Another AWS Account. Ask your YAP manager (support@youngapp.co) to provide you with Account ID and External ID. Input YAP's Amazon S3 Account ID. Select the Require external ID box. Enter an External ID and record this down, you will also use this in the connection settings when creating an Amazon S3 connection with YAP. Click Next: Permissions.

4) Select a proper permission for YAP to run automation in your Amazon S3. At the mininum, YAP should have List/Read/Write access to specific buckets or folders. In this tutorial, we select AmazonS3FullAccess.

Questions? We're always happy to help with any issues you might have! Send us an email to support@youngapp.co or request the demo with our sales team.