You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The **Arduino AWS S3 CSV Exporter** is designed to extract time series data from **Arduino Cloud** and publish it to an **AWS S3** bucket.
19
+
The **Arduino AWS S3 CSV Exporter** is designed to extract time series data from **Arduino Cloud** and publish it to an **AWS S3** bucket in CSV format.
20
20
21
-
The data extraction is managed by a scheduled AWS Lambda function that operates at configurable intervals, with samples aggregated based on the user’s preference. Data is stored in CSV format and uploaded to S3, providing a structured way to manage and store data.
21
+
A scheduled AWS Lambda function manages the data extraction process, running at configurable intervals. The extraction frequency, sampling resolution and filters can be customized to refine the data stored in S3.
22
+
23
+
At the end of this tutorial, the stack will be configured to extract data from Arduino Cloud every hour, aggregate samples at a five minute resolution and store structured CSV files in AWS S3. The setup will also allow filtering by tags to include only specific data, providing a scalable and structured approach to managing cloud connected device data and ensuring easy retrieval and long term storage.
22
24
23
25
## Goals
24
26
25
-
* Learn to create S3 Bucket and CloudFormation Stack
26
-
* Understand the functionality of the Arduino AWS S3 CSV Exporter
27
-
* Learn how to configure and deploy the Lambda function for data extraction
28
-
* Set up filtering and resolution options for optimized data aggregation
29
-
* Get started with configuring the exporter using CloudFormation
27
+
* Set up the required AWS S3 bucket and deploy resources using CloudFormation.
28
+
* Understand the functionality of the Arduino AWS S3 CSV Exporter.
29
+
* Configure and deploy the Lambda function for automated data extraction using the Arduino AWS S3 CSV Exporter’s pre-defined template.
30
+
* Apply filters and resolution settings to optimize data aggregation.
31
+
* Use CloudFormation templates to simplify deployment and configuration.
32
+
* Learn how Lambda, CloudWatch and EventBridge help monitor the deployed CloudFormation stack.
30
33
31
34
## Required Software
32
35
@@ -37,24 +40,26 @@ The data extraction is managed by a scheduled AWS Lambda function that operates
37
40
38
41
## How It Works
39
42
40
-
The **Arduino AWS S3 CSV Exporter** extracts time series data from **Arduino Cloud** and publishes it to an **AWS S3** bucket. Data is extracted at a specified resolution using a **GO based AWS Lambda** function triggered periodically by**AWS EventBridge**.
43
+
The **Arduino AWS S3 CSV Exporter** extracts time series data from **Arduino Cloud** and publishes it to an **AWS S3** bucket. Data extraction is handled by an AWS Lambda function written in **GO**, which runs at scheduled intervals using**AWS EventBridge**.
41
44
42
-
Each function execution generates a CSV file containing samples from the selected **Arduino Things**, which are then uploaded to **S3** for data storage and management.
45
+
Each function execution retrieves data from the selected **Arduino Things** and generates a CSV file. The file is then uploaded to **S3** for structured storage and accessibility.
43
46
44
-
Data is extracted every hour by default, with samples aggregated at a 5 minute resolution. Both the extraction period and the aggregation rate are configurable.
47
+
Data is extracted every hour by default, with samples aggregated at a 5 minute resolution. Both the extraction period and the aggregation rate are configurable. Aggregation is performed by calculating the average over the aggregation period, while non-numeric values, such as strings, are sampled at the specified resolution.
45
48
46
-
Aggregation is performed as an average over the aggregation period, and non-numeric values, such as strings, are sampled at the specified resolution. Timeseries data is exported in **UTC** by default, and all Arduino Things in the account are exported unless filtered using [*tags*](#tag-filtering).
49
+
Time-series data is exported in **UTC** by default. All Arduino Things in the account are included in the export unless filtered using [**tags**](#tag-filtering).
47
50
48
-
This setup allows you to easily manage and store time series data from connected devices, offering flexibility with configurable parameters like sampling intervals and data filtering.
51
+
This setup provides a structured and scalable approach for managing time series data from connected devices, providing configurable parameters such as sampling intervals and data filtering.
49
52
50
53
## AWS Account & CloudFormation Template
51
54
52
-
If you do not have an existing AWS account and user, refer to the [online AWS documentation](https://docs.aws.amazon.com/iot/latest/developerguide/setting-up.html) for setting up your account. To get started, follow these steps:
55
+
An active AWS account is required to deploy the **Arduino AWS S3 CSV Exporter**. If an account is not available, refer to the [online AWS documentation](https://docs.aws.amazon.com/iot/latest/developerguide/setting-up.html) for account setup. The following steps can help you to get started:
53
56
54
57
-[Sign up for an AWS account](https://docs.aws.amazon.com/iot/latest/developerguide/setting-up.html#aws-registration)
55
58
-[Create an administrative user](https://docs.aws.amazon.com/iot/latest/developerguide/setting-up.html#create-an-admin)
56
59
57
-
The exporter setup involves deploying resources using a [**CloudFormation template**](https://github.com/arduino/aws-s3-integration/blob/0.3.0/deployment/cloud-formation-template/deployment.yaml). The AWS account will be set to have permission for:
60
+
The exporter setup involves deploying resources using a [**CloudFormation template**](https://github.com/arduino/aws-s3-integration/blob/0.3.0/deployment/cloud-formation-template/deployment.yaml). This template provisions and configures the necessary AWS resources automatically.
61
+
62
+
CloudFormation requires the following **IAM permissions** to automatically provision and manage the AWS resources used in this deployment.
* Parameter management in SSM (policy: `AmazonSSMFullAccess`)
65
70
71
+
These permissions allow CloudFormation to create and manage the required resources automatically. The stack will deploy an AWS Lambda function, configure an EventBridge rule to trigger executions and set up S3 buckets for data storage.
72
+
66
73
## S3 Buckets (Pre-Requisite)
67
74
68
-
Before continuing and creating the CloudFormation stack, two S3 buckets need to be created:
75
+
Before continuing with the CloudFormation stack deployment, two **S3** buckets need to be created:
69
76
70
-
-**Temporary bucket**: This is where the Lambda binaries and **CloudFormation template (CFT)**will be uploaded and stored.
71
-
-**CSV destination bucket**: This is where all generated CSV files will be uploaded. Make sure this bucket is in the same AWS region where the stack will be created.
77
+
-**Temporary bucket**: Stores the Lambda binaries and the **CloudFormation template (CFT)**required for deployment.
78
+
-**CSV destination bucket**: This is the storage location for all generated CSV files. This bucket must be created in the same AWS region where the CloudFormation stack will be deployed.
72
79
73
80

74
81
75
82
### Creating S3 Bucket
76
83
77
-
To create the temporary bucket and the CSV destination bucket, you need to go to **Amazon S3** or search for **S3**. Then, you can click on **Create bucket** to start creating the first bucket.
84
+
To create the **temporary bucket** and the **CSV destination bucket**, navigate to **Amazon S3** or search for **S3** in the AWS Management Console. Click on **Create bucket** to begin the setup.
78
85
79
-
When creating a bucket, you will see several different options of the bucket configuration required for the creation as follows:
86
+
During an S3 bucket creation, several configuration options will be presented:
80
87
81
88
- General configuration
82
89
- Object ownership
@@ -86,132 +93,147 @@ When creating a bucket, you will see several different options of the bucket con
86
93
- Default encryption
87
94
- Advanced settings
88
95
89
-
All these parameters are explained briefly within the bucket creation process.
96
+
Each configuration option is briefly explained within the S3 bucket creation process.
The important configuration here is the **General configuration** in our case. The bucket name **must be**defined and**General purpose**bucket is selected for the purpose of the present integration.
102
+
For this integration, the key configuration is the bucket name, and the bucket type is set to **General purpose**under the**General configuration**section.
96
103
97
-
The rest of the configuration can be left with**Default**values that were selected and configured when the bucket creation process started.
104
+
Other settings can remain at their**default values**unless specific customizations are needed.
98
105
99
-
Proceed to **Submit**bucket creation with the defined configuration. The following image shows when a general purpose bucket has been created successfully after submission.
106
+
After defining the required settings, proceed to **Submit** the bucket creation. Once successfully created, the bucket will be listed under **General purpose buckets**.
100
107
101
-
In this process, we created the **Temporary bucket** to store the Lambda binaries and **CloudFormation template (CFT)**. The name assigned to this bucket is:**lambdas3binaries**.
108
+
This process creates the **temporary bucket** to store the **Lambda binaries** and the **CloudFormation template (CFT)**. The assigned bucket name is**lambdas3binaries** in this example.
Please download the binaries and CFT file to upload to the **lambdas3binaries** bucket.
113
-
114
-
Enter the **lambdas3binaries** bucket within the general purpose buckets and you will able to see different options available for the bucket as shown in the image below:
119
+
To upload the files, navigate to **Amazon S3** and open the **`lambdas3binaries`** bucket. The available options for managing the bucket will be displayed:
Choose the **Upload** option within the **Objects** tab. You can manually upload the binaries and CFT files using the browser explorer or drag and drop the files into the *Upload* area. Once the files are selected, the screen should resemble the following image:
123
+
Select the **Upload** option within the **Objects** panel.
124
+
125
+
Manually upload the required files by either using the file browser to select the `.zip` and `.yaml` files or dragging and dropping them into the designated upload area. Once the files are recognized, the screen should resemble the following image:
A second bucket needs to be created following the same process as the [Temporary bucket](#temporary-bucket). This bucket will be the **CSV destination bucket**, where all generated CSV files will be stored. It is important to make sure this bucket is created in the same AWS region where the CloudFormation stack will be deployed.
133
140
134
-
Navigate to the Amazon S3 service and select Create bucket. In the bucket creation interface, specify the bucket name and confirm that the correct AWS region is selected.
141
+
Navigate to the **Amazon S3** service and select **Create bucket**. In the bucket creation window, specify the bucket name and check that the same AWS region is selected.
135
142
136
-
Keep the recommended default settings for Object Ownership and Public Access to maintain security compliance. Once all settings are verified, proceed with the bucket creation.
143
+
Keep the recommended default settings for **Object Ownership** and **Public Access** to maintain security compliance. Once all settings are verified, proceed with the bucket creation.
After the bucket has been successfully created, it will be listed among other existing buckets. Select the newly created CSV destination bucket to continue with additional configurations if necessary.
147
+
After the bucket has been successfully created, it will be listed among the available S3 buckets. Select the newly created **CSV destination bucket** to proceed with additional configurations if necessary.
When creating a folder, you will see options for server-side encryption to protect data at rest. By default, encryption settings are inherited from the bucket's global configuration.
157
+
During folder creation, options for **server-side encryption** will be displayed for data protection. By default, encryption settings derives from the bucket’s global configuration.
151
158
152
-
If needed, specify a custom encryption keybefore creating the folder.
159
+
A **custom encryption key** can be specified before finishing the folder creation.
Once the folder is created, it will be displayed under the Objects tab of the CSV destination bucket. This ensures that all exported CSV files will be stored well-organized within the dedicated bucket.
163
+
Once the folder is created, it will be displayed under the **Objects** tab of the **CSV destination bucket**. This makes sure that all exported CSV files are wellorganized within the dedicated bucket.
157
164
158
165

159
166
160
167
## CloudFormation Stack
161
168
162
169
### Preparing CloudFormation Stack
163
170
164
-
The CloudFormation stack will be deployed via a template after the stack has been created with required parameters. To begin, navigate to the AWS CloudFormation service and select **Create stack**. This process involves specifying a template source.
171
+
The CloudFormation stack is deployed using a [predefined template](https://github.com/arduino/aws-s3-integration/releases). This process involves specifying the required parameters and selecting the appropriate template source.
172
+
173
+
Navigate to the **AWS CloudFormation** service and select **Create stack**.
The**Object URL**is required for the **Amazon S3 URL** field within the stack creation.
187
+
This**Object URL**needs to be provided in the **Amazon S3 URL** field when creating the stack.
179
188
180
189
Proceed with the stack creation by following the steps. The configuration requires specifying parameters, including the Arduino API key and secret, the S3 bucket for code storage and the CSV destination bucket.
181
190
182
-
Optional parameters such as tag filters, organization ID and data resolution settings can also be configured.
191
+
Configure the following required parameters before proceeding with stack creation:
192
+
193
+
* Arduino API key
194
+
* Arduino API secret
195
+
* S3 bucket for code storage
196
+
* CSV destination bucket
197
+
198
+
***For more information about Arduino Cloud API, please refer to the [APIs Overview](https://docs.arduino.cc/arduino-cloud/api/api-overview/) or [Arduino Cloud API from Getting started with Arduino Cloud for Business](https://docs.arduino.cc/arduino-cloud/business/arduino-cloud-for-business/#arduino-cloud-api).***
199
+
200
+
You can also configure optional parameters like **tag filters**, **organization ID (Space ID for Arduino Cloud)** and **data resolution settings**.
In the **Specify stack details** step, provide a stack name and enter the necessary parameters.
193
211
194
-
The **`csvdests3int`** is the location where the CSV files will be stored.
212
+
The **`csvdests3int`**bucket is the designated location where the CSV files will be stored.
195
213
196
214
The **`LambdaCodeS3Bucket`** refers to the bucket containing the Lambda function ZIP file.
197
215
198
216
Specify the corresponding API key and secret in the `IotApiKey` and `IotApiSecret` fields.
199
217
218
+
***For more information about Arduino Cloud API, please refer to the [APIs Overview](https://docs.arduino.cc/arduino-cloud/api/api-overview/) or [Arduino Cloud API from Getting started with Arduino Cloud for Business](https://docs.arduino.cc/arduino-cloud/business/arduino-cloud-for-business/#arduino-cloud-api).***
Additional parameters include scheduling execution frequency, resolution settings and optional filters. These settings define how often data is exported and the aggregation method applied to collected data.
210
230
211
-
Once all parameters are filled in, proceed to the review stage. This allows you to verify the stack configuration before finalizing the deployment.
231
+
Once all parameters are filled in, proceed to the review stage. This allows you to verify the stack configuration before finishing the deployment.
The following animation shows the final review stage, which summarizes all stack parameters before starting the deployment process. The review screen confirms the selected CloudFormation template, stack name and all defined configuration parameters.
@@ -234,15 +256,15 @@ Below are the supported configuration parameters that are editable in the AWS Pa
234
256
235
257
To export specific Arduino Things from the Arduino Cloud, **tag filtering** is applied.
236
258
237
-
**Tags** can be added in the Arduino Cloud under the **Metadata** section of each device, referred to as **Things**.
259
+
**Tags** can be added in the Arduino Cloud under each device's **Metadata** section, referred to as **Things**.
238
260
239
261

240
262
241
-
When you click on **ADD**, it will ask you to provide a **key** and its **value** for the tag.
263
+
Click on **ADD** to define a tag by specifying a **key** and its **value**.
242
264
243
265

244
266
245
-
During CloudFormation stack creation, configure tag filters using:
267
+
During CloudFormation stack creation, tag filters are configured using:
246
268
247
269
```bash
248
270
/arduino/s3-exporter/{stack-name}/iot/filter/tags
@@ -264,47 +286,61 @@ If required, the extraction can be configured to align with specific time window
264
286
265
287
After confirming the stack creation, AWS CloudFormation will begin deploying the required resources.
266
288
267
-
The stack creation process can be monitored from the AWS CloudFormation Stacks section.
289
+
The **Stacks** section displays the newly created stack and its status. At this stage, the status appears as **`CREATE_IN_PROGRESS`**, indicating that AWS is actively provisioning resources.
The **Events - updated** tab logs real time updates for each resource creation. The status **`CREATE_IN_PROGRESS`** is shown alongside timestamps and event details, allowing visibility of the deployment process.
The deployment status can be tracked, and once completed, the stack should display the status **`CREATE_COMPLETE`**, indicating that all resources have been successfully deployed.
297
+
Once all resources are successfully deployed, the **stack status** updates to **`CREATE_COMPLETE`**. This confirms that the deployment is finished without errors.
Now, the stack is ready for operation with AWS S3 is integrated with the Arduino Cloud and automated CSV data export functionality.
306
+
307
+
## AWS S3 CSV Exporter Result
308
+
309
+
Once the CloudFormation stack is successfully deployed, the AWS S3 CSV Exporter will function based on the configured execution schedule. Based on the configuration, the [**Lambda function**](#lambda) is triggered every hour, retrieving data from relevant Arduino Cloud Thing tagged with the appropriate metadata key.
With the successful CloudFormation stack deployment, we will have a AWS S3 CSV exporter with execution handling of 1 hour as configured in stack configuration stage. Every 1 hour, the Lambda function will be invoked and retrieve data from relevant Arduino Cloud Thing with the key metadata configured.
313
+
This process allows only the selected Arduino Cloud Things to export data to the generated CSV files, which are then stored in an AWS S3 bucket for further processing, retrieval or integration with other services.
284
314
285
-
### CSV File Structure
315
+
### CSV File Storage and Organization
286
316
287
-
The CSV file are created inside the `csvdests3int` S3 bucket. Files are organized by date and time stamp, with a structured naming convention for easy identification. The `csvdests3int` bucket will show directories containing CSV files with dates first as image below:
317
+
The generated CSV files are stored in the `csvdests3int` S3 bucket. Within this bucket, files are structured in a date-based hierarchy for organized storage and easy access. Each folder corresponds to a specific date and within those folders, CSV files are named according to their respective timestamps.
318
+
319
+
The top-level structure of the `csvdests3int` bucket appears as follows:
288
320
289
321

290
322
291
-
Each date defined directories will have CSV files organized with tiemstamps as well as the dates to clarify time of the file creation, which is useful when the CSBV files are exported to local machines:
323
+
CSV files are stored inside date-specific folders within the `csvdests3int` S3 bucket. These folders are named according to the extraction date and within them, CSV files are organized by timestamp. This structure provides a chronological view of the exported data.
324
+
325
+
It also helps with data retrieval, processing, and analysis, particularly when exporting multiple data sets over extended periods:
292
326
293
327

294
328
295
-
It is possible to look deeper into each generated CSV files as an object if detailed information is required for certain purposes:
329
+
To view a specific CSV file, navigate to its **object details page** within the S3 bucket. You can access metadata such as file size, storage class, last modified timestamp and the AWS S3 URI for automated access here:
You can download the CSV file within object window shown in the image above or within list of objects from the bucket itself.
333
+
The CSV files can be downloaded directly from the object view or by selecting them from the list of objects within the bucket. This provides methods to analyze data locally, integrate it into external workflows or visualize trends.
300
334
301
-
The CSV files generated by the exporter follow this structure:
335
+
### CSV File Format and Data Structure
336
+
337
+
The exported CSV files follow a standardized column based structure, ensuring consistency across all data sets. Each row represents a data sample from a specific Arduino Cloud Thing, including timestamp, thing ID, property values and aggregation type:
AWS S3 and Arduino Cloud are now connected with the stack successfully deployed. Data extraction will follow the defined schedule, storing CSV files in the designated S3 bucket.
370
+
333
371
## Lambda, CloudWatch & EventBridge
334
372
335
373
Once the CloudFormatio stack has been deployed getting CSV destination bucket getting filled every hour, there are three useful tools to monitor stack deployment. These are Lambda, CloudWatch and EventBridge.
0 commit comments