The Weather Platform uses Amazon S3 as a simple data lake to store IoT messages from weather sensors. This is the primary storage for raw sensor data.
The data lake is an S3 bucket named itea-weather-data-lake-storage
that stores:
Click Create bucket to finish.
The data lake is organized with the following folder structure:
itea-weather-data-lake-storage/
├── raw-data/ # Raw IoT sensor messages
│ └── weatherPlatform/ # Weather platform data
│ └── telemetry/ # Telemetry data organized by location
│ ├── {location-1}/ # e.g., "itea-lab-room-a"
│ │ └── sensor-data.json
│ ├── {location-2}/ # e.g., "outdoor-station-1"
│ │ └── sensor-data.json
│ └── {location-n}/ # Additional locations
│ └── sensor-data.json
└── glue-scripts/ # AWS Glue ETL scripts
└── weather-transform.py # Data transformation script
When IoT devices send weather data, the messages are stored in:
raw-data/weatherPlatform/telemetry/{location}/
Where {location}
is the specific location identifier (e.g., itea-lab-room-a
, outdoor-station-1
).
Critical: S3 bucket names must be globally unique across all AWS accounts. Add your name or timestamp to ensure uniqueness: itea-weather-data-lake-storage-yourname
Backend Code Update Required: The Amplify backend code has hardcoded bucket names that must be updated to match your unique bucket name.
This data lake bucket is separate from the processed dataset bucket. Raw IoT sensor data flows into this bucket, while processed datasets are stored in a different bucket.
After creating your unique bucket name, you must update the hardcoded references in the Amplify backend:
Files to Update:
amplify/backend.ts
(Lines 233, 272, 273, 318):
// Change from:
sourceBucketName: "itea-weather-data-lake-storage";
// To your unique name:
sourceBucketName: "itea-weather-data-lake-storage-yourname";
amplify/functions/getTotalReadings/handler.ts
(Line 16):
// Change from:
const bucket = "itea-weather-data-lake-storage";
// To your unique name:
const bucket = "itea-weather-data-lake-storage-yourname";
amplify/custom/WeatherDataGlue/resource.ts
(Lines 65, 66, 195):
// Change all S3 ARNs and paths from:
arn:aws:s3:::itea-weather-data-lake-storage
s3://itea-weather-data-lake-storage/glue-scripts/
// To your unique name:
arn:aws:s3:::itea-weather-data-lake-storage-yourname
s3://itea-weather-data-lake-storage-yourname/glue-scripts/
Use Find & Replace in your code editor to quickly update all instances of itea-weather-data-lake-storage
to your unique bucket name.
With the S3 data lake configured:
raw-data/weatherPlatform/telemetry/{location}/
glue-scripts/
process the raw data