To secure data in transit in AWS, use SSL/TLS for encryption during transmission and implement VPNs or AWS Direct Connect for secure connections. To secure data at rest, use AWS services like S3 Server-Side Encryption, EBS encryption, and RDS encryption, along with IAM policies to control access.

AWS CLI (Amazon Web Services Command Line Interface) is a tool that allows users to interact with AWS services using command-line commands instead of the web-based console.
Spot instances are a type of Amazon EC2 instance that allows you to bid on unused computing capacity at potentially lower prices than on-demand instances. They can be interrupted by AWS with little notice if the capacity is needed for on-demand instances.
A Public Subnet is a subnet that has a route to the internet through an Internet Gateway, allowing resources within it to be accessed from the internet. A Private Subnet, on the other hand, does not have a direct route to the internet, meaning resources in it cannot be accessed directly from the internet.
Elastic Beanstalk is a platform-as-a-service (PaaS) that simplifies application deployment and management, automatically handling infrastructure provisioning, load balancing, and scaling. CloudFormation, on the other hand, is an infrastructure-as-code (IaC) service that allows you to define and provision AWS resources using templates, giving you more control over the infrastructure setup but requiring more manual configuration.
**Difference between DWH and Data Mart:**
- A Data Warehouse (DWH) is a centralized repository that stores large volumes of data from multiple sources for analysis and reporting. A Data Mart is a subset of a Data Warehouse, focused on a specific business area or department.
**Difference between Views and Materialized Views:**
- A View is a virtual table that provides a way to present data from one or more tables without storing it physically. A Materialized View, on the other hand, stores the result of a query physically, allowing for faster access at the cost of needing to refresh the data periodically.
**Indexing:**
- Indexing is a database optimization technique that improves the speed of data retrieval operations on a database table. Common indexing techniques include B-tree indexing, hash indexing, and bitmap indexing.
FALSE
Second Normal Form (2NF) is a database normalization level where a table is in First Normal Form (1NF) and all non-key attributes are fully functionally dependent on the entire primary key, meaning there are no partial dependencies on a composite primary key.
Joins are used to combine rows from two or more tables based on a related column, while scope relationships define how data is related within a single table or between tables in terms of hierarchy or context, often influencing data visibility and access.
The star schema has a central fact table connected directly to multiple dimension tables, resembling a star shape. The snowflake schema, on the other hand, normalizes dimension tables into multiple related tables, creating a more complex structure that resembles a snowflake.
To fine-tune the mappings in ETL processes, you can:
1. **Optimize Source Queries**: Ensure that source queries are efficient and only retrieve necessary data.
2. **Use Incremental Loads**: Implement incremental loading to process only new or changed data.
3. **Reduce Data Volume**: Filter out unnecessary columns and rows early in the process.
4. **Leverage Pushdown Optimization**: Push transformations to the source database when possible to reduce data movement.
5. **Optimize Transformations**: Simplify complex transformations and use efficient functions.
6. **Monitor Performance**: Use performance monitoring tools to identify bottlenecks and optimize accordingly.
7. **Parallel Processing**: Utilize parallel processing to improve throughput.
8. **Indexing**: Ensure proper indexing on source and target tables to speed up data retrieval and loading.
Full load refers to the process of loading all the data from the source system into the data warehouse, replacing any existing data. Incremental or refresh load, on the other hand, involves loading only the new or changed data since the last load, thereby updating the existing data without replacing everything.
Partitioning is the process of dividing a large database table into smaller, more manageable pieces, called partitions, while still treating it as a single table. The main types of partitioning are:
1. **Range Partitioning**: Divides data based on a specified range of values.
2. **List Partitioning**: Divides data based on a list of values.
3. **Hash Partitioning**: Distributes data across partitions based on a hash function.
4. **Composite Partitioning**: Combines two or more partitioning methods, such as range and hash.