Find Interview Questions for Top Companies
Ques:- Difference between DWH and Data mart, Difference between views and materialized views. What is Indexing and which kind of Indexing Technique we use in
Right Answer:
**Difference between DWH and Data Mart:**
- A Data Warehouse (DWH) is a centralized repository that stores large volumes of data from multiple sources for analysis and reporting. A Data Mart is a subset of a Data Warehouse, focused on a specific business area or department.

**Difference between Views and Materialized Views:**
- A View is a virtual table that provides a way to present data from one or more tables without storing it physically. A Materialized View, on the other hand, stores the result of a query physically, allowing for faster access at the cost of needing to refresh the data periodically.

**Indexing:**
- Indexing is a database optimization technique that improves the speed of data retrieval operations on a database table. Common indexing techniques include B-tree indexing, hash indexing, and bitmap indexing.
Ques:- What is BI, Which kind of modeling suitable for OLAP Reporting and WHY, Steps to create a Database,
Right Answer:
BI stands for Business Intelligence, which involves analyzing data to help make informed business decisions. For OLAP (Online Analytical Processing) reporting, a star schema or snowflake schema is suitable because they optimize query performance and simplify data retrieval.

Steps to create a database:
1. Define the purpose and requirements.
2. Design the schema (tables, relationships).
3. Choose a database management system (DBMS).
4. Create the database and tables using SQL.
5. Populate the database with data.
6. Implement indexing for performance.
7. Test the database for functionality and performance.
Ques:- What is CDC TECHNIQUE?, What is confirmed Dimension tell the scenario where u face, What is role playing Dimension, Types of Hierarchy
Right Answer:
CDC (Change Data Capture) technique is a method used to identify and capture changes made to data in a database, allowing for efficient data synchronization and updates in data warehousing.

A confirmed dimension is a dimension that is shared across multiple fact tables, ensuring consistency in reporting. For example, a "Customer" dimension can be confirmed across sales and returns fact tables.

A role-playing dimension is a dimension that can be used in multiple contexts within the same data model. For instance, a "Date" dimension can represent different roles like "Order Date," "Ship Date," and "Delivery Date."

Types of hierarchy include:
1. **Parent-Child Hierarchy**: A hierarchy where each member can have multiple children and a single parent.
2. **Level-Based Hierarchy**: A hierarchy where members are organized into levels, such as Year > Quarter > Month > Day.
Ques:- What is data sparsity and how it effect on aggregation?
Right Answer:
Data sparsity refers to the condition where a dataset contains a high proportion of empty or zero values. It affects aggregation by making it difficult to derive meaningful insights, as the lack of data points can lead to inaccurate averages or totals, potentially skewing results and making it challenging to identify trends or patterns.
Ques:- Why are recursive relationships are bad? How do you resolve them?
Right Answer:
Recursive relationships can lead to complexity and ambiguity in data modeling, making it difficult to enforce constraints and maintain data integrity. To resolve them, you can create a separate linking table to manage the relationships or use additional attributes to clarify the hierarchy or relationship type.
Ques:- What is First Normal Form
Right Answer:
First Normal Form (1NF) is a property of a relation in a database that ensures all columns contain atomic, indivisible values, and each entry in a column is of the same data type. Additionally, each row must be unique, typically achieved by having a primary key.
Ques:- What is an artificial (derived) primary key? When should it be used?
Right Answer:
An artificial (derived) primary key is a unique identifier for a database record that is created by the database designer, rather than being derived from the data itself. It is typically a sequential number or a unique string that has no business meaning. It should be used when natural keys are not available, are too complex, or when there is a need for a stable identifier that won't change over time.
Ques:- What is the difference between star flake and snow flake schema?
Right Answer:
The star schema has a central fact table connected directly to multiple dimension tables, resembling a star shape. The snowflake schema, on the other hand, normalizes dimension tables into multiple related tables, creating a more complex structure that resembles a snowflake.
Ques:- When should you consider denormalization?
Right Answer:
Denormalization should be considered when you need to improve read performance, reduce the complexity of queries, or when you have specific reporting requirements that benefit from fewer joins in the database.
Ques:- Waht is second normal form
Right Answer:
Second Normal Form (2NF) is a database normalization level where a table is in First Normal Form (1NF) and all non-key attributes are fully functionally dependent on the entire primary key, meaning there are no partial dependencies on a composite primary key.
Ques:- Describe the third normal form
Right Answer:
The third normal form (3NF) is a database normalization rule that requires a table to be in second normal form (2NF) and have no transitive dependencies. This means that all non-key attributes must depend only on the primary key and not on other non-key attributes.
Ques:- What role views play in dimensional modeling?
Right Answer:
Views in dimensional modeling serve as a way to simplify complex queries by presenting data in a more user-friendly format. They can encapsulate complex joins and aggregations, making it easier for users to access and analyze data without needing to understand the underlying database structure.
Ques:- What is normalization? Explain normalization types.
Right Answer:
Normalization is the process of organizing data in a database to reduce redundancy and improve data integrity. The main types of normalization are:

1. **First Normal Form (1NF)**: Ensures that all columns contain atomic values and each entry in a column is of the same type.
2. **Second Normal Form (2NF)**: Achieves 1NF and ensures that all non-key attributes are fully functionally dependent on the primary key.
3. **Third Normal Form (3NF)**: Achieves 2NF and ensures that all non-key attributes are not only dependent on the primary key but also independent of each other.
4. **Boyce-Codd Normal Form (BCNF)**: A stronger version of 3NF that deals with certain types of anomalies not handled by 3NF.
5. **Fourth Normal Form (4NF)**: Achieves BCNF and addresses multi-valued dependencies.
6. **Fifth Normal Form (5NF)**: Achieves 4
Ques:- What is LIS ?
Right Answer:
LIS stands for Laboratory Information System, which is a software system that manages and processes laboratory data, including test orders, results, and patient information.
Ques:- What is the difference between joins and scope relationship?
Right Answer:
Joins are used to combine rows from two or more tables based on a related column, while scope relationships define how data is related within a single table or between tables in terms of hierarchy or context, often influencing data visibility and access.
Ques:- How can you apply linked list to improve your college’s database?
Right Answer:
You can use a linked list to manage student records in your college's database by linking each student node to the next one. This allows for efficient insertion and deletion of records, as you can easily add or remove students without needing to shift other records, making it easier to handle dynamic data like enrollments and course registrations.
Ques:- What is fish-bone analysis?
Right Answer:
Fishbone analysis, also known as Ishikawa or cause-and-effect diagram, is a visual tool used to identify and organize potential causes of a problem. It helps teams analyze the root causes of issues by categorizing them into different branches, resembling the bones of a fish.
Ques:- What is the difference between hashed file stage and sequential file stage in relates to DataStage Server?
Right Answer:
The hashed file stage in DataStage Server allows for fast access and retrieval of data using a hash key, enabling efficient lookups and joins. In contrast, the sequential file stage reads and writes data in a linear fashion, processing records one after another without indexing, making it slower for random access operations.


AmbitionBox Logo

What makes Takluu valuable for interview preparation?

1 Lakh+
Companies
6 Lakh+
Interview Questions
50K+
Job Profiles
20K+
Users