Find Interview Questions for Top Companies
Dataiku Interview Questions and Answers
Ques:- What is the purpose of feature engineering in data analysis
Right Answer:
The purpose of feature engineering in data analysis is to create, modify, or select variables (features) that improve the performance of machine learning models by making the data more relevant and informative for the analysis.
Ques:- What are descriptive and inferential statistics
Right Answer:
Descriptive statistics summarize and describe the main features of a dataset, using measures like mean, median, mode, and standard deviation. Inferential statistics use sample data to make predictions or inferences about a larger population, often employing techniques like hypothesis testing and confidence intervals.
Ques:- What are outliers and how do you handle them in data analysis
Right Answer:
Outliers are data points that significantly differ from the rest of the dataset. They can skew results and affect statistical analyses. To handle outliers, you can:

1. Identify them using methods like the IQR (Interquartile Range) or Z-scores.
2. Remove them if they are errors or irrelevant.
3. Transform them using techniques like log transformation.
4. Use robust statistical methods that are less affected by outliers.
5. Analyze them separately if they provide valuable insights.
Ques:- What is the difference between correlation and causation
Right Answer:
Correlation is a statistical measure that indicates the extent to which two variables fluctuate together, while causation implies that one variable directly affects or causes a change in another variable.
Ques:- How do you handle missing data in a dataset
Right Answer:
To handle missing data in a dataset, you can use the following methods:

1. **Remove Rows/Columns**: Delete rows or columns with missing values if they are not significant.
2. **Imputation**: Fill in missing values using techniques like mean, median, mode, or more advanced methods like KNN or regression.
3. **Flagging**: Create a new column to indicate missing values for analysis.
4. **Predictive Modeling**: Use algorithms to predict and fill in missing values based on other data.
5. **Leave as Is**: In some cases, you may choose to leave missing values if they are meaningful for analysis.
Ques:- What are the different ways to generate random numbers?
Comments
Admin May 17, 2020

Random module is the standard module that is used to generate the random number.
The method is defined as:
import random
random.random()
The statement random.random() method return the floating point number that is in the range of [0, 1). The function generates the random float numbers. The methods that are used with the random class are the bound methods of the hidden instances. The instances of the Random can be done to show the multi-threading programs that creates different instance of individual threads. The other random generators that are used in this are:
• randrange(a, b): it chooses an integer and define the range in-between [a, b). It returns the elements by selecting it randomly from the range that is specified. It doesn’t build a range object.
• uniform(a, b): it chooses a floating point number that is defined in the range of [a,b).Iyt returns the floating point number
• normalvariate(mean, sdev): it is used for the normal distribution where the mu is a mean and the sdev is a sigma that is used for standard deviation.
• The Random class that is used and instantiated creates an independent multiple random number generators.

Ques:- Describe how to use Sessions for Web python.
Comments
Admin May 17, 2020

Sessions are the server side version of cookies. While a cookie preserves state at the client side, sessions preserves state at server side.
The session state is kept in a file or in a database at the server side. Each session is identified by a unique session id (SID). To make it possible to the client to identify himself to the server the SID must be created by the server and sent to the client whenever the client makes a request.
Session handling is done through the web.session module in the following manner:
import web.session session = web.session.start( option1, Option2,... )
session['myVariable'] = 'It can be requested'

Ques:- What is DRF of Django Rest Frame work ?
Right Answer:
DRF stands for Django Rest Framework, which is a powerful toolkit for building Web APIs in Django. It provides features like serialization, authentication, and viewsets to simplify the process of creating RESTful APIs.
Ques:- What are the Types of Model relationships in django ?
Right Answer:
The types of model relationships in Django are:

1. **One-to-One**: Each record in one model is related to one record in another model.
2. **One-to-Many**: A record in one model can be related to multiple records in another model.
3. **Many-to-Many**: Records in one model can be related to multiple records in another model, and vice versa.
Ques:- What is deadlock?
Right Answer:
A deadlock is a situation in computer systems where two or more processes are unable to proceed because each is waiting for the other to release a resource, resulting in a standstill.
Ques:- What do you mean by a trigger?
Right Answer:
A trigger is a special type of stored procedure in a database that automatically executes or fires in response to certain events on a particular table or view, such as insertions, updates, or deletions.
Ques:- What are the fields used for Project Planning in Ms Project?
Right Answer:
The fields used for Project Planning in MS Project include:

1. Task Name
2. Duration
3. Start Date
4. Finish Date
5. Predecessors
6. Resources
7. Percent Complete
8. Work
9. Cost
10. Milestones
Ques:- Given Data for doing different formats like pivot, and matching the data for another data
Right Answer:
To analyze data for different formats like pivot tables and matching datasets, you should:

1. **Identify Key Variables**: Determine the key fields that will be used for matching and pivoting.
2. **Clean the Data**: Ensure that the data is free from duplicates, errors, and inconsistencies.
3. **Use Pivot Tables**: Create pivot tables to summarize and analyze the data by aggregating values based on categories.
4. **Match Data**: Use functions like VLOOKUP or JOIN operations in SQL to match data from different sources based on the identified key variables.
5. **Validate Results**: Check the accuracy of the matched data and the pivot table outputs to ensure they meet business requirements.
Ques:- What is the difference between brd, srs and use of case documents?
Right Answer:
BRD (Business Requirements Document) outlines the high-level business needs and objectives. SRS (Software Requirements Specification) details the functional and non-functional requirements for the software. Use Case documents describe specific interactions between users and the system to achieve particular goals.
AmbitionBox Logo

What makes Takluu valuable for interview preparation?

1 Lakh+
Companies
6 Lakh+
Interview Questions
50K+
Job Profiles
20K+
Users