55 Dataiku Interview Questions and Answers | HR, Technical & Aptitude Prep

Data Analytics

Ques:- What is the purpose of feature engineering in data analysis

Asked In :- KRIOS Info Solutions, WSNE Consulting, AnAr Solutions, Queppelin Technology Solutions, Rock Solid Solutions, Ziffity Solutions, Aranca (Mumbai), Solace Infotech, Born Commerce, GOQii Technologies,

Right Answer:
The purpose of feature engineering in data analysis is to create, modify, or select variables (features) that improve the performance of machine learning models by making the data more relevant and informative for the analysis.

Data Analytics

Ques:- What are descriptive and inferential statistics

Asked In :- Vinove Software & Services Pvt Ltd, KRIOS Info Solutions, Shipco IT, STIC SOFT E-SOLUTIONS, Toxsl Technologies, GOQii Technologies, Namecheap Web Services, iROID Technologies, TRICON INFOTECH PVT, SWYM,

Right Answer:
Descriptive statistics summarize and describe the main features of a dataset, using measures like mean, median, mode, and standard deviation. Inferential statistics use sample data to make predictions or inferences about a larger population, often employing techniques like hypothesis testing and confidence intervals.

Data Analytics

Ques:- What are outliers and how do you handle them in data analysis

Asked In :- Fluid AI, Trigent Software, Itobuz Technologies, RANDSTAD INDIA PVT, GOQii Technologies, Dhruvsoft Services, Radicle Software, Webvillee Technology, Fission Infotech, Noesys Consulting,

Right Answer:
Outliers are data points that significantly differ from the rest of the dataset. They can skew results and affect statistical analyses. To handle outliers, you can:

1. Identify them using methods like the IQR (Interquartile Range) or Z-scores.
2. Remove them if they are errors or irrelevant.
3. Transform them using techniques like log transformation.
4. Use robust statistical methods that are less affected by outliers.
5. Analyze them separately if they provide valuable insights.

Data Analytics

Ques:- What is the difference between correlation and causation

Asked In :- Unyscape Infocom Pvt. Ltd., Vinove Software & Services Pvt Ltd, AnAr Solutions, STIC SOFT E-SOLUTIONS, Toxsl Technologies, Chegg India, Aranca (Mumbai), iROID Technologies, TNQ Technologies, MatchMove India,

Right Answer:
Correlation is a statistical measure that indicates the extent to which two variables fluctuate together, while causation implies that one variable directly affects or causes a change in another variable.

Data Analytics

Ques:- How do you handle missing data in a dataset

Asked In :- Rock Solid Solutions, Protege Solutions, Ziffity Solutions, Toxsl Technologies, Cybage Software, WFM, Oodles Technologies, Sun Dew Solutions, Startup - Navya Network, LenDenClub,

Right Answer:
To handle missing data in a dataset, you can use the following methods:

1. **Remove Rows/Columns**: Delete rows or columns with missing values if they are not significant.
2. **Imputation**: Fill in missing values using techniques like mean, median, mode, or more advanced methods like KNN or regression.
3. **Flagging**: Create a new column to indicate missing values for analysis.
4. **Predictive Modeling**: Use algorithms to predict and fill in missing values based on other data.
5. **Leave as Is**: In some cases, you may choose to leave missing values if they are meaningful for analysis.

Python Software Developer/ Programmer

Ques:- Â If we hire you, what do you want to work on?

Asked In :- iASYS, NovoInvent, Zysk Technologies, Intugine Technologies, Zorang, Marolix Technology Solutions, zynga, national payments corporation of india (npci), traction on demand, dataiku,

Python

Ques:- What are the different ways to generate random numbers?

Asked In :- DecisionTree Analytics and Services, Talent Smart Soft Solutions (OPC), Sion Semiconductors, iProgrammer, MindNerves Technologies, Technoforte Software, Emertxe, Mechlin Technologies, Flannels, Intelligraphics,

Comments

Admin May 17, 2020

Random module is the standard module that is used to generate the random number.
The method is defined as:
import random
random.random()
The statement random.random() method return the floating point number that is in the range of [0, 1). The function generates the random float numbers. The methods that are used with the random class are the bound methods of the hidden instances. The instances of the Random can be done to show the multi-threading programs that creates different instance of individual threads. The other random generators that are used in this are:
â€¢ randrange(a, b): it chooses an integer and define the range in-between [a, b). It returns the elements by selecting it randomly from the range that is specified. It doesnâ€™t build a range object.
â€¢ uniform(a, b): it chooses a floating point number that is defined in the range of [a,b).Iyt returns the floating point number
â€¢ normalvariate(mean, sdev): it is used for the normal distribution where the mu is a mean and the sdev is a sigma that is used for standard deviation.
â€¢ The Random class that is used and instantiated creates an independent multiple random number generators.

Python

Ques:- Describe how to use Sessions for Web python.

Asked In :- Rudder Analytics, Testrig Technologies, Infometry, Kireeti Soft Technologies, Tetherfi, Spring Computing Technologies, birlasoft, qinetiq, advantest, datorama,

Comments

Admin May 17, 2020

Sessions are the server side version of cookies. While a cookie preserves state at the client side, sessions preserves state at server side.
The session state is kept in a file or in a database at the server side. Each session is identified by a unique session id (SID). To make it possible to the client to identify himself to the server the SID must be created by the server and sent to the client whenever the client makes a request.
Session handling is done through the web.session module in the following manner:
import web.session session = web.session.start( option1, Option2,... )
session['myVariable'] = 'It can be requested'

Django Python

Ques:- What is DRF of Django Rest Frame work ?

Asked In :- Techwave Consulting Inc, CloudIX, iLeap, Vantage Systech, Synapse Labs, Zorang, Digilytics, playstation, smiths detection, latentview analytics,

Right Answer:
DRF stands for Django Rest Framework, which is a powerful toolkit for building Web APIs in Django. It provides features like serialization, authentication, and viewsets to simplify the process of creating RESTful APIs.

Django Python

Ques:- What are the Types of Model relationships in django ?

Asked In :- Incise Infotech, EpiSource, Talentica, Relinns Technologies, Zinier, Intugine Technologies, Ansoft, qinetiq, medidata solutions, sopra banking software,

Right Answer:
The types of model relationships in Django are:

1. **One-to-One**: Each record in one model is related to one record in another model.
2. **One-to-Many**: A record in one model can be related to multiple records in another model.
3. **Many-to-Many**: Records in one model can be related to multiple records in another model, and vice versa.

ASP.NET C HTML JAVA sql Technical Support Engineer XML

Ques:- What is deadlock?

Asked In :- Addweb solutions, Sun Technology Integrators, Lucid Imaging, FOX SOLUTIONS, Anand Rathi IT, AgilePoint Software India, Fulcrum Logic (I), Plural Technology Inc, Cloud Assert, Caliber Technologies,

Right Answer:
A deadlock is a situation in computer systems where two or more processes are unable to proceed because each is waiting for the other to release a resource, resulting in a standstill.

Database Administrator (dba) SQL Server

Ques:- Can Somebody tell me the difference between Clustered & Non-Clustered Index??

Asked In :- New Vision Info Tech, leaap international, SimpleMinds Technologies LLP, Comviva, Clover Technologies, Payoda, Miracle Software Systems, GoldenSource Corporation, KareXpert Technologies, Axxonet System Technologies,

Project Leader/ Project Manager SQL Server

Ques:- What are the new enhancements in SQL server 2005?

Asked In :- Xoriant Solutions, Infomats Technologies, PrimeSoft Solutions, RoboSoft, Mazenet Solution, Vinove Software and Services, QuantumLink, Valuefy Solutions, ACC Cement, VectoScalar Technologies,

Software Developer/ Programmer SQL Server

Ques:- What is the sum of 3 consecutive integers between 20 to 50?

Asked In :- TRICON INFOTECH PVT, McKinsey Knowledge Center, Tatvic Analytics, FACTENTRY DATA SOLUTIONS, ION Group, Celebal, Skolaro, Healthcare Informatics, Ultramain Systems, Classplus,

Senior Database Administrator sql

Ques:- What do you mean by a trigger?

Asked In :- Amrut Software, Chella Software, Quantile, World Bank, Sevya IT, ITH Technologies (Infotech HUB), SGV & Co, Saral Technologies, iNEXTURE, Susquehanna International Group (SIG),

Right Answer:
A trigger is a special type of stored procedure in a database that automatically executes or fires in response to certain events on a particular table or view, such as insertions, updates, or deletions.

Business Analysis Business Analyst

Ques:- What are the fields used for Project Planning in Ms Project?

Asked In :- Hidden Brains InfoTech, MattsenKumar Services, FIS Global Business Solutions India, SPARX IT SOLUTIONS, Ray Business Technologies, Elsner Technologies, Noesys Consulting, Target Integration, Evolvus Solutions, sciative,

Right Answer:
The fields used for Project Planning in MS Project include:

1. Task Name
2. Duration
3. Start Date
4. Finish Date
5. Predecessors
6. Resources
7. Percent Complete
8. Work
9. Cost
10. Milestones

Business Analysis

Ques:- Given Data for doing different formats like pivot, and matching the data for another data

Asked In :- WSNE Consulting, AnAr Solutions, MattsenKumar Services, FIS Global Business Solutions India, TRICON INFOTECH PVT, SPARX IT SOLUTIONS, SmartData Enterprises, Adnate IT Solutions, Happiest Minds Technologies Pvt., Ray Business Technologies,

Right Answer:
To analyze data for different formats like pivot tables and matching datasets, you should:

1. **Identify Key Variables**: Determine the key fields that will be used for matching and pivoting.
2. **Clean the Data**: Ensure that the data is free from duplicates, errors, and inconsistencies.
3. **Use Pivot Tables**: Create pivot tables to summarize and analyze the data by aggregating values based on categories.
4. **Match Data**: Use functions like VLOOKUP or JOIN operations in SQL to match data from different sources based on the identified key variables.
5. **Validate Results**: Check the accuracy of the matched data and the pivot table outputs to ensure they meet business requirements.

Business Analysis Product Manager

Ques:- What is the difference between brd, srs and use of case documents?

Asked In :- Hidden Brains InfoTech, MattsenKumar Services, Protege Solutions, FIS Global Business Solutions India, TRICON INFOTECH PVT, Webvillee Technology, Energytech Global, COEPD, Tredence Analytics Solutions, Noesys Consulting,

Right Answer:
BRD (Business Requirements Document) outlines the high-level business needs and objectives. SRS (Software Requirements Specification) details the functional and non-functional requirements for the software. Use Case documents describe specific interactions between users and the system to achieve particular goals.

Business Analysis

Ques:- About database related

Asked In :- MattsenKumar Services, Protege Solutions, Webvillee Technology, SmartData Enterprises, Adnate IT Solutions, Happiest Minds Technologies Pvt., COEPD, Noesys Consulting, Promact Infotech, scandid,

Right Answer:
Could you please specify the exact question related to databases?

Business Analysis

Ques:- Sadsdfsdfdfd

Asked In :- WSNE Consulting, AnAr Solutions, MattsenKumar Services, SPARX IT SOLUTIONS, Adnate IT Solutions, Tredence Analytics Solutions, Elsner Technologies, IRIS KPO RESOURCING INDIA PVT, Evolvus Solutions, AroDek,

Right Answer:
I'm sorry, but the question appears to be unclear or nonsensical. Please provide a specific question related to business analysis for me to answer.

Find Interview Questions for Dataiku

What makes Takluu valuable for interview preparation?

Get Our Mobile App

Programming

Reasoning

Network & Telecom

Management

What makes Takluu valuable for interview preparation?