Data Scientist - interview questions

November 8, 2025

Freshers to Experienced Candidate

sorted interview question

Created by	SSudarshan Mestha
Created time	@October 30, 2025 8:59 AM
Category
Last edited by	SSudarshan Mestha
Last updated time	@November 8, 2025 7:08 PM

Python

attachment:dd004fae-d46b-4deb-af23-aab097d96b55:download_(18).mp3

What is a decorator in Python, and how does it work?
A decorator is like a wrapper that adds extra features to a function without changing its original code. It works by taking one function as input and returning another function that adds new behavior.

How do you apply a decorator using the @ syntax?
You just write @decorator_name above the function you want to decorate. For example:
```
@login_decorator
def user_login():
    print("Welcome to the system!")
```

Can you explain decorators with arguments and give an example?

Yes. A decorator with arguments means we can pass values to the decorator.

For example:

def repeat(num):
    def inner(func):
        def wrapper():
            for _ in range(num):
                func()
        return wrapper
    return inner

@repeat(3)
def greet():
    print("Hi!")

Here, repeat(3) will make greet() run 3 times.

What is the difference between iterate and enumerate()?
iterate means looping through items. enumerate() gives both index and value — helpful when we need the position of items in a loop.

What is the difference between a list and a tuple? Which one is faster and why?
- List → can be changed (mutable).
- Tuple → cannot be changed (immutable).
  Tuples are faster because they use less memory and can’t be modified.

What is the difference between a static method, a class method, and an abstract method?
attachment:bba931cc-52c8-48a7-ae6b-5264525b1198:download_(28).mp3

Static method: Belongs to a class, but doesn’t use class or object data.

class MathHelper:
    @staticmethod
    def add(a, b):
        return a + b

# Calling without creating an object
print(MathHelper.add(5, 3))

Class method: Uses the class itself (cls) as a parameter.

class Student:
    school_name = "ABC School"

    @classmethod
    def change_school(cls, name):
        cls.school_name = name

# Call class method without creating object
Student.change_school("XYZ School")

print(Student.school_name)
🖨️ Output:
XYZ School

Abstract method: Defined in abstract classes and must be implemented in child classes.

from abc import ABC, abstractmethod

class Shape(ABC):
    @abstractmethod
    def area(self):
        pass  # No implementation here

class Circle(Shape):
    def area(self):
        return 3.14 * 5 * 5

# obj = Shape()  ❌ Not allowed (abstract class)
c = Circle()
print(c.area())

🖨️ Output: 78.5

What is an iterator and a generator?
- Iterator: Object that can be looped over using next().
- Generator: A simpler way to create iterators using the yield keyword.

What are the different types of inheritance (single, multiple, multilevel, hierarchical, hybrid)?

Have you used multithreading or multiprocessing? What’s the difference?
- Multithreading: Runs multiple threads in the same process (good for I/O tasks).
- Multiprocessing: Runs multiple processes with separate memory (good for CPU tasks).

What is the Global Interpreter Lock (GIL)?
The GIL is a lock in Python that allows only one thread to run at a time, even on multi-core CPUs.
It helps avoid memory conflicts but limits true parallelism.

Have you worked with polymorphism?
Yes. Polymorphism means using a single function name or method in different ways based on the object or data type.
For example, two different classes can have the same method name, like
```
speak()
```
, but each class can implement it differently.
It helps in writing flexible and reusable code by allowing one interface to work with multiple object types.

Have you written or used decorators in Python or FastAPI?
Yes. I’ve used decorators in both Python and FastAPI.
In Python, I’ve written custom decorators to add logging, authentication, or validation logic around functions.
In FastAPI, I’ve used built-in decorators like
```
@app.get("/")
```
and
```
@app.post("/login")
```
to define API routes.
Decorators help keep the code clean and reusable by separating extra features from the main logic.

How would you find the employee with the maximum salary using Python (pandas)?
attachment:8c587feb-7d52-45c3-928f-2c4e8d2a8570:download_(17).mp3
df.loc[df['salary'].idxmax()]

Given a dictionary with cities, speeds, and units, convert to DataFrame and standardize to km/h.

attachment:f8961faf-3f0c-4f6d-80a2-241faf6e6ae9:download_(16).mp3

import pandas as pd  

data = {
    'city': ['A', 'B'],
    'speed': [60, 40],
    'unit': ['km/h', 'miles/h']
}

df = pd.DataFrame(data)

df['speed_kmh'] = df.apply(
    lambda x: x['speed'] * 1.6 if x['unit'] == 'miles/h' else x['speed'],
    axis=1
)

🧮 SQL :

attachment:43ad3a87-8f86-4d45-8394-704cfdf06ba6:download_(20).mp3

How would you find the second-highest salary and the employee name?

We can use the LIMIT and OFFSET keywords in SQL.

SELECT name, salary
FROM employees
ORDER BY salary DESC
LIMIT 1 OFFSET 1;

Or using a subquery:

SELECT name, salary
FROM employees
WHERE salary = (
  SELECT MAX(salary)
  FROM employees
  WHERE salary < (SELECT MAX(salary) FROM employees)
);

What is the OFFSET keyword in SQL?
OFFSET is used to skip a specific number of rows before starting to return rows from the result set.
For example:
```
SELECT * FROM employees
ORDER BY salary DESC
LIMIT 5 OFFSET 5;
```
This will skip the first 5 rows and show the next 5.

What’s the SQL equivalent for “employee with max salary” logic?

We can use the MAX() function or ORDER BY with LIMIT.

SELECT name, salary
FROM employees
WHERE salary = (SELECT MAX(salary) FROM employees);

Or,

SELECT name, salary
FROM employees
ORDER BY salary DESC
LIMIT 1;

What is the difference between INNER JOIN, LEFT JOIN, RIGHT JOIN, and FULL JOIN?
- INNER JOIN: Returns only matching records from both tables.
- LEFT JOIN: Returns all records from the left table and matched records from the right.
- RIGHT JOIN: Returns all records from the right table and matched records from the left.
- FULL JOIN: Returns all records when there is a match in either left or right table.

What is the difference between WHERE and HAVING clauses?
- WHERE filters rows before grouping (used with individual rows).
- HAVING filters groups after aggregation (used with GROUP BY).
  Example:
```
SELECT department, COUNT(*)
FROM employees
WHERE salary > 30000
GROUP BY department
HAVING COUNT(*) > 5;
```

How do you find duplicate records in a table?
We can use GROUP BY with HAVING COUNT() > 1.
```
SELECT name, COUNT(*)
FROM employees
GROUP BY name
HAVING COUNT(*) > 1;
```

What is the difference between RANK(), DENSE_RANK(), and ROW_NUMBER()?
- RANK(): Skips ranking numbers if there are ties.
- DENSE_RANK(): Doesn’t skip ranking numbers for ties.
- ROW_NUMBER(): Assigns a unique sequential number to each row.
Example:
Salary RANK DENSE_RANK ROW_NUMBER
5000 1 1 1
5000 1 1 2
4000 3 2 3

Salary	RANK	DENSE_RANK	ROW_NUMBER
5000	1	1	1
5000	1	1	2
4000	3	2	3

What is the difference between UNION and UNION ALL?
- UNION: Combines results and removes duplicates.
- UNION ALL: Combines results and keeps duplicates.
  Example:
```
SELECT name FROM employees_2024
UNION
SELECT name FROM employees_2025;
```

How do you find the second highest salary from the Employee table?

SELECT MAX(salary)
FROM employees
WHERE salary < (SELECT MAX(salary) FROM employees);

Or, using ORDER BY:

SELECT salary
FROM employees
ORDER BY salary DESC
LIMIT 1 OFFSET 1;

What is a subquery and what are its types?
A subquery is a query inside another query.
Types:
1. Single-row subquery — returns one value.
1. Multi-row subquery — returns multiple values.
1. Correlated subquery — depends on the outer query.
1. Nested subquery — multiple levels of subqueries.

What are primary key and foreign key?
- Primary Key: Uniquely identifies each record in a table. It cannot have NULL values.
- Foreign Key: Creates a relationship between two tables, linking a column in one table to the primary key of another.
Example:
```
CREATE TABLE Department (
  dept_id INT PRIMARY KEY,
  dept_name VARCHAR(50)
);

CREATE TABLE Employee (
  emp_id INT PRIMARY KEY,
  emp_name VARCHAR(50),
  dept_id INT,
  FOREIGN KEY (dept_id) REFERENCES Department(dept_id)
);
```

What is normalization and why is it important?
Normalization is the process of organizing data to reduce redundancy and improve data integrity.
Common normal forms:
- 1NF: Remove repeating groups.
- 2NF: Remove partial dependencies.
- 3NF: Remove transitive dependencies.
  It helps make the database efficient, consistent, and easier to maintain.

What is the difference between DELETE, TRUNCATE, and DROP?
Command Description Can Rollback Removes Structure
DELETE Removes specific rows ✅ Yes ❌ No
TRUNCATE Removes all rows, faster than DELETE ❌ No ❌ No
DROP Deletes entire table ❌ No ✅ Yes
Example:
```
DELETE FROM employees WHERE id = 5;
TRUNCATE TABLE employees;
DROP TABLE employees;
```

Command	Description	Can Rollback	Removes Structure
DELETE	Removes specific rows	✅ Yes	❌ No
TRUNCATE	Removes all rows, faster than DELETE	❌ No	❌ No
DROP	Deletes entire table	❌ No	✅ Yes

☁️ AWS / Cloud / GCP

attachment:81b6939f-8dc9-4811-83b4-9cec02c33240:download_(21).mp3

Do you have experience with GCP services?
1. Yes, I have hands-on experience with key GCP services such as Compute Engine for virtual machines, Cloud Storage for object storage, BigQuery for analytics, and Cloud Functions for serverless execution. I’ve also used Vertex AI for model deployment and Cloud Run for containerized applications.

What is AWS and what are its key services?
AWS (Amazon Web Services) is a cloud computing platform that provides infrastructure and services on demand.
Key services include:
- EC2 for virtual servers,
- S3 for object storage,
- RDS for relational databases,
- Lambda for serverless computing,
- VPC for network isolation, and
- CloudWatch for monitoring and logging.

How would you assess AWS Lambda file-processing security?
To secure AWS Lambda for file processing:
- Use IAM roles with least privilege.
- Validate and sanitize all input files.
- Store temporary data in encrypted S3 buckets.
- Use AWS KMS for encryption.
- Enable VPC access if connecting to internal systems.
- Implement CloudWatch Logs and AWS Config for monitoring.
  This ensures both data integrity and secure file handling.

Which AWS services have you used in RAG or AI projects?
In RAG (Retrieval-Augmented Generation) and AI workflows, I’ve used:
- S3 for document and embedding storage,
- Lambda for data preprocessing,
- OpenSearch or Aurora for vector storage,
- SageMaker for model training and inference, and
- API Gateway for serving model endpoints securely.

What is the difference between EC2, Lambda, and ECS?
- EC2: Provides full control over virtual machines — you manage scaling and OS.
- Lambda: Runs functions on demand — completely serverless, no infrastructure management.
- ECS: Manages Docker containers — suitable for microservices and containerized workloads.

What is IAM and why is it important?
IAM (Identity and Access Management) controls who can access AWS resources and what actions they can perform.
It’s important for:
- Implementing least-privilege access,
- Managing users, groups, and roles,
- Enabling MFA (Multi-Factor Authentication),
- Ensuring compliance and security of the cloud environment.

What are Security Groups and NACLs in AWS?
- Security Groups act as virtual firewalls for EC2 instances, controlling inbound and outbound traffic at the instance level.
- Network ACLs (Access Control Lists) control traffic at the subnet level.
  Security Groups are stateful, while NACLs are stateless.

What is the difference between S3 and EBS?
- S3 (Simple Storage Service): Object storage — used for static files, backups, and big data.
- EBS (Elastic Block Store): Block storage — used as a hard disk for EC2 instances.
  S3 is scalable and accessed via APIs; EBS is attached to a single instance.

What is the difference between Load Balancer types in AWS?
AWS provides three main types:
- Application Load Balancer (ALB): Works at Layer 7, for HTTP/HTTPS traffic.
- Network Load Balancer (NLB): Works at Layer 4, for high performance and TCP/UDP traffic.
- Gateway Load Balancer (GLB): Used for deploying and scaling network appliances like firewalls.

What is Auto Scaling in AWS?
Auto Scaling automatically adjusts the number of EC2 instances or containers based on demand.
It helps maintain performance while reducing costs by scaling resources up or down according to metrics like CPU or request rate.

What is the difference between RDS and DynamoDB?
- RDS: Relational database service supporting MySQL, PostgreSQL, and others. It uses structured schema and SQL queries.
- DynamoDB: Fully managed NoSQL database with key-value pairs and high scalability.
  Use RDS for transactional workloads, and DynamoDB for high-throughput, flexible schema applications.

What is the difference between Availability Zone (AZ) and Region?
- Region: A geographical area containing multiple data centers.
- Availability Zone (AZ): One or more isolated data centers within a region.
  Deploying across multiple AZs increases fault tolerance and uptime.

How do you monitor AWS resources?
AWS provides several monitoring tools:
- CloudWatch for metrics, alarms, and logs,
- CloudTrail for tracking API calls and user activity,
- AWS Config for resource configuration changes, and
- Trusted Advisor for cost and security optimization.
  These tools help maintain visibility, security, and operational health of AWS environments.

How did you deploy your applications — Docker, Kubernetes, or others?

What is the difference between Docker image and container?

Why do we use Docker?

How would you Dockerize a FastAPI app?

Have you created any DevOps pipelines?

How do you ensure responsible and ethical use of AI?

How would you design cloud security architecture?

What is Zero Trust in cloud security?

📊 EDA (Exploratory Data Analysis / Data Handling

attachment:afc99a22-8dd9-4da6-b1a0-81689e56e97b:download_(22).mp3

What is EDA and why is it important?
EDA (Exploratory Data Analysis) is the process of examining datasets to summarize their main characteristics using visual and statistical methods.
It helps to understand data distribution, detect patterns, find missing values or outliers, and form hypotheses before applying any model.
In short, EDA gives insights that guide better decision-making in data preprocessing and modeling.

What are the main steps in performing EDA?
The key steps in EDA are:
1. Data Collection – gathering data from different sources.
1. Data Cleaning – handling missing, duplicate, or inconsistent data.
1. Data Profiling – understanding data types and summary statistics.
1. Visualization – using plots like histograms, box plots, and scatter plots.
1. Correlation Analysis – checking relationships between variables.
1. Feature Engineering – creating or transforming variables for modeling.

How do you handle missing data?
There are several ways to handle missing data:
- Delete rows or columns if missingness is small.
- Impute using mean, median, or mode for numerical data.
- Use interpolation or forward-fill for time series data.
- Use predictive imputation with algorithms like KNN or regression.
  Choosing the right method depends on how much data is missing and its pattern.

How do you detect and handle outliers?
Outliers can be detected using:
- Statistical methods: z-score or IQR (Interquartile Range).
- Visual methods: box plots or scatter plots.
  Handling methods include:
- Removing the outliers if they are due to data entry errors.
- Transforming data (like using log or square root).
- Capping values (winsorization) or using robust models less affected by outliers.

What types of plots do you use in EDA?
Common plots used in EDA include:
- Histogram: to check data distribution.
- Box Plot: to identify outliers.
- Scatter Plot: to see relationships between variables.
- Heatmap: to visualize correlation.
- Bar Chart and Pie Chart: for categorical variables.

How do you check correlation between variables?
Correlation measures how two variables move together.
We use:
- Pearson correlation for linear relationships.
- Spearman correlation for ranked or non-linear relationships.
  A heatmap can visually show correlation between all numeric variables.
  In Python, we use df.corr() or sns.heatmap(df.corr()).

What is multicollinearity and how do you detect it?
Multicollinearity occurs when independent variables are highly correlated with each other.
It can make model coefficients unstable.
We detect it by:
- Checking the correlation matrix.
- Calculating the VIF (Variance Inflation Factor).
  If VIF is greater than 10, that variable may cause multicollinearity and should be removed or combined.

How do you identify skewness in data?
Skewness shows how data is distributed — whether it leans to the left or right.
We can detect it using:
- Histogram or density plot.
- Skew() function in pandas (df.skew()).
  If skewness is high, we can apply log, square root, or Box-Cox transformation to make it more normal.

What's the difference between univariate, bivariate, and multivariate analysis?
- Univariate Analysis: Examines one variable (e.g., histogram of age).
- Bivariate Analysis: Studies relationships between two variables (e.g., scatter plot of income vs. age).
- Multivariate Analysis: Looks at more than two variables together (e.g., correlation matrix or multiple regression).

How do you detect and handle duplicate data?
Duplicate data can be detected using:
```
df.duplicated().sum()
```
and removed using:
```
df.drop_duplicates(inplace=True)
```
It’s important to check duplicates carefully — sometimes, multiple entries may appear similar but represent valid data (like multiple transactions by the same customer).

attachment:e63ca0e2-52ae-4807-af72-5adbc155b1ee:download_(24).mp3

How large is the dataset in your project?
1. In my project, the dataset contains around 5 lakh records collected from multiple sources such as transaction logs and customer activity data. The size helps in creating robust models with sufficient data for training and validation.

How many features (variables) are included?
1. The dataset includes approximately 25 to 30 features. These include numerical, categorical, and time-based variables like transaction amount, frequency, user ID, and location. Feature selection was performed using correlation analysis and domain knowledge to improve model efficiency.

How did you handle missing values and data quality?
Missing values were handled using techniques like mean or median imputation for numerical columns, and mode or most frequent category imputation for categorical data.
For data quality, I removed duplicates, standardized formats, and validated ranges to ensure consistency and accuracy before training.

How did you identify patterns in fraud detection data?
- I analyzed behavioral patterns such as unusual transaction times, high transaction amounts, and device changes. Using EDA techniques, I visualized data through histograms and boxplots to detect anomalies and applied correlation heatmaps to identify relationships between fraudulent activities and features.

How did you handle class imbalance (SMOTE, oversampling, undersampling)?
- Since fraud cases were very rare compared to normal transactions, I used SMOTE (Synthetic Minority Oversampling Technique) to generate synthetic samples for the minority class. In some tests, I also tried undersampling the majority class to balance the dataset without losing important patterns.

How did you process raw or historical data?
- I collected raw data from multiple sources like APIs and SQL databases. Then, I performed preprocessing — such as removing noise, handling missing values, encoding categorical variables, and scaling numerical data. For historical data, I created time-based features like transaction trends, rolling averages, and frequency patterns to improve model accuracy.

🤖 Machine Learning (ML)

attachment:24a0ce1e-a4e4-496c-9cc4-017ce3a64fd8:download_(25).mp3

Which ML algorithm did you use and why?
1. I used Random Forest for classification because it handles both categorical and numerical data efficiently, reduces overfitting through ensemble learning, and provides good accuracy with minimal tuning. For regression tasks, I sometimes use XGBoost for its speed and high performance.

What are the evaluation metrics you used?
I used accuracy, precision, recall, F1-score, and AUC-ROC depending on the problem type.
For imbalanced datasets like fraud detection, I prefer precision, recall, and F1-score over accuracy to better measure the model’s real-world performance.

How do you interpret results with a confusion matrix?
A confusion matrix shows True Positives, True Negatives, False Positives, and False Negatives.
It helps evaluate how well the model is performing by showing the number of correct and incorrect predictions in each category.

Why isn’t accuracy alone reliable?
Accuracy can be misleading when the dataset is imbalanced.
For example, in fraud detection where fraud cases are only 2%, even a model that predicts everything as “not fraud” can have 98% accuracy — but it’s useless. That’s why we consider precision, recall, and F1-score.

What’s the difference between bagging and boosting?
Bagging builds multiple independent models on random subsets of data and averages their results to reduce variance — Random Forest is an example.
Boosting, on the other hand, builds models sequentially where each model corrects the previous one’s errors — examples include AdaBoost and XGBoost.

Explain Random Forest (and Gini Index).
Random Forest is an ensemble of decision trees built using bagging.
It takes multiple samples of data, trains a tree on each, and averages the results.
The Gini Index measures node impurity in decision trees — a lower Gini value means a purer node.

What’s the difference between Random Forest and XGBoost?
Random Forest uses bagging — it trains trees independently in parallel.
XGBoost uses boosting — it builds trees sequentially, where each tree focuses on correcting errors made by the previous one.
XGBoost is generally faster and performs better on complex datasets.

What is cross-validation?
Cross-validation is a technique to evaluate model performance by dividing data into multiple folds.
The model trains on some folds and tests on the remaining one, repeating the process to get a more reliable performance estimate.

How do you handle model overfitting?
I handle overfitting using techniques like cross-validation, early stopping, regularization (L1/L2), pruning in decision trees, and using dropout in neural networks.
I also ensure data is properly split and features are not leaking information.

what is ROC (Receiver Operating Characteristic) Curve?
The ROC curve plots True Positive Rate against False Positive Rate at different thresholds.
It shows the trade-off between sensitivity and specificity, helping evaluate model performance.

What is AUC (Area Under the ROC Curve)?
AUC represents the area under the ROC curve.
It measures the model’s ability to distinguish between classes.
A higher AUC indicates a better model — 1 means perfect classification, 0.5 means random guessing.

What is Out-of-Bag error?
In Random Forest, some data samples are left out while training each tree — these are called Out-of-Bag samples.
OOB error is the average error for these samples, used as an unbiased estimate of model accuracy.

What is bias vs. variance?
Bias is the error due to simplifying the model too much — it leads to underfitting.
Variance is the error due to model complexity — it leads to overfitting.
The goal is to balance both using techniques like regularization and ensemble methods.

What are dimensionality reduction techniques? (PCA, etc.)
Dimensionality reduction reduces the number of input variables while preserving most of the information.
PCA (Principal Component Analysis) transforms features into new components that capture maximum variance.
Other methods include LDA, t-SNE, and Autoencoders.

What’s the difference between K-Means and DBSCAN?
K-Means groups data into a fixed number of clusters by minimizing distance to cluster centers.
DBSCAN groups data based on density and can find clusters of arbitrary shapes without needing to specify the number of clusters.

How does DBSCAN calculate data density?
DBSCAN uses two parameters — epsilon (distance threshold) and minPoints (minimum neighbors).
A point is considered a core point if it has at least minPoints within the epsilon distance.
Clusters are formed by connecting core points that are close to each other.

🧠 Deep Learning (DL)

attachment:30e18d27-234f-46da-943e-50c9c529b69a:download_(29).mp3

What is Deep Learning?
1. Deep Learning is a subset of Machine Learning that uses neural networks with multiple layers to automatically learn complex patterns from large amounts of data. It is especially effective in areas like image recognition, speech processing, and natural language understanding.

What is the difference between Machine Learning and Deep Learning?
1. Machine Learning uses algorithms that rely on feature extraction done by humans, while Deep Learning automatically extracts features using neural networks. ML works well with structured data, while DL excels with unstructured data like images, videos, and text.

What is a Neural Network?
1. A Neural Network is a set of algorithms modeled after the human brain. It consists of layers of interconnected nodes (neurons) that process input data, learn relationships, and make predictions by adjusting weights through training.

What are Activation Functions?
1. Activation Functions introduce non-linearity into neural networks, allowing them to learn complex relationships. Common activation functions include ReLU, Sigmoid, and Tanh.

What is Backpropagation?
1. Backpropagation is the process used to train neural networks by calculating the gradient of the loss function with respect to each weight and updating the weights to minimize the error.

What is Overfitting, and how can you prevent it?
1. Overfitting occurs when a model performs well on training data but poorly on unseen data. It can be prevented using techniques like regularization, dropout, early stopping, and data augmentation.

What are CNNs and RNNs?
1. CNNs (Convolutional Neural Networks) are designed for spatial data like images; they use convolution layers to extract features. RNNs (Recurrent Neural Networks) are used for sequential data like text or time series, as they retain memory of previous inputs.

Difference between RNN and LSTM.
1. RNNs suffer from vanishing gradient problems and have short memory, while LSTMs (Long Short-Term Memory) solve this issue by using gates (input, forget, output) to store information over long sequences.

What is the difference between Batch Gradient Descent, Stochastic, and Mini-batch?
- Batch Gradient Descent: Uses the entire dataset for each update — accurate but slow.
- Stochastic Gradient Descent (SGD): Updates after every sample — fast but noisy.
- Mini-batch Gradient Descent: Uses small batches of data — balances speed and stability.

What is Dropout in Deep Learning?
1. Dropout is a regularization technique where a fraction of neurons are randomly turned off during training to prevent overfitting and improve generalization.

What is Transfer Learning?
1. Transfer Learning involves reusing a pre-trained model on a new task with a smaller dataset. It saves time and improves performance, especially when data is limited.

Explain Transformer architecture and Attention mechanism.
1. Transformers use self-attention mechanisms to weigh the importance of different parts of the input sequence. Unlike RNNs, they process all tokens in parallel, making them faster and more efficient. The architecture includes encoder-decoder blocks and is widely used in NLP models like BERT and GPT.

⚙️ MLOps

What tools or platforms did you use for deployment (Flask, FastAPI, SageMaker, Docker)?

Have you built CI/CD pipelines?

What’s your retraining frequency and strategy?

How do you monitor model drift or performance drops?

How do you deploy algorithms (cloud/on-premise/containerized)?

🔍 LLM / RAG / Generative AI

What LLM did you integrate and how?

Explain the architecture/flow of your RAG system.

How do you handle document chunking?

How do you reduce hallucinations in RAG?

What are your prompt engineering strategies?

What is the temperature parameter in LLMs?

How do you select or evaluate the best LLM?

Have you worked with LangChain or LangGraph?

What vector databases have you used (Pinecone, FAISS, etc.)?

How do you extract text and tables from PDFs and images?

How do you measure RAG performance?

💡 Other / Basic / HR

Tell me about yourself and your current role.

Why did you leave your previous job?

How many total years of experience do you have?

What are your strengths in Python?

Are you comfortable working in shifts or client locations?

What is your current and expected salary?

Why do you want to join this role?

Do you have any questions for me?