Search test library by skills or roles
⌘ K

Adaface Sample Natural Language Processing Questions

Here are some sample Natural Language Processing questions from our premium questions library (10273 non-googleable questions).

Skills

Others

Embedded Systems Agile/Scrum Cyber Security SAP ABAP SAP HANA SAP Fiori SAP GRC SAP QM SAP SuccessFactors Salesforce Developer Salesforce Administrator Boomi Dynamics 365 SCM Dynamics 365 Finance Dynamics 365 Customer Service Dynamics 365 Sales Dynamics 365 Customer Voice Dynamics 365 Commerce Dynamics AX Spark Adobe InDesign Oracle Hyperion Planning Customer Support ITIL Blue Prism SAS SCCM SSAS SSIS SSRS Citrix Google AdWords Weblogic Talend UML Human Resource Management Talent Acquisition Power Apps RPA CISCO CISCO DCIM French Apache NiFi Apache Pig TIBCO Spotfire TIBCO Business Studio TIBCO Administration TIBCO ActiveMatrix BPM TIBCO Hawk TIBCO Apache Tomcat Oracle Hyperion Financial Management Oracle Fusion Oracle AIA Oracle APEX Oracle BPM Oracle Apps Oracle Financial Apps Oracle OAF Oracle SOA SAP Hybris SAP BusinessObjects SAP BI SAP PowerDesigner SAP PowerBuilder SAP Leonardo SAP MDM SAP MDG SAP BW SAP SRM SAP UI5 SAP MM SAP HCM SAP BTP (SCP) SAP PI SAP PP SAP Basis SAP SD SAP WM SAP PS SAP BODS SAP DBM SAP Litmos SAP FI Loadrunner WPF WebFOCUS Ranorex Informatica Data Quality GDPR RabbitMQ Gradle Grunt EJB SnapLogic SharePoint Progress (OpenEdge) Informatica B2B Data Exchange Informatica MDM Joomla Ionic Liferay Sqoop Computer Literacy Communication Skills Technical Support Ecommerce Analytics Software Support Growth Marketing Marketing Analysis Digital Marketing Product Marketing SEO Outreach Market Research Jira German Italian Spanish Grammar & Vocabulary Listening Comprehension Reading Comprehension Sentence Structure
🧐 Question

Medium

Hate Speech Detection Challenge
Text Classification
Data Imbalance
Solve
You are working on a project to detect hate speech in social media posts. Your initial model, a basic binary classification model, has achieved high accuracy during training, but it's not performing well on the validation set. You also notice that your dataset has significantly more non-hate-speech examples than hate-speech examples. Given this situation, which of the following strategies could likely improve the performance of your model?
A: Collect more data and retrain the model.
            B: Introduce data augmentation techniques specifically for hate-speech examples.
            C: Change the model architecture from binary classification to multi-class classification.
            D: Replace all the words in the posts with their synonyms to increase the diversity of the data.
            E: Remove the non-hate-speech examples from the dataset to focus on the hate-speech examples.

Easy

Identifying Fake Reviews
Text Classification
Solve
You are a data scientist at an online marketplace company. Your task is to develop a solution to identify fake reviews on your platform. You have a dataset where each review is marked as either 'genuine' or 'fake'. After developing an initial model, you find that it's accurately classifying 'genuine' reviews but performing poorly with 'fake' ones. Which of the following steps can likely improve your model's performance in this context?
A: Use a more complex model to capture the intricacies of 'fake' reviews.
            B: Obtain more data to improve the overall performance of the model.
            C: Implement a cost-sensitive learning approach, placing a higher penalty on misclassifying 'fake' reviews.
            D: Translate the reviews to another language and then back to the original language to enhance their clarity.
            E: Remove the 'genuine' reviews from your training set to focus on 'fake' reviews.

Medium

Sentence probability
N-Grams
Language Models
Solve
Consider the following pseudo code for calculating the probability of a sentence using a bigram language model:
 image
Assume that the bigram and unigram counts are as follows:
            
            bigram_counts = {("i", "like"): 2, ("like", "cats"): 1, ("cats", "too"): 1}
            unigram_counts = {"i": 2, "like": 2, "cats": 2, "too": 1}
            vocabulary_size = 4
            
            What is the probability of the sentence "I like cats too" using the bigram language model?

Easy

Tokenization and Stemming
Stemming
Solve
You are working on a natural language processing project and need to preprocess the text data for further analysis. Your task is to tokenize the text and apply stemming to the tokens. Assuming you have an English text corpus, which of the following combinations of tokenizer and stemmer would most likely result in the best balance between token granularity and generalization?

Medium

Word Sense Disambiguation
Solve
You have been provided with a pre-trained BERT model (pretrained_bert_model) and you need to perform Word Sense Disambiguation (WSD) on the word "bat" in the following sentence:
            
            "The bat flew around the room."
            
            You have also been provided with a function called cosine_similarity(vec1, vec2) that calculates the cosine similarity between two vectors.
Which of the following steps should you perform to disambiguate the word "bat" in the given sentence using the BERT model and cosine similarity?
            
            1. Tokenize the sentence and pass it through the pre-trained BERT model.
            2. Extract the embeddings of the word "bat" from the sentence.
            3. Calculate the cosine similarity between the "bat" embeddings and each sense's representative words.
            4. Choose the sense with the highest cosine similarity.
            5. Calculate the Euclidean distance between the "bat" embeddings and each sense's representative words.
            6. Choose the sense with the lowest Euclidean distance.
🧐 Question🔧 Skill

Medium

Hate Speech Detection Challenge
Text Classification
Data Imbalance

2 mins

Natural Language Processing
Solve

Easy

Identifying Fake Reviews
Text Classification

2 mins

Natural Language Processing
Solve

Medium

Sentence probability
N-Grams
Language Models

2 mins

Natural Language Processing
Solve

Easy

Tokenization and Stemming
Stemming

2 mins

Natural Language Processing
Solve

Medium

Word Sense Disambiguation

2 mins

Natural Language Processing
Solve
🧐 Question🔧 Skill💪 Difficulty⌛ Time
Hate Speech Detection Challenge
Text Classification
Data Imbalance
Natural Language Processing
Medium2 mins
Solve
Identifying Fake Reviews
Text Classification
Natural Language Processing
Easy2 mins
Solve
Sentence probability
N-Grams
Language Models
Natural Language Processing
Medium2 mins
Solve
Tokenization and Stemming
Stemming
Natural Language Processing
Easy2 mins
Solve
Word Sense Disambiguation
Natural Language Processing
Medium2 mins
Solve

Trusted by recruitment teams in enterprises globally

Amazon Morgan Stanley Vodafone United Nations HCL PayPal Bosch WeWork Optimum Solutions Deloitte Microsoft NCS Doubtnut Sokrati J&T Express Capegemini

We evaluated several of their competitors and found Adaface to be the most compelling. Great library of questions that are designed to test for fit rather than memorization of algorithms.


Swayam Narain, CTO, Affable

hashtag image heart icon Swayam
customers across world
Join 1200+ companies in 75+ countries.
Try the most candidate friendly skills assessment tool today.
g2 badges
Ready to streamline your recruitment efforts with Adaface?
Ready to streamline your recruitment efforts with Adaface?
logo
40 min tests.
No trick questions.
Accurate shortlisting.
ada
Ada
● Online
Previous
Score: NA
Next
✖️