SAS interview questions and answers 👇

  1. SaS General Interview Questions


SaS General Interview Questions

What is Base SAS?

View answer

Base SAS is a text-based, basic IDE with an older interface. Enterprise Guide (EG) is a more GUI-like IDE with wizards to assist with writing code for various processes.

Define the STD function.

View answer

With the help of the STD function, the standard deviation will be returned for the nonmissing statements.

What is BY-Group processing?

View answer

This type of term is used to make sure that the data which is process is grouped, indexed or even ordered based depending upon the variables.

If a variable contains letters or special characters, can it be numeric data type?

View answer

No, it must be character data type.

How would you identify a macro variable?

View answer

With the Ampersand (&) sign.

When will you use SELECT construct instead of IF statement?

View answer

When we have a long series of mutually exclusive conditions and the comparison is numeric, we use SELECT construct rather than IF-THEN or IF-THEN-ELSE statement because CPU time is reduced.

What are Global Statements in SAS?

View answer

Global statements are the one used anywhere in a SAS program and takes effect until it is changed, canceled or SAS session ends.

What are SYMGET and SYMPUT?

View answer

SYMPUT puts the value from a dataset into a macro variable where as SYMGET gets the value from the macro variable to the dataset.

What is SAS program?

View answer

A SAS program is a sequence of statements executed in order. Every SAS statement ends with a semicolon and the statement can be in upper-or lowercase letters.

Full form for SAS?

View answer

SAS means “Statistical Analysis Software”.

What are the 3 components in SAS programming?

View answer

  • Statements
  • Variables
  • Dataset

Difference between Informat and Format.

View answer

Informat - To tell SAS that a number should be read in a particular format.

Format - To tell SAS how to print the variables.

Elucidate the FILECLOSE data set option.

View answer

When a data set is closed, its tape positioning is defined by FILECLOSE.

How to limit decimal places for variable using PROC MEANS?

View answer

By using MAXDEC= option.

What is Linear Regression?

View answer

Linear regression is a statistical technique where the score of a variable Y is predicted from the score of a second variable X. X is referred to as the predictor variable and Y as the criterion variable.

Does SAS 'Translate'(compile) or 'interpret'?

View answer

Compile

How Data Step Merge and PROC SQL handle many-to-many relationship?

View answer

Data Step MERGE does not create a cartesian product incase of a many-to-many relationship. Whereas, Proc SQL produces a cartesian product.

Can you explain the process of CALENDAR?

View answer

The prime aim of CALENDAR is to make the data of the calendar on monthly basis be visible in the format of the SAS data set.

How do you use the do loop if you don’t know how many times you should execute the do loop?

View answer

We can use ‘do until’ or ‘do while’ to specify the condition.

How many data types are there in SAS?

View answer

There are two data types in SAS. Character and Numeric. Apart from this, dates are also considered as characters although there are implicit functions to work upon dates.

What is the one statement to set the criteria of data that can be coded in any step?

View answer

WHERE statement can sets the criteria for any data set in a data step or a proc step.

What is the difference between PROC MEANS and PROC Summary?

View answer

The difference between the two procedures is that PROC MEANS produces a report by default. By contrast, to produce a report in PROC SUMMARY, you must include a PRINT option in the PROC SUMMARY statement.

Explain the use of PROC GPLOT.

View answer

PROC GPLOT identifies the data set that contains the plot variables. It has more options and, therefore, can create more colorful and fancier graphics.

What is the use of $BASE64X?

View answer

By using $BASE64X encoding, the character data is converted into ASCII text.

What is PDV?

View answer

Program Data Vector (PDV) is the area of memory where data sets are created through the SAS system, one at a time. When a program is executed, an Input Buffer is created that reads data values and makes them assigned to their respective variables.

How to redirect SAS user folder?

View answer

Use LIBNAME statement to redirect SAS user folder.

What are _N_ and _ERROR_ in SAS?

View answer

  • _N_ is a data counter variables used to indicate the number of times that SAS has looped through the data step.
  • _ERROR_ is a implicit variable created by SAS during data processing. It gives the total number of records SAS has iterated in a dataset.

What is the difference between %LOCAL and %GLOBAL?

View answer

% Local is a macro variable defined inside a macro. %Global is a macro variable defined in open code (outside the macro or can use anywhere).

What do you understand by CALL MISSING Routine?

View answer

The character or numeric variables that are specified can be assigned missing values through the CALL MISSING routine.

do you know what CALL PRXCHANGE routine is?

View answer

CALL PRXCHANGE routine helps to perform the pattern matching replacement.

What is the purpose of trailing @ and @@?

View answer

  • The single trailing @ tells the SAS system to “hold the line”.
  • The double trailing @@ tells the SAS system to “hold the line more strongly”.

How to count unique values by a grouping variable?

View answer

You can use PROC SQL with COUNT(DISTINCT variable_name) to determine the number of unique values for a column.

What is the difference between do while and do until?

View answer

An important difference between the DO UNTIL and DO WHILE statements is that the DO WHILE expression is evaluated at the top of the DO loop. If the expression is false the first time it is evaluated, then the DO loop never executes. Whereas DO UNTIL executes at least once.

When grouping is in effect, can the WHERE clause be used in PROC SQL to subset data?

View answer

No. In order to subset data when grouping is in effect, the HAVING clause must be used. The variable specified in having clause must contain summary statistics.

What do you mean by the ALTER= Data Set option?

View answer

It is used for assigning an ALTER password, which will stop users from changing the file.

What is the work of tranwrd function?

View answer

TRANWRD function replaces or removes all occurrences of a pattern of characters within a character string.

How to specify variables to be processed by the FREQ procedure?

View answer

By using TABLES Statement.

What are the statements in PROC SQL?

View answer

Select, From, Where, Group By, Having, Order.

Name statements that function at both compile and execution time.

View answer

Options, title, footnote

What is APPEND procedure?

View answer

It is all about adding at the end so that in case of SAS, there can be one more SAS data which you can add and further more other data set can automatically be added.

What does ODS stand for?

View answer

ODS stands for the Output Delivery System.

Given an unsorted data set, how to read the last observation to a new data set?

View answer

We can read the last observation to a new data set using end= data set option.

Give some examples where PROC REPORT’s defaults are same as PROC PRINT’s defaults?

View answer

  • Variables/Columns in position order.
  • Rows ordered as they appear in data set.

What is the difference between using drop = data set option in data statement and set statement?

View answer

If you don’t want to process certain variables and you do not want them to appear in the new data set, then specify drop = data set option in the set statement.

Whereas If want to process certain variables and do not want them to appear in the new data set, then specify drop = data set option in the data statement.

How to sort in descending order?

View answer

Use DESCENDING keyword in PROC SORT code.

Give some examples where PROC REPORT’s defaults are different than PROC PRINT’s defaults?

View answer

  • No Record Numbers in Proc Report.
  • Labels (not var names) used as headers in Proc Report.
  • REPORT needs NOWINDOWS option.

How would you define the end of a macro?

View answer

The end of the macro is defined by %Mend Statement

What is PROC UNIVARIATE?

View answer

The purpose of using such type of detail is for analysis the elementary at numeric level. It will help you examine how well is the data actually distributed

What is the use of the DIVIDE function?

View answer

The DIVIDE function is used to return the division result.

Difference between Missover and Truncover.

View answer

Missover -When the MISSOVER option is used on the INFILE statement, the INPUT statement does not jump to the next line when reading a short line. Instead, MISSOVER sets variables to missing. Truncover - It assigns the raw data value to the variable even if the value is shorter than the length that is expected by the INPUT statement.

Briefly explain Input and Put function.

View answer

Input function – Character to numeric conversion- Input(source,informat).

put function – Numeric to character conversion- put(source,format).

do you know the functions that are used for Character handling functions?

View answer

There are basically two functions which are used for Character handling functions namely UPCASE and LOWCASE.

What is the purpose of _error_?

View answer

It has only 2 values, 1 for error and 0 for no error.

What are the default statistics for means procedure?

View answer

n-count, mean, standard deviation, minimum, and maximum

What is DATA _NULL_?

View answer

The DATA _NULL_ is mainly used to create macro variables. It can also be used to write output without creating a dataset.The idea of "null" here is that we have a data step that actually doesn't create a data set.

Explain BOR function?

View answer

It is a bitwise logical operation and is used for returning bitwise logical OR between two statements.

Explain the COMPRESS= Data set option.

View answer

It is used for compressing the data into new output.

What is RUN-Group processing?

View answer

It is used for submitting the step of a PROC which is used more specifically in RUN statement. It ends without any kind of process.

What are the special Input Delimiters?

View answer

Input delimiters are DLM and DSD.

Which command is used to save logs in the external file?

View answer

PROC PRINTTO command is used to save logs in the external file.

How to limit decimal places for variable using PROC MEANS?

View answer

By using MAXDEC= option

Is using ‘group’ the only way to define variables in a ‘PROC report’?

View answer

Using the ‘group’ definition isn’t the only way to define the variables. There are quite a few definitions that you can use (i.e. analysis).

What are the features of SAS?

View answer

  • Business Solutions: SAS provides business analysis that can be used as business products for various companies to use.
  • Analytics: SAS is the market leader in the analytics of various business products and services.
  • Data Access & Management: SAS can also be use as a DBMS software.
  • Reporting & Graphics: Hello SAS helps to visualize the analysis in the form of summary, lists and graphic reports.
  • Visualization: We can visualize the reports in the form of graphs ranging from simple scatter plots and bar charts to complex multi-page classification panels.

Can you explain about CALL PRXFREE Routine?

View answer

It focuses on Character String Matching to allocate the free memory.

Describe CROSSLIST option in TABLES statement

View answer

Adding the CROSSLIST option to TABLES statement displays crosstabulation tables in ODS column format.

Difference between SET and MERGE.

View answer

SET concatenates the data sets where as MERGE matches the observations of the data sets.

Can PROC MEANS analyze ONLY the character variables?

View answer

No, Proc Means requires at least one numeric variable.

Difference between NODUP and NODUPKEY Options?

View answer

The NODUPKEY option removes duplicate observations where value of a variable listed in BY statement is repeated while NODUP option removes duplicate observations where values in all the variables are repeated (identical observations).

What is the difference between Order and Group variable in proc report?

View answer

  • If the variable is used as group variable, rows that have the same values are collapsed.
  • Group variables produce list report whereas order variable produces summary report.

How would you include common or reuse code to be processed along with your statements?

View answer

  • Using SAS Macros.
  • Using a %include statement

What is the maximum length of the macro variable?

View answer

32 characters long.

Which are the statements whose placement in the DATA step is critical?

View answer

DATA, INPUT, RUN, CARDS ,INFILE,WHERE,LABEL,SELECT,INFORMAT,FORMAT

What is the difference between reading data from an external file and reading data from an existing data set?

View answer

The main difference is that while reading an existing data set with the SET statement, SAS retains the values of the variables from one observation to the next. Whereas when reading the data from an external file, only the observations are read. The variables will have to re-declared if they need to be used.

Describe the VFORMATX function.

View answer

The VFORMATX function is used to return the format that is assigned with the value of a given statement.

What is SAS?

View answer

  • SAS is a software suite for advanced analytics, multivariate analyses, business intelligence, data management and predictive analytics
  • It is developed by SAS Institute.
  • SAS provides a graphical point-and-click user interface for non-technical users and more advanced options through the SAS language.

For what purpose would you use the RETAIN statement?

View answer

A RETAIN statement tells SAS not to set variables to missing when going from the current iteration of the DATA step to the next. Instead, SAS retains the values.

What are the default statistics that PROC MEANS produce?

View answer

PROC MEANS produce the “default” statistics of N, MIN, MAX, MEAN and STD DEV.

What does the function CATX syntax do?

View answer

CATX syntax inserts delimiters, removes trailing and leading blanks, and returns a concatenated character string.

What is scan function in sas and how it is used?

View answer

The scan function searches for a particular string and puts the value in the target variable, the target variable length using the scan function is 200 chars.

How to debug SAS Macros?

View answer

There are some system options that can be used to debug SAS Macros: MPRINT, MLOGIC, SYMBOLGEN.

What is ANYDIGIT function?

View answer

The focus of such function is to search the character string and return it soon after it is found.

What is the function of Stop statement in a SAS Program?

View answer

Stop statement causes SAS to stop processing the current data step immediately and resume processing statement after the end of current data step.

What are the differences between sum function and using “+” operator?

View answer

SUM function returns the sum of non-missing arguments whereas “+” operator returns a missing value if any of the arguments are missing.

What is PROC SORT?

View answer

It is used for sorting the SAS data for which variable are set. This way, it becomes possible to set a new data for further usage.

How to create list output for crosstabulations in proc freq?

View answer

To generate list output for crosstabulations, add a slash (/) and the LIST option to the TABLES statement in your PROC FREQ step.

TABLES variable-1*variable-2 <* … variable-n> / LIST;

What is Debugging?

View answer

Debugging is a technique for testing the program logic, and this can be done with the help of Debugger.

What is the function of output statement in a SAS Program?

View answer

You can use the OUTPUT statement to save summary statistics in a SAS data set. This information can then be used to create customized reports or to save historical information about a process.

You can use options in the OUTPUT statement to

  • Specify the statistics to save in the output data set,
  • Specify the name of the output data set, and
  • Compute and save percentiles not automatically computed by the CAPABILITY procedure.

Where do you use PROC MEANS over PROC FREQ?

View answer

We will use PROC MEANS for numeric variables whereas we use PROC FREQ for categorical variables.

Define in detail about the TRANSLATE function?

View answer

Under this function there are few characters which are specified in a string. They are then replaced with the other characters which are usually specified.

What does the trace option do?

View answer

ODS Trace is used to find the names of the particular output objects when several of them are created by some procedure. ODS TRACE ON; ODS TRACE Off;

Difference between SCAN and SUBSTR.

View answer

SCAN extracts words within a value that is marked by delimiters. SUBSTR extracts a portion of the value by stating the specific location. It is best used when we know the exact position of the sub string to extract from a character value.

What is the length assigned to the target variable by the scan function?

View answer

200

What is interleaving in SAS?

View answer

Interleaving combines individual, sorted SAS data sets into one sorted SAS data set.

Differentiate ‘CEIL’ and ‘FLOOR’.

View answer

The CEIL function, when issued, retrieves the smallest integer, while FLOOR does the opposite and retrieves the biggest one.

What are the statements that are executed only?

View answer

INFILE, INPUT, Output, Call routines

Name statements that are recognized at compile time only.

View answer

drop, keep, rename, label, format, informat, attrib, where, by, retain, length, array.

Name few SAS functions?

View answer

Scan, Substr, trim, Catx, Index, tranwrd, find, Sum.

What are _numeric_ and _character_ and what do they do?

View answer

  1. _NUMERIC_ specifies all numeric variables that are already defined in the current DATA step.
  2. _CHARACTER_ specifies all character variables that are currently defined in the current DATA step.

What is BMDP procedure?

View answer

This type of process is more basically used for analysis the data and ensure that whatever is received is accurate and comes without any kind of single error.

What is the difference between SAS functions and procedures?

View answer

Functions expect argument values to be supplied across an observation in a SAS data set whereas a procedure expects one variable value per observation.