Teradata interview questions and answers 👇

  1. Teradata Interview Questions


Teradata Interview Questions

What is PDE in Teradata?

View answer

Parallel Database Extensions (PDE) is a software interface layer that lies between the operating system and Teradata Database. PDE supports the parallelism across system nodes that contributes to Teradata Database speed and linear scalability.

What is a Parsing engine?

View answer

The Parser does the following: Interprets the SQL statement received from the application. Verifies SQL requests for the proper syntax and evaluates them semantically. Consults the Data Dictionary to ensure that all objects exist and that the user has authority to access them.

What is Transient Journaling in Teradata?

View answer

In Teradata database, the transient journal (TJ) is a system-maintained log that provides a way to protect transactions from various system failures. Each transaction processed by the database records a before image of rows that are touched by the transaction.

What are Database Privileges in Teradata?

View answer

A database privilege is permission to access or to manipulate a database object or data. Specific privileges are required for nearly everything that can be done in the Teradata Database. Teradata Database privileges are used by administrators to control access to database objects and data and to control the types of actions and activities available to users.

The privileges are used to control which users can:

  • Access, create, modify, or delete specific database objects and data
  • Execute specific macros, stored procedures, and UDFs
  • Monitor system-wide activity
  • Grant privileges to other users

What are Volatile Tables in Teradata?

View answer

Volatile tables are created, used and dropped within a user session. Their definition is not stored in data dictionary. They hold intermediate data of the query which is frequently used.

Explain Permanent Journaling.

View answer

The permanent journal is a user‑specified table that can be used to capture both before images and after images of Teradata Database transactions. Journal entries in this table can be used by the Archive/Recovery utility to roll forward or roll back transactions during a recovery operation.

Explain block compression.

View answer

During the compression process, Teradata Database reforms data blocks into their maximum size. With larger block sizes defined, Teradata Database can achieve a greater degree of block-level compression. For performance purposes, the size to make data blocks depends on the application (as it does without BLC).

Explain Surrogate Keys in Teradata?

View answer

A surrogate key in Teradata is used to map the natural keys of source systems to a unique key, usually an integer value. Usually, one or more natural key columns are mapped to a surrogate key worthy of an INTEGER. Often a consecutive number is generated.

Explain Referential Constraints.

View answer

Referential integrity ensures that a value created in a foreign key (FK) column of a referencing (child) table also exists as a primary key (PK) in the referenced (parent) table. In Teradata, the parent table's column does not have to be specified explicitly as PK, but an FK can reference each unique NOT NULL column.

What are Transition Tables in Teradata?

View answer

A transition table is a dynamically created table that you can reference using a correlation name.

What are the main components of Teradata Architecture?

View answer

The major components of Teradata are Parsing Engine, BYNET and Access Module Processors (AMPs). The following diagram shows the high level architecture of a Teradata Node.

What are Locks in Teradata?

View answer

In Teradata locking prevents multiple users from trying to access the same data simultaneously.

What is Fallback in Teradata?

View answer

Fallback is a feature of the Teradata Database, Fallback protects data in case of an AMP vproc failure. This protection can be done at the database or table level.

What is a dispatcher?

View answer

The Dispatcher exercises execution and response control for requests and dynamic plan fragments, and manages transaction and request aborts and the queue table cache.

The Dispatcher has the following primary functions:

  • Apply TASM rules and classify into workload.
  • Route AMP steps to the appropriate AMPs.
  • Return result or statistics feedback to the optimizer so it can generate the next plan fragment when executing a dynamic plan.
  • Return results of a request to the user application that submitted that request.
  • Notify client applications and Teradata platform processors of aborted transactions and requests.

What is BYNET?

View answer

The BYNET interconnect was designed to address the YNET's weaknesses, especially that of scalability limitation. The BYNET handles the inter-vproc messaging via shared memory.

How would you find duplicates in a Table?

View answer

GROUP BY clause can be used to find the duplicates in a table by associating the columns. If you want to find duplicates over whole table, all the columns must be supplied in GROUP BY clause.

Explain the purpose of FALLBACK.

View answer

If you have specified FALLBACK in creating the table, either explicitly or by default, the system automatically maintains a duplicate copy of the data in the table. This fallback copy is then used if the primary copy becomes unavailable.

Fallback is very important when a system needs to reconstruct data from fallback copies when a single‑bit read error occurs when it attempts to read the primary copy of the data. When a hardware read error occurs in this case, the file system reads the fallback copy of the rows and reconstructs a memory‑resident image of them on their home AMP.

What is RAID in context of Teradata?

View answer

RAID is a technology that uses multiple physical disk drives to protect data from a single disk failure. The purpose of RAID is to ensure that at the time of failure, there should be one copy of data which should be available for immediate use. In some cases the Teradata systems are pre-configured with RAID 1.

What is Partitioned Primary Index (PPI) and what are the advantages of using it in a query?

View answer

Partitioned Primary Index (PPI) is an indexing mechanism that improves the performance of certain queries. Within a table, with Partitioned Primary Index (PPI) defined, rows are sorted according to their partition number.

Within each partition, records are arranged by their row hash. Also, rows are assigned to a partition based on the definition of a partition.

Advantages of Partitioned Primary Index (PPI):

  • For several queries, PPI helps to avoid a full table scan and only required partitions are accessed.
  • Prevents using the secondary index and it helps to avoid additional input/output maintenance.
  • Quick access to the subset of a large table.
  • Easy to drop old data and add new data.

What is the use of the UPSERT command?

View answer

UPSERT command is used to perform the tasks Update Else Insert nature, and it is one of the features only available in Teradata.