- Why use SSIS when there is DTS?
- What is ‘data transformation’?
- What is a ‘task’?
- What are the important components of SSIS package?
- Explain solution Explorer in SSIS.
- What is the control flow?
- What is a data flow?
- Define what is "task" in SSIS?
- What is an SSIS package?
- Name different types of connection or files that support SSIS.
- What is a container? How many types of containers are there in SSIS?
- What are the different types of containers used in SSIS?
- What is Precedence Constraint in SSIS?
- What variables in SSIS and what are the types of variables in SSIS?
- Explain what is a checkpoint in SSIS?
- Explain connection managers in SSIS.
- What is SSIS breakpoint?
- Explain event logging in SSIS.
- What is logging mode property?
- Explain the term data flow buffer?
- For what data checkpoint data is not saved?
- What is conditional split transactions in SSIS?
- Name different types of Data viewers in SSIS.
- Explain the possible locations to save the SSIS package.
- What will be your first approach if the package that runs fine in Business Intelligence Development Studio (BIDS) but fails when running from an SQL agent job?
- What is the role of the Event Handlers tab in SSIS?
- How you can notify the staff members about package failure?
- How would you do logging in SSIS?
- How would you deploy an SSIS package on production?
- How to handle Early Arriving Facts or Late Arriving Dimension?
- Explain the method to perform incremental load?
- Name three data flow components in SSIS.
- Why check Points used in SSIS?
- Explain event logging mode property.
- Explain different options for dynamic configuration is SSIS.
- Explain Data conversion Transformation.
- Explain few features of SSIS.
- Explain two disadvantages of SSIS.
- What is the use of Execute SQL task in SSIS?
- What is an SSIS Catalog?
- How would you stop a package that is running forever?
- Explain project and package control flow in SSIS.
- Explain use of XML Task.
- What is the use of a sequence container?
- What are important best practices for using SSIS?
- What is the use of control flow tab in SSIS?
- How to create the deployment utility?
- What is the Manifest file in SSIS?
- What is File system deployment?
- Difference between Merge and Union All?
- What is the OLE DB Command Transform?
- Difference between Execute TSQL Task and Execute SQL Task
- A package runs without a hitch in BIDS (Business Intelligence Development Studio), but doesn’t run with SQL Agent. What is the most likely reason for this?
- What are the types of Lookup Cache Modes present in SSIS?
- How does an error occur in SSIS, and what are the most critical errors in SSIS?
DTS (Data Transformation Services) is an outdated version of SSIS. The latter was overhauled to be faster, more flexible and better optimized. So to put it short - SSIS is the newer, more advanced and developed version of DTS.
Data transformation is a process that allows you to extract specific data out of its source. Once that is done, it then manages and transfers it to the file of your choosing (more often than not, it is the end-file).
A task is something that you would issue to the database, to receive certain desired results. In total, there are two types of tasks in SSIS - control flow tasks and database maintenance ones.
The important component in SSIS package are
- Data flow
- Control flow
- Package Explorer
- Event handler
Solution Explorer in SSIS Designer is a screen where you can view and access all the data sources, data sources views, projects, and other miscellaneous files.
The control flow is part of the package and contains tasks with functionality (create backups, execute scripts, execute SQL tasks, connect to FTP, etc.) and containers (can be sequential, for each loop, for loops) and finally it includes constraints to join flows.
The data flow allows to export data from different sources to different destinations and transform the data if necessary. There is a Data Flow component in the control flow and when you double click the task you have new tasks to import and export data.
A task in SSIS is very much similar to the method of any programming language that represents or carries out an individual unit of work. Tasks are categorized into two categories
- Control Flow Tasks
- Database Maintenance Tasks
A package in SSIS is an organized collection of connections like data flow elements, control events, event handlers, parameters, variables, and configurations. You assemble them either building it programmatically or by graphical design tools that SSIS provides.
Different types of connection that work within SSIS are
- .net SQLClient
- Flat File
A container is defined as the set of tasks linked logically. In SSIS, the container's use is essential as it allows us to manage the scope of the task together.
There are mainly four types of containers used in SSIS:
- Task Host Container
- Sequence Container
- Foreach Loop Container
- For Loop Container
There are mainly three different types of containers used in SSIS, such as:
Sequence Container: This type of container is used to put all similar tasks in the same group. The sequence container is considered an organization container primarily used for those packages, which are more complex.
For Loop Container: This type of container is mainly used to execute any particular task to a specific number of times. The For Loop Container helps execute the same tasks several times instead of creating multiple packages or executing the entire package multiple times. For example- Assume, we want to update records for any task 10 times. We can put the task inside the 'for loop container' and assign a value 10 as the loop's end value. Doing so will execute the same task 10 times within the same package.
For Each Loop Container: This type of container is used in the scenario where we want to execute the task multiple times, but we are unsure how many times a task should perform. Using 'for each loop container', the task is executed any number of times until it satisfies the given condition. For example, suppose we want to delete all the files inside a folder, and we are not sure about the number of files inside the folder. Therefore, we can apply 'for each loop container' that will go through the collection of files one by one and delete them until the collection is empty.
Precedence Constraint in SSIS enables you to define the logical sequence of tasks in the order they should be executed. You can connect all the tasks using connectors- Precedence Constraints.
Variable in SSIS is used to store values. In SSIS, there are two types of variables system variable and user variable.
Checkpoint in SSIS allows the project to restart from the point of failure. Checkpoint file stores the information about the package execution, if the package run successfully the checkpoint file is deleted or else it will restart from the point of failure.
While gathering data from different sources and writing it to a destination, connection managers are helpful. Connection manager facilitates the connection to the system that includes information's like data provider information, server name, authentication mechanism, database name, etc.
A breakpoint enables you to pause the execution of the package in the business intelligence development studio during troubleshooting or development of an SSIS package.
In SSIS, event logging allows you to select any specific event of a task or a package to be logged. It is beneficial when you are troubleshooting your package to understand the performance package.
SSIS packages and all the associated tasks have a property called LoggingMode. This property accepts three possible values.
Disabled: To disable logging of the component Enabled: To enable logging of the part Use Parent Setting: To use the parent's setting of the component
SSIS operates using buffers; it is a kind of an in-memory virtual table to hold data.
Checkpoint data is not saved for For Each Loop and For Loop containers.
Conditional split transformation in SSIS is just like IF condition, which checks for the given condition based on the condition evaluation.
Different types of data viewers in SSIS include
- Scatter Plot
- Column Chart
You can save SSIS package at
- SQL Server
- Package Store
- File System
What will be your first approach if the package that runs fine in Business Intelligence Development Studio (BIDS) but fails when running from an SQL agent job?
On the event handlers tab, workflows can be configured to respond to package events. For instance, you can configure workflow when any task stops, fails or starts.
Either inside the package, you could add a Send Mail Task in the event handlers, or you can even set the notification in the SQL Agent when the package runs.
Logging in SSIS can be done by logging various events like onError, onWarning, etc. to the multiple options like a flat file, XML, SQL server table, etc.
To deploy SSIS package we need to execute the manifest files and need to determine whether to deploy this into File System or onto SQL Server. Alternatively, you can also import package from SSMS from SQL Server or File System.
Late Arriving Dimension are unavoidable; to handle these we can create a dummy dimension with natural/business key and keep the rest of the attributes as null or default. So when the actual dimension arrives, the dummy dimension is updated with Type 1 change. This is also referred to as Inferred Dimensions.
The best and fastest way to do incremental load is by using Timestamp column in the source table and storing the last ETL timestamp.
Three data flow components is SSIS are:
Checkpoint used in SSIS to allows a package to restart at the point of failure.
The three values accept by event logging mode property are:
- Enabled: Allows you to logging of the components.
- Disabled: It is used to disable the components.
- UserParentSetting: It is used to optimize the parent’s setting.
Different option for dynamic configuration are:
- XML file
- Customer variables.
- Database per environment with the variables.
- Allows you to use a centralized database with all variables.
Data conversion is the best method to convey the data from one type to another. However, you need to make sure that you have COMPATIBLE data in the column.
Some important features of SSIS are:
- Studio Environments.
- Relevant data analytics and integration functions.
- Tight integration with other Microsoft SQL family.
- Data Mining Query Transformation.
- SIS sometimes create issues in non-windows environments.
- Unclear vision and strategy.
- SSIS doesn't provide support for alternative data integration styles.
Execute SQL helps you to execute a SQL statement against a relational database.
The SSIS catalog is a database to store all the deployed packages. It is widely used for security reasons to store and handle the deployed packages.
It depends. If you are running the package in the SQL Agent, you can kill the process using T-SQL. However, if the package is running in the SSIS catalog, you can stop it using Active Operations window or the stop operation stored procedure.
In SSIS, a project is a container for developing package while the package is an object which helps you to implement ETL.
XML task allows you to split, merge, split or reformat any XML file.
Sequence contain helps you to organize subsidiary tasks by dividing them into the group. It will enable you to apply transaction or assign logging to the container.
The best practices for using the SISS tool are:
- You should avoid performing logged operations
- You should make a clear plan for resource utilization.
- Optimize the data source, lookup transformation, and destination
Control flow tab in SSIS includes dataflow task, containers and precedence constraints which helps you to connect containers and functions.
Deployment is the process in which packages convert from development mode into executables mode. For deploying the SSIS package, you can directly deploy the package by right-clicking the Integration Services project and build it.
This will save the package.dtsx file on the projectbin folder. Also, you can create the deployment utility using which the package can be deployed at either SQL Server or as a file on any location.
For creating deployment utility, follow these steps:
- Right-click on the project and click on properties.
- Select “True” for createDeploymentUtiltiy Option. Also, you can set the deployment path.
- Now close the window after making the changes and build the project by right-clicking on the project.
- A deployment folder will be created in the BIN folder of your main project location.
- Inside the deployment folder, you will find .manifest file, double-clicking on it you can get options to deploy the package on SQL Server.
- Log in to SQL Server and check-in MSDB on Integration Services.
Manifest file is the utility that can be used to deploy the package using the wizard on the file system and SQL Server database.
File system deployment means to save package files on local or network drive. Then you can use the SQL Agent job to schedule when the packages will run.
The Merge transformation can merge data from two paths into a single output. The Transform is useful when you wish to break out your Data Flow into a path that handles certain errors and then merge it back into the main Data Flow downstream after the errors have been handled. it’s also useful if you wish to merge data from two Data Sources.
Note that the data must be sorted before using the Merge Transformation. you can do this by using the sort transformation prior to the merge or by specifying an ORDER By clause in the source connection. Also, the metadata must be the same for both paths. For example, the CustomerID column cannot be a numeric column in one path and a character column in the other path.
The Union All Transformation works much the same way as the Merge Transformation, but it does not require the data to be sorted. It takes the outputs from multiple sources or transforms and combines them into a single result set.
The OLE DB Command Transform is a component designed to execute a SQL Statement for each row in an input stream. This task is analogous to an ADO Command Object being created, prepared, and executed for each row of a result set. The input stream provides that data for parameters that can be set into the SQL Statement that is either an Inline statement or a stored procedure call.
In SSIS there is one task Execute TSQL task which is similar to Execute SQL task. We will see what is the difference between the two.
Execute the TSQL Task:
- Pros: Takes less memory, faster performance.
- Cons: Output into variable not supported, Only supports ADO.net connection.
Execute SQL Task:
- Pros: Support output into variables and multiple types of connection, parameterized query possible.
- Cons: Takes more memory, slower performance compared to the TSQL task.
For this specific question, the most probable cause would be that your account does not have the permission to run on SQL Agent. A simple solution would be to grant the permission that is required or to create a proxy account.
There are mainly three different types of Lookup Cache Modes present in SSIS Lookup Transformation:
Full Cache Mode: This type of cache mode helps SSIS query the database before the beginning of the data-flow task execution. This mode is a critical part of the pre-execute phase. Besides, SSIS copies all the data from the reference table (or lookup table) into the SSIS lookup cache during full cache mode.
Partial Cache Mode: This cache mode helps SSIS to query the database against new rows from different sources. In this mode, the row is cached into the SSIS lookup cache only in the case when there is a subsequent match. Once the cache gets full, SSIS automatically starts removing existing rows based on the match and usage stats. After that, new matching rows are loaded into the lookup cache.
No Cache Mode: As the name suggests, SSIS doesn't cache any rows in this cache mode unless there are two subsequent sources with the same lookup values. In 'No Cache Mode', the database is queried to get the match data/value from the reference table for each row coming through the source.
In most cases, the error occurs during transformation due to unexpected input of data values. There can be several different scenarios when an error may occur. For example- while applying a transformation to column data, loading data into destinations, extracting data from sources, etc.
The most critical errors commonly found in SSIS are:
- Data Connection Errors: This type of error is commonly seen when the connection manager cannot be initialized with a connection string. This can be seen in both the data-source and the data-destination, along with the control flow that uses the connection strings.
- Data Transformation Errors: This type of error is observed while converting the data from the data source to a destination (in the data pipeline).
- Expression Evaluation Errors: This type of error can usually be seen in a scenario where values evaluated at a run-time exhibit invalid performance.
We evaluated several of their competitors and found Adaface to be the most compelling. Great default library of questions that are designed to test for fit rather than memorization of algorithms.