24 Apr 2013

Teradata Mode Transactions ?


As mentioned earlier, Teradata mode considers every SQL statement as a stand-alone transaction. This means that if the outcome of the statement is successful, the work is committed to the database. This is particularly important when data is being written onto disk instead of simply read. When multiple tables are being updated, multiple SQL commands must be used. A single transaction can be established using a couple of different techniques in Teradata. The easiest and surest technique is to put all the SQL statements into a macro. This works well in BTEQ, Queryman, and all client software applications. Therefore, it is the best technique.
A second reliable technique is available only when using BTEQ. It involves taking advantage of the way BTEQ delivers the SQL to the optimizer. The trick is to continue each subsequent SQL statement on the same line as the semi-colon (;) of the previous statement. When BTEQ finds this condition, it automatically delivers the commands as a single transaction.
The following demonstrates this technique:
image\img00581.gif
In the above script, both updates must work successfully, or both will be rolled back because the second UPDATE is on the same line as the semi-colon for the first UPDATE. When a semicolon (;) is not the last thing on a line, BTEQ treats the next SQL as part of the same transaction.
The last technique uses the Teradata BEGIN TRANSACTION (BT) and END TRANSACTION (ET) commands to delineate the transaction in Teradata mode. Although these work in some of the other client tools, they should primarily be used in BTEQ.
We say this because BTEQ has the ability to execute in batch (background without user interaction) and to check the outcome of the previous SQL statement using a .if command for verification that it was successful. It also provides hooks to terminate the execution of the script or branch around subsequent SQL statements that should not be executed when a failure occurs. This control is important to guarantee the all or nothing philosophy of a transaction.
The next transaction operates exactly the same as the previous two techniques when using BTEQ:
image\img00582.gif
Or
image\img00583.gif
NOTE: BT and ET are the abbreviations for BEGIN TRANSACTION and END TRANSACTION to establish an explicit transaction. In the first script, the if statement checks for an good completion as 0 and uses the .quit 12 as the error return code from the script. The second example uses a “go to” command to branch to the end script. The concept of a script implies a batch (without interactivity of a user) operation and therefore it is important to use the .if to have BTEQ check for a failure.
It is important because if a failure occurs in the first UPDATE, it causes the transaction to abort and the transaction automatically ends. Since there is no longer a transaction in process and the flow is from top to bottom, the second UPDATE is executed as a new and single “implied” transaction in Teradata mode. Then, a warning is issued when the ET is executed because there is no transaction in process, due to the ABORT. See Aborting Teradata Transactions later in this chapter. Therefore, the check is important to prevent the execution of the second UPDATE statement. If this same SQL were executed interactively, the user would never enter the second UPDATE. However, in batch scripts all commands are performed sequentially (top to bottom) from a file stored on disk instead of being typed ad hoc by a person. As seen above, BTEQ can run in either batch or interactive mode, but since Queryman is interactive only, the user is notified immediately of a failure with an error code. Once the error is evaluated, the user takes the appropriate action. However, when doing “batch” or off-line processing, there is no user to take an action. Therefore, it is important to provide the appropriate checks and balances in a script.

Setting the Transaction Mode ?


Compatibility: Teradata Extension
As mentioned above, the Teradata default Transaction mode is set at the system level. A Teradata system can default to either Teradata or ANSI mode. The system level setting is established in the DBS Control Record.
When using BTEQ, it is possible to over-ride the transaction mode at the session level. Since the session is established at logon time, it is necessary to set the mode prior to issuing a logon connection request. Remember, the transaction mode impacts the way SQL will execute, so the mode must be established at the Parsing Engine (PE) to affect the session.
In BTEQ, either of the following commands can be used to change to ANSI or Teradata (BTET) mode:
image\img00579.gif
or
image\img00580.gif
NOTE: the dot (.) is necessary because it is a BTEQ command and not SQL.
Although the ANSI specification is obvious, the BTET is not intuitive. In the chart above, it is indicated that BEGIN TRANSACTION (BT) and END TRANSACTION (ET) commands can be used to delineate an explicit transaction. The BTET simply comes from a combination of these two transactional commands to indicate Teradata transaction mode. An explanation of implicit and explicit transactions is addressed in this chapter.
The transaction mode only needs to be specified if the SQL output requires different characteristics than the mode that is established as the default for the system. If the default is acceptable, there is no need to change it.

what is the size of BYTEINT,SMALLINT,INTEGER?


BYTEINT - 1 Bytes - 8 bits -- > -128 to 127
SMALLINT - 2 bytes - 16 bites: ---> -32768 to 32767

INTEGER - 4 BYTES - 32 BITS ---> -2,147,483,648 to 2,147,483,647

What is Advantage of PPI?


Mainly we use for Range based data storing or category based Data storing.
Range queries don't have to utilize a Full Table Scan.Its directly goes to the corresponding partition and skips other partitions.
Fastload and Multiload work with PPI tables, but not with all Secondary Indexes.

what about UPSERT command in teradata?


UPSERT means update else insert.
In Teradata we have this option.

How many AMPs involved in case of Primary Index?

Its always one AMP.

At what level teradata can apply a LOCK?


1.Database level lock --- All objects inside database will be locked.
2.Table level -- All rows inside a table will be locked.
3,Row hash level lock-- Only Corresponding row will be locked.

List different types of LOCKS in teradata?


Teradata can apply four types of LOCKS
a.Access Lock
b.Read lock
c.Write Lock
d.Exclusive Lock

what is Clique?


It prevents from Node Failure.
a.A Clique is a collection of Nodes will Share Common Disk drives.
b.whenever any node went down automatically Vprocs all migrate from fail node to other node to retrieve data from common disk drives.

What is BYNET?


It acts like a "Message Communication" happens between Components.It is responsible for
1. Sending messages
2. Merging data
3.Sorting data

How many MAX session PE handles at a time?

PE handles MAX 120 sessions.

What is Parsing Engine?


PE is type of Vproc it takes SQL request and delivers SQL response.
It has software components to break SQL into steps, and send the steps to the AMPs.

Session Control
A session is nothing but logical connection between User and Application.Here it controls Authorization if its a Valid it does log on otherwise log off.
Parser
checks syntactical errors
checks semantical errors
checks existence of objects.
Dispatcher
It takes set of request and keep in a queue ,delivers set of responses by keeping the same queue that means it does request response flow control.

what is Amp?


Access Module Processor
It is a Virtual processor responsible for managing one portion of the database(collection of virtual
disks).This portion is not sharable by any other AMP.so well call this Architecture as shared nothing Architecture.
Amp contains Database Manager subsystem and it performs the below operations
a.Performing DDL
b.Performing DML
c.Implementing Joins,Aggregations.
d.Applying and Releasing Locks etc.

What is Virtual disk?

A collection of Cylinders(physical disks) arranged in an Array fashion is called Vdisk or Virtual disk.Traditionally this is called as disk Array or Array of disk.

What is Teradata Gateway?

Terdata Gateway software gives communication between the application and the PEs assigned to network-attached clients. There is one Gateway per node.

What is channel driver?

A Software that communicates between the PEs and applications running on channel-attached clients.

What is Trusted parallel database (TPD)?


A database is called TPD if it runs under PDE.
Teradata is a database which runs under PDE.so we call Teradata as pure Parallel database or Trusted parallel database.

What is PDE?


Parallel Data Extension
A software interface layer on top of the operating system that enables the database to operate in a parallel environment. It was created by NCR to support the parallel environment.

what is Node?

A Node is nothing but Collection of Hardware and Software components.Typically a Server is called as node.

How to see current teradata version?

.SHOW VERSION

There is file it contains 100 records,need to load 60 records by skipping first 20 records and last 20 records.


use BTEQ Utility to do this task by Skip = 20 and Repeat 60 in script.

While Executing MLOAD Teradata Server Restarted?

Along with the server MLAOD script will also restarted and Runs from last check point.

While Executing MLOAD Client System Restarted?

We need to Manualy Submit the script,so that it loads data from last checkpoint.

Why Multiload does not supports USI and supports NUSI?


Index subtable row is on the same Amp as the data row in NUSI.so it operates each Amp independently and in parallel.

Why Multiload and Fastload does not supports SI,JI,RI and TRIGGERS?


Above all concepts require communication between Multiple Amps. As per Fastload and Multiload doesnot provide any communication b/w multiple Amps and they Operate Independently.As concept is violating as well as it takes time to implement the above operation they are not allowing them.

How do you load Multiple files to a table by using fast load scripts?

Remove End Loading statement in the script and Replace the file one by one in the script till last file and submit every time so that data appended in Amp Level.For the last file specify End Loading statement in the script and Run.so that it runs from Amp to table.

How do you Generate Sequence in Teradata?


By Using Identity Column
1-for storing purpose using identity.
2-for display purpose using csum.

How do you Generate sequence at the time of Display?

By Using CSUM 

Aborted in Phase 2 data acquisition completed in fastload?

Simply take Begin and End loading in scripts and Run again.so that it runs from Amp to table.

Major advantages of Other ETL tools than TD Utilities:


1.Heterogeneous Sources and Destinations we can Operate. (Oracle,Sqlserver,Excel,Flat Files etc....)
2.As they are providing full GUI support,degugging is so easier.
3.Reusability of components(Informatica:mapplets,worklets etc...) available in ETL tools so if we change in main location automatically other applications(mappings) which are using these components can update instantly. 4.Pivoting(Normalization),Unpivot(Denormalization) we can implement very easily in ETL tools.
5.Caching is one more advantage when we work with a source (Heterogenous) which is not changing frequently.Some times cache can be shared across applications etc....

Fast Load Script is failed and error tables are available then how do you restart?


2.Incase of new file to Run
Drop Error tables
Try to run the script with only begin and end loading statements. ,so that it releases the lock on target table If possible remove the record from fastlog table.
Run the script with new file freshly.

There is a load to the Table every one hour and 24/7.Morning trafic is high, afternoon trafiic is less, Night trafiic is high.According to this situation which Utility you use and how do you load,which utility used?


Tpump is suggestable here
By Using packet size increasing or decreasing we can handle traffic.

There is a load to the Table every one hour and 24/7.Morning trafic is high, afternoon trafiic is less, Night trafiic is high.According to this situation which Utility you use and how do you load,which utility used?


At unix prompt,Please execute in the below way.
Tpump is suggestable here
By Using packet size increasing or decreasing we can handle traffic.

How to run the Teradata jobs in unix environment ?


$Sh>BTEQ<[Script Path]>[Logfile Path]
or
$Sh>BTEQ<[Script Path] TEE [Logfile Path]

Not only these ways ,there are many ways.
Similarly to other utilities.

Important Components:


a.PE(Parsing Engine )
b.Bynet
c.Amp (Access Module Processor)
d.VSS(Virtual Storage System) (from Teradata13)
e.Vdisk(Virtual Disk)

New Featues (Teradata 13):


Designed with the below appliances.
a.Extreme Data warehouse 1550.
d.Active Enterprise Datawarehouse 5550 H,5555 C/H
c.Internet and E-business
d.Customer Relationship Management.
e.Datamart Appliance 2500,2550,2555

Definition of Teradata? Tell me important components of Teradata?

Definition:
It is an RDBMS that drives company's Datawarehouse,Datamart,OLTP,OLAP,DSS Appliances.
a.An "Open system which executes on UNIX MR-RAS(Discontinued in teradata 13),WIN2K, Suse Linux ETC & Above OS.
b.Compatible with ANSI standards
c.Runs on Single or Multi nodes.
d.Built in Parallelism
e.Acts Like a Server.

Diff bw BTEQ and SQL Assistant ?


BTEQ:
Basic teradata query utility,sql front end,report writing and formatting features,
Interactive and batch queries,import/export across all platforms,the default no of sessions, upon login is’ 1’
TD query manager/queryman/td sql assistant :
Sql front end of ODBC compliant database,historical records of queries including –timings,-status,-row_count,random sampling feature,limit amount of data returned,import/export b/w database and PC.
Export to excel or Access

what is the diference between timestamp(0),timestamp(6)?


Timestamp (0) is 19 charecters(is yyyy-mm-ddbhh:mi:ss).
Timestamp (6) is 26 charectors(is yyyy-mm-ddbhh:mi:ss ssssss)/milli seconds extra.

when should be the statistics collected ?


Some greate guide lines to collect statistics
All non unique indices
Non index join columns
Primary index of small tables
Secondary indexs mentained in the tables

what are the examples of anOLTP enivironment ?

There are 2  examples of OLTP nvironment
 1.ATM,
2.POS(purchase order system).

what is optimization and performance tuning ?

ANS :  optimization is the skill of selecting the least expensive plan is also the fastest plan for the query to fetch reslts. Performance is a tool for improving the process for the query to perform fastest with minimalistic usage of the resources of CPU.

what are the basic rules of PI in td.


ANS : 
Here are the following basic rules of applying the above command
Only one primary index per table.
Primary index may be unique or non unique.
Primary index can be null, modified or it may be so populated that it can’t be modified.

What is the Secondary index advantages and dise


ANS : ADVANTAGES :-
     A secondary index may be created and dropped dynamically
     It can be created on  any column be it unique or not unique.
    A  table may consists of up to 32 secondary indexes
DISADVANTAGES :
 They require additional i/o to maintain their sub-tables.
If  the statistics are not collected properly then the ptimiger goes for a full table scan
There is always an overhead for additional spaces, since there no sub-tables to be created

How does indexing improve query performance ?


ANS : It is a technique to physically organize the records to enable frequently used queries to run faster. Anything,  Updated ,deleted,re-done has to be updated in the table of indexes.
  This helps the process to run faster the nexttime you operate. And indexes cannot be accessed directly by the users , only the optimizer can access the index.

what is the ERROR Table ? what are the uses of it ?


ANS :Contains rows which failed to be adjusted due to transalation error,
          Captures rows that contain duplicate values for UPAs,
          Data conversion errors and other error conditions,
          Captures rows that contain duplicate values.

What is the types of Hash functions ?



ANS : Hashrow (column(s)),
HashBucket(Hashrow),
HashAmp(HashBucket),
HashBackAmp(HashBucket),

what are the different types of temporary tables in TD?


These are 3 tables
Derived temporary tables.   // query level only (query completed automatically erase data and dd)
Volatile temporary tables.  // session level and  and (logoff session and automatic erase the table and DD)
Global temporary tables.//sesson level (logoff session and automatic erase the table data, not delete the DD).

what is BTET transaction processing over HEAD in teradata ?


BTET makes all the queries running between BT and ET as single transaction . if any of query fails than the entire set of queries will not be commited.
BTET also has an overhead with locking , since it holds locks on tables till the ET is occurred or all the queries have executed successfully
DDL statements can’t be used everywhere in BTET processing , but these statements can be given towards the ed of BTET transaction.
Using large number of BTET caued transient journal to grow and has tendancy for system restarts.

spool “Last use”?.

ANS : the particular spool file used in the step wil be used for last time and the spol space will be released since it is no longer used in further steps of the query execution.

what is residual condition ?

It means a filter is applied on particular table to lmit the numberof rows fetched into spool. Say ex : sel * from dbc.tname where sal>1500  ,Here, residual condition will act upon to fetch only employe with sal greater than 1500.

what does “psudo table” Locks mean ?

ANS : it is a false lock which is applied AMP on the table to prevent two users from getting conflicting locks with all-AMP requests.PE will etermine an particular AMP to manage all AMP Lock requests for given table and put psudo lock on the table.To put in simple terms, it’s like an University with 10 gates and at a given time you can enter the university through one gate (or main gate) for security check.


What are the advantages of other ETL tools(Informatica,DataStage,SSIS etc...) than Teradata Utilities or vice versa?

1.Incase of Old file to RunDont drop error tables simply rectify the error in the script or file and Run again.so that it runs from last configuration.TD Utilities run faster than other ETL tools incase of File to Table(Import) or Table to File (Export) Operations. Inmod and Outmod also help us to do better programing while Importing and Exporting.