Teradata Fastload Reference Manual

  • Teradata Tutorial

Teradata FastLoad processes a series of Teradata FastLoad commands and Teradata SQL statements entered either interactively or in batch mode. Use the Teradata FastLoad commands for session control and data handling of the data transfers. The Teradata SQL statements create, maintain, and drop tables on the Teradata Database. Table 20: FastLoad Entering Commands Command. Directs Teradata FastLoad to log on to the Teradata Database for up to four sessions. Directs Teradata FastLoad to begin reading data at record 100 in the input data source and stop reading records at record 100,000. When in doubt, please read the manual. The 'File' specification is part of the DEFINE statement. You have the BEGIN LOADING command in between the DEFINE and the 'File'. And the syntax of your BEGIN LOADING is incorrect. The FastLoad Reference manual will provide you with everything you need.

  • Teradata Basics
  • Teradata Advanced
  • Teradata Useful Resources
  • Selected Reading

Teradata Fastload User Guide

FastLoad utility is used to load data into empty tables. Since it does not use transient journals, data can be loaded quickly. It doesn't load duplicate rows even if the target table is a MULTISET table.

Limitation

Target table should not have secondary index, join index and foreign key reference.

How FastLoad Works

FastLoad is executed in two phases.

Phase 1

  • The Parsing engines read the records from the input file and sends a block to each AMP.

  • Each AMP stores the blocks of records.

  • Then AMPs hash each record and redistribute them to the correct AMP.

  • At the end of Phase 1, each AMP has its rows but they are not in row hash sequence.

Phase 2

  • Phase 2 starts when FastLoad receives the END LOADING statement.

  • Each AMP sorts the records on row hash and writes them to the disk.

  • Locks on the target table is released and the error tables are dropped.

Example

Create a text file with the following records and name the file as employee.txt.

Following is a sample FastLoad script to load the above file into Employee_Stg table.

Executing a FastLoad Script

Once the input file employee.txt is created and the FastLoad script is named as EmployeeLoad.fl, you can run the FastLoad script using the following command in UNIX and Windows.

Once the above command is executed, the FastLoad script will run and produce the log. In the log, you can see the number of records processed by FastLoad and status code.

FastLoad Terms

Following is the list of common terms used in FastLoad script.

  • LOGON − Logs into Teradata and initiates one or more sessions.

  • DATABASE − Sets the default database.

  • BEGIN LOADING Sublime text c++ compiler mac. − Identifies the table to be loaded.

  • ERRORFILES − Identifies the 2 error tables that needs to be created/updated.

  • CHECKPOINT − Defines when to take checkpoint.

  • SET RECORD − Specifies if the input file format is formatted, binary, text or unformatted.

  • DEFINE − Defines the input file layout.

  • FILE − Specifies the input file name and path.

  • INSERT − Inserts the records from the input file into the target table.

  • END LOADING − Initiates phase 2 of the FastLoad. Distributes the records into the target table.

  • LOGOFF − Ends all sessions and terminates FastLoad.

Previous Page|Next Page
Teradata
SAS/ACCESS Interface to Teradata

Teradata Fastload Tutorial

Overview

To significantly improve performance when loading data,SAS/ACCESS Interfaceto Teradata provides these facilities. These correspond to native Teradatautilities.

  • FastLoad

  • MultiLoad

  • Multi-Statement

SAS/ACCESSalso supports the Teradata Protocol Transporter application programming interface (TPT API), which you can also use with thesefacilities.

Using FastLoad

FastLoad Supported Features and Restrictions

SAS/ACCESS Interface to Teradata supportsa bulk-load capability called FastLoad that greatly accelerates insertionof data into empty Teradata tables. For general information about using FastLoadand error recovery, see the Teradata FastLoad documentation. SAS/ACCESS examplesare available.

Note: Implementation of SAS/ACCESS FastLoadfacility will change in a future release of SAS. So you might need to changeSAS programming statements and options that you specify to enable this featurein the future.

The SAS/ACCESS FastLoadfacility is similar to the native Teradata FastLoad Utility. They share theselimitations:

  • FastLoad can load only empty tables; it cannotappend to a table that already contains data. If you attempt to use FastLoadwhen appending to a table that contains rows, the append step fails.

  • Both the Teradata FastLoad Utility and theSAS/ACCESS FastLoadfacility log data errors to tables. Error recovery can be difficult. To findthe error that corresponds to the code that is stored in the error table,see the Teradata FastLoad documentation.

  • FastLoad does not load duplicate rows (rows whereall corresponding fields contain identical data) into a Teradata table. Ifyour SAS data set contains duplicate rows, you can use the normal insert (load)process.


Starting FastLoad

Ifyou do not specify FastLoad, your Teradata tables are loaded normally (slowly).To start FastLoad in the SAS/ACCESS interface,you can use one of these items:

  • the BULKLOAD=YES data set option in a processingstep that populates an empty Teradata table

  • the BULKLOAD=YES LIBNAME option on the destinationlibref (the Teradata DBMS library where one or more intended tables are tobe created and loaded)

  • the FASTLOAD= alias for either of these options


FastLoad Data Set Options

Hereare the data set options that you can use with the FastLoad facility.

  • BL_LOG= specifies the namesof error tables that are created when you use the SAS/ACCESS FastLoadfacility. By default, FastLoad errors are logged in Teradata tables namedSAS_FASTLOAD_ERRS1_randnum and SAS_FASTLOAD_ERRS2_randnum, where randnum is a randomlygenerated number. For example, if you specify BL_LOG=my_load_errors, errors are logged in tables my_load_errors1 and my_load_errors2. If you specify BL_LOG=errtab, errors are logged in tables nameerrtab1 and errtab2.

    Note: SAS/ACCESS automaticallydeletes the error tables if no errors are logged. If errors occur, the tablesare retained and SAS/ACCESS issuesa warning message that includes the names of the error tables.

  • DBCOMMIT=n causesa Teradata 'checkpoint' after each group of nrows is transmitted. Using checkpoints slows performance but provides knownsynchronization points if failure occurs during the loading process. Checkpointsare not used by default if you do not explicitly set DBCOMMIT= and BULKLOAD=YES. TheTeradata alias for this option is CHECKPOINT=.

To see whether threaded reads areactually generated, turn on SAS tracingby setting OPTIONS SASTRACE=',d' in your program.

Script
Using MultiLoad

MultiLoad Supported Features and Restrictions

SAS/ACCESS Interface to Teradata supportsa bulk-load capability called MultiLoad that greatly accelerates insertionof data into Teradata tables. For general information about using MultiLoadwith Teradata tables and for information about error recovery, see the TeradataMultiLoad documentation. SAS/ACCESS examplesare available.

Unlike FastLoad, which only loads empty tables, MultiLoad loads bothempty and existing Teradata tables. If you do not specify MultiLoad, yourTeradata tables are loaded normally (inserts are sent one row at a time).

The SAS/ACCESS MultiLoad facilityloads both empty and existing Teradata tables. SAS/ACCESS supportsthese features:

  • You can load only one target table at a time.

  • Only insert operations are supported.

Because the SAS/ACCESS MultiLoadfacility is similar to the native Teradata MultiLoad utility, they share alimitation in that you must drop the following items on the target tablesbefore the load:

  • unique secondaryindexes

  • foreign key references

  • join indexes

Both the Teradata MultiLoad utility and theSAS/ACCESS MultiLoadfacility log data errors to tables. Error recovery can be difficult, but theability to restart from the last checkpoint is possible. To find the errorthat corresponds to the code that is stored in the error table, see the TeradataMultiLoad documentation.


MultiLoad Setup

Here arethe requirements for using the MultiLoad bulk-load capability in SAS.

  • The native Teradata MultiLoad utility must be present on yoursystem. If you do not have the Teradata MultiLoad utility and you want touse it with SAS, contact Teradata to obtain the utility.

  • SAS must be able to locate the Teradata MultiLoad utility on yoursystem.

  • The Teradata MultiLoad utility must be able to locate the SASMlamaccess module and the SasMlne exit routine. They are supplied with SAS/ACCESS Interfaceto Teradata.

  • SAS MultiLoad requires Teradata client TTU 8.2 or later.

If it has not been done so already as part of the post-installationconfiguration process, see the SAS configuration documentation for your systemfor information about how to configure SAS to work with MultiLoad.

MultiLoad Data Set Options

Call the SAS/ACCESS MultiLoadfacility by specifying MULTILOAD=YES. See the MULTILOAD= data set optionfor detailed information and examples on loading data and recovering fromerrors during the load process.

Here are the data set options that are available for use with the MultiLoadfacility. For detailed information about these options, seeData Set Options for Relational Databases.

  • MBUFSIZE=

  • ML_CHECKPOINT=

  • ML_ERROR1=lets the user name the error table that MultiLoad uses for tracking errorsfrom the acquisition phase. See the Teradata MultiLoad reference for moreinformation about what is stored in this table. By default, the acquisitionerror table is named SAS_ML_ET_randnum where randnum is a random number. When restarting a failed MultiLoadjob, you need to specify the same acquisition table from the earlier run sothat the MultiLoad job can restart correctly. Note that the same log table,application error table, and work table must also be specified upon restarting,using ML_RESTART, ML_ERROR2, and ML_WORK data set options. ML_ERROR1 andML_LOG are mutually exclusive and cannot be specified together.

  • ML_ERROR2=

  • ML_LOG=specifies a prefix for the temporary tables that the Teradata MultiLoad utilityuses during the load process. The MultiLoad utility uses a log table, twoerror tables, and a work table while loading data to the target table. Thesetables are named by default as SAS_ML_RS_randnum,SAS_ML_ET_randnum, SAS_ML_UT_randnum, and SAS_ML_WT_randnum where randnum is arandomly generated number. ML_LOG= is used tooverride the default names used. For example, if you specify ML_LOG=MY_LOAD the log table is named MY_LOAD_RS. Errors are logged in tables MY_LOAD_ET and MY_LOAD_UT. The work table is named MY_LOAD_WT.

  • ML_RESTART= lets the user name the log table that MultiLoad usesfor tracking checkpoint information. By default, the log table is named SAS_ML_RS_randnum where randnum is a randomnumber. When restarting a failed MultiLoad job, you need to specify the samelog table from the earlier run so that the MultiLoad job can restart correctly.Note that the same error tables and work table must also be specified uponrestarting the job, using ML_ERROR1, ML_ERROR2, and ML_WORK data set options.ML_RESTART and ML_LOG are mutually exclusive and cannot be specified together.

  • ML_WORK=lets the user name the work table that MultiLoad uses for loading the targettable. See the Teradata MultiLoad reference for more information about whatis stored in this table. By default, the work table is named SAS_ML_WT_randnum where randnum is a randomnumber. When restarting a failed MultiLoad job, you need to specify the samework table from the earlier run so that the MultiLoad job can restart correctly. Note that the same log table, acquisition error table and application errortable must also be specified upon restarting the job using ML_RESTART, ML_ERROR1,and ML_ERROR2 data set options. ML_WORK and ML_LOG are mutually exclusiveand cannot be specified together.

  • SLEEP= specifies the number of minutes that MultiLoadwaits beforeit retries a logon operation when the maximum number of utilities are alreadyrunning on the Teradata database. The default value is 6. SLEEP= functionsvery much like the SLEEP run-time option of the native Teradata MultiLoadutility.

  • TENACITY= specifies the number of hours that MultiLoad triesto log on when the maximum number of utilities are already running on theTeradata database. The default value is 4. TENACITY= functions very much likethe TENACITY run-time option of the native Teradata MultiLoad utility.

Be aware that these options are disabled while you are using theSAS/ACCESS MultiLoadfacility.

  • The DBCOMMIT= LIBNAME and data set options are disabled becauseDBCOMMIT= functions very differently from CHECKPOINT of the native TeradataMultiLoad utility.

  • The ERRLIMIT= data set option is disabled because the number oferrors is not known until all records have been sent to MultiLoad. The defaultvalue of ERRLIMIT=1 is not honored.

To see whether threaded reads are actuallygenerated, turn on SAS tracingby setting OPTIONS SASTRACE=',d' in your program.

Using the TPT API

Teradata Fastload Csv

TPT API Supported Features and Restrictions

SAS/ACCESS Interface toTeradata supports the TPT API for loading data. The TPT API provides a consistentinterface for Fastload, MultiLoad, and Multi-Statement insert. TPT API documentationrefers to Fastload as the load driver, MultiLoad as the update driver, and Multi-Statement insert as the stream driver. SAS supports all three load methods andcan restart loading fromcheckpoints when you use the TPT API with any of them.


TPT API Setup

Here arethe requirements for using the TPT API in SAS for loading SAS.

  • Loading data from SAS to Teradata using the TPT API requires Teradataclient TTU 8.2 or later. Verify that you have applied all of the latest TeradataeFixes.

  • This feature is supported only on platforms for which Teradataprovides the TPT API.

  • The native TPT API infrastructure must be present on your system.Contact Teradata if you do not already have it but want to use it with SAS.

The SAS configuration document for your system contains informationabout how to configure SAS to work with the TPT API. However, those stepsmight already have been completed as part of the post-installation configurationprocess for your site.


Example

TPT API LIBNAME Options

The TPT= LIBNAME option is common to all three supported loadmethods. If SAS cannot use the TPT API, it reverts to using Fastload, MultiLoad,or Multi-Statement insert, depending on which method of loading was requestedwithout generating any errors.


TPT API Data Set Options

These dataset options are common to all three supported loadmethods:

  • SLEEP=

  • TENACITY=

  • TPT=

  • TPT_CHECKPOINT_DATA=

  • TPT_DATA_ENCRYPTION=

  • TPT_LOG_TABLE=

  • TPT_MAX_SESSIONS=

  • TPT_MIN_SESSIONS=

  • TPT_RESTART=

  • TPT_TRACE_LEVEL=

  • TPT_TRACE_LEVEL_INF=

  • TPT_TRACE_OUTPUT=


TPT API FastLoad Supported Features and Restrictions

SAS/ACCESS Interfaceto Teradata supports the TPT API for FastLoad, also known as the loaddriver, SAS/ACCESS works byinterfacing with the load driver through the TPT API, which in turn uses theTeradata Fastload protocol for loading data. See your Teradata documentationfor more information about the load driver.

This is the default FastLoad method. If SAS cannot find the Teradatamodules that are required for the TPT API or TPT=NO, then SAS/ACCESS usesthe old method of Fastload. SAS/ACCESS canrestart Fastload from checkpoints when FastLoad uses the TPT API. The SAS/ACCESS FastLoadfacility using the TPT API is similar to the native Teradata FastLoad utility.They share these limitations.

  • FastLoad can load only empty tables. It cannot append to a tablethat already contains data. If you try to use FastLoad when appending to atable that contains rows, the append step fails.

  • Data errors are logged in Teradata tables. Error recovery canbe difficult if you do not TPT_CHECKPOINT_DATA= to enable restart from thelast checkpoint. To find the error that corresponds to the code that is storedin the error table, see your Teradata documentation. You can restart a failedjob for the last checkpoint by following the instructions in the SAS errorlog.

  • FastLoad does not load duplicate rows (those where all correspondingfields contain identical data) into a Teradata table. If your SAS data setcontains duplicate rows, you can use other load methods.


Starting FastLoad with the TPT API

Seethe SAS configuration document for instructions on setting up the environmentso that SAS can find the TPT API modules.

You can use one of these options to start FastLoad in theSAS/ACCESS interfaceusing the TPT API:

  • the TPT=YES data set option in a processing step that populatesan empty Teradata table

  • the TPT=YES LIBNAME option on the destination libref (the TeradataDBMS library where one or more tables are to be created and loaded)


FastLoad with TPT API Data Set Options

These data set options are specific to FastLoad using the TPTAPI:

  • TPT_BUFFER_SIZE=

  • TPT_ERROR_TABLE_1=

  • TPT_ERROR_TABLE_2=


TPT API MultiLoad Supported Features and Restrictions

SAS/ACCESS Interfaceto Teradata supports the TPT API for MultiLoad, also known as the updatedriver. SAS/ACCESS works byinterfacing with the update driver through the TPT API. This API then usesthe Teradata Multiload protocol for loading data. See your Teradata documentationfor more information about the update driver.

This is the default MultiLoad method. If SAS cannot find the Teradatamodules that are required for the TPT API or TPT=NO, then SAS/ACCESS usesthe old method of MultiLoad. SAS/ACCESS canrestart Multiload from checkpoints when MultiLoad uses the TPT API.

The SAS/ACCESS MultiLoad facilityloads both empty and existing Teradata tables. SAS/ACCESS supportsonly insert operations and loading only one target table at time.

The SAS/ACCESS MultLoad facilityusing the TPT API is similar to the native Teradata MultiLoad utility. A commonlimitation that they share is that you must drop these items on target tablesbefore the load:

  • unique secondary indexes

  • foreign key references

  • joinindexes

Errors are logged to Teradata tables. Error recovery can be difficultif you do not set TPT_CHECKPOINT_DATA= to enable restart from the last checkpoint. To find the error that corresponds to the code that is stored in the errortable, see your Teradata documentation. You can restart a failed job forthe last checkpoint by following the instructions in the SAS error log.


Starting MultiLoad with the TPT API

Seethe SAS configuration document for instructions on setting up the environmentso that SAS can find the TPT API modules.

You can use one of these options to start MultiLoad in the SAS/ACCESS interfaceusing the TPT API:

  • the TPT=YES data set option in a processing step that populatesan empty Teradata table

  • the TPT=YES LIBNAME option on the destination libref (the TeradataDBMS library where one or more tables are to be created and loaded)


MultiLoad with TPT API Data Set Options

These data set options are specific to MultiLoad using the TPTAPI:

  • TPT_BUFFER_SIZE=

  • TPT_ERROR_TABLE_1=

  • TPT_ERROR_TABLE_2=


TPT API Multi-Statement Insert Supported Features and Restrictions

SAS/ACCESS Interface toTeradata supports the TPT API for Multi-Statement insert, also known as the stream driver. SAS/ACCESS worksby interfacing with the stream driver through the TPT API, which in turn usesthe Teradata Multi-Statement insert (TPump) protocol for loading data. Seeyour Teradata documentation for more information about the stream driver.

This is the default Multi-Statement insert method. If SAS cannot findthe Teradata modules that are required for the TPT API or TPT=NO, then SAS/ACCESS usesthe old method of Multi-Statement insert. SAS/ACCESS canrestart Multi-Statement insert from checkpoints when Multi-Statement insertuses the TPT API.

The SAS/ACCESS Multi-Statementinsert facility loads both empty and existing Teradata tables. SAS/ACCESS supportsonly insert operations and loading only one target table at time.

Errors are logged to Teradata tables. Error recovery can be difficultif you do not set TPT_CHECKPOINT_DATA= to enable restart from the last checkpoint. To find the error that corresponds to the code that is stored in the errortable, see your Teradata documentation. You can restart a failed job forthe last checkpoint by following the instructions on the SAS error log.


Starting Multi-Statement Insert with the TPT API

See the SAS configuration document for instructions on settingup the environment so that SAS can find the TPT API modules.

You can use one of these options to start Multi-Statement in the SAS/ACCESS interfaceusing the TPT API:

  • the TPT=YES data set option in a processing step that populatesan empty Teradata table

  • the TPT=YES LIBNAME option on the destination libref (the TeradataDBMS library where one or more tables are to be created and loaded)


Multi-Statement Insert with TPT API Data Set Options

These data set options are specific to Multi-Statement insertusing the TPT API.

  • TPT_PACK=

  • TPT_PACKMAXIMUM=

Examples

This example startsthe FastLoad facility.

This next example uses FastLoad to append SAS data toan empty Teradata table and specifies the BL_LOG= option to name the errortables Append_Err1 and Append_Err2. In practice, applications typically append many rows.

This example starts the MultiLoad facility.

This example loads data using TPT FastLoad.

This example restarts a MultiLoad step that recorded checkpoints andfailed after loading 2000 rows of data.

Previous Page|Next Page|Top of Page