Business Rules transformations in SAS® Data Integration Studio cannot display an object with a minor version number greater than 99: C5F001: 63224: The CONTAINS operator on the 'Filter type' menu for the View Data window in SAS® Data Integration Studio incorrectly behaves as a LIKE operator: C5F001: 63434.
Sas Data Integration Studio Download Free
![Sas Sas](/uploads/1/2/6/0/126023634/538812504.jpg)
Sas Data Integration Studio Tutorial
SAS Data Integration Studio 3.4Users GuideThe correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2007.SAS Data Integration Studio 3.4: Users Guide. Cary, NC: SAS Institute Inc.SAS Data Integration Studio 3.4: Users GuideCopyright 20022007, SAS Institute Inc., Cary, NC, USAISBN 978-1-59994-198-1All rights reserved. Produced in the United States of America.For a hard-copy book: No part of this publication may be reproduced, stored in aretrieval system, or transmitted, in any form or by any means, electronic, mechanical,photocopying, or otherwise, without the prior written permission of the publisher, SASInstitute Inc.For a Web download or e-book: Your use of this publication shall be governed by theterms established by the vendor at the time you acquire this publication.U.S. Government Restricted Rights Notice. Use, duplication, or disclosure of thissoftware and related documentation by the U.S. government is subject to the Agreementwith SAS Institute and the restrictions set forth in FAR 52.227-19 Commercial ComputerSoftware-Restricted Rights (June 1987).SAS Institute Inc., SAS Campus Drive, Cary, North Carolina 27513.1st printing, May 2007SAS Publishing provides a complete selection of books and electronic products to helpcustomers use SAS software to its fullest potential. For more information about oure-books, e-learning products, CDs, and hard-copy books, visit the SAS Publishing Web siteat support.sas.com/pubs or call 1-800-727-3228.SAS and all other SAS Institute Inc. product or service names are registered trademarksor trademarks of SAS Institute Inc. in the USA and other countries. indicates USAregistration.Other brand and product names are registered trademarks or trademarks of theirrespective companies.ContentsP A R T 1 Introduction 1Chapter 1 Introduction to SAS Data Integration 3About SAS Data Integration 3A Basic Data Integration Environment 4Overview of Building a Process Flow 8Advantages of SAS Data Integration 11Online Help for SAS Data Integration Studio 12Administrative Documentation for SAS Data Integration Studio 12Accessibility Features in SAS Data Integration Studio 13Chapter 2 About the Main Windows and Wizards 17Overview of the Main Windows and Wizards 18Metadata Windows and Wizards 20Job Windows and Wizards 27SAS Data Integration Studio Application Windows 32Tree View on the Desktop 36P A R T 2 General User Tasks 45Chapter 3 Getting Started 47Required Components for SAS Data Integration Studio 48Main Tasks for Creating Process Flows 49Starting SAS Data Integration Studio 50Connecting to a Metadata Server 51Reconnecting to a Metadata Server 54Selecting a Default SAS Application Server 55Registering Any Libraries That You Need 55Registering Sources and Targets 56Working with Change Management 59Specifying Global Options in SAS Data Integration Studio 63Chapter 4 Importing, Exporting, and Copying Metadata 65About Metadata Management 66Working with SAS Metadata 66Best Practices for Importing or Exporting SAS Metadata 71Preparing to Import or Export SAS Metadata 71Exporting SAS Metadata 72Importing SAS Metadata 73Copying and Pasting SAS Metadata 74Working with Other Metadata 75ivOther Metadata That Can Be Imported and Exported 76Usage Notes for Importing or Exporting Other Metadata 76Preparing to Import or Export Other Metadata 77Importing As New Metadata (without Change Analysis) 77Importing Metadata with Change Analysis 78Exporting Other Metadata 80Chapter 5 Working with Tables 83About Tables 85Registering Tables with a Source Designer 85Registering Tables with the Target Table Wizard 87Viewing or Updating Table Metadata 89Using a Physical Table to Update Table Metadata 90Specifying Options for Tables 91Supporting Case and Special Characters in Table and Column Names 93Maintaining Column Metadata 98Identifying and Maintaining Key Columns 104Maintaining Indexes 107Browsing Table Data 109Editing SAS Table Data 112Using the View Data Window to Create a SAS Table 115Specifying Browse and Edit Options for Tables and External Files 116Chapter 6 Working with External Files 119About External Files 120Registering a Delimited External File 122Registering a Fixed-Width External File 126Registering an External File with User-Written Code 130Viewing or Updating External File Metadata 134Overriding the Code Generated by the External File Wizards 135Specifying NLS Support for External Files 136Accessing an External File With an FTP Server or an HTTP Server 136Viewing Data in External Files 138Registering a COBOL Data File That Uses a COBOL Copybook 139Chapter 7 Creating, Executing, and Updating Jobs 141About Jobs 142Creating an Empty Job 143Creating a Process Flow for a Job 144About Job Options 145Submitting a Job for Immediate Execution 147Accessing Local and Remote Data 148Viewing or Updating Job Metadata 150Displaying the SAS Code for a Job 152Common Code Generated for a Job 152Troubleshooting a Job 155vChapter 8 Monitoring Jobs 157About Monitoring Jobs 157Using the Job Status Manager 159Managing Status Handling 161Managing Return Code Check Transformations 162Maintaining Status Code Conditions and Actions 165Chapter 9 Deploying Jobs 169About Deploying Jobs 170About Job Scheduling 170Deploying Jobs for Scheduling 171Redeploying Jobs for Scheduling 173Using Scheduling to Handle Complex Process Flows 174Deploying Jobs for Execution on a Remote Host 175About SAS Stored Processes 177Deploying Jobs as SAS Stored Processes 178Redeploying Jobs to Stored Processes 180Viewing or Updating Stored Process Metadata 181About Deploying Jobs for Execution by a Web Service Client 182Requirements for Jobs That Can Be Executed by a Web Service Client 183Creating a Job That Can Be Executed by a Web Service Client 185Deploying Jobs for Execution by a Web Service Client 188Using a Web Service Client to Execute a Job 191Chapter 10 Working with Transformations 195About Transformations 195Viewing the Code for a Transformation 196Viewing or Updating the Metadata for Transformations 197Creating and Maintaining Column Mappings 197About Archived Transformations 204Chapter 11 Working with Generated Code 207About Code Generated for Jobs 207Displaying the Code Generated for a Job 209Displaying the Code Generated for a Transformation 210Specifying Options for Jobs 211Specifying Options for a Transformation 211Modifying Configuration Files or SAS Start Commands for Application Servers 212Chapter 12 Working with User-Written Code 215About User-Written Code 216Editing the Generated Code for a Job 218Adding User-Written Source Code to an Existing Job 219Creating a New Job That Consists of User-Written Code 220Editing the Generated Code for a Transformation 221About the User Written Code Transformation 222viCreating a Job That Includes the User Written Code Transformation 223Creating and Using a Generated Transformation 226Maintaining a Generated Transformation 232Chapter 13 Optimizing Process Flows 237About Process Flow Optimization 238Managing Process Data 238Managing Columns 243Streamlining Process Flow Components 245Using Simple Debugging Techniques 246Using SAS Logs 249Reviewing Temporary Output Tables 251Additional Information 253Chapter 14 Using Impact Analysis 255About Impact Analysis and Reverse Impact Analysis 255Prerequisites 256Performing Impact Analysis 257Performing Impact Analysis on a Generated Transformation 258Performing Reverse Impact Analysis 259P A R T 3 Working with Specic Transformations 261Chapter 15 Working with Loader Transformations 263About Loader Transformations 263Setting Table Loader Transformation Options 265Selecting a Load Technique 270Removing Non-Essential Indexes and Constraints During a Load 272Considering a Bulk Load 273Chapter 16 Working with SAS Sorts 275About SAS Sort Transformations 275Setting Sort Options 276Optimizing Sort Performance 277Creating a Table That Sorts the Contents of a Source 279Chapter 17 Working with the SQL Join Transformation 283About SQL Join Transformations 285Using the SQL Designer Tab 286Reviewing and Modifying Clauses, Joins, and Tables in an SQL Query 287Understanding Automatic Joins 289Selecting the Join Type 292Adding User-Written SQL Code 295Debugging an SQL Query 296Adding a Column to the Target Table 297Adding a Join to an SQL Query in the Designer Tab 298viiCreating a Simple SQL Query 299Configuring a SELECT Clause 301Adding a CASE Expression 303Creating or Configuring a WHERE Clause 305Adding a GROUP BY Clause and a HAVING Clause 307Adding an ORDER BY Clause 310Adding Subqueries 311Submitting an SQL Query 315Joining a Table to Itself 316Using Parameters with an SQL Join 318Constructing a SAS Scalable Performance Data Server Star Join 319Optimizing SQL Processing Performance 320Performing General Data Optimization 321Influencing the Join Algorithm 322Setting the Implicit Property for a Join 324Enabling Pass-Through Processing 325Using Property Sheet Options to Optimize SQL Processing Performance 327Chapter 18 Working with Iterative Jobs and Parallel Processing 331About Iterative Jobs 331Creating and Running an Iterative Job 332Creating a Parameterized Job 335Creating a Control Table 338About Parallel Processing 340Setting Options for Parallel Processing 341Chapter 19 Working with Slowly Changing Dimensions 343About Slowly Changing Dimensions (SCD) 344Loading a Dimension Table Using Begin and End Datetime Values 349Loading a Dimension Table Using Version Numbers or Current-Row Indicators 353Loading a Fact Table 353Generating Retained Keys for an SCD Dimension Table 356Updating Closed-Out Rows in SCD Dimension Tables 357Optimizing SQL Pass-Through in the SCD Type 2 Loader 358Chapter 20 Working with Message Queues 361About Message Queues 361Selecting Message Queue Transformations 363Processing a WebSphere MQ Queue 364Chapter 21 Working with SPD Server Cluster Tables 369About SPD Server Clusters 369Creating an SPD Server Cluster 370Maintaining an SPD Server Cluster 371P A R T 4 Appendixes 373viiiAppendix 1 Recommended Reading 375Recommended Reading 375Glossary 377Index 3851P A R T1IntroductionChapter 1. . . . . . . . . .Introduction to SAS Data Integration 3Chapter 2. . . . . . . . . .About the Main Windows and Wizards 173C H A P T E R1Introduction to SAS DataIntegrationAbout SAS Data Integration 3A Basic Data Integration Environment 4Overview of a Data Integration Environment 4SAS Management Console 5SAS Data Integration Studio 5Servers 6SAS Application Servers 6SAS Data Servers 7Database Management System (DBMS) Servers 7Enterprise Resource Management Servers 8Libraries 8Additional Information 8Overview of Building a Process Flow 8Problem 8Solution 9Tasks 9Connect to the Metadata Server 9Register Source Tables 9Register Target Tables 10Create a Job That Specifies the Desired Process Flow 10Run the Job 11Next Tasks 11Impact of Change Management 11Advantages of SAS Data Integration 11Online Help for SAS Data Integration Studio 12Administrative Documentation for SAS Data Integration Studio 12Accessibility Features in SAS Data Integration Studio 13Accessibility Standards 13Enabling Assistive Technologies 15About SAS Data IntegrationData integration is the process of consolidating data from a variety of sources inorder to produce a unified view of the data. SAS supports data integration in thefollowing ways: Connectivity and metadata. A shared metadata environment provides consistentdata definition across all data sources. SAS software enables you to connect to,acquire, store, and write data back to a variety of data stores, streams,applications, and systems on a variety of platforms and in many differentenvironments. For example, you can manage information in Enterprise Resource4 A Basic Data Integration Environment Chapter 1Planning (ERP) systems; relational database management systems (RDBMS), flatfiles, legacy systems, message queues, and XML. Data cleansing and enrichment. Integrated SAS Data Quality software enablesyou to profile, cleanse, augment, and monitor data to create consistent, reliableinformation. SAS Data Integration Studio provides a number of transformationsand functions that can improve the quality of your data. Extraction, transformation, and loading (ETL). SAS Data Integration Studioenables you to extract, transform, and load data from across the enterprise tocreate consistent, accurate information. It provides a point-and-click interface thatenables designers to build process flows, quickly identify inputs and outputs, andcreate business rules in metadata, all of which enable the rapid generation of datawarehouses, data marts, and data streams. Migration and synchronization. SAS Data Integration Studio enables you tomigrate, synchronize, and replicate data among different operational systems anddata sources. Data transformations are available for altering, reformatting, andconsolidating information. Real-time data quality integration allows data to becleansed as it is being moved, replicated, or synchronized, and you can easily builda library of reusable business rules. Data federation. SAS Data Integration Studio enables you to query and use dataacross multiple systems without the physical movement of source data. It providesvirtual access to database structures, ERP applications, legacy files, text, XML,message queues, and a host of other sources. It enables you to join data acrossthese virtual data sources for real-time access and analysis. The semanticbusiness metadata layer shields business staff from underlying data complexity. Master data management. SAS Data Integration Studio enables you to create aunified view of enterprise data from multiple sources. Semantic data descriptionsof input and output data sources uniquely identify each instance of a businesselement (such as customer, product, and account) and standardize the master datamodel to provide a single source of truth. Transformations and embedded dataquality processes ensure that master data is correct.A Basic Data Integration EnvironmentOverview of a Data Integration EnvironmentThe following figure shows the main clients and servers in a SAS data integrationenvironment.Introduction to SAS Data Integration SAS Data Integration Studio 5Figure 1.1 SAS Data Integration Studio EnvironmentSAS ManagementConsoleSAS Data IntegrationStudioSAS Metadata ServerMetadata Repository Source DataTarget DataSAS Workspace ServerSAS/CONNECT ServerOther ServersAdministrators use SAS Management Console to connect to a SAS Metadata Server.They enter metadata about servers, libraries, and other resources on your network andsave this metadata to a repository. SAS Data Integration Studio users connect to thesame metadata server and register any additional libraries and tables that they need.Then, they create process flows that read source tables and create target tables inphysical storage.SAS Management ConsoleSAS Management Console provides a single interface through which administratorscan explore and manage metadata repositories. With this interface, administrators canefficiently set up system resources, manage user and group accounts, and administersecurity.SAS Data Integration StudioSAS Data Integration Studio is a visual design tool that enables you to consolidateand manage enterprise data from a variety of source systems, applications, andtechnologies. This software enables you to create process flows that accomplish thefollowing tasks: extract, transform, and load data for use in data warehouses and data marts6 Servers Chapter 1 cleanse, migrate, synchronize, replicate, and promote data for applications andbusiness servicesSAS Data Integration Studio enables you to create metadata that defines sources,targets, and the processes that connect them. This metadata is stored in one or moreshareable repositories. SAS Data Integration Studio uses the metadata to generate orretrieve SAS code that reads sources and creates targets in physical storage. Otherapplications that share the same repositories can use the metadata to access the targetsand use them as the basis for reports, queries, or analyses.Through its metadata, SAS Data Integration Studio provides a single point of controlfor managing the following resources: data sources (from any platform that is accessible to SAS and from any formatthat is accessible to SAS) data targets (to any platform that is accessible to SAS, and to any format that issupported by SAS) processes that specify how data is extracted, transformed, and loaded from asource to a target jobs that organize a set of sources, targets, and processes (transformations) source code generated by SAS Data Integration Studio user-written source codeNote: SAS Data Integration Studio was formerly named SAS ETL Studio. ServersSAS Application ServersWhen the SAS Intelligence Platform was installed at your site, a metadata objectthat represents the SAS server tier in your environment was defined. In the SASManagement Console interface, this type of object is called a SAS Application Server. Ifyou have a SAS server, such as a SAS Workspace Server, on the same machine as yourSAS Metadata Server, the application server object is named SASMain; otherwise, it isnamed SASApp.A SAS Application Server is not an actual server that can execute SAS code submittedby clients. Rather, it is a logical container for a set of application server components,which do execute codetypically SAS code, although some components can executeJava code or MDX queries. For example, a SAS Application Server might contain aworkspace server, which can execute SAS code that is generated by clients such as SASData Integration Studio. A SAS Application Server might also contain a stored processserver, which executes SAS Stored Processes, and a SAS/CONNECT Server, which canupload or download data and execute SAS code submitted from a remote machine.The following table lists the main SAS Application Server components and describeshow each one is used.Introduction to SAS Data Integration Servers 7Table 1.1 SAS Application ServersServer How Used How SpecifiedSAS MetadataServerReads and writes metadata in a SASMetadata Repository.In each users metadata profile.SASWorkspaceServerExecutes SAS code; reads and writesdata.As a component in a SAS ApplicationServer object.SAS/CONNECTServerSubmits generated SAS code tomachines that are remote from thedefault SAS Application Server; canalso be used for interactive access toremote libraries.As a component in a SAS ApplicationServer object.SAS OLAPServerCreates cubes and processes queriesagainst cubes.As a component in a SAS ApplicationServer object.Stored ProcessServerSubmits stored processes for executionby a SAS session. Stored processes areSAS programs that are stored and canbe executed by client applications.As a component in a SAS ApplicationServer object.SAS GridServerSupports a compute grid that canexecute grid-enabled jobs created inSAS Data Integration Studio.As a component in a SAS ApplicationServer object.Typically, administrators install, start, and register SAS Application Servercomponents. SAS Data Integration Studio users are told which SAS Application Serverobject to use.SAS Data ServersThe following table lists two special-purpose servers for managing SAS data.Table 1.2 SAS Data ServersServer How Used How SpecifiedSAS/SHAREServerEnables concurrent access of serverlibraries from multiple users.In a SAS/SHARE library.SAS ScalablePerformanceData (SPD)ServerProvides parallel processing for largeSAS data stores; provides acomprehensive security infrastructure,backup and restore utilities, andsophisticated administrative andtuning options.In an SPD Server library.Typically, administrators install, start, and register these servers and register theSAS/SHARE library or the SPD Server library. SAS Data Integration Studio users aretold which library to use.Database Management System (DBMS) ServersSAS Data Integration Studio uses a SAS Application Server and a database server toaccess tables in database management systems such as Oracle and DB2.8 Libraries Chapter 1When you start a source designer or a target designer, the wizard tries to connect toa SAS Application Server. You are then prompted to select an appropriate databaselibrary. SAS Data Integration Studio uses the metadata for the database library togenerate a SAS/ACCESS LIBNAME statement, and the statement is submitted to theSAS Application Server for execution.The SAS/ACCESS LIBNAME statement specifies options that are required tocommunicate with the relevant database server. The options are specific to the DBMSto which you are connecting. For example, here is a SAS/ACCESS LIBNAME statementthat could be used to access an Oracle database:libname mydb oracle user=admin1 pass=ad1min path=V2o7223.worldTypically, administrators install, start, and register DBMS servers and register theDBMS libraries. SAS Data Integration Studio users are told which library to use.Enterprise Resource Management ServersOptional data surveyor wizards can be installed that provide access to the metadataand data from enterprise applications. Applications from vendors such as SAP, Oracle,PeopleSoft, and Siebel are supported. Typically, administrators install, start, andregister ERP servers. SAS Data Integration Studio users are told which servermetadata to use.LibrariesIn SAS software, a library is a collection of one or more files that are recognized bySAS and that are referenced and stored as a unit. Libraries are critical to SAS DataIntegration Studio. You cannot begin to enter metadata for sources, targets, or jobsuntil the appropriate libraries have been registered in a metadata repository.Accordingly, one of the first tasks in a SAS Data Integration Studio project is tospecify metadata for the libraries that contain sources, targets, or other resources. Atsome sites, an administrator adds and maintains most of the libraries that are needed,and the administrator tells SAS Data Integration Studio users which libraries to use.The steps for specifying metadata about a Base SAS library are described inRegistering Any Libraries That You Need on page 55.Additional InformationFor more information about setting up a data integration environment,administrators should see Administrative Documentation for SAS Data IntegrationStudio on page 12.Overview of Building a Process FlowProblemYou want to become familiar with SAS Data Integration Studio, so you decide tocreate a simple process flow that reads data from a source table, sorts the data, andthen writes the sorted data to a target table, as shown in the following figure.Introduction to SAS Data Integration Tasks 9Figure 1.2 Simple Process Flowunsorted table sorted table sort processSolutionCreate a job in SAS Data Integration Studio that specifies the desired process flow.Perform the following tasks: Connect to a metadata server. Register the source table. Register the target table. Create an empty job. Drag and drop the SAS Sort transformation on the job. Drag and drop the source table metadata and target table metadata on the job. Update the metadata for the tables and the SAS Sort transformation as needed foryour environment. Execute the job.It is assumed that administrators have installed, configured, and registered therelevant servers, libraries, and other resources that are required to support SAS DataIntegration Studio in your environment.TasksConnect to the Metadata ServerMost servers, data, and other resources on your network are not available to SASData Integration Studio until they are registered in a repository on a SAS MetadataServer. Accordingly, when you start SAS Data Integration Studio, you are prompted toselect a metadata profile which specifies a connection to a metadata server. You mighthave a number of different profiles that connect to different metadata servers at yoursite. Select the profile that will connect to the metadata server with the metadata thatyou will need during the current session.For details about creating a metadata profile, see Connecting to a Metadata Serveron page 51.Register Source TablesSuppose that the source table in the example process flow is an existing SAS table,but the table is not currently registered; that is, metadata about this table has not beensaved to the current metadata server. One way to register a table that exists inphysical storage is to use a source designer wizard. To display the source designer10 Tasks Chapter 1wizard for a SAS table, select Tools Source Designer from the menu bar. A selectionwindow displays. Click SAS, and then click OK. The SAS source designer displays. Asource designer wizard enables you to: specify the library that contains the table to be registered (typically, this libraryhas been registered ahead of time) display a list of tables contained in the selected library select one or more tables in that library generate and save metadata for the selected tablesFor details about using source designers, see Registering Tables with a SourceDesigner on page 85.Register Target TablesSuppose that the target table in the example process flow is a new table, one thatdoes not yet exist in physical storage. You could use the Target Table wizard to specifymetadata for the table. Later, you can drag and drop this metadata on the targetposition in a process flow. When the process flow is executed, SAS Data IntegrationStudio will use the metadata for the target table to create a physical instance of thattable.One way to register a table that does not exist in physical storage is to use theTarget Table wizard. To display the Target Table wizard, select Tools TargetDesigner from the menu bar. A selection window displays. Click Target Table, andthen click OK. The Target Table wizard displays.The Target Table wizard enables you to specify the physical location, columnstructure, and other attributes of the target table and save that metadata to thecurrent repository.For details about using the Target Table wizard, see Registering Tables with theTarget Table Wizard on page 87.Create a Job That Species the Desired Process FlowIn SAS Data Integration Studio, a process flow is contained in a job. One way tocreate a job is to use the New Job wizard to create an empty job, then drag and dropmetadata for the source tables, the target tables, and the desired transformations ontothe empty job, and build the desired process flow. For details about this method, seeCreating an Empty Job on page 143. For now, assume that you have used this methodto create the process flow shown in the following display.Display 1.1 Process Flow Diagram for a Job That Sorts DataIntroduction to SAS Data Integration Advantages of SAS Data Integration 11Given the direction of the arrows in the previous display: ALL_EMP specifies metadata for the source table. SAS Sort specifies metadata for the sort process, which writes its output to atemporary output table, Sort Target-W5BU8XGB. (For more information abouttemporary output tables, see Manage Temporary and Permanent Tables forTransformations on page 239.) Table Loader specifies metadata for a process that reads the output from theprevious step and loads this data into a target table. Employees Sorted specifies metadata for the target table.SAS Data Integration Studio uses the preceding process flow diagram to generate SAScode that reads ALL_EMP, sorts this information, and writes the sorted information to atemporary output table. Then, the information is written to the Employees Sorted table.Run the JobOne way to execute a SAS Data Integration Studio job is to select Process Submitfrom the menu bar. The code is then submitted to a SAS Application Server, whichexecutes the code. If the job is successful, the output of the job is created or updated.Next TasksThe output from a job can become the source for another job in SAS Data IntegrationStudio, or it can be the source for a report or query in another SAS application. Anytables, libraries, or other resources that were registered in order to create the job arealso available to other SAS applications that connected to the same metadata repository.Impact of Change ManagementThe change management feature adds a few steps to some of the previous tasks. Formore information, see Working with Change Management on page 59.Advantages of SAS Data IntegrationSAS data integration projects have a number of advantages over projects that relyheavily on custom code and multiple tools that are not well integrated. SAS data integration reduces development time by enabling the rapid generationof data warehouses, data marts, and data streams. It controls the costs of data integration by supporting collaboration, code reuse,and common metadata. It increases returns on existing IT investments by providing multi-platformscalability and interoperability. It creates process flows that are reusable, easily modified, and have embeddeddata quality processing. The flows are self-documenting and support data lineageanalysis.12 Online Help for SAS Data Integration Studio Chapter 1Online Help for SAS Data Integration StudioThe online Help describes all windows in SAS Data Integration Studio, and itsummarizes the main tasks that you can perform with the software. The Help includesexamples for all source designer wizards, all target designer wizards, and alltransformations in the Process Library. The Help also includes a Whats New topic anda set of Usage Note topics for the current version of the software.Perform the following steps to display the main Help window for SAS DataIntegration Studio.1 Start SAS Data Integration Studio as described in Starting SAS Data IntegrationStudio on page 50.2 From the menu bar, select Help Contents. The main Help window displays.To display the Help for an active window or tab, click its Help button. If the windowor tab does not have a Help button, press the F1 key.To search for topics about concepts or features that are identified by specific words,such as application server, display the main Help window. Then, click the Search tab(magnifying glass icon). Enter the text to be found and press the Enter key.Administrative Documentation for SAS Data Integration StudioMany administrative tasks, such as setting up the servers that are used to executejobs, are performed outside of the SAS Data Integration Studio interface. Such tasksare described in SAS Intelligence Platform documentation, which can be found at thefollowing location: http://support.sas.com/913administration.The following table identifies the main SAS Intelligence Platform documentation forSAS Data Integration Studio.Table 1.3 SAS Intelligence Platform Documentation for SAS Data IntegrationStudioAdministrative Task Related Documentation Set up metadata servers and metadatarepositories.SAS Intelligence Platform: SystemAdministration Guide Set up data servers and libraries forcommon data sources.SAS Intelligence Platform: DataAdministration GuideIntroduction to SAS Data Integration Accessibility Standards 13Administrative Task Related Documentation Set up SAS Application Servers. Set up grid computing (so that jobs canexecute on a grid).SAS Intelligence Platform: Application ServerAdministration Guide Set up change management. Manage operating system privileges ontarget tables (job outputs). Set up servers and libraries for remote data(multi-tier environments). Set up security for Custom tree folders. Set up a central repository for importingand exporting generated transformations. Set up support for message queue jobs. Set up support for Web service jobs andother stored process jobs. Enable the bulk-loading of data into targettables in a DBMS. Set up SAS Data Quality software. Set up support for job status handling. Set up support for FTP and HTTP access toexternal files.SAS Intelligence Platform: Desktop ApplicationAdministration GuideAccessibility Features in SAS Data Integration StudioAccessibility StandardsSAS Data Integration Studio includes features that improve usability of the productfor users with disabilities. These features are related to accessibility standards forelectronic information technology that were adopted by the U.S. Government underSection 508 of the U.S. Rehabilitation Act of 1973, as amended. SAS Data IntegrationStudio supports Section 508 standards except as noted in the following table.14 Accessibility Standards Chapter 1Table 1.4 Accessibility ExceptionsSection 508 AccessibilityCriteriaSupportStatusExplanation(a) When software is designed torun on a system that has akeyboard, product functionsshall be executable from akeyboard where the functionitself or the result of performinga function can be discernedtextually.SupportedwithexceptionsThe software supports keyboard equivalents forall user actions. Tree controls in the userinterface can be individually managed andnavigated through using the keyboard.However, some exceptions exist. Some ALT keyshortcuts are not functional. Also, some moreadvanced manipulations require a mouse. Still,the basic functionality for displaying trees in theproduct is accessible from the keyboard.Based on guidance from the Access Board,keyboard access to drawing tasks does notappear to be required for compliance withSection 508 standards. Accordingly, keyboardaccess does not appear to be required for theProcess Editor tab in the Process Designerwindow, or the Designer tab in the SQL Joinproperties window.Specifically, use of the Process Editor tabin the Process Flow Diagram and theDesigner tab in the SQL Join Propertieswindow are functions that cannot be discernedtextually. Both involve choosing a drawing piece,dragging it into the workspace, and designing aflow. These tasks required a level of control thatis provided by a pointing device. Moreover, thesame result can be achieved by editing thesource code for flows.Example: Use of the Process Editor tabin the Process Flow Diagram is designed forvisual rather than textual manipulation.Therefore, it cannot be operated via keyboard. Ifyou have difficulty using a mouse, then you cancreate process flows with user-written sourcecode. See Chapter 12, Working withUser-Written Code, on page 215.(c) A well-defined on-screenindication of the current focusshall be provided that movesamong interactive interfaceelements as the input focuschanges. The focus shall beprogrammatically exposed sothat Assistive Technology cantrack focus and focus changes.SupportedwithexceptionsIn some wizards, when focus is on an element ina wizard pane, rather than a button, focus is notapparent. If an element in the pane ishighlighted and focus is moved to a button, theelement appearance is unchanged, so the usermight not be certain when focus is on such anitem.Example: When you launch the TargetDesigner and press the down arrow, you cantraverse the Targets tree to select the type oftarget that you want to design even though noobject has visual focus.Introduction to SAS Data Integration Accessibility Standards 15Section 508 AccessibilityCriteriaSupportStatusExplanation(d) Sufficient information abouta user interface elementincluding the identity, operation,and state of the element shallbe available to AssistiveTechnology. When an imagerepresents a program element,the information conveyed by theimage must also be available intext.SupportedwithexceptionsIn some wizards, identity, operation, and state ofsome interface elements is ambiguous. SAScurrently plans to address this in a futurerelease.Example: When you select a library in theSource Designer wizard, you must use the SASLibrary combo box. If you are using the JAWSscreen reader, the reader immediately reads notonly the library name but also all of its details.If you want to know the libref, you must knowthat the label exists and that its shortcut isAlt+F. Then, you must press Alt+F so that theJAWS screen reader will read the label and itsread-only text. You can move among the itemsin Library Details only after you use a shortcutto get to one of them.(g) Applications shall notoverride user selected contrastand color selections and otherindividual display attributes.SupportedwithexceptionsWhen the user sets the operating systemsettings to high contrast, some attributes of thatsetting are not inherited.Example: As with most other Javaapplications, system font settings are notinherited in the main application window. If youneed larger fonts, consider using a screenmagnifier.(l) When electronic forms areused, the form shall allowpeople using AssistiveTechnology to access theinformation, field elements, andfunctionality required forcompletion and submission ofthe form, including alldirections and cues.SupportedwithexceptionsWhen navigating with a keyboard to choose apath in the Browse dialog box, the focusdisappears. To work around the problem, either(1) count the number of times you press the TABkey and listen closely to the items, or (2) typethe path explicitly.Example: In some wizards such as the SourceDesigner, the visual focus can disappearsometimes when you operate the software withonly a keyboard. If so, continue to press the TABkey until an interface element regains focus.If you have questions or concerns about the accessibility of SAS products, send e-mailto [email protected] Assistive TechnologiesFor instructions on how to configure SAS Data Integration Studio software so thatassistive technologies will work with the application, see the information aboutdownloading the Java Access Bridge in the section about accessibility features in theSAS Intelligence Platform: Desktop Application Administration Guide.17C H A P T E R2About the Main Windows andWizardsOverview of the Main Windows and Wizards 18Metadata Windows and Wizards 20Property Windows for Tables and Other Objects 20Property Window for a Table or External File 20Property Window for a Job 21Transformation Property Window 22Windows Used to View Data 22View Data Window 22View File Window 23View Statistics Window 24Impact Analysis and Reverse Impact Analysis Windows 24Import and Export Metadata Wizards 25Source Designer Wizards 25Target Table Wizard 26SAS OLAP Cube Wizards 26Data Surveyor Wizards 27Job Windows and Wizards 27Process Designer Window 27Process Editor Tab 28Source Editor Tab 29Log Tab 29Output Tab 29Expression Builder Window 29Job Status Manager Window 30Source Editor Window 31New Job Wizard 32Transformation Generator Wizard 32SAS Data Integration Studio Application Windows 32Metadata Profile Window 32Desktop Window 33Overview of the Desktop 33Metadata Profile Name 33Menu Bar 33Toolbar 34Default SAS Application Server 34User ID and Identity 34Shortcut Bar 34Metadata Server and Port 34Job Status Icon 34Options Window 34Tree View on the Desktop 3618 Overview of the Main Windows and Wizards Chapter 2Tree View 36Inventory Tree 36Custom Tree 38Process Library Tree 39Overview of the Process Library 39Access Folder 39Analysis Folder 40Archived Transforms Folder 40Control Folder 41Data Transforms Folder 41Output Folder 42Publish Folder 42SPD Server Dynamic Cluster Folder 43Additional Information about the Process Library Transformations 43Project Tree 43Quick Properties Pane 43Comparison Results Tree 44Metadata Tree 44Overview of the Main Windows and WizardsThe windows and wizards in SAS Data Integration Studio can be divided intocategories that facilitate working with metadata, jobs, and the SAS Data IntegrationStudio application as a whole. The wizards are available from the shortcut bar or fromthe tools menu on the SAS Data Integration Studio desktop. The following table liststhe metadata windows and wizards.Table 2.1 Metadata WindowsWindow or wizard Enables you toProperty windows for tables andother objectsView and update the metadata for tables or external files,jobs, and transformations.Windows for viewing data Display the data in tables or external files. Includes theView Data, View File, and View Statistics windows.Impact analysis windows Identify the tables, columns, jobs, and transformations thatare affected by a change in a selected table or column;identify the tables, columns, jobs, and transformations thatcontribute to the content of a selected table or column.Import/Export Metadata Wizards Export metadata from and import metadata into SAS DataIntegration Studio.Source Designer Wizards Register one or more tables that exist in physical storage.Target Table wizard Register a new table that will be created when a SAS DataIntegration Studio job is executed; register a new table thatreuses column metadata from one or more registered tables.About the Main Windows and Wizards Overview of the Main Windows and Wizards 19Window or wizard Enables you toCube wizards Create or maintain SAS cubes.Data surveyors Extract, search, and navigate data from SAP, Siebel,PeopleSoft, Oracle, and other enterprise application vendors.For more information about these components, see Metadata Windows and Wizardson page 20. The following table lists the job windows and wizards.Table 2.2 Job WindowsWindow or wizard Enables you toProcess Designer window Create process flows; generate and submit code for jobs.Expression Builder Create SAS expressions that aggregate columns, performconditional processing, and perform other tasks in SAS DataIntegration Studio jobs.Job Status Manager window View the status of jobs that have been submitted forexecution. You can also cancel, kill, or resubmit jobs.Source Editor window Write SAS code and submit it for execution.New Job wizard Select one or more tables as the targets (outputs) of a jobAlternatively, you can create an empty job onto which youcan drag and drop transformations and tables.Transformation Generator wizard Specify user-written SAS code for a generatedtransformation and save metadata for the transformation tothe current repository.For more information about these components, see Job Windows and Wizards onpage 27. The following table lists the application windows.Table 2.3 SAS Data Integration Application WindowsWindow Enables you toMetadata Profile window Select or create a metadata profile. You can also connect tothe metadata server that is specified in the profile.Desktop window Work with the menus, trees, windows, and other objects thatcomprise the user interface for SAS Data Integration Studio.20 Metadata Windows and Wizards Chapter 2Window Enables you toOptions window Specify global options for SAS Data Integration Studio.Tree views Display and work with metadata in the repositories that arespecified in the current metadata profile. The following treesare possible, depending on configuration: Inventory Custom Process Library Project Quick Properties Comparison Results MetadataFor more information about these components, see SAS Data Integration StudioApplication Windows on page 32.Metadata Windows and WizardsProperty Windows for Tables and Other ObjectsProperty Window for a Table or External FileUse the properties window for a table or an external file to view or update themetadata for its columns and other attributes. The following display shows a typicalwindow.About the Main Windows and Wizards Property Windows for Tables and Other Objects 21Display 2.1 Table Properties WindowThe window shown in the previous display contains the metadata for a typical table.For a summary of how to use this window, see Viewing or Updating Table Metadataon page 89.Property Window for a JobUse the properties window for a job to view or update its basic metadata. Forexample, you can specify whether the code for the current job will be generated by SASData Integration Studio or will be retrieved from a specified location. You can also usethis window to specify code that should be run before or after a job executes. Thefollowing display shows a typical window.Display 2.2 Job Properties Window22 Windows Used to View Data Chapter 2If you want to specify user-written code for the job, you can enter metadata that issimilar to the metadata that is shown in the previous display. In the job propertieswindow shown in the previous display, the User written option has been selected, andthe physical path to a source code file has been specified.If you wanted to execute code before or after the Sort Staff job is executed, you canclick the Pre and Post Process tab and specify the code. For example, you mightwant to issue a SAS LIBNAME statement before the job is run.For a summary of how to use the job properties window, see Viewing or UpdatingJob Metadata on page 150.Transformation Property WindowUse a transformation properties window to view or update the metadata for a processin a job. The metadata for a transformation specifies how SAS Data Integration Studiowill generate code for the corresponding process. The window for each kind oftransformation has one or more tabs that are unique to the corresponding process. Thefollowing display shows a typical window.Display 2.3 Transformation Properties WindowThe window shown in the previous display contains the metadata for the SAS Sorttransformation, which is described in Chapter 16, Working with SAS Sorts, on page275. Note that the rows in the output table for this transformation will be sorted byemployee ID. For a summary of how to use transformation property windows, seeViewing or Updating the Metadata for Transformations on page 197.Windows Used to View DataView Data WindowThe View Data window is available in the tree views on the desktop. It works in twomodes, browse and edit. The browse mode enables you to view the data displayed in aSAS table or view, in an external file, in a temporary output table displayed in a processflow diagram, or in a DBMS table or view that is part of a SAS library for DBMS datastores. The table, view, or external file must be registered in a current metadataAbout the Main Windows and Wizards Windows Used to View Data 23repository and must exist in physical storage. However, temporary output tables areretained until the Process Designer window is closed or the current server session isended in some other way (for example, by selecting Process Kill from the menu bar).Use the edit mode to perform simple editing operations on the data in the View Datawindow. For example, you can overwrite the data in a cell, copy and paste rows of data,and delete data. You can even create completely new tables. However, this editing modeis enabled only on SAS tables that are stored in a BASE engine library and areassigned on a workspace server.The View Data window typically uses the metadata for a data store to format thedata for display. Accordingly, the View Data window can be used to verify that themetadata for a data store is appropriate for use in the intended job. If the window doesnot correctly display the data in the selected data store, then you might have to updatethe corresponding metadata before you use it in a job.The following display shows a typical View Data window.Display 2.4 View Data WindowThe title bar in the View Data window displays the name of the object that is beingviewed and the total number of rows. If a column has a description, the descriptiondisplays in the column heading in the View Data window. Otherwise, the physical nameof the column displays in the column heading. A round icon to the left of the nameindicates that the column is numeric, and a pyramid-shaped icon to the left of thename indicates that the column contains character data.To customize the data view displayed in the View Data window, right-click on acolumn name, row number, or table cell. Then, select an appropriate option from thepop-up menu. To display Help for the View Data window, press F1. See also BrowsingTable Data on page 109.View File WindowUse the View File window to display the raw contents of an external file. Unlike theView Data window, the View File window does not use SAS metadata to format thecontents of the corresponding external file. The View File option reads the structure ofthe external file directly and displays the data accordingly.The external file must exist in physical storage. You cannot use the View Filewindow to view an external file that is accessed with user-written code.For an example of the View File window, please see the following display.24 Impact Analysis and Reverse Impact Analysis Windows Chapter 2Display 2.5 View File WindowView Statistics WindowUse the View Statistics window to see or print statistics for a table. Table statisticsmight include size, number of rows, number of populated rows, constraints, and otherinformation that is generally available through SAS. An example is shown in thefollowing display:Display 2.6 View Statistics WindowNote: The preceding example shows only a portion of the screen. Impact Analysis and Reverse Impact Analysis WindowsImpact analysis identifies the tables, columns, jobs, and transformations that areaffected by a change in a selected table or column. Reverse impact analysis identifiesthe tables, columns, jobs, and transformations that contribute to the content of aAbout the Main Windows and Wizards Source Designer Wizards 25selected table or column. For more information about these windows, see Chapter 14,Using Impact Analysis, on page 255.Import and Export Metadata WizardsSAS Data Integration Studio provides the following wizards for the import andexport of SAS metadata (metadata in SAS Open Metadata Architecture format):Table 2.4 Wizards for Import and Export of SAS MetadataWizard DescriptionExport Wizard Exports SAS metadata to a SAS Package (SPK)file.Import Wizard Imports SAS metadata that was exported to aSAS Package (SPK) file.Transformation Importer Imports one or more generated transformationsthat were exported in XML format in SAS DataIntegration Studio versions earlier than 3.4.Job Export and Merge Wizard Imports a job that was previously exported inXML format in SAS Data Integration Studioversions earlier than 3.4.SAS Data Integration Studio provides the following wizards for the import and exportof other metadata (metadata that is not in SAS Open Metadata Architecture format):Table 2.5 Wizards for Import and Export of Other MetadataWizard DescriptionMetadata Importer Imports metadata in Common WarehouseMetamodel (CWM) format or in a format that issupported by a SAS Metadata Bridge. You havethe option of comparing the imported metadata toexisting metadata. You can view any changes inthe Differences window and choose which changesto apply.Metadata Exporter Exports the default metadata repository that isspecified in your metadata profile. You can exportmetadata in Common Warehouse Metamodel(CWM) format or in a format that is supported bya SAS Metadata Bridge. If you are not exportingin CWM format, then you must license theappropriate bridge.For more information about metadata import and export, see Chapter 4, Importing,Exporting, and Copying Metadata, on page 65.Source Designer WizardsUse a source designer to register one or more tables or external files that exist inphysical storage. Source designers use a combination of libraries and servers to access26 Target Table Wizard Chapter 2the desired tables or external files. Typically, administrators set up the appropriateresources, and SAS Data Integration Studio users simply select the appropriate library,tables, or files in the source designer.For more information about using source designers, see Registering Tables with aSource Designer on page 85. For details about setting up the libraries and servers forsource designers, administrators should see the chapters about common data sources inthe SAS Intelligence Platform: Data Administration Guide.Target Table WizardThe Target Table wizard enables you to register a table that does not yet exist inphysical storage. For example, suppose that you want to create a SAS Data IntegrationStudio job that sends its output to a new table. You can use the Target Table wizard toregister that table so that you can add the table to the process flow for the job.As you specify the metadata for a new table, the Target Table wizard enables you toselect column metadata from one or more tables that are registered in the currentrepository. For example, suppose that you want to register a new table, Table C, andthat Table C will have some of the same columns as two registered tables, Tables A andB. You can use the Target Table wizard to copy the desired column metadata fromTables A and B into the metadata for Table C. For more information about using theTarget Table wizard, see Registering Tables with the Target Table Wizard on page 87.SAS OLAP Cube WizardsA SAS OLAP cube is a logical set of data that is organized and structured in ahierarchical, multidimensional arrangement. It is a data store that supports onlineanalytical processing (OLAP). When you specify a cube, you specify the dimensions andmeasures for the cube along with information about how aggregations should be createdand stored.A cube can be quite complex. Accordingly, someone who is familiar with OLAPdesign and the business goals for a particular cube should design the cube before youcreate it in SAS Data Integration Studio. For more information about maintainingcubes, see the cube chapter in SAS Intelligence Platform: Data Administration Guide.See also the topic Maintaining Cubes in SAS Data Integration Studio help.Here are the main cube wizards in SAS Data Integration Studio.Table 2.6 Cube WizardsWizard DescriptionCube Designer Adds and maintains a SAS OLAP cube. For moreinformation, see the Help for this wizard.Export Cube Exports a SAS OLAP cube in XML format. Formore information, see the Help for this wizard.Import Cube Imports a SAS OLAP cube that was exported inXML format. For more information, see the Helpfor these wizards.About the Main Windows and Wizards Process Designer Window 27Wizard DescriptionAdvanced Aggregation Tuning Adds, edits, and deletes the calculated membersassociated with the cubes that are registered to acurrent metadata repository. For moreinformation, see the Help for this wizard.Calculated Members Adds, edits, and deletes the calculated membersassociated with the cubes that are registered to acurrent metadata repository. For moreinformation, see the Help for this wizard.Data Surveyor WizardsYou can install optional data surveyor wizards that enable you to extract, search, andnavigate data from SAP, Siebel, PeopleSoft, Oracle, and other enterprise applicationvendors. For details about setting up the libraries, servers, and client software forEnterprise Resource Planning (ERP) systems, administrators should see the chaptersabout common data sources in the SAS Intelligence Platform: Data AdministrationGuide.Job Windows and WizardsProcess Designer WindowUse the Process Designer window to perform these tasks: Maintain the process flow diagram for the selected job. View or update the metadata for sources, targets, and transformations within theselected job. View or update the code that is generated for the entire selected job or for atransformation within that job. View a log that indicates whether code was successfully generated for the selectedjob or for one of its transformations (and was successfully executed, if the code wassubmitted for execution). View any output that the selected job or one of its transformations sends to theSAS output window.28 Process Designer Window Chapter 2The following display shows a typical view of this window.Display 2.7 Process Designer WindowIn the previous display, the Process Designer window contains the process flow diagramfor the Sort Staff job that is described in Creating a Table That Sorts the Contents of aSource on page 279. Note that the Process Editor tab is shown by default.The following steps describe one way to open an existing job in the Process Designerwindow:1 From the SAS Data Integration Studio desktop, display the Inventory tree.2 In the Inventory tree, expand the Jobs group.3 Select the desired job, then select View View Job from the menu bar. Theprocess flow diagram for the job displays in the Process Editor tab in theProcess Designer window.If the diagram is too large to view in the Process Editor tab, select ViewOverview from the menu bar. A small image of the complete process flow diagramdisplays in the Overview window.To change the size or the orientation of the process flow diagram, select ProcessZoom or Process Layout from the menu bar.The tabs in the Process Designer window are described in the following sections. Todisplay the online Help for each tab, select the tab and press the F1 key.Process Editor TabUse the Process Editor tab to add and maintain a process flow diagram for theselected job. For an example of how you can use the Process Editor to create a processflow diagram for a job, see Creating a Process Flow for a Job on page 144.About the Main Windows and Wizards Expression Builder Window 29Source Editor TabUse the Source Editor tab to view or modify SAS code for the selected job. Forexample, if the Sort Staff job is displayed on the Process Editor tab, and you selectthe Source Editor tab, code for the entire job is generated and displayed. Thefollowing display shows some of the code that would be generated for the Sort Staff job.Display 2.8 Source Editor TabLog TabUse the Log tab to view the SAS log that is returned from the code that wassubmitted for execution. The Log tab can help you identify the cause of problems withthe selected job or transformation. For information about how you can use the Log tab,see Using SAS Logs on page 249.Output TabUse the Output tab to view any printed output from a SAS program. For example,SAS Data Integration Studio jobs that produce reports can specify that the reports aresent to the Output tab. For an example of a job that sends output to the Output tab,see Creating and Using a Generated Transformation on page 226.Expression Builder WindowUse the Expression Builder to create SAS expressions that aggregate columns,perform conditional processing, and perform other tasks in a SAS Data IntegrationStudio job. For example, the following display shows an expression that sums thevalues in a column named Total_Retail_Price.30 Job Status Manager Window Chapter 2Display 2.9 Expression Builder Window with a SUM ExpressionThe Expression Builder is displayed from tabs in the property windows of many SASData Integration Studio transformations. It is used to add or update expressions inSAS, SQL, or MDX. The expression can transform columns, provide conditionalprocessing, calculate new values, and assign new values. The expressions specify thefollowing elements, among others: column names SAS functions constants (fixed values) sequences of operands (something to be operated on like a column name or aconstant) and operators, which form a set of instructions to produce a valueAn expression can be as simple as a constant or a column name, or an expression cancontain multiple operations connected by logical operators. For example, an expressionto define how the values for the column COMMISSION are calculated can be amount *.01. An example of conditional processing to subset data can be amount > 10000 andregion = NE. Other examples are an expression to convert a character date into aSAS date or an expression to concatenated columns. For details about SAS expressions,see SAS Language Reference: Concepts in SAS OnlineDoc.Job Status Manager WindowUse the Job Status Manager window to display the name, status, starting time,ending time, and application server used for all jobs submitted in the current session.Right-click on any row to clear, view, cancel, kill, or resubmit a job.About the Main Windows and Wizards Source Editor Window 31Note: To debug the SAS code for a job, use the Log tab in the Process Designerwindow. For more information, see Troubleshooting a Job on page 155. The Job Status Manager window opens in a table that displays all jobs that aresubmitted in the current session. Each row in the table represents a job in thefollowing example.Display 2.10 Job Display Manager WindowFor more information, see Using the Job Status Manager on page 159.Source Editor WindowSAS Data Integration Studio also provides a Source Editor window that you can useas a general-purpose SAS code editor. The following display shows an example of thiswindow.Display 2.11 Source Editor WindowTo display the Source Editor window, from the SAS Data Integration Studio desktop,select Tools Source Editor.To submit code from the Source Editor, from the SAS Data Integration Studiodesktop, select Editor Submit.To display Help for this window, press the F1 key.32 New Job Wizard Chapter 2New Job WizardUse the New Job wizard to select one or more tables as the targets (outputs) of a job.This wizard can also be used to create an empty job into which you can drag and droptables and transformation templates. This is the approach that is described inCreating an Empty Job on page 143.Transformation Generator WizardOne of the easiest ways to customize SAS Data Integration Studio is to create yourown generated transformations. Unlike Java-based plug-ins that require softwaredevelopment, generated transformations are created with a wizard.The Transformation Generator wizard guides you through the steps of specifyingSAS code for a generated transformation and saving the transformation to the currentmetadata repository. After the transformation is saved, it is displayed in the ProcessLibrary tree, where it is available for use in any job.For details about using the Transformation Generator wizard, see Creating andUsing a Generated Transformation on page 226.SAS Data Integration Studio Application WindowsMetadata Prole WindowA metadata profile is a client-side definition that specifies the location of a metadataserver. The definition includes a host name, a port number, and a list of one or moremetadata repositories. In addition, the metadata profile can contain a users logininformation and instructions for connecting to the metadata server either automaticallyor manually.When you start SAS Data Integration Studio, the Metadata Profile window displays.The following display shows an example of this window.Display 2.12 Metadata Prole WindowUse the Metadata Profile window to open an existing metadata profile, edit an existingmetadata profile, or add a new metadata profile. You must open a metadata profilebefore you can do any work in SAS Data Integration Studio.About the Main Windows and Wizards Desktop Window 33Desktop WindowOverview of the DesktopAfter you open a metadata profile, the SAS Data Integration Studio desktop displays.Display 2.13 SAS Data Integration Studio Desktop10951876432The desktop consists of the following main parts:1 Metadata profile name2 Menu bar3 Toolbar4 Shortcut bar5 Tree view6 Tree tabs7 Default SAS Application Server8 User ID and identity9 Metadata server and port10 Job status iconMetadata Prole NameThe title bar in this window shows the name of the metadata profile that is currentlyin use. For more information about metadata profiles, see Connecting to a MetadataServer on page 51.Menu BarUse the menu bar to access the drop-down menus. The list of active options variesaccording to the current work area and the kind of object that you select. Inactiveoptions are disabled or hidden.34 Options Window Chapter 2ToolbarThe toolbar contains shortcuts for items on the menu bar. The list of active optionsvaries according to the current work area and the kind of object that you select.Inactive options are disabled or hidden.Default SAS Application ServerThis pane displays the default SAS Application Server, if you have selected one. Themessage No Default Application Server indicates that you need to select a defaultSAS Application server. If you double-click this pane, the Default SAS ApplicationServer window opens. From that window, you can select a default server or test yourconnection to the default server. Alternatively, you can select the default SASApplication Server from the Options window, as described in Options Window on page34. For more information about the impact of the default SAS Application Server, seeSAS Application Servers on page 6.User ID and IdentityThis pane displays the domain, login user ID, and metadata identity that arespecified in the current metadata profile.Shortcut BarThe shortcut bar displays a pane of task icons on the left side of the SAS DataIntegration Studio desktop. To display it, select View Shortcut Bar from the menubar. Each icon displays a commonly used window, wizard, or a selection window forwizards.Metadata Server and PortThis pane displays the name and port of the SAS Metadata Server that is specified inthe current metadata profile.Job Status IconYou can right-click this icon to display a list of the last five unique jobs that weresubmitted in the current SAS Data Integration Studio session and the status for each ofthem.Note: The appearance of the icon changes to indicate changes in the status of thejobs that you submitted in the current session. The icon appearance returns to normalwhen you right-click the icon. Options WindowUse the Options window to specify global options for SAS Data Integration Studio.The following steps describe one way to display this window:1 From the SAS Data Integration Studio desktop, select Tools Options.2 Select the tab that contains the options that you want to view or update.About the Main Windows and Wizards Options Window 35The following display shows an example of this window.Display 2.14 Options WindowThe Options window contains the tabs listed in the following table.Table 2.7 Option Window TabsTab DescriptionGeneral Specifies general user interface options for SAS Data Integration Studio,such as desktop colors or whether to show the SAS Log tab in theProcess Designer.Process Specifies options for the Process Editor tab in the Process Designer.These options control how process flow diagrams display.Editor Specifies options for the appearance and behavior of the SourceEditor tab in the Process Designer window.Metadata Tree Specifies what types of metadata are displayed in the Metadata tree onthe SAS Data Integration Studio desktop.SAS Server Specifies the default SAS Application Server for SAS Data IntegrationStudio.Data Quality Specifies options for Data Quality transformation templates that areavailable in the Process Library tree.View Data Specifies options for the View Data window that is available for tables,external files, temporary output tables displayed in process flow diagrams,and selected transformations.36 Tree View on the Desktop Chapter 2Tab DescriptionImpact Analysis Specifies whether impact analysis extends to dependent metadatarepositories.Code Generation Specifies options for code generation and parallel processing.Tree View on the DesktopTree ViewThe tree view is the pane on the left side of the SAS Data Integration Studio desktopthat displays the contents of the current metadata repositories. Most of the tabs at thebottom of this pane, such as Inventory and Custom, are used to display differenthierarchical lists or trees of metadata. To display or hide a tree, use the View optionon the menu bar.Inventory TreeUse the Inventory tree to select metadata from a default folder for each kind ofmetadata. For example, you can select table metadata from the folder named Tables.You can select job metadata from the folder named Jobs, and so on. The Inventory treedisplays the objects in the default metadata repository, as well as the objects from anyrepositories on which the default repository depends.The following table describes the main icons for metadata objects in the Inventorytree.Table 2.8 Main Icons for Metadata Objects in the Inventory TreeFolder Name Icon DescriptionRepositories Icon for a SAS Metadata Repository. A metadatarepository is collection of related metadata objects,such as the metadata for a set of tables and columnsthat are maintained by an application.Cubes Metadata for a SAS cube, a logical set of data that isorganized and structured in a hierarchical,multidimensional arrangement. See SAS OLAPCube Wizards on page 26.Deployed Jobs Metadata for a SAS Data Integration Studio job thathas been deployed for scheduling. See About JobScheduling on page 170.Documents Metadata for a document. Use the Notes tab in theproperties window for a job, table, column, or anotherobject to associate a document with that object.Unlike notes, which are plain text, a document cancontain graphics as well as text.About the Main Windows and Wizards Inventory Tree 37Folder Name Icon DescriptionExternal Files Metadata for an external file. An external file is a filethat is created and maintained by a host operatingsystem or by another vendors software application. Acomma-delimited file is one example. See Chapter 6,Working with External Files, on page 119.Generated Transforms Metadata for a transformation that is created withthe Transformation Generator wizard. The wizardhelps you specify SAS code for the transformation.See Creating and Using a GeneratedTransformation on page 226.Information Maps Metadata for an Information Map. Information Mapsare created and maintained in SAS Information MapStudio. In SAS Data Integration Studio, you can runimpact analysis on Information Maps.Jobs Metadata for a SAS Data Integration Studio job. Ajob is collection of SAS tasks that create output. SeeAbout Jobs on page 142.Libraries Metadata for a library. In SAS software, a library is acollection of one or more files that are recognized bySAS and that are referenced and stored as a unit. SeeLibraries on page 8.Message Queues Metadata for a message queue. A message queue is aplace where one program can send messages that willbe retrieved by another program. See About MessageQueues on page 361.Mining Results Metadata for the output of a Mining Resultstransformation.Notes Metadata for a note. Use the Notes tab in theproperties window for a job, table, column, or anotherobject to associate a note with that object. Unlikedocuments, which can contain graphics as well as text,notes can contain text only. See Add and MaintainNotes and Documents for a Column on page 103.OLAP Schema Metadata for an OLAP schema.Stored Processes Metadata for a stored process that was generatedfrom a SAS Data Integration Studio job. Enablesusers to execute jobs from applications such as SASEnterprise Guide or a Web Service client. See AboutSAS Stored Processes on page 177.Tables Metadata for a table. See Chapter 5, Working withTables, on page 83.The following table lists modifier icons that are used in combination with the mainicons for folder objects. A modifier icon indicates that an object is in a certain state orhas special attributes.38 Custom Tree Chapter 2Table 2.9 Modier Icons for Objects in the Inventory TreeFolder Name Icon DescriptionDeployed Jobs A job icon with a clock indicates the job has beendeployed for scheduling. See About Job Schedulingon page 170.Jobs A job icon with a blue triangle indicates that the jobhas been deployed for scheduling or for execution as astored process.If the job is deployed for scheduling, then the deployedjob appears with a clock icon in the Deployed Jobsfolder of the Inventory tree, as shown in the previoustable. See About Job Scheduling on page 170.If the job is deployed for execution as a stored process,then the deployed job appears with a stored processicon in the Stored Processes folder of the Inventorytree, as shown in the previous table. See About SASStored Processes on page 177.various An icon with an ampersand indicates that someattributes of the object, such as its physical path, arespecified as variables rather than literal values. Forexample, parameterized tables and jobs are often usedin iterative jobs. See Creating a Parameterized Jobon page 335.Jobs A job icon with both an ampersand and a bluetriangle indicates that the job specifies its inputs,outputs, or both as parameters. It also indicates thatthe job has been deployed for scheduling or forexecution as a stored process.various An icon with a red check mark indicates that thismetadata object has been checked out for updating,under change management. Change managementenables multiple SAS Data Integration Studio usersto work with the same metadata repository at thesame time without overwriting each others changes.See Working with Change Management on page 59.Custom TreeUse the Custom tree to group related metadata objects together. For example, youcan create a folder named Sales to contain the metadata for libraries, tables, and jobsthat contain sales information. The Custom tree displays the objects in the defaultmetadata repository, as well as the objects from any repositories on which the defaultrepository depends. The Custom tree has the same folder structure as the tree for theBI Manager plug-in for SAS Management Console. In the Custom tree, the main folderof interest is named SAS Data Integration Studio Custom Tree. The other folders(BIP Tree, Integration Technologies, and Samples) contain metadata objects fromother applications.About the Main Windows and Wizards Process Library Tree 39Process Library TreeOverview of the Process LibraryThe Process Library tree organizes transformations into a set of folders. You can draga transformation from the Process Library tree and into the Process Editor, where youcan populate the drop zones and update the default metadata for the transformation.By updating a transformation with the metadata for actual sources, targets, andtransformations, you can quickly create process flow diagrams for common scenarios.The following sections describe the contents of the Process Library folders.Access FolderThe following table describes the transformations in the Access folder in the ProcessLibrary.Table 2.10 Access Folder TransformationsName DescriptionFile Reader Reads an external file and writes to a target. Added automatically to a process flowwhen an external file is specified as a source. Executes on the host where theexternal file resides, as specified in the metadata for the external file.File Writer Reads a source and writes to an external file. Added automatically to a process flowwhen an external file is specified as a target. Executes on the host where theexternal file resides, as specified in the metadata for the external file.LibraryContentsCreates a control table to use with an iterative job, a job with a control loop in whichone or more processes are executed multiple times.MicrosoftQueueReaderDelivers content from a Microsoft MQ message queue to SAS Data IntegrationStudio. If the message is being sent into a table, the message queue content is sentto a table or a Data Integration Studio transformation. If the message is being sentto a macro variable or file, then these files or macro variables can be referenced by alater step.MicrosoftQueueWriterEnables writing files in binary mode, tables, or structured lines of text to theWebSphere MQ messaging system. The queue and queue manager objects necessaryto get to the messaging system are defined in SAS Management Console.SPD ServerTableLoaderReads a source and writes to a SAS SPD Server target. It is automatically added toa process flow when a SAS SPD Server table is specified as a source or as a target.Enables you to specify options that are specific to SAS SPD Server tables.TableLoaderReads a source table and writes to a target table. Added automatically to a processflow when a table is specified as a source or a target.WebsphereQueueReaderDelivers content from a WebSphere MQ message queue to SAS Data IntegrationStudio. If the message is being sent into a table, the message queue content is sentto a table or a Data Integration Studio transformation. If the message is being sentto a macro variable or file, then these files or macro variables can be referenced by alater step.40 Process Library Tree Chapter 2Name DescriptionWebsphereQueueWriterEnables writing files in binary mode, tables, or structured lines of text to theWebSphere MQ messaging system. The queue and queue manager objects necessaryto get to the messaging system are defined in SAS Management Console.XML Writer Puts data into an XML table. In a SAS Data Integration Studio job, if you want toput data into an XML table, you must use an XML Writer transformation. Youcannot use the Table Loader transformation to load an XML table, for example.Analysis FolderThe following table describes the transformations in the Analysis folder in theProcess Library.Table 2.11 Analysis Folder TransformationsName DescriptionCorrelations Creates an output table that contains correlation statistics.Correlations- ReportCreates an HTML report that contains summary correlation statistics.DistributionAnalysisCreates an output table that contains a distribution analysis.DistributionAnalysis -ReportCreates an HTML report that contains a distribution analysis.Forecasting Generates forecasts for sets of time-series or transactional data.Frequency Creates an output table that contains frequency information.Frequency -ReportCreates an HTML report that contains frequency information.SummaryStatisticsCreates an output table that contains summary statistics.SummaryStatistics -ReportCreates an HTML report that contains summary statistics.SummaryTables -ReportCreates an HTML report that contains summary tables.Archived Transforms FolderThe following table describes the deprecated and archived transformations in theArchived Transforms folder in the Process Library.About the Main Windows and Wizards Process Library Tree 41Table 2.12 Archived Transforms Folder TransformationsName DescriptionFact TableLookup(archived)Loads source data into a fact table and translates business keys into generated keys.SQL Join(archived)Selects multiple sets of rows from one or more sources and writes each set of rows toa single target. Typically used to merge two or more sources into one target. Canalso be used to merge two or more copies of a single source.TableLoader(archived)Reads a source table and writes to a target table. Added automatically to a processflow when a table is specified as a source or a target.Control FolderThe following table describes the transformations in the Control folder in theProcess Library.Table 2.13 Control Folder TransformationsName DescriptionLoop Marks the beginning of the iterative processing sequence in an iterative job.Loop End Marks the end of the iterative processing sequence in an iterative job.Return CodeCheckProvides status-handling logic at a desired point in the process flow diagram for ajob. Can be inserted between existing transformations and removed later withoutaffecting the mappings in the original process flow.Data Transforms FolderThe following table describes the transformations in the Data Transforms folder inthe Process Library.Table 2.14 Data Transforms Folder TransformationsName DescriptionAppend Creates a single target table by combining data from several source tables.ApplyLookupStandardizationApplies scheme data sets to transform source columns.CreateMatch CodeEstablishes relationships between source rows using match code analysis or clusteranalysis.DataTransferMoves data directly from one machine to another. Direct data transfer is moreefficient than the default transfer mechanism.DataValidationCleanses data before it is added to a data warehouse or data mart.Extract Selects multiple sets of rows from a source and writes those rows to a target.Typically used to create one subset from a source. Can also be used to createcolumns in a target that are derived from columns in a source.42 Process Library Tree Chapter 2Name DescriptionFact TableLookupLoads source data into a fact table and translates business keys into generated keys.KeyEffectiveDateEnables change tracking in intersection tables.Lookup Loads a target with columns taken from a source and from several lookup tables.MiningResultsIntegrates a SAS Enterprise Miner model into a SAS Data Integration Studio datawarehouse. Typically used to create target tables from a SAS Enterprise Minermodel.SAS Rank Ranks one or more numeric column variables in the source and stores the ranks inthe target.SAS Sort Reads data from a source, sorts it, and writes the sorted data to a target.SAS Splitter Selects multiple sets of rows from one source and writes each set of rows to adifferent target. Typically used to create two or more subsets of a source. Can alsobe used to create two or more copies of a source.SCD Type 2LoaderLoads source data into a dimension table, detects changes between source and targetrows, updates change tracking columns, and applies generated key values. Thistransformation implements slowly changing dimensions.SQL Join Selects multiple sets of rows from one or more sources and writes each set of rows toa single target. Typically used to merge two or more sources into one target. Canalso be used to merge two or more copies of a single source.Standardize Creates an output table that contains data standardized to a particular number.SurrogateKeyGeneratorLoads a target, adds generated whole number values to a surrogate key column andsorts and saves the source based on the values in the business key column(s).Transpose Creates an output table that contains transposed data.UserWrittenCodeRetrieves a user-written transformation. Can be inserted between existingtransformations and removed later without affecting the mappings in the originalprocess flow. Can also be used to document the process flow for the transformationso that you can view and analyze the metadata for a user-written transformation,similarly to how you can for other transformations.Output FolderThe following table describes the transformations in the Output folder in the ProcessLibrary.Table 2.15 Output Folder TransformationsName DescriptionList Data Creates an HTML report that contains selected columns from a source table. Seealso Add the List Data Transformation to a Process Flow on page 253.Publish FolderThe following table describes the transformations in the Publish folder in theProcess Library.About the Main Windows and Wizards Quick Properties Pane 43Table 2.16 Publish Folder TransformationsName DescriptionPublish toArchiveCreates an HTML report and an archive of the report.Publish toEmailCreates an HTML report and e-mails it to a designated address.Publish toQueueCreates an HTML report and publishes it to a queue using MQSeries.SPD Server Dynamic Cluster FolderThe following table describes the transformations in the SPD Server DynamicCluster folder in the Process Library.Table 2.17 SPD Server Dynamic Cluster Folder TransformationsName DescriptionCreate orAdd to aClusterCreates or updates an SPD Server cluster table.List ClusterContentsLists the contents of an SPD Server cluster table.RemoveClusterDefinitionDeletes an SPD Server cluster table.Additional Information about the Process Library TransformationsPerform the following steps to display examples that illustrate how the ProcessLibrary templates can be used in a SAS Data Integration Studio job:1 From the SAS Data Integration Studio menu bar, select Help Contents. Theonline Help window displays.2 In the left pane of the Help window, select Examples Process LibraryExamples.Project TreeUse the Project tree to work with metadata in your Project repository. A Projectrepository is a temporary work area where metadata can be added or updated before itis checked in to a change-managed repository. For more information, see Working withChange Management on page 59.Quick Properties PaneOn the desktop, use the Quick Properties pane to display the main attributes of anobject that is selected in a tree view.On the Designer tab of the properties window for the SQL Join transformation, usethe Quick Properties pane to display the main attributes of an object that is selected in44 Comparison Results Tree Chapter 2the Create/Subquery tab. To display or hide this pane, select or deselect View SQLDesigner Properties from the menu bar. On this Designer tab, you can also use theQuick Properties pane to update the properties for some objects.Comparison Results TreeThe Comparison Results tree displays the metadata objects that result from changeanalysis operations. You can select a comparison result object and view the comparisonin the Differences window, recompare the specified metadata, and perform other tasks.For more information, see Working with Other Metadata on page 75.Metadata TreeUse the Metadata tree to identify the metadata type associated with a particularobject, as defined in the SAS Open Metadata Architecture. This can be useful in somesituations. For example, after identifying the metadata type associated with an object,a developer can write a program that uses the SAS Open Metadata Interface to readand write the metadata for that kind of object. The Metadata tree displays the objectsin the default metadata repository, as well as the objects from any repositories on whichthe default repository depends.45P A R T2General User TasksChapter 3. . . . . . . . . .Getting Started 47Chapter 4. . . . . . . . . .Importing, Exporting, and Copying Metadata 65Chapter 5. . . . . . . . . .Working with Tables 83Chapter 6. . . . . . . . . .Working with External Files 119Chapter 7. . . . . . . . . .Creating, Executing, and Updating Jobs 141Chapter 8. . . . . . . . . .Monitoring Jobs 157Chapter 9. . . . . . . . . .Deploying Jobs 169Chapter 10. . . . . . . . .Working with Transformations 195Chapter 11. . . . . . . . .Working with Generated Code 207Chapter 12. . . . . . . . .Working with User-Written Code 215Chapter 13. . . . . . . . .Optimizing Process Flows 237Chapter 14. . . . . . . . .Using Impact Analysis 25547C H A P T E R3Getting StartedRequired Components for SAS Data Integration Studio 48Main Tasks for Creating Process Flows 49Starting SAS Data Integration Studio 50Problem 50Solution 50Tasks 50Start Methods 50Specify Java Options 50Specify the Plug-in Location 50Specify the Error Log Location 50Specify Message Logging 51Allocate More Memory to SAS Data Integration Studio 51Connecting to a Metadata Server 51Problem 51Solution 51Tasks 52Create a Profile for a User, without Change Management 52Create a Profile for a User, with Change Management 52Create a Profile for an Administrator, with Change Management 53Open a Metadata Profile 53Additional Information 54Impact of the Default Repository 54Reconnecting to a Metadata Server 54Problem 54Solution 54Selecting a Default SAS Application Server 55Problem 55Solution 55Tasks 55Select a SAS Application Server 55Registering Any Libraries That You Need 55Problem 55Solution 56Tasks 56Register a Library 56Registering Sources and Targets 56Problem 56Solution 56Tasks 57Use Wizards to Register Sources and Targets 57Register DBMS Tables with Keys 5848 Required Components for SAS Data Integration Studio Chapter 3Working with Change Management 59Problem 59Solution 59Tasks 59Review the Prerequisites for Change Management 59Add New Metadata 60Check Out Existing Metadata for Updates 60Check Metadata In 60Delete Metadata 61Remove Metadata Permanently 61Clear All Metadata From the Project Tree 61Clear Project Repositories That You Do Not Own 62Additional Information 63Specifying Global Options in SAS Data Integration Studio 63Problem 63Solution 63Tasks 63Specify Start Options 63Use the Global Options Window 63Required Components for SAS Data Integration StudioCertain servers, metadata repositories and other components must be put in place inorder for SAS Data Integration Studio to operate properly. The most importantcomponents are listed in the following table.Table 3.1 Required ComponentsComponent DescriptionData IntegrationEnvironmentIn most cases, administrators install and configure servers, metadatarepositories, and other elements of a SAS data integration environment,and SAS Data Integration Studio users are told which of these resourcesto use. These elements are listed in the following table. For moreinformation, see A Basic Data Integration Environment on page 4.Hot Fixes At least one hot fix must be applied to the metadata servers in order tofully support SAS Data Integration Studio 3.4. Other hot fixes might beneeded to support certain features. For more information, administratorsshould see the installation instructions for SAS Data Integration Studio3.4.SAS Metadata ServerVersion and Hot FixesThe improved import/export and copy/paste features in SAS DataIntegration Studio 3.4 require a SAS 9.1.3 Metadata Server with ServicePack 4, as well as a hot fix. The hot fix is applied once per metadata server.Getting Started Main Tasks for Creating Process Flows 49Component DescriptionSAS Workspace ServerVersion and Hot FixesSAS Workspace Servers execute SAS Data Integration Studio jobs and arealso used to access data. A SAS 9.1.3 Workspace Server with Service Pack3 will support all features in SAS Data Integration Studio 3.4 except forWebsphere message queues. If you will be using Websphere messagequeues, administrators should apply the appropriate hotfix to the SASWorkspace Server that will execute jobs that read output from, or sendoutput to, Websphere message queues.Metadata Updates forSAS Data IntegrationStudioSAS Data Integration Studio 3.4 requires metadata that must be appliedby SAS Management Console 9.1 with hot fixes for SAS Data IntegrationStudio 3.4. Administrators should perform the following steps once permetadata server. Repeat for all metadata servers that SAS DataIntegration Studio users will access.1 Open SAS Management Console as an unrestricted user (such assasadm).2 From the desktop, in the Repository selection panel, select therelevant Foundation repository.3 Select Tools Update Metadata for SAS DataIntegration Studio from the menu bar. The UpdateMetadata for SAS Data Integration Studio dialog boxdisplays.4 Click OK to update the metadata for the current metadata server.Note: It is no longer necessary to restart the metadataserver so that the updates can take effect. Main Tasks for Creating Process FlowsWhen you create process flows in SAS Data Integration Studio, perform the followingmain tasks:1 Start SAS Data Integration Studio. See Starting SAS Data Integration Studio onpage 50.2 Create and open a metadata profile. See Connecting to a Metadata Server onpage 51.3 Select a default SAS Application Server. See Selecting a Default SAS ApplicationServer on page 55.4 Add metadata for the jobs inputs (data sources). See Registering Tables with aSource Designer on page 85.5 Add metadata for the jobs outputs (data targets). See Registering Tables with theTarget Table Wizard on page 87.6 Create a new job and a process flow that will read the appropriate sources,perform the required transformations, and load the target data store with thedesired information. See Chapter 7, Creating, Executing, and Updating Jobs, onpage 141.7 Run the job. See Submitting a Job for Immediate Execution on page 147.The change management feature adds a few steps to some of the previous tasks. Formore information, see Working with Change Management on page 59.50 Starting SAS Data Integration Studio Chapter 3Starting SAS Data Integration StudioProblemYou want to start SAS Data Integration Studio and perhaps specify one or more startoptions.SolutionStart SAS Data Integration Studio as you would any other SAS application on agiven platform. You can specify one or more options in the start command or in theetlstudio.ini file.TasksStart MethodsUnder Microsoft Windows, you can select Start Programs SAS SAS DataIntegration Studio.You can also start the application from a command line. Navigate to the SAS DataIntegration Studio installation directory and issue the etlstudio.exe command.If you do not specify any options, SAS Data Integration Studio uses the parametersspecified in the etlstudio.ini file. The following sections contain information aboutoptions you can specify on the command line or add to the etlstudio.ini file.Specify Java OptionsTo specify Java options when you start SAS Data Integration Studio, use the-javaopts option and enclose the Java options in single quotation marks. For example,the following command starts SAS Data Integration Studio on Windows and containsJava options that specify the locale as Japanese:etlstudio -javaopts -Duser.language=ja -Duser.country=JPSpecify the Plug-in LocationBy default, SAS Data Integration Studio looks for plug-ins in a plugins directoryunder the directory in which the application was installed. If you are starting SASData Integration Studio from another location, you must specify the location of theplug-in directory by using the --pluginsDir option. The syntax of the option isetlstudio --pluginsdir Specify the Error Log LocationSAS Data Integration Studio writes error information to a file named errorlog.txtin the working directory. Because each SAS Data Integration Studio session overwritesthis log, you might want to specify a different name or location for the log file. Use thefollowing option to change the error logging location:Getting Started Solution 51etlstudio --logfile Specify Message LoggingYou can specify the server status messages that are encountered in a SAS DataIntegration Studio session by using the --MessageLevel level_value option. Validvalues for level_value are listed in the following table.Table 3.2 Values for level_valueValue DescriptionALL All messages are logged.CONFIG Static configuration messages are logged.FINE Basic tracing information is logged.FINER More detailed tracing information is logged.FINEST Highly detailed tracing information is logged. Specify this option to debugproblems with SAS server connections.INFO Informational messages are logged.OFF No messages are logged.SEVERE Messages indicating a severe failure are logged.WARNING Messages indicating a potential problem are logged.Allocate More Memory to SAS Data Integration StudioThere might be a number of reasons to increase the amount of memory for SAS DataIntegration Studio. For example, after running a job, if you click the Log tab or theOutput tab, and SAS Data Integration Studio does not respond, you might need toincrease the amount of memory allocated to the application.Locate the subdirectory where SAS Data Integration Studios executable(etlstudio.exe) is found. There will be a .ini file with the same name as the executable(etlstudio.ini). Edit the .ini file and increase the memory values on the Javainvocation. If that does not help, the problem might be server memory or another issue.Connecting to a Metadata ServerProblemYou want to connect to the metadata server where the metadata for the requiredservers, data libraries, and tables is stored.SolutionCreate and open a metadata profile that connects to the appropriate metadata serverand repository.52 Tasks Chapter 3A metadata profile is a client-side definition of where the metadata server is located.The definition includes a machine name, a port number, and one or more metadatarepositories. In addition, the metadata profile can contain a users logon informationand instructions for connecting to the metadata server automatically. A metadataprofile is stored in a file that is local to the SAS Data Integration Studio client. It is notstored in a metadata repository.TasksCreate a Prole for a User, without Change ManagementPerform the following steps to create a metadata profile that will enable you to workwith a metadata repository that is not under change-management control. Assume thatyou are not an administrator.1 Obtain the following information from an administrator: the network name of the metadata server the port number used by the metadata server a logon ID and password for that server the name of the repository that should be selected as the default metadatarepository for this profile.2 Start SAS Data Integration Studio. The Open a Metadata Profile window displays.3 Select Create a new metadata profile. The Metadata Profile wizard displays.4 Click Next. In the general information window, enter a name for the profile.5 Click Next. In the Connection Information window, enter a machine address, port,user name, and password that will enable you to connect to the appropriate SASMetadata Server.6 Click Next. The wizard attempts to connect to the metadata server. If theconnection is successful, the Select Repositories window displays.7 In the Select Repositories window, select the appropriate repository as the defaultmetadata repository for this profile.8 Click Finish to exit the metadata profile wizard. You are returned to the Open aMetadata Profile window.Create a Prole for a User, with Change ManagementThe change management feature in SAS Data Integration Studio enables multipleusers to work simultaneously with a shared metadata repository. For more informationabout this feature, see Working with Change Management on page 59.Perform the following steps to create a metadata profile that will enable you to workwith a metadata repository that is under change-management control. Assume that youare not an administrator and that administrators have set up change-managedrepositories.1 Obtain the following information from an administrator: the network name of the metadata server the port number used by the metadata server a logon ID and password for that server the name of the Project repository that should be selected as the defaultmetadata repository for this profile. The administrator who created therepository must have specified you as the owner of the repository.Getting Started Tasks 532 Start SAS Data Integration Studio. The Open a Metadata Profile window displays.3 Select Create a new metadata profile. The Metadata Profile wizard displays.4 Click Next. In the general information window, enter a name for the profile.5 Click Next. In the Connection Information window, enter a machine address, port,user name, and password that will enable you to connect to the appropriate SASMetadata Server.6 Click Next. The wizard attempts to connect to the metadata server. If theconnection is successful, the Select Repositories window displays.7 In the Select Repositories window, select the appropriate Project repository as thedefault metadata repository for this profile.8 Click Finish to exit the metadata profile wizard. You are returned to the Open aMetadata Profile window.Create a Prole for an Administrator, with Change ManagementSome tasks, such as deploying a job for scheduling or customizing status handling,cannot be done under change management control. Perform the following steps tocreate a metadata profile that will grant an administrator special privileges in achange-managed repository:1 Obtain the following information: the network name of the metadata server the port number used by the metadata server an administrative logon ID and password for that server the name of the change-managed repository that should be selected as thedefault metadata repository for this profile.2 Start SAS Data Integration Studio. The Open a Metadata Profile window displays.3 Select Create a new metadata profile. The Metadata Profile wizard displays.4 Click Next. In the general information window, enter a name for the profile.5 Click Next. In the Connection Information window, enter a machine address, port,user name, and password that will enable you to connect to the appropriate SASMetadata Server.6 Click Next. The wizard attempts to connect to the metadata server. If theconnection is successful, the Select Repositories window displays.7 In the Select Repositories window, select the change-managed repository as thedefault metadata repository for this profile.8 Click Finish to exit the metadata profile wizard. You are returned to the Open aMetadata Profile window.Open a Metadata ProlePerform the following steps to open a metadata profile:1 Start SAS Data Integration Studio. The Open a Metadata Profile window displays.2 Select Open an existing metadata profile. The selected profile is opened inSAS Data Integration Studio.Another way to open a metadata profile is to start SAS Data Integration Studio, thenselect File Open a Metadata Profile from the menu bar.54 Additional Information Chapter 3After you open a metadata profile, the tree views on the desktop will be populatedwith metadata that is associated with the default repository that is specified in theprofile. If you are working under change management control, you will see a Projecttab in the tree view. If you are not working under change management, you will not seea Project tab. For more information about change management, see Working withChange Management on page 59.Additional InformationImpact of the Default RepositoryWhen you create a metadata profile, you specify a default metadata repository forthat profile. The administrator who creates metadata repositories typically tells SASData Integration Studio users what repository to select as the default.The default metadata repository has two main impacts: It determines which metadata objects are available in a SAS Data IntegrationStudio session. Wizards and the tree views display metadata objects from thedefault metadata repository and from any repository on which the defaultrepository 'depends' (inherits metadata from). New metadata objects are always saved to the default metadata repository that isspecified in the current profile.Reconnecting to a Metadata ServerProblemYou are working in SAS Data Integration Studio, and the connection to the metadataserver is broken. Any metadata that you have entered but not saved could be lostunless connection to the metadata server is restored.SolutionIf the connection to the metadata server is broken, a dialog box displays and asks ifyou want to attempt reconnection. Click Try Now, and SAS Data Integration Studiowill attempt to reconnect to the metadata server.If the reconnection is successful, you can continue your work. The user credentialsfrom the previous session will be used. If the tree views are not populated with theappropriate metadata, select View Refresh. If the reconnection is not successful,contact your server administrator.Getting Started Problem 55Selecting a Default SAS Application ServerProblemYou want to work with SAS Data Integration Studio without having to select a servereach time that you access data, execute SAS code, or perform other tasks that require aSAS server.SolutionUse the Tools Options window to select a default SAS Application Server.When you select a default SAS Application Server, you are actually selecting ametadata object that can provide access to a number of servers, libraries, schemas,directories, and other resources. An administrator typically creates this object. Theadministrator then tells the SAS Data Integration Studio user which object to select asthe default server.TasksSelect a SAS Application ServerPerform the following steps to select a default SAS Application Server:1 From the SAS Data Integration Studio menu bar, select Tools Options todisplay the Options window.2 Select the SAS Server tab.3 On the SAS Server tab, select the desired server from the Server drop-down list.The name of the selected server appears in the Server field.4 Click Test Connection to test the connection to the SAS Workspace Server(s)that are specified in the metadata for the server. If the connection is successful, goto the next step. If the connection is not successful, contact the administrator whodefined the server metadata for additional help.5 After you have verified the connection to the default SAS Application Server, clickOK to save any changes. The server that is specified in the Server field is now thedefault SAS Application Server.Note: For more information about the impact of the default SAS Application Server,see Accessing Local and Remote Data on page 148. Registering Any Libraries That You NeedProblemYou want to register a library before you can access some tables in that library.56 Solution Chapter 3SolutionUse the New Library wizard to register the library.At some sites, an administrator adds and maintains most of the libraries that areneeded, and the administrator tells users which libraries to use. It is possible, however,that SAS Data Integration Studio users will need to register additional libraries.TasksRegister a LibraryPerform the following steps to register a library:1 Start SAS Data Integration Studio and open an appropriate metadata profile. Thedefault metadata repository for this profile will store metadata about the librarythat you will register.2 From the SAS Data Integration Studio desktop, in the Inventory tree, select theLibraries folder, then select File New from the menu bar. The New LibraryWizard displays. The first page of the wizard enables you to select the kind oflibrary that you want to create.3 After you have selected the library type, click OK.4 Enter the rest of the library metadata as prompted by the wizard.Note: Registering a library does not, in itself, provide access to tables in the library.For more information about libraries, see the chapters about common data sources inthe SAS Intelligence Platform: Data Administration Guide.You must also specify metadata for all tables that you want to access in the library.For more information, see Registering Sources and Targets on page 56.Registering Sources and TargetsProblemYou want to create a process flow in SAS Data Integration Studio, but the sourcesand targets are not visible in the tree view.SolutionUse wizards in SAS Data Integration Studio to register the sources and targets inthe desired process flow. For example, suppose that you wanted to create the processflow that is shown in the following display.Getting Started Tasks 57Display 3.1 Sample Job with a Source and a TargetGiven the direction of the arrows in the process flow: ALL_EMP specifies metadata for the source table. SAS Sort specifies metadata for the sort process, which writes its output to atemporary output table, Sort Target-W5BU8XGB. Table Loader specifies metadata for a process that reads the output from theprevious step and loads this data into a target table. Employees Sorted specifies metadata for the target table.Before you could create the previous process flow, you must register the source table(ALL_EMP) and the target table (Employees Sorted). The tables will then be visible inthe tree views, and you can drag and drop them into the process flow.TasksUse Wizards to Register Sources and TargetsUse the methods in the following table to enter metadata for both sources andtargets in SAS Data Integration Studio jobs.Table 3.3 Methods for Specifying Metadata for Data StoresData Store Method for Specifying MetadataA set of tables that are specified in a data model. Import the data model in CWM format or ina format for which you have the appropriateMeta Integration Model Bridge. SeeWorking with Other Metadata on page 75.One or more SAS tables or DBMS tables that existin physical storage.Use a source designer to register the tables.See Registering Tables with a SourceDesigner on page 85. See also the chaptersabout common data sources in the SASIntelligence Platform: Data AdministrationGuide.A table that is specified in a comma-delimited file oranother external file.Use an external file source designer toregister the table. See Chapter 6, Workingwith External Files, on page 119. See alsothe external file source designer examples inthe Help for SAS Data Integration Studio.A new table that will be created when a SAS DataIntegration Studio job is executed. Or, a new tablethat reuses column metadata from one or moreregistered tables.Use the Target Table designer to register thetable. See Registering Tables with theTarget Table Wizard on page 87.58 Tasks Chapter 3Data Store Method for Specifying MetadataOne or more tables that are specified in an XML file. Various methods can be used. One way is touse the XML source designer. See thechapters about common data sources in theSAS Intelligence Platform: DataAdministration Guide.A table that can be accessed with an OpenDatabase Connectivity (ODBC) driver, such as aMicrosoft Access table.Use an ODBC source designer to register thetable. See the chapters about common datasources in the SAS Intelligence Platform:Data Administration Guide.A Microsoft Excel spreadsheet. Various methods can be used. One way is touse the Excel source designer. Another wayis to use the ODBC source designer. See thechapters about common data sources in theSAS Intelligence Platform: DataAdministration Guide.One or more tables that exist in physical storageand that can be accessed with an Open DatabaseConnectivity (ODBC) driver.Use an ODBC source designer to register thetables. See the chapters about common datasources in the SAS Intelligence Platform:Data Administration Guide.A table in a format for which you do not have asource designer.Use the Generic source designer to registerthe table. See the Generic source designerexample in the Help for SAS DataIntegration Studio.Add and maintain a SAS cube. See the cube topics in the Help for SAS DataIntegration Studio.Register DBMS Tables with KeysTables in a database management system often have primary keys, unique keys, andforeign keys.A primary key is one or more columns that are used to uniquely identify a row in atable. A table can have only one primary key. The column(s) in a primary key cannotcontain null values.A unique key is also one or more columns that can be used to uniquely identify a rowin a table. A table can have one or more unique keys. Unlike a primary key, a uniquekey can contain null values.A foreign key is one or more columns that are associated with a primary key orunique key in another table. A table might have one or more foreign keys. A foreignkey is dependent upon its associated primary or unique key. In other words, a foreignkey cannot exist without a primary or unique key.Note: When registering a DBMS table with foreign keys, if you want to preserve theforeign keys, you register all of the tables that are referenced by the foreign keys. For example, suppose that Table 1 had foreign keys that referenced primary keys inTable 2 and Table 3. To preserve the foreign keys in Table 1, you could use the MetadataImporter wizard or a source designer wizard to import metadata for Tables 1, 2, and 3.Getting Started Tasks 59Working with Change ManagementProblemYou want to manage the needs of the users that have access to your metadata.Specifically, you must ensure that users can access the metadata that they need withoutcompromising the data that other users need. This means that you must have a reliablemechanism for adding and deleting metadata. You also must be able to check metadatain and out as needed.SolutionYou can use the change management feature. The change management featureenables multiple users to work simultaneously with a shared metadata repository. Itenables teams of developers to build and maintain process flows without overwritingeach others changes.To implement change management, an administrator creates a metadata repositoryand puts it under change-management control. Each SAS Data Integration Studio userthen creates and opens a metadata profile which enables him or her to work with thechange-managed repository.To add new metadata under change management, you will add the metadata asusual. When you are finished working with the new metadata, you will check it in tothe change-managed repository.To update existing metadata under change management, you will check out themetadata from the change-managed repository. The metadata is locked so that no oneelse can update it as long as you have it checked out. When you are finished updatingthe metadata, you check it in to the change-managed repository. Change managementenables you to perform the following tasks: set prerequisites for change management add new metadata check existing metadata out for updates check metadata in delete metadata remove metadata permanently clear all metadata from the Project tree clear project repositoriesTasksReview the Prerequisites for Change ManagementTo implement change management, administrators create one or morechange-managed repositories. Then they set up user profiles for all SAS DataIntegration Studio users who will be working under change management. For moreinformation, administrators should see the SAS Data Integration Studio chapter in SASIntelligence Platform: Desktop Application Administration Guide . Each user mustcreate and open a metadata profile which enables him or her to work with a60 Tasks Chapter 3change-managed repository. For more information, users should see Create a Profilefor a User, with Change Management on page 52.Add New MetadataYou can add metadata about a new library, table, or another resource to achange-managed repository. Perform the following steps:1 If you have not done so already, open a metadata profile that enables you to workunder change management. For more information, see Create a Profile for a User,with Change Management on page 52.2 Add the metadata as usual. For example, to register a table, you can use themethods described in Registering Sources and Targets on page 56. Newmetadata objects will appear in your Project tree on the desktop.Check Out Existing Metadata for UpdatesYou can update existing metadata for a library, table, or another resource in achange-managed repository. Perform the following steps:1 If you have not done so already, open a metadata profile that enables you to workunder change management. For more information, see Create a Profile for a User,with Change Management on page 52.2 On the SAS Data Integration Studio desktop, click the Inventory tab or theCustom tab. The appropriate tree displays.3 Open the folder for the kind of metadata that you want to check out, such as theTables folder for tables in the Inventory tree.4 Right-click the metadata that you want to check out and selectChange-Management Check Out. You can also left-click the metadata thatyou want to check out, then go the drop-down menu and select Project CheckOut. The metadata is checked out and displays in your Project tree. The Projecttree displays the contents of your Project repository.After you are finished working with the metadata, you can check it in as described inCheck Metadata In on page 60.Note: You must check out the metadata objects to be changed before you can updatethe metadata. As long as you have the metadata checked out, it is locked so that itcannot be updated by another person . If two or more tables share a common object,such as a primary key, a note, or a document, and you check out one of these tables,only you can check out the other tables that share that common object. (Other userscannot access the common object that you have checked out, and the shared object isrequired to check out a table that uses that object.) Check Metadata InWhen you have finished adding or updating metadata, you can check it in to yourchange-managed repository. Perform the following steps:1 In the Project tree, right-click the Project repository icon and select Check InRepository. You can also left-click the Project repository icon, open the drop-downmenu, and select Project Check In Repository. The Check In window displays.2 Enter meaningful comments in the Name field (and perhaps in the Descriptionfield) about the changes that were made to all of the objects that you are about toGetting Started Tasks 61check in. The text entered here becomes part of the check in/check out history forall objects that you are checking in. If you do not enter meaningful comments, thecheck in/check out history is less useful.3 When you are finished entering comments in the Check In window, click OK. Allmetadata objects that are in the Project repository are checked in.After check in, any new or updated metadata that was in your Project tree will nowbe visible to others who are connected to the same metadata repository. You might findit convenient to work with small sets of related objects in the Project tree. That way, itis easier to track the metadata changes you are making.Note: A checkin operation checks in all metadata objects that are in the Projecttree. You cannot check in selected objects and leave other objects in the Project tree. Delete MetadataYou can delete selected metadata objects from your Project tree. This would behelpful if you added or updated objects in your temporary work area that you do notwant to check in to the change-managed repository.You can use the Delete option to remove metadata objects that have been checkedout or remove new metadata objects that have never been checked in. New metadataobjects are simply deleted. Metadata objects that are checked out are deleted, and thecorresponding objects in the change-managed repository are restored to the state theywere in before check out. Alternatively, you can use the Undo Checkout option toremove metadata objects that have been checked out. It has the same effect as theDelete option.Perform the following steps to delete selected metadata from your Project tree.1 In the Project tree, select one or more objects that you want to remove.2 Right-click the object or objects and select Delete. Alternatively, for checked-outobjects only, you can right-click the objects and select Change ManagementUndo Checkout. The objects are removed from the Project tree.Remove Metadata PermanentlyYou can permanently remove selected metadata objects from the change-managedrepository. This ability would be helpful if you added or updated some metadata objects,checked them in, and then discovered that you do not need them. Perform the followingsteps:1 Check out the objects to be removed, as described in Check Out ExistingMetadata for Updates on page 60.2 Select the objects to be removed from the Project tree.3 Right-click the objects and select Change Management Destroy.4 Click Yes in response to the dialog box. The objects are removed from the Projecttree.5 Check in the Project repository, as described in Check Metadata In on page 60.The objects are permanently removed from the change-managed repositoryNote: Metadata objects that are destroyed cannot be recovered except by restoringthe change-managed repository from backup. Clear All Metadata From the Project TreeYou can clear all metadata from your Project tree. You could do this to removemetadata objects that fail to check in due to technical problems. You perform this task62 Tasks Chapter 3with the Clear Project Repository option, which deletes all new objects and unlocksall checked-out objects in the Project tree. When you clear a Project repository, allchanges that have not been checked in are lost. Perform the following steps:1 If you have not done so already, click the Project tab. The Project tree displays.2 Select Project Clear Project Repository from the menu bar. A warningdisplays that any changes to the current objects will be lost if you click OK.3 Click OK. All new objects are deleted. All checked-out objects are unlocked. Allchanges that have not been checked in are lost.Clear Project Repositories That You Do Not OwnAn administrator might need to clear all metadata from one or more Projectrepositories that he or she does not own. This would be helpful in the followingsituations: A user checked out metadata objects but forgot to check them back in before goingon a long vacation. In the meantime, other users need to update the checked-outmetadata. An administrator accidentally deletes a users Project repository that containschecked-out objects.You can use the Clear Project Repository option to delete all new objects andunlock all checked-out objects in one or more Project repositories that you select in theClear Project Repository window. Perform the following steps:1 Start SAS Data Integration Studio. A window displays with various options formaintaining a metadata profile.2 Select a metadata profile that enables you to work as an unrestricted user. Thedefault repository specified in this profile must be the parent repository on whichthe desired Project repositories depend (the Project repositories that you want toclear).3 On the SAS Data Integration Studio desktop, select the Inventory tab.4 In the Inventory tree, select the parent repository of the Project repository to becleared, then select Project Clear Project Repository from the menu bar. TheClear Project Repository window displays. Unrestricted users see all projectrepositories that depend on the selected repository.5 Has the Project repository been deleted? If not, skip to the next step. If so, selectSearch for deleted project repository information. Unrestricted userssee all deleted project repositories that depend on the selected repository.6 In the Clear Project Repository window, select one or more project repositories tobe cleared. Then, click OK. All new objects are deleted. All checked-out objects areunlocked. All changes that have not been checked in are lost.Note: An unrestricted user (a special user who is specified for a metadata server)can clear all Project repositories that depend on the default repository in his or hermetadata profile. For more information about administrative profiles, see Create aProfile for an Administrator, with Change Management on page 53. For details aboutthe unrestricted user, administrators should see the 'How to Designate an Unrestricted,Administrative, or Trusted User' section in the SAS Intelligence Platform: SecurityAdministration Guide. Getting Started Tasks 63Additional InformationThe help for SAS Data Integration Studio provides more details about changemanagement. To display the relevant Help topics, perform the following steps to :1 From the SAS Data Integration Studio menu bar, select Help Contents. TheHelp window displays.2 In the left pane of the Help window, select Task Overviews SAS DataIntegration Studio Task Reference Using Change Management in SAS DataIntegration Studio.Specifying Global Options in SAS Data Integration StudioProblemYou want to set default options for SAS Data Integration Studio.SolutionSpecify the appropriate start option or specify an option in the global Options window.TasksSpecify Start OptionsYou can specify one or more options in the start command or in the etlstudio.inifile. See Starting SAS Data Integration Studio on page 50.Use the Global Options WindowTo display the global Options window from the SAS Data Integration Studio desktop,select Tools Options from the menu bar.From the Options window, you can specify options such as the following: the default support for case, special characters, or both in DBMS names the default SAS Application Server the default display options for the Process Designer window options for the code that SAS Data Integration Studio generates for jobs andtransformations; parallel processing options are included options for Data Quality transformations that are available in the Process Librarytree options for the View Data window that is available for tables, external files, andselected transformations65C H A P T E R4Importing, Exporting, andCopying MetadataAbout Metadata Management 66Working with SAS Metadata 66About Importing and Exporting SAS Metadata 66SAS Metadata That Can Be Imported and Exported 67Automatic Content 67Optional Content 68Restoration of Metadata Associations 69Optional Promotion of Access Controls 70Logging after Import or Export 71Best Practices for Importing or Exporting SAS Metadata 71Preparing to Import or Export SAS Metadata 71Exporting SAS Metadata 72Problem 72Solution 72Tasks 72Document the Metadata That Will Be Exported (optional) 72Export Selected Metadata 72Importing SAS Metadata 73Problem 73Solution 73Tasks 73Identify the Metadata That Should Be Imported (optional) 73Import the SAS Package File 73Copying and Pasting SAS Metadata 74Problem 74Solution 74Tasks 74Copy 74Paste 75Paste Special 75Working with Other Metadata 75About Exporting and Importing Other Metadata 75Other Metadata That Can Be Imported and Exported 76Usage Notes for Importing or Exporting Other Metadata 76Preparing to Import or Export Other Metadata 77Importing As New Metadata (without Change Analysis) 77Problem 77Solution 77Tasks 77Import As New Metadata (No Change Analysis) 77Importing Metadata with Change Analysis 7866 About Metadata Management Chapter 4Problem 78Solution 78Tasks 78Compare Import Metadata to Repository 78View the Comparison 79Select and Apply Changes 79Applying Changes to Tables with Foreign Keys 80Restoring Metadata for Foreign Keys 80Deleting an Invalid Change Analysis Result 80Exporting Other Metadata 80Problem 80Solution 81Tasks 81Export the Default Metadata Repository 81About Metadata ManagementThis chapter describes how to export, import, copy, and paste selected metadataobjects in SAS Data Integration Studio. The metadata can be in SAS Open MetadataArchitecture format or in other formats.The online Help for SAS Data Integration Studio has additional information aboutthe import and export of SAS metadata and other metadata. To display the relevanttopics, from the desktop select Help Contents Task Overviews SAS DataIntegration Studio Task Reference Importing and Exporting Metadata.For a more comprehensive view of metadata management, administrators should seethe metadata management chapters in the SAS Intelligence Platform: SystemAdministration Guide.Working with SAS MetadataAbout Importing and Exporting SAS MetadataSAS Data Integration Studio enables you to import and export metadata in SASOpen Metadata Architecture format. This feature enables you to reuse the metadata fortables, jobs, and other objects. For example, you can develop a job in a testenvironment, export it, then import the job into a production environmentFrom the desktop, you can select the object or objects to be exported and exportthem. The metadata is saved to a SAS Package file. You then import the package inSAS Data Integration Studio and save it to the same metadata repository or to adifferent metadata repository. The source and target repository can be located on thesame host machine or on different host machines.The source metadata server and the target metadata server can run on hostmachines that have different operating systems. For example, you can import objectsinto a metadata repository on a UNIX system even though those objects were exportedfrom a Microsoft Windows system.The metadata Export Wizard and Import Wizard enable you to perform the followingtasks: export the metadata for one or more selected objects in the Inventory tree, Customtree, or Project tree.Importing, Exporting, and Copying Metadata Automatic Content 67 export the metadata for all objects in one or more selected folders in the Customtree. export access controls that are associated with exported objects (optional). export data, dependent metadata, and other content that is associated withexported objects (optional). change object names, port numbers, host machine names, schema names, physicalpaths, and other attributes when you import metadata (optional). For example,you can export the metadata for a SAS table, and upon import, change themetadata so that it specifies a DBMS table in the target environment.SAS Metadata That Can Be Imported and ExportedYou can import and export metadata for the following types of objects in SAS DataIntegration Studio: libraries physical tables external files jobs generated transformations output from a Mining Results transformation stored processes (including jobs deployed for execution by a SAS stored processserver or a Web service client) information maps notes documents folders containing the previous objects, including the folder for a repositoryNote: The metadata Export Wizard is optimized for selective metadata export, notthe export of whole repositories. The Export Wizard will not export the metadata forservers, database schemas, or users, for example. For more information about the export and import of whole metadata repositories,administrators should see the metadata management chapters in the SAS IntelligencePlatform: System Administration Guide, especially the 'Creating, Moving, Copying, andDeleting Metadata Repositories' chapter.Automatic ContentWhen you select metadata objects for export, some content is automatically movedwith them. The following table describes the content that is automatically moved withsome objects.68 Optional Content Chapter 4Table 4.1 Automatic ContentExported MetadataObjectAutomatic Contentjob attributes (Name, Description, Roles, Extended Attributes, etc.) documents and notes parameters Automap setting for transformations Message Queue objects custom code associated with a job, a transformation, a pre-process,a post-process, or a custom action code (status-handling action code)physical table attributes (Name, Description, Roles, Extended Attributes[including Web Stream flags], etc.) documents and notes parameters primary keys and unique keys, but not foreign keysexternal file attributes (Name, Description, Roles, Extended Attributes[including Web Stream flags], etc.) documents and notes parameters primary keys and unique keys, but not foreign keys external format files COBOL copybooksgeneratedtransformation attributes (Name, Description, Roles, Extended Attributes(including Web Stream flags), etc.) documents and notes default settings for option valuesKeep the following in mind about the content that is automatically moved with someobjects: When importing a job, or when using the Paste Special option to paste a job, youcan specify where any custom code for the job will be written on the target system. When importing an external file, or when using the Paste Special option topaste an external file, you can specify that the external file and any format files(external format files or COBOL copybooks) are written to separate locations onthe target system.Optional ContentWhen you select metadata objects for export, you have the option to move some of thecontent. Later, you can decide whether you want to import this optional content. Thefollowing table describes the optional content that can be moved with some objects.Importing, Exporting, and Copying Metadata Restoration of Metadata Associations 69Table 4.2 Optional ContentExported MetadataObjectOptional Contentjob metadata for any libraries, tables, external files, notes, anddocuments in the job metadata for any generated transformations in the job metadata for the output of a Mining Results transformation in thejob metadata for any job nested within the exported jobSAS library or a DBMSlibrary metadata for the tables in the librarySAS XML library XML file that is specified in the library metadata any XML Map file that is specified in the library metadataSAS data set physical SAS data set SAS constraints: check, not null, primary keys and unique keys,but not foreign keysDBMS table physical DBMS table DBMS constraints: noneslowly changingdimension (SCD) table physical SCD table any physical cross-reference table that is specified in the SCD tablemetadataexternal file physical external fileKeep the following in mind about the optional content that can be moved with someobjects: Metadata for DBMS table constraints is not exported or imported. Avoid using the metadata Export and Import Wizards to move large data tablesalong with the metadata for those tables. All exported content must be packagedinto a SAS package file, and it is not efficient to add large data tables to thepackage. Foreign keys, user-defined column formats and informats, and ODS tag setsassociated with a SAS XML library are not handled by the metadata Export andImport Wizards and must be moved manually. The best way to export metadata for XML tables is to export the metadata for theparent SAS XML library.Restoration of Metadata AssociationsSome metadata objects depend on other metadata objects in order to work properly.For example, the metadata for a table must be associated with the metadata for alibrary, or the table will not be accessible in SAS Data Integration Studio.If you import metadata objects that depend on other metadata, the Import Wizardwill prompt you to associate the imported objects with appropriate metadata in thetarget environment. Depending on the object that you are importing, the Import Wizard70 Optional Promotion of Access Controls Chapter 4might prompt you to restore an association between the imported object and thefollowing metadata in the target environment: a SAS Application Server a WebDAV server base path on a WebDAV server libraries tables external files physical locations for external files or for libraries OLAP schemas output from a Mining Results transformation a source code repository and an archive path (for stored processes)The Import Wizard enables you to select a default match, if any, between theimported metadata and metadata that exists in the target environment. The next tabledescribes the criteria for a default match.Table 4.3 Criteria for a Default MatchExported MetadataObjectMatching Criteriafolder Parent Folder.Metadata Name for the folderjob Folder.Metadata Name for the joblibrary SAS Application Server.Metadata Name for the libraryphysical table Library.Metadata Name for the tableexternal file Folder.Metadata Name for the filegeneratedtransformationUnique ID for the transformation (each generated transformation has aunique ID)You can restore the same metadata associations that were specified in the sourceenvironment, or you can specify different associations. The Import Wizard also enablesyou to select None (no matching object in the target environment). You might want toselect None if there is no appropriate match at the time of import, and you intend tosupply the missing metadata association later.Optional Promotion of Access ControlsYou can include direct assignments of access control entries (ACEs) and accesscontrol templates (ACTs) in the promotion process. The following limitations apply tothe promotion of access controls:The wizards cannot promote ACTs, but they can promote direct associations betweenACTs and objects. In order for an objects ACT association to be promoted, an ACT ofthe same name must exist in the destination metadata repository.Access control entries and ACT associations that are inherited from a folder are notpromoted. For inheritance to take effect in the destination metadata repository, youmust also promote the folder that contains the direct access control entries or ACTassociation.Access control entries are not promoted if they refer to users or groups that do notexist in the destination metadata repository (or in the parent Foundation repository, ifthe destination is a dependent repository).Importing, Exporting, and Copying Metadata Preparing to Import or Export SAS Metadata 71If you do not specify that you want to include access controls in the promotionprocess, then access controls are applied as follows: If you import an object that already exists in the destination metadata repository,then the permissions that have been applied to the existing object are preservedwhen you overwrite the object. If you import an object for the first time, then the object inherits permissions fromthe folder into which the object is imported.Logging after Import or ExportThe metadata Import and Export Wizards create a log for the current session. Thelog can be viewed from the Finish panel of the wizards. The export log is included inthe SAS package file upon export.On Windows systems, metadata export and import logs are written to the usersfolder at a location such as the following: Documents and Settingsuser nameMyDocuments. The log file is named Import_date/time.log or Export_datetime.log.Best Practices for Importing or Exporting SAS MetadataIf possible, keep sets of related jobs, tables, and other objects under the same parentfolder in the Custom tree. Exporting a parent folder structure will provide awell-defined container for changes to be exported.CAUTION:If an import operation will result in the update of existing objects in the target repository,make sure that the update is necessary. Importing over an existing object, such as a table, overwrites dependent metadataobjects with new IDs, which can impact the column mappings in existing jobs. If thereare no changes being made to existing objects, you should not include them in theimport process. The Import Wizard enables you to select only those objects that youwant to import.If you have to move a large volume of metadata from one repository to another, suchas 100 jobs and their related tables, consider using full metadata repository promotioninstead of the Export Wizard and Import Wizard in SAS Data Integration Studio. Formore information about the export and import of whole metadata repositories,administrators should see the metadata management chapters in the SAS IntelligencePlatform: System Administration Guide, especially the 'Creating, Moving, Copying, andDeleting Metadata Repositories' chapter.For more information about best practices, see the technical white paper, BestPractices for SAS9 Metadata Server Change Control, at the following location:http://support.sas.com/documentation/whitepaper/technical.Preparing to Import or Export SAS MetadataSome metadata import and export features require the latest metadata server, asdescribed in Required Components for SAS Data Integration Studio on page 48.Metadata import tasks require planning and preparation of the target environment.Typical preparation tasks include the following: Verify that the appropriate servers are running in the source and targetenvironments.72 Exporting SAS Metadata Chapter 4 Set up Custom tree folders in the source repository, so the objects can be exportedfrom a convenient folder structure. Set up Custom tree folders in the target repository, so the objects can be importedinto a convenient folder structure. Set up security for folders and objects in the target environment. For stored processes, set up a source code repository in an archive path (if needed)in the target environment.For more information about preparing to use the metadata Import and ExportWizards, administrators should see the Promoting Individual BI Objects chapter inthe SAS Intelligence Platform: System Administration Guide.Exporting SAS MetadataProblemYou want to export selected metadata objects from SAS Data Integration Studio sothat you can import them later.SolutionUse the Export Wizard to export the metadata. You can then import the package inSAS Data Integration Studio and save it to same metadata repository or to a differentmetadata repository. The source and target repository can be located on the same hostmachine or on different host machines.TasksDocument the Metadata That Will Be Exported (optional)Metadata export and import tasks will be easier to manage if you create a documentthat describes the metadata to be exported, the metadata that should be imported, andthe main metadata associations that must be reestablished in the target environment.Otherwise, you might have to guess about these issues when you are using the ExportWizard and the Import Wizard.Export Selected MetadataPerform the following steps:1 Start SAS Data Integration Studio. A window displays that has various optionsfor maintaining a metadata profile.2 Select a profile that enables you to connect to the metadata repository with theobjects to be exported.3 In a tree view, right-click the selected objects and select Export from the pop-upmenu. The Export wizard displays. Alternatively, you can left-click the objects tobe exported and select File Export from the menu bar.Importing, Exporting, and Copying Metadata Tasks 734 In the first page of the wizard, specify a path and name for the export package oraccept the default. When finished, click Next.5 Review the list of objects that you have selected for export. Deselect the check boxfor any objects that you do not want to export.6 If desired, click an object, and then click the Options tab to view its options. Forexample, you can click the Options tab to specify whether or not you want toexport content with the object. When you are finished, click Next.7 Review the metadata to be exported. When finished, click Export. The metadatais exported to a SAS package file. A status page displays, indicating whether theexport was successful. A log with a datetime stamp is saved to your user directory.8 If desired, click View Log to view a log of the export operation. When you arefinished, click Finish.Importing SAS MetadataProblemYou want to import metadata into SAS Data Integration Studio that was exportedwith the Export Wizard.SolutionUse the Import Wizard to import the SAS package file that contains the metadata.The package can be saved to the same metadata repository or to a different metadatarepository. The source and target repository can be located on the same host machine oron different host machines.TasksIdentify the Metadata That Should Be Imported (optional)It will be easier to import metadata if you have a document that describes themetadata that was exported, the metadata that should be imported, and the mainmetadata associations that must be reestablished in the target environment.For example, suppose that a SAS Data Integration Studio job was exported. Whenyou import the job, the Import Wizard will prompt you to associate tables in the jobwith libraries in the target environment. If appropriate libraries do not exist, you mighthave to cancel the wizard, register appropriate libraries, and run the wizard again.However, if the library requirements are known and addressed ahead of time, you cansimply import the tables and specify an appropriate library in the target environment.Import the SAS Package FilePerform the following steps:1 Start SAS Data Integration Studio. A window displays that has various optionsfor maintaining a metadata profile.74 Copying and Pasting SAS Metadata Chapter 42 Select a profile that enables you to connect to the metadata repository into whichyou will import the exported metadata.3 In a tree view, right-click the icon for the target repository (or a folder in thetarget repository) and select Import from the pop-up menu. The Import wizarddisplays. Alternatively, you can left-click the icon for the target repository (or afolder in the target repository) and select File Import from the menu bar.4 In the first page of the wizard, select the package to be imported. Select the optionto import all objects in the package or just the new objects (objects which do notexist in the target repository). When finished, click Next.5 Review the list of objects that you have selected for import. Deselect the check boxfor any objects that you do not want to import.6 If desired, click an object, and then click the Options tab to view its options. Forexample, you can click the Options tab to specify whether or not you want toimport content, if content was exported with the object. When finished, click Next.7 Review the metadata associations to be restored. For example, if you areimporting a table, you will prompted to specify a library for that table. Click Nextto begin restoring the required associations.8 You will be presented with one or more pages that prompt you to associate theimported object with other objects (such as libraries and servers). When finished,click Next.9 Review the metadata to be imported. When finished, click Import. The metadatais imported. A status page displays, indicating whether the import was successful.A log with a datetime stamp is saved to your user directory.10 If desired, click View Log to view a log of the import operation. When finished,click Finish.Copying and Pasting SAS MetadataProblemYou want to create a metadata object that is similar to another metadata object in aSAS Data Integration Studio tree view.SolutionUse the Copy and Paste menu options to create a copy of the object, then modify thecopy as desired. As an alternative to Paste, you can use Paste Special to display theImport Wizard, which enables you to select which attributes are copied and to changesome attributes in the pasted object.TasksCopyTo copy an object in a tree view, right-click the object and select Copy from thepop-up menu.Importing, Exporting, and Copying Metadata About Exporting and Importing Other Metadata 75PasteTo paste a copy in which almost all of the attributes are the same as the original,right-click the icon for the target repository, then select Paste from the pop-up menu. Ifyou are working under change management, the new object will appear in the Projecttree. To prevent duplication, the new object will be named 'Copy of object_name.'To paste a copy into a different Custom tree folder, right-click the target folder andselect Paste from the pop-up menu. If you are working under change management, thenew object will appear in the Project tree. The new object will have the same name asthe original, but if you check the properties of the new object, you will see that it isassociated with the folder that you selected. (Duplicate names in different Custom treefolders are valid.)Paste SpecialAfter using the Copy option on a selected object, you can display a wizard that willenable you to select what attributes are copied and to change some attributes in thepasted object. Right-click a target folder in the Custom tree, then select PasteSpecial from the pop-up menu. The Import Wizard will be displayed. The generalsteps for using this wizard are described in Importing SAS Metadata on page 73.Working with Other MetadataAbout Exporting and Importing Other MetadataSAS Data Integration Studio supports the SAS Open Metadata Architecture, but youcan import metadata in other formats, and you can export SAS metadata to otherformats. Supported metadata formats include Common Warehouse Metamodel (CWM)format or a format that is supported by a SAS Metadata Bridge. For example, amodeling application can be used to create a data model that specifies a set of tables,columns, indexes, and keys. The model can be exported in CWM format, and you canimport it into SAS Data Integration Studio.The Metadata Importer option enables you to perform the following tasks: Import relational metadata in CWM format or a format that is supported by a SASMetadata Bridge. Perform change analysis on the imported metadata (compare imported metadatato existing metadata). View any changes in the Differences window. Run impact analysis or reverse impact analysis on tables and columns in theDifferences window, to help you understand the impact of a given change on thetarget environment. Choose which changes to apply to the target environment.The Metadata Exporter option enables you to export metadata in the default metadatarepository that is specified in your metadata profile, in CWM format, or a format that issupported by a SAS Metadata Bridge.76 Other Metadata That Can Be Imported and Exported Chapter 4Other Metadata That Can Be Imported and ExportedYou can import and export relational metadata in CWM format or in a formataccessible with a SAS Metadata Bridge. Relational metadata includes the metadata forthe following objects: tables columns indexes keys (including primary keys and foreign keys)Usage Notes for Importing or Exporting Other Metadata Avoid importing the same DBMS data model as new multiple times. A DBMS datamodel specifies a library and schema for each table in the model. If you use theMetadata Import wizard to import the same DBMS data model multiple times intothe same metadata repository, and you use the Import as new metadata optioneach time, you will create duplicate metadata for schemas and libraries in yourrepository. Duplicate schema names can cause unpredictable results when youattempt to compare the newest, imported metadata from the model with existingmetadata. You cannot run change analysis on metadata that is imported from z/OS systems. If you are working under change management, it is a good practice to check in thecomparison result metadata before viewing or applying the results. When imported metadata is compared to existing metadata, the differencesbetween the two are stored in a comparison result library. In the current release,the comparison result library cannot be a SAS/SHARE library. Accordingly, in anenvironment where two or more people will be performing change analysis onimported metadata, care should be taken to avoid contention over the samecomparison results library. For example, each user can create his or her owncomparison result library. To avoid problems that arise when character sets from different locales arecombined in the same comparison result library, create one or more comparisonresult libraries for each locale. SAS uses the combination of a DBMS server, a DBMS library, and a schema toaccess a DBMS table. When you select a DBMS server in the Metadata Importwizard, it is assumed that the metadata for the corresponding library and schemaexist in a current metadata repository. If they do not exist, you will receive anerror when you try to import the metadata, indicating that a required schema ismissing. If you are working under change management, empty your Project tree of anymetadata before importing more metadata with the Metadata Importer. This willmake it easier to manage the imported metadata from a particular session. If youwant to save any metadata in the Project tree, check in the repository. Then selectProject Clear Repository. The Metadata Import wizard enables you to select a metadata file that is local orremote to SAS Data Integration Studio. Remote support is provided for Windowsand UNIX hosts only. Under change management, imported metadata is compared to checked-inmetadata. Accordingly, any metadata in the Project repository that has not beenchecked in will not be included in the comparison.Importing, Exporting, and Copying Metadata Tasks 77If you mistakenly run a comparison before the appropriate metadata has beenchecked in, you can check in the contents of the Project repository and then selectComparison Recompare from the menu bar. Null SAS formats that show as differences in change analysis will, when applied,overwrite user-defined SAS Formats in a metadata repository. Be careful whenyou apply formats during change analysis.Preparing to Import or Export Other MetadataIn SAS Data Integration Studio , no preparation is required in order to import orexport metadata in CWM format. To import or export metadata in a format that isaccessible with a SAS Metadata Bridge, you must license the appropriate SASMetadata Bridge. For more information, contact your SAS representative.Importing As New Metadata (without Change Analysis)ProblemYou want to import a data model for a set of new tables. The model is in CWMformat or a format that is accessible with a SAS Metadata Bridge. You are certain thatnone of the imported metadata is a duplicate of existing metadata.SolutionUse the Import as new metadata option in the Metadata Import Wizard.The Import as new metadata option specifies that metadata in the selected file willbe imported without comparing it to existing metadata. This eliminates some steps, butit also means that you can import duplicate metadata. Under change management, theimported metadata appears in your Project tree, where you can review it beforechecking it in. Without change management, all metadata in the selected file will beregistered to the default repository in the current profile.TasksImport As New Metadata (No Change Analysis)Perform these steps to use the Import as new metadata option in the MetadataImport Wizard.1 From the desktop, select Tools Metadata Importer from the menu bar. TheMetadata Import Wizard displays.2 From the Metadata Import wizard, select the format of the file that you want toimport, such as CWM Import. Click Next.3 From the File Location window, specify a path to the file that contains themetadata to be imported, such as OrionStar.xml. (The path must be accessible tothe default SAS Application Server for SAS Data Integration Studio or to anotherSAS Application Server that you select, using the Advanced button.) Click Next.78 Importing Metadata with Change Analysis Chapter 44 From the Import Selection window, select Import as new metadata and clickNext.5 From the Metadata Location window, specify a single SAS library or DBMS serverfor all of the table metadata that is being imported from the selected file. In thiscase, select Library A. Click Next.6 In the Folder Selection window, select a Custom tree folder for any new tablemetadata that is being imported. (Updated table metadata will keep any currentCustom tree folder that is specified for it.)7 In the Finish window, review the metadata and click Finish when you arefinished. All tables that are specified in the imported metadata are registered tothe default repository. The imported tables might not appear in the Project treeuntil you refresh the tree.8 Right-click the Project tree and select Refresh from the pop-up menu. Themetadata for the imported tables appears in the Project tree.Importing Metadata with Change AnalysisProblemYou want to you want to import a data model for a set of tables. The model is inCWM format or a format that is accessible with a SAS Metadata Bridge. It is possiblethat some of the imported metadata is a duplicate of existing metadata.SolutionUse the Compare import metadata to repository option in the Metadata Importwizard.The Compare import metadata to repository option specifies that metadata inthe selected file will be imported and compared to existing metadata. Differences intables, columns, indexes, and keys are detected. Under change management, importedmetadata is compared to checked-in metadata that is associated with the library orDBMS server that you selected in the wizard. Without change management, importedmetadata is compared to the metadata in the default repository that is associated withthe selected library or DBMS server. Differences will be stored in a comparison resultlibrary. You can view the changes in the Differences window.TasksCompare Import Metadata to RepositoryPerform these steps to use the Compare import metadata to repository optionin the Metadata Import Wizard.1 From the SAS Data Integration Studio desktop, select Tools MetadataImporter. The Metadata Import wizard displays.2 From the Metadata Import wizard, select the format of the file that you want toimport, such as CWM Import. Click Next.Importing, Exporting, and Copying Metadata Tasks 793 From the File Location window, specify a path to the file that contains themetadata to be imported. (The path must be accessible to the default SASApplication Server for SAS Data Integration Studio or to another SAS ApplicationServer that you select, using the Advanced button.) Click Next.4 From the Import Selection window, select Compare import metadata torepository. The Comparison result library field becomes active.5 In the Comparison result library field, select a comparison result library. Youcan change the default options for the comparison by clicking the Advanced buttonto display the Advanced Comparison Options window. Click Next.6 From the Metadata Location window, specify a SAS library or DBMS server for allof the table metadata that is being imported from the selected file. Click Next.7 In the Folder Selection window, select a Custom tree folder for any new tablemetadata that is being imported. (Updated table metadata will keep any currentCustom tree folder that is specified for it.) Click Next.8 In the Finish window, review the metadata and click Finish when you arefinished. The imported metadata will be compared to checked-in metadata. If thecomparison is successful, a dialog box displays, indicating that the comparison hasbeen completed.9 You are given the option of viewing a log of the comparison process or closing thedialog box. The Comparison Results tree displays a comparison results object thatis named after the imported file.10 If you are working under change management, it is a good practice to check in thecomparison result metadata before viewing or applying the results. From theProject tree, right-click the Project repository icon and select Check InRepository.View the ComparisonPerform the following steps to view the results of an import metadata comparison.1 In the Comparison Result tree, select the comparison results object. Thecomparison results object is named after the imported file, and it has an XMLextension. The Comparison menu displays on the menu bar.2 From the menu bar, select Comparison View Differences. The Differenceswindow displays.The Differences window is divided into two panes: Import Metadata and RepositoryMetadata. The Import Metadata pane displays metadata that is being imported. Underchange management, the Repository Metadata pane displays any matching metadata inthe change-managed repository. Only checked-in metadata displays. Without changemanagement, the Repository Metadata pane displays any matching metadata in thedefault repository.Any items that are new, changed, or deleted in the imported metadata are indicatedwith icons and displayed in a special background color.Select and Apply ChangesPerform the following steps to select certain imported metadata and apply it to thecurrent metadata repository.1 Use the check boxes in the Import Metadata pane to select which tables andcolumns to import.2 Click Apply. The Apply Confirmation window displays.3 Click OK to accept the changes. When the new metadata has been registered to thedefault repository, the Difference window will be empty.80 Exporting Other Metadata Chapter 4Applying Changes to Tables with Foreign KeysWhen you import metadata about a set of tables that are related by primary keys orforeign keys, and the keys have been either added or updated in the importedmetadata, do one of the following: apply all changes in the imported metadata, or apply selective changes, making sure to select all tables that are related byprimary keys or foreign keysOtherwise, the key relationships will not be preserved.Restoring Metadata for Foreign KeysWhen you apply changes from imported metadata, a warning message is displayed ifforeign key metadata is about to be lost. At that time, you can cancel or continue theapply operation. However, if you accidentally lose foreign key metadata as a result ofan apply operation, it is possible to restore this metadata.Assuming that the imported metadata correctly specifies the primary keys or foreignkeys for a set of tables, you can recompare the imported metadata to the metadata inthe repository. In the Comparison Results window, select the icon for the appropriatecomparison result and then select Comparison Recompare from the menu bar. In theDifferences window, accept all changes, or select the primary key table and all relatedforeign key tables together and apply changes to them.After you import the metadata for a table, you can view the metadata for any keys bydisplaying the properties window for the table and clicking the Keys tab.Deleting an Invalid Change Analysis ResultWhen you perform change analysis on imported metadata, it is possible to import thewrong metadata or compare the imported metadata to the wrong current metadata. Ifthis happens, the comparison result metadata in the Comparison Result tree will not bevalid, as well as the data sets for this comparison in the comparison result library.If you are not working under change management, delete the bad comparison resultmetadata.If you are working under change management, perform the following steps:1 Check in the bad comparison result metadata. From the Project tree, right-clickthe Project repository icon and select Check In Repository. This will make thecomparison result metadata available to others, such as the administrator in thenext step.2 In SAS Data Integration Studio, have an administrator open the repository thatcontains the bad comparison result metadata.3 Have the administrator delete the bad comparison result from the ComparisonResults tree. This will delete both the metadata and the data sets for acomparison result.Exporting Other MetadataProblemYou want to export metadata from SAS Data Integration Studio in CWM format or aformat that is supported by a SAS Metadata Bridge. Some SAS solutions rely on thismethod.Importing, Exporting, and Copying Metadata Tasks 81SolutionUse the Metadata Export wizard to export the default metadata repository that isspecified in your metadata profile.TasksExport the Default Metadata RepositoryPerform the following steps to export the default metadata repository:1 From the SAS Data Integration Studio desktop, select Tools MetadataExporter. The Metadata Export wizard displays.2 Select the format for the exported repository and click Next. If the selected fileformat requires you to specify options, an options window displays.3 If you must specify options for the export format, specify these and click Next.4 Specify a path for the export file and click Next.5 Review the metadata in the Finish window and click Finish when you arefinished.83C H A P T E R5Working with TablesAbout Tables 85Registering Tables with a Source Designer 85Problem 85Solution 85Tasks 86Register a Table with a Source Designer 86Registering Tables with the Target Table Wizard 87Problem 87Solution 87Tasks 87Register a Table with a Target Table Wizard 87Viewing or Updating Table Metadata 89Problem 89Solution 89Using a Physical Table to Update Table Metadata 90Problem 90Solution 90Tasks 90Run Update Table Metadata 90Specifying Options for Tables 91Problem 91Solution 91Tasks 91Set Global Options for Tables 91Set Local Options for Tables 91Supporting Case and Special Characters in Table and Column Names 93About Case and Special Characters in SAS Names 93Rules for SAS Names 93Case and Special Characters in SAS Names 94About Case and Special Characters in DBMS Names 94Overview 94DBMSs for Which Case and Special Characters are Supported 95Verify Name Options for Database Libraries 95Enabling Name Options for a New Database Library 95Enabling Name Options for an Existing Database Library 96Verify DBMS Name Options in Table Metadata 96Enabling Name Options for Existing Tables 97Set DBMS Name Options in the Source Designers 97Setting Default Name Options for Tables and Columns 97Maintaining Column Metadata 98Problem 9884 Contents Chapter 5Solution 98Tasks 99Add Metadata for a Column 99Modify Metadata for a Column 100Perform Additional Operations on Column Metadata 100Add and Maintain Notes and Documents for a Column 103Identifying and Maintaining Key Columns 104Problem 104Solution 104Tasks 105Identify a Column as a Primary or Unique Key 105Add a Foreign Key Column 106Add a Column to a Primary or Unique Key 106Remove a Column from a Primary or Unique Key 106Rearrange the Columns in a Multi-Column Key 106Rename a Key 106Delete a Key 107Maintaining Indexes 107Problem 107Solution 107Tasks 107Create a New Index 107Delete an Index or a Column 108Rearrange the Columns in an Index 108Browsing Table Data 109Problem 109Solution 109Tasks 109Use Browse Mode in the View Data Window 109Browse Functions 110Additional Information 112Editing SAS Table Data 112Problem 112Solution 112Tasks 112Use Edit Mode in the View Data Window 112Edit Tasks 113Additional Information 115Using the View Data Window to Create a SAS Table 115Problem 115Solution 116Tasks 116Using the Create Table Function in the View Data Window 116Additional Information 116Specifying Browse and Edit Options for Tables and External Files 116Problem 116Solution 117Tasks 117Set General Options 117Set Column Header Options 117Set Format Options 118Set Search Options 118Set Editing Options 118Working with Tables Solution 85About TablesA table contains data arranged in rows and columns. SAS Data Integration Studiosupports SAS tables and the tables created by the database management systemssupported by SAS Access software. The main places tables are displayed within SASData Integration Studio include: the Inventory, Custom, and Project trees the Process Designer windowThe following table shows some common tasks that you perform when you work withtables in SAS Data Integration Studio:Table 5.1 Common Table TasksIf you want to Do thisWork with a table in SAS DataIntegration Studio.Register the table. See the sections about sourcedesigners and the Target Table wizard in this chapterfor more information. See also Registering Sources andTargets on page 56.Specify a registered table as a source ora target in a SAS Data IntegrationStudio job.Select the table in a tree. Then, drag it to the ProcessDesigner window for the job and drop it onto theappropriate drop zone. You can also select target tableswhen you create a job in the New Job wizard.View the data or metadata for aregistered table.Use the View Data window to review table data andmetadata in the trees and the Process Designer window.See Browsing Table Data on page 109 for moreinformation.Import, export, copy, and paste tablemetadata.Use the Metadata Importer and Metadata Exporterwizards. See Working with SAS Metadata on page 66.Registering Tables with a Source DesignerProblemYou want to create a job that includes one or more tables that exist in physicalstorage, but the tables are not registered in a metadata repository.SolutionUse the appropriate source designer to register the tables. Later, you can drag anddrop this metadata into the target position in a process flow. When the process flow isexecuted, SAS Data Integration Studio will use the metadata for the target table tocreate a physical instance of that table.The first page of the wizard will prompt you to select a library that contains thetables to be registered. (Typically, this library has been registered ahead of time.) SAS86 Tasks Chapter 5Data Integration Studio must be able to access this library. For details about SASlibraries, see Libraries on page 8.TasksRegister a Table with a Source DesignerPerform the following steps to register one or more tables that exist in physicalstorage:1 Display the source designer selection window in one of the following ways: Click Source Designer on the shortcuts bar. Select Tools Source Designer. Select File New Object. Then, click Source Table on the New ObjectWizard window.2 When the source designer selection window opens, only those data formats forwhich source designers have been installed are available. Select the appropriatewizard for the data type of the tables that you need to register.3 Click Next. The wizard tries to open a connection to the default SAS ApplicationServer. If there is a valid connection to this server, you might be prompted for auser name and a password. After you have provided that information, you will betaken directly to the Select a SAS Library window.4 Select the SAS library that contains the tables that you want to register, andreview the settings displayed in the Library Details section of the window.Sample settings are shown in the following display.Display 5.1 Sample Library Settings5 Click Next to access the Define Tables window. Select one or more tables toregister.6 Click Next to access the Select Folder window. Specify a Custom tree group for thetable or tables that you are registering. A Custom tree group is a folder that youWorking with Tables Tasks 87can use to keep similar types of metadata together in the Custom tree on the SASData Integration Studio desktop.7 Click Next to access the Wizard Finish window. Review the metadata that will becreated. When you are satisfied that the metadata is correct, click Finish to savethe data and close the wizard.Registering Tables with the Target Table WizardProblemYou want to create a job that includes a table that does not yet exist. This new tablecould hold the final results of the job, or it could serve as the input to a transformationthat would continue the job.SolutionUse The Target Table wizard to register the new table. Later, you can drag and dropthis metadata onto the target position in a process flow. When the process flow isexecuted, SAS Data Integration Studio will use the metadata for the target table tocreate a physical instance of that table.The physical storage page of the wizard will prompt you to select a library that willcontain the table to be registered. (Typically, this library has been registered ahead oftime.) For details about SAS libraries, see Libraries on page 8.TasksRegister a Table with a Target Table WizardPerform the following steps to register a table that does not exist:1 Display a selection window in one of the following ways: Click Target Designer on the shortcuts bar. Select Tools Target Designer. Select File New Object. Then, click Target Table on the New Object Wizardwindow.2 The target designer selection screen opens. Select Target Table. Note that thelist of wizards includes only target types for the Target Designer wizards thathave been installed.3 Click Next. Because you have set up a default SAS Application Server, the wizardtries to open a connection to this default server. If a valid connection to this serverexists, you are prompted for a valid user name and a password. After you providethat information, the name and description window displays in the Target TableDesigner. Enter a name and description for the table that you want to register.Note that the metadata object might or might not have the same name as thecorresponding physical table. You will specify a name for the physical table in alater window in this wizard.88 Tasks Chapter 54 Click Next to access the Table Storage Information window. Enter appropriatevalues in the following fields: DBMS Library Name (must follow the rules for table names in the format that you select inthe DBMS field. For example, if SAS is the selected DBMS, the name mustfollow the rules for SAS data sets. If you select another DBMS, the namemust follow the rules for tables in that DBMS. For a SAS table or a table in adatabase management system, you can enable the use of mixed-case namesor special characters in names.) Schema (if required by DBMS type)Use the Table Storage Information window to specify the format and location ofthe table that you are registering. You also specify the database managementsystem that is used to create the target, the library where the target is to bestored, and a valid name for the target. You can specify new libraries or edit themetadata definitions of existing libraries using the New and Edit buttons. You canuse the Table Options button to specify options for SAS tables and tables in aDBMS. The following display shows these settings for a sample table.Display 5.2 Sample Table Storage Settings5 Click Next to access the Select Columns window. Use the Select Columns windowto import column metadata from existing tables registered for use in SAS DataIntegration Studio. Drill down in the Available Columns field to find thecolumns that you need for the target table. Then, move the selected columns tothe Selected Columns field.6 Click Next to access the Change Columns/Indexes window. Use this window toaccept or modify any column metadata that you selected in the Select Columnswindow. You can add new columns or modify existing columns in various ways.(For details, click the Help button for the window.)7 Click Next when you are finished reviewing and modifying the column metadata.If you change the default order of the column metadata, you are prompted to savethe new order. The Select Folder window is displayed. Specify a Custom treeWorking with Tables Solution 89group for the table or tables that you are registering. A Custom tree group is afolder that you can use to keep similar types of metadata together in the Customtree on the SAS Data Integration Studio desktop.8 Click Next to access the Wizard Finish window. Review the metadata that will becreated. When you are satisfied that the metadata is correct, click Finish to savethe data and close the source designer wizard.Viewing or Updating Table MetadataProblemYou want to view or update the metadata for a table that you have registered in SASData Integration Studio.SolutionYou can access the properties window for the table and change the settings on theappropriate tab of the window. The following tabs are available on properties windowsfor tables: General Columns Indexes Keys Physical Storage Parameters Notes Extended Attributes AdvancedUse the properties window for a table to view or update the metadata for itscolumns, keys, indexes, and other attributes. You can right-click a table in any of thetrees on the SAS Data Integration Studio desktop or in the Process Designer window.Then, click Properties to access its properties window.Note that any updates that you make to a table change the physical table when yourun a job that contains the table. These changes can have the following consequencesfor any jobs that use the table: Changes, additions, or deletions to column metadata are reflected in all of the jobsthat include the table. Changes to column metadata often affect mappings. Therefore, you might need toremap your columns. Changes to keys, indexes, physical storage options, and parameters affect thephysical external file and are reflected in any job that the includes the table.You can use the impact analysis and reverse impact tools in SAS Data IntegrationStudio to estimate the impact of these updates on your existing jobs. For informationsee About Impact Analysis and Reverse Impact Analysis on page 255.90 Using a Physical Table to Update Table Metadata Chapter 5Using a Physical Table to Update Table MetadataProblemYou want to ensure that the metadata for a table matches the columns contained inthe physical table. If the metadata does not match the columns in the physical table,you need to update the metadata to match the physical table.SolutionYou can use the update table metadata feature. This feature compares the columnsin a physical table to the columns that are defined in the metadata for that table. If thecolumn metadata does not match the columns in the physical table, the metadata isupdated to match the physical table.For existing tables, the update table metadata feature adds new columns, removesdeleted columns, and records changes to all of the column attributes. When you selectand run this feature against one or more tables simultaneously, the update log listswhich tables have been successfully updated and which have failed. When you use theupdate table metadata option on a physical table in DBMS format and the DBMS tablehas more than one schema, the update table metadata option selects the first schema.The update table metadata feature uses the following resources: the current metadata server and the SAS Application Server to read the physicaltable the current metadata server to update the metadata to match the physical tableA warning message displays if the SAS Workspace Server component of the SASApplication Server is older than SAS 9.1.3, Service Pack 3. See the usage note, 'SomeNew Features Require the Latest Servers' in the 'SAS Data Integration Studio UsageNotes' topic in SAS Data Integration Studio help. For release-specific information onthis feature, see the 'Update Table Metadata Cannot Be Used for Some Tables' usagenote.TasksRun Update Table MetadataPerform the following steps to run the update table metadata feature:1 Select one or more tables from a SAS Data Integration Studio tree. Then, clickUpdate Table Metadata on the Tools menu. You might be prompted to supply auser name and password for the relevant servers.2 When the update is finished, you can choose to view the resulting SAS log.Working with Tables Tasks 91Specifying Options for TablesProblemYou want to set options for tables used in SAS Data Integration Studio jobs, such asDBMS name options; library, name, and schema options; and compression scheme andpassword protection options.SolutionYou can set global options for tables on the General tab of the Options menu. TheOptions menu is available on the Tools menu on the SAS Data Integration Studiomenu bar. You can set local options on the tabs available on the properties window foreach table.TasksSet Global Options for TablesTable 5.2 Global Table OptionsOption name DescriptionEnable case-sensitive DBMS objectnamesSpecifies whether the source designer and targetdesigner will generate code that supports case-sensitivetable and column names by default. If you do not selectthe check box, no case-sensitive support is provided. Ifyou select the check box, support is provided.Enable special characters within DBMSobject namesSpecifies whether the source designer and targetdesigner will generate code that supports specialcharacters in table and names by default. If you selectthe check box, support is provided by default. When youselect this check box, the Enable case-sensitiveDBMS object names check box is also automaticallyselected.The global settings apply to any new table metadata object, unless the settings areoverridden by a local setting. See Supporting Case and Special Characters in Tableand Column Names on page 93 for more information about DBMS object names.Set Local Options for TablesYou can set local options that apply to individual tables. These local options overrideglobal options for the selected table, but they do not affect any other tables. Forexample, you can set local DBMS name options on the Physical Storage tab of theproperties window for a table. These DBMS name options are listed in the followingtable.92 Tasks Chapter 5Table 5.3 Local Table Options on the Physical Storage TabOption name DescriptionDBMS Specifies the database management system (DBMS) where the tableis stored. To select a DBMS from a list, click the down arrow. DBMSsthat are valid in the current context are listed.Library Specifies a library that you can use to access the table. You canchoose from a list of libraries that are in the Project repository or itsparent repositories. To select a library, click the selection arrow. Tocreate a new library, click New, which opens the New Library wizard.To edit the properties of the existing library, click Edit, which opensthe properties window for the data library.Name Specifies the name of the table. The name must follow the rules fortable names in the DBMS that is selected in the DBMS field. Forexample, if SAS is the selected DBMS, the name must follow the rulesfor SAS data sets. If you select another DBMS, the name must followthe rules for tables in that DBMS. If the table is used for iterative orparallel processing, the table name must be preceded by anampersand (&) character (for example, &sourcetable). This usageensures that the parameters set for the iteration are recognized andthat the table is included when the iterative process works throughthe list of tables contained in the control table.Schema When present, this field specifies the name of the schema that isassociated with the DBMS table that is specified in the Name field. Aschema is a map or model of the overall data structure of a database.To select a schema that has been defined in a current metadatarepository, click the selection arrow. To specify a new schema, clickNew, which opens the New Database Schema wizard. To edit theproperties of the existing schema, click Edit, which opens theproperties window for the schema. When adding or editing a schema,note that the name of the schema in the metadata must match thename of the schema in the DBMS exactly (including case).Enable case-sensitiveDBMS object namesWhen present, specifies whether SAS Data Integration Studio willgenerate code that supports case-sensitive table and column names forthe current table. If the check box is deselected, case sensitivity is notsupported. If the check box is selected, case sensitivity is supported.This option overrides the global option with the same name.Enable special characterswithin DBMS object namesWhen present, specifies whether the source designers and the TargetTable wizard will generate code that supports special characters intable and names by default. If you select the check box, support isprovided by default. When you select this check box, the Enablecase-sensitive DBMS object names check box is alsoautomatically selected. This option overrides the global option withthe same name.Table Options When present, displays the Table Options window, where you canspecify a compression scheme, password protection, or other optionsfor the current table.See Supporting Case and Special Characters in Table and Column Names on page93 for more information about DBMS object names.Working with Tables About Case and Special Characters in SAS Names 93You can set additional table options in the Table Options window. To access thiswindow, click Table Options on the Physical Storage tab of the properties windowfor a table. These options are covered in following table.Table 5.4 Local Table Options in the Table Options WindowOption name DescriptionCompressed Specifies the kind of compression used, if any, for a SAS data set. (Youcannot compress SAS data views because they contain no data.)Compression reduces storage requirements, but it increases theamount of CPU time that is required to process the file. Althoughcompression affects performance by increasing the amount of CPUtime that is required to process the file, it significantly reducesstorage requirements. You should therefore use the default value onlyif you have sufficient disk space. In particular, you will want toenable compression for files such as Web logs that have columns withlarge widths that are sparsely filled. Select one of the following: NO (default): The SAS data set will not be compressed. YES: The SAS data set will be compressed using Run LengthEncoding (RLE). Use this method for character data. BINARY: The SAS data set will be compressed using Ross DataCompression (RDC). Use this method for medium to large(several hundred bytes or larger) blocks of binary data (numericvariables).Encrypted Specifies whether a SAS data set is encrypted (YES) or not encrypted(NO). You cannot encrypt SAS data views because they contain nodata.Additional Options Specifies options for SAS data sets or views. Separate each optionwith a blank. The field is restricted to a maximum of 200 characters.You can specify a password for the table in this field. Use table optionsyntax, such as read=readpw write=writepw.Supporting Case and Special Characters in Table and Column NamesAbout Case and Special Characters in SAS NamesRules for SAS NamesBy default, the names for SAS tables and columns must follow these rules: Blanks cannot appear in SAS names. The first character must be a letter (such as A through Z) or an underscore (_). Subsequent characters can be letters, numeric digits (such as 0 through 9), orunderscores. You can use uppercase or lowercase letters. SAS processes names as uppercase,regardless of how you enter them.94 About Case and Special Characters in DBMS Names Chapter 5 Special characters are not allowed, except for the underscore. Only in filerefs canyou use the dollar sign ($), pound sign (#), and at sign (@)The following SAS language elements have a maximum length of eight characters: librefs and filerefs SAS engine names and passwords names of SAS/ACCESS access descriptors and view descriptors (to maintaincompatibility with SAS Version 6 names) variable names in SAS/ACCESS access descriptors and view descriptorsBeginning in SAS 7 software, SAS naming conventions have been enhanced to allowlonger names for SAS data sets and SAS variables. The conventions also allowcase-sensitive or mixed case names for SAS data sets and variables.The following SAS language elements can now be up to 32 characters in length: members of SAS libraries, including SAS data sets, data views, catalogs, catalogentries, and indexes variables in a SAS data set macros and macro variablesFor a complete description of the rules for SAS names, see the topic, 'Names in the SASLanguage' in SAS Language Reference: Concepts.Case and Special Characters in SAS NamesBy default, the names for SAS tables and columns must follow the rules for SASnames. However, SAS Data Integration Studio supports case-sensitive names for tables,columns, and special characters in column names if you specify the appropriate DBMSname options in the metadata for the SAS table, as described in Enabling NameOptions for Existing Tables on page 97 or Setting Default Name Options for Tablesand Columns on page 97. Double-byte character set (DBCS) column names aresupported in this way, for example.The DBMS name options apply to most tables in SAS format. The following areexceptions: Special characters are not supported in SAS table names. Leading blanks are not supported for SAS column names and are thereforestripped out. Neither the External File source designers nor SAS/SHARE libraries and tablessupport case-sensitive names for SAS tables or special characters in columnnames. When you use these components, the names for SAS tables and columnsmust follow the standard rules for SAS names.About Case and Special Characters in DBMS NamesOverviewYou can access tables in a database management system (DBMS), such as Oracle orDB2, through a special SAS library that is called a database library. SAS DataIntegration Studio cannot access a DBMS table with case-sensitive names or withspecial characters in names unless the appropriate DBMS name options are specified inboth of these places:Working with Tables About Case and Special Characters in DBMS Names 95 in the metadata for the database library that is used to access the table, asdescribed in Enabling Name Options for a New Database Library on page 95 orEnabling Name Options for an Existing Database Library on page 96 in the metadata for the table itself, as described in Enabling Name Options for aNew Database Library on page 95 or Enabling Name Options for an ExistingDatabase Library on page 96Use the following methods to avoid or fix problems with case-sensitive names or withspecial characters in names in DBMS tables.DBMSs for Which Case and Special Characters are SupportedSAS Data Integration Studio generates SAS/ACCESS LIBNAME statements toaccess tables and columns that are stored in DBMSs. The following SAS/ACCESSLIBNAME statements have options that preserve case-sensitive names and names withspecial characters: DB2 z/OS DB2 UNIX/PC Informix MySQL ODBC OLE DB Oracle Microsoft SQL Server TeradataVerify Name Options for Database LibrariesPerform the following steps to verify that the appropriate DBMS name options havebeen set for all database libraries that are listed in the Inventory tree.1 From the SAS Data Integration Studio desk top, select the Inventory tree.2 In the Inventory tree, open the Libraries folder.3 Right-click a database library and select Display Libname from the pop-up menu.A SAS LIBNAME statement is generated for the selected library. In theLIBNAME statement, set both the Preserve DBMS table names option to YESand the Preserve column names as in the DBMS option.4 If these options are not set correctly, update the metadata for the library, asdescribed in Enabling Name Options for an Existing Database Library on page96 .Enabling Name Options for a New Database LibraryThe following steps describe how to specify name options for a new database libraryso that table and column names are supported as they would in the DBMS. This task istypically done by an administrator. It is assumed that the appropriate database serverhas been installed and registered, and the appropriate database schema has beenregistered. For more information about database servers and schemas, see the chaptersabout common data sources in the SAS Intelligence Platform: Data AdministrationGuide.96 About Case and Special Characters in DBMS Names Chapter 51 Follow the general instructions in Registering Any Libraries That You Need onpage 55. In the first window of the New Library wizard, select the appropriatekind of database library and click Next.2 Enter a name for the library and click Next.3 Enter a SAS LIBNAME for the library, then click Advanced Options. TheAdvanced Options window displays.4 In the Advanced Options window, click the Output tab. In the Preserve columnnames as in the DBMS field, select Yes.5 Click OK and enter the rest of the metadata as prompted by the wizard.Enabling Name Options for an Existing Database LibraryPerform the following steps to update the existing metadata for a database library tosupport table and column names as they exist in the DBMS.1 In SAS Data Integration Studio, click the Inventory tab to display the Inventorytree.2 In the Inventory tree, expand the folders until the Libraries folder is displayed.3 Select the Libraries folder and then select the library for which metadata mustbe updated.4 Select File Properties from the menu bar. The properties window for thelibrary displays.5 In the properties window, click the Options tab.6 On the Options tab, click Advanced Options. The Advanced Options windowdisplays.7 In the Advanced Options window, click the Output tab. In the Preserve columnnames as in the DBMS field, select Yes.8 In the Advanced Options window, click the Input/Output tab. In the PreserveDBMS table names field, select Yes.9 Click OK twice to save your changes.Verify DBMS Name Options in Table MetadataPerform the following steps to verify that the appropriate DBMS name options havebeen set for DBMS tables that are used in SAS Data Integration Studio jobs.1 From the SAS Data Integration Studio desk top, select the Inventory tree.2 In the Inventory tree, open the Jobs folder.3 Right-click a job that contains DBMS tables and select View Job from the pop-upmenu. The job opens in the Process Designer window.4 In the process flow diagram for the job, right-click a DBMS table and selectProperties from the pop-up menu.Working with Tables Setting Default Name Options for Tables and Columns 975 In the properties window, click the Physical Storage tab.6 Verify that the Enable case-sensitive DBMS object names option and theEnable special characters within DBMS object names option are selected.7 If these options are not set correctly, update the metadata for the table, asdescribed in Enabling Name Options for Existing Tables on page 97 .Enabling Name Options for Existing TablesPerform the following steps to enable name options for tables for which metadata hasbeen saved to a metadata repository. These steps apply to tables in SAS format or inDBMS format.1 From the SAS Data Integration Studio desktop, display the Inventory tree oranother tree view.2 Open the Tables folder.3 Select the desired table and then select File Properties from the menu bar. Theproperties window for the table displays.4 In the properties window, click the Physical Storage tab.5 On the Physical Storage tab, select the check box to enable the appropriatename option for the current table. Select Enable case-sensitive DBMS objectnames to support case-sensitive table and column names. Select Enable specialcharacters within DBMS object names to support special characters in tableand column names.6 Click OK to save your changes.Set DBMS Name Options in the Source DesignersThe first window in a DBMS source designer wizard enables you to select the librarythat contains the DBMS table or tables for which you want to generate metadata. Inthe first window, check the boxes labeled Enable case-sensitive DBMS objectnames and Enable special characters within DBMS object names.Setting Default Name Options for Tables and ColumnsYou can set default name options for all table metadata that is entered with a sourcedesigner wizard or a target designer wizard in SAS Data Integration Studio. Thesedefaults apply to tables in SAS format or in DBMS format.Defaults for table and column names can make it easier for users to enter the correctmetadata for tables. Administrators still have to set name options on databaselibraries, and users should at least verify that the appropriate name options areselected for a given table.Perform the following steps to set default name options for all table metadata that isentered with a source designer wizard or a target designer wizard in SAS DataIntegration Studio:1 Start SAS Data Integration Studio.2 Open the metadata profile that specifies the repository where metadata for thetables is stored.3 On the SAS Data Integration Studio desktop, select Tools Options from themenu bar. The Options window is displayed.4 In the Options window, select the General tab.98 Maintaining Column Metadata Chapter 55 On the General tab, select Enable case-sensitive DBMS object names tohave source designers and the Target Table wizard support case-sensitive tableand column names by default.6 On the General tab, select Enable special characters within DBMS objectnames to have source designers and the Target Table wizard support specialcharacters in table and column names by default.7 Click OK to save any changes.Maintaining Column MetadataProblemYou want to work with the metadata for the columns of the tables or external filesthat you have registered in SAS Data Integration Studio. You might also need toperform common tasks such as sorting, reordering, and restoring the columns of thetables or external files. You might even need to attach a note or document to themetadata for a column.SolutionYou can use the Columns window, tab, or pane to maintain the metadata for columnsin a table or external file. You can perform the following tasks on the metadata: Add Metadata for a Column Modify Metadata for a Column Import Metadata for a Column Delete Metadata for a Column Propagate New Columns Reorder Columns Sort Columns Restore the Order of Columns Save Reordered Columns Add and Maintain Notes and Documents for a ColumnNote: Updates to any shared resource should be carefully planned to avoidunintended impacts on other applications that might use the same resource. Forexample, the metadata for a table might be used in a SAS Data Integration Studio job.Working with Tables Tasks 99If a table is a target in a SAS Data Integration Studio job, changes to the metadatafor that table are not reflected in the corresponding physical table until the next timethat the job is executed. TasksAdd Metadata for a ColumnPerform the following steps to add a new column to the metadata for the currenttable:1 Select a column from the following list that represents the new one that you want.2 Click the New button. A row of default metadata describing the new columndisplays. The name of the column, Untitledn, is selected and ready for editing.The other attributes of the column have their default values: Description: Blank Length: 8 Type: Character Summary Role: (None) Sort Order: (None) Informat: (None) Format: (None) Is Nullable: No3 Change the name of the column to give it a meaningful name.4 Change the values of other attributes for the column, as described in ModifyMetadata for a Column on page 100.5 Click Apply or OK.For information about the implications of adding metadata to a column, see the noteat the end of Import Metadata for a Column in Table 5.6 on page 101.100 Tasks Chapter 5Modify Metadata for a ColumnTo modify the metadata for a column in the current table, select the attribute, makethe change, and then click Apply or OK. The following table explains how to changeeach type of attribute.Table 5.5 Column Metadata ModicationsAttribute InstructionsName Perform the following steps to enter a name:1 Double-click the current name to make it editable.2 Enter a new name of 32 characters or fewer.3 Press the Enter key.Description Perform the following steps to enter a description1 Double-click in the Description field.2 Edit the description, using 200 characters or fewer.3 Press the Enter key.Length Perform the following steps to enter the column length:1 Double-click the current length.2 Enter a new length. A numeric column can be from 3 to 8 bytes long(2 to 8 in the z/OS operating environment). A character column canbe 32,767 characters long.3 Press the Enter key.Type Perform the following steps to enter the data type:1 Double-click the current value to display the drop-down list arrow.2 Click the arrow to make a list of valid choices appear.3 Select a value from the list.Summary Role Perform same steps as for type.Sort Order Perform same steps as for type.Informat Perform the following steps to enter an informat:1 Double-click the current value to display the drop-down list arrow.2 Click the arrow to make a list of valid choices appear and then selecta value from the list, or type in a new value and press the Enter key.Format Perform same steps as for informat.Is Nullable Perform same steps as for type.You can also make a value editable by tabbing to it and pressing F2 or any alphanumerickey. For information about the implications of modifying metadata for a column, see thenote at the end of Delete Metadata for a Column in Table 5.6 on page 101.Perform Additional Operations on Column MetadataThe following table describes some additional operations you can perform on columnmetadata.Working with Tables Tasks 101Table 5.6 Additional Operations on Column MetadataIf you want to Do thisDelete Metadata for a Column Perform the following steps to delete the metadata for acolumn in the current table:1 Select a column.2 Click Delete.Note: When you modify or delete themetadata for a column in a table and that tableis used in a SAS Data Integration Studio job,you might also have to make the samemodifications to other tables in the job. Forexample, if you change the data type of a columnand that table is used as a source in a job, thenyou need to change the data type of that columnin the target table and in the temporary worktables in the transformations in that job.Changes to column metadata in SAS DataIntegration Studio do not appear in the physicaltable unless you select Drop Table in the LoadTechnique tab of the Loader transformationthat loads the current table. Import Metadata for a Column Perform the following steps to import column metadatathat has been added to the repositories that are specifiedin your current metadata profile:1 Click Import Column to access the ImportColumn.2 Select one or more columns from the AvailableColumns field in the Import Column window.3 Select the right arrow to move the selectedcolumns into the Selected Columns field.4 Reorder the columns in the Selected Columnsfield by selecting columns and clicking the Movesselect items up or Moves select itemsdown arrows.5 Click OK to import the columns into the table.Be aware of the following implications if you add orimport metadata for a column: You might need to propagate that columnmetadata through the job or jobs that include thecurrent table, as described in Propagate Metadatafor New Columns in this table. The new column does not appear in the physicaltable unless you select Drop Table in the LoadTechnique tab of the Loader transformationthat loads the current table.102 Tasks Chapter 5If you want to Do thisPropagate Metadata for New Columns Perform the following steps to propagate a new columnthrough the temporary work tables in thetransformations in a job. In this way, the new column ispropagated to the next target in the process flowdiagram:1 As necessary, open the job in the SAS DataIntegration Studio Process Editor bydouble-clicking the job in the Project tree oranother tree on the desktop.2 In the Process Editor, add or import the column inall source and target tables that need that column,as described the earlier portions of this section.3 Select Process Propagate from the menubar and click Yes in the pop-up window.Selecting Propagate adds the new column to all of thetemporary work tables that use that column. It alsoadds mappings that connect these work tables to thenext targets.Note: Any mappings that were removed beforeselecting Propagate are restored after youselect Propagate. If you add or import columns into a sourceand if that source is used in other jobs, youshould consider propagating that changethrough those other jobs.Reorder Columns You can rearrange the columns in a table (withoutsorting them) by (1) dragging a column to a new locationor (2) using the arrow buttons at the bottom of thewindow. Drag a column to a new location by draggingthe column-number cell. Perform the following steps tomove a column or columns using the arrow buttons:1 Select one or more columns.2 Click the Moves columns up arrow to movethe columns up, or click the Moves columnsdown arrow to move them down.Restore the Order of Columns Click the column number heading to restore all of thecolumns to their original order.Working with Tables Tasks 103If you want to Do thisSave Reordered Columns Some windows allow you to change the default order ofcolumns. Then, you can save that new order in themetadata for the current table or file. If you can savereordered columns before you exit the current window,SAS Data Integration Studio displays a dialog box thatasks if you want to save the new order.Sort Columns You can sort the columns in a table based on the value ofany column attribute (such as Name or Description)in either ascending or descending order. For example,you can sort the columns in ascending order by name byleft-clicking the Name heading. To sort the columns indescending order by name, you can click the sameheading a second time.Add and Maintain Notes and Documents for a ColumnThe Columns window, tab, or pane enables you to attach text notes, and documentsproduced by word processors, to the metadata for a table column. Such a note ordocument usually contains information about the table column or the values stored inthat column.Note: If a column currently has notes or documents associated with it, you can see anotes icon to the left of the column name. To add a note or document to a column, modify an existing note or document, or removean existing note or document, you can use the Notes window. Follow these steps to getto this window:1 Right-click the column you want to work with. Then, click Notes on the pop-upmenu to access the Notes window for the selected column.2 Perform one or more of the following tasks in the Notes group box: Click New to create a new note. Enter a title in the Assigned field and the textof the note in the Note text field. Use the editing and formatting tools at thetop of the window if you need them. Click the name of an existing note in the Assigned field to review or update thecontent in the Note text field. Click Delete to delete the note. Click Attach to access the Select Additional Notes window and attach anadditional note to the column.3 Perform one or more of the following steps in the Documents group box: Click New to attach a new document to the note. Enter a title in the Namefield. Then, enter a path to the document in the Path field. Click the name of an existing document in the Name field to review or updatethe path in the Path field. Click Delete to delete the document. Click Attach to access the Select Additional Documents window and attachan additional document to the column. The following display depicts asample of a completed Notes window.104 Identifying and Maintaining Key Columns Chapter 5Display 5.3 Completed Notes WindowIdentifying and Maintaining Key ColumnsProblemYou want to identify existing columns in a table as primary, unique, or foreign keys,which are defined as follows: Primary key: one or more columns that are used to uniquely identify a row in atable. A table can have only one primary key. One or more columns in a primarykey cannot contain null values. Foreign key: one or more columns that are associated with a primary key orunique key in another table. A table can have one or more foreign keys. A foreignkey is dependent upon its associated primary or unique key. In other words, aforeign key cannot exist without that primary or unique key. Unique key: one or more columns that can be used to uniquely identify a row in atable. A table can have one or more unique keys. Unlike a primary key, a uniquekey can contain null values.You also need to modify and delete existing key columns. If you do need to generate newkey values, see Chapter 19, Working with Slowly Changing Dimensions, on page 343.SolutionYou can use the Keys tab in the properties window for a table to identify key columnsin the table, modify existing keys, and delete keys. You can perform the following taskswhen you work with keys: Identify a Primary or Unique Key ColumnWorking with Tables Tasks 105 Add a Foreign Key Column Add a Column to a Primary or Unique Key Remove a Column from a Primary or Unique Key Rearrange the Columns in a Multi-Column Key Rename a Key Delete a KeyTasksIdentify a Column as a Primary or Unique KeyPerform the following steps to designate existing columns as primary keys or uniquekeys in a table:1 Access the Keys tab on the properties sheet for a table.2 Select the one or more columns from the Columns table that you want to serve asthe primary or unique key.3 Click on the down-arrow portion of the New combination button, and selectPrimary Key or Unique Key, depending on the type of key that you are creating.The new nodes display in the tree in Keys field. The nodes include the key and thecolumns in the key.4 Click Apply or OK. A sample primary node is depicted in the following display.Display 5.4 Completed Primary NodeNote: By default, the keys are named as follows: TableName.Primary indicates a primary key. Unique keys are named TableName.Unique Key#, such asEMPLOYEE_JAN_DTM.Unique Key1 for the first unique key.106 Tasks Chapter 5Add a Foreign Key ColumnPerform the following steps to identify a foreign key in a table:1 Click on the down-arrow portion of the New combination button, and selectForeign Key. This displays the Foreign Key Wizard window.2 In the wizard, click Help for guidance in creating the foreign key. When you havefinished running the wizard, additional nodes display in the tree in the Keys field.Note: For a complete example of how to run the Foreign Key Wizard, see the'Example: Running the Foreign Key Wizard' topic in SAS Data Integration Studiohelp. Add a Column to a Primary or Unique KeyPerform the following steps to add a new column to an existing primary or uniquekey:1 From the columns table, select the column that you want to add to the primary orunique key.2 In the keys tree, select either the folder that contains the key columns or one ofthe columns in the key.3 Click Add or drag and drop one or more columns into the Columns field in the keystree. If you selected the folder, the key that you add becomes the first column inthe list of columns that make up the key. If you selected a column, the newcolumn is added below the selected column.4 Click OK or Apply.Remove a Column from a Primary or Unique KeyPerform the following steps to remove a column from an existing primary or uniquekey:1 Select a column that you want to remove in the tree in the Keys field.2 Click Remove, or press the Delete key on your keyboard.3 Click OK or Apply.Rearrange the Columns in a Multi-Column KeyPerform the following steps to rearrange the columns in a multi-column key:1 Select a column that you want to move up or down in a list of columns in the treein the Keys field.2 Click Move a column up to move the column up in the list or click Move acolumn down to move the column down.3 Click OK or Apply.Rename a KeyPerform the following steps to rename a key:1 Right-click on the Key folder and click Rename in the pop-up menu. This makesthe key name editable.2 Enter the new name, and press the Enter key.3 Click OK or Apply.Working with Tables Tasks 107Delete a KeyPerform the following steps to delete a key:1 Click the folder that represents the key that you want to delete. This actionselects the key.2 Click the Delete button, or press the Delete key on your keyboard.3 Click OK or Apply.Maintaining IndexesProblemYou want to create a new index or modify or delete an existing index. See the'Indexes Window/Tab' topic in SAS Data Integration Studio help for more informationabout indexes.SolutionYou can use the Indexes tab on the properties window for a table.TasksCreate a New IndexPerform the following steps to create a new index in the Indexes tab:1 Click New. A folder displays in the tree in the Indexes field. This folder representsan index and has an appropriate default name. The name is selected for editing.You can rename the index to a more appropriate value by typing over the existingname and pressing the Enter key.2 Perform the following steps to add one or more columns to the index: Drag a column name from the Columns field to an index folder in theIndexes field. Select a column name in the Columns field. Then, select an Index. Finally,click Add.3 Click OK. The following display depicts a sample index.108 Tasks Chapter 5Display 5.5 Sample Completed IndexNote: If you add one column to the index, you create a simple index. If you add twoor more columns, you create a composite index. If you want the index to be unique,select the index name in the Indexes field, and then select the Unique values checkbox. Finally, if you are working with a SAS table and want to ensure that the indexcontains no missing values, check the No missing values check box. Delete an Index or a ColumnPerform the following steps to delete an index or to delete a column from an index inthe Indexes window or tab:1 Select the index or column in the tree in the Indexes field.2 Click the Delete button, or press the Delete key on your keyboard.3 Click OK or Apply.Rearrange the Columns in an IndexYou can reorder the columns for composite indexes, which contain more than onecolumn. Perform the following steps to move a column up or down in the list of indexcolumns in the Indexes window or tab:1 Select the column that you want to move in the tree in the Indexes field.2 Use the Move columns up in an index and Move columns down in an indexbuttons to move the column up or down.3 After you have arranged the columns as you want them, click OK or Apply.Note: It is generally best to list the column that you plan to search the most oftenfirst. Working with Tables Tasks 109Browsing Table DataProblemYou want to display data in a SAS table or view, in an external file, in a temporaryoutput table displayed in a process flow diagram, or in a DBMS table or view that ispart of a SAS library for DBMS data stores.SolutionYou can use the browse mode of the View Data window, provided that the table, view,or external file is registered in a current metadata repository and exists in physicalstorage. You can browse temporary output tables until the Process Designer window isclosed or the current server session is ended in some other way.Transformations in a SAS Data Integration Studio job can create temporary outputtables. If these temporary tables have not been deleted, you can also use the browsemode to display the data that they contain. The transformation must have beenexecuted at least once for the temporary output tables to exist in physical storage.TasksUse Browse Mode in the View Data WindowPerform the following steps to browse data in the View Data window:1 In the Inventory tree, other tree view, or Process Designer window, right-click themetadata object for the table, view, external file, temporary output, ortransformation. Then, select View Data from the pop-up menu.2 Enter the appropriate user ID and password, if you are prompted for them. Theinformation in the table, view, or external file displays in the View Data window,as shown in the following display.Display 5.6 View Data Window in Browse Mode110 Tasks Chapter 5The title bar of the View Data window displays the name of the object that isbeing viewed and the total number of rows.Browse FunctionsThe browse mode of the View Data window contains a group of functions that enableyou to customize how the data in the window is displayed. These functions arecontrolled by the view data toolbar, as shown in the following display.Display 5.7 View Data Browse ToolbarPerform the tasks listed in the following table to customize the data display:Table 5.7 Browse Functions in the View Data WindowIf you want to Do thisNavigate within the data Perform the following steps: Enter a row number in the Go to row field andclick Go to row to specify the number of thefirst row that is displayed in the table. Click Go to first row to navigate to the firstrow of data in the View Data window. Click Go to last row to navigate to the lastrow of data in the View Data window.Select a View Data window mode Perform the following steps: Click Switch to browse mode to switch tothe browse mode. Click Switch to edit mode to switch to theedit mode.Note: The Switch to browse mode andSwitch to edit mode buttons are displayedonly for SAS tables. Copy one or more rows of data into thecopy bufferPerform the following steps: Highlight one or more rows of data. Then, clickCopy to copy the selected text into the copy buffer.Working with Tables Tasks 111If you want to Do thisManipulate the data displayed in ViewData windowPerform the following steps: Click Launch search screen. Then, use thesearch toolbar to search for string occurrences inthe data set that is currently displayed in theView Data window. Click Launch sort screen. Then, use theSort By Columns tab in the View Data QueryOptions window to specify a sort condition onmultiple columns. The sort is performed on thedata set that is currently displayed in the ViewData window. Click Launch filter screen. Then, use theFilter tab in the View Data Query Optionswindow to specify a filter clause on the data setthat is currently displayed in the View Datawindow. This filter clause is specified as an SQLWHERE clause that is used when the data isfetched. Click Launch column subset screen. Usethe Columns tab in the View Data Query Optionswindow to select a list of columns that you want tosee displayed in the View Data window. You cancreate a subset of the data currently displayed inthe View Data window by selecting only some ofthe available columns in the Columns field. Theredrawn View Data window will include only thecolumns that you select here on the Columns tab.Determine what is displayed in thecolumn headersYou can display any combination of column metadata,physical column names, and descriptions in the columnheaders. Click Show column name in columnheader to display physical column names in thecolumn headings. Click Show description in columnheader to display optional descriptions in thecolumn headers. Click Show metadata name in columnheader to display optional column metadata inthe column headers. This metadata can be enteredin some SAS Business Intelligence applications,such as the SAS Information Mapping Studio.Determine whether metadata formatsare appliedPerform the following steps: Click Apply metadata formats to togglebetween showing formatted and unformatted datain the View Data window.To sort columns and perform related tasks, right-click on a column name and select anappropriate option from the pop-up menu. To refresh the view, select View Refresh112 Additional Information Chapter 5from the menu bar. For other details about using this window, see 'Pop-Up MenuOptions in the View Data Window' in the SAS Data Integration Studio help.To set options for the View Data window, select File Options from the SAS DataIntegration Studio menu bar to display the Options window. Then, click the View Datatab. See Specifying Browse and Edit Options for Tables and External Files on page116 for information about the available options.To access the View Data drop-down menu, click anywhere in the View Data windowand click the View Data heading on the SAS Data Integration Studio menu bar. Forinformation about the available menu items, see the View Data section in the 'MenuBar' topic in the SAS Data Integration Studio help.Additional InformationGeneral information about the View Data window is available in the 'View DataWindow' Help topic in SAS Data Integration Studio. For detailed information aboutspecific usage issues, see the 'Usage Notes for the View Data Window' Help topic.Editing SAS Table DataProblemYou want to edit SAS table data that is displayed in the View Data window.SolutionYou can use the edit mode of the View Data window to perform simple editingoperations in a SAS table. The editing mode is enabled only on SAS tables that arestored in a BASE engine library and are assigned on the workspace server. The entitythat you want to edit must be registered in a current metadata repository, and it mustexist in physical storage. If you are working under change management, you mustcheck out the entity before you can edit it in the View Data window.TasksUse Edit Mode in the View Data WindowPerform the following steps to edit data for a SAS table in the View Data window:1 In the Inventory tree, other tree view, or Process Designer window, right-click themetadata object for a SAS table. Then, select View Data from the pop-up menu.2 Enter the appropriate user ID and password, if you are prompted for them. Theinformation in the table displays in the browse mode of the View Data window.3 Click Switch to edit mode on the view data toolbar. The View Data windowdisplays in edit mode, as shown in the following display.Working with Tables Tasks 113Display 5.8 View Data Window in Edit ModeThe title bar of the View Data window displays the name of the object that isbeing viewed.4 Double-click inside a cell and then change the data in the cell. Click Commit editrow to commit the change to the database. Of course, you must have operatingsystem access for the file for the change to be saved.5 Click Undo last action to reverse the change that you just made. (You can clickRedo last action to return to the changed version of the cell.) Note that you canundo only the last operation because only a single level of undo is supported. So ifmultiple rows have been deleted or pasted, only the last row affected can beundone. Similarly, you can redo only your latest undo.6 Click a row number to select the row. Click Copy to copy the row into the buffer.7 Click Go to last row to move to the last row in the table.8 Click in the row marked by the New Row icon at the end of the View Data window.The New Row icon changes to the Editing Row icon. Click Paste to paste thecopied data into the row.9 Click Delete selected rows to delete the pasted data and remove the row fromthe table.Edit TasksThe edit mode of the View Data window contains a group of functions that enableyou to customize how the data in the window is displayed. These functions arecontrolled by the view data toolbar, as shown in the following display.Display 5.9 View Data Edit ToolbarPerform the tasks listed in the following table to edit the data displayed:114 Tasks Chapter 5Table 5.8 Edit Functions in the View Data WindowIf you want to Do thisNavigate within the data Perform the following steps: Enter a row number in the Go to row field and clickGo to row to specify the number of the first rowthat is displayed in the table. Click Go to first row to navigate to the first rowof data in the View Data window.Select a View Data window mode Perform the following steps: Click Switch to browse mode to switch to thebrowse mode. Click Switch to edit mode to switch to the editmode.Note: The Switch to browse mode andSwitch to edit mode buttons are displayed onlyfor SAS tables. Copy or paste data Perform the following steps: Highlight one or more rows of data. Then, click Copyto copy the selected text into the copy buffer. Place the cursor in the row where you want to placethe data. Then, click Paste to paste the data into thetable.Search the data displayed in ViewData windowPerform the following steps: Click Launch search screen. Then, use thesearch toolbar to search for string occurrences in thedata set that is currently displayed in the View Datawindow.Undo or redo editing operations Perform the following steps: Click Undo last action to reverse the mostrecent editing operation. Click Redo last action to restore the results ofthe most recent editing operation.Working with Tables Problem 115If you want to Do thisDetermine what is displayed in thecolumn headersYou can display any combination of column metadata,physical column names, and descriptions in the columnheaders. Click Show column name in column headerto display physical column names in the columnheaders. Click Show description in column headerto display displays optional descriptions in the columnheaders.Commit or delete editing changes Perform the following steps: Click Commit edit row to commit the changes thatyou have made to the currently edited row. Click Delete edit row to delete the changes thatyou have made to the currently edited row.To hide, show, hold, and release columns, right-click on a column name and select anappropriate option from the pop-up menu. To refresh the view, select View Refreshfrom the menu bar. For other details about using this window, see 'Pop-Up MenuOptions in the View Data Window' in the SAS Data Integration Studio help.To set options for the View Data window, select File Options from the SAS DataIntegration Studio menu bar to display the Options window. Then, click the View Datatab. See Specifying Browse and Edit Options for Tables and External Files on page116 for information about the available options.To access the View Data drop-down menu, click anywhere in the View Data windowand click the View Data heading on the SAS Data Integration Studio menu bar. Forinformation about the available menu items, see the View Data section in the 'MenuBar' topic in the SAS Data Integration Studio help.Additional InformationGeneral information about the View Data window is available in the 'View DataWindow' Help topic in SAS Data Integration Studio. For detailed information aboutspecific usage issues, see the 'Usage Notes for the View Data Window' Help topic.Using the View Data Window to Create a SAS TableProblemYou want to create a new SAS table. This method can be used to create small tablesfor testing purposes.116 Solution Chapter 5SolutionUse the create table function of the View Data window. This function enables you tocreate a new SAS table based on metadata that you register by using the Target Tablewizard.TasksUsing the Create Table Function in the View Data WindowPerform the following steps to create a new table in the View Data window:1 Create the metadata for a new SAS table in the Target Table wizard. Select thecolumns that you need from existing tables.2 Right-click the newly registered table and click View Data. The dialog box in thefollowing display is shown.Display 5.10 Create Table Dialog Box3 Click Yes to create the table in the SAS library that you specified in the metadatafor the table. The table is opened in edit mode.Additional InformationGeneral information about the View Data window is available in the 'View DataWindow' Help topic in SAS Data Integration Studio. For detailed information aboutspecific usage issues, see the 'Usage Notes for the View Data Window' Help topic.Specifying Browse and Edit Options for Tables and External FilesProblemYou want to set options that control how tables and external files are processed inthe browse and edit modes in the View Data window.Working with Tables Tasks 117SolutionYou can use the View Data tab in the Options window to specify options for the ViewData window. The options that you set on the View Data tab are applied globally. Thetab is divided into the General group box, the Column Headers group box, the Formatgroup box, the Search group box, and the Editing group box.TasksSet General OptionsThe General group box contains the following items:Table 5.9 General OptionsOption DescriptionClear Query Options when refreshing Clears any options that you set on the query when yourefresh the data.Prompt for long-running navigationoperationDetermines whether the user is prompted to decidewhether the View Data query should proceed with along-running navigation operation. If this option isselected, the prompt is displayed whenever the rowcount of the table is either not known or greater than100,000. If the option is deselected, the navigationoperation proceeds without the warning prompt.Set Column Header OptionsThe Column Headers group box contains the following items:Table 5.10 Column Headers OptionsOption DescriptionShow column name in column header Displays physical column names in the column headers.Show column description in columnheaderDisplays optional descriptions in the column headers.Show column metadata name in columnheaderDisplays optional column metadata names in the columnheaders. This metadata can be entered in some SASBusiness Intelligence applications, such as the SASInformation Mapping Studio.Note: You can display any combination of column metadata, SAS column names,and descriptions in the column headers by selecting the combination of check boxes thatare required to get the result that you want. 118 Tasks Chapter 5Set Format OptionsThe Format group box contains the following items:Table 5.11 Format OptionsOption DescriptionApply formats When selected, displays formatted data in the View Datawindow. This option applies the permanent formats thatare specified for the data when the data set is created.Deselect the check box to view unformatted data in theView Data window.Apply metadata formats When selected, uses metadata formats for formatteddata that is displayed in the View Data window. Theseformats are specified in the metadata for the data set.Set Search OptionsThe Search group box contains the following items:Table 5.12 Search OptionsOption DescriptionRecently specified search string (entries) Specifies the number of recently searched strings thatare displayed when you click the drop-down menu in theSearch for field.Ignore invalid column names When selected, ignores any invalid column names thatare entered into the search.Set Editing OptionsThe Editing group box contains the following items:Table 5.13 Editing OptionsOption DescriptionAllow editing of SCD2 tables withoutpromptingDetermines whether a warning dialog box that statesthat edits to Slowly Changing Dimension (SCD) tableswill cause the SCD to no longer be valid is displayed.Always delete rows without prompting Determines whether a warning dialog box is displayedbefore rows are deleted.On multi-row operation errors When one or more errors occur in a multi-row editingoperation, determines whether the user is prompted,errors are ignored and the operation is continued, or theoperation is canceled.Default date format Specifies a default format for date values.Default datetime format Specifies a default format for datetime values.119C H A P T E R6Working with External FilesAbout External Files 120Registering a Delimited External File 122Problem 122Solution 122Tasks 123Run the Delimited External File Source Designer 123View the External File Metadata 125View the Data 125Additional Information 126Registering a Fixed-Width External File 126Problem 126Solution 126Tasks 126Run the Fixed-Width External File Source Designer 126View the External File Metadata 129View the Data 129Additional Information 130Registering an External File with User-Written Code 130Problem 130Solution 130Tasks 131Test Your Code 131Run the User-Written External File Source Designer 131View the External File Metadata 133View the Data 133Additional Information 133Viewing or Updating External File Metadata 134Problem 134Solution 134Overriding the Code Generated by the External File Wizards 135Problem 135Solution 135Tasks 135Replace a Generated SAS INFILE Statement 135Additional Information 135Specifying NLS Support for External Files 136Problem 136Solution 136Tasks 136Specify NLS Encoding Options 136Additional Information 136120 About External Files Chapter 6Accessing an External File With an FTP Server or an HTTP Server 136Problem 136Solution 137Tasks 137Select an HTTP server or an FTP server 137Additional Information 137Viewing Data in External Files 138Problem 138Solution 138Tasks 138View Raw Data in an External File 138View Formatted Data in the External File Wizards 138Registering a COBOL Data File That Uses a COBOL Copybook 139Problem 139Solution 139Tasks 139Import the COBOL Copybook 139Copy Column Metadata From the COBOL Format File 140About External FilesAn external file is a file that is maintained by the machine operating environment orby a software product other than SAS. A flat file with comma-separated values is oneexample.SAS Data Integration Studio provides the following source designer wizards thatenable you to create metadata objects for external files: Delimited External File wizard: Use for external files in which data values areseparated with a delimiter character. You can specify multiple delimiters,nonstandard delimiters, missing values, and a multi-line record. Fixed Width External File wizard: Use for external files in which data valuesappear in columns that are a specified number of characters wide. You can specifynon-contiguous data. User Written External File wizard: Use for complex external files that requireuser-written SAS code for data accessYou can use the external file source designer wizards to do the following: display a raw view of the data in the external file display a formatted view of the data in the external file, as specified in the SASmetadata for that file display the SAS DATA step and SAS INFILE statement that the wizard generatesfor the selected file display the SAS log for the code that is generated by the wizardWorking with External Files About External Files 121 specify options for the SAS INFILE statement that is generated by the wizard,such as National Language Support (NLS) encoding override the generated SAS INFILE statement with a user-written statement supply a user-written SAS DATA step to access an external fileThe most common tasks that you can perform with external files are listed in thefollowing table.Table 6.1 Common External File TasksIf you want to Do thisAccess external files on a server The external file source designers prompt you to specifythe physical path to an external file and to select aserver that can resolve that path. You can select anySAS Application Server, FTP server, or HTTP/HTTPSserver that is defined in a current metadata repository.For details, see Accessing an External File With an FTPServer or an HTTP Server on page 136.Generate column metadata The external file wizards enable you to automaticallygenerate column metadata based on the structure of theexternal file, a guessing algorithm, and some parametersthat you supply. The guessing algorithm supportsnumeric, character, date/time, and monetary formats.Defaults can be supplied for numeric and characterformats. The external file wizards enable you to importcolumn metadata in the following ways: from the metadata for a table or an external file ina current metadata repository. from an external format file. For information, seethe 'Using External Format Files' topic in SASData Integration Studio help. from a COBOL copybook. For information, seeRegistering a COBOL Data File That Uses aCOBOL Copybook on page 139. from a column heading in the external file.You can also manually enter the metadata for eachcolumn.122 Registering a Delimited External File Chapter 6If you want to Do thisView and update external files You can update the metadata for an external file, asdescribed in the 'Viewing or Updating Metadata forOther Objects' topic in SAS Data Integration Studiohelp. You can display a raw view of the data in anexternal file or display a formatted view of this data, asspecified in the metadata for the file. For details, see the'Viewing Data for Tables and Files' topic in SAS DataIntegration Studio help.Use external files in jobs The metadata for an external file can be used in a SASData Integration Studio job that reads data from the fileor writes data to the file. For more information aboutjobs, see About Tables on page 85.To support jobs that include external files, two newtransformations have been added to the Process Library: File Reader: Reads an external file and writes to atarget. Added automatically to a process flow whenan external file is specified as a source. Executeson the host where the external file resides, asspecified in the metadata for the external file. File Writer: Reads a source and writes to anexternal file. Added automatically to a processflow when an external file is specified as a target.Executes on the host where the external fileresides, as specified in the metadata for theexternal file.The Mapping tab in the properties window for these twotransformations enables you to define derived mappings.Registering a Delimited External FileProblemYou want to create metadata for a delimited external file so that it can be used inSAS Data Integration Studio.SolutionUse the delimited external file source designer to register the file. The sourcedesigner enables you to create metadata for external files that contain delimited data.This metadata is saved to a SAS Metadata Repository.Working with External Files Tasks 123TasksRun the Delimited External File Source DesignerPerform the following steps to use one method to register an external file in thedelimited external file source designer:1 Open the Source Designer window. Open the External Files folder and clickDelimited External File. Click Next to access the External File Locationwindow.2 If you are prompted, enter the user ID and password for the default SASApplication Server that is used to access the external file.3 Specify the physical path to the external file in the File name field. Click Next toaccess the Delimiters and Parameters window.4 Select the check box for the appropriate delimiter in the Delimiters group box.Accept the default values for the remaining fields, and click Next to access theColumn Definitions window.5 Click Refresh to view the raw data from the external file in the File tab in theview pane at the bottom of the window. Sample data is shown in the followingdisplay.Display 6.1 Delimited Data in the File TabNote: If your external file contains fewer than 10 rows, a warning box will bedisplayed. Click OK to dismiss the warning window.6 Click Auto Fill to access the Auto Fill Columns window and populatepreliminary data into the columns component of the Columns Definition window.7 The first row in most external files is unique because it holds the column namesfor the file. Therefore, you should change the value that is entered in the Startrecord field in the Guessing records group box to 2. This setting ensures thatthe guessing algorithm begins with the second data record in the external file.Excluding the first data from the guessing process yields more accuratepreliminary data.124 Tasks Chapter 68 Accept all of the remaining default settings. Click OK to return to the ColumnDefinitions window.9 Click Import to access the Import Column Definitions window and the importfunction to simplify the task of entering column names.10 Select the Get the column names from column headings in the field radiobutton, and keep the default settings for the fields underneath it. Click OK to savethe settings and return to the Column Definitions window. The names from thefirst record in the external file are populated in the Name column. You now can editthem as needed.Note: If you use the get column names from column headings function, the valuein the Starting record field in the Data tab of the view pane in the ColumnDefinitions window is automatically changed. The new value is one greater thanthe value in the The column headings are in file record field in the ImportColumn Definitions window.11 The preliminary metadata that is populated into the columns component usuallyincludes column names and descriptions that are too generic to be useful for SASData Integration Studio jobs. Fortunately, you can modify the columns componentby clicking in the cells that you need to change and entering the correct data.Enter appropriate values for the external file that you are registering. Thefollowing display depicts a sample completed Column Definitions window.Display 6.2 Sample Completed Column Denitions Window12 To verify that the metadata you have entered is appropriate for the data in theexternal file, click the Data tab and then click Refresh. If the metadata matchesthe data, the data will be properly displayed in the Data tab. In the currentexample, the Data tab would look similar to the View Data window for theregistered external file. If the data does not display properly, update the columnmetadata and click Refresh to verify that the appropriate updates have beenmade. To view the code that will be generated for the external file, click theSource tab. To view the SAS log for the generated code, click the Log tab. TheWorking with External Files Tasks 125code that is displayed in the Source tab is the code that will be generated for thecurrent external file when it is included in a SAS Data Integration Studio job.13 Click Next to access the Select Folder window. Select an appropriate Custom treefolder for the external file metadata.14 Click Next to access the General window. Enter an appropriate name for theexternal file metadata. (If you leave the Name field blank, the file is identified byits path.) You can also enter an optional description.15 Click Finish to save the metadata and exit the delimited external file sourcedesigner.View the External File MetadataAfter you have generated the metadata for an external file, you can use SAS DataIntegration Studio to view, and possibly make changes to, that metadata. For example,you might want to remove a column from a table or change the data type of a column.Any changes that you make to this metadata will not affect the physical data in theexternal file. However, the changes will affect the data that is included when theexternal table is used in SAS Data Integration Studio. Perform the following steps toview or update external file metadata:1 Right-click the external file, and click Properties. Then, click the Columns tab.The Columns window is displayed, as shown in the following example.Display 6.3 External File Columns Window2 Click OK to save any changes and close the properties window.View the DataRight-click the external file, and click View Data. The View Data window isdisplayed, as shown in the following example.126 Additional Information Chapter 6Display 6.4 External File Data in the View Data WindowIf the data in the external file displays correctly, the metadata for the file is correctand the table is available for use in SAS Data Integration Studio. If you need to reviewthe original data for the file, right-click on its metadata object. Then, click View File.Additional InformationGeneral information about registering external files window is available in the'Managing External Files in SAS Data Integration Studio' Help topic in SAS DataIntegration Studio. For detailed information about specific usage issues, see the 'UsageNotes for External Files' Help topic.Registering a Fixed-Width External FileProblemYou want to create metadata for a fixed-width external file so that it can be used inSAS Data Integration Studio.SolutionUse the fixed-width external file source designer to register the file. The sourcedesigner enables you to create metadata for external files that contain fixed-width data.The metadata is saved to a SAS Metadata Repository.You need to know the width of each column in the external file. This informationmight be provided in a document that describes the structure of the external file.TasksRun the Fixed-Width External File Source DesignerPerform the following steps to use one method to register an external file in thefixed-width external file source designer:1 Open the Source Designer window. Open the External Files folder and clickFixed Width External File. Click Next to access the External File Locationwindow.Working with External Files Tasks 1272 If you are prompted, enter the user ID and password for the default SASApplication Server that is used to access the external file.3 Specify the physical path to the external file in the File name field. Click Next toaccess the Parameters window.4 The Pad column values with blanks check box is selected by default. Deselectthis check box if the columns in your external file are short. It is unnecessary topad values in short columns, and padded values can hurt performance. Inaddition, select the Treat unassigned values as missing check box. Thissetting adds the TRUNCOVER option to the SAS code, which sets variableswithout assigned values to missing.5 Accept the default for the Logical record length, and click the Next button toaccess the Column Definitions window.6 Click Refresh to view the raw data from the external file in the File tab in theview pane at the bottom of the window. Sample data is shown in the followingdisplay.Display 6.5 Fixed-Width Data in the File Tab7 The next step sets the boundaries of the columns in the external file. The processis similar to the process that is used to set tabs in word processing programs. Clickthe appropriate tick marks in the ruler displayed at the top of the view pane. Youcan get the appropriate tick mark position numbers from the documentation thatcomes with the data. To set the first column boundary, click the tick mark on theruler that immediately follows the end of its data. A break line displays, and thecolumn is highlighted. For example, if the data in the first column extends to theeighth tick mark, you should click the ninth mark. Notice that the metadata forthe column is also populated into the column component at the top of the window.8 Click the appropriate tick marks in the ruler for the other columns in the externalfile. Break lines and metadata for these columns are set.128 Tasks Chapter 69 Click Auto Fill to refine this preliminary data by using the auto fill function.Accept all default settings and then click OK to return to the Column Definitionswindow. More accurate metadata is entered into the column components section ofthe window.10 The preliminary metadata that is populated into the columns component usuallyincludes column names and descriptions that are too generic to be useful for SASData Integration Studio jobs. Fortunately, you can modify the columns componentby clicking in the cells that you need to change and by entering the correct data.Note: The only values that need to be entered for the sample file areappropriate names and descriptions for the columns in the table. The other valueswere created automatically when you defined the columns and clicked Auto Fill.However, you should make sure that all variables have informats that describe thedata that you are importing because the auto fill function provides a best guess atthe data. You need to go in and verify this guess. If appropriate informats are notprovided for all variables in the fixed-width file, incorrect results can beencountered when the external file is used in a job or its data is viewed. A sampleof a completed Column Definitions window is shown in the following display. Display 6.6 Completed Column Denitions WindowYou can click Data to see a formatted view of the external file data. To view thecode that will be generated for the external file, click the Source tab. To view theSAS log for the generated code, click the Log tab. The code that is displayed in theSource tab is the code that will be generated for the current external file when itis included in a SAS Data Integration Studio job.11 Click Next to access the Select Folder window. Select an appropriate Custom treefolder for the external file metadata.12 Click Next to access the General window. Enter an appropriate name for theexternal file metadata. (If you leave the Name field blank, the file is identified byits path.) You can also enter an optional description.13 Click Finish to save the metadata and exit the fixed-width external file sourcedesigner.Working with External Files Tasks 129View the External File MetadataAfter you have generated the metadata for an external file, you can use SAS DataIntegration Studio to view, and possibly make changes to, that metadata. For example,you might want to remove a column from a table or change the data type of a column.Any changes that you make to this metadata will not affect the physical data in theexternal file. However, the changes will affect the data that is surfaced when theexternal table is used in SAS Data Integration Studio. Perform the following steps toview or update external file metadata:1 Right-click the external file, and click Properties. Then, click the Columns tab.The Columns tab is displayed, as shown in the example in the following display.Display 6.7 External File Columns Tab2 Click OK to save any changes and close the properties window.View the DataRight-click the external file, and click View Data. The View Data window isdisplayed, as shown in the example in the following display.130 Additional Information Chapter 6Display 6.8 External File Data in the View Data WindowIf the data in the external file displays correctly, the metadata for the file is correctand the table is available for use in SAS Data Integration Studio. If you need to reviewthe original data for the file, right-click on its metadata object. Then, click View File.Additional InformationGeneral information about registering external files window is available in the'Managing External Files in SAS Data Integration Studio' Help topic in SAS DataIntegration Studio. For detailed information about specific usage issues, see the 'UsageNotes for External Files' Help topic.Registering an External File with User-Written CodeProblemYou want to use user-written code to create metadata for an external file so that itcan be used in SAS Data Integration Studio.SolutionUse the user-written external file source designer to register external files withuser-written SAS code that includes an INFILE statement. The metadata is saved to aSAS Metadata Repository. This metadata can then be used as a source or a target in aSAS Data Integration Studio job.Working with External Files Tasks 131TasksTest Your CodeYou should test your SAS code before you run it in the User Written External Filesource designer. That way, you can ensure that any problems that you encounter in thewizard come from the wizard itself and not from the code. Perform the following stepsto run this test:1 Open the Source Editor from the Tools menu in the menu bar on the SAS DataIntegration Studio desktop.2 Paste the SAS code into the Source Editor window. Here is the code that is used inthis example:libname temp base d9585output_sas;%let _output=temp.temp;data &_output;infile d9585sources_externalbirthday_event_data.txtlrecl = 256padfirstobs = 2;attrib Birthday length = 8 format = ddmmyy10. informat = YYMMDD8. ;attrib Event length = $19 format = $19. informat = $19. ;attrib Amount length = 8 format = dollar10.2 informat = comma8.2 ;attrib GrossAmt length = 8 format = Dollar12.2 informat = Comma12.2;input @ 1 Birthday YYMMDD8.@ 9 Event $19.@ 28 Amount Comma8.2@ 36 GrossAmt Comma12.2;run;Note: The first two lines of this SAS code are entered to set the LIBNAME andoutput parameters that the SAS code needs to process the external file. After youhave verified that the code ran successfully, delete them. They are not neededwhen the SAS code is used to process the external file.3 Review the log in the Source Editor window to ensure that the code ran withouterrors. The expected number of records, variables, and observations should havebeen created.4 Close the Source Editor window. Do not save the results.Run the User-Written External File Source DesignerPerform the following steps to use one method to register an external file in theuser-written source designer:1 Open the Source Designer window. Open the External Files folder and clickUser Written External File. Click Next to access the User Written SourceCode window.2 If you are prompted, enter the user ID and password for the default SASApplication Server that is used to access the external file.132 Tasks Chapter 63 Enter the appropriate value in the Type field. The available types are file andmetadata.4 Verify that the correct server is displayed in the Host field.5 Specify the physical path to the external file in the Path field. Click Next to accessthe Column Definitions window.6 You can either enter the column definitions manually or click Import to access theImport Column Definitions window. For information on the column importfunctions available there, see the 'Import Column Definitions Window' in the SASData Integration Studio help. The column definitions for this example wereentered manually.You can find the information that you need to define the columns in theattributes list in the SAS code file. For example, the first variable in thebirthday_event_code.sas file has a name of Birthday, a length of 8, the yymmdd8.informat, and the ddmmyy10. format. Click New to add a row to the columnscomponent at the top of the Column Definitions window.7 Review the data after you have defined all of the columns. To view this data, clickthe Data tab under the view pane at the bottom of the window. To view the codethat will be generated for the external file, click the Source tab. To view the SASlog for the generated code, click the Log tab. The code that is displayed in theSource tab is the code that will be generated for the current external file when itis included in a SAS Data Integration Studio job. The following display shows thecompleted Column Definitions window.Display 6.9 Completed Column Denitions Window8 Click Next to access the Select Folder window. Select an appropriate Custom treefolder for the external file metadata.9 Click Next to access the General window. Enter an appropriate name for theexternal file metadata. (If you leave the Name field blank, the file is identified byits path.) You can also enter an optional description.10 Click Finish to save the metadata and exit the fixed-width external file sourcedesigner.Working with External Files Additional Information 133View the External File MetadataAfter you have generated the metadata for an external file, you can use SAS DataIntegration Studio to view, and possibly make changes to, that metadata. For example,you might want to remove a column from a table or change the data type of a column.Any changes that you make to this metadata will not affect the physical data in theexternal file. However, the changes will affect the data that is included when theexternal table is used in SAS Data Integration Studio. Perform the following steps toview or update external file metadata:1 Right-click the external file, and click Properties. Then, click the Columns tab.The Columns tab is displayed, as shown in the example in the following display.Display 6.10 External File Columns Tab2 Click OK to save any changes and close the properties window.View the DataRight-click the external file, and click View Data. The View Data window isdisplayed, as shown in the example depicted in the following display.Display 6.11 External File Data in the View Data WindowIf the data in the external file displays correctly, the metadata for the file is correctand the table is available for use in SAS Data Integration Studio. If you need to reviewthe original data for the file, right-click on its metadata object. Then, click View File.Additional InformationGeneral information about registering external files window is available in the'Managing External Files in SAS Data Integration Studio' Help topic in SAS Data134 Viewing or Updating External File Metadata Chapter 6Integration Studio. For detailed information about specific usage issues, see the 'UsageNotes for External Files' Help topic.Viewing or Updating External File MetadataProblemYou want to view or update the metadata for an external file that you haveregistered in SAS Data Integration Studio.SolutionYou can access the properties window for the table and change the settings on theappropriate tab of the window. The following tabs are available on properties windowsfor tables: General File Location (not available for user-written external files) File Parameters Columns Parameters Notes Extended Attributes AdvancedUse the properties window for an external file to view or update the metadata for itscolumns, file locations, file parameters, and other attributes. You can right-click anexternal file in any of the trees on the SAS Data Integration Studio desktop or in theProcess Designer window. Then, click Properties to access its properties window.Note that any updates that you make to an external file change the physical externalfile when you run a job that contains the file. These changes can have the followingconsequences for any jobs that use the external file: Changes, additions, or deletions to column metadata are reflected in all of the jobsthat include the external file. Changes to column metadata often affect mappings. Therefore, you might need toremap your columns. Changes to file locations, file parameters, and parameters affect the physicalexternal file and are reflected in any job that the includes the external file.You can use the impact analysis and reverse impact tools in SAS Data IntegrationStudio to estimate the impact of these updates on your existing jobs. For information,see About Impact Analysis and Reverse Impact Analysis on page 255.Working with External Files Additional Information 135Overriding the Code Generated by the External File WizardsProblemYou want to substitute your own SAS INFILE statement for the code that isgenerated by the Delimited External File wizard and the Fixed Width External Filewizard. For details about the SAS INFILE statement, see SAS Language Reference:Dictionary.SolutionUse the Override generated INFILE statement with the followingstatement check box in the Advanced File Parameters window of the Source Designerwizard.Note: If you override the generated code that is provided by the external filewizards and specify a non-standard access method such as PIPE, FTP, or a URL, thenthe Preview button on the External File Location window, the File tab on the ColumnsDefinition window, and the Auto Fill button on the Columns Definition window willnot work. TasksReplace a Generated SAS INFILE StatementPerform the following steps to substitute your own SAS INFILE statement for thecode that is generated by the Delimited External File wizard and the Fixed WidthExternal File wizard.1 Open the Source Designer selection window, and select either the DelimitedExternal File wizard or the Fixed Width External File wizard.2 Specify the physical path for the external file and click Next. Either theParameters window or the Parameters/Delimiters window displays (depending onthe selected wizard.3 Click the Advanced button to display the Advanced File Parameters window.4 Select the Override generated INFILE statement with the followingstatement check box. Then, paste your SAS INFILE statement into the text area.5 Enter other metadata for the external file as prompted by the wizard.Additional InformationFor details about the effects of using overridden code with a non-standard accessmethod, see the 'Accessing Data With Methods Other Than the SAS ApplicationServer' topic in SAS Data Integration Studio help.136 Specifying NLS Support for External Files Chapter 6Specifying NLS Support for External FilesProblemYou want to specify the National Language Support (NLS) encoding for an externalfile. You must have the proper NLS encoding to view the contents of the selected file orautomatically generate its column metadata.SolutionEnter the appropriate encoding value into the Encoding options field in theAdvanced File Parameters window of the Source Designer wizard.TasksSpecify NLS Encoding OptionsPerform the following steps to specify NLS encoding for the Delimited External Filewizard or the Fixed Width External File wizard.1 Open the Source Designer selection window, and click either Delimited ExternalFile or Fixed Width External File.2 Specify the physical path for an external file for which NLS options must be set,such as a Unicode file. Normally, after you have specified the path to the externalfile, you can click Preview to display the raw contents of the file. However, thePreview button will not work yet, because the required NLS options have not beenspecified.3 Click Next. Either the Parameters window or the Parameters/Delimiters windowdisplays (depending on the selected wizard.4 Click Advanced to display the Advanced File Parameters window.5 Enter the appropriate NLS encoding for the selected file in the Encoding optionsfield. Then, click OK.Additional InformationFor detailed information about encoding values, see 'Encoding Values in SASLanguage Elements' in SAS National Language Support (NLS): Users Guide.Accessing an External File With an FTP Server or an HTTP ServerProblemYou want to access an external file that is located on either an HTTP server or anFTP server. The Delimited External File wizard and the Fixed Width External FileWorking with External Files Additional Information 137wizard prompt you to specify the physical path to an external file. By default, a SASApplication Server is used to access the file. However, you can access the file with anHTTP server, HTTPS server, or FTP server if the metadata for that server is availablein a current metadata repository.Note: If you use a method other than a SAS Application Server to access anexternal file, then the Preview button on the External File Location window, the Filetab on the Columns Definition window, and the Auto Fill button on the ColumnsDefinition window will not work. SolutionYou can select the server in the FTP Server field or the HTTP Server field. Thesefields are located on the Access Method tab in the Advanced File Location Settingswindow of the Source Designer wizard.TasksSelect an HTTP server or an FTP serverPerform the following steps to select an HTTP server or an FTP server in theexternal file wizards:1 Open the Source Designer selection window, and click either Delimited ExternalFile or Fixed Width External File.2 Click Advanced in the External File Location window. The Advanced File LocationSettings window displays.3 Click the Access Method tab. Then, select either the FTP check box or the URLcheck box.4 Select either an FTP server or an HTTP server in the FTP Server field or theHTTP Server field. Click OK to save the setting and close the Advanced FileLocation Settings window.5 Specify a physical path for the external file. The path must be appropriate for theserver that you selected.6 Enter other metadata for the external file as prompted by the wizard.Additional InformationFor details about defining metadata for an HTTP server, HTTPS server, or an FTPserver, administrators should see 'Enabling the External File Wizards to Retrieve FilesUsing FTP or HTTP' in the SAS Data Integration Studio chapter of SAS IntelligencePlatform: Desktop Application Administration Guide. Also see the usage note'Accessing Data With Methods Other Than the SAS Application Server' in the 'UsageNotes for External Files' topic in SAS Data Integration Studio help.138 Viewing Data in External Files Chapter 6Viewing Data in External FilesProblemYou want to view raw data or formatted data in the one of the external file wizardsthat are included in the Source Designer. You might also need to view this raw orformatted data in an external file that you have already registered by using of theexternal file wizards.SolutionYou can view raw data in the External File Location window or Columns Definitionwindow in the external file wizards or in the View File window for a registered externalfile. You can view formatted data in the Columns Definition window in the external filewizards or in the View Data window for a registered external file.TasksView Raw Data in an External FileYou can click Preview on the External File Location window in the external filewizards to view raw data for an unregistered file. You can also click the File tab on theColumns Definition window. There are two main situations where the Preview buttonand the File tab will not be able to display data in the external file: when you use a method other than a SAS Application Server to access the externalfile. (See Accessing an External File With an FTP Server or an HTTP Server onpage 136.) when you use the User Written External File wizard (because your SAS code, notthe wizard, is manipulating the raw data in the file).For an example of how you can use the File tab to help you define metadata, see theexplanation of the Column Definitions window in Registering a Delimited ExternalFile on page 122. You can also view the raw data in an external file after you haveregistered it in the Source Designer. To do this, access the View File window for theexternal file. The raw data surfaced in the external file wizards and the View Filewindow is displayed without detailed column specifications or data formatting. You canuse it to understand the structure of the external file better.View Formatted Data in the External File WizardsThe Data tab on the Columns Definition window displays data in the external fileafter metadata from the external file wizard has been applied. Use the Data tab toverify that the appropriate metadata has been specified for the external file.The Data tab is populated as long as the SAS INFILE statement that is generated bythe wizard is valid. The tab cannot display data for a fixed-width external file unlessthe SAS informats in the metadata are appropriate for the columns in the data. For anexample of how you can use the Data tab to help you verify your metadata, see theexplanation of the Column Definitions window in Registering a Delimited ExternalFile on page 122.Working with External Files Tasks 139You can also view the formatted data in an external file after you have registered itin the Source Designer. To do this, access the View Data window for the external file.Registering a COBOL Data File That Uses a COBOL CopybookProblemYou want to create metadata for a COBOL data file that uses column definitions froma COBOL copybook. The copybook is a separate file that describes the structure of thedata file.SolutionPerform the following steps to specify metadata for a COBOL data file in SAS DataIntegration Studio:1 Use the import COBOL copybook feature to create a COBOL format file from theCOBOL copybook file.2 Use the Fixed-Width External File wizard to copy column metadata from theCOBOL format file.TasksImport the COBOL CopybookServer administrators should perform the following steps, which describe one way toimport the COBOL copybook:1 Obtain the required set of SAS programs that supports copybook import. Performthe following steps from Technical Support document TS-536 to download theversion of COB2SAS8.SAS that was modified for SAS V8:a Go to the following Web page and download this zipped file: http://ftp.sas.com/techsup/download/mvs/cob2sas8.zip.b Unzip the file into an appropriate directory.c Read the README.TXT file. It contains information about this modifiedversion of COB2SAS8.SAS. It also contains additional information about theinstallation process.2 Click Import COBOL Copybook in the Tools menu for SAS Data IntegrationStudio to access the Cobol Copybook Location and Options window.3 Select a SAS Application Server in the Server field. The selected SAS ApplicationServer must be able to resolve the paths that are specified in the Copybook(s)field and the COBOL format file directory field.4 Indicate the original platform for the COBOL data file by selecting the appropriateradio button in the COBOL data resides on field.140 Tasks Chapter 65 Select a copybook file to import in the Copybook(s) field. If you have importedcopybooks in the past, you can select from a list of up to eight physical paths topreviously selected copybook files. If you need to import a copybook that you havenever used in SAS Data Integration Studio, you have two options. First, you canclick Add to type a local or remote path manually. Second, you can click Browse tobrowse for a copybook that is local to the selected SAS Application Server.6 Specify a physical path to the directory for storing the COBOL format file in theCOBOL format file directory field. You can enter a local or remote path in thefield, choose a previously selected location from the drop-down menu, or browse tothe file.7 Click OK when you are finished. The Review object names to be created windowdisplays.8 Verify the name of the COBOL format file or files. Specify a physical path for theSAS log file in the SAS Log field. This file will be saved to the SAS DataIntegration Studio client machine.9 Click OK when you are finished. One or more COBOL format files are created fromthe COBOL copybook file.Note: If the external file resides on the MVS operating system, and the filesystem isnative MVS, then the following usage notes apply. Add the MVS: tag as a prefix to the name of the COBOL copybook file in theCopybook(s) field . Here is an example file name:MVS:wky.tst.v913.etls.copybook. Native MVS includes partitioned data sets (PDS and PDSE). Take this intoaccount when you specify a physical path to the directory for storing the COBOLformat file in the COBOL format file directory field . Here is an example path:MVS:dwatest.tst.v913.cffd. The COB2SAS programs must reside in a PDS with certain characteristics. Formore information about these characteristics, see http://support.sas.com/techsup/technote/ts536.html. The path to the r2cob1.sas program should specify the PDS and member name.Here is an example path, which would be specified in the Full path forr2cob1.sas field in the Advanced options window:mvs:dwatest.tst.v913.cob2sasp(r2cob1).Copy Column Metadata From the COBOL Format FileYou can copy column metadata from the COBOL format file in the Column Definitionswindow of the Fixed Width External File wizard. Perform the following steps:1 Access the Column Definitions screen of the Fixed Width External File wizard.For information about the wizard, see Registering a Fixed-Width External Fileon page 126.2 Click Import to access the Import Columns window.3 Select the Get the column definitions from a COBOL copybook radio button.Then, use the down arrow to select the appropriate COBOL format file and clickOK. The column metadata from the COBOL format file is copied into the ColumnDefinitions window.4 Specify any remaining column metadata in the Column Definitions window. ClickNext when you are finished.5 Click through the Select Group and General windows of the wizard, entering anyunfinished configuration as you go. Click Finish when you are finished. Themetadata for the external file is saved into the appropriate repository.141C H A P T E R7Creating, Executing, andUpdating JobsAbout Jobs 142Jobs with Generated Source Code 142Jobs with User-Supplied Source Code 143Run Jobs 143Manage Job Status 143Creating an Empty Job 143Problem 143Solution 143Tasks 144Use the New Job Wizard 144Related Tasks 144Creating a Process Flow for a Job 144Problem 144Solution 144Tasks 144Create and Populate a Sample Job 144About Job Options 145Submitting a Job for Immediate Execution 147Problem 147Solution 147Tasks 147Submit a Job from a Process Designer Window 147Submit a Job from a Tree 148Resubmit a Job from the Job Status Manager Window 148Accessing Local and Remote Data 148Access Data in the Context of a Job 148Access Data Interactively 149Use a Data Transfer Transformation 150Viewing or Updating Job Metadata 150Problem 150Solution 151Tasks 151View or Update Basic Job Properties 151View or Update the Job Process Flow Diagram 151Displaying the SAS Code for a Job 152Problem 152Solution 152Tasks 152View SAS Code in the Source Editor Tab 152View SAS Code in the View Code Window 152Common Code Generated for a Job 152142 About Jobs Chapter 7Overview 152LIBNAME Statements 153SYSLAST Macro Statements 153Remote Connection Statements 154Macro Variables for Status Handling 154User Credentials in Generated Code 155Troubleshooting a Job 155Problem 155Solution 155Tasks 155Display the Log tab 155Review the Log for a Job 155About JobsJobs with Generated Source CodeA job is a collection of SAS tasks that create output. SAS Data Integration Studiouses the metadata for each job to generate or retrieve SAS code that reads sources andcreates targets in physical storage.If you want SAS Data Integration Studio to generate code for a job, you must definea process flow diagram that specifies the sequence of each source, target, and process ina job. In the diagram, each source, target, and process has its own metadata object.For example, the following process flow diagram shows a job that reads data from asource table, sorts the data, and then writes the sorted data to a target table.Display 7.1 Process Flow Diagram for a Job That Sorts DataGiven the direction of the arrows in the process flow: ALL_EMP specifies metadata for the source table. SAS Sort specifies metadata for the sort process, which writes its output to atemporary output table, Sort Target-W5BU8XGB. Table Loader specifies metadata for a process that reads the output from theprevious step and loads this data into a target table. Employees Sorted specifies metadata for the target table.SAS Data Integration Studio uses this metadata to generate SAS code that readsALL_EMP, sorts this information, writes the sorted information to a temporary outputtable, and then writes it to the Employees Sorted table.Each process in a process flow diagram is specified by a metadata object called atransformation. In the example, SAS Sort and Table Loader are transformations. Atransformation specifies how to extract data, transform data, or load data into datastores. Each transformation that you specify in a process flow diagram generates orCreating, Executing, and Updating Jobs Solution 143retrieves SAS code. You can specify user-written code for any transformation in aprocess flow diagram.For more details about the process flow diagram shown in the preceding exampleabove, see Creating a Table That Sorts the Contents of a Source on page 279.Jobs with User-Supplied Source CodeFor all jobs except the read-only jobs that create cubes, you can specify user-writtencode for the entire job or for any transformation within the job. For details, see Chapter12, Working with User-Written Code, on page 215.Run JobsThere are four ways to run a job: submit the job for immediate execution; see Submitting a Job for ImmediateExecution on page 147 deploy the job for scheduling; see About Job Scheduling on page 170 deploy the job as a SAS stored process; see About SAS Stored Processes on page177 deploy the job as a SAS stored process that can be accessed by a Web service client;see About Deploying Jobs for Execution by a Web Service Client on page 182Manage Job StatusAfter you have submitted the job, you can review and manage its status in the JobStatus Manager window. The Job Status Manager displays all of the jobs that havebeen submitted and not cleared in the current SAS Data Integration Studio session.You can use this tool to view, clear, cancel, kill, and resubmit jobs. For details, seeUsing the Job Status Manager on page 159.Creating an Empty JobProblemYou want to create an empty job. After you have an empty job, you can create aprocess flow diagram by dragging and dropping tables and transformations into theProcess Designer window.SolutionUse the New Job wizard.144 Tasks Chapter 7TasksUse the New Job WizardPerform the following steps to create an empty job:1 Access the New Job wizard through one of the following methods: Click Process Designer on the shortcuts bar. Select Tools Process Designer. Select File New Object Then, click New Job on the New Object Wizardwindow.2 Enter an appropriate name for the job in the New Job wizard. You can also enteran optional description of the job.3 Click Next. Do not select tables to be loaded in this empty job. (You can use thiswindow to select target tables for a job.)4 Click Next. Select an appropriate group in the SAS Data Integration StudioCustom Tree.5 Click Next, review the metadata for the job, and click Finish. The ProcessDesigner displays an empty job.Related TasksSave the new job before you close the Process Designer window. To save the job,select File Save from the menu bar.After you have created an empty job, you can populate and execute the job.Creating a Process Flow for a JobProblemYou want to create a job to perform a business task. Then, you need to populate thejob with the source tables, transformations, and target tables required to complete thetask.SolutionYou can use the New Job Wizard to create an empty job. Then, you can populate thejob in the Process Designer window with the source tables, transformations, and targettables that you need to accomplish your task.TasksCreate and Populate a Sample JobTo illustrate the process of creating and populating a job, we will build a sample job.Perform the following steps to create and populate the job:Creating, Executing, and Updating Jobs About Job Options 1451 Create an empty job. For information, see Creating an Empty Job on page 143.2 Drop the SQL Join transformation from the Data Transformations folder in theProcess Library tree into the empty job. You generally drop the transformation intothe flow first. This way, the appropriate input and output drop zones are provided.3 Drop the first source table on the input drop zone for the SQL Join transformation.Then, drop any additional source tables on the SQL Join transformation. Thesource tables can be tables or external files. However, they must be registered inSAS Data Integration Studio.4 Drop the target table on the target drop zone. The target table must be registeredin SAS Data Integration Studio.5 Delete the Table Loader transformation and the temporary worktable SQL Targetfrom the job. If you keep the Table Loader and the worktable, you must configuretwo sets of mappings, one from the source tables to the worktable and anotherfrom the worktable to the target table. The extra processing required can degradeperformance when the job is run. For information that will help you decidewhether to delete the Table Loader transformation and the temporary worktablefrom a job, see Manage Temporary and Permanent Tables for Transformations onpage 239. The following example shows the sample process flow.Display 7.2 Sample Process FlowNote: You can set global options for jobs on the Code Generation tab of theOptions menu. The Options menu is available on the Tools menu on the SAS DataIntegration Studio menu bar. You can set local options on the Options tab available onthe properties window for each table. For detailed information, see Specifying Optionsfor Jobs on page 211. About Job OptionsOptions can be set for Data Integration Studio, such as enabling parallel processingand configuring grid processing.Use the Options window to specify options for SAS Data Integration Studio. You candisplay this window by selecting Tools Options from the menu bar.In most cases the appropriate options are selected by default. You can override thedefaults by using one of the options in the following tables.146 About Job Options Chapter 7Table 7.1 Global Options for JobsOption name DescriptionEnable optional macro variables for new jobs When selected, specifies that optional macrovariables are to be included in the code thatSAS Data Integration Studio generates for newjobs.Enable parallel processing macros for new jobs When selected, adds parallel processing macrosto the code that is generated for all new jobs.Default grid workload specification Enables you to select a default workloadspecification value for all new jobs. Forexample, if the grid is partitioned, you candesignate that specific applications will run ondesignated servers. The grid workloadspecification consists of a string value thatmust match the name defined in the PlatformComputing grid configuration files. These filesare text files set up by administrators whenthey configure a grid.Default maximum number of concurrent processesgroup boxContains concurrent processes options.One process for each available CPU node When selected, sets the number of concurrentprocesses to one process for each availableCPU node for all new jobs. Generally, this isthe most effective setting.Use this number Specifies an exact number of concurrentprocesses to run for all new jobs.Run all processes concurrently When selected, runs all processes concurrentlyby using SAS load balancing for new jobs.Typically, this option is used only in a gridcomputing environment where a managingscheduler, such as Platform Computingsoftware, is used to handle workloadmanagement for the grid. In a grid computingenvironment, you should also adjust your jobslots to match your environment and performother necessary tuning. Too many processessent to an overloaded environment candramatically reduce performance, andpotentially cause deadlock.You can set local options that apply to individual jobs by selecting the job and usingthe right mouse button to open the pop-up menu. Select Properties and then selectthe Options tab. These local options override global options for the selected job, butthey do not affect any other jobs.Creating, Executing, and Updating Jobs Tasks 147Table 7.2 Local Options for JobsOption name DescriptionEnable optional macro variables When set to YES, specifies that optional macrovariables are to be included in the code thatSAS Data Integration Studio generates for theselected job. This option overrides the globaloption with the same name.Enable parallel processing macros When set to YES, adds parallel processingmacros to the code that is generated for theselected job. This option overrides the globaloption with the same name.System Options Enables you to set options by using a SASOPTIONS statement.Submitting a Job for Immediate ExecutionProblemYou want to execute a job immediately.SolutionYou can submit jobs from the Process Designer window, from any tree on the SASData Integration Studio desktop, or from the Job Status Manager window.You can submit a job after you have defined its metadata. Until you submit a job, itsoutput tables (or targets) might not exist on the file system. Note that you can openmultiple jobs in multiple Process Designer windows and submit each job for execution.These jobs execute in the background, so you can do other tasks in SAS DataIntegration Studio while a job is executing. Each job has its own connection to the SASApplication Server so that the jobs can execute in parallel.Note: Two jobs that load the same target table should not be executed in parallel.They will either overwrite each others changes, or they will try to open the target atthe same time. The SAS Application Server that executes the job must have been installed, and theappropriate metadata must have been defined for it. For details, see Selecting aDefault SAS Application Server on page 55.TasksSubmit a Job from a Process Designer WindowYou can submit a job that is displayed in a Process Designer window. Click Submitin the SAS Data Integration Studio Process menu. The job is submitted to the default148 Accessing Local and Remote Data Chapter 7SAS Application Server and to any server that is specified in the metadata for atransformation within the job.Submit a Job from a TreeYou select a job in the Custom tree, the Metadata tree, or the Inventory tree. Afterthe job is selected, you can view it in the Process Designer window, submit it, or reviewits code. Perform the following steps if the job to be submitted is not displayed in theProcess Editor:1 Display the tree in the SAS Data Integration Studio desktop where the job isstored.2 Navigate to the job in the selected tree. For example, jobs are located in the Jobsfolder of the Inventory tree.3 Right-click the job. Then, click Submit Job in the pop-up menu. If you need toreview the job before you submit it, perform the following steps: Right-click the job that you want to execute. Then, click View Job in the pop-upmenu. You can also access View Job from the View menu in the menu bar. Theprocess flow diagram for the job displays in the Process Designer. Right-click in the Process Designer. Then, click Submit in the pop-up menu.Resubmit a Job from the Job Status Manager WindowYou can resubmit any job that is listed in the Job Status Manager window.Right-click on the row for the job in the Job Status Manager window and clickResubmit from the pop-up menu.The job is submitted to the default SAS Application Server and to any server that isspecified in the metadata for a transformation within the job.Note: After the job is submitted, you can review and manage its status in the JobStatus Manager window. The Job Status Manager window displays all jobs that havebeen submitted and not cleared in the current SAS Data Integration Studio session.For details, see Using the Job Status Manager on page 159. If the job executes without error, you can view the data in one or more the targets toverify that they have the appropriate content. For details, see Browsing Table Dataon page 109 .If the job fails to execute without error, see Troubleshooting a Job on page 155.Accessing Local and Remote DataAccess Data in the Context of a JobYou can access data implicitly in the context of a job, access data interactively in ajob, or use the Data Transfer transformation to move data directly from one machine toanother.When code is generated for a job, it is generated in the current context. The contextincludes the default SAS Application Server when the code was generated, thecredentials of the person who generated the code, and other information. The context ofa job affects the way that data is accessed when the job is executed.Creating, Executing, and Updating Jobs Access Data Interactively 149In order to access data in the context of a job, you need to understand the distinctionbetween local data and remote data. Local data is addressable by the SAS ApplicationServer when code is generated for the job. Remote data is not addressable by the SASApplication Server when code is generated for the job.For example, the following data is considered local in the context of a job: data that can be accessed as if it were on one or more of the same computers asthe SAS Workspace Server components of the default SAS Application Server data that is accessed with a SAS/ACCESS engine (used by the default SASApplication Server)The following data is considered remote in a SAS Data Integration Studio job: data that cannot be accessed as if it were on one or more of the same computers asthe SAS Workspace Server components of the default SAS Application Server data that exists in a different operating environment from the SAS WorkspaceServer components of the default SAS Application Server (such as MVS data thatis accessed by servers running under Microsoft Windows)Note: Avoid or minimize remote data access in the context of a SAS DataIntegration Studio job. Remote data has to be moved because it is not addressable by the relevantcomponents in the default SAS Application Server at the time that the code wasgenerated. SAS Data Integration Studio uses SAS/CONNECT and the UPLOAD andDOWNLOAD procedures to move data. Accordingly, it can take longer to access remotedata than local data, especially for large data sets. It is especially important tounderstand where the data is located when using advanced techniques such as parallelprocessing because the UPLOAD and DOWNLOAD procedures would run in eachiteration of the parallel process.For information about accessing remote data in the context of a job, administratorsshould see 'Multi-Tier Environments' in the SAS Data Integration Studio chapter ofthe SAS Intelligence Platform: Desktop Application Administration Guide.Administrators should also see Deploying Jobs for Execution on a Remote Host onpage 175. For details about the code that is generated for local and remote jobs, see thesubheadings about LIBNAME statements and remote connection statements inCommon Code Generated for a Job on page 152.Access Data InteractivelyWhen you use SAS Data Integration Studio to access information interactively, theserver that is used to access the resource must be able to resolve the physical path tothe resource. The path can be a local path or a remote path, but the relevant servermust be able to resolve the path. The relevant server is the default SAS ApplicationServer, a server that has been selected, or a server that is specified in the metadata forthe resource.For example, in the source designers for external files, the Server tab in theAdvanced File Location Settings window enables you to specify the SAS ApplicationServer that is used to access the external file. This server must be able to resolve thephysical path that you specify for the external file.As another example, assume that you use the View Data option to view the contentsof a table in the Inventory tree. If you want to display the contents of the table, thedefault SAS Application Server or a SAS Application Server that is specified in thelibrary metadata for the table must be able to resolve the path to the table.In order for the relevant server to resolve the path to a table in a SAS library, one ofthe following conditions must be met:150 Use a Data Transfer Transformation Chapter 7 The metadata for the library does not include an assignment to a SAS ApplicationServer, and the default SAS Application Server can resolve the physical path thatis specified for this library. The metadata for the library includes an assignment to a SAS Application Serverthat contains a SAS Workspace Server component, and the SAS Workspace Serveris accessible in the current session. The metadata for the library includes an assignment to a SAS Application Server,and SAS/CONNECT is installed on both the SAS Application Server and themachine where the data resides. For more information about configuring SAS/CONNECT to access data on a machine that is remote to the default SASApplication Server, administrators should see 'Multi-Tier Environments' in theSAS Data Integration Studio chapter of the SAS Intelligence Platform: DesktopApplication Administration Guide.Note: If you select a library that is assigned to an inactive server, you receive aCannot connect to workspace server error. Check to make sure that the serverassigned to the library is running and is the active server. Use a Data Transfer TransformationYou can use the Data Transfer transformation to move data directly from one machineto another. Direct data transfer is more efficient than the default transfer mechanism.For example, assume that you have the following items: a source table on machine 1 the default SAS Application Server on machine 2 a target table on machine 3You can use SAS Data Integration Studio to create a process flow diagram thatmoves data from the source on machine 1 to the target on machine 3. By default, SASData Integration Studio generates code that moves the source data from machine 1 tomachine 2 and then moves the data from machine 2 to machine 3. This is an implicitdata transfer. For large amounts of data, this might not be the most efficient way totransfer data.You can add a Data Transfer transformation to the process flow diagram to improve ajobs efficiency. The transformation enables SAS Data Integration Studio to generatecode that migrates data directly from the source machine to the target machine.You can use the Data Transfer transformation with a SAS table or a DBMS tablewhose table and column names follow the standard rules for SAS Names. For anexample of how you can use a Data Transfer transformation, see the 'Example: MoveData Directly from One Machine to Another Machine' Help topic in SAS DataIntegration Studio. Also see the 'Data Transfer Will Not Work for DBMS Tables WithSpecial Characters in Table Names' section in the 'SAS Data Integration Studio UsageNotes' Help topic.Viewing or Updating Job MetadataProblemYou want to view or update the metadata that is associated with a job. All jobs havebasic properties that are contained in metadata that is viewed from the job propertiesCreating, Executing, and Updating Jobs Tasks 151window. If you want SAS Data Integration Studio to generate code for the job, the jobmust also have a process flow diagram. If you supply the source code for a job, noprocess flow diagram is required. However, you might want to create one fordocumentation purposes.SolutionYou can find metadata for a job in its properties window or process flow diagram.TasksView or Update Basic Job PropertiesPerform the following steps to view or update the metadata that is associated withthe job properties window:1 Open the Jobs folder in the Inventory tree on the SAS Data Integration Studiodesktop.2 Right-click the desired job and click Properties to access the properties windowfor the job.3 Click the appropriate tab to view or update the desired metadata.For details about the metadata that is maintained on a particular tab, click the Helpbutton on that tab. The Help topics for complex tabs often include task topics that canhelp you perform the main tasks that are associated with the tab.View or Update the Job Process Flow DiagramPerform the following steps to view or update the process flow diagram for a job:1 Open the Jobs folder in the Inventory tree on the SAS Data Integration Studiodesktop.2 Right-click the desired job and click View Job to display the process flow diagramfor the job in the Process Designer window. View or update the metadatadisplayed in the process flow diagram as follows: To update the metadata for tables or external files in the job, see Viewing orUpdating Table Metadata on page 89 or Viewing or Updating External FileMetadata on page 134. To update the metadata for transformations in the job, see Viewing orUpdating the Metadata for Transformations on page 197. To add a transformation to a process flow diagram, select the transformationdrop it in the Process Designer window.Note: Updates to job metadata are not reflected in the output for that job until yourerun the job. For details about running jobs, see Submitting a Job for ImmediateExecution on page 147. 152 Displaying the SAS Code for a Job Chapter 7Displaying the SAS Code for a JobProblemYou want to display the SAS code for a job. (To edit the SAS code for a job, seeAbout User-Written Code on page 216.)SolutionYou can display the SAS code for a job in the Source Editor tab of the ProcessDesigner window or in the View Code window. In either case, SAS Data IntegrationStudio must be able to connect to a SAS Application Server with a SAS WorkspaceServer component to generate the SAS code for a job. See Connecting to a MetadataServer on page 51.TasksView SAS Code in the Source Editor TabYou can view the code for a job that is currently displayed in the Process Designerwindow. To do this, click the Source Editor tab. The job is submitted to the defaultSAS Application Server and to any server that is specified in the metadata for atransformation within the job. The code for the job is displayed in the Source Editortab.View SAS Code in the View Code WindowPerform the following steps to view the code for a job that is not displayed in theProcess Designer window:1 Expand the Jobs folder in the Inventory tree on the SAS Data Integration Studiodesktop.2 Right-click the job that you want to view, and then select View Code from thepop-up menu.The job is submitted to the default SAS Application Server and to any server that isspecified in the metadata for a transformation within the job. The job is opened in theProcess Designer window, and the code for the job is displayed in the Source Editortab.Common Code Generated for a JobOverviewWhen SAS Data Integration Studio generates code for a job, it typically generates thefollowing items:Creating, Executing, and Updating Jobs SYSLAST Macro Statements 153 a LIBNAME statement for each table in the job a SYSLAST macro statement at the end of each transformation in the job remote connection statements for any remote execution machine that is specifiedin the metadata for a transformation within a job macro variables for status handlingThe generated code includes the user name and password of the person who createdthe job. You can set options for the code that SAS Data Integration Studio generates forjobs and transformations. For details, see Specifying Options for Jobs on page 211.LIBNAME StatementsWhen SAS Data Integration Studio generates code for a job, a library is consideredlocal or remote in relation to the SAS Application Server that executes the job. If thelibrary is stored on one of the machines that is specified in the metadata for the SASApplication Server that executes the job, it is local. Otherwise, it is remote.SAS Data Integration Studio generates the appropriate LIBNAME statements forlocal and remote libraries.The following syntax is generated for a local library:libname libref ;The following syntax is generated for a remote library:options comamid=connection_type;%let remote_session_id=host_name ;signon remote_session_id ;rsubmit remote_session_id;libname libref 154 Remote Connection Statements Chapter 7is the name of the last output table created by the transformation. The SYSLASTmacro variable is used to maketransformation_output_table_namethe input for the next step in the process flow. In most cases, this setting is appropriate.Setting the value to NO is appropriate when you have added a transformation to aprocess flow, and that transformation does not produce output, or it produces outputthat should not become the input to the next step in the flow. The following exampleillustrates a sample process flow.Display 7.3 Process Flow with a Custom Error Handling TransformationIn this example, the Custom Error Handing transformation contains user-writtencode that handles errors from the Extract transformation, and the error-handling codedoes not produce output that should be become the input to the target table,ALL_MALE_EMP. Instead, the output from the Extract transformation should becomethe input to ALL_MALE_EMP.In that case, you would do the following: Leave the Create SYSLAST Macro Variable option set to YES for the Extracttransformation. Set the Create SYSLAST Macro Variable option to NO for the Custom ErrorHanding transformation.Remote Connection StatementsEach transformation within a job can specify its own execution host. When SAS DataIntegration Studio generates code for a job, a host is considered local or remote inrelation to the SAS Application Server that executes the job. If the host is one of themachines that is specified in the metadata for the SAS Application Server that executesthe job, it is local. Otherwise, it is remote.A remote connection statement is generated if a remote machine has been specifiedas the execution host for a transformation within a job, as shown in the followingsample statement:options comamid=connection_type;%let remote_session_id=host_name ;SIGNON remote_session_id ;rsubmit remote_session_id;... SAS code ...endrsubmit;Macro Variables for Status HandlingWhen SAS Data Integration Studio generates the code for a job, the code includes anumber of macro variables that can be used to monitor the status of jobs. For details,see Chapter 8, Monitoring Jobs, on page 157.Creating, Executing, and Updating Jobs Tasks 155User Credentials in Generated CodeThe code that is generated for a job contains the credentials of the person who createdthe job. If a persons credentials are changed and a deployed job contains outdated usercredentials, the deployed job fails to execute. The solution is to redeploy the job withthe appropriate credentials. For details, see About Job Scheduling on page 170.Troubleshooting a JobProblemYou have run a job that has failed to execute without error and you need todetermine why the job did not execute successfully.SolutionYou can troubleshoot a job that has failed to execute without error by viewing thecontents of the Log tab in the Process Designer window.TasksDisplay the Log tabPerform the following steps if the Log tab does not currently display in the ProcessDesigner window:1 Display the Options window by using one of the following methods: Click the Options item in the Shortcut Bar. Click Options in the Tools menu on the SAS Data Integration Studio menu bar.2 Select the Show Log tab in Process Designer check box on the General tab.Then, click OK to save the setting.Review the Log for a JobPerform the following steps if a job fails to execute without error:1 Click the Log tab in the Process Designer window.2 Scroll through the SAS log information on the Log tab that was generated duringthe execution of the job. Locate the first error in the log and try to correct theproblem. For example, if there are errors in the metadata for a table ortransformation in the job, select it in the process flow diagram. Then, right-click itand click Properties in the pop-up menu to display the properties window.Correct the metadata and resubmit the job until you cannot find an error.3 After the job runs without error, right-click anywhere in the Process Designerwindow and click Save to save the job.Note: SAS Data Integration Studio includes a Job Status Manager that allows youto review and manage multiple jobs. For information, see Using the Job StatusManager on page 159. 157C H A P T E R8Monitoring JobsAbout Monitoring Jobs 157Prerequisites for Job Status Code Handling 158Using the Job Status Manager 159Problem 159Solution 159Tasks 160Review a Job in the Job Status Manager 160Managing Status Handling 161Problem 161Solution 161Tasks 161Use the Status Handling Tab 161Managing Return Code Check Transformations 162Problem 162Solution 162Tasks 163Use Return Code Check Transformations 163Maintaining Status Code Conditions and Actions 165Problem 165Solution 165Tasks 165Manage Default Actions 165About Monitoring JobsThere are four ways to monitor the status of jobs and transformations in SAS DataIntegration Studio:Table 8.1 Methods for Monitoring Jobs and TransformationsIf you want to Do thisCheck the status of the last five jobsthat have been submitted in the currentsession.Use the Job Status Manager, described in Using the JobStatus Manager on page 159.Define a specific action based on thereturn code of a job or of sometransformations.Use Status Code Handling, described in ManagingStatus Handling on page 161.158 Prerequisites for Job Status Code Handling Chapter 8If you want to Do thisMonitor the status of a transformationthat does not have a StatusHandling tab.Insert a Return Code Check transformation into theprocess flow in order to check the transformationsreturn code. See Managing Return Code CheckTransformations on page 162.Monitor the status of a Lookuptransformation, which has someadvanced methods for handlingexceptions.See the online Help for the Lookup transformation.When you execute a job in SAS Data Integration Studio, a return code for eachtransformation in the job is captured in a macro variable. The return code for the job isset according to the least successful transformation in the job.SAS Data Integration Studio enables you to associate a return code condition, suchas successful, with an action, such as Send Email or Send Event. In this way, you canspecify how a return code is handled for the job or transformation. For example, youcan associate a return code with an action that performs one or more of these tasks: Terminate the job or transformation. Call a user-defined SAS macro. Send a status message to a person, a file, or an event broker that then passes thestatus code to another application.You can also use the status handling to capture job statistics, such as the number ofrecords before and after the append of the last table loaded in the job. To capturestatistics about a job, associate a return code with the Send Job Status action.The Status Handling tab, which is included in the properties windows for jobs andfor some transformations, is used to associate a return code condition with an action.See Use the Status Handling Tab on page 161.The properties windows for most transformations do not have a Status Handlingtab. To return the status of a transformation that does not have a Status Handlingtab, you can use a Return Code Check transformation to insert status-handling logic ata desired point in the process flow diagram for a job. See Use Return Code CheckTransformations on page 163.Prerequisites for Job Status Code HandlingWhen you execute a job in SAS Data Integration Studio, notification of its success orfailure can be automatically e-mailed to a person, written to a file, or sent to an eventbroker that passes job status on to another application. Therefore, the SAS ApplicationServer that is used to execute jobs must have a SAS Workspace Server component thatis invoked from Base SAS 9.1.2 or later. This prerequisite supports status codehandling in general. The following message is displayed if you attempt to use theStatus Handling tab and the current metadata repository is not set up correctly forstatus handling: The metadata server does not contain the needed information tosupport this feature. Please notify your metadata administrator to add the defaultinitialization metadata. If you see this message, an administrator must use theMetadata Manager plug-in to SAS Management Console to initialize the repository.A Status Handling tab is included in the properties windows for jobs and for sometransformations. Users can select from a list of code conditions and actions on theStatus Handling tab. For example, you can select a code condition such asSuccessful and associate it with an action, such as Send Email or Send Event. Theprerequisites listed in the following table apply to specific default actions.Monitoring Jobs Solution 159Table 8.2 Status Handling ActionsAction DescriptionEmail SAS system options for e-mail must be set for the SAS Application Serverthat is used to execute jobs. For details about the relevant e-mail options,see the Administering SAS Data Integration Studio chapter in the SASIntelligence Platform: Desktop Application Administration Guide.Send Event SAS Foundation Services must be installed, and the Event BrokerService must be properly configured for the SAS solution that will receivethe events. For details about setting up the Event Broker Service so thata SAS solution can receive events from SAS Data Integration Studio, seethe documentation for the SAS solution.Custom An appropriate SAS Autocall macro library must be accessible by theSAS Application Server that is used to execute the job. For details aboutmaking Autocall macro libraries available to SAS Data IntegrationStudio, see the Administering SAS Data Integration Studio chapter inthe SAS Intelligence Platform: Desktop Application AdministrationGuide.Send Entry to a DataSetThe library that contains the data set must be assigned before the job ortransformation is executed. To assign a library within SAS DataIntegration Studio, you can select the Pre and Post Process tab inthe properties window for the job or transformation and then specify aSAS LIBNAME statement as a preprocess.To assign a library outside of SAS Data Integration Studio, you canpreassign the library to the SAS Application Server that is used toexecute the job. Some tasks that are associated with preassigning a SASlibrary must be done outside of SAS Data Integration Studio or SASManagement Console. For details, see the Assigning Libraries chapterin SAS Intelligence Platform: Data Administration Guide.Send Job Status Saves status messages to a SAS data set. Consecutive messages areappended to the data set with a timestamp. If the data set does not exist,it is created the first time that the job is executedUsing the Job Status ManagerProblemYou have submitted one or more jobs for execution and need to review the status ofthe jobs. Depending on the status, you might want to clear, kill, resubmit, or performanother action on the job.SolutionYou can display a pop-up menu that shows the status of the last five unique jobs thathave been submitted in the current SAS Data Integration Studio session. You can alsodisplay the Job Status Manager window, which shows the name, status, starting time,ending time, and application server used for all jobs submitted in the current session.160 Tasks Chapter 8TasksReview a Job in the Job Status ManagerPerform the following steps to review a job the Job Status Manager window:1 On the SAS Data Integration Studio desktop, select the Inventory tab.2 In the Inventory tree, open the Jobs folder.3 Select Tools Job Status Manager. A list is displayed of the last five unique jobsthat have been submitted in the current SAS Data Integration Studio session andthe status for each of them.4 Right-click on a job. A pop-up menu enables you to clear, view, cancel, kill, orresubmit the job.5 Right-click on any column heading to sort the list of jobs.You can click Clear All to clear all of the completed jobs that are displayed in theJob Status Manager table. You can also right-click in the Job Status Manager windowto access a pop-up menu that enables you to control the sort order and manage jobs.The following table explains the pop-up menu. The options that you see vary accordingto the context.Table 8.3 Job Status Manager Window Pop-Up Menu OptionsMenu option Available when you DescriptionCancel Job Right-click on a currently running job. Stops the selected running job after thecurrent SAS procedure or DATA stepcompletes.Clear Right-click on a completed job. Clears the selected job and disconnectsthe application server. Available onlyfor completed jobs.Clear All Right-click on any job or in any blankspace within the Job Status Managerwindow.Clears all completed jobs. Disconnectsthe application server from the clearedjobs. Affects only completed jobs.Kill Job Right-click on a currently running job. Stops the selected running job afterfive seconds by killing the serversession for the job.Hide Column Right-click on a column heading. Hides the selected column from view inthe Job Status Manager window.Resubmit Job Right-click on a completed job. Resubmits the code for a selected job.Available only for completed jobs.Show Right-click on any column heading. Opens a submenu that displays theShow Hidden Column and ShowAll Columns options.Show AllColumnsRight-click on any column heading andselect the Show option.Displays all of the columns that havebeen hidden with the Hide Columnoption.Show HiddenColumnRight-click on any column heading andSelect the Show option. Then click onthe name of the column that you wantto show.Displays a selected column that hasbeen hidden with the Hide Columnoption.Monitoring Jobs Tasks 161Menu option Available when you DescriptionSort Ascending Right-click on any column heading. Sorts the jobs in ascending order.SortDescendingRight-click on any column heading. Sorts the jobs in descending order.Toggle Sort Left-click on any column heading. Toggles the sort order of the jobsbetween ascending and descending.Also displays an arrow in the columnheading to indicate the sort direction.View Job Right-click on any job. Displays the Process Editor window forthe selected job.Managing Status HandlingProblemYou need to specify a given action based on the status code returned by a job (or, insome cases, a transformation).SolutionA Status Handling tab is included in the properties windows for jobs and for sometransformations within a job. If the prerequisites for status handling have been metwhen the job or transformation is executed, the actions that are specified on the StatusHandling tab are carried out. For example, assume that you want to capture statisticsabout a job, such as the number of records before and after the append of the last tableloaded in the job. You will perform the following steps:1 Display the properties window for the job.2 Click the Status Handling tab.3 Click the Errors code condition.4 Click the Send Job Status action.TasksUse the Status Handling TabPerform the following steps to manage status codes and options for a job:1 On the SAS Data Integration Studio desktop, select the Inventory tab.2 Open the Jobs folder in the Inventory tree.3 If you want to specify status handling for the job, select the job in the project treeand then select File Properties from the menu bar. The properties window forthe job is displayed.If you want to specify status handling for a transformation with the job, selectthe job in the project tree and then select View View Job from the menu bar.162 Managing Return Code Check Transformations Chapter 8The job process flow diagram is displayed in the Process Editor window. In theprocess flow diagram, select the transformation for which you plan to specifystatus handling and then select File Properties from the menu bar. Theproperties window for the transformation is displayed.4 Click Status Handling in the properties window for the job or transformation.5 Click New to add a row that consists of one code condition and one action.6 Select the Code Condition field to display the selection arrow. Click a codecondition, such as Successful.7 Select the Action field to display the selection arrow. Click an action that shouldbe performed when the condition is met, such as Send Email. (The default actionsare described in Maintaining Status Code Conditions and Actions on page 165.)8 When you select most actions, the Action Options window is displayed. Use thiswindow to specify options, such as an e-mail address and a message to include inthe body of the e-mail. For details about action options, select Help on the ActionOptions window.9 Click OK when you are finished.Managing Return Code Check TransformationsProblemYou want to check the status of a transformation that does not contain a StatusHandling tab in a job.SolutionYou can insert status-handling logic where you want it in the process flow diagramfor a job by using a Return Code Check transformation. In the example presented here,a Return Code Check transformation will be inserted into the Employee Sort job.Assume the following: The Employee Sort job is registered in a current metadata repository. You want to return the status of the Sort transformation in the process flowdiagram for the Employee Sort job. Status messages for the Sort transformation will be written to a file in thedirectory c:output. Prerequisites for status handling have been met as described in Prerequisites forJob Status Code Handling on page 158.Monitoring Jobs Tasks 163TasksUse Return Code Check Transformations1 On the SAS Data Integration Studio desktop, select the Inventory tab.2 In the Inventory tree, open the Jobs folder. Right-click Employee Sort, and clickView Job in the pop-up menu. The job is opened in the Process Editor window.3 Open the Process Library tree. Then, open the Control folder.4 In the Control folder, select, drag, and drop the Return Code Check transformationover the arrow that is between the Sort Target and Loader transformations. Theprocess flow diagram now resembles the following display. The Return Code Checktransformation captures the status of the previous transformation in the processflow, which in this case is the Sort transformation.Display 8.1 Process Flow Diagram with a Return Code Check Transformation5 Right-click the Return Code Check transformation and select Properties fromthe pop-up menu. The properties window is displayed.6 Click the Status Handling tab.7 Click New. A new Code condition and action row is added, as shown in thefollowing display. The code condition Successful and the action None are thedefault values. If these values were saved without change, no action would betaken whenever the previous transformation in the process flow was successfullyexecuted. For this example, the default action is changed so that a status messageis written to a file.164 Tasks Chapter 8Display 8.2 Status Handling Tab8 Select the Action field. A selection arrow appears.9 Use the selection arrow to select an appropriate action. In this example, the SendEntry to Text File action is used. An Action Options window displays for thisaction because you must enter both a fully qualified path to the file that containsthe status messages and the text of the message. The options window is shown inthe following display.Display 8.3 Action Options Window10 When you finish entering the options for this action, click OK. The options aresaved, and you are returned to the Status Handling tab.Monitoring Jobs Tasks 16511 Click OK to close the properties window.12 Right-click the job in the project tree, and select Submit Job. The server executesthe SAS code for the job. If the job completes without error, go to the next section.If error messages appear, respond to the messages and try again.13 View the status handling output. To verify that a status message was written to afile, go to the directory where the message file was to have been written. In thecurrent example, look for a file with the physical pathc:outputSort_Emp_Job_Status.txt. Open the file and view any statusmessages. In the current example, the messages resembles the following:10FEB04:17:40:23 Sort Successful. If you can view the appropriate statusmessage, the job produced the desired result.14 To verify that a status message was written to a file, go to the directory where themessage file was to have been written. In the current example, look for a file withthe physical path of c:outputSort_Emp_Job_Status.txt. Open the file andview any status messages. In the current example, you should see a message suchas: 10FEB04:17:40:23 Sort Successful.Maintaining Status Code Conditions and ActionsProblemYou want to edit the default conditions and actions that appear in the StatusHandling tab. However, you cannot edit these conditions and actions in the currentrelease. Moreover, user-defined code conditions and actions are not fully supported.Fortunately, you can prevent an inappropriate action from being displayed in theStatus Handling tab. For example, assume that your site had no plans to implementan event broker. You can prevent all Send Event actions from showing up in theStatus Handling tab so that users cannot select them.SolutionTo prevent an action from showing up in the Status Handling tab, use theConfigure Status Handling window to delete all code condition and action combinationsthat include the action that you want to exclude. For example, if you want to excludeall Send Event actions, you can delete all code condition and action combinations thatinclude Send Event. You can use this method only when the status handlingprerequisites have been met.TasksManage Default ActionsPerform the following steps to prevent an inappropriate action from being displayedin the Status Handling tab.1 From the SAS Data Integration Studio desktop, select Tools Configure StatusHandling. The Configure Status Handling window is displayed in update mode.166 Tasks Chapter 82 To delete a condition and action combination, in the Available code conditions/action combinations pane, select the desired combination and then select Delete.The condition and action combination is no longer be available in the StatusHandling tab.3 Click OK.Users can select from a list of return code conditions and actions in the StatusHandling tab for jobs and transformations. The following table provides a list of all theavailable actions. Note, however, that not all of the actions are available to alltransformations. For a list of which actions are available for each type oftransformation, see the SAS Data Integration Studio Help.Table 8.4 Default ActionsAction DescriptionAbort Terminates the job or transformation.Abort All Processes(Executing and Remaining)Terminates all of the currently executing and remaining processes.Abort Remaining Iterations Terminates all of the remaining iterations of a loop without affectingcurrently executing iterations.Abort After Looping Completes all of the processes in the loop and then terminates the job.Custom Enables you to call a macro in a SAS Autocall library to provideuser-defined status handling for a job or transformation. The SASApplication Server that executes the job in which custom action codeis used must be able to access the relevant Autocall library. If thelibrary contains a macro that is named SENDCUSTOM (a macro thatwould perform a status handling action), you can use a Customaction to call that macro. Enter a call to a status handling macro in aSAS Autocall library, such as the following: %sendcustom;.Email Exception Report E-mails an exception report that lists the column name, type ofexception, action taken, and related information. Enter one or morerecipient e-mail addresses in the Options window for the EmailException Report action. Separate multiple e-mail addresseswith a semicolon.Save Exception Report Saves an exception report to a file. The report lists the column name,type of exception, action taken, and related information. Enter a fullyqualified path to the file that contains the report in the Optionswindow for the Save Exception Report action. The path to thereport file must be accessible to the SAS Application Server that isexecuting the job.Save Table Saves status messages to a table. Consecutive messages areappended to the table with a timestamp. Enter the table name in thelibref.dataset SAS format, such as targetlib.allemp. Note thatthe libref must be assigned before the job or transformation executes.Monitoring Jobs Tasks 167Action DescriptionSend Email E-mails a message that you specify. Enter one or more recipiente-mail addresses and a message in the options window for the SendEmail action. To specify more than one e-mail address, enclose thegroup of addresses in parentheses, enclose each address in quotationmarks, and separate the addresses with a space, as [email protected] and [email protected]. Any text in theMessage field that includes white space must be enclosed by singlequotation marks so that the mail is processed correctly.Send Entry to Text File Saves status messages to a file. Consecutive messages are appendedto the file with a timestamp. Enter a fully qualified path to the filethat contains the status messages and the text of the message in theOptions window for the Send Entry to Text File action. Thepath must be accessible to the SAS Application Server that isexecuting the job.Send Entry to Dataset Saves status messages to a SAS data set. Consecutive messages areappended to the data set with a timestamp. Enter a libref for the SASlibrary where the status messages are to be written, a data set namefor the message, and the text of the message in the options window forthe Send Entry to Dataset action. The libref in the Libreffield must be assigned before the job or transformation executes.Send Event Sends a status message to an event broker, which sends the messageto applications that have subscribed to the broker. The subscribingapplications can then respond to the status of the SAS DataIntegration Studio job or transformation. For details about theoptions for the Send Event action, see the SAS Data IntegrationStudio Help for the Event Options window.Send Job Status Saves status messages to a SAS data set. Consecutive messages areappended to the data set with a timestamp. If the data set does notexist, it is created the first time that the job is executed. Enter alibref for the SAS library where the status messages are to be writtenand a data set name for the message in the Options window for theSend Job Status action. The libref in the Libref field must beassigned before the job or transformation executes. When the SendJob Status action is selected, the following values are captured: job name job status job return code number of records before append of the last table loaded in thejob number of records after append of the last table loaded in thejob the library and table name of the last table loaded in the job the user who ran the job the time the job started and finished169C H A P T E R9Deploying JobsAbout Deploying Jobs 170About Job Scheduling 170Prerequisites for Scheduling 171Deploying Jobs for Scheduling 171Problem 171Solution 171Tasks 171Deploy a Job for Scheduling 171Schedule a Job 173Redeploying Jobs for Scheduling 173Problem 173Solution 173Tasks 174Redeploy a Job for Scheduling 174Reschedule the Job 174Using Scheduling to Handle Complex Process Flows 174Problem 174Solution 174Tasks 174Schedule Complex Process Flows 174Deploying Jobs for Execution on a Remote Host 175Problem 175Solution 175Tasks 175Deploy One or More Jobs for Execution on a Remote Host 175About SAS Stored Processes 177Prerequisites for SAS Stored Processes 177Deploying Jobs as SAS Stored Processes 178Problem 178Solution 178Tasks 178Deploy a Job as a Stored Process 178Execute the Stored Process for a Job 180Redeploying Jobs to Stored Processes 180Problem 180Solution 180Tasks 181Redeploy a Selected Job with a Stored Process 181Redeploy Most Jobs with Stored Processes 181Viewing or Updating Stored Process Metadata 181Problem 181170 About Deploying Jobs Chapter 9Solution 181Tasks 182Update the Metadata for a Stored Process 182About Deploying Jobs for Execution by a Web Service Client 182Requirements for Jobs That Can Be Executed by a Web Service Client 183Overview of Requirements 183Process Flow Requirements 183Data Format for Web Client Inputs and Outputs 184Libraries and Librefs for Web Client Inputs and Outputs 184Web Streams for Web Client Inputs and Outputs 185(Optional) Parameters for User Input 185Creating a Job That Can Be Executed by a Web Service Client 185Problem 185Solution 185Tasks 186Create a Job That Can Be Executed by a Web Service Client 186Deploying Jobs for Execution by a Web Service Client 188Problem 188Solution 188Tasks 189Deploy a Job for Execution by a Web Service Client 189Using a Web Service Client to Execute a Job 191Problem 191Solution 191Tasks 191Use a Web Service Client to Execute a Job 191About Deploying JobsIn a production environment, SAS Data Integration Studio jobs must often beexecuted outside of SAS Data Integration Studio. For example, a job might have to bescheduled to run at a specified time, or a job might have to be made available as astored process.Accordingly, SAS Data Integration Studio enables you to do the following tasks: Deploy a job for scheduling; see About Job Scheduling on page 170 Deploy a job as a SAS stored process; see About SAS Stored Processes on page177 Deploy a job as a SAS stored process that can be accessed by a Web service client;see About Deploying Jobs for Execution by a Web Service Client on page 182You can also deploy a job in order to accomplish the following tasks: Divide a complex process flow into a set of smaller flows that are joined togetherand can be executed in a particular sequence; see Using Scheduling to HandleComplex Process Flows on page 174 Execute a job on a remote host; see Deploying Jobs for Execution on a RemoteHost on page 175About Job SchedulingYou can select a job in the Inventory tree or the Custom tree on the SAS DataIntegration Studio desktop and deploy it for scheduling. Code is generated for the jobDeploying Jobs Tasks 171and the generated code is saved to a file in a source repository. Metadata about thedeployed job is saved to the current metadata repository. After a job has been deployedfor scheduling, the person responsible for scheduling jobs can use the appropriatesoftware to schedule the job for execution by a SAS Workspace Server.The main tasks that are associated with deploying a job for scheduling are as follows: Deploying Jobs for Scheduling on page 171 Redeploying Jobs for Scheduling on page 173 Using Scheduling to Handle Complex Process Flows on page 174Prerequisites for SchedulingTo deploy a job for scheduling in SAS Data Integration Studio, you will need thefollowing components: scheduling server SAS Workspace Server that is used to deploy jobs for scheduling source repository for the code that is generated from a SAS Data IntegrationStudio jobTypically, an administrator installs, configures, and registers the appropriate serversand source code repositories. The administrator then tells SAS Data Integration Studiousers which server and source repository to select when deploying jobs for scheduling.For more information, see the scheduling chapters in the SAS Intelligence Platform:System Administration Guide.Deploying Jobs for SchedulingProblemYou want to schedule a SAS Data Integration Studio job to run in batch mode at aspecified date and time.SolutionScheduling a job is a two-stage process: Deploy the job for scheduling. See Deploy a Job for Scheduling on page 171. Schedule the job for execution. See Schedule a Job on page 173.If possible, test a job and verify its output before deploying the job for scheduling.TasksDeploy a Job for SchedulingPerform the following steps to deploy a job for scheduling:172 Tasks Chapter 91 From the SAS Data Integration Studio desktop, select the Inventory tab in thetree view.2 In the Inventory tree, expand the Jobs folder.3 Select the job that you want to deploy. Then, right-click to display the pop-upmenu, and select Deploy for Scheduling. The Deploy for Scheduling windowdisplays. If you select only one job, the window resembles the following display.Display 9.1 Deploy for Scheduling Window for a Single JobBy default, the deployed job file (in this case, Extract Balances Job.sas) isnamed after the selected job. If you select more than one job, the windowresembles this display.Display 9.2 Deploy for Scheduling Window for Multiple JobsWhen you deploy more than one job, a separate file is created for each job thatyou select. Each deployed job file is named after the corresponding job.4 In the SAS server field, accept the default server or select the server that is usedto generate and store code for the selected job. The next step is to select the jobdeployment directory. One or more job deployment directories (source repositories)were defined for the selected server when the metadata for that server was created.Deploying Jobs Solution 1735 Click Select. The Deployment Directories window is displayed.6 In the Deployment Directories window, select a directory where the generated codefor the selected job will be stored. Then, click OK. The Deploy for Schedulingwindow is displayed. The directory that you selected is specified in the Directoryname field.7 If you selected one job, in the File name field, you can edit the default name ofthe file that contains the generated code for the selected job. The name must beunique in the context of the directory specified in the Directory name field.8 When you are ready to deploy the job or jobs, click OK.Code is generated for the selected job or jobs and is saved to the directory that isspecified in the Directory name field. Metadata about the deployed jobs is saved to thecurrent metadata repository. A status window is displayed and indicates whether thedeployment was successful. In the Inventory tree, the icon beside each deployed jobincludes the image of a small clock. The clock indicates that the jobs are now availablefor scheduling.Schedule a JobAfter a job has been deployed for scheduling, the person responsible for schedulingcan use the appropriate software to schedule the job. For more information, see thescheduling chapters in the SAS Intelligence Platform: System Administration Guide.Redeploying Jobs for SchedulingProblemAfter deploying a job for scheduling and then updating the job, you need to redeploythe job so that the latest version of the job is scheduled. You also need to redeploy jobsif the computing environment changes, for example, when a job is exported from a testenvironment and is imported into a production environment.SolutionUse the Redeploy Jobs for Scheduling feature to find any jobs that have beendeployed for scheduling, regenerate the code for these jobs, and save the new code to ajob deployment directory. The redeployed jobs can then be rescheduled.Rescheduling a job is a two-stage process: Redeploy the job for scheduling. See Redeploy a Job for Scheduling on page 174. Reschedule the job for execution. See Reschedule the Job on page 174.174 Tasks Chapter 9TasksRedeploy a Job for SchedulingPerform the following steps to redeploy a job for scheduling:1 From the SAS Data Integration Studio desktop, select Tools Redeploy forScheduling. Any jobs that have been deployed are found. A dialog box isdisplayed, citing the number of deployed jobs found and prompting you to click OKto continue.2 Click OK. A window is displayed that has various options for maintaining ametadata profile. Code is generated for all deployed jobs and saved to the jobdeployment directory for the SAS Application Server that is used to deploy jobs.The regenerated code contains references to servers and libraries that areappropriate for the current metadata repositories. The regenerated jobs are nowavailable for scheduling.Reschedule the JobAfter a job has been redeployed from SAS Data Integration Studio, the personresponsible for scheduling can use the appropriate software to reschedule the job. Forexample, you can use the Schedule Manager in SAS Management Console to rescheduleall flows, resubmitting all job flows that have been submitted to the scheduling server.For details, see the online Help for the Schedule Manager in SAS Management Console.Using Scheduling to Handle Complex Process FlowsProblemYou have a complex job involving joins and transformations from many differenttables. You want to reduce the complexity by creating a set of smaller jobs that arejoined together and can then be executed in a particular sequence.SolutionGroup all of the jobs in the flow together in a single folder in the Custom tree.Perform the steps in Schedule Complex Process Flows on page 174 to deploy andschedule the jobs in the proper sequence.TasksSchedule Complex Process FlowsPerform the following steps to schedule complex process flows:1 Divide the complex job into a series of smaller jobs that create permanent tables.Those tables can then be used as input for succeeding jobs.Deploying Jobs Tasks 1752 Keep all of your jobs in the flow together in a single folder in the Custom tree, andgive the jobs a prefix that causes them to be displayed in the appropriateexecution order.3 Deploy the jobs for scheduling.4 The person responsible for scheduling can use the appropriate software toschedule the jobs to be executed in the proper sequence.Deploying Jobs for Execution on a Remote HostProblemYou want to execute one or more SAS Data Integration Studio jobs that process alarge amount of data on a remote machine and then save the results to that remotemachine. In this case, it might be efficient to move the job itself to the remote machine.SolutionAdministrators must configure the servers that are described in the Processing JobsRemotely section of the SAS Intelligence Platform: Desktop Application AdministrationGuide. A SAS Data Integration Studio user can then use the Deploy for Schedulingwindow to deploy a job for execution on the remote host. Code is generated for the joband the generated code is saved to a file on the remote host. After a job has beendeployed to the remote host, it can be executed by any convenient means.For example, assume that the default SAS Application Server for SAS DataIntegration Studio is called SASMain, but you want a job to execute on another SASApplication Server that is called DEZ_App Server. You will select DEZ_App Server inthe Deploy for Scheduling window, and the code that is generated for the job will belocal to DEZ_App Server.TasksDeploy One or More Jobs for Execution on a Remote HostPerform the following tasks to deploy jobs for execution on a remote host:1 From the SAS Data Integration Studio desktop, select RepositoriesFoundation Jobs in the Inventory tree.2 Right-click the job or jobs that you want to deploy. Then, select Deploy forScheduling. The Deploy for Scheduling window is displayed. If you selected onlyone job, the window resembles the following display:176 Tasks Chapter 9Display 9.3 Scheduling Window for One JobIf you selected more than one job, the window resembles the following display:Display 9.4 Scheduling Window for More Than One Job3 In the drop-down list in the SAS Server field, select the SAS Application Serverthat contains the servers on the remote host.4 In the Directory field, use the down arrow to select a predefined directory wherethe generated code for the selected job will be stored.5 If you selected one job, you can edit the default name of the file that contains thegenerated code for the selected job in the File name field. The name must beunique in the context of the directory that is specified above. Click OK to deploythe job.If you selected more than one job, SAS Data Integration Studio automaticallygenerates filenames that match the job names. If the files already exist, a messageasking whether you want to overwrite the existing files is displayed. Click Yes tooverwrite them. Otherwise, click No.Code is generated for the current jobs and saved to the directory that is specified inthe Directory field. Metadata about the deployed jobs is saved to the current metadatarepository. In the Inventory tree, the icon beside each deployed job includes the imageDeploying Jobs Prerequisites for SAS Stored Processes 177of a small clock. The clock indicates that the jobs are now available for scheduling. Thedeployed job can either be scheduled or executed by any convenient means.About SAS Stored ProcessesA SAS stored process is a SAS program that is stored on a server and can beexecuted as required by requesting applications. You can use stored processes for Webreporting, analytics, building Web applications, delivering result packages to clients orthe middle tier, and publishing results to channels or repositories. Stored processes canalso access any SAS data source or external file and create new data sets, files, or otherdata targets supported by the SAS System.After a job has been deployed as a stored process, any application that can execute aSAS stored process can execute the job. For example, the following applications canexecute stored processes for SAS Data Integration Studio jobs: SAS Add-In for Microsoft Office: a Component Object Model (COM) add-in thatextends Microsoft Office by enabling you to dynamically execute stored processesand embed the results in Microsoft Word documents and Microsoft Excelspreadsheets. SAS Enterprise Guide: an integrated solution for authoring, editing, and testingstored processes. SAS Information Map Studio: an application can be used to implement informationmap data sources. Stored processes can use the full power of SAS procedures andthe DATA step to generate or update the data in an information map. SAS Information Delivery Portal: a portal that provides integrated Web access toSAS reports, stored processes, information maps, and channels. Stored Process Service: a Java application programming interface (API) thatenables you to execute stored processes from a Java program. SAS Stored Process Web applications: Java Web applications that can executestored processes and return results to a Web browser. SAS BI Web Services: a Web service interface to SAS stored processes.The main tasks that are associated with deploying a Stored Process are as follows: Deploying Jobs as SAS Stored Processes on page 178 Redeploying Jobs to Stored Processes on page 180 Viewing or Updating Stored Process Metadata on page 181Prerequisites for SAS Stored ProcessesTo deploy a SAS Data Integration Studio job as a SAS stored process, you will needthe following components: A server that can execute SAS stored processes. Stored processes that can beexecuted by Web service clients require a connection to a SAS Stored ProcessServer. Other stored processes can be executed on a SAS Stored Process Server ora SAS Workspace Server. For details about how these servers are installed,configured, and registered in a SAS metadata repository, see the SAS IntelligencePlatform: Application Server Administration Guide. A source repository for the stored process that is generated from a SAS DataIntegration Studio job.178 Deploying Jobs as SAS Stored Processes Chapter 9Typically, an administrator installs, configures, and registers the appropriate serversand source code repositories. The administrator then tells SAS Data Integration Studiousers which server and source repository to select when deploying jobs as storedprocesses.To use the stored process feature to your best advantage, you should be familiar withstored process parameters, input streams, and result types. For a detailed discussion ofstored processes, see SAS Stored Processes and SAS BI Web Services in the SASIntegration Technologies: Developers Guide.Deploying Jobs as SAS Stored ProcessesProblemYou want to deploy a job as a SAS stored process so that any application that canexecute a SAS stored process can execute the job.SolutionDeploying and executing a job as a stored process is two-stage procedure: Deploy the job as a stored process. See Deploy a Job as a Stored Process on page178. Execute the stored process. See Execute the Stored Process for a Job on page 180.If possible, test a job and verify its output before deploying the job as a stored process.TasksDeploy a Job as a Stored ProcessPerform the following steps to deploy a job as a stored process:1 In the Inventory tree or the Custom tree on the SAS Data Integration Studiodesktop, right-click the job for which you will like to generate a stored process.Then select Stored Process New from the pop-up menu. The first window of theStored Process wizard is displayed.Deploying Jobs Tasks 179Display 9.5 First Window of the Stored Process Wizard2 In the first window, enter a descriptive name for the stored process metadata. Youmight want to use a variation of the job name. Enter other information as desired.For details about the fields in this window, select Help. When finished, selectNext. The second window of the wizard is displayed.3 In the second window, specify a SAS server, a source repository, a source filename,any input stream, and any output type (result type) for the new stored process.The following display shows some sample values for this window.Display 9.6 Second Window of the Stored Process Wizard180 Redeploying Jobs to Stored Processes Chapter 94 The main fields in this window are as follows: SAS server: a SAS Workspace Server or a SAS Stored Process Server. Source code repository: a location, such as a directory, that containsstored processes. The physical path that is specified in this field is associatedwith the server that is specified in the SAS server field. You must have writeaccess to this repository. Source file: name of the stored process file that is to be generated fromthe job that was selected. Input: specifies the type of input for the stored process. This isautomatically set to None if you are using a SAS Workspace Server or if youare using a SAS Stored Process Server with no defined streams. Output: specifies the type of results produced by the stored process. A list ofresult types is provided for you. The list of result types changes depending onwhether you select a SAS Workspace Server or a SAS Stored Process Server.For details about the fields in this window, select Help. When finished, select Next.5 The third window of the wizard is displayed. In the third window, specify anyparameters for the stored process. For details about the fields in this window,select Help. When finished, select Finish.A stored process is generated for the current job and is saved to the sourcerepository. Metadata about the stored process is saved to the current metadatarepository. In the Inventory tree, the icon beside the selected job includes theimage of a blue triangle. The blue triangle indicates that a stored process has beengenerated for the job. A metadata object for the stored process is added to theStored Process folder in the Inventory tree.Execute the Stored Process for a JobAfter a job has been deployed as a stored process, the person responsible forexecuting that job can use the appropriate application to access and run the job.Redeploying Jobs to Stored ProcessesProblemAfter generating a stored process for a job and then updating the job, you want toredeploy the stored process so that it matches the latest version of the job. You alsoneed to redeploy stored processes if the computing environment changes, as when astored process is exported from a test environment and is imported into a productionenvironment, for example.SolutionYou can select a job for which a stored process has been generated, regenerate codefor the job, and update any stored processes associated with the selected job. SeeRedeploy a Selected Job with a Stored Process on page 181.Alternatively, you can use the Redeploy Jobs to Stored Processes feature to regeneratethe code for most jobs with stored processes and update any stored processes associatedDeploying Jobs Solution 181with these jobs. Each redeployed stored process then matches the current version of thecorresponding job. See Redeploy Most Jobs with Stored Processes on page 181.TasksRedeploy a Selected Job with a Stored ProcessPerform the following steps to select a job for which a stored process has beengenerated, regenerate code for the job, and update any stored processes associated withthe selected job:1 Open the Jobs folder in the Inventory tree.2 Right-click the job metadata for a stored process.3 Select Stored Process job_name Redeploy from the pop-up menu.Redeploy Most Jobs with Stored ProcessesPerform the following steps to regenerate the code for most jobs with stored processesand update any stored processes associated with these jobs.Note: The Redeploy Jobs to Stored Processes feature does not redeploy a job thathas been deployed for execution by a Web service client. 1 From the SAS Data Integration Studio desktop, select Tools Redeploy Jobs toStored Processes.2 A dialog box is displayed, citing the number of stored processes found andprompting you to click OK to continue.For each job that has one or more associated stored processes, the code is regeneratedfor that job. For each stored process associated with a job, the generated code is writtento the file associated with the stored process. The regenerated code contains referencesto servers and libraries that are appropriate for the current metadata repositories.Viewing or Updating Stored Process MetadataProblemYou want to update or delete the metadata for a stored process.SolutionLocate the metadata for the stored process in the Stored Process folder of theInventory tree. Display the properties window and update the metadata.182 Tasks Chapter 9The SAS server and source repository for a stored process cannot be changed. If youneed to change these values, create a new stored process and specify a different serverand source repository.TasksUpdate the Metadata for a Stored ProcessPerform the following steps to update the metadata for a stored process that wasgenerated for a SAS Data Integration Studio job:1 In the Inventory tree on the SAS Data Integration Studio desktop, locate theStored Process folder.2 Locate the metadata for the stored process that you want to update.3 To delete the metadata for a stored process, right-click the appropriate process andselect Delete. (The physical file that contains the stored process code is notdeleted; only the metadata that references the file is deleted.)To view or update the metadata for a stored process, right-click the appropriateprocess and select Properties. A properties window for the stored process isdisplayed.4 View or update the metadata as desired. For details about the tabs in this window,select Help.About Deploying Jobs for Execution by a Web Service ClientA Web service is an interface that enables communication between distributedapplications, even if the applications are written in different programming languages orare running on different operating systems. In SAS Data Integration Studio, a job canbe selected in the Inventory tree or the Custom tree and deployed as a SAS storedprocess that can be executed by a Web service client.You might want to deploy a job for execution by a Web service client when the jobmeets all of the following criteria: The job must be accessed across platforms. The amount of data to be input and output is not large. Any input from the Web service client, or output to the Web service client, can beformatted as an XML table.The main tasks that are associated with deploying a job for execution by a Web serviceclient are as follows: Review the requirements for a job that will be deployed for execution by a Webservice client. See Requirements for Jobs That Can Be Executed by a Web ServiceClient on page 183. Create the job. See Creating a Job That Can Be Executed by a Web ServiceClient on page 185. Deploy the job. See Deploying Jobs for Execution by a Web Service Client onpage 188.After the job has been deployed, the person responsible for executing the deployed jobcan use the appropriate Web service client to access and execute the job. See Using aWeb Service Client to Execute a Job on page 191.Deploying Jobs Process Flow Requirements 183Requirements for Jobs That Can Be Executed by a Web Service ClientOverview of RequirementsIn addition to the Prerequisites for SAS Stored Processes on page 177, a SAS DataIntegration Studio job that can be executed by a Web service client must meet therequirements that are described in this section.Process Flow RequirementsIf a job does not need to receive input from the client and does not have to sendoutput to the client, then the job does not have to meet any additional requirementsbefore it is deployed for execution by a Web service client. It can be deployed asdescribed in Deploying Jobs for Execution by a Web Service Client on page 188.Usually, however, a job receives input from the client and sends output to the client.For example, suppose that you wanted to deploy a job that will enable a user to providea temperature value in Fahrenheit and obtain the corresponding value in Celsius. Thefollowing display shows how you can specify such a process flow in SAS DataIntegration Studio.Display 9.7 Process Flow for a Temperature Conversion JobA job that obtains input from and sends output to a Web service client must meet thefollowing requirements before it is deployed: The job can receive zero or more inputs from the Web service client that executesthe job. The preceding temperature conversion job receives one input from theclient, as specified in F_TEMP. The job can send zero or one output to the client that executes the job. The jobshown here sends one output to the client, as specified in C_TEMP. Input to the job from the client, and output from the job to the client, must be inXML table format. In this job, F_TEMP and C_TEMP are XML tables. XML tables that specify client input or output in the job must be members of aSAS XML library. The XML table for a client input can have an XMLMap associated with it throughthe library. An XMLMap can help the XML LIBNAME engine to read the table.However, the XML table that specifies a client output cannot have an XMLMapassociated with it through the library. The XML table for each client input or output in the job must have a unique libref. The client output in the process flow must have a libref named _WEBOUT.184 Data Format for Web Client Inputs and Outputs Chapter 9 The XML table for each client input or output in the job must be configured as aWeb stream. F_TEMP and C_TEMP are configured as Web streams, as indicatedby the blue circles on their icons. (See Web Streams for Web Client Inputs andOutputs on page 185.)Data Format for Web Client Inputs and OutputsAll Web services exchange information in XML format. SAS Data Integration Studiojobs are optimized for structured data such as tables. Accordingly, exchanges between aSAS Data Integration Studio job and a Web service client must be in XML table format.For example, the following sample code shows an XML table that is appropriate forthe client input in our temperature conversion job.Example Code 9.1 An XML Table for One Temperature Value in FahrenheitThe XML code in this example can be stored in an XML file named f_temps.xml.The next example code shows an XML table that is appropriate for the client output inthe temperature conversion job.Example Code 9.2 An XML Table for One Temperature Value in CelsiusThe preceding XML code could be stored in an XML file named c_temps.xml.Later these tables could be registered with the XML source designer, and the tablemetadata can be used to create a process flow in SAS Data Integration Studio.Libraries and Librefs for Web Client Inputs and OutputsA job that is deployed for execution by a Web service client has a number ofrequirements that are related to libraries and librefs.A SAS XML library is a library that uses the SAS XML LIBNAME engine to accessan XML file. In the process flow for the job, the XML tables that specify input from theclient or output to the client must be members of a SAS XML Library. For example,suppose that you had created the XML tables that are described in the previous section.You can create one SAS XML Library for the input table (f_temps.xml) and anotherSAS XML Library for the output table (c_temps.xml).Each library has a libref, an alias for the full physical name of a SAS library. A SASLIBNAME statement maps the libref to the full physical name. The process flow for ajob that is deployed for execution by a Web service client has two requirements inregard to librefs: The XML table for each client input and output must have a unique libref. Whenthe job is deployed for execution by a Web service client, the input and outputDeploying Jobs Solution 185streams will be named after the librefs of the inputs and the output. Differentstreams in the job cannot have the same librefs. The client output must have a libref named _WEBOUT.For example, if you created a SAS XML Library for the input file f_temps.xml andanother SAS XML Library for the output file c_temps.xml, you can specify librefs off_temps and _WEBOUT, respectively. In that case, there will be no duplicate librefs forclient inputs and outputs, and the libref for the output will be _WEBOUT, as required.If needed, you can change the libref in the metadata for a library. In general, tochange the libref, right-click the library and select Properties from the pop-up menu.Click the Options tab and update the Libreffield.Web Streams for Web Client Inputs and OutputsIn the process flow for a job that will be deployed for execution by a Web serviceclient, the XML table for each client input or output must be configured as a Webstream. To configure an XML table as a Web stream, right-click the table in the processflow and then select Web Stream from the pop-up menu. An extended attribute namedETLS_WEB_STREAM is added to the selected table. A blue circle icon is added to the iconfor the table.(Optional) Parameters for User InputSuppose that you wanted the Web client user to supply a character or string value tothe Web service. In that case, you will: Add the appropriate column or columns to the Web client input table in the job.See Data Format for Web Client Inputs and Outputs on page 184. When you deploy the job, use the Web Service Wizard to add a parameter to theWeb service. See Deploying Jobs for Execution by a Web Service Client on page188.Creating a Job That Can Be Executed by a Web Service ClientProblemYou want to create a SAS Data Integration Studio job that can be executed by a Webservice client.SolutionCreate a SAS Data Integration Studio job that meets the requirements that aredescribed in Requirements for Jobs That Can Be Executed by a Web Service Client onpage 183.186 Tasks Chapter 9TasksCreate a Job That Can Be Executed by a Web Service ClientPerform the following steps to create a job that can be executed by a Web serviceclient. The temperature conversion job that is described in Process FlowRequirements on page 183 is used as an example.1 Use an XML editor to create an XML table for each input from the Web serviceclient. Include test values in the input tables, if desired. Save each table to aseparate file. For additional information, see Data Format for Web Client Inputsand Outputs on page 184.2 Use an XML editor to create an XML table for the output to the Web serviceclient. Save that table to a file.3 Using the New Library wizard for SAS XML Libraries, create a separate XMLlibrary for each input from the Web service client, or a unique libref for each input,or both. In the following display, the input XML library has a libref of inputtab.Display 9.8 New Library Wizard for inputtab4 Using the New Library wizard for SAS XML Libraries, create an XML library witha libref of _WEBOUT for the output to the Web service client. The library must havea libref of _WEBOUT; otherwise, the stream will be ignored.Deploying Jobs Tasks 187Display 9.9 New Library Wizard for _WEBOUT5 Use Tools Source Designer to register both the InputTable and OutputTablelibraries. Perform the following steps for both libraries:a Select XML --- All Documents from XML Sources. Click Next.b Select the library.c Select All Tables. Click Next.d Select the appropriate folder. Click Next. Click Finish.6 Using the Process Designer window, create a process flow for the Web services job,such as the process flow shown in the temperature conversion job. Use the XMLtable metadata from the previous step to specify the Web service client input andoutput.188 Deploying Jobs for Execution by a Web Service Client Chapter 9Display 9.10 Specify Web Service Client Input and Output7 If the metadata for each client input table points to an XML table with test values,you can test the job in SAS Data Integration Studio. Run the job, and then use theView Data window to verify that the values in the client output table are correct.If not, troubleshoot and correct the job.8 Configure the client input and output as Web streams. Right-click a client input inthe process flow and then select Web Stream from the pop-up menu. Repeat for allinputs and the output in the job.9 Save and close the job.Note: After the job is deployed, and the Web client executes the job, any physicaltable specified in the metadata for a Web stream input or output is ignored, and datasubmitted by the client is used instead. Deploying Jobs for Execution by a Web Service ClientProblemYou want to deploy a SAS Data integration Studio job for execution by a Web serviceclient.SolutionSelect the job and deploy it as described in the following section. It is assumed thatthe selected job meets the requirements that are described in Requirements for JobsThat Can Be Executed by a Web Service Client on page 183.Deploying Jobs Tasks 189TasksDeploy a Job for Execution by a Web Service ClientPerform the following steps to deploy a job for execution by a Web service client. Thetemperature conversion job that is described in Process Flow Requirements on page183 is used as an example.1 In the Inventory tree or the Custom tree on the SAS Data Integration Studiodesktop, right-click the job that you want to deploy for execution by a Web serviceclient. Then select Web Service New from the pop-up menu. The first window ofthe Web Service wizard displays:Display 9.11 General TabIn the first window, the Name field specifies the name of the job that you areabout to deploy. The name of the selected job (Convert F to C Job) is displayedby default. Note the keyword XMLA Web Service, which is the required keywordfor a job that can be executed by a Web service client.2 In the first window, enter a descriptive name for the stored process that will begenerated from the selected job. You can use a variation of the job name. Forexample, if the selected job was named Convert F to C Job, the metadata forthe stored process can be Convert F to C Service.3 In the second window, specify a SAS Stored Process Server, a source repository, asource filename, any input stream, and any output type (result type) for the newstored process. The following display shows some example values for this window.190 Tasks Chapter 9Display 9.12 Execution Tab4 The main fields in this window are as follows: SAS server: for Web services, only SAS Stored Process Servers are availablefor selection. Source repository: a location, such as a directory, that contains storedprocesses. The physical path that is specified in this field is associated withthe server that is specified in the SAS server field. You must have writeaccess to this repository. Source file: name of the stored process file that is to be generated from theselected job.. Input: specifies the type of input for the stored process. This is automaticallyset to Streaming if you are deploying a job with one or more Web Streaminputs. Output: specifies the type of results roduced by the stored process. This isautomatically set to Streaming if you are deploying a job with a Web Streamoutput.For details about the fields in this window, select the Help button. When youare finished, select Next.5 In the third window, specify any parameters for the stored process. For detailsabout the fields in this window, select the Help button. When finished, selectFinish.A stored process that can be executed by a Web service client is generated for thecurrent job and saved to the source repository. Metadata about the stored process issaved to the current metadata repository. In the Inventory tree, the icon beside theselected job includes the image of a blue triangle. The blue triangle indicates that astored process has been generated for the job. A metadata object for the stored processis added to the Stored Process folder in the Inventory tree.Deploying Jobs Tasks 191Using a Web Service Client to Execute a JobProblemYou want to execute a SAS Data Integration Studio job that was deployed forexecution by a Web service client.SolutionUse a Web service client to execute the stored process that was generated from thejob.TasksUse a Web Service Client to Execute a JobPerform the following steps to execute the stored process that was generated from aSAS Data Integration Studio job. The SAS BI Web Services Explorer is used as anexample of a Web service client. The example job is the temperature conversion job thatis described in Process Flow Requirements on page 183.1 Start the Web service client and display a list of stored processes that areavailable for execution. For example, in the SAS BI Web Services Explorer, youcan click Discover Stored Processes. After the request is submitted, theresponse looks similar to the following display.192 Tasks Chapter 9Display 9.13 Stored Processes in SAS BI Web Services Explorer2 Select the stored process that you want to execute. If the stored process requiresinput, the Web service client will display one or more input areas.3 Enter any required input. The input must be in the format described in DataFormat for Web Client Inputs and Outputs on page 184. For example, supposethat you selected the stored process for the temperature conversion job. In thatcase, you will paste an XML table that contains a parameter and a temperaturevalue, as shown in the next display.Deploying Jobs Tasks 193Display 9.14 Input Is One Value in an XML Table4 Submit the input and execute the stored process. For example, in the SAS BI WebServices Explorer, you will click Encode to encode the input, and then clickExecute to execute the stored process.The stored process executes. If an output value is returned, it is displayed in theWeb service client. For example, the stored process for the temperature conversion jobconverts a Celsius input value and displays a Fahrenheit output value in the client.195C H A P T E R10Working with TransformationsAbout Transformations 195Viewing the Code for a Transformation 196Problem 196Solution 196Tasks 196Access the Code in a Transformation 196Viewing or Updating the Metadata for Transformations 197Problem 197Solution 197Tasks 197Access the Metadata for a Transformation 197Additional Information 197Creating and Maintaining Column Mappings 197Problem 197Solution 198Tasks 198Create Automatic Column Mappings 198Create One-to-One Column Mappings 199Create Derived Column Mappings 199Create Mappings with Multiple Columns 201Delete Column Mappings 202Use the Pop-Up Menu Options for Mappings 203About Archived Transformations 204About TransformationsA transformation is a metadata object that specifies how to extract data, transformdata, or load data into data stores. Each transformation that you specify in a processflow diagram generates or retrieves SAS code. You can also specify user-written code inthe metadata for any transformation in a process flow diagram. For information aboutuser-written code in transformations, see Chapter 11, Working with Generated Code,on page 207 and Chapter 12, Working with User-Written Code, on page 215.You can find transformations in the Process Library tree on the SAS DataIntegration Studio desktop. The transformations that are supplied with SAS DataIntegration Studio are divided into the following categories: Access Analysis Archived Transforms Control196 Viewing the Code for a Transformation Chapter 10 Data Transforms Output Publish SPD Server Dynamic ClusterSee Process Library Tree on page 39 for information about the transformationsincluded in these categories. You can create additional transformations and store themin the User Defined section of the Process Library tree.Each transformation comes with an appropriate set of options. You can set them onthe Options tab of the properties window for the transformation. You can also decidewhether to you should send the output of the transformation to a temporary table or toa permanent table. See Manage Temporary and Permanent Tables forTransformations on page 239.Once you have located the transformation that you need, you can drag it from theProcess Library tree and drop it into a job. You can view and update the metadata forany transformation that is included in a process flow diagram, as described in Viewingor Updating the Metadata for Transformations on page 197. You can also import andexport the transformation metadata when the jobs that contain them are imported orexported. The metadata is simply treated as a part of the job.Viewing the Code for a TransformationProblemYou want to view the code for a transformation that is included in an existing SASData Integration Studio job.SolutionYou can view or update the metadata for a transformation in the transformationsSource Editor window. This window is available only when the transformation isincluded in a SAS Data Integration Studio job.TasksAccess the Code in a TransformationYou can access the code in a transformation that is included in a SAS DataIntegration Studio job. Perform the following steps.1 Open an existing SAS Data Integration Studio job.2 Right-click the transformation in the Process Designer window that contains thecode that you want to review. Then, click View Step Code.3 Review the code that is displayed in the Source Editor window for the selectedtransformation.Working with Transformations Problem 197Viewing or Updating the Metadata for TransformationsProblemYou want to view or update the metadata for a transformation. This metadata caninclude the metadata for column mappings and the metadata that specifies whether youor SAS Data Integration Studio will supply the code for the transformation.SolutionYou can view or update the metadata for a transformation in the transformationsproperties window. This window is available only when the transformation is includedin a SAS Data Integration Studio job.TasksAccess the Metadata for a TransformationYou can access the metadata for a transformation that is included in a SAS DataIntegration Studio job. Perform the following steps.1 Open an existing SAS Data Integration Studio job.2 Right-click the transformation in the Process Designer window that contains themetadata that you need to review or update. Then, click Properties.3 Click the appropriate tab to view or update the desired metadata.For details about the metadata that is maintained on a particular tab, click Help onthat tab. The Help topics for complex tabs often include task topics that can help youperform the main tasks that are associated with that tab. Updates to transformationmetadata are not reflected in the output for that transformation until you rerun the jobin which the transformation appears, as described in Submitting a Job for ImmediateExecution on page 147.Additional InformationFor details about maintaining column mappings, see Creating and MaintainingColumn Mappings on page 197. For information about maintaining generatedtransformations, see Maintaining a Generated Transformation on page 232.Creating and Maintaining Column MappingsProblemYou want to create or maintain the column mappings between the source tables andthe target tables in a SAS Data Integration Studio job.198 Solution Chapter 10SolutionYou create or maintain column mappings in the Mapping tab in the propertieswindow for a transformation. You can work with mappings in the following ways: Create automatic column mappings. Create one-to-one column mappings. Create derived column mappings. Create mappings with multiple columns. Delete mappings.You can also modify the columns in the Target table field of the Mapping tab. Forinformation, see Maintaining Column Metadata on page 98.TasksCreate Automatic Column MappingsYou can review the mappings that are automatically generated when atransformation is submitted for execution in the context of a SAS Data IntegrationStudio job. The mappings are depicted on the Mapping tab of the transformation, asshown in the following display.Display 10.1 Automatic Column MappingsThe arrows in the preceding display represent mappings that associate sourcecolumns with target columns. By default, SAS Data Integration Studio automaticallycreates a mapping when a source column and a target column have the same columnname, data type, and length. Events that trigger automatic mapping include: dropping a source and a target in the drop zones for a transformation template clicking Propagate in the toolbar or in the pop-up menu in the Process Designerwindow clicking Quick Map or Quick Propagate in the pop-up menu for the Mapping tabof the transformationSAS Data Integration Studio might not be able to automatically create all columnmappings that you need in a transformation.Working with Transformations Tasks 199SAS Data Integration Studio automatically creates a mapping when a source columnand a target column have the same column name, data type, and length. However, eventhough such mappings are valid, they might not be appropriate in the current job.You can also disable or enable automatic mapping for a transformation. For example,suppose that both the source table and the target table for a transformation have twocolumns that have the same column name, data type, and length, as shown in thepreceding display. These columns are mapped automatically unless you disableautomatic mapping for the transformation. If you delete the mappings between thesecolumns, the mappings are restored upon a triggering event, such as selecting thePropagate, Quick Map, or Quick Propagate options.The only way to prevent automatic mappings is to disable automatic mapping for thetransformation. To disable automatic mapping for a selected transformation, right-clickthe transformation and deselect Automap from the pop-up menu. To restore automaticmapping for a selected transformation, right-click the transformation and selectAutomap from the pop-up menu.Note: If you disable automatic mapping for a transformation, you must maintain itsmappings manually. Create One-to-One Column MappingsYou need to manually map between a column in the source table and a column in thetarget table. Perform the following steps to map between two columns:1 Open the Mapping tab in the properties window for the transformation.2 Select the column in the source table.3 Select the column in the table.4 Right-click an empty space in the Mapping tab. Then, click New Mapping in thepop-up menu. An arrow appears between the two selected columns. This arrowrepresents the new mapping.5 Click OK to save the new mapping and close the properties window.You can also create a mapping in the Mapping tab by clicking on a source column anddragging a line to the appropriate target column.Create Derived Column MappingsA derived mapping is a mapping between a source column and a target column inwhich the value of the target column is a function of the source column. For example,you could use a derived column to accomplish the following tasks: Write the date to a Date field in the target when there is no source column for thedate. Multiply the value of the Price source column by 1.06 to get the value of thePriceIncludingTax target column. Write the value of the First Name and Last Name columns in the source table tothe Name field in the target table.200 Tasks Chapter 10You can use the techniques illustrated in the following table to create different typesof derived column mappings. All of the techniques are used on the Mapping tab in theproperties window for the transformation.Table 10.1 Derived Column TechniquesTechnique DescriptionDirectly enteran expressioninto anExpressionfieldYou can create any type of expression by entering the expression directly into anExpression field. The expression can be a constant or an expression thatuses the values of one or more source columns. For example, you can create asample expression that writes todays date to a Date column in a target table.Perform the following steps:1 Double-click in the field in which you want to enter the expression. Acursor displays in the field. (The button disappears.)2 Enter your expression into the field. For example, to write todays date toevery row in a column, you can enter the expression &SYSDATE.3 Click OK.Createexpressionsthat use nosource columnsSome transformations such as Extract, Lookup, and SCD Type 2 Loader providean Expression column in the target table. You can perform the following stepsto enter an expression into this column that does not use source columns:1 Right-click in an Expression column. Then, click Expression in thepop-up menu to access the Expression Builder window.2 Use the Expression Builder to create an expression. (For informationabout the Expression Builder window, see the 'Expression Builder' topicin SAS Data Integration Studio help.) Then, click OK to save theexpression, close the Expression Builder window, and display theexpression in the selected column in the target table.3 Click OK to close the properties window for the transformation.Working with Transformations Tasks 201Technique DescriptionCreateexpressionsthat use asingle sourcecolumnAssume that you want to define the value of a DiscountedPrice column inthe target by using the Price source column in an expression. This would bepossible if the discount were a constant, such as 6 percent. That is, you mightwant to define an expression as Price * .94. You could perform the followingsteps:1 Select the Price source column and the DiscountedPrice target column.2 Right-click either selected variable, and select Expression from thepop-up menu to access the Expression Builder window.3 Use the Expression Builder to create an expression. Then, click OK tosave the expression, close the Expression Builder window, and display theexpression in the selected column in the target table.4 Click OK to close the properties window for the transformation.Createexpressionsthat use two ormore sourcecolumnsYou can create a derived mapping that uses two or more source columns.Perform the following steps:1 Select the source columns and target column to be used in the mapping.For example, you could use the values of the Price and Discountcolumns in the source in an expression. Then, the result could be writtento the DiscountedPrice column in the target.2 Right-click any of the selected variables and select New Mapping fromthe pop-up menu that displays. Because two source columns were selectedwhen you requested the new mapping, the warning dialog box is displayed.3 Click Yes to access the Expression Builder window.4 Create the expression, which would be Price - (Price *(Discount / 100)) in this example. Then, click OK to save theexpression, close the Expression Builder window, and display theexpression in the selected column in the target table.5 Click OK to close the properties window for the transformation.Create Mappings with Multiple ColumnsIn certain transformations, the Mapping tab in the properties window for thetransformation displays a single source table and multiple target tables, or multiplesource tables and a single target table. To create mappings for multiple sources ortargets, you can right-click the Mapping tab and click Quick Propagate. The QuickPropagate window is displayed, as depicted in the following display.202 Tasks Chapter 10Display 10.2 Quick Propagate WindowYou can select All in the Select Target Table field to propagate the columnslisted in the Selected Columns field to both of the target tables listed beneath the Allitem. (These tables are moved to the Selected Columns field from the AvailableColumns field.You can also select one of the tables listed in the field to propagate the columns toonly that table. In either case, click OK to complete the propagation process. Thefollowing display shows a Mapping tab after two columns are propagated to all of thetarget tables.Display 10.3 Source Columns Added to Both TargetsNotice that the Sex and Name columns are added twice, once for the W5DN37YQtarget table and once for the W5DN37Z0 target table. If one of the target tables hadbeen selected in the Select Target Table field, the Sex and Name columns wouldappear only once.Delete Column MappingsYou can delete a column mapping in the Mapping tab of the properties window for atransformation by using one of the following methods: Left-click the arrow that connects a column in the Source table field to a columnin the Target table field. Then, press the Delete key.Working with Transformations Tasks 203 Right-click the arrow that connects a column in the Source table field to acolumn in the Target table field. Then, click Delete in the pop-up menu.Note: You must disable automatic mapping for a transformation to delete mappingsthat would otherwise be automatically created. See Create Automatic ColumnMappings on page 198. Use the Pop-Up Menu Options for MappingsYou can use the pop-up menu in the Mapping tab of the properties window to controlthe behavior of the tab. The available menu options are listed in the following table.Table 10.2 Pop-Up Options for the Mapping TabMenuoption Available when you right-click DescriptionDelete after selecting one or morecolumns in the Sourcetable(s) or Targettable(s) list box.Removes one mapping for each selectedcolumn. See Delete Column Mappings onpage 202. To delete the column itself, removeall mappings and press the Delete key.Expression on a single mapping arrow or inan Expression column in a targettable. This menu option is notavailable in all transformations.Opens the Expression Builder so that you canmanipulate source data before that data isassociated with a target table. See CreateDerived Column Mappings on page 199.Hide Column with the cursor in a columnheading in the table view.Removes the specified column from the tableview. To restore, use Show.Hold Columnto Leftwith the cursor in a columnheading in a table view.The selected row moves to the far left andremains there as you scroll left and right.Hold Columnto Rightwith the cursor in a columnheading in a table view.The selected column moves to the far rightand remains there as you scroll left and right.Hold Row ToBottomwith the cursor in the # column ofa table view.The selected row moves to the bottom of thetable view and remains at the bottom of thetable view as you scroll up and down.Hold Row ToTopwith the cursor in the # column ofa given row in a table view.The selected row moves to the top of the tableview and remains at the top of the table viewas you scroll up and down.ImportColumnswith the cursor in the Targettable(s) list box.Opens the Import Columns window, whichenables you to define metadata for newcolumns using existing columns in a currentmetadata repository.New Column with the cursor in the Targettable(s) list box, with acolumn optionally selected to setthe insertion point.Adds a new column below the selected columnor at the bottom of the table. The name of thenew table is Untitledn, where n is a numberthat is assigned automatically. Double-click onthe default name to enter a new name.NewMappingafter selecting a source columnwith a left-click and selecting atarget column with aShift-left-click.Adds a mapping arrow between a sourcecolumn and a target column.204 About Archived Transformations Chapter 10Menuoption Available when you right-click DescriptionQuick Map with the cursor anywhere in theMapping tab.Automatically adds mappings between sourceand target columns that have the same name,data type (numeric or character), and length.See Create Automatic Column Mappings onpage 198QuickPropagatewith the cursor anywhere in theMapping tab.Opens the Quick Propagate window, whichenables you to define new target columns andadd mappings to those new target columnsbased on existing source columns. See CreateMappings with Multiple Columns on page201.Release with the cursor in the # column orin a column heading.Enables you to regain the ability to reorderany rows that have been held to the left orright, or any columns that have been held tothe top or bottom.Select AllMappingswith the cursor anywhere in theMapping tab.Selects all mappings, generally for deletion.Show with the cursor in a columnheading.Displays any or all columns in a table viewthat have been hidden with the HideColumns menu option.SortAscendingwith the cursor in a columnheading.Sorts all rows in ascending alphanumericorder based on the values in the selectedcolumn.SortDescendingwith the cursor in a columnheading.Sorts rows in descending alphanumeric orderbased on the values in the selected column.Rows that are held to the top or bottom arenot included in the sort.Sort Original with the cursor in a columnheading.Restores the column sort order to the orderthat is displayed when the Properties windowis first displayed. Any prior ascending ordescending sort order is removed.About Archived TransformationsIn order to support backwards compatibility for existing processes and guaranteethat processes run exactly as defined using older transformations, SAS has developed amethodology for archiving older versions of transforms in the Process library. Theprocess library will continue to surface the archived transformations for some numberof releases. When a job is opened that contains a newer transformation replacement, adialog box displays that indicates the name of the old transformation. The dialog boxalso provides the name and location of the new transformation in the process librarytree.Working with Transformations About Archived Transformations 205The following transformations are being archived and replaced with newer versionsin version 3.4 of SAS Data Integration Studio: Table Loader SQL Join Fact TableIn addition, older transformations will be marked with a flag on their icons. This flagindicates that each transformation is an older version of an updated transformation.207C H A P T E R11Working with Generated CodeAbout Code Generated for Jobs 207Overview 207LIBNAME Statements 208SYSLAST Macro Statements 208Remote Connection Statements 208Macro Variables for Status Handling 209User Credentials in Generated Code 209Displaying the Code Generated for a Job 209Problem 209Solution 209Tasks 209Prerequisite 210View Code Displayed in the Process Designer Window 210View Code Not Displayed in the Process Designer Window 210Displaying the Code Generated for a Transformation 210Problem 210Solution 210Tasks 210Specifying Options for Jobs 211Problem 211Solution 211Tasks 211Set Global Options for Job 211Set Local Options for Jobs 211Specifying Options for a Transformation 211Problem 211Solution 212Tasks 212Modifying Configuration Files or SAS Start Commands for Application Servers 212About Code Generated for JobsOverviewWhen SAS Data Integration Studio generates code for a job, it typically generates thefollowing items: specific code to perform the transformations used in the job208 LIBNAME Statements Chapter 11 a LIBNAME statement for each table in the job a SYSLAST macro statement at the end of each transformation in the job remote connection statements for any remote execution machine that is specifiedin the metadata for a transformation within a job macro variables for status handlingYou can set options for the code that SAS Data Integration Studio generates for jobsand transformations. For details, see Specifying Options for Jobs on page 211 andSpecifying Options for a Transformation on page 211.LIBNAME StatementsWhen SAS Data Integration Studio generates code for a job, a library is consideredlocal or remote in relation to the SAS Application Server that executes the job. If thelibrary is stored on one of the machines that is specified in the metadata for the SASApplication Server that executes the job, it is local. Otherwise, it is remote.SAS Data Integration Studio generates the appropriate LIBNAME statements forlocal and remote libraries.Here is the syntax that is generated for a local library:libname libref ;Here is the syntax that is generated for a remote library:options comamid=connection_type;%let remote_session_id=host_name ;signon remote_session_id ;rsubmit remote_session_id;libname libref ;endrsubmit;rsubmit remote_session_id;proc download data=table_on_remote_machineout=table_on_local_machine;run;endrsubmit;SYSLAST Macro StatementsThe Options tab in the property window for most transformations includes a fieldthat is named Create SYSLAST Macro Variable. This field specifies whether SASData Integration Studio generates a SYSLAST macro statement at the end of thecurrent transformation. In general, accept the default value of YES for the CreateSYSLAST Macro Variable option.Remote Connection StatementsEach transformation within a job can specify its own execution host. When SAS DataIntegration Studio generates code for a job, a host is considered local or remote inrelation to the SAS Application Server that executes the job. If the host is one of themachines that is specified in the metadata for the SAS Application Server that executesthe job, it is local. Otherwise, it is remote.Working with Generated Code Tasks 209A remote connection statement is generated if a remote machine has been specifiedas the execution host for a transformation within a job:options comamid=connection_type;%let remote_session_id=host_name ;signon remote_session_id ;rsubmit remote_session_id;... SAS code ...endrsubmit;Note: This will be done implicitly for users if the machine is remote. Users can alsouse the DataTransfer transformation to explicitly handle moving data between machineenvironments when needed. The Data Transfer transformation provides for morecontrol over the transfer when needed, such as support for locale-specific settings. Macro Variables for Status HandlingWhen SAS Data Integration Studio generates the code for a job, the code includes anumber of macro variables for status handling. For details, see About MonitoringJobs on page 157.User Credentials in Generated CodeThe code that is generated is based on the credentials and permission settings of theuser that generated the code. When required, such as in LIBNAME statements to arelational DBMS, for passthrough, or for remote machine data movement, thegenerated code might also contain embedded credentials, with encoded passwords.If the credentials of the person who created the job are changed and a deployed jobcontains outdated user credentials, the deployed job fails to execute. The solution is toredeploy the job with the appropriate credentials. For details, see Redeploying Jobs forScheduling on page 173.Displaying the Code Generated for a JobProblemYou want to see the code that you generated for a job.SolutionSAS Data Integration Studio uses the metadata in a job to generate code or toretrieve user-written code. You can display the SAS code for a job by opening the job inthe Process Designer window and selecting the Source Editor tab.TasksThis section describes two ways to display the SAS code for a job. It is assumed thatyou are working under change management. However, you do not have to check out ajob to generate and display its code.210 Displaying the Code Generated for a Transformation Chapter 11PrerequisiteSAS Data Integration Studio must be able to connect to a SAS Application Serverwith a SAS Workspace Server component in order to generate the SAS code for a job.View Code Displayed in the Process Designer WindowTo view the code for a job that is currently displayed in the Process Designer window,click the Source Editor tab. The code for the job is displayed in the Source Editor.View Code Not Displayed in the Process Designer WindowPerform the following steps to view the code for a job that is not displayed in theProcess Designer window:1 From the SAS Data Integration Studio desktop, in the Inventory tree, expand theJobs folder.2 Do one of the following: Select the job, go to the View drop-down menu, and select View Code. Right-click the job that you want to view, then select View Code from thepop-up menu.The code for the job is displayed in the Source Editor.Displaying the Code Generated for a TransformationProblemYou want to see the code that you generated for a transformation.SolutionTo see the code as part of the job, see Displaying the SAS Code for a Job on page152.To see the code generated specifically for a transformation, see the following taskssection.TasksPerform the following steps to see the code generated for a transformation:1 Right-click on the transformation icon.2 Select View Step Code from the pop-up menu.A new Source Editor window will open. The code generated for the transformationwill be in this window.Working with Generated Code Problem 211Specifying Options for JobsProblemYou want to set options for Data Integration Studio jobs, such as enabling parallelprocessing and configuring grid processing.In most cases the appropriate options are selected by default. You can override thedefaults by using this procedure.SolutionYou can set global options for jobs on the Code Generation tab of the Optionsmenu. The Options menu is available on the Tools menu on the SAS Data IntegrationStudio menu bar. You can set local options on the Options tab available on theproperties window for each job.TasksSet Global Options for JobYou can set the options listed on the Code Generation tab of the Options menuavailable on the Tools menu on the SAS Data Integration Studio menu bar. Theoptions that are relevant for jobs are examined in the table Table 7.1 on page 146.Set Local Options for JobsYou can set local options that apply to individual jobs by selecting the job and usingthe right mouse button to open the pop-up menu. Select Properties and then selectthe Options tab. These local options override global options for the selected job, butthey do not affect any other jobs. For the local options, refer to Table 7.2 on page 147.You can also add code to a job using the Pre and Post Process option tab availableon the properties window for each job. Select the check box for PreProcessing andselect the Edit button. The Edit Source Code window enables you to add or updateuser-written source code.Specifying Options for a TransformationProblemYou want to set options for a Data Integration Studio transformation, such as SASSort, SQL Join, or Extract.212 Solution Chapter 11SolutionYou can specify SAS system options, SAS statement options, ortransformation-specific options on the Options tab or other tabs in the propertieswindow for many transformations. Use this method to select these options when aparticular transformation executes.TasksPerform the following steps to display the Options tab in the properties window fora transformation in a job:1 Open the job to display its process flow.2 Right-click the transformation and select Properties from the pop-up menu.3 Select the Options tab.For a description of the available options for a particular transformation, see theHelp for the Options tab or other tabs that enable you to specify options. If theOptions tab includes a System Options field, you can specify options such asUBUFNO for the current transformation. Some transformations enable you to specifyoptions that are specific to that transformation. For example, the Options tab for theSort transformation has specific fields for sort size and sort sequence. It also has a PROCSORT Options field where you can specify sort-related options that are not otherwisesurfaced in the interface. For example, you could use these fields to specify the optionsthat are described in the Setting Sort Options on page 276.In addition to the Options tab, some transformations have other tabs that enableyou to specify options that affect performance. For example, the properties window forthe SAS Scalable Performance Server Loader transformation has an Other SPDServer Settings tab, which enables you to specify several SAS Scalable PerformanceServer options.Modifying Conguration Files or SAS Start Commands for ApplicationServersThere are several ways to customize the environment where the code generated byData Integration Studio runs.When you submit a SAS Data Integration Studio job for execution, it is submitted toa SAS Workspace Server component of the relevant SAS Application Server. Therelevant SAS Application Server is one of the following: the default server that is specified on the SAS Server tab in the Options window the SAS Application Server to which a job is deployed with the Deploy forScheduling optionTo set SAS invocation options for all SAS Data Integration Studio jobs that areexecuted by a particular SAS server, specify the options in the configuration files for therelevant SAS Workspace Servers, batch or scheduling servers, and grid servers. (You donot set these options on SAS Metadata Servers or SAS Stored Process Servers.)Examples of these options include UTILLOC, NOWORKINIT, or ETLS_DEBUG.Working with Generated Code Modifying Conguration Files or SAS Start Commands for Application Servers 213To specify SAS system options or startup options for all jobs that are executed on aparticular SAS Workspace Server, modify one of the following for the server: config.sas file autoexec.sas file SAS start commandFor example, your SAS logs have become too large and you want to suppress theMPRINT option in your production environment. Invoke the ETLS_DEBUG option inthe autoexec.sas file by following these steps:1 Open the autoexec.sas file.2 Add the following code to the autoexec.sas file for your production run:%let etls_debug=0;3 Save and close the file.Note: If the condition etls_debug=0 is true, then the logic in the deployed jobprevents execution of the OPTIONS MPRINT; statement. To turn on the MPRINT optionagain, remove %let etls_debug=0; from the autoexec.sas file. CAUTION:We strongly recommend that you do not turn off MPRINT in a development environment. 215C H A P T E R12Working with User-Written CodeAbout User-Written Code 216Editing the Generated Code for a Job 218Problem 218Solution 218Tasks 218Edit and Save the Generated Code 218Specify a Path to the Edited Code in the Job Metadata 219Adding User-Written Source Code to an Existing Job 219Problem 219Solution 219Tasks 219Write SAS Code and Add It to An Existing Job 219Creating a New Job That Consists of User-Written Code 220Problem 220Solution 220Tasks 220Create a Job That Consists of User-Written Code 220Editing the Generated Code for a Transformation 221Problem 221Solution 221Tasks 221Edit and Save the Generated Code 221Specify a Path to the Edited Code in the Transformation Metadata 222About the User Written Code Transformation 222Creating a Job That Includes the User Written Code Transformation 223Problem 223Solution 223Tasks 223Add User-Written Code to a Job 223Map the Columns From the Source Table to the Work Table 224Map the Columns From the Work Table to the Target Table 225Creating and Using a Generated Transformation 226Problem 226Solution 226Tasks 227Write a Generated Transformation 227Use a Generated Transformation in a Job 230Additional Information 232Maintaining a Generated Transformation 232Problem 232Solution 233216 About User-Written Code Chapter 12Tasks 233Analyze the Impact of Generated Transformations 233Import Generated Transformations Exported Before SAS Data Integration Studio 3.4 233Update Generated Transformations 234About User-Written CodeA job is collection of SAS tasks that create output. By default, SAS Data IntegrationStudio uses the metadata for a job to generate code for the job. You can also specifyuser-written code for the entire job or for any transformation within the job.If you want to supply user-written code for a job or a transformation within a job,you create the code and then specify the location of the user-written code in themetadata for the job or transformation. SAS Data Integration Studio then retrieves thespecified code instead of generating code.In most cases, you can specify user-written code for a job or a transformation bydisplaying the properties window for the job or the properties window for thetransformation and then updating the metadata on the Process tab. Cubes are theonly exception to this rule. The job for a cube can be updated only through the CubeDesigner, not the job properties window. For details about specifying user-written codefor cubes, see the 'User-Written Source Code for Cubes' topic in SAS Data IntegrationStudio help.For details about adding status handling macros to the user-written code for a job ortransformation, see the 'Macro Variables for Status Handling' topic in SAS DataIntegration Studio help. The following table describes the main methods for workingwith user-written code in jobs and transformations.Working with User-Written Code About User-Written Code 217Table 12.1 Methods for Working With User-Written codeIf you want to Do thisSpecify user-written code in atransformationSpecify user-written code for a transformation within a job inone of the following ways:. For an existing transformation, generate code for thetransformation, edit the generated code and save theedited code, and then specify the location of the editedcode in the metadata for the transformation. For an existing transformation, write a SAS programthat performs the transformation that you want andthen specify the location of the user-written code in themetadata for the transformation. For a new transformation (one that does not currentlyexist in the job process flow diagram), write a SASprogram that performs the transformation that youwant, add a User Written Code transformation to theprocess flow diagram, and then specify the location ofthe user-written code in the metadata for the UserWritten Code transformation.For details, see Editing the Generated Code for aTransformation on page 221.Specify user-written code for anentire jobSpecify user-written code for an entire job in one of thefollowing ways: For an existing job, generate code for the job; edit thegenerated code and save the edited code; then specifythe location of the edited code in the metadata for thejob. For an existing job, write a SAS program thatperforms the desired job and then specify the locationof the user-written code in the metadata for the job. For a new job, write a SAS program that performs thedesired job and then specify the location of theuser-written code in the metadata for the job.For details, see the following topics:Editing the Generated Code for a Job on page 218.Adding User-Written Source Code to an Existing Job onpage 219.Creating a New Job That Consists of User-Written Code onpage 220.218 Editing the Generated Code for a Job Chapter 12If you want to Do thisCreate user-written transformations The most common way to create a custom transformationthat you can drag from the Process Library and drop intoany job is to create a generated transformation that usesSAS code to produce the output that you want. In SAS DataIntegration Studio, use the Transformation Generator wizardto include your SAS code in a transformation, which is thenavailable in the Process Library. For details, see Creatingand Using a Generated Transformation on page 226.Use the User Written CodetransformationThe User Written Code transformation is provided by defaultin the Process Library tree. It enables you to adduser-written code to a job. For details, see About the UserWritten Code Transformation on page 222.Editing the Generated Code for a JobProblemYou want to modify the code that is generated for a job in order to get the desiredoutput from the job.SolutionWhen you work with an existing job, you can generate SAS code for the job. Then,you can edit the generated code and save the edited code. Finally, you need to specifythe location of the edited code in the metadata for the job. This edited code is saved asa file on the file system. Therefore, you might want to create a special directory for thistype of code. Naturally, this method requires a basic understanding of the SASprogramming language.TasksEdit and Save the Generated CodePerform the following steps to generate code for a job, edit the code, and then savethe edited code to a file:1 Right-click a job and click View Job in the pop-up menu. The job opens in theProcess Editor.2 Click the Source Editor tab in the Process Editor. Code is generated for theentire job and displayed in the Source Editor tab.3 Edit the generated code in the Source Editor tab. Then, close the ProcessDesigner window. Click Yes when prompted to save your changes. A file selectionwindow displays.4 Specify a path name for the edited code in the file selection window. Then clickSave. The edited code is saved to a file, and the Process Designer window closes.Working with User-Written Code Tasks 219At this point, you have saved the edited code. The next step is to specify the locationof the edited code in the metadata for the job.Specify a Path to the Edited Code in the Job MetadataPerform the following steps to specify a path to the edited code in the job metadata:1 Open the Process tab in the properties window for the job.2 Click User Written. The Type field and related fields become active.3 Select File with the drop-down menu in the Type field.4 You will usually accept the default host in the Host field.5 Specify a path to the file that contains the edited code in the Path field. Theserver in the Host field must be able to resolve this path. You can type in the pathor use the Browse button to display a file selection window.6 After specifying the path, click OK to save your changes.The specified user-written code is retrieved whenever code for this job is generated.After you have specified user-written code in the metadata for a job, you might want toexecute the job to verify that it works in the context of SAS Data Integration Studio.Adding User-Written Source Code to an Existing JobProblemYou want to completely replace the code that is generated for a job in order to get thedesired output from the job.SolutionYou can use the Process tab in the properties window for the job to write your ownSAS code for an existing job. Then, you can specify the location of your code in themetadata for the job.TasksWrite SAS Code and Add It to An Existing JobPerform the following steps to write the code and add it to an existing job:1 Write a SAS program that will perform the desired job. Verify that your programproduces the appropriate output.2 Open the Process tab in the properties window for the job.3 Click User Written in the Code Generation group box. The Type, Name, andDescription fields become active.4 You can now perform one of the following steps to specify the location of theuser-written SAS program: Select File in the drop-down menu in the Type field if you want to maintainthe SAS program as a file on the file system. Then select the server that you220 Creating a New Job That Consists of User-Written Code Chapter 12use to access the file in the Host field. Finally, specify the physical path tothe SAS program in the Path field. Select Metadata in the drop-down menu in the Type field if you want tomaintain the SAS program as an object in the current metadata repository.Begin by clicking Edit to display the Edit Source Code window. Next, copythe SAS program and paste it into the window. Then, click OK to save thecode to the current metadata repository. Finally, enter a name for themetadata object that specifies the location of the user-written code in theName field. You can also enter a Description.5 Accept the default SAS Application Server or select a different application serverin the Execution Host field. This server executes the user-written source code.6 Click OK to save your changes.The specified user-written code is retrieved whenever code for this job is generated.After you have specified user-written code in the metadata for a job, you might want toexecute the job to verify that it works in the context of SAS Data Integration Studio.Creating a New Job That Consists of User-Written CodeProblemYou want to create a new job that consists entirely of user-written code.SolutionYou can write a SAS program that produces the desired output. Then, you can addmetadata for the job and specify the location of the user-written code. Fordocumentation purposes, you can add a process flow diagram to the metadata for thejob, but a process flow diagram is optional when user-written code is supplied for theentire job.TasksCreate a Job That Consists of User-Written CodePerform the following steps to write the code and use it to create the job:1 Write a SAS program that will perform the desired job. Verify that your programproduces the appropriate output.2 Create a new job and give it an appropriate name. The Process Designer windowfor the new job is displayed.3 Open the Process tab in the properties window for the job.4 Click User Written in the Code Generation group box. The Type, Name, andDescription fields become active.5 You can now perform one of the following steps to specify the location of theuser-written SAS program:Working with User-Written Code Tasks 221 Select File in the drop-down menu in the Type field if you want to maintainthe SAS program as a file on the file system. Then select the server that youuse to access the file in the Host field. Finally, specify the physical path tothe SAS program in the Path field. Select Metadata in the drop-down menu in the Type field if you want tomaintain the SAS program as an object in the current metadata repository.Begin by clicking Edit to display the Edit Source Code window. Next, copythe SAS program and paste it into the window. Then, click OK to save thecode to the current metadata repository. Finally, enter a name for themetadata object that specifies the location of the user-written code in theName field. You can also enter a Description.6 Accept the default SAS Application Server or select a different application serverin the Execution Host field. This server executes the user-written source code.7 Click OK to save your changes.The specified user-written code is retrieved whenever code for this job is generated.After you have specified user-written code in the metadata for a job, you might want toexecute the job to verify that it works in the context of SAS Data Integration Studio.Editing the Generated Code for a TransformationProblemYou want to modify the code that is generated for a particular transformation inorder to get the desired output from the transformation.SolutionWhen you work with an existing transformation in a process flow diagram, you cangenerate SAS code for the transformation. Then, you can edit the generated code andsave the edited code. Finally, you need to specify the location of the edited code in themetadata for the transformation. This edited code is saved as a file on the file system.Therefore, you might want to create a special directory for this type of code. Naturally,this method requires a basic understanding of the SAS programming language.TasksEdit and Save the Generated CodePerform the following steps to generate code for a transformation, edit the code, andthen save the edited code to a file:1 Right-click the job that contains the transformation that you want and click ViewJob in the pop-up menu. The job opens in the Process Editor window.2 Right-click the transformation and click View Step Code. Click the SourceEditor tab in the Process Editor window. Code is generated for thetransformation and displayed in the Source Editor window.222 About the User Written Code Transformation Chapter 123 Edit the generated code in the Source Editor window. Then, close the window.Click Yes when prompted to save your changes. A file selection window displays.4 Specify a pathname for the edited code in the file selection window. Then clickSave. The edited code is saved to a file, and the Source Editor window closes.At this point, you have saved the edited code. The next step is to specify the locationof the edited code in the metadata for the transformation.Specify a Path to the Edited Code in the Transformation MetadataPerform the following steps to specify a path to the edited code in the transformationmetadata:1 Open the Process tab in the properties window for the transformation.2 Click User Written. The Type field and related fields become active.3 Select File with the drop-down menu in the Type field.4 You will usually accept the default host in the Host field.5 Specify a path to the file that contains the edited code in the Path field. Theserver in the Host field must be able to resolve this path. You can type in the pathor use the Browse button to display a file selection window.6 After specifying the path, click OK to save your changes.The specified user-written code is retrieved whenever code for this transformation isgenerated. After you have specified user-written code in the metadata for atransformation, you might want to execute the job that contains the transformation toverify that it works in the context of SAS Data Integration Studio.About the User Written Code TransformationWhen you create a process flow diagram for a job, you can drag and droptransformations from the Process Library tree into the Process Editor. Sometimes,however, the predefined transformations in the Process Library tree might not meetyour needs. In this cases, you can write a SAS program that produces the desiredoutput. Then, you can add your code to a User Written Code transformation in theprocess flow diagram. Finally, you can specify the location of the new code in themetadata for the User Written Code transformation. The following display shows asample User Written Code transformation dropped into a process flow diagram.Display 12.1 Sample User Written Code Transformation In a JobAfter the transformation has been inserted, you can update its metadata so that itspecifies the location of your user-written code. When the job is executed, theuser-written code is retrieved and executed as part of the job.Note: When SAS Data Integration Studio generates all of the code for a job, it canautomatically generate the metadata for column mappings between sources and targets.However, when you specify user-written code for part of a job, you must manuallyWorking with User-Written Code Tasks 223define the column metadata for that part of the job that the user-written code handles.SAS Data Integration Studio needs this metadata to generate the code for the part ofthe job that comes after the User Written Code transformation.For the example depicted in the process flow diagram in the display, you need tomanually define the column metadata in the User Written Code transformation and theLoader transformation to create the target, Converted Employee Data. For details, seeCreating a Job That Includes the User Written Code Transformation on page 223.Note that you can add the RCSET macro and the TRANS_RC and JOB_RC variablesto user-written code, such as the code for User Written transformations and generatedtransformation templates. Creating a Job That Includes the User Written Code TransformationProblemYou want to add user-written code to a job. One method uses the User Written Codetransformation, which is provided by default in the SAS Data Integration StudioProcess Library. Once you place this transformation in a job, you can add user-writtencode in the Process tab of its properties window and map its columns to the targettable. This approach works particularly well with jobs that need quick custom code orrequire only one input and output and no parameters. More complicated situations arehandled more effectively with the Transformation Generator wizard. For moreinformation, see Creating and Using a Generated Transformation on page 226.SolutionYou can create a job that includes the User Written Code transformation. You needto add the code to the job in the User Written Code transformation. Then, you mustmap the columns from the source table to the work table. Finally, you need to map thecolumns from the work table to the target table.TasksAdd User-Written Code to a JobPerform the following steps to add the user-written code to a job:1 Write SAS code and test it to ensure that it produces the required output. Thefollowing code was written for the sample job:data &_OUTPUT;set &SYSLAST;length sex $1;if gender = 1 thensex = 'M';else if gender = 2 then224 Tasks Chapter 12sex = 'F';elsesex='U';run;2 Create a new job and give it an appropriate name. The Process Designer windowfor the new job is displayed.3 Drop the User Written Code transformation from the Data Transformations folderin the Process Library tree into the empty job.4 Drop the source table on the drop zone for the User Written Code transformation.5 Drop the target table on the target drop zone for the User Written Codetransformation. The flow for the sample job is shown in the following display.Display 12.2 Sample User Written Code Transformation In a JobNote that a work table and the Table Loader transformation are automaticallyincluded in the job.6 Open the Process tab in the properties window for the User Written Codetransformation.7 Click User Written in the Code Generation group box. The Type, Name, andDescription fields become active.8 Select Metadata in the drop-down menu in the Type field.9 Click Edit to display the Edit Source Code window. Then, copy the SAS programand paste it into the window.10 Click OK to save the code to the current metadata repository. Enter a name for themetadata object that specifies the location of the user-written code in the Namefield. You can also enter a Description.11 Accept the default SAS Application Server or select a different application serverin the Execution Host field. This server executes the user-written source code.12 Click Apply to save your changes. Do not close the properties window for the UserWritten Code transformation.At this point, you have updated the User Written Code transformation so that it canretrieve the appropriate code when the job is executed.Map the Columns From the Source Table to the Work TableWhen SAS Data Integration Studio generates all of the code for a job, it canautomatically generate the metadata for column mappings between sources and targets.However, when you use user-written code for part of a job, you must manually definethe column metadata for that part of the job that the user-written code handles. SASData Integration Studio needs this metadata to generate the code for the part of the jobthat comes after the User Written Code transformation. Perform the following steps tomap between the columns in the source table and the columns in the work table:1 Open the Mapping tab in the properties window for the User Written Codetransformation.Working with User-Written Code Tasks 2252 Review the mappings between the source table in the Source table field and thework table in the Target table field. In the sample job, the following steps wererequired to map the columns properly:a Right-click the GENDER column in the Target table field. Then, clickDelete Columns.b Right-click in the Target table field and click New Column. Give the newcolumn an appropriate name (SEX, in this case). You should also make surethat column properties such as type and length match the properties of samerow in the target table. In this case, the length of the column needed to bechanged to 1. The mappings for this part of the sample job are shown in thefollowing display.Display 12.3 Sample Mapping From Source Table to Work Tablec Click OK to save the settings and close the properties window for the UserWritten Code transformation.3 Click OK to save the settings and close the properties window for the User WrittenCode transformation.Map the Columns From the Work Table to the Target TablePerform the following steps to map between the columns in the source table and thecolumns in the work table:1 Open the Mapping tab in the properties window for the Table Loadertransformation.2 Review the mappings between the work table in the Source table field and thetarget table in the Target table field. In the sample job, the SEX column in theSource table field was mapped to the SEX column in the Target table field.3 Click OK to save the settings and close the properties window for the Table Loadertransformation.4 Run the job. The output of job is shown in the following display.226 Creating and Using a Generated Transformation Chapter 12Display 12.4 Output From the Sample JobCreating and Using a Generated TransformationProblemYou want to create a generated transformation that will enable you to processmultiple outputs or inputs, macro variables, and parameters. Then, you need to use thegenerated transformation in a SAS Data Integration Studio job. Once written andsaved, the generated transformation can be used in any job.SolutionOne of the easiest ways to customize SAS Data Integration Studio is to write yourown generated transformations. The Transformation Generator wizard in SAS DataIntegration Studio guides you through the steps of specifying SAS code for thetransformation and saving the transformation in a current metadata repository. Afterthe transformation is saved and checked in, it displays in the Process Library tree,where it is available for use in any job.Working with User-Written Code Tasks 227TasksWrite a Generated TransformationPerform the following steps to write a generated transformation:1 Open the Transformation Generator wizard from the Tools menu in the SAS DataIntegration Studio menu bar. The wizard opens to the general information screen.2 Enter an appropriate name for the transformation. Then, select the subfolderbeneath the Process Library folder where you would like to place thetransformation from the drop-down menu in the Process Library Folder field.You can also enter a description of the transformation. Click Next to access theSAS Code screen.3 Enter the SAS code that will be generated by the transformation. Of course, youcan either enter code manually or paste in SAS code from an existing source. Thefollowing display shows the SAS code for a sample generated transformation.Display 12.5 Sample Transformation CodeA number of macro variables appear in this sample code. One of these,&SYSLAST, is normally available and refers to the last data set created. Thetransformation also includes other macro variables, such as &ColumnsToPrint.The type of each such variable is defined in the Options screen of the wizard. Youwill supply values for these user-defined variables when the transformation isincluded in a job. Click Next to access the Options window.4 Click New to access the Create Option window. Define an option that correspondsto the first macro variable listed on the SAS code screen. The following displayshows a Create Option window for the sample transformation.228 Tasks Chapter 12Display 12.6 Sample Create Options WindowNote that you provide an option name, a macro name, a description, and a datatype for the option. If there are constraints on an option, you can clickConstraints to access the Constraints window. The constraints are different foreach data type. You need to define each of the macro variables included in thetransformation as an option. These options display on the Options tab of thetransformation when it is used in a job. The completed Options screen for thesample transformation is depicted in the following display.Working with User-Written Code Tasks 229Display 12.7 Completed Options ScreenWhen you have defined options for each of the macro variables, click Next toaccess the Transform properties screen.5 Use the Transform properties screen to specify the number of inputs and outputsfor the generated transformation. The Transform properties screen for the sampletransformation is depicted in the following display.Display 12.8 Transform PropertiesThese values add a single input drop zone to the transformation when it is usedin a job. The number of drop zones displayed in the Process Designer window isequal to the value in the Minimum number of inputs field. Therefore, only onedrop zone is displayed. If you later update the transformation to increase thisminimum number of inputs value, any jobs that have been submitted and saveduse the original value. The increased minimum number of inputs is enforced onlyfor subsequent jobs. This means that you can increase the minimum number ofinputs without breaking existing jobs.The increased maximum number of inputs is used to allow you to dropadditional inputs into the drop zone. (In the sample transformation, you can have230 Tasks Chapter 12up to three inputs because you set the maximum to three.) The same rules applyto outputs. The report generated by this transformation is sent to the Outputpanel of the Process Designer window. Therefore, you do not need to add an outputdrop zone to the transformation by using the controls in the Outputs group box.6 After you have set up these properties, click Next to access the Select Folderwindow. Select an appropriate Custom tree folder for this generatedtransformation.7 After you have selected a Custom tree folder, click Next to access the WizardFinish window. Verify that the metadata is correct, and then click Finish. Yourtransformation is created and saved.8 Verify that the generated transformation is available in the Process Library. Fromthe SAS Data Integration Studio desktop, click the tab for the Process Librarytree. Locate the transformation that you just created.Use a Generated Transformation in a JobPerform the following steps to create and run a job that contains the generatedtransformation:1 Create an empty job.2 Drag the generated transformation from its folder in the Process Library tree.Then, drop it into the empty job.3 Drop the source table for the job into the input drop zone for the generatedtransformation.4 Drop the target table into the output drop zone for the generated transformation,if an output exists. You can also send the output to the Output tab of the ProcessDesigner window. The appropriate option must be set so that the Output tabappears in the Process Designer window. For details, see the 'Process DesignerWindow' topic in SAS Data Integration Studio help. The sample job shown in thefollowing display uses the Output tab in this way.Display 12.9 Generated Transformation in a Sample Job5 Open the Options tab in the properties window for the generated transformation.Enter appropriate values for each of the options created for the transformation.The values for the sample job are shown in the following display.Working with User-Written Code Tasks 231Display 12.10 Option Values in a Sample JobClick Column Options to select the columns that are displayed in the Outputtab of the Process Designer window. The selections for the sample job are depictedin the following display.Display 12.11 Column Options in a Sample JobWith these settings, the following %LET statement is generated when you runthe job:%let ReportTitle = %nrquote(Employee Dependent Data);%let ColumnsToPrint = 'Name'n 'Age'n 'Weight'n;Click OK close the properties window and save the settings.6 Run the job by right-clicking inside the Process Designer and selecting Submitfrom the pop-up menu. SAS Data Integration Studio generates and runs thefollowing code:/* Options */%let ReportTitle = %nrquote(Employee Dependent Data);%let ColumnsToPrint = 'Name'n 'Sex'n 'Weight'n;232 Additional Information Chapter 12PROC PRINT DATA=&SYSLASTVAR &ColumnsToPrint;WHERE Sex='M' and Weight > 65;Title '&ReportTitle';RUN;7 After the code has executed, check the Process Designer window Output tab forthe report shown in the following display.Display 12.12 Sample Output ReportAdditional InformationFor detailed information about the Transformation Generator wizard, see the'Maintaining Generated Transformations' topic in the SAS Data Integration StudioHelp. For more information about setting options on the Options screen, see the'Create/Edit Option Window' Help topic.Maintaining a Generated TransformationProblemYou want to perform maintenance tasks such as the following: Analyzing the Impact of Generated Transformations Importing Generated Transformations Generated BeforeSAS Data IntegrationStudio 3.4 Updating Generated TransformationsThese tasks are necessary because of (1) the way that generated transformationsfunction in jobs and (2) the effects that changes made to the transformations could haveon the jobs that contain them.Working with User-Written Code Tasks 233SolutionYou can update a generated transformation, but the change can affect the existingjobs that include that transformation. Before you change a generated transformation,you should run impact analysis on that transformation to see all of the jobs that wouldbe affected by the change.After you have run impact analysis, you can make updates to the transformations.Changes to a generated transformation can affect existing jobs that include thattransformation. They can also affect any new jobs that will include that transformation.Therefore, you should be very careful about any generated transformation that has beenincluded in existing jobs. This precaution reduces the possibility that any one user couldmake changes to a generated transformation that would adversely affect many users.TasksAnalyze the Impact of Generated TransformationsPerform the following steps to run impact analysis on a generated transformation:1 Find the generated transformation that you want to analyze in the ProcessLibrary tree.2 Right-click the transformation and click Impact Analysis. (You can also clickImpact Analysis in the Tools menu.) The Report tab of the Impact Analysiswindow displays, as shown in the following display.Display 12.13 Impact Analysis on a Sample Generated TransformationThe selected generated transformation is named Employee Dependent Data. TheImpact Analysis window shows that the selected transformation is used in the severaljob. You can right-click the objects in the Report tab to access their properties windowsand obtain information about them. For details about the available options, see the'Report Tab options' topic in the SAS Data Integration Studio help. For afrom dictionary.tableswherelibname = 'WORK' andmemtype = 'DATA';quit;data _null_;work_members = symget('work_members');num_members = input(symget('sqlobs'), best.);do n = 1 to num_members;this_member = scan(work_members, n, ',');call symput('member'||trim(left(put(n,best.))),trim(this_member));end;call symput('num_members', trim(left(put(num_members,best.))));run;%if #_members gt 0 %then %do;proc datasets library = work nolist;%do n=1 %to #_members;delete &&member&n%end;quit;%end;%mend clear_work;%clear_workNote: The previous macro deletes all data sets in the Work library. For details about adding a post process to a SAS Data Integration Studio job, seeSpecifying Options for Jobs on page 211.The transformation output tables for a process flow remain until the SAS sessionthat is associated with the flow is terminated. Analyze the process flow and determinewhether there are output tables that are not being used (especially if these tables arelarge). If so, you can add transformations to the flow that will delete these outputtables and free up valuable disk space and memory. For example, you can add agenerated transformation that will delete output tables at a certain point in the flow.For details about generated transformations, see Creating and Using a GeneratedTransformation on page 226.Cleanse and Validate DataClean and de-duplicate the incoming data early in the process flow so that extra datathat might cause downstream errors in the flow is caught and eliminated quickly. Thisprocess can reduce the volume of data that is being sent through the process flow.To clean the data, consider using the Sort transformation with the NODUPKEYoption or the Data Validation transformation. The Data Validation transformation canperform missing-value detection and invalid-value validation in a single pass of thedata. It is important to eliminate extra passes over the data, so try to code all of thesevalidations into a single transformation. The Data Validation transformation alsoprovides de-duplication capabilities and error-condition handling. For information,search for data validation in SAS Data Integration Studio help.Minimize Remote Data AccessRemote data has to be copied locally because it is not accessible by the relevantcomponents in the default SAS Application Server at the time that the code wasgenerated. SAS uses SAS/CONNECT and the UPLOAD and DOWNLOAD proceduresOptimizing Process Flows Tasks 243to move data. It can take longer to access remote data than local data, especially whenyou access large data sets.For example, the following data is considered local in a SAS Data Integration Studiojob: data that can be accessed as if it were on the same computers as the SASWorkspace Server components of the default SAS Application Server data that is accessed with a SAS/ACCESS engine (used by the default SASApplication Server)The following data is considered remote in a SAS Data Integration Studio job: data that cannot be accessed as if it were on the same computers as the SASWorkspace Server data that exists in a different operating environment from the SAS WorkspaceServer components of the default SAS Application Server (such as mainframe datathat is accessed by servers running under Microsoft Windows)Avoid or minimize remote data access in a process flow. For information aboutaccessing remote data, or executing a job on a remote host, administrators should seeMulti-Tier Environments in the SAS Data Integration Studio chapter in the SASIntelligence Platform: Desktop Application Administration Guide.Managing ColumnsProblemYour process flows are running slowly, and you suspect that the columns in yoursource tables are either poorly managed or superfluous.SolutionYou can perform the following tasks on columns to improve the performance ofprocess flows: dropping unneeded columns avoid adding unneeded columns aggregating columns matching column variables size to data lengthTasksDrop Unneeded ColumnsAs soon as the data comes in from a source, consider dropping any columns that arenot required for subsequent transformations in the flow. You can drop columns andmake aggregations early in the process flow instead of later. This prevents theextraneous detail data from being carried along between all transformations in the flow.You should work to create a structure that matches the ultimate target table structure244 Tasks Chapter 13as closely as possible early in the process flow. Then, you can avoid carrying extra dataalong with the process flow.To drop columns in the output table for a SAS Data Integration Studiotransformation, click the Mapping tab and remove the extra columns from the Targettable area on the tab. Use derived mappings to create expressions to map severalcolumns together. You can also turn off automatic mapping for a transformation byright-clicking the transformation in the process flow and deselecting the Automap optionin the pop-up menu. You can then build your own transformation output table columnsto match your ultimate target table and map.Avoid Adding Unneeded ColumnsAs data is passed from step to step in a process flow, columns could be added ormodified. For example, column names, lengths, or formats might be added or changed.In SAS Data Integration Studio, these modifications to a table, which are done on atransformations Mapping tab, often result in the generation of an intermediate SQLview step. In many situations, that intermediate step adds processing time. Try toavoid generating more of these steps than is necessary.You should rework your flow so that activities such as column modifications oradditions throughout many transformations in a process flow are consolidated withinfewer transformations. Avoid using unnecessary aliases; if the mapping betweencolumns is one-to-one, then keep the same column names. Avoid multiple mappings onthe same column, such as converting a column from a numeric to a character value inone transformation and then converting it back from a character to a numeric value inanother transformation. For aggregation steps, rename any columns within thosetransformations, rather than in subsequent transformations.Aggregate Columns for EfciencyWhen you add column mappings, also consider the level of detail that is beingretained. Ask these questions: Is the data being processed at the right level of detail? Can the data be aggregated in some way?Aggregations and summarizations eliminate redundant information and reduce thenumber of records that have to be retained, processed, and loaded into a data collection.Match the Size of Column Variables to Data LengthVerify that the size of the column variables in the data collection is appropriate tothe data length. Consider both the current and future uses of the data: Are the keys the right length for the current data? Will the keys accommodate future growth? Are the data sizes on other variables correct? Do the data sizes need to be increased or decreased?Data volumes multiply quickly, so ensure that the variables that are being stored in thedata warehouse are the right size for the data.Optimizing Process Flows Tasks 245Streamlining Process Flow ComponentsProblemYou have worked hard to optimize the data and columns in your process flow, butyour flow is still running too slowly.SolutionYou can try the following best practices when they are relevant to your process flows: Work from simple to complex. Use transformations for star schemas and lookups. Use surrogate keys.TasksWork From Simple to ComplexWhen you build process flows, build complexity up rather than starting at a complextask. For example, consider building multiple individual jobs and validating eachrather than building large, complex jobs. This will ensure that the simpler logicproduces the expected results.Also, consider subsetting incoming data or setting a pre-process option to limit thenumber of observations that are initially being processed in order to fix job errors andvalidate results before applying processes to large volumes of data or complex tasks.For details about limiting input to SAS Data Integration Studio jobs andtransformations, see Limit Input to a Transformation on page 247.Use Transformations for Star Schemas and LookupsConsider using the Lookup transformation when you build process flows that requirelookups such as fact table loads. The Lookup transformation is built using a fastin-memory lookup technique known as DATA step hashing that is available in SAS9.The transformation allows for multi-column keys and has useful error handlingtechniques such as control over missing-value handling and the ability to set limits onerrors.When you are working with star schemas, consider using the SCD Type 2transformation. This transformation efficiently handles change data detection, and hasbeen optimized for performance. Several change detection techniques are supported:date-based, current indicator, and version number. For details about the SCD Type 2transformation, see Chapter 19, Working with Slowly Changing Dimensions, on page343.Use Surrogate KeysAnother technique to consider when you are building the data warehouse is to useincrementing integer surrogate keys as the main key technique in your data structures.Surrogate keys are values that are assigned sequentially as needed to populate a246 Using Simple Debugging Techniques Chapter 13dimension. They are very useful because they can shield users from changes in theoperational systems that might invalidate the data in a warehouse (and thereby requireredesign and reloading). For example, using a surrogate key can avoid issues if theoperational system changes its key length or type. In this case, the surrogate keyremains valid. An operational key will not remain valid.The SCD Type 2 transformation includes a surrogate key generator. You can alsoplug in your own methodology that matches your business environment to generate thekeys and point the transformation to it. There is also a Surrogate Key Generatortransformation that can be used to build incrementing integer surrogate keys.Avoid character-based surrogate keys. In general, functions that are based on integerkeys are more efficient because they avoid the need for subsetting or string partitioningthat might be required for character-based keys. Numeric strings are also smaller insize than character strings, thereby reducing the storage required in the warehouse.For details about surrogate keys and the SCD Type 2 transformation, see Chapter19, Working with Slowly Changing Dimensions, on page 343.Using Simple Debugging TechniquesProblemOccasionally a process flow might run longer than you expect or the data that isproduced might not be what you anticipate (either too many records or too few). In suchcases, it is important to understand how a process flow works. Then, you can correcterrors in the flow or improve its performance.SolutionA first step in analyzing process flows is being able to access information from SASthat will explain what happened during the run. If there were errors, you need tounderstand what happened before the errors occurred. If you are having performanceissues, then the logs will explain where you are spending your time. Finally, if youknow what SAS options are set and how they are set, this can help you determine whatis going on in your process flows. You can perform the following tasks: Monitor job status. Verify output. Limit input. Add debugging code. Set SAS invocation options. Set and check status codes.TasksMonitor Job StatusYou can use the Job Status Manager window to display the name, status, startingtime, ending time, and application server used for all jobs submitted in the currentOptimizing Process Flows Tasks 247session. You can also right-click on any row to clear, view, cancel, kill, or resubmit a job.From the SAS Data Integration Studio desktop, select Tools Job Status Manager toopen the Job Status Manager window. For more information, see Using the Job StatusManager on page 159.Verify Output From a TransformationYou can view the output tables for the transformations in the job. Reviewing theoutput tables enables you to verify that each transformation is creating the expectedoutput. This review can be useful when a job is not producing the expected output orwhen you suspect that something is wrong with a particular transformation in the job.For more information, see Browsing Table Data on page 109.Limit Input to a TransformationWhen you are debugging and working with large data files, you might find it usefulto decrease some or all of the data that is flowing into a particular step or steps. Oneway of doing this is to use the OBS= data set option on input tables of DATA steps andprocedures.To specify the OBS= for an entire job in SAS Data Integration Studio, add thefollowing code to the Pre and Post Processing tab in the jobs property window:options obs=;To specify the OBS= for a transformation within a job, you can temporarily add theoption to the system options field on the Options tab in the transformations propertywindow. Alternatively, you can edit the code that is generated for the transformationand execute the edited code. For more information about this method, see SpecifyingOptions for Jobs on page 211.Important considerations when you are using the OBS= system option include thefollowing: All inputs into all subsequent steps are limited to the specified number, until theoption is reset. Setting the number too low before a join or merge step can result in few or nomatches, depending on the data. In the SAS Data Integration Studio Process Editor, this option stays in effect forall runs of the job until it is reset or the Process Designer window is closed.The syntax for resetting the option is as follows:options obs=MAX;Note: Removing the OBS= line of code from the Process Editor does not reset theOBS= system option. You must reset it as shown previously, or by closing the ProcessDesigner window. Add Debugging Code to a Process FlowIf you are analyzing a SAS Data Integration Studio job, and the information that isprovided by logging options and status codes is not enough, consider the followingmethods for adding debugging code to the process flow.248 Tasks Chapter 13Table 13.1 Methods for Adding Custom Debugging CodeMethod DocumentationReplace the generated code for atransformation with user-writtencode.Adding User-Written Source Code to an Existing Job on page219.Add a User-Written Codetransformation to the processflow.Creating a Job That Includes the User Written CodeTransformation on page 223.Add a generated transformationto the process flow.Creating and Using a Generated Transformation on page 226.Add a return code to the processflow.Set and Check Status Codes on page 248.Custom code can direct information to the log or to alternate destinations such asexternal files, tables. Possible uses include tests of frequency counts, dumping out SASmacro variable settings, or listing the run time values of system options.Set SAS Invocation Options on JobsWhen you submit a SAS Data Integration Studio job for execution, it is submitted toa SAS Workspace Server component of the relevant SAS Application Server. Therelevant SAS Application Server is one of the following: the default server that is specified on the SAS Server tab in the Options window the SAS Application Server to which a job is deployed with the Deploy forScheduling optionTo set SAS invocation options for all SAS Data Integration Studio jobs that areexecuted by a particular SAS server, specify the options in the configuration files for therelevant SAS Workspace Servers, batch or scheduling servers, and grid servers. (You donot set these options on SAS Metadata Servers or SAS Stored Process Servers.)Examples of these options include UTILLOC, NOWORKINIT, or ETLS_DEBUG. Formore information, see Modifying Configuration Files or SAS Start Commands forApplication Servers on page 212.To set SAS global options for a particular job, you can add these options to the Preand Post Process tab in the Properties window for a job. For more information aboutadding code to this window, see Specifying Options for Jobs on page 211.The property window for most transformations within a job has an Options tab witha System Options field. Use the System Options field to specify options for aparticular transformation in a jobs process flow. For more information, see SpecifyingOptions for a Transformation on page 211.For more information about SAS options, search for relevant phrases such as systemoptions and invoking SAS in SAS OnlineDoc.Set and Check Status CodesWhen you execute a job in SAS Data Integration Studio, a return code for eachtransformation in the job is captured in a macro variable. The return code for the job isset according to the least successful transformation in the job. SAS Data IntegrationStudio enables you to associate a return code condition, such as Successful, with anaction, such as Send Email or Abort. In this way, users can specify how a return codeis handled for the job or transformation.Optimizing Process Flows Tasks 249For example, you could specify that a transformation in a process flow will terminatebased on conditions that you define. This can reduce the log to just the transformationsleading up to the problem being investigated, making the log more manageable andeliminating inconsequential error messages. For more information about status codehandling, see Managing Status Handling on page 161.Using SAS LogsProblemThe errors, warnings, and notes in the SAS log provide information about processflows. However, large SAS logs can decrease performance, so the costs and benefits oflarge SAS logs should be evaluated. For example, in a production environment, youmight not want to create large SAS logs by default.SolutionYou can use SAS logs in the following ways: Evaluate SAS logs. Capture additional SAS options In the SAS log. View or hide SAS logs. Redirect large SAS logs to a file.TasksEvaluate SAS LogsThe SAS logs from your process flows are an excellent resource to help youunderstand what is happening as the flows execute. For example, when you look at therun times in the log, compare the real-time values to the CPU time (user CPU plussystem CPU). For read operations, the real time and CPU time should be close. Forwrite operations, however, the real time can substantially exceed the CPU time,especially in environments that are optimized for read operations. If the real time andthe CPU time are not close, and they should be close in your environment, investigatewhat is causing the difference.If you suspect that you have a hardware issue, see A Practical Approach to SolvingPerformance Problems with the SAS System, a document that is available from theScalability and Performance Papers page at http://support.sas.com/rnd/scalability/papers/.If you determine that your hardware is properly configured, then review the SAScode. Transformations generate SAS code. Understanding what this code is doing isvery important to ensure that you do not duplicate tasks, especially SORTs, which areresource-intensive. The goal is to configure the hardware so that there are nobottlenecks, and to avoid needless I/O in the process flows.250 Tasks Chapter 13Capture Additional SAS Options In the SAS LogTo analyze performance, we recommend that you turn on the following SAS optionsso that detailed information about what the SAS tasks are doing is captured in the SASlog:FULLSTIMERMSGLEVEL=I (this option prints additional notes pertaining to index, mergeprocessing, sort utilities, and CEDA usage, along with the standard notes,warnings, and error messages)SOURCE, SOURCE2MPRINTNOTESTo interpret the output from the FULLSTIMER option, see A Practical Approach toSolving Performance Problems with the SAS System, a document that is available fromthe Scalability and Performance Papers page at http://support.sas.com/rnd/scalability/papers/.In addition, the following SAS statements will also send useful information to theSAS log:PROC OPTIONS OPTION=UTILLOC; run;PROC OPTIONS GROUP=MEMORY; run;PROC OPTIONS GROUP=PERFORMANCE; run;LIBNAME _ALL_ LIST;The PROC OPTIONS statement will send SAS options and their current settings tothe SAS log. There are hundreds of SAS options, so if, for example, you prefer to seewhich value has been set to the SAS MEMORY option, you can issue the PROCOPTIONS statement with the GROUP=MEMORY parameter. The same is true if youwant to see only the SAS options that pertain to performance.The LIBNAME _ALL_ LIST statement will send information (such as physical pathlocation and the engine being used) to the SAS log about each libref that is currentlyassigned to the SAS session. This data is helpful for understanding where all the workoccurs during the process flow. For details about setting SAS invocation options for SASData Integration Studio, see Set SAS Invocation Options on Jobs on page 248.View or Hide SAS LogsThe Process Designer window in SAS Data Integration Studio has a Log tab thatdisplays the SAS log for the job in the window. Perform the following steps to display orhide the Log tab:1 Select Tools Options on the SAS Data Integration Studio menu bar to displaythe Options window.2 Click the General tab in the Options window. Then, select or deselect the checkbox that controls whether the Log tab is displayed in the Process Designer window.3 Click OK in the Options window to save the setting and close the window.Redirect Large SAS Logs to a FileThe SAS log for a job provides critical information about what happened when a jobwas executed. However, large jobs can create large logs, which can slow down SAS DataIntegration Studio. In order to avoid this problem, you can redirect the SAS log to apermanent file. Then, you can turn off the Log tab in the Process Designer window.When you install SAS Data Integration Studio, the Configuration Wizard enables youto set up as permanent SAS log files for each job that is executed. The SAS logOptimizing Process Flows Tasks 251filenames will contain the name of the job that created the log, plus a timestamp ofwhen the job was executed.Alternatively, you can add the following code to the Pre and Post Process tab inthe properties window for a job:proc printto log=...path_to_log_file NEW; run;For details about adding pre-process code to a SAS Data Integration Studio job, seeSpecifying Options for Jobs on page 211. This code will cause the log to be redirectedto the specified file. Be sure to use the appropriate host-specific syntax of the hostwhere your job will be running when you specify this log file, and make sure that youhave Write access to the location where the log will be written.Reviewing Temporary Output TablesProblemMost transformations in a SAS Data Integration Studio job create at least one outputtable. Then, they store these tables in the Work library on the SAS Workspace Serverthat executes the job. The output table for each transformation becomes the input tothe next transformation in the process flow. All output tables are deleted when the jobis finished or the current server session ends.Sometimes a job does not produce the expected output. Other times, something canbe wrong with a particular transformation. In either case, you can view the outputtables for the transformations in the job to verify that each transformation is creatingthe expected output. Output tables can also be preserved to determine how much diskspace they require. You can even use them to restart a process flow after it has failed ata particular step (or in a specific transformation).SolutionYou can (1) view a transformations temporary output table from the Process Designerwindow and (2) preserve temporary output tables so that you can view their contents byother means. You can perform the following tasks to accomplish these objectives: Preserve temporary output tables. View temporary output tables. Redirect temporary output tables. Add a List Data transformation to a process flow. Add a User-Written Code transformation to a process flow.TasksPreserve Temporary Output TablesWhen SAS Data Integration Studio jobs are executed in batch mode, a number of SASoptions can be used to preserve intermediate files in the Work library. These systemoptions can be set as described in Set SAS Invocation Options on Jobs on page 248.252 Tasks Chapter 13Use the NOWORKINIT system option to prevent SAS from erasing existing Workfiles on invocation. Use the NOWORKTERM system option to prevent SAS fromerasing existing Work files on termination.For example, to create a permanent SAS Work library in UNIX and PCenvironments, you can start the SAS Workspace Server with the WORK option toredirect the Work files to a permanent Work library. The NOWORKINIT andNOWORKTERM options must be included, as follows:C:>'C:Program FilesSASSAS 9.1sas.exe'-work 'C:Documents and SettingssasapbMy DocumentsMy SAS FilesMy SAS Work Folder'-noworkinit-noworktermThis redirects the generated Work files in the folder My SAS Work Folder.To create a permanent SAS Work library in the z/OS environment, edit your JCLstatements and change the WORK DD statement to a permanent MVS data set. Forexample://STEP1 EXEC SDSSAS9,REGION=50M//* changing work lib definition to a permanent data set//SDSSAS9.WORK DD DSN=userid.somethin.sasdata,DISP=OLD//* other file defs//INFILE DD ... .CAUTION:If you redirect Work les to a permanent library, you must manually delete these les toavoid running out of disk space. View Temporary Output TablesPerform the following steps to view the output file:1 Open the job in the Process Designer window.2 Submit the job for execution. The transformations must execute successfully.(Otherwise, a current output table is not available for viewing.)3 Right-click the transformation of the output table that you want to view, andselect View Data from the pop-up menu. The transformations output table isdisplayed in the View Data window.This approach works if you do not close the Process Designer window. When youclose the Process Designer window, the current server session ends, and the outputtables are deleted. For information, see Browsing Table Data on page 109.Redirect Temporary Output TablesThe default name for a transformations output table is a two-level name thatspecifies the Work libref and a generated member name, such as work.W54KFYQY. Youcan specify the name and location of the output table for that transformation on thePhysical Storage tab on the properties window of the temporary output table. Notethat this location can be a SAS library or RDBMS library. This has the added benefit ofproviding users the ability to specify which output tables they want to retain and toallow the rest to be deleted by default. Users can use this scheme as a methodology forcheckpoints by writing specific output tables to disk when needed.Note: If you want to save a transformation output table to a library other than theSAS User library, replace the default name for the output table with a two-level name. If you refer to an output table with a single-level name (for example, employee),instead of a two-level name (for example, work.employee), SAS automatically sends theOptimizing Process Flows Additional Information 253output table into the User library, which defaults to the Work library. However, thisdefault behavior can be changed by any SAS user. Through the USER= system option,a SAS user can redirect the User library to a different library. If the USER= systemoption is set, single-level tables are stored in the User library, which has beenredirected to a different library, instead of to the Work library.Add the List Data Transformation to a Process FlowIn SAS Data Integration Studio, you can use the List Data transformation to printthe contents of an output table from the previous transformation in a process flow. Addthe List Data transformation after any transformation whose output table is of interestto you.The List Data transformation uses the PRINT procedure to produce output. Anyoptions associated with that procedure can be added from the Options tab in thetransformations property window. By default, output goes to the Output tab in theProcess Designer window. Output can also be directed to an HTML file. For large data,customize this transformation to print just a subset of the data. For details, see theExample: Create Reports from Table Data topic in SAS Data Integration Studio help.Add a User-Written Code Transformation to the Process FlowYou can add a User Written Code transformation to the end of a process flow thatmoves or copies some of the data sets in the Work library to a permanent library. Forexample, assume that there are three tables in the Work library (test1, test2, andtest3). The following code moves all three tables from the Work library to a permanentlibrary named PERMLIB and then deletes them from the Work library:libname permlib base'C:Documents and SettingsramichMy DocumentsMy SAS Files9.1';proc copy movein = workout = permlib;select test1 test2 test3;run;For information about User Written Code transformations, see Creating a Job ThatIncludes the User Written Code Transformation on page 223.Additional InformationThe techniques covered in this chapter address general performance issues thatcommonly arise for process flows in SAS Data Integration Studio jobs. For specificinformation about the performance of the SQL Join transformation, see OptimizingSQL Processing Performance on page 320. For specific information about theperformance of the Table Loader transformation, see Selecting a Load Technique onpage 270 and Removing Non-Essential Indexes and Constraints During a Load onpage 272.You can also access a library of SAS Technical Papers that cover a variety ofperformance-related topics. You can find these papers at http://support.sas.com/documentation/whitepaper/technical/.255C H A P T E R14Using Impact AnalysisAbout Impact Analysis and Reverse Impact Analysis 255Prerequisites 256Performing Impact Analysis 257Problem 257Solution 257Tasks 257Perform Impact Analysis 257Performing Impact Analysis on a Generated Transformation 258Problem 258Solution 259Tasks 259Perform Impact Analysis on a Generated Transformation 259Performing Reverse Impact Analysis 259Problem 259Solution 259Tasks 260Perform Reverse Impact Analysis 260About Impact Analysis and Reverse Impact AnalysisImpact analysis identifies the tables, columns, jobs, and transformations that areaffected by a change in a selected table or column. Use impact analysis before changingor deleting a metadata object to see how that change can affect other objects.Reverse impact analysis identifies the tables, columns, jobs, and transformations thatcontribute to the content of a selected table or column. Use reverse impact analysis tohelp determine the cause of data anomalies.The following figure shows the difference between impact analysis and reverseimpact analysis for a selected object.256 Prerequisites Chapter 14Figure 14.1 Differentiating Impact Analysis and Reverse Impact AnalysisReverse Impact Analysis Impact AnalysisUpstream Data Flow Downstream Data FlowExternal File Job 1 Job 2 OLAP CubeSelected ObjectAs shown in the figure, impact analysis traces the impact of the selected object onlater objects in the data flow. Reverse impact analysis traces the impact that previousobjects in the data flow have had on the selected object.In SAS Data Integration Studio, you can perform impact analysis and reverse impactanalysis on the following kinds of metadata: table metadata column metadata external file metadata OLAP cube metadata metadata for OLAP cube features, levels, and measures metadata for Information Maps, Enterprise Guide projects, or Add-in for MicrosoftOffice (AMO) objects message queue metadata metadata for SAS Data Integration Studio jobs metadata for transformations, including generated transformations.For further information, see the SAS Data Integration Studio Help topic on UsingImpact Analysis and Reverse Impact Analysis.PrerequisitesTo ensure that your analyses include a search that includes all repositories, selectTools Options and click the Impact Analysis tab. In the Impact Analysis tab,select the check box Continue analysis into dependent repositories.Impact analysis and reverse impact analysis display only the objects for which usershave ReadMetadata permission. Administrators should make sure that users have theReadMetadata permissions for all repositories needed for their analysis. For moreinformation, the administrator should see SAS Intelligence Platform: SecurityAdministration Guide.Using Impact Analysis Tasks 257Performing Impact AnalysisProblemA table is used in the process flow for a job. You want to delete the metadata for acolumn in a table, and you want to trace the impact this will have on later objects inthe process flow.SolutionUse impact analysis to trace the impact of the selected object on later objects in theprocess flow for the job.TasksPerform Impact AnalysisIn general, to perform impact analysis on a metadata object, right-click the object ina tree view or in the context of a process flow in the Process Designer window, thenselect Impact Analysis from the pop-up menu.Alternatively, you can select the object in a tree view or in the context of a processflow, then select Tools Impact Analysis from the menu bar.Perform the following steps to trace the impact of the metadata for a table column:1 Double-click the metadata object for the table that contains the column to beanalyzed. The properties window for the table displays.2 Select the Columns tab.3 On the Columns tab, right-click the column to be analyzed and select ImpactAnalysis from the pop-up menu. In the resulting Impact Analysis window, theReport tab displays by default. In the following display, the Report tab shows theresult of an analysis performed on a column named Customer_ID in a table namedCUSTOMER.Display 14.1 Report Tab in the Impact Analysis Window258 Performing Impact Analysis on a Generated Transformation Chapter 14The Report tab uses a hierarchical list to illustrate the impact of the selectedobject (Customer_ID column) on later objects in a process flow. In the previousdisplay, the tab contains the following objects: CUSTOMER.Customer_ID (Foundation): specifies the selected column,Customer_ID, in the table CUSTOMER, which is registered in the Foundationrepository. SCD_I (Foundation): specifies the job, SCD_1, to which theCustomer_ID column is an input. SCD Type 2 Loader.Customer_ID (1:1) (Foundation): specifies thetransformation which maps data from the Customer_ID column to a columnlater in the process flow. The mapping type is 1:1. CUSTOMER_DIM.Customer_ID (Foundation): specifies the target column,Customer_ID, in the table CUSTOMER_DIM. The target column is loaded withdata from the selected column.4 Click the Graph tab to display a graphical view of the analytical results, as shownin the following example.Display 14.2 Graph Tab of the Impact Analysis windowThe Graph tab uses a process flow to illustrate the impact of the selected object(Customer_ID column) on later objects in the flow.Performing Impact Analysis on a Generated TransformationProblemYou want to determine how many jobs are impacted by a change to a generatedtransformation.A generated transformation is a transformation that you create with theTransformation Generator wizard. You can use this wizard to create your owngenerated transformations and register them in your metadata repository. After theyare registered, your transformations display in the Process Library, where they areavailable for use in any job. For more information about these transformations, seeCreating and Using a Generated Transformation on page 226.When you change or update a generated transformation, the change can affect thejobs that include that transformation. Before you change a generated transformation,you should run impact analysis on that transformation to see all of the jobs that wouldbe affected by the change.Using Impact Analysis Solution 259SolutionRun impact analysis on a generated transformation.TasksPerform Impact Analysis on a Generated TransformationPerform the following steps to run an impact analysis on a generated transformation:1 From the SAS Data Integration Studio desktop, select the Process Library tree.2 Open the folder that contains the generated transformation that you want toanalyze.3 Select that transformation and then, from the menu bar, select Tools ImpactAnalysis. The Report tab of the Impact Analysis window displays, as shown inthe following example.Display 14.3 Impact Analysis on a Generated TransformationIn the preceding display, the selected generated transformation is named SummaryStatistics. The Impact Analysis window shows that the selected transformation is usedin the job Home Run Stats.You can right-click the objects in the Report tab to obtain information about thoseobjects. For details about the available options, see the Data Integration Studio Helptopic on Report Tab options.For a process flow view of the impacts, select the Graph tab.Performing Reverse Impact AnalysisProblemA table is used in the process flow for a job. You notice an error in the data for onecolumn, and you want to trace the data flow to that column.SolutionUse reverse impact analysis to identify the tables, columns, jobs, andtransformations that contribute to the content of a selected column.260 Tasks Chapter 14TasksPerform Reverse Impact AnalysisIn general, to perform impact analysis on a metadata object, right-click the object ina tree view or in the context of a process flow in the Process Designer window, thenselect Reverse Impact Analysis from the pop-up menu .Alternatively, you can select the object in a tree view or in the context of a processflow, then select Tools Reverse Impact Analysis from the menu bar.The steps for performing reverse impact analysis on a column are similar to the stepsin Perform Impact Analysis on page 257.261P A R T3Working with Specic TransformationsChapter 15. . . . . . . . .Working with Loader Transformations 263Chapter 16. . . . . . . . .Working with SAS Sorts 275Chapter 17. . . . . . . . .Working with the SQL Join Transformation 283Chapter 18. . . . . . . . .Working with Iterative Jobs and Parallel Processing 331Chapter 19. . . . . . . . .Working with Slowly Changing Dimensions 343Chapter 20. . . . . . . . .Working with Message Queues 361Chapter 21. . . . . . . . .Working with SPD Server Cluster Tables 369263C H A P T E R15Working with LoaderTransformationsAbout Loader Transformations 263About the SPD Server Table Loader Transformation 264About the Table Loader Transformation 264Setting Table Loader Transformation Options 265Problem 265Solution 265Tasks 265Set Options Based on Load Style 265Set Constraint Condition and Index Condition Options 268Selecting a Load Technique 270Problem 270Solution 270Tasks 271Remove All Rows 271Add New Rows 271Match and Update Rows 272Removing Non-Essential Indexes and Constraints During a Load 272Problem 272Solution 272Considering a Bulk Load 273Problem 273Solution 273About Loader TransformationsSAS Data Integration Studio provides the following transformations for loadingoutput tables in a process flow: the SCD Type 2 Loader transformation, which loads source data into a dimensiontable, detects changes between source and target rows, updates change trackingcolumns, and applies generated key values. This transformation implementsslowly changing dimensions. For more information see Chapter 19, Working withSlowly Changing Dimensions, on page 343. the SPD Server Table Loader transformation, which reads a source and writes toan SPD Server target. This transformation is automatically added to a processflow when an SPD Server table is specified as a source or as a target. It enablesyou to specify options that are specific to SPD Server tables. the Table Loader transformation, which reads a source table and writes to a targettable. This transformation is added automatically to a process flow when a table isdropped on to a Process Designer window.264 About the SPD Server Table Loader Transformation Chapter 15About the SPD Server Table Loader TransformationThe SPD Server Table Loader transformation is automatically added to a processflow when a SAS Scalable Performance Data (SPD) Server table is added as a source oras a target in the Process Designer window. The SPD Server Table Loader generatescode that is appropriate for the special data format that the server uses. It also enablesyou to specify options that are unique to SPD Server tables.You can specify a variety of table options in the Additional Data Table Optionsfield on the Options tab of the SPD Server Table Loader properties window. Theseoptions are described in detail in the documentation that is installed with the SPDServer. One example of an additional table option is the MINMAXVARLIST option thatis described in the SAS Data Integration Studio Usage Notes topic in SAS DataIntegration Studio help.About the Table Loader TransformationThe Table Loader transformation provides load options and combinations of loadoptions. You can use it in the following ways: You can update-and-insert rows, and you can also update only or insert only byusing skip options. After you find and match new records, you can process them by using SQL Insertor the APPEND procedure. SQL Update has ignore blanks options similar to those for DATA Step Modify inprevious releases. For SAS tables and for some database tables, indexes and constraints can berevised without having to recreate a table. For SAS tables and for some database tables, indexes and constraints can bedropped or added before or after a load. For example, Job One can load Table Awithout an index, and Job Two can update the same table and add the indexes. Index dropping is available for all update methods, not just the APPENDprocedure. If an index is needed for the update method, that index will not bedropped.Finally, the current version of the Table Loader transformation inserts a commentinto the generated code for queries that states that the code was generated by version 2of the transformation. This comment distinguishes the code from the code that isgenerated by the earlier version of the Table Loader transformation.To understand how the Table Loader transformation functions in the context of a job,you need to understand the following points about how jobs are constructed: When a transformation is added to a process flow, it sends its output to atemporary output table by default. Each temporary output table has its own iconin the process flow. Having a separate metadata object for each temporary tablemakes it easier to manage the data and metadata for these tables. When you drop a permanent data structure such as a physical table on to theoutput drop zone for the transformation, the Table Loader transformation is addedto the process flow. It is located between the temporary output table and thepermanent data structure.When a process flow contains a temporary output table and a Table Loadertransformation, you might often need to verify or create the column mappings betweenthe temporary output table and the permanent data structure (output table) attached tothe Table Loader. Performance could be degraded for the job because of the overheadWorking with Loader Transformations Tasks 265associated with the temporary output table and the Table Loader transformation.Therefore, you have the option of deleting the Table Loader transformation. Forinformation about deleting the Table Loader transformation, Manage Temporary andPermanent Tables for Transformations on page 239.Setting Table Loader Transformation OptionsProblemYou want to specify options that control how the Table Loader transformationupdates the target.SolutionYou can use the settings available on the Load Technique tab in the propertieswindow for the Table Loader transformation. Some of the settings available on the tabvary depending on which load styles that you use, although some settings appear formore than one load style. Some options are set in the Advanced Load Options window.TasksSet Options Based on Load StyleWhen you use the append to existing table load style, you can set the options listedin following table.Table 15.1 Append to Existing Table Load Style OptionsSettingLocation(group box orfield) DescriptionAppend to Existing Load Style Creates the output table if it does not exist;then appends input data rows to the outputtable.Append (Proc Append) New Rows Appends rows by using PROC APPEND. Thisis the default.Insert SQL New Rows Appends rows by using SQL Insert code.When you use the replace table load style, you can set the options listed in followingtable.266 Tasks Chapter 15Table 15.2 Replace Table Load Style OptionsSettingLocation(group box orfield) DescriptionReplace Load Style Specifies that the output table or rows in theoutput table are replaced.Entire table Replace Replaces the entire output table by deletingand recreating the table before adding newrows. This is the default for SAS data sets.All rows using delete Replace Replaces all rows in the output table by firstdeleting all rows before adding new rows. Thisis the default for DBMS tables that do notsupport truncation.All rows using truncate Replace Replaces all rows in the output table by usingthe truncate method of the DBMS beforeadding new rows. This is the default forDBMS tables that support truncation.Append (Proc Append) New rows Inserts new rows into the empty table byusing PROC APPEND. This is the default.Insert (SQL) New rows Inserts new rows into the empty table byusing SQL Insert code.When you use the update/insert load style, you can set the options listed in thefollowing table.Table 15.3 Update/Insert Load StyleSettingLocation(group box orfield) DescriptionUpdate/Insert Load Style Specifies that existing rows are updated, newrows are inserted, or both.SQL Set Matching rows Specifies that PROC SQL is used to updateoutput table rows with all matching inputdata. Also enables the Match by Columnsgroup box, which allows you to select columns.Modify by Column(s) Matching rows Specifies that the DATA STEP MODIFY BYmethod is used to update output table rowswith all matching input data. Also enables theMatch by Columns group box, whichallows you to select columns.Working with Loader Transformations Tasks 267SettingLocation(group box orfield) DescriptionModify Using Index Matching rows Specifies that the DATA STEP MODIFY withKEY= method will be used to update theoutput table rows with all matching inputdata. Also enables the Modify UsingIndex group box, the Index field, and theReturn to the top of the indexfor duplicate values coming fromthe input data check box.Index Modify UsingIndex GroupboxEnables you to select an index by using adrop-down menu.Return to the top of the indexfor duplicate values comingfrom the input dataModify UsingIndex GroupboxWhen selected, processes duplicate valuescoming from the input data by returning theupdate or insert operation to the top of theindex.Skip Matching Rows Matching rows Ignores any input rows that match rows in theoutput table (by selected match-columns orindex). This prevents any existing rows in theoutput table from being modified.Blanks can replace non-blankvaluesTechnique(s)group boxWhen selected, specifies that blanks canreplace non-blank values during updates andinsertions.Append (Proc Append) New rows Appends any unmatched rows to the outputtable. This is the default.Insert (SQL) New rows Uses SQL to insert unmatched rows into theoutput table. Available only when you selectSQL Set or Skip Matching Rows in theMatching rows field.Modify by Column(s) New rows Adds any unmatched rows into the outputtable. Available only when you select Modifyby Column(s) or Skip Matching Rowsin the Matching rows field.Modify Using Index New rows Adds any unmatched rows into the outputtable. Available only when you select ModifyUsing Index or Skip Matching Rowsin the Matching rows field.Skip New Rows New rows Ignores any input rows that do not match rowsin the output table (by selected match-columnsor index).268 Tasks Chapter 15SettingLocation(group box orfield) DescriptionAvailable columns Match byColumn(s)Lists the columns that are available formatching.Column(s) to match Match byColumn(s)Specifies the columns to match during updatesand insertions to the output table.Set Constraint Condition and Index Condition OptionsThe settings in the Constraint Condition and Index Condition group boxesdisplay only when the output table for the Table Loader transformation contains one ormore constraints or indexes. The options available in the Constraint Conditiongroup box are listed in following table.Table 15.4 Constraint Condition OptionsSettingLocation(group box orfield) DescriptionOn table creation Before load Creates constraints on the output table beforethe load but only during the run when thetable is first created. This is the default.As is (do nothing) or Leave Off Before load Does not generate SAS code related to addingor deleting constraints before the load.Alternatively worded as Leave Off if theselected load technique is Replace entiretable.Off Before load Identifies and removes constraints from thepre-existing output table before the load. Notapplicable when the load technique isReplace entire table.On refresh Before load Deletes obsolete constraints and recreates theconstraints based on the current keydefinitions before the load. Not applicablewhen the load technique is Replaceentire table.As is (do nothing), Leave on, orLeave offAfter Load Does not generate SAS code related to addingor deleting constraints before the load.Alternatively worded as Leave on if thebefore action adds constraints and as Leaveoff if the before action removes constraints.This is the default.Off After Load Identifies and removes constraints after theload.Working with Loader Transformations Tasks 269SettingLocation(group box orfield) DescriptionOn table creation After load Creates constraints on the output table afterthe load but only during the run when thetable is first created.On refresh or On After load On/refresh deletes obsolete constraints andrecreates the constraints based on the currentkey definitions. Available as On/Refresh ifthe before action is As-is, and as On if thebefore action is Off. Not applicable when theload technique is Replace entire table.The options available in the Index Condition group box are listed in following table.Table 15.5 Index Condition OptionsSetting Location DescriptionOn table creation Before load Creates indexes on the output table before theload but only during the run when the table isfirst created. This is the default.As is (do nothing) or Leave Off Before load Does not generate SAS code related to addingor deleting indexes before the load.Alternatively worded as Leave Off if theselected load technique is Replace entiretable.Off Before load Identifies and removes indexes from thepre-existing output table before the load. Notapplicable when the load technique isReplace entire table.On refresh Before load Deletes obsolete indexes and recreates theindexes based on the current key definitionsbefore the load. Not applicable when the loadtechnique is Replace entire table.As is (do nothing), Leave on, orLeave offAfter Load Does not generate SAS code related to addingor deleting indexes before the load.Alternatively worded as Leave On if thebefore action adds constraints and as LeaveOff if the before action removes constraints.This is the default.Off After Load Identifies and removes indexes after the load.270 Selecting a Load Technique Chapter 15Setting Location DescriptionOn table creation After load Creates indexes on the output table after theload but only during the run when the table isfirst created.On refresh or On After load On/refresh deletes obsolete indexes andrecreates the indexes based on the current keydefinitions. Available as On/Refresh if thebefore action is As-is, and as On if the beforeaction is Off. Not applicable when the loadtechnique is Replace entire table.Selecting a Load TechniqueProblemYou want to load data into a permanent physical table that is structured to matchyour data model. As the designer or builder of a process flow in SAS Data IntegrationStudio, you must identify the load style that your process requires. Then you canproceed to (1) append all of the source data to any previously loaded data, (2) replace allpreviously loaded data with the source data, or (3) use the source data to update andadd to the previously loaded data based on specific key columns. When you know theload style that is required, you can select the techniques and options that will maximizethe steps performance.SolutionYou can use the Table Loader transformation to perform any of the three load styles.The transformation generates the code that is required in order to load SAS data sets,database tables, and other types of tables, such as an Excel spreadsheet. When youload tables, you can use the transformation to manage indexes and constraints on thetable that is being loaded.You select the load style in the Load style field on the Load Technique tab of theTable Loader transformation. After you have selected the load style, you can choosefrom a number of load techniques and options. Based on the load style that you selectand the type of table that is being loaded, the choice of techniques and options will vary.The Table Loader transformation generates code to perform a combination of thefollowing loading tasks: Remove all rows. Add new rows. Match and update rows.The following sections describe the SAS code alternatives for each load task andprovide tips for selecting the load technique (or techniques) that will perform best.Working with Loader Transformations Tasks 271TasksRemove All RowsThis task is associated with the Replace Load style. Based on the type of target tablethat is being loaded, two or three of the following selections are listed in the Replacefield: Replace entire table: Uses PROC DATASETS to delete the target table Replace all rows using truncate: Uses PROC SQL with TRUNCATE toremove all rows (only available for some databases) Replace all rows using delete: Uses PROC SQL with DELETE * to removeall rowsWhen you select Replace entire table, the table is removed and disk space isfreed. Then the table is recreated with 0 rows. Consider this option unless yoursecurity requirements restrict table deletion permissions (a restriction that is commonlyimposed by a database administrator on database tables). Also, avoid this method if thetable has any indexes or constraints that SAS Data Integration Studio cannot recreatefrom metadata (for example, check constraints).If available, consider using Replace all rows using truncate. Either of theremove all rows selections enable you to keep all indexes and constraints intact duringthe load. By design, using TRUNCATE is the quickest way to remove all rows. TheDELETE * syntax also removes all rows; however, based on the database and tablesettings, this choice can incur overhead that will degrade performance. Consult yourdatabase administrator or the database documentation for a comparison of the twotechniques.CAUTION:When DELETE * is used repeatedly to clear a SAS table, the size of that table should bemonitored over time. DELETE * performs only logical deletes for SAS tables.Therefore, a tables physical size will grow, and the increased size can negativelyaffect performance. Add New RowsFor this task, the Table Loader transformation provides two techniques for all threeload styles: PROC APPEND with the FORCE option and PROC SQL with the INSERTstatement. The two techniques handle discrepancies between source and target tablestructures differently.PROC APPEND with the FORCE option is the default. If the source is a large tableand the target is in a database that supports bulk-load, PROC APPEND can takeadvantage of the bulk-load feature. Consider bulk-loading the data into database tableswith the optimized SAS/ACCESS engine bulk loaders. (We recommended that you usenative SAS/ACCESS engine libraries instead of ODBC libraries or OLEDB libraries forrelational database data. SAS/ACCESS engines have native access to the databasesand have superior bulk-loading capabilities.)PROC SQL with the INSERT statement performs well when the source table is smallbecause you dont incur the overhead needed to set up bulk-loading. PROC SQL withINSERT adds one row at a time to the database.272 Removing Non-Essential Indexes and Constraints During a Load Chapter 15Match and Update RowsThe Table Loader transformation provides three techniques for matching andupdating rows in a table. All the following techniques are associated with the Update/Insert load style: DATA step with the MODIFY BY option DATA step with the MODIFY KEY= option PROC SQL with the WHERE and SET statementsFor each of these techniques, you must select one or more columns or an index formatching. All three techniques update matching rows in the target table. The MODIFYBY and MODIFY KEY= options have the added benefit of being able to take unmatchedrecords and add them to the target table during the same pass through the source table.Of these three choices, the DATA step with MODIFY KEY= option often outperformsthe other update methods in tests conducted on loading SAS tables. An index isrequired. The MODIFY KEY= option can also perform adequately for database tableswhen indexes are used.When the Table Loader uses PROC SQL with WHERE and SET statements to matchand update rows, performance varies. When used in PROC SQL, neither of thesestatements requires data to be indexed or sorted, but indexing on the key columns cangreatly improve performance. Both of these statements use WHERE processing tomatch each row of the source table with a row in the target table.The update technique that you choose depends on the percentage of rows beingupdated. If the majority of target records are being updated, the DATA step withMERGE (or UPDATE) might perform better than the DATA step with MODIFY BY orMODIFY KEY= or PROC SQL because MERGE makes full use of record buffers.Performance results can be hardware and operating environment dependent, so youshould consider testing more than one technique.Note: The general Table Loader transformation does not offer the DATA step withMERGE as a load technique. However, you can revise the code for the MODIFY BYtechnique to do a merge and save that as user-written code for the transformation Removing Non-Essential Indexes and Constraints During a LoadProblemYou want to improve the performance of a job that includes a table that contains oneor more non-essential indexes.SolutionYou can remove non-essential indexes before a load and recreate those indexes afterthe load. In some situations, this procedure improves performance. As a general rule,consider removing and recreating indexes if more than 10 percent of the data in thetable will be reloaded.You might also want to temporarily remove key constraints in order to improveperformance. If there are significant numbers of transactions and the data that is beingWorking with Loader Transformations Solution 273loaded conforms to the constraints, then removing the constraints from the targetbefore a load removes the overhead that is associated with maintaining thoseconstraints during the load.To control the timing of index and constraint removal, use the options that areavailable on the Load Technique tab of the Table Loader transformation. Thefollowing settings are provided to enable you to specify the desired conditions for theconstraints and indexes before and after the load: the Before Load field in the Constraint Condition group box the After Load field in the Constraint Condition group box the Before Load field in the Index Condition group box the After Load field in the Index Condition group boxThe options that are available depend on the load technique that you choose. Thechoices translate to three different tasks: put on, take off, and leave as is. When youselect Off for the Before Load options, the generated code checks for and removes anyindexes (or constraints) that are found. Then, it loads the table. If an index is requiredfor an update, that index is not removed or it will be added, as needed. Select On for theAfter Load options to have indexes added after the load.There might be times when you want to select Leave Off in the After Load field.Here is an example of a situation when you might want to leave the indexes off duringand after the table loading for performance reasons. The table is updated multipletimes in a series of load steps that appear in separate jobs. Indexes are defined on thetable only to improve performance of a query and reporting application that runs afterthe nightly load. None of the load steps need the indexes, and leaving the indexes onimpedes the performance of the load. In this scenario, the indexes can be taken offbefore the first update and left off until after the final update.Considering a Bulk LoadProblemYou want to load a table on a row-by-row basis, but you find that it takes too long toload.SolutionYou should consider using the optimized SAS/ACCESS engine bulk loaders to bulkload the data into database tables. Bulk-load options are set in the metadata for anRDBMS library. To set these options from the SAS Data Integration Studio Inventorytree, right-click an RDBMS library, then select Properties Options AdvancedOptions Output. Select the check box that will enable the RDBMS bulk-load facilityfor the current library.You can set additional bulk-load options for the tables in an RDBMS library. To setthese options from the SAS Data Integration Studio Inventory tree, right-click anRDBMS table, then select Properties Physical Storage Table Options. Specifythe appropriate bulk-load option for the table.Also, you should consider using native SAS/ACCESS engine libraries instead ofODBC libraries or OLEDB libraries for RDBMS data.275C H A P T E R16Working with SAS SortsAbout SAS Sort Transformations 275Setting Sort Options 276Problem 276Solution 276Optimizing Sort Performance 277Problem 277Solution 278Creating a Table That Sorts the Contents of a Source 279Problem 279Solution 280Tasks 280Create and Populate the Job 280Specify How Information in the Target Is to Be Sorted and Run the Job 280View the Output 281About SAS Sort TransformationsThe SAS Sort transformation provides a graphic interface for the functions availablein PROC SORT. You can use the transformation to read data from a source, sort it, andwrite the sorted data to a target in a SAS Data Integration Studio job. Sorting occursimplicitly with index creation, ORDER BY clauses, SAS SQL joins, and procedureexecution that requires ordering. For any caller, the underlying sort engine is the same.Sort callers include, but are not limited to, the DATA step, PROC SORT, PROC SQL,and PROC SUMMARY.The properties window for the SAS Sort transformation contains tabs that enableyou to select the columns that you sort by and to set options for the sort, as described inSetting Sort Options on page 276. You can also optimize sort performance, asdescribed in Optimizing Sort Performance on page 277. For an example of how youcan use a SAS Sort transformation, see Creating a Table That Sorts the Contents of aSource on page 279.276 Setting Sort Options Chapter 16Setting Sort OptionsProblemYou want to set options for a table sort. The sort is performed with the SAS Sorttransformation in a SAS Data Integration Studio job.SolutionYou can access the Options tab in the properties window of the SAS Sorttransformation. The available options are listed in the following table.Table 16.1 SAS Sort OptionsOption DescriptionCreate SYSLAST MacroVariableSpecifies whether SAS Data Integration Studio generates a SYLASTmacro statement at the end of the current transformation. Acceptthe default value of YES when the current transformation creates anoutput table that should be the input of the next transformation inthe process flow. Otherwise, select NO.Equals Maintains relative order within BY groups. The following values areavailable: NOEQUALS EQUALS (default)Force Overrides the default sort behavior by forcing data to be sorted, evenif the information that is stored with the table indicates that thedata is already in sorted order. The following values are available: FORCE no FORCE (default)Tagsort Reduces temporary disk usage. The following values are available: TAGSORT no TAGSORT (default)Duplicates Deletes duplicate observations. The following values are available: Blank/no value (default) NODUPKEY NODUPRECWorking with SAS Sorts Problem 277Option DescriptionSortseq Specifies the collating sequence. The following values are available: ASCII DANISH (alias NORWEGIAN) EBCDIC FINNISH ITALIAN NATIONAL REVERSE SPANISH SWEDISH User-suppliedSortsize Specifies the maximum amount of memory that is available to PROCSORT. The following values are available: MAX MIN n nK nM nGSystem Options Specifies one or more SAS System options on the OPTIONSstatement. For details, see SAS Language Reference: Dictionary.PROC SORT Options Specifies one or more options in the SORT procedure.Optimizing Sort PerformanceProblemYou want to sort the data in your source tables before running a job. Sorting is acommon and resource-intensive component of SAS Data Integration Studio. Sorts occurexplicitly as PROC SORT steps and implicitly in other operations such as joins.Effective sorting requires a detailed analysis of performance and resource usage.Sorting large SAS tables requires large SORT procedure utility files. When SASData Integration Studio is running on multiple SAS jobs simultaneously, multipleSORT procedure utility files can be active. For these reasons, tuning sort performanceand understanding sort disk space consumption are critical.278 Solution Chapter 16SolutionYou can enhance sort performance with the techniques listed in the following table.For more information, see the ETL Performance Tuning Tips whitepaper that isavailable from http://support.sas.com/documentation/whitepaper/technical/.Table 16.2 Sort Performance Enhancement TechniquesTechnique NotesUse the improved SAS9 sortalgorithmSAS9 includes a rewritten SORT algorithm that incorporatesthreading and data latency reduction algorithms. The SAS9sort uses multiple threads and outperforms a SAS 8 sort inalmost all circumstances.Minimize data Perform the following steps: Minimize row width. Drop unnecessary columns. Minimize pad bytes.Direct sort utility files to faststorage devicesUse the WORK invocation option, the UTILLOC invocationoption, or both options to direct SORT procedure utility filesto fast, less-utilized storage devices. Some procedure utilityfiles are accessed heavily, and separating them from otheractive files might improve performance.Distribute sort utility files acrossmultiple devicesDistribute SORT procedure utility files across multiple fast,less-utilized devices. Direct the SORT procedure utility file ofeach job to a different device. Use the WORK invocationoption, the UTILLOC invocation option, or both options.Pre-sort explicitly on the mostcommon sort keySAS Data Integration Studio might arrange a table in sortorder, one or multiple times. For large tables in which sortorder is required multiple times, look for a common sort order.Use the MSGLEVEL=I option to expose information that is inthe SAS log to determine where sorts occur.Change the default SORTSIZEvalueFor large tables, set SORTSIZE to 256 MB or 512 MB. Forextremely large tables (a billion or more wide rows), setSORTSIZE to 1 GB or higher. Tune these recommendedvalues further based on empirical testing or based on in-depthknowledge of your hardware and operating system.Change the default MEMSIZEvalueSet MEMSIZE at least 50% larger than SORTSIZE.Set the NOSORTEQUALS systemoptionIn an ETL process flow, maintaining relative row order israrely a requirement. If maintaining the relative order ofrows with identical key values is not important, set thesystem option NOSORTEQUALS to save resources.Set the UBUFNO option to themaximum of 20The UBUFNO option specifies the number of utility I/Obuffers. In some cases, maximizing UBUFNO increases sortperformance up to 10%. Increasing UBUFNO has no negativeramifications.Working with SAS Sorts Problem 279Technique NotesUse the TAGSORT option fornearly sorted dataTAGSORT is an alternative SAS 8 sort algorithm that isuseful for data that is almost in sort order. The option is mosteffective when the sort-key width is no more than 5 percent ofthe total uncompressed column width. Using the TAGSORToption on a large unsorted data set results in extremely longsort times compared to a SAS9 sort that uses multiplethreads.Use relational database sortengines to pre-sort tables withoutdata order issuesPre-sorting in relational databases might outperform sortingbased on SAS. Use options of the SAS Data Integration StudioExtract transformation to generate an ORDER BY clause inthe SAS SQL. The ORDER BY clause asks the relationaldatabase to return the rows in that particular sorted order.Determine disk spacerequirements to complete a sortSize the following sort data components: Input data SORT procedure utility file Output dataSize input data Because sorting is so I/O intensive, it is important to startwith only the rows and columns that are needed for the sort.The SORT procedure WORK files and the output file will bedependent on the input file size.Size SORT procedure utility files Consider a number of factors to size the SORT procedureutility files: sizing information of the input data any pad bytes added to character columns any pad bytes added to short numeric columns pad bytes that align each row by 8-bytes (for SAS datasets) 8 bytes per row overhead for EQUALS processing per-page unused space in the SORT procedure utilityfiles multi-pass merge: doubling of SORT procedure utilityfiles (or sort failure)Size of output data To size the output data, apply the sizing rules of thedestination data store to the columns that are produced bythe sort.Creating a Table That Sorts the Contents of a SourceProblemYou want to create a job that reads data from a source, sorts it, and writes the sorteddata to a target.280 Solution Chapter 16SolutionYou can create a job that uses a SAS Sort transformation to sort the data in a sourcetable and write it to a target table.TasksCreate and Populate the JobPerform the following steps to create and populate a new job:1 Create a new job.2 Select and drag the SAS Sort transformation from the Data Transforms folder inthe Process Library tree into the empty job in the Process Designer window.3 Drop the source table on the source drop zone for the SAS Sort transformation.4 Drop the target table on the target drop zone for the SAS Sort transformation.5 Delete the Table Loader and the temporary worktable from the job. The jobresembles the sample shown in the following display.Display 16.1 Sample SAS Sort Process Flow DiagramSpecify How Information in the Target Is to Be Sorted and Run the JobPerform the following steps to specify how information in the target table is to besorted:1 Open the Sort By Columns tab of the properties window for the SAS Sorttransformation.2 Select the first variable for the new sort from the list in the Columns field. Then,move the variable to the Sort by columns field. Then, specify the sort directionfor the variable with the drop-down menu in the Sort Order column.3 Move the other variables that you want to sort by to the Sort by columns field.Then, set the sort direction for each. The following display depicts the completedSort By Columns tab for the sample sort job.Working with SAS Sorts Tasks 281Display 16.2 Completed SAS Sort Tab for Sample Job4 Save the selection criteria for the target and close the properties window.5 Run the job. If you are prompted to do so, enter a user ID and password for thedefault SAS Application Server that generates and run SAS code for the job. Theserver executes the SAS code for the job.6 If the job completes without error, go to the next section. If error messages appear,read and respond to the messages.View the OutputYou can verify that the job created the desired output by reviewing the View Datawindow. The View Data window for the sample job is shown in the following display.Display 16.3 Data in Sample Sorted TableYou can review the View Data window to ensure that the data from the source tablewas properly sorted. Note that the Name and Sex columns in the sample target tableare sorted, but the other columns remained unsorted.283C H A P T E R17Working with the SQL JoinTransformationAbout SQL Join Transformations 285Using the SQL Designer Tab 286Problem 286Solution 286Tasks 286Using Components on the Designer Tab 286Additional Information 287Reviewing and Modifying Clauses, Joins, and Tables in an SQL Query 287Problem 287Solution 287Tasks 287Review Clauses, Join, and Tables 287Modify Properties of Clauses and Tables 288Understanding Automatic Joins 289A Sample Auto-Join Process 290Selecting the Join Type 292Problem 292Solution 292Tasks 292Change Join Types in a Sample SQL Query 292Adding User-Written SQL Code 295Problem 295Solution 295Additional Information 296Debugging an SQL Query 296Problem 296Solution 296Tasks 296Set the Debug Property 296Examine Some Sample Method Traces 296Adding a Column to the Target Table 297Problem 297Solution 297Tasks 297Add a Column with the Columns Tab for the Target Table 297Adding a Join to an SQL Query in the Designer Tab 298Problem 298Solution 298Tasks 298Add a Join to the Create Tab 298Creating a Simple SQL Query 299284 Contents Chapter 17Problem 299Solution 299Tasks 299Create and Populate the Job 299Create the SQL Query 300Additional Information 301Configuring a SELECT Clause 301Problem 301Solution 302Tasks 302Configure the SELECT Clause with the Select Tab 302Adding a CASE Expression 303Problem 303Solution 304Tasks 304Add a CASE Expression to an SQL Query in the Designer Tab 304Creating or Configuring a WHERE Clause 305Problem 305Solution 306Tasks 306Configure the WHERE Clause with the Where Tab 306Adding a GROUP BY Clause and a HAVING Clause 307Problem 307Solution 307Tasks 308Add a GROUP BY Clause to an SQL Query in the Designer Tab 308Add a HAVING Clause to an SQL Query on the Designer Tab 309Adding an ORDER BY Clause 310Problem 310Solution 310Tasks 310Add an ORDER BY Clause to an SQL Query in the Designer Tab 310Adding Subqueries 311Problem 311Solution 311Tasks 312Add a Subquery to an Input Table 312Add a Subquery to an SQL Clause 314Submitting an SQL Query 315Problem 315Solution 315Tasks 316Submit a Query from the Designer Tab of the SQL Join Transformation 316Submit a Query as a Part of a SAS Data Integration Studio Job 316Joining a Table to Itself 316Problem 316Solution 317Tasks 317Join the Table to Itself 317Using Parameters with an SQL Join 318Problem 318Solution 318Constructing a SAS Scalable Performance Data Server Star Join 319Problem 319Working with the SQL Join Transformation About SQL Join Transformations 285Solution 319Tasks 319Construct an SPD Server Star Join 319Additional Information 320Optimizing SQL Processing Performance 320Problem 320Solution 321Performing General Data Optimization 321Problem 321Solution 321Tasks 322Minimize Input/Output (I/O) Processing 322Pre-Sort Data 322Influencing the Join Algorithm 322Problem 322Solution 322Tasks 323Sort-Merge Joins 323Index Joins 323Hash Joins 323Setting the Implicit Property for a Join 324Problem 324Solution 324Enabling Pass-Through Processing 325Problem 325Solution 325Tasks 326Explicit Pass-Through Processing 326Implicit Pass-Through Processing 326Using Property Sheet Options to Optimize SQL Processing Performance 327Problem 327Solution 327Tasks 328Bulk Load Tables 328Optimize the SELECT Statement 328Set Buffering Options 328Use Threaded Reads 329Write User-Written Code 329About SQL Join TransformationsThe SQL Join transformation enables you to create SQL queries that run in thecontext of SAS Data Integration Studio jobs. The transformation features a graphicalinterface that provides a consistent and intuitive setting for building the statementsand clauses that constitute queries. The transformation supports the PROC SQLsyntax of Create table/view as and accommodatesup to 256 tables in a single query.The Select statement now supports joining the table to itself. It also supportssubqueries; the CASE expression; and WHERE, GROUP BY, HAVING, and ORDER BYclauses. Finally, the current version of the SQL Join transformation inserts a commentinto the generated code for queries that states that the code was generated by version 2of the transformation. This comment distinguishes the code from the code that isgenerated by the earlier version of the SQL Join transformation.286 Using the SQL Designer Tab Chapter 17The process of building the SQL query is performed on the Designer tab. Use theDesigner tab to create, edit, and review an SQL query. The tab contains sections thatare designed to simplify creating the SQL query and configuring its parts.Using the SQL Designer TabProblemYou want to create SQL queries that you can use in SAS Data Integration Studiojobs. You want to build these queries in a graphical interface that enables you to dragand drop components onto a visual representation of a query. After a component isadded to the query, you need the ability to open and configure it.SolutionUse the Designer tab of the properties window for the SQL transformation to create,edit, and review an SQL query. The tab contains sections that are designed to simplifycreating the SQL query and configuring its parts.TasksUsing Components on the Designer TabThe Designer tab enables you to perform the tasks listed in the following table:Table 17.1 Designer Tab TasksIf you want to Use the To accessSelect and manipulate an object thatdisplays in the Create tab.Navigate pane Click the object that you need toaccess.Add SQL clauses to the flow shown onthe Create/Subquery tab.SQL ClausespaneDouble-click the clause or drop it onthe Create tab.Review the list of columns in thesource table and the target table.Note that you can specify alphabeticdisplay of the columns by selectingDisplay columns inalphabetical order.Tables pane Click Select, Where, Having,Group by, or Order by in the SQLClauses pane.Display and update the mainproperties of an object that is selectedon the Create tab. The title of thispane changes to match the objectselected in the Navigate pane.Properties pane Click an object on the Create tab.Working with the SQL Join Transformation Tasks 287If you want to Use the To accessCreate SQL statements, configure theclauses contained in the statement,and edit the source table to targettable mappings. The name of thiscomponent changes as you clickdifferent statements and clauses inthe Navigate pane.Create tab Click Create in the Navigate pane.View the SAS code generated for thequery.Source tab Click Source at the bottom of theCreate tab.View the log of a SAS program, suchas the code that is executed orvalidated for the SQL query.Log tab Click Log at the bottom of theCreate tab.Additional InformationFor more information about using the Navigate pane, see the 'Navigate Pane' topicin SAS Data Integration Studio help. For information about using the Create Query/Subquery tab, see the 'Create/Subquery Tab' topic. For information about otherDesigner tab components, see the 'Where/Having/Join Tab' topic, the 'Group by Tab'topic, the 'Order By Tab' topic, and the 'Select Tab' topic. For information about theSource and Log tabs, click Help when the component is displayed.Reviewing and Modifying Clauses, Joins, and Tables in an SQL QueryProblemYou want to view a clause, join, or table in an SQL query or modify its properties.SolutionUse the Navigate and properties panes on the Designer tab of the properties windowfor the SQL transformation. If you click an SQL clause or a join, its SQL code ishighlighted in the Source pane. If you change a property for a clause, join, or table inits properties pane, the change is also displayed in the Source pane.TasksReview Clauses, Join, and TablesWhen you click an item in the Navigate pane, the Designer tab responds in thefollowing ways: The properties pane for the clause, join, or table is displayed.288 Tasks Chapter 17 The appropriate tab for the clause or join is displayed on the right side of theDesigner tab. When you click a table, the currently displayed tab continues to beshown. The appropriate SQL code is highlighted on the Source tab when you click aclause or a join. The Source tab is unaffected when you click a table. If you click SQL Join, Create, or From in the Navigate pane, the SQL Clausespane is displayed. If you click Select, Where, or one of the Joins in the Navigate pane, the Tablespane is displayed.The following display shows the Designer tab for a sample job.Display 17.1 Information About a Select Clause on a Designer TabNote that Select is highlighted on the Navigate pane, and the SQL code for theSELECT clause is highlighted on the Source tab. Also note that the Select tab, theTables pane, and the Select Properties pane are displayed.Modify Properties of Clauses and TablesYou can use the properties pane that is displayed when you click an object on theNavigate pane as if it were the Options tab in the stand-alone properties window forthe object. For example, if you enter text in the Description field in the SelectProperties pane, a comment is added to the SELECT clause on the Source tab. See thefollowing display for a sample view of this behavior.Working with the SQL Join Transformation Understanding Automatic Joins 289Display 17.2 Using the Description Field to Comment a Select ClauseNote that text entered in the Description field in the Select Properties pane is alsodisplayed next to the Select object in the Navigate pane and immediately before theSQL code on the Source tab. If you were to delete the text from the Description field,it would also be removed from the Navigate pane and the Source tab. You can makesimilar modifications to any field in a properties pane for any object, unless the field isdimmed. Dimmed fields are read-only.Understanding Automatic JoinsThe automatic join (auto-join) process determines the initial relationships andconditions for a query formulated in the SQL Join transformation. You can understandhow these relationships and conditions are established. You can also examine how droporder, key relationships, and indexes are used in the auto-join process.The process for determining the join relationships is based on the order of the tablesadded to SQL transformation as input. When more than one table is dropped on theSQL transformation, a best guess is made about the join relationships between thetables. The join order is determined by taking the first table dropped and making it theleft side of the join. Then, the next table dropped becomes the right side. If more thantwo tables are dropped, the next join is added so that the existing join is placed on theleft side and the next table is placed on the right. This process continues until no moresource tables are found. The default join type is an inner join.290 A Sample Auto-Join Process Chapter 17As each join is created and has its left and right sides added, a matching process isrun to determine the best relationships for the join. The process evaluates the jointables from the left side to the right side. For example, if a join is connected on the left,it follows that left side join until it hits all of the tables that are connected to the join.This includes any joins that are connected to it.The auto-join process is geared toward finding the best relationships based on theknown relationships documented in the physical tables when each of the tables beingjoined contains key constraints, indexes, or both. Therefore, the process is most likely tofind the correct relationships when the primary and foreign key relationships aredefined between the tables that are being joined. The auto-join process can still find thecorrect relationships using indexes alone, but an index-only match can occur only whenthere are columns that are matched between the two tables in the join.The key-matching process proceeds as follows:1 Each of the left side tables unique keys are evaluated to find any existingassociated foreign keys in any table on the right side of the join. If no associationsare found, the left side tables foreign keys are checked to see whether arelationship is found to a unique key in a table on the right side of the join. If amatch is found, both tables are removed from the search.2 If tables are still available on both the left and right sides, the table indexes aresearched. The left side is searched first. If an index is found, then the indexcolumns are matched to any column in the tables on the right. As matches arefound, both tables are removed from the search. The right side is searched iftables are still available on both the right and left sides.3 If tables are still available on both the left and right sides, the left side tablescolumns are matched to the right side by name and type. If the type is numeric,the lengths must match. As a match is found, both tables are removed from thesearch.A Sample Auto-Join ProcessThis is an abstract explanation. Therefore, it is useful to illustrate it with a specificexample. Suppose that you add the following tables as input to the SQL Jointransformation in the following order: CUSTOMERS, with the following constraint defined: Primary key: CUSTOMER_ID INVOICE, with the following constraints defined: Primary key: INVOICE_NUMBER Foreign key: CUSTOMER_ID Foreign key: PRODUCT_NUMBER PRODUCTS, with the following constraint defined: Primary key: PRODUCT_NUMBER INVENTORY, with the following constraint defined: Index: PRODUCT_NUMBERAfter the auto-join process is run for this source data, the process flow depicted in thefollowing display is shown in the Create tab in the Designer tab in the propertieswindow for the SQL Join transformation.Working with the SQL Join Transformation A Sample Auto-Join Process 291Display 17.3 Sample Process Flow for an Auto-Join ProcessThis process flow is resolved to the following order: CUSTOMERS, INVOICE,PRODUCTS, and INVENTORY. As each join is created and has its left and right sides,a matching process is used to determine the best relationships for the join. The processevaluates the join tables from the left side to the right side. For example, if a join isconnected on the left, it follows that left side join until it gets all of the tables that areconnected to the join. The matching process uses the following criteria to determine agood match. Note that the tables are removed from the search process as therelationships are found.The first join is created with the left table of CUSTOMERS and the right table ofINVOICE. Going through the join relationship process, the key relationship onCUSTOMER_ID is found between the two tables. Both tables are removed from thesearch and the matching process is finished.The next join is created with the search results of the CUSTOMERS and INVOICEtables as the new left table and PRODUCTS as the right table. A key relationshipbetween INVOICE and PRODUCTS on the column PRODUCT_NUMBER is found, andan expression is created. Both tables are removed from the search and the matchingprocess is finished.The last join is created with the search results of the CUSTOMER, INVOICE, andPRODUCTS table as the new left table and INVENTORY as the right table. No keyrelationships are found, so the indexes are searched. A match is found betweenPRODUCTS and INVENTORY on the column PRODUCT_NUMBER. Both tables arethen removed from the search and the matching process is finished.The relationship is initialized as follows:CUSTOMERS.CUSTOMER_ID = INVOICE.CUSTOMER_ID andINVOICE.PRODUCT_NUMBER = PRODUCTS.PRODUCT_NUMBER andPRODUCTS.PRODUCT_NUMBER = INVENTORY.PRODUCT_NUMBER292 Selecting the Join Type Chapter 17Selecting the Join TypeProblemYou want to select a specific type for a join in an SQL query. You can use the jointype selection to gain precise control over the data included in the results of the query.SolutionRight-click an existing join in an SQL query, and click the appropriate join type inthe pop-up menu.TasksChange Join Types in a Sample SQL QueryExamine a sample SQL query in a SAS Data Integration Studio job to see the effectsof changing the join types used in the query. The sample query contains the tables andcolumns listed in the following table:Table 17.2 Sample Query DataSource Table 1:POSTALCODESSource Table 2:UNITEDSTATES Target Table: State_Data Name Code Capital Population Area Continent Statehood Name Code Capital Population Area Continent StatehoodThe join condition for the query is POSTALCODES.Name = UNITEDSTATES.Name.The query is depicted in the following display.Working with the SQL Join Transformation Tasks 293Display 17.4 Sample SQL Query in a SAS Data Integration Studio JobNotice that the query contains an inner join and a WHERE statement. Thesecomponents are included by default when a query is first created. The following tableillustrates how the query is affected when you run through all of the available jointypes in succession:Table 17.3 Results By Join TypeJoin Type Description Data Included in ResultsImplicit/ExplicitStatusInner Combines and displays only therows from the first table thatmatch rows from the secondtable, based on the matchingcriteria that are specified in theWHERE clause.50 rows: 50 matches on namecolumn; 0 non-matchesImplicitFull Retrieves both the matchingrows and the non-matching rowsfrom both tables.59 rows: 50 matches on namecolumn; 8 non-matches fromPOSTALCODES (left table); 1non-match fromUNITEDSTATES (right table)ExplicitLeft Retrieves both the matchingrows and the non-matching rowsfrom the left table.58 rows: 50 matches on namecolumn; 8 non-matches fromPOSTALCODES (left table)ExplicitRight Retrieves both the matchingrows and the non-matching rowsfrom the right table.51 rows: 50 matches on namecolumn; 1 non-match fromUNITEDSTATES (right table)Explicit294 Tasks Chapter 17Join Type Description Data Included in ResultsImplicit/ExplicitStatusCross Combines each row in the firsttable with every row in thesecond table (creating aCartesian product of the tables).2958 rows ExplicitUnion Selects unique rows from bothtables together and overlays thecolumns. PROC SQL firstconcatenates and sorts the rowsfrom the two tables, and theneliminates any duplicate rows.See the following display for theresults of a sample union join.109 rows: 58 rows fromPOSTALCODES (left table); 51rows from UNITEDSTATES(right table)ExplicitA section of the View Data window for a sample query that includes a union join isdepicted in the following display.Display 17.5 Sample Section from a View of a Union JoinRows 45 to 51 come from the POSTALCODES table. Rows 52 to 59 come from theUNITEDSTATES table.These joins are contained in the FROM clause in the SELECT statement, whichcomes earlier in an SQL query than a WHERE statement. You can often create moreefficient query performance by using the proper join type in a SELECT statement thanyou can by setting conditions in a WHERE statement that comes later in the query.Working with the SQL Join Transformation Solution 295Adding User-Written SQL CodeProblemYou want to add user-written code to an SQL query that is used in a SAS DataIntegration Studio job. This user-written code can consist of SQL code that is added toa WHERE, HAVING, or JOIN clause. It can also overwrite the entire DATA step for theSQL Join transformation.SolutionYou can add SQL code to an SQL WHERE, HAVING, or JOIN clause in the propertieswindow for the clause. To set the user-written property for a clause, click the clause inthe SQL Clauses pane on the Designer tab for the SQL Join transformation. Then,select Yes in the User Written field and enter the code in the SQL field on the clausestab. The following display shows sample user-written code added to a WHERE clause.Display 17.6 Sample User-Written SQL CodeNote that the following line of SQL code was added to the SQL field on the Where tab:and us.'Population'n < 5000000This code is also highlighted on the Source tab.296 Additional Information Chapter 17Additional InformationFor information about how to overwrite the entire DATA step for the SQL Jointransformation, see Chapter 12, Working with User-Written Code, on page 215.Debugging an SQL QueryProblemYou want to determine which join algorithm is selected for an SQL query by the SASSQL Optimizer. You also need to know how long it takes to run the job that containsthe SQL Join transformation.SolutionYou can enable debugging for the query by setting the Debug property in the SQLProperties pane.TasksSet the Debug PropertyThe Debug property in the SQL Properties pane enables the following debuggingoption:options sastrace = ,,,sd sastraceloc = saslog no$stsuffix fullstimer;You can use this option to determine which join algorithms are used in the query and toget timing data for the SAS job.You can use the keywords from the trace output that are listed in the following tableto determine which join algorithm was used:Table 17.4 Debugging Keywords and Join AlgorithmsKeyword Join Algorithmsqxsort sort stepsqxjm sort-merge joinsqxjndx index joinsqxjhsh hash joinsqxrc table nameExamine Some Sample Method TracesThe following sample fragments illustrate how these keywords appear in a _methodtrace.Working with the SQL Join Transformation Tasks 297In the first example, each data set is sorted and sort-merge is used for the join:sqxjmsqxsortsqxsrc( WORK.JOIN_DATA2 )sqxsortsqxsrc( LOCAL.MYDATA )In the next example, an index nested loop is used for the join:sqxjndxsqxsrc( WORK.JOIN_DATA2 )sqxsrc( LOCAL.MYDATA )In the final example, a hash is used for the join:sqxjhshsqxsrc( LOCAL.MYDATA )sqxsrc( WORK.JOIN_DATA1 )Adding a Column to the Target TableProblemYou want to add a column to the target table for an SQL query that is used in a SASData Integration Studio job.SolutionYou can use the Columns tab on the properties window for the target table to add acolumn to the target table. (You can also add a column in the Select tab. To do this,right-click in the Target table field and click New Column in the pop-up menu.)TasksAdd a Column with the Columns Tab for the Target TablePerform the following steps to add a column to the target table:1 Right-click the target table in the Navigation pane. Then, open the Columns tab inits properties window.2 Click New to add a row to the list of columns.3 Enter the column name in the Column field of the new row.4 Click the drop-down menu in the Type field. Then, click either Character orNumeric.5 Review the other columns in the new row to ensure that they contain appropriatevalues. Make any needed changes.6 Click OK to save the new column and close the properties window.298 Adding a Join to an SQL Query in the Designer Tab Chapter 17Adding a Join to an SQL Query in the Designer TabProblemYou want to add a join to an SQL query that is used in a SAS Data IntegrationStudio job. Then you can connect an additional source table, join, or subquery for thequery to the join.SolutionYou can drop the join on the Create tab in the Designer tab. The Designer tab islocated in the properties window for the SQL Join transformation. This join enablesyou to add a new drop zone to the query flow in the Create pane.TasksAdd a Join to the Create TabPerform the following steps to add a join to the Create tab:1 Select one of the join objects in the Joins folder in the SQL Clauses pane, and dropit on the Select object in a query flow displayed on the Create tab. The joinobject and its drop zone are added to the query flow. (You can also double-click thejoin object to insert it between the existing join and the SELECT clause.)2 Drop an appropriate table, join, or SubQuery object on the drop zone. The SQLcode for the join is shown on the Source tab.The join and its drop zone are displayed in the query flow, as shown in the followingdisplay.Display 17.7 Add a Join and Drop ZoneNote: You can add the source and target tables directly to the process flow diagramfor the job. You can also add a table, join, or subquery to a job by dragging anddropping it on an empty source or target drop zone on the Create tab on the Designertab of the properties window for the SQL Join transformation. If you drop a table on anexisting table on the Designer tab, the new table will replace the existing table. Working with the SQL Join Transformation Tasks 299Creating a Simple SQL QueryProblemYou want to add a simple SQL query to a SAS Data Integration Studio job.SolutionUse the SQL Join transformation. The SQL Join transformation enables you tocreate SQL queries that run in the context of SAS jobs. The transformation features agraphical interface that enables you to build the statements and clauses that constitutequeries. This example describes how to use the transformation to create a job that usesan SQL query to select data from two SAS tables. The data is merged into a target table.TasksCreate and Populate the JobPerform the following steps to create and populate the job:1 Create an empty job.2 Drop the SQL Join transformation from the Data Transformations folder in theProcess Library tree onto the empty job.3 Drop the first source table on the input drop zone for the SQL Join transformation.4 Drop any additional source tables on the SQL Join transformation.5 Drop the target table on the target drop zone.6 Delete the Table Loader transformation and the temporary worktable SQL Targetfrom the job. If you keep the Table Loader and the worktable, you must configuretwo sets of mappings, one from the source tables to the worktable and anotherfrom the worktable to the target table. The extra processing required coulddegrade performance when the job is run. In addition, the Table Loader stepshould be deleted if you use pass-through processing when your target table is aDBMS table and your DBMS engine supports the Create as Select syntax.Note: If you use work tables as the source for your query, you can use the RunPrior Steps on Submit option to control whether the steps that are used to createthe source tables for the SQL query are run for a given query submission. When youselect the option, the steps that are placed before the SQL query code are run each timethat the query is submitted. These steps are not needed when you want to check onlythe results of the query. After that first run, you can deselect the option to improveyour performance while you develop and test the query within the Designer tab.However, you must run the query at least once to generate the source tables. 300 Tasks Chapter 17The following display shows a sample SQL job.Display 17.8 Sample SQL JobNow you can create the SQL query that populates the target table.Create the SQL QueryPerform the following steps to create the SQL query that populates the target table:1 Open the properties window for the SQL Join transformation.2 Click Designer to access the Designer tab.3 Click SQL Join in the Navigate pane. The right-hand side of the Designer tabcontains a Navigate pane, an SQL Clauses/Tables pane, and a properties pane.You might need to resize the horizontal borders of the panes to see all three ofthem. For more information, see Using the SQL Designer Tab on page 286.You can enter options that affect the entire query. Note that the SQL JoinProperties pane displays at the bottom of the tab. For example, you can limit thenumber of observations output from the job in the Max Output Rows field.4 Click Create in the Navigate pane to display an initial view of the query on theCreate tab. Note that the sample query already contains an INNER join, aSELECT clause, and a WHERE clause. These elements are created when you dropsource tables on the transformation template. The joins shown in the queryprocess flow are not necessarily joined in the order in which the SQL optimizerwill actually join the tables. However, they do reflect the SQL syntax.The Show Columns option for the query flow diagram on the Create tab is alsoturned on. This option provides scrollable lists of the columns included in each ofthe source and target tables for the job, but the time that it takes to generate thecolumn lists can degrade performance. If you need to improve performance, youcan right-click an empty area of the flow diagram and deselect Show Columns toturn the option off. You can also click the tables included in the query and set analias in the properties pane for each. These aliases help simplify the SQL codegenerated in the query. The Designer tab is shown in the following display.Working with the SQL Join Transformation Problem 301Display 17.9 Sample Designer TabNote that the query is shown in the Navigate pane, complete with the aliasesthat were set for the source tables. The process flow for the query is displayed onthe Create tab. Finally, you can review the code for the query in the SQL Joinproperties pane. You can see the SQL code for the query on the Source tab.Additional InformationFor detailed information about specific usage issues with the SQL Jointransformation, see the 'SAS Data Integration Studio Usage Notes' topic in SAS DataIntegration Studio help.Conguring a SELECT ClauseProblemYou want to configure the SELECT clause for an SQL query that is used in a SASData Integration Studio job. This clause defines which columns will be read from thesource tables and which columns will be saved in the query result tables. You mustreview the automappings for the query, and you might need to create one or morederived expressions for the query.302 Solution Chapter 17SolutionYou need to use the Select tab on the Designer tab of the properties window for theSQL Join transformation.TasksCongure the SELECT Clause with the Select TabPerform the following steps to configure the SELECT clause for the SQL query:1 Click Select in the Navigate pane to access the Select tab.2 Review the automappings to ensure that the columns in the source table aremapped to corresponding tables in the target table. If some columns are notmapped, right-click in an empty area of the Select tab and click Quick Map inthe pop-up menu.3 Perform the following steps if you need to create a derived expression for a columnin the target table for the sample query: Click the drop-down menu in the Expression field, and click Advanced. TheExpression Builder window displays. Enter the expression that you need to create into the Expression Text field.(You can use the Data Sources tab to drill down to the column names.) Forinformation about the Expression Builder window, see the 'ExpressionBuilder' topic in SAS Data Integration Studio help. Click OK to close thewindow. Click Yes when the Warning window prompts you to update your mappingsto take account of the expression that you just created. Review the data in the row that contains the derived expression. Ensure thatthe column formats are appropriate for the data that will be generated by theexpression. Change the formats as necessary.4 Click Apply to save the SELECT clause settings to the query. The followingdisplay depicts a sample Select tab.Working with the SQL Join Transformation Problem 303Display 17.10 Sample Select Tab SettingsNote that the SQL code for the SELECT clause is displayed on the Source tab.Adding a CASE ExpressionProblemYou want to create a CASE expression to incorporate conditional processing into anSQL query contained in a SAS Data Integration Studio job. The CASE expression canbe added to the following parts of a query: a SELECT statement a WHERE condition a HAVING condition a JOIN condition304 Solution Chapter 17SolutionYou can use the CASE Expression window to add a conditional expression to thequery.TasksAdd a CASE Expression to an SQL Query in the Designer TabPerform the following steps to add a CASE expression to the SQL query in theDesigner tab of the properties window for the SQL Join transformation:1 Access the CASE Expression window. To do this, click CASE in the drop-downmenu for an Operand in a WHERE, HAVING, or JOIN condition. You can alsoaccess the CASE option in the Expression column for any column listed in theTarget table field in the Select tab.2 Click New to begin the first condition of the expression. An editable row appears inthe table.3 Enter the appropriate WHEN condition and THEN result for the first WHEN/THEN clause. For information about entering these values, see the 'CaseExpression Window' topic in SAS Data Integration Studio help.4 Add the remaining WHEN/THEN clauses. You need to add one row for each clause.5 Enter an appropriate value in the ELSE Result field. This value is returned forany row that does not satisfy one of the WHEN/THEN clauses.6 Click OK to save the CASE expression and close the window. The following displaydepicts a sample completed CASE Expression window.Display 17.11 Sample Completed CASE Expression WindowNote that the Operand field is blank. You can specify the operand only when theconditions in the CASE expression are all equality tests. The expression in this samplequery uses comparison operators. Therefore, the UNITEDSTATES.Population columnname must be entered for each WHEN condition in the expression. The followingdisplay depicts the Select tab and the highlighted CASE expression for a sample query.Working with the SQL Join Transformation Problem 305Display 17.12 Sample CASE Expression QueryNote that the Population column in the Source tables field in the Select tab ismapped to both the Population and the Pop_Group columns in the Target table field.The second mapping, which links Population to Pop_Group, is created by the CASEexpression described in this topic.Creating or Conguring a WHERE ClauseProblemYou want to configure the WHERE clause for an SQL query that is used in a SASData Integration Studio job. The conditions included in this clause determine whichsubset of the data from the source tables is included in the query results that arecollected in the target table.306 Solution Chapter 17SolutionYou can use the Where tab in the Designer tab of the properties window for the SQLJoin transformation.TasksCongure the WHERE Clause with the Where TabThe WHERE clause for the query is an SQL expression that creates subsets of thesource tables in the SQL query. It also defines the join criteria for joining the sourcetables and the subquery to each other by specifying which values to match. Perform thefollowing steps to configure the Where tab:1 If the Where clause object is missing from the process flow in the Create tab,double-click Where in the SQL Clauses pane. The Where clause object is added tothe query flow in the Create tab. Note that Where clause objects are automaticallypopulated into the Create tab. The WHERE clause is not automatically generatedunder the following circumstances: the query contains only one source table no relationship was found during the auto-join process2 Click Where in the Navigate pane to access the Where tab.3 Click New on the Where tab to begin the first condition of the expression. Aneditable row appears in the table near the top of the tab.4 Enter the appropriate operands and operator for the first condition. Forinformation about entering these values, see the 'Where/Having/Join Tab' topic inSAS Data Integration Studio help.5 Add the remaining conditions for the WHERE clause. You need to add one row foreach condition.6 Click Apply to save the WHERE clause settings to the query. The conditionscreated for the sample query are depicted in the SQL code generated in this stepin the SQL field, as shown in the following display.Working with the SQL Join Transformation Solution 307Display 17.13 Sample Where Tab SettingsNote that the SQL code for the WHERE clause that is shown in the SQL field isidentical to the highlighted WHERE clause code displayed in the Source tab.Adding a GROUP BY Clause and a HAVING ClauseProblemYou want to group your results by a selected variable. Then, you want to subset thenumber of groups displayed in the resultsSolutionYou can add a GROUP BY clause to group the results of your query. You can also adda HAVING clause that uses an aggregate expression to subset the groups returned bythe GROUP BY clause that are displayed in the query results.308 Tasks Chapter 17TasksAdd a GROUP BY Clause to an SQL Query in the Designer TabPerform the following steps to add a GROUP BY clause to the SQL query in theDesigner tab of the properties window for the SQL Join transformation:1 Click Create in the Navigate pane to gain access to the Create tab and the SQLClauses pane.2 Double-click Group by in the SQL Clauses pane. The Group by object is added tothe query flow in the Create tab.3 Click Group by in the Navigate pane to access the Group by tab.4 Select the column that you want to use for grouping the query results from theAvailable columns field. Then, move the column to the Group by columns field.5 Click Apply to save the GROUP BY clause settings to the query. The followingdisplay depicts a sample SQL query grouped with a GROUP BY clause.Display 17.14 Sample SQL Query Grouped With a GROUP BY ClauseNote that the group by column is set on the Group by tab, and the resulting SQLcode is highlighted on the Source tab. The GROUP BY clause in the sample querygroups the results of the query by the region of the United States.Working with the SQL Join Transformation Tasks 309Add a HAVING Clause to an SQL Query on the Designer TabPerform the following steps to add a Having clause to the SQL query on theDesigner tab of the properties window for the SQL Join transformation:1 Click Create in the Navigate pane to gain access to the Create tab and the SQLClauses pane.2 Double-click Having in the SQL Clauses pane. The Having object is added to thequery flow on the Create tab.3 Click Having in the Navigate pane to access the Having tab.4 Click New on the Having tab to begin the first condition of the expression. Aneditable row appears in the table near the top of the tab.5 Enter the appropriate operands and operator for the first condition. Forinformation about entering these values, see the 'Where/Having/Join Tab' topic inSAS Data Integration Studio help.6 Add the remaining conditions for the HAVING clause. You need to add one row foreach condition.7 Click Apply to save the HAVING clause settings to the query. The conditioncreated for the sample query is depicted in the SQL code generated in this step inthe SQL field, as shown in the following display.Display 17.15 Sample SQL Query Subsetted with a HAVING ClauseNote that the SQL code for the HAVING clause that is shown in the SQL field isidentical to the highlighted HAVING clause code displayed on the Source tab. TheHAVING clause subsets the groups that are included in the results for the query. In thesample, only the regions with an average population density of less than 100 areincluded in the query results. Therefore, the Western and Mountain region results areincluded, but the Eastern and Midwestern region results are not.310 Adding an ORDER BY Clause Chapter 17Adding an ORDER BY ClauseProblemYou want to sort the output data in an SQL query that is included in a SAS DataIntegration Studio job.SolutionYou can use the Order by tab on the Designer tab in the properties window for theSQL Join transformation.TasksAdd an ORDER BY Clause to an SQL Query in the Designer TabYou can add an ORDER BY clause to establish a sort order for the query results.Perform the following steps to add an ORDER BY clause to the SQL query in theDesigner tab of the properties window for the SQL Join transformation:1 Click Create in the Navigate pane to gain access to the Create tab and the SQLClauses pane.2 Double-click Order by in the SQL Clauses pane. The Order by object is added tothe query flow in the Create tab.3 Right-click the Order by object and click Edit in the pop-up menu.4 Select the column that you want to use for ordering the query results from theAvailable columns field. Then, move the column to the Order by columns field.Finally, enter a value in the Sort Order field to determine whether the resultsare sorted in ascending or descending order.5 Click Apply to save the ORDER BY clause settings to the query. The followingdisplay depicts a sample SQL query with an ORDER BY clause.Working with the SQL Join Transformation Solution 311Display 17.16 Sample SQL Query Sorted With an ORDER BY ClauseNote that Order by columns is set in the Order by tab, and the resulting SQL codeis highlighted in the Source tab.Adding SubqueriesProblemYou want to add one or more subqueries to an existing SQL query by using theDesigner tab of the properties window for the SQL Join transformation.SolutionUse the Subquery object in the Designer tab that is located in the propertieswindow for the SQL Join transformation. The sample job used in Add a Subquery to anInput Table on page 312 adds a subquery to an input table. This subquery reduces theamount of data that is processed in the main SQL query because it runs and subsetsdata before the SELECT clause is run. Add a Subquery to an SQL Clause on page 314covers adding a subquery to a SELECT, WHERE, or HAVING clause in an SQL query.312 Tasks Chapter 17TasksAdd a Subquery to an Input TableYou can add the source and target tables directly to the process flow diagram for thejob. You can also add a table, join, or subquery to a job by dragging and dropping it onan empty source or target drop zone in the Create tab in the Designer tab of theproperties window for the SQL Join transformation. If you drop a table on an existingtable in the Designer tab, the new table will replace the existing table.You can even add a new drop zone to the query flow in the Create tab. To performthis task, select one of the join icons from the Joins directory in the SQL Clauses paneand drop it on the Select object, a table, or an existing join in the flow. The join and itsdrop zone will be displayed in the query flow. Use this method to add a subquery to thejob.Perform the following steps to create a subquery that refines the SQL query:1 Select Inner in the Joins folder in the SQL Clauses pane, and drop it on the Selectobject in a query flow displayed on the Create tab. An Inner object is added to thequery flow.2 Select SubQuery in the Select Clauses folder in the SQL Clauses pane. Drop it inthe drop zone attached to the new Inner object in the query flow. The subquery isadded to the join, as shown in the sample job depicted in the following display.Display 17.17 SQL Query with an Added Subquery3 Click the SubQuery object. Note that the SubQuery Properties pane displays.Enter an appropriate value in the Alias field. (RegionQry was entered in thesample job.) Enter an alias here; otherwise, the subquery will fail. Thesystem-generated name for the subquery results table is too ambiguous to berecognized as an input to the full SQL query.4 Right-click SubQuery, and click Edit. The SubQuery pane is displayed.5 Drop the source table on the drop zone for the Select clause object.6 Enter an alias for the source table into its properties pane. This step is optional,but it does simplify the SQL code.7 Right-click Select and click Edit to display the Select tab. Make sure that thesource table columns are mapped properly to the RegionQry target table. Also,ensure that the Select * property in the Select Properties pane is set to No.8 Click SubQuery in the Navigate pane to return to the SubQuery tab. Then, selectWhere in the SQL Clauses folder of the SQL Clause pane. Finally, drop the WhereWorking with the SQL Join Transformation Tasks 313icon into an empty spot in the SubQuery tab. A Where clause object is added to theSubQuery tab.9 Right-click Where and click Edit to display the Where tab.10 Click New on the Where tab to begin the first part of the expression. An editablerow appears in the table near the top of the tab.11 Create your first WHERE condition. In this example, a subset of the Regioncolumn from the Region table to select values from the eastern region was created.To recreate the condition, click the drop-down menu in the Operand field on theleft side of the row, and click Choose column(s). Then, drill down into the Regiontable, and select the Region column. The value r.Region displays in the field.12 Keep the defaulted value of = in the Operator field. Enter the value E in theOperand field on the right side of the row.13 Create the remaining conditions for the WHERE statement. Review the SQL codegenerated in this step in the SQL field, as shown in the following display.Display 17.18 Where Tab in the Subquery14 A connection is required between the source table for the subquery and the targettable for the query. To recreate the sample, right-click in the Target table fieldof the Select tab and click New Column in the pop-up menu.15 Enter name of the subquery source table in the Name field. Then, make sure thatthe new column has the appropriate data type.16 Click Apply to save the new column into the target table. You will have to add amapping for the subquery to the main query SELECT clause, and add thesubquery to the main query WHERE clause. The following display depicts theinput table subquery.314 Tasks Chapter 17Display 17.19 Sample Input Table SubqueryYou can compare the tree view of the subquery in the Navigate pane to the processflow in the SubQuery tab and the highlighted code in the Source tab. Note that you canadd a subquery any place that you can add a table or a column.Add a Subquery to an SQL ClauseYou can also add a subquery to SELECT, WHERE, HAVING clauses in SQL queries.The following display shows how a subquery can be added as a condition to a WHEREclause.Display 17.20 Add a Subquery to a WHERE ClauseNote that the subquery is connected to the WHERE clause with the EXISTSoperator, which you can select from the drop-down menu in the Operator field. To addthe subquery, click in the Operand field on the right-hand side of the Where tab. Then,click Subquery from the drop-down menu. The following display shows the completedsample subquery.Working with the SQL Join Transformation Solution 315Display 17.21 Sample WHERE Clause SubqueryThe subquery includes a SELECT clause, a WHERE clause, and a HAVING clause.You can compare the tree view of the subquery in the Navigate pane to the process flowin the SubQuery tab and the highlighted code in the Source tab. The HAVING clausenested into the WHERE clause enables you to further refine data selection in the query.Submitting an SQL QueryProblemYou want to submit an SQL query, either to verify that it will work properly or to runit as part of a SAS Data Integration Studio job.SolutionYou can submit an SQL query in two distinct contexts. First, you can submit it fromthe Designer tab of the SQL Join transformation. This approach can be helpful whenyou want to make sure that your query runs properly and returns the data that youreseeking. You can also submit the query when you run the SAS Data Integration Studiojob that contains the SQL Join transformation.316 Tasks Chapter 17TasksSubmit a Query from the Designer Tab of the SQL Join TransformationPerform the following steps to submit a query from the Designer tab:1 Submit the query in one of the following ways: Click Submit on the SAS Data Integration Studio menu bar. Right-click on the Create, SubQuery, Where, Having, Join, or Source tab.Then, click Submit. Click Submit on the SAS Data Integration Studio SQL menu.If Run Prior Steps on Submit is selected on the pop-up menu or the SQLmenu, the steps that are placed before the SQL query code are submitted. (Thesesteps are used to create the source tables for the query.) When deselected, thisoption runs the SQL query code only. This setting enables you to test changes tothe SQL query. This option takes effect only when work tables are used as thesource.2 Validate the query as needed. For example, you can check the properties of thetarget table. You can also review the data populated into the target table in theView Data window. Finally, you can examine the Log tab to verify that the querywas submitted successfully or to troubleshoot an unsuccessful submission.Note: You can use the Job Status Manager in SAS Data Integration Studio to cancelthe SQL query. The SQL Join transformation is displayed as a row in the Job StatusManager. You can right-click the row and click Cancel Job to cancel the query. TheSQL Join transformation is currently the only transformation that supports this type ofcancellation. Submit a Query as a Part of a SAS Data Integration Studio JobPerform the following steps to submit a query from the SAS Data Integration Studiojob:1 Submit the query in one of the following ways: Click Submit on the SAS Data Integration Studio menu bar. Right-click in the Process Designer window. Then, click Submit. Click Submit on the SAS Data Integration Studio Process menu.2 Validate the job as needed. For example, you can check the properties of the targettable. You can also review the data populated into the target table in the ViewData window. Finally, you can examine the Log tab to verify that the job wassubmitted successfully or to troubleshoot an unsuccessful submission.Joining a Table to ItselfProblemYou want to join a table to itself.Working with the SQL Join Transformation Tasks 317SolutionYou can join the table to itself by creating the second version of the table with analias. Then, you can take some of the variables used in a query from the original tableand take the remaining variables from the newly created copy of the table.TasksJoin the Table to ItselfPerform the following steps to join a table to itself and use the resulting hierarchy oftables in a query:1 Create an SQL query in an empty job. The query should contain the SQL Jointransformation, at least one source table, and a target table.2 Open the Designer tab in the SQL Join transformation. Click Create in theNavigate pane to access the Create tab and the SQL Clauses pane.3 Drop a join from the SQL Clauses pane on the Select object on the Create tab.4 Drop the table used as a source table for the query on the drop zone for the join.You will be prompted to supply an alias for the table because it is already beingused as a source table for the query.5 Enter the alias in the Alias field of the properties pane for the table.6 Click Apply to save the alias for the table that you just added.7 Complete any additional configuration needed to finish the query. The followingdisplay shows a sample job that includes a table joined to itself.Display 17.22 Sample Job with a Table Joined to Itself318 Using Parameters with an SQL Join Chapter 17Note that the table jobs shown on the Create tab are reflected in the FROM clausehighlighted on the Source tab. The query shown in the sample job pulls the Namevariable from the original table (denoted with the us alias). However, it pulls thePopulation and Area variables from the copy of the original table (denoted with theuscopy alias).Using Parameters with an SQL JoinProblemYou want to include an SQL Join transformation in a parameterized job. Then, theparameterized job is run in an iterative job. The iterative job contains a control loop inwhich one or more processes are executed multiple times. This arrangement allows youto iteratively run a series of tables in a library through your SQL query. For example,you could process a series of 50 tables that represent each of the 50 states in the UnitedStates through the same SQL query.SolutionYou can create one or more parameters on the Parameters tab in the propertieswindow for the SQL Join transformation. Then, you can use the parameters to tie theSQL Join transformation to the other parts of the parameterized job and the iterativejob that contains it. The following prerequisites must be satisfied before the SQL Jointransformation can work in this iterative setting: The SQL Join transformation must be placed in a parameterized job. SeeCreating a Parameterized Job on page 335. One or more parameters must be set for the input and output tables for theparameterized job. See Set Input and Output Parameters on page 336. One or more parameters must be set for the parameterized job. See SetParameters for the Job on page 337. The parameterized job must be embedded in an iterative job. See About IterativeJobs on page 331. The parameters from the parameterized job must be mapped on the ParameterMapping tab of the properties window for the iterative job. See About IterativeJobs on page 331. The tables that you need to process through query created in the SQL Jointransformation must be included in the control table for the iterative job. SeeCreating a Control Table on page 338.Working with the SQL Join Transformation Tasks 319Constructing a SAS Scalable Performance Data Server Star JoinProblemYou want to construct SAS Scalable Performance Data (SPD) Server star joins.SolutionYou can use the SAS Data Integration Studio SQL Join transformation to constructSAS SPD Server star joins when you use SAS SPD Server version 4.2 or later.TasksConstruct an SPD Server Star JoinStar joins are useful when you query information from dimensional models that areconstructed of two or more dimension tables that surround a centralized fact tablereferred to as a star schema. SAS SPD Server star joins are queries that validate,optimize, and execute SQL queries in the SAS SPD Server database for performance. Ifthe star join is not used, the SQL will be processed in SAS SPD Server using pair-wisejoins, which require one step for each table to complete the join. When the SAS SPDServer options are set, the star join is enabled.You must meet the following requirements to enable a star join SAS SPD Server: All dimension tables must surround a single fact table. Dimension-to-fact table joins must be equal joins, and there should be one join perdimension table. You must have two or more dimension tables in the join condition. The fact table must have at least one subsetting condition placed on it. All subsetting and join conditions must be specified in the WHERE clause. Star join optimization must be enabled through the setting of options on the SASSPD Server library.In order to enable star join optimization, code that will run on the generated Pass SASSPD Server system library must have the following options added to the library: LIBGEN=YES * IP=YESHere is a commented example of a WHERE clause that will enable a SAS SPDServer star join optimization:where/* dimension1 equi-joined on the fact */hh_&statesimple.geosur = hh_dim_geo_&statesimple.geosur/* dimension2 equi-joined on the fact */and hh_&statesimple.utilsur = hh_dim_utility_&statesimple.utilsur/* dimension3 equi-joined on the fact */and hh_dim_family_&statesimple.famsur = hh_dim_family_&statesimple.famsur/* subsetting condition on the fact */320 Additional Information Chapter 17and hh_dim_family_&statesimple.PERSONS = 1;The following display depicts the Designer tab in an SQL Join transformation that isincluded in a job that constructs an SPD Server star join.Display 17.23 Sample SPD Server Star Join Code in an SQL QueryNote that the SAS SPD Server requires all subsetting to be implemented on theWhere tab in the SQL Join transformation. For more information about SAS SPDServer support for star joins, see the 'SAS Scalable Performance Data Server 4.4:Users Guide.' When the code is properly configured, the following output is generatedin the log: SPDS_NOTE: STARJOIN optimization used in SQL executionAdditional InformationFor detailed information about specific usage issues with the SQL Jointransformation, see the 'SAS Data Integration Studio Usage Notes' topic in SAS DataIntegration Studio help.Optimizing SQL Processing PerformanceProblemJoins are a common and resource-intensive part of SAS Data Integration Studio.SAS SQL implements several well-known join algorithms: sort-merge, index, and hash.Working with the SQL Join Transformation Solution 321You can use common techniques to aid join performance, irrespective of the algorithmchosen. Conditions often cause the SAS SQL optimizer to choose the sort-mergealgorithm; techniques that improve sort performance also improve sort-merge joinperformance. However, understanding and leveraging index and hash joins willenhance performance.It is common in SAS Data Integration Studio to perform lookups between tables.Based on key values in one table, you look up matching keys in a second table andretrieve associated data in the second table. SQL joins can perform lookups. However,SAS and SAS Data Integration Studio provide special lookup mechanisms that typicallyoutperform a join. The problems associated with joins are similar to those with sorting: Join performance seems slow. You have trouble influencing the join algorithm that SAS SQL chooses. You experience higher than expected disk space consumption. You have trouble operating SAS SQL joins with RDBMS data.SolutionReview the techniques explained in the following topics: Debugging an SQL Query on page 296 Enabling Pass-Through Processing on page 325 Influencing the Join Algorithm on page 322 Performing General Data Optimization on page 321 Understanding Automatic Joins on page 289 Setting the Implicit Property for a Join on page 324 Selecting the Join Type on page 292 Using Property Sheet Options to Optimize SQL Processing Performance on page327Performing General Data OptimizationProblemYou want to streamline the data as much as possible before you run it through SQLprocessing in a SAS Data Integration Studio job.SolutionYou can minimize the input and output overhead for the data. You can also pre-sortthe data.322 Tasks Chapter 17TasksMinimize Input/Output (I/O) ProcessingTo help minimize I/O and improve performance, you can drop unneeded columns,minimize column widths (especially from Database Management System [DBMS] tablesthat have wide columns), and delay the inflation of column widths until the end of yourSAS Data Integration Studio flow. (Column width inflation becomes an issue when youcombine multiple columns into a single column to use a key value).Pre-Sort DataPre-sorting can be the most effective means to improve overall join performance. Atable that participates in multiple joins on the same join key usually benefits frompre-sorting. For example, if the ACCOUNT table participates in four joins onACCOUNT_ID, then pre-sorting the ACCOUNT table on ACCOUNT_ID helps optimizethree joins. However, the overhead associated with sorting can degrade performance.You can sometimes achieve better performance when you subset by using the list ofcolumns in the SELECT statement and the conditions set in the WHERE clause.Note: Integrity constraints are automatically generated when the query target tothe SQL transformation is a physical table. You can control the generation of theseconstraints by using a Table Loader transformation between the SQL Jointransformation and its physical table. Inuencing the Join AlgorithmProblemYou want to influence the SAS SQL optimizer to choose the join algorithm that willyield the best possible performance for the SQL processing included in a SAS DataIntegration Studio job. SAS SQL implements several well-known join algorithms:sort-merge, index, and hash.SolutionThere are common techniques to aid join performance, irrespective of the algorithmchosen. These techniques use options that are found on the SQL Properties pane andthe properties panes for the tables found in SAS queries. However, selecting a joinalgorithm is important enough to merit a dedicated topic. You can use the Debugproperty on the SQL Join Properties pane to run the _method, which will add a tracethat will indicate which algorithm is used when in the Log tab.Working with the SQL Join Transformation Tasks 323TasksSort-Merge JoinsConditions often cause the SAS SQL optimizer to choose the sort-merge algorithm,and techniques that improve sort performance also improve sort-merge joinperformance. However, understanding and using index and hash joins can provideperformance gains. Sort-merge is the algorithm most often selected by the SQLoptimizer. When index nested loop and hash join are eliminated as choices, asort-merge join or simple nested loop join is used. A sort-merge sorts one table, storesthe sorted intermediate table, sorts the second table, and finally merges the two to formthe join result. Use the Suggest Sort Merge Join property on the SQL Propertiespane to encourage a sort-merge. This property adds MAGIC=102 to the PROC SQLinvocation, as follows: proc sql _method magic=102;.Index JoinsAn index join looks up each row of the smaller table by querying an index of thelarge table. When chosen by the optimizer, an index join usually outperforms asort-merge join on the same data. To get the best join performance, you should ensurethat both tables have indexes created on any columns that the you want to participatein the join relationship. The SAS SQL optimizer considers an index join when: The join is an equijoin in which tables are related by equivalence conditions on keycolumns. Joins with multiple conditions are connected by the AND operator. The larger table has an index composed of all the join keys.Encourage an index nested loop with IDXWHERE=YES as a data set option, as follows:proc sql _method; select ... from smalltable, largetable(idxwhere=yes).You can also turn on the Suggest Index Join property on the properties panes for thetables in the query.Hash JoinsThe optimizer considers a hash join when an index join is eliminated as a possibility.With a hash join, the smaller table is reconfigured in memory as a hash table. SQLsequentially scans the larger table and performs row-by-row hash lookup against thesmall table to form the result set. A memory-sizing formula, which is not presentedhere, determines whether a hash join is chosen. The formula is based on the PROC SQLoption BUFFERSIZE, whose default value is 64 KB. On a memory-rich system, considerincreasing BUFFERSIZE to increase the likelihood that a hash join is chosen. You canalso encourage a hash join by increasing the default 64 KB PROC SQL buffersizeoption. Set the Buffer Size property on the SQL Properties pane to 1048576.324 Setting the Implicit Property for a Join Chapter 17Setting the Implicit Property for a JoinProblemYou want to decide whether the Implicit property for a join should be enabled. Thissetting determines whether the join condition is processed implicitly in a WHEREstatement or explicitly in a FROM clause in the SELECT statement.SolutionYou can access the Implicit property in the SQL Properties pane. You can alsoright-click a join in the Create tab to access the property in the pop-up menu. Thefollowing table depicts the settings available for each type of join, along with a sampleof the join condition code generated for the join type:Table 17.5 Implicit and Explicit Properties for SQL Join TypesJoin Type Join Condition CodeInner Can generate an implicit inner join condition in a WHERE statementnear the end of the query:wherePOSTALCODES.Name = UNITEDSTATES.NameYou can use an implicit join only when the tables are joined with theequality operator. You can also generate an explicit inner join conditionin a FROM clause in the SELECT statement:fromsrclib.POSTALCODES inner joinsrclib.UNITEDSTATESon(POSTALCODES.Name = UNITEDSTATES.Name)Full Can generate an explicit join condition in a FROM clause in theSELECT statement:fromsrclib.POSTALCODES full joinsrclib.UNITEDSTATESon(POSTALCODES.Name = UNITEDSTATES.Name)Left Can generate an explicit join condition in a FROM clause in theSELECT statement:fromsrclib.POSTALCODES left joinsrclib.UNITEDSTATESon(POSTALCODES.Name = UNITEDSTATES.Name)Working with the SQL Join Transformation Solution 325Join Type Join Condition CodeRight Can generate an explicit join condition in a FROM clause in theSELECT statement:fromsrclib.POSTALCODES right joinsrclib.UNITEDSTATESon(POSTALCODES.Name = UNITEDSTATES.Name)Cross Can generate an explicit join condition in a FROM clause in theSELECT statement:fromsrclib.POSTALCODES cross joinsrclib.UNITEDSTATESUnion Can generate an explicit join condition in a FROM clause in theSELECT statement:fromsrclib.POSTALCODES union joinsrclib.UNITEDSTATESThe Implicit property is disabled by default for all of the join types except the innerjoin.Enabling Pass-Through ProcessingProblemYou want to decide whether to enable pass-through processing, which sendsDBMS-specific statements to a database management system and retrieves the DBMSdata directly. In some situations, pass-through processing can improve the performanceof the SQL Join transformation in the context of a SAS Data Integration Studio job.Pass-through processing is enabled with options that are found on the SQL Propertiespane and the properties panes for the tables found in SAS queries. However, its impactcan be sufficient enough to merit a dedicated topic.SolutionYou can use the Pass Through property on the SQL Join Properties pane todetermine whether explicit pass-through processing is used. When the Pass Throughproperty is set to Yes, you can send DBMS-specific statements to a databasemanagement system and retrieve DBMS data directly, which sometimes is faster thanprocessing the SQL query on the SAS system. When Pass Through is set to No, explicitpass-through processing will not be used.326 Tasks Chapter 17TasksExplicit Pass-Through ProcessingExplicit pass-through is not always feasible. The query has to be able to work as ison the database. Therefore, if the query contains anything specific to SAS beyond theoutermost select columns portion, the database will generate errors. For example, usingany of the following in a WHERE clause expression or in a subquery on the WHERE orFROM clauses will cause the code to fail on the database if pass through is set to Yes: SAS formats SAS functions DATE or DATETIME literals or actual numeric values date arithmetic (usually will not work) INTO: macro variable data set optionsThe SQL Properties pane also contains the Target Table is Pass Through property,which determines whether explicit pass-through is active for the target table. Thisproperty enables the target to have the select rows inserted into the target within theexplicit operation. This is valid only when all the tables in the query, including thetarget, are on the same database server. The Target Table is Pass Throughproperty has a corresponding property, named Target Table Pass Through Action.The Truncate option in this property is useful for DBMS systems that will not allowthe target to be deleted or created. In this case, the only option is removing all of therows. If Truncate is selected, the table will have all of its rows deleted. If the tabledoesnt exist, it is created.Implicit Pass-Through ProcessingEven if Pass Through is set to No, PROC SQL will still try to pass the query or partof the query down to the database with implicit pass-through. This attempt to optimizeperformance is made without the user having to request it. SQL implicit pass-throughis a silent optimization that is done in PROC SQL. Implicit pass-through interprets SASSQL statements, and, whenever possible, rewrites the SAS SQL into database SQL.There is no guarantee that the SQL will be passed to the database. However, PROCSQL will try to generate SQL that will pass to the database. If the optimizationsucceeds in passing a query (or parts of a query) directly to a database, the SQL queryexecutes on the database and only the results of the query are returned to SAS. Thiscan greatly improve the performance of the PROC SQL code. If the query cannot bepassed to the database, records are read and passed back to SAS, one at a time.Implicit pass-through is disabled by the following query constructs: Heterogeneous queries: Implicit pass-through will not be attempted for queriesthat involve different engines or on queries that involve a single engine withmultiple librefs that cannot share a single connection because they have differentconnection properties (such as a different database= value). You can use thePass Through property to run these queries with explicit pass-through processing.You can also use the Upload Library Before SQL, Pre-Upload Action, and UseBulkload for Upload properties in the table properties panes to improve thesituation.Note: The Upload Library Before SQL property can be used to create ahomogeneous join, which then can enable an explicit pass-through operation. Thisproperty allows you to select another library on the same database server as otherWorking with the SQL Join Transformation Solution 327tables in the SQL query. The best choice for a library would be a temporary spaceon that database server. The operations on that temporary table can also bemodified to choose between deleting all rows or deleting the entire table. Bulk-loadis also an option for the upload operation with the Use Bulkload for Uploadingproperty. It is generally good practice to upload the smaller of the tables in theSQL query because this operation could be expensive. Queries that incorporate explicit pass-through statements: If explicit pass-throughstatements are used, the statements are passed directly to the database as theyare. Therefore, there is no need to try to prepare or translate the SQL withimplicit pass-through to make it compatible to the database. It is already assumedcompatible. Queries that use SAS data set options: SAS data set options cannot be honored ina pass-through context. Queries that use an INTO: clause: The memory associated with the host variableis not available to the DBMS processing the query. The INTO: clause is notsupported in the SQL Join transformation. Queries that contain the SAS OUTER UNION operator: This is a non-ANSI SASSQL extension. Specification of a SAS Language function that is not mapped to a DBMSequivalent by the engine. These vary by database. Specification of ANSIMISS or NOMISS in the join syntax.Using Property Sheet Options to Optimize SQL Processing PerformanceProblemYou want to set specific options in the SQL Properties pane or table properties paneslocated on the Designer tab in the properties window of an SQL Join transformation.These options are intended to improve the performance of SQL processes included in aSAS Data Integration Studio job.SolutionUse one of the following techniques: Bulk load tables. Optimize the SELECT statement. Set buffering options. Use threaded reads. Write user-written code.328 Tasks Chapter 17TasksBulk Load TablesThe fastest way to insert data into a relational database when using the SAS/ACCESS engine is to use the bulk-loading capabilities of the database. By default, theSAS/ACCESS engines load data into tables by preparing an SQL INSERT statement,executing the INSERT statement for each row, and issuing a COMMIT. If you specifyBULKLOAD=YES as a DATA step or LIBNAME option, the database load utility isinvoked. This invocation enables you to bulk load rows of data as a single unit, whichcan significantly enhance performance. You can set the BULKLOAD option on theBulkload to DBMS property pane for the target table. Some databases require that thetable be empty in order to load records with their bulk-load utilities. Check yourdatabase documentation for this and other restrictions.For smaller tables, the extra overhead of the bulk-load process might slowperformance. For larger tables, the speed of the bulk-load process outweighs theoverhead costs. Each SAS/ACCESS engine invokes a different load utility and usesdifferent options. More information on how to use the bulk-load option for each SAS/ACCESS engine can be found in the online documentation for each engine.The Use Bulkload for Uploading and Bulkload Options properties are availableon the properties window for each table in a query. The Use Bulkload for Uploadingproperty applies to the source table. It is a valid option only when the source table isbeing uploaded to the DBMS to create a homogeneous join. The Bulkload to DBMSproperty applies to target tables and turns bulk loading on and off. The Bulkload toDBMS property is not valid when the Target Table is Pass Through property on theSQL Properties pane is set to Yes.The option to bulk load tables applies only to source tables that are participating in aheterogeneous join. Also, the user must be uploading the table to the DBMS where thejoin is performed.Optimize the SELECT StatementIf you set the Select * property to Yes in the Select Properties pane, a Select *statement that selects all columns in the order in which they are stored in a table isrun when the query is submitted. If you set the Select * property to No and enter onlythe columns that you need for the query in the SELECT statement, you can improveperformance. You can also enhance performance by carefully ordering columns so thatnon-character columns (such as numeric, DATE, and DATETIME) come first andcharacter columns come last.Set Buffering OptionsYou can adjust I/O buffering. Set the Buffer Size property to 128 KB to promotefast I/O performance (or 64 KB to enhance large, sequential processes). The BufferSize property is available in the SQL Properties pane. Other buffering options aredatabase-specific and are available in the properties pane for each of the individualtables in the query. For example, you can set the READBUFF option by entering anumber in the Number of Rows in DBMS Read property in the properties pane, whichbuffers the database records read before passing them to SAS. INSERTBUFF is anexample of another option available on some database management systems.You should experiment with different settings for these options to find optimalperformance for your query. These are data set options; therefore, do not specify themunless you know that explicit pass-through or implicit pass-through will not be used onthat portion of the query because they could actually slow performance. If these optionsWorking with the SQL Join Transformation Tasks 329are present in the query at all, they will prevent implicit pass-through processing. Ifthese are options are present on the part that is being explicitly passed through, adatabase error will occur because the database wont recognize these options.For example, if the Target Table is Pass Through property on the SQLProperties pane is set to Yes, then using INSERTBUFF data set options on this targettable will cause an error on the database. If the Pass Through property in the SQLProperties pane is set to Yes and a number is specified in the Buffer Size property,the database will not recognize this option on the FROM clause of the query and returnan error. One way around the risk of preventing implicit pass-through is specifyingthese options on the LIBNAME statement instead, but then it applies to all tables usingthat LIBNAME and to all access to those tables. That being said, these buffering dataset options are great performance boosters if the database records will all be copied toSAS before the query runs in SAS (with no pass through) because it buffers the I/Obetween the database and SAS into memory. The default is 1, which is inefficient.Use Threaded ReadsThreaded reads divide resource-intensive tasks into multiple independent units ofwork and execute those units simultaneously. SAS can create multiple threads, and aread connection is established between the DBMS and each SAS thread. The result setis partitioned across the connections, and rows are passed to SAS simultaneously (inparallel) across the connections. This improves performance.To perform a threaded read, SAS first creates threads, which are standard operatingsystem tasks controlled by SAS, within the SAS session. Next, SAS establishes aDBMS connection on each thread. SAS then causes the DBMS to partition the resultset and reads one partition per thread. To cause the partitioning, SAS appends aWHERE clause to the SQL so that a single SQL statement becomes multiple SQLstatements, one for each thread. The DBSLICE option specifies user-supplied WHEREclauses to partition a DBMS query for threaded reads. The DBSLICEPARM optioncontrols the scope of DBMS threaded reads and the number of DBMS connections. Youcan enable threaded reads with the Parallel Processing with Threads property onthe SQL Properties pane.Write User-Written CodeThe User Written property determines whether the query is user-written orgenerated. When the User Written property on the SQL Properties pane is set to Yes,you can edit the code on the Source tab, and the entire job will be saved as user written.When the User Written property in the Where, Having, or Join Properties pane is setto Yes, you can then enter code directly into the field. Therefore, you can either write anew SQL query from scratch or modify a query that is generated when conditions areadded to the top section of the Where/Having/Join tab. When User Written is set toNo in any properties pane, the SQL field is read-only. It displays only the generatedquery. User-written code can be used as a last resort because the code cant beregenerated from the metadata when there are changes. The User Written property isavailable in the SQL Properties pane and in the Where/Having/Join Properties pane.331C H A P T E R18Working with Iterative Jobs andParallel ProcessingAbout Iterative Jobs 331Additional Information 332Creating and Running an Iterative Job 332Problem 332Solution 333Tasks 333Create and Run the Iterative Job 333Variation: Add the Input and Transformation Directly To a Job 333Examine the Results 334Additional Information 335Creating a Parameterized Job 335Problem 335Solution 335Tasks 336Create the Parameterized Job 336Set Input and Output Parameters 336Set Parameters for the Job 337Complete Parameterized Job Configuration 337Additional Information 337Creating a Control Table 338Problem 338Solution 338Tasks 338Create and Register the Control Table 338Populate the Control Table Job 339Additional Information 340About Parallel Processing 340Setting Options for Parallel Processing 341Problem 341Solution 342Tasks 342Additional Information 342About Iterative JobsAn iterative job is a job with a control loop in which one or more processes areexecuted multiple times. For example, the following display shows the process flow foran iterative job.332 Additional Information Chapter 18Display 18.1 Iterative JobThe process flow specifies that the inner Extract Balance job is executed multipletimes, as specified by the Loop transformations and the CHECKLIB control table. Theinner job is also called a parameterized job because it specifies its inputs and outputs asparameters.The job shown in the previous example uses a control table that was created in aseparate library contents job. This job created a control table that contains a static listof the tables included in the input library at the time that the job was run. You can alsoreuse an existing control table or create a new one. Many times, you will want to addthe library input and the Library Contents transformation directly to an iterative job,as shown in the following example.Display 18.2 Control Table Job in an Iterative JobWhen the input library and the Library Contents transformation are added to theiterative job, the contents of the control table are dynamically generated each time thatthe iterative job is run. This arrangement ensures that the list of tables in the CheckAcct Lib table is refreshed each time that the job is run. It also ensures that the tablesare processed iteratively as each row in the control table is read.Additional InformationFor an example of how the steps in the iterative process are performed, see the'Example: Create an Iterative Job That Uses Parallel Processing' topic in SAS DataIntegration Studio help.Creating and Running an Iterative JobProblemYou want to run a series of similarly structured tables through the same task orseries of tasks. For example, you might need to extract specific items of census datafrom a series of 50 tables. Each table in the series contains data from one of the 50states in the United States.Working with Iterative Jobs and Parallel Processing Tasks 333SolutionYou need to create an iterative job that enables you to run a series of tables throughthe tasks contained in a job that is placed between Loop and Loop End transformations.This iterative job also contains a control table that lists the tables that are fed throughthe loop.TasksCreate and Run the Iterative JobPerform the following steps to create and run the iterative job:1 Create the control table and parameterized job that will be included in theiterative job.2 Create an empty job.3 Drag the Loop End transformation from the Control folder in the Process Librarytree. Then, drop it into the empty job.4 Drag the parameterized job from the Project tree. Then, drop it into the input dropzone for the Loop End transformation.5 Drag the Loop transformation from the Control folder in the Process Library tree.Then, drop it into the input drop zone for the parameterized job.6 Drag the control table and drop it into the input drop zone for the Looptransformation. A sample completed iterative job is shown below.Display 18.3 Completed Iterative Job7 Open the Loop Options tab in the properties window for the Loop transformation.Select the Execute iterations in parallel check box. Also select the Oneprocess for each available CPU node check box in the Maximum number ofconcurrent processes group box.8 Open the Parameter Mapping tab. Make sure that the appropriate value Sourcetable field is mapped to the parameter listed in the Parameters field. The exactmapping depends on the columns that are included in the source table and theparameter that is set on the parameterized job.9 Close the properties window for the Loop transformation.10 Run the iterative job.Variation: Add the Input and Transformation Directly To a JobYou can customize the basic process by adding the library input and the LibraryContents transformation directly to an iterative job, as shown in the following example.334 Tasks Chapter 18Display 18.4 Control Table Job in an Iterative JobWhen the input library and the Library Contents transformation are added to theiterative job, the contents of the control table are dynamically generated each time thatthe iterative job is run. This arrangement ensures that the list of tables in the controltable is refreshed each time that the job is run. It also ensures that the tables areprocessed iteratively as each row in the control table is read. For information aboutcontrol table jobs, see Creating a Control Table on page 338.Examine the ResultsThe output for the completed iterative processing is found in the output table for theparameterized job. In addition, the Loop transformation provides a status and run-timeinformation in the temporary output table that is available when it is included in asubmitted job. Once the job has completed without any errors, you can perform thefollowing steps to review both the status data and iterative job output:1 Right-click the Loop transformation and click View Data. A sample View Datawindow for the status information in the Loop transformation temporary outputtable is shown in the following example.Display 18.5 Loop Transformation Temporary Output TableEach row in this table contains information about an iteration in the job.2 Double-click the icon for the parameterized job. After the parameterized job opens,right-click the target table icon and click View Data. A sample View Data windowfor the iterative data is shown in the following example.Working with Iterative Jobs and Parallel Processing Solution 335Display 18.6 View of Target Table OutputRemember that you set a default value for the parameter on the output tablewhen you set up the parameterized job. You can change the default value to see adifferent portion of the outputted data.Additional InformationFor detailed information about iterative jobs, see the Example: Create an IterativeJob That Uses Parallel Processing topic in the SAS Data Integration Studio help. Fordetailed information about specific usage issues, see the Usage Notes for IterativeJobs topic in the SAS Data Integration Studio help.Creating a Parameterized JobProblemYou want to create a job that will enable you to perform an identical set of tasks on aseries of tables. For example, you might need to extract specific demographicinformation for each of the 50 states in the United States when the data for each stateis contained in a separate table.SolutionYou need to create a job that enables you to run each table through the loop in aniterative job. This job then writes data to an output table with each iteration. You setparameters on the job, the input table, and the output table. Then, you connect theparameters to the control table in the iterative job.336 Tasks Chapter 18TasksCreate the Parameterized JobPerform the following steps to create the parameterized job:1 Create and register the input and output tables.2 Create an empty job.3 Drag the output table and drop it into the empty job. The output table mustcontain exactly the same columns as the tables listed in the control table for theloop processing in the iterative job to work properly.4 Drag the transformation that will be used to process the job from the ProcessLibrary tree. Then, drop it into the input drop zone for the Loader transformation.Release the mouse button.5 Drag the input table from the Project tree. Then, drop it into the input drop zonefor the transformation just added to the Process Designer.6 By default, a temporary output table and a Table Loader transformation are addedto the Process Designer, but they are not needed for this job. Right-click the TableLoader transformation and select the Delete option. Click OK in the ConfirmRemove window. The temporary output table and the Table Loader are deleted. Asample completed parameterized job is shown in the following example.Display 18.7 Completed Parameterized JobSet Input and Output ParametersPerform the following steps to set the input and output table parameters for theparameterized job:1 Open the Parameters tab in the properties window for the input table. Click Newto display the Create Parameter window. Enter appropriate values in thefollowing fields:Parameter Name: a name for the macro variable, such as Marital Status.Macro Variable Name: a valid macro variable name, such as mstatus.Default value: a default value for the output table, such asCHECKING_ACCOUNT_DIVORCED in the Default value field. (You can enter thename of any tables listed in the control table list here.)2 Close the Create Parameter window and open the Physical Storage tab. Enteran appropriate value in the Name field. Create this value by combining anampersand sign with the value entered in the Macro Variable Name field on theCreate Parameter window (for example, &mstatus). Close the properties windowfor the input table.3 Open the Parameters tab in the properties window for the output table. Click Newto display the Create Parameter window. Enter appropriate values in thefollowing fields:Working with Iterative Jobs and Parallel Processing Additional Information 337Parameter Name: a name for the macro variable, such as Marital Status Out.Macro Variable Name: a valid macro variable name, such as mstatus.Default value: a default value for the output table, such asCHECKING_ACCOUNT_DIVORCED in the Default value field. (You can enter thename of any tables listed in the control table list here.)4 Close the Create Parameter window and open the Physical Storage tab. Enteran appropriate value in the Name field. Create this value by combining anampersand sign with the value entered in the Macro Variable Name field on theCreate Parameter window and appending .OUT to the combination (for example,&mstatus.OUT). Close the properties window for the output table.Set Parameters for the JobPerform the following steps to set the parameters for the parameterized job and tocomplete job configuration:1 Open the Parameters tab in the properties window for the parameterized job.2 Click Import to display the Import Parameters window. Click an appropriatevalue such as PARAMTABLE_IN in the Available Parameters field. Select theparameter assigned to the input table and move it to the Selected Parametersfield. Then, close the properties window.Complete Parameterized Job CongurationPerform the following steps to complete the configuration of the parameterized job:1 Configure any settings needed to process the data in the parameterized job. Forexample, you can set a WHERE condition in an Extract transformation if one isincluded in the job. These settings vary depending on the structure of theindividual job.2 Open the Mapping tab in the properties window for the transformation included inthe parameterized job. Verify that all of the columns in the source table aremapped to an appropriate column in the target table and close the propertieswindow.3 Do not run the job. It will be submitted as a part of the iterative job.Note: For detailed information about parameterized jobs, see the Example: Createa Parameterized Job For Use In an Iterative Job topic in the SAS Data IntegrationStudio help. Additional InformationFor detailed information about iterative jobs, see the Example: Create an IterativeJob That Uses Parallel Processing topic in the SAS Data Integration Studio help. Fordetailed information about specific usage issues, see the Usage Notes for IterativeJobs topic in the SAS Data Integration Studio help.338 Creating a Control Table Chapter 18Creating a Control TableProblemYou want to create a control table that lists the tables that you plan to include in aniterative job. Iterative jobs are used to run a series of similarly structured tablesthrough the same task or series of tasks. The control table supplies the name of thetable that is run through each iteration of the job.SolutionYou can reuse an existing control table or create one manually. You can also create ajob that uses the Library Contents transformation. This transformation generates alisting of the tables contained in the library that holds the tables that you plan to runthrough the iterative job. This control table is based on the dictionary table of thatlibrary.TasksCreate and Register the Control TableIf you have an existing control table, you can use it. If you dont, you can use theSource Editor window in SAS Data Integration Studio to execute an SQL statement.The statement creates an empty instance of the table that has same column structureas the dictionary table for the library. Then use a Source Designer to register the emptytable. Perform the following steps to create the empty control table:1 Determine the identity and location of the library that contains the tables that youneed to process in an iterative job.2 From the SAS Data Integration Studio desktop, select Tools Source Editor.The Source Editor window appears. Submit code similar to the following code:libname tgt C:targetssas1_tgt;proc sql;create table tgt.CHECKLIBas select *from dictionary.tableswhere libname=checklib;quit;Be sure to check the Log tab to verify that the code ran without errors.3 Register the table that you just created using the Source Designer. This actioncreates a metadata object for the table.4 (Optional) You can confirm that the empty control table was created in physicalstorage. Right-click the metadata object for the table and select View Data. Asample table is shown in the following example.Working with Iterative Jobs and Parallel Processing Tasks 339Display 18.8 View of Empty Control Table OutputPopulate the Control Table JobPerform the following steps to populate a control table job:1 Create an empty job.2 Drag the Library Contents transformation from the Access folder in the ProcessLibrary tree. Then, drop it into the empty job.3 Drag the control table and drop it into the target table location in the ProcessEditor.4 By default, a temporary output table and a Table Loader transformation are addedto the Process Designer, but they are not needed for this job. Right-click the TableLoader transformation and select the Delete option. Click OK in the ConfirmRemove window. The temporary output table and the Table Loader are deleted.5 Open the Mapping tab in the properties window for the Library Contentstransformation. Verify that all of the rows in the source table are mapped to thecorresponding row in the target table and click Quick Map to correct any errors.6 Drag the icon for the library that contains the tables that will be iterativelyprocessed from Libraries folder in the Inventory tree. Then, drop the icon into theinput zone for the Library Contents transformation. A sample completed controltable job is shown in the example below.Display 18.9 Completed Control Table Job7 Run the job.8 If the job completes without error, right-click the control table icon and click ViewData. The View Data window appears, as shown in the following example.340 Additional Information Chapter 18Display 18.10 View of Control Table OutputNote that the all of the rows in the table are populated with the name of thecontrol table in the libname column. This confirms that all of the rows are drawnfrom the appropriate library. You can now use the table as the control table for theiterative job.For detailed information about control table jobs, see the Example: Create andPopulate the Control Table With a Library Contents Transformation topic in the SASData Integration Studio help.Additional InformationFor detailed information about iterative jobs, see the Example: Create an IterativeJob That Uses Parallel Processing topic in the SAS Data Integration Studio help. Fordetailed information about specific usage issues, see the Usage Notes for IterativeJobs topic in the SAS Data Integration Studio help.About Parallel ProcessingSAS Data Integration Studio uses a set of macros to enable parallel processing. Youcan enable these macros by doing one of the following: Selecting YES in the Enable parallel processing macros option on theOptions tab of the properties window for a job. Including a Loop transformation in a job.When you enable the parallel-processing option for a job, macros are generated at thetop of the job code with comments to enable you to create your own transformations orcode to take advantage of parallel processing.When you include a Loop transformation in a job, the transformation generates thenecessary macros to take advantage of sequential execution, symmetric multiprocessing(SMP) execution, or execution on a grid computing network.No special software or metadata is required to enable parallel processing on SMPservers. If grid options have been enabled for a job, but the grid software has not beenconfigured and licensed, SAS Data Integration Studio does not generate grid-enabledcode for the job. It generates code that is appropriate for SMP on the SAS ApplicationServer.The following table describes the prerequisites that are required to enable parallelprocessing for SAS Data Integration Studio jobs. For details about these prerequisites,see the appropriate section in the documentation mentioned below.Working with Iterative Jobs and Parallel Processing Problem 341Table 18.1 Prerequisites for Parallel Processing of SAS Data Integration StudioJobsComputers Used for Parallel Processing RequirementsSMP machine with one or more processors Specify a SAS9 Workspace server in themetadata for the default SAS applicationserver for SAS Data Integration Studio. Seethe Specifying Metadata for the Default SASApplication Server topic in SAS DataIntegration Studio help.Grid computing network Specify an appropriate SAS Metadata Serverto get the latest metadata object for a gridserver. See the SAS Data Integration Studiochapter in the SAS Intelligence Platform:Desktop Application Administration Guide.Specify an appropriate SAS9 WorkspaceServer in the metadata for the default SASapplication server.Grid software must be licensed.Define or add a Grid Server component to themetadata that points to the grid serverinstallation. The controlling server machinemust have both a Grid Server definition and aSAS Workspace Server definition as aminimum to be able to run your machines in agrid. It is recommended that you also have theSAS Metadata Server component accessible tothe server definition where your grid machinesare located.Install Platform Computing software to handleworkload management for the grid.Note: For additional information onthese requirements, see the gridchapter in the SAS IntelligencePlatform: Application ServerAdministration Guide. Setting Options for Parallel ProcessingProblemYou want to take advantage of parallel processing and grid processing in SAS DataIntegration Studio jobs.342 Solution Chapter 18SolutionIf you need to enable parallel or grid processing for all jobs, set global options on theOptions tab of the Options window for SAS Data Integration Studio. If you need toenable parallel or grid processing for a single iterative job, set the options available onthe Loop Options tab of the properties window for the Loop transformation.TasksThe following tables describe how to set options for parallel processing and gridprocessing in SAS Data Integration Studio jobs.Table 18.2 Global Options (affects all new jobs)Option Purpose To SpecifyEnable parallel processingmacros for new jobsAdds parallel processingmacros to the code that isgenerated for all new jobs.Select Tools Options fromthe menu bar. Click theOptions tab. Specify thedesired option.Various grid computing options Sets grid computing optionsfor all new jobs.Select Tools Options fromthe menu bar. Click theOptions tab. Specify thedesired option.Table 18.3 Local Options (affects the current job or transformation)Option Purpose To SpecifyEnable parallel processingmacrosWhen YES is selected, thisoption adds parallel processingmacros to the code that isgenerated for the current job.Parallel processing macros arealways included in the codethat is generated for a Looptransformation.Open the Options tab in theproperties window for the job.Select YES or NO in the fieldfor this option.Various grid computing optionsfor the Loop transformationSets grid options for thecurrent Loop transformationOpen the Loop Optionstab in the properties windowfor the Loop transformation.Specify the desired option.Additional InformationFor details about the global options included on the Code Generation tab of theOptions window, see the description of the options in the Code Generation Tab topic inSAS Data Integration Studio help. For information about the options available on theLoop Options tab in the Loop transformation properties window, see the About LoopTransformations and Loop Options Tab topics. For information about specific usageissues, see the Usage Notes for Parallel Processing topic.343C H A P T E R19Working with Slowly ChangingDimensionsAbout Slowly Changing Dimensions (SCD) 344SCD and the Star Schema 344Type 1 SCD 345Type 2 SCD 345Type 3 SCD 345About the Star Schema Loading Process 345About Type 2 SCD Dimension Tables 346About Change Detection and the Loading Process for SCD Dimension Tables 346About Cross-Reference Tables 347About the Structure and Loading of Fact Tables 347About Keys 348About Generated Keys 348Transformations That Support Slowly Changing Dimensions 349Loading a Dimension Table Using Begin and End Datetime Values 349Problem 349Solution 350Tasks 350Loading a Dimension Table Using Version Numbers or Current-Row Indicators 353Problem 353Solution 353Tasks 353Loading a Fact Table 353Problem 353Solution 353Tasks 354Generating Retained Keys for an SCD Dimension Table 356Problem 356Solution 356Tasks 356Updating Closed-Out Rows in SCD Dimension Tables 357Problem 357Solution 357Tasks 357Optimizing SQL Pass-Through in the SCD Type 2 Loader 358Problem 358Solution 358Tasks 358344 About Slowly Changing Dimensions (SCD) Chapter 19About Slowly Changing Dimensions (SCD)SCD and the Star SchemaThe star schema is an architecture that separates operational data into twocategories: factual events and the detail data that describes those events. Numericaldata about events is stored in a fact table. Character data that describes the events isstored in dimension tables. In the fact table, numerical values called keys identify thedetail data in the dimension tables that is associated with each event.As shown in the following diagram, SAS Data Integration Studio enables you tocreate jobs that load data into star schemas. Other jobs extract knowledge from thestar schema.Figure 19.1 The Star Schema and SAS Data Integration StudioDimension tables provide collection points for categories of information. Typicalcategories are customers, suppliers, products, and organizations.To provide analytical power, type 2 slowly changing dimensions is implemented indimension tables to retain a record of changes to data over time. Analysis of the recordof data changes provides knowledge. For example, analysis of a dimension table thatWorking with Slowly Changing Dimensions About the Star Schema Loading Process 345contains customer information might allow buying incentives to be offered to customersthat are most likely to take advantage of those incentives.In SAS Data Integration Studio, the process of loading dimension tables andmaintaining the record of data changes takes place in the transformation called theSCD Type 2 Loader. Other transformations handle fact table loading and other aspectsof star schema management, as described in Transformations That Support SlowlyChanging Dimensions on page 349.Three types of slowly changing dimensions are commonly defined, as described in thefollowing subsections.Type 1 SCDType 1 SCD stores one row of data for each member of a dimension. (Members areindividuals with a unique ID number.) New data overwrites existing data, and a historyof data changes is not maintained. For example, in a dimension table that containscustomer information, if a row contains data for Mr. Jones, and if a new home addressfor Mr. Jones is included in an update of the dimension table, then the new addressoverwrites the old address.Type 2 SCDYou can identify dimension tables as type 2 SCD when you see multiple rows of dataper member. One row is the current row, which is the latest set of information for eachmember in the dimension. The other rows are said to be closed out, which means thatthey are no longer the current row. The closedout rows maintain a historical record ofchanges to data.Type 3 SCDType 3 SCD retains a limited history of data changes using separate columns fordifferent versions of the same value As in Type 1 SCD, each member of the dimension isrepresented by a single row in the dimension table. An example of a type 3 SCD tablemight contain three columns for postal codes. The column names might be CurrentPostal Code, Previous Postal Code, and Oldest Postal Code. As dimension table updatesadd new postal codes, the values in the Current column move into the Previous column,and values in Previous column move into the Oldest column.About the Star Schema Loading ProcessThe process for loading a star schema for slowly changing dimensions follows thesegeneral steps:1 Stage operational data. In this initial step you capture data and validate thequality of that data. Your staging jobs make use of the Data Validationtransformation, along with other data quality transformations and processes.2 Load dimension tables. Data from the staging area is moved into the dimensiontables of the star schema. Dimension tables are loaded before the fact table inorder to generate the primary key values that are needed in the fact table.3 Load the fact table. In this final step you run a job that includes the Lookuptransformation, which loads numerical columns from the staging area into the facttable. Then the Lookup transformation captures foreign key values from thedimension tables.346 About Type 2 SCD Dimension Tables Chapter 19About Type 2 SCD Dimension TablesDimension tables that are loaded with the SCD Type 2 Loader consist of a primarykey column, a business key column, one or two change tracking columns, and anynumber of detail data columns. The primary key column is often loaded with valuesthat are generated by the transformation. The business keys are supplied in the sourcedata. Both the business key and the primary key can be defined to consist of more thanone column, as determined by the structure of the source data.Change tracking columns can consist of begin and end datetime columns, a versionnumber column, or a current-row indicator column.Begin and end datetime values specify the period of time in which each row was thecurrent row for that member. The following diagram shows how data is added to beginand end datetime columns. Note how the end value for one row relates to the beginvalue for the row that superseded it. The end value for the current row is a placeholderfuture date.Figure 19.2 Structure of an SCD Dimension TableBusiness Key(member ID) BeginDatetime EndDatetime Primary Key(generated) DetailData A DetailData Bcurrent row 2138 27JUL2007 01JAN2599 252138 27JUL200715MAY2007 18closed-out row2138 15MAY200722FEB2007 6closed-out rowTracking changes by version number increments a counter when a new row is added.The current row has the highest version number for that member. The version numberfor new members is current_version_number + 1.Tracking changes using a currentrow indicator column loads a 1 for the current rowand 0s for all of the other rows that apply to that same member.The preceding diagram shows a primary key column, the values for which aregenerated by the SCD Type 2 Loader. The generated primary key is necessary in orderto uniquely identify individual rows in the dimension table. The generated primary keyvalues are loaded into the star schemas fact table as foreign keys, to connect factual ornumerical events to the detail data that describes those events.About Change Detection and the Loading Process for SCD DimensionTablesIn jobs that run the SCD Type 2 Loader transformation, the dimension table loadingprocess repeats the following process for each source row:1 Compare the business key of the source row to the business keys of all of thecurrent rows in the dimension table. If no match is found, then the source rowrepresents a new member. The source row is written to the target and the loadingprocess moves on to the next source row.2 If the business key in the source matches a business key in the target, thenspecified detail data columns are compared between the matching rows. If noWorking with Slowly Changing Dimensions About the Structure and Loading of Fact Tables 347differences in data are detected, then the source row is a duplicate of the targetrow. The source row is not loaded into the target and the loading process moves onto the next source row.3 If business keys match and data differences are detected, then the source rowrepresents a new current row for that member. The source row is written to thetarget, and the previous current row for that member is closed-out. To close out arow, the change tracking column or columns are updated as necessary, dependingon the selected method of change tracking.To learn how to use the SCD Type 2 Loader, see Loading a Dimension Table UsingBegin and End Datetime Values on page 349.About Cross-Reference TablesDuring the process of loading an SCD dimension table, the comparison of incomingsource rows to the current rows in the target is facilitated by a cross-reference table.The cross-reference table consists of all of the current rows in the dimension table, onerow for each member. The columns consist of the generated key, the business key and adigest column named DIGEST_VALUE.The digest column is used to detect changes in data between the source row and thetarget row that has a matching business key. DIGEST_VALUE is a character columnwith a length of 32. The values in this column are encrypted concatenations of the datacolumns that were selected for change detection. The encryption uses the MD5algorithm, which is described in detail at http://www.faqs.org/rfcs/rfc1321.html.If a cross-reference table exists and has been identified, it will be used and updated.If a cross-reference table has not been identified, then a new temporary table is createdeach time you run the job.Cross-reference tables are identified in the Options tabs of the followingtransformations: SCD Type 2 Loader and Key Effective Date, in the fieldCross-Reference Table Name.About the Structure and Loading of Fact TablesFact tables contain numerical columns that describe events, along with foreign-keycolumns that identify detail data that describes the events. The detail data is stored indimension tables.Fact tables are loaded with the Lookup transformation. Input data for thetransformation consists of a source table that contains the numeric columns and thedimension tables that contain the foreign-key columns. Because the dimension tablesneed to have primary keys assigned, you need to load your dimension tables before youload your fact table. The primary keys in the dimension table become the foreign keysin the fact table.For each dimension table, you configure the Lookup transformation to find a matchbetween specified columns in the source data and the dimension table. If a match isfound, then the primary key of that dimension table row is loaded into the fact table asa foreign key.The Lookup transformation also enables you to flexibly configure responses toexceptions and errors, so that you can maintain the quality of the data in your starschema.To learn how to use the Lookup transformation, see Loading a Fact Table on page353.348 About Keys Chapter 19About KeysIn a star schema, key values connect fact tables to dimension tables. The followingtypes of keys are used in the SCD Type 2 Loader:primary keysuniquely identify the rows in fact and dimension tables. Primary key columns areidentified in the Keys tab of the dimension tables Properties dialog box.foreign keysconnect rows in fact tables to rows in dimension tables. Foreign key values in afact table are primary key values in the dimension tables. Foreign keys are loadedinto fact tables using the Lookup transformation.business keysuniquely identify the members in dimension tables. These keys are essential toType 2 SCD, because each member can be represented by more than one row.Theyre called business keys because the values are supplied in the source table,as delivered from the operational system that collected the data. An example of abusiness key column might be named Customer_ID. This column can be loadedinto the Customer dimension table to identify all of the rows (current andclosedout) that are associated with each customer. Business keys are identified inBusiness Key tab of the SCD Type 2 Loaders Properties dialog box.generated keysestablish key values for in dimension tables. The SCD Type 2 Loader can generatesurrogate, retained, and unique keys.surrogate keysare integer values that are generated in the SCD Type 2 Loader to provideprimary or unique key values in dimension tables. By default, the generatedvalues increment the highest existing number in a specified key column. You havethe option of using an expression to modify the default generated value. You canalso specify a starting point for each load using a lookup column.retained keysconsist of a numeric column of generated values that is combined with a datetimecolumn to make up the primary key of a dimension table. The keys are said to beretained because the generated numeric values do not change after they have beenassigned to a member. When a member is updated, the retained value is copiedinto the new row from the previous current row. The datetime value uniquelyidentifies the row.Keys of any type can be defined to include multiple columns in your dimensionaltarget table, as needed according to the structure of your source data.You identify key columns in the Keys tab of the target tables Properties dialog box.About Generated KeysThe SCD Type 2 Loader enables you to generate key values when you load adimension table. The generated values are used as primary keys. After the keys aregenerated and the dimension table has been loaded, the primary key column is added tothe fact table as a foreign key for that dimension.In the Generated Keys tab of the SCD Type 2 Loader, you can configure a simplesurrogate key that increments the highest existing value in a specified column. You canalso use an expression to generate values in other than simple increments. To specify aWorking with Slowly Changing Dimensions Problem 349unique starting point for the keys that are generated in each load, you can specify alookup column.In addition to surrogate keys, you can also generate retained keys. Retained keysprovide a primary key value that consists of two columns, one containing a datetimevalue and the other containing a generated value. The generated value is retainedbecause a single generated value is used for all of the rows that apply to a givenmember. The datetime value differentiates the rows.As with surrogate keys, you can generate retained key values using expressions andlookup columns.To enhance performance, you should create an index for your generated key column.If you identify your generated key column as the primary key of the table, then theindex is created automatically. Surrogate keys should receive a unique or simple indexconsisting of one column. Retained keys should receive a complex index that includesthe generated key column and the datetime column.To create an index, open the Properties dialog box for the table and use the Indexand Keys tabs.Transformations That Support Slowly Changing DimensionsSAS Data Integration Studio provides the following transformations that you can useto implement slowly changing dimensions:SCD Type 2 Loaderloads dimension tables, detects changes, tracks changes, and generates key values.Lookuploads source data into fact tables and loads foreign keys from dimension tables,with configurable error handling. The lookup process accesses dimension tablesusing hash objects for optimal performance.Fact Table Lookuploads source data into fact tables using key values from dimension tables. Thelookup process uses SAS formats rather than the more efficient hash objects usedin the Lookup transformation.Key Effective Dateupdates dimension tables based on changes to the business key, when changedetection is unnecessary.Surrogate Key Generatorgenerates unique key numbers for dimension tables, in a manner that is similarbut less feature-rich than the SCD Type 2 Loader transformation. Use theSurrogate Key Generator when key generation is the sole task that is required atthat point in the job.Loading a Dimension Table Using Begin and End Datetime ValuesProblemYou want to load data into a dimension table and maintain a history of data changesusing begin and end datetime values.350 Solution Chapter 19SolutionUse the SCD Type 2 Loader to load datetime values from the source, or generatedatetime values in specified formats.TasksPerform the following steps to load an SCD dimension table that tracks data changesusing begin and end datetime values:1 Create a job that includes the SCD Type 2 Loader transformation, a source tablewith a business key, and a target table that has numeric columns for begin andend datetime values.2 In the Process Editor, double-click the target table to display the Properties dialogbox. If you intend to generate primary key values, display the Columns tab andadd a numeric column.3 Click Apply, and then open the Keys tab. Right-click New and select Primary Key.4 In the Columns list of the Keys tab, select the column or columns that make upyour primary key and click the right arrow. The primary key column or columnsappear in the Keys pane, as shown in the following display.Display 19.1 Targets Primary Key Specication5 Click OK to save changes and close the Properties dialog box.6 Open the Properties dialog box for the SCD Type 2 Loader.7 Open the Mapping tab. For the business key and detail data columns in thesource, click in the source column and drag to a target column to create arrowsthat connect the source and target columns. If you plan to load datetime valuesfrom the source, also add connecting arrows for those columns.Working with Slowly Changing Dimensions Tasks 3518 Click Apply, and then open the Change Tracking tab. Triple-click under ColumnName to select numeric datetime columns from drop-down lists. By default, thedatetime columns are the first two numeric columns in your target table.Display 19.2 Typical Default Values in the Change Tracking Tab9 If you plan to load datetime values from the source, realize that the values in thetwo Expression fields will be used when datetime values are missing in thesource. You might wish to change the default expressions to better suit thedatetime format in your source data.To modify the expressions with the Expression Builder, double click underExpression and click the button that appears in the field. The Expression Builderenables you to use logical operators, SAS functions, and a variety of datetimeformats.352 Tasks Chapter 19Display 19.3 The Expression Builder10 When youre ready, click Apply and open the Options tab. Click Help to reviewthe valid values for Format Type for Date. Specify the value that suits thenature of your source data. Then click Apply to save your changes.11 If you plan to generate datetime values for change tracking, examine the defaultexpressions in the Change Tracking tab.12 Click Apply, and then open the Business Key tab.13 In the Business Key tab, change the default values as necessary to specify thenumeric source column or columns that make up the business key. If your businesskey consists of multiple columns, be sure to specify at the top of the list the columnthat provides unique member identifiers. If your business key uses a combinationof columns to establish uniqueness, consider that the columns will be evaluated bythe transformation in top-down order as listed in the Business Key tab.14 Click Apply, and then open the Change Detection tab. In that tab you will see alist of all target columns other than those that those that are used for changetracking, the business key, or any generated key. To specify the columns that youinclude in change tracking, select the columns and click the right arrow. Tooptimize performance, select only the columns that you intend to analyze, andselect shorter columns over longer columns when possible.15 Click OK to save changes and close the Properties dialog box.16 Review column mappings in the SCD Type 2 Loader and the target.17 With the cursor in the Process Editor, right-click and select Save to save thechanges to your job.18 Click the Submit Source button in the toolbar to run the job.Working with Slowly Changing Dimensions Solution 353Loading a Dimension Table Using Version Numbers or Current-RowIndicatorsProblemYour Type 2 SCD dimension table needs to use a means of tracking data changesother than begin and end datetime columns.SolutionUse the Change Tracking tab in the SCD Type 2 Loader to generate versionnumbers or current-row indicators to identify the current row for each member in thedimension.TasksTracking changes with version numbers generates and loads a sequential number ina specified column. The advantages of version numbering are that the previous currentrow doesnt need to be updated, and a fully ordered list of entries is maintained. Thecurrent row for a given member is the row that has the highest version number.The current-row indicator uses a 1 in the target column for all current rows and a 0in the same column for all closed-out rows. This method of change tracking is useful ifyou have other means of ordering closed-out rows.To implement version numbers or current-row indicators, follow the steps in Loadinga Dimension Table Using Begin and End Datetime Values on page 349. In the targettable, define one target column for change tracking instead of two. In the ChangeTracking tab of the SCD Type 2 Loader, select version numbering or current-rowindicators instead of begin and end datetime values.Loading a Fact TableProblemYou want to load data into a fact table and add foreign keys to connect the fact tableto the dimension tables in the star schema.SolutionCreate a job that uses the Lookup transformation, and run that job after you run thejobs that load your dimension tables. The Lookup transformation loads fact columnsfrom a source table and loads foreign key values from dimension tables. The generatedprimary key in the dimension table is loaded into the fact table when the business keyin the fact table matches the business key in the dimension table.354 Tasks Chapter 19TasksPerform the following steps to load a fact table:1 Before you build the job that loads your fact table, be sure to first submit the jobsthat load your dimension tables, as described in Loading a Dimension TableUsing Begin and End Datetime Values on page 349.2 Create a job in the Process Editor that includes a Lookup transformation.3 Drop the source table into the source table drop box. The source table contains thefact columns and business key columns.4 Drop one of the dimension tables onto the lookup table drop box. The dimensiontables contain the key values that you want to load into your fact table.5 To add drop boxes for additional dimension tables, right-click the Lookuptransformation and click Add Input.6 Drop dimension tables onto the lookup table drop boxes.7 In the Lookup transformation, open the Mapping tab, right-click, and selectImport. Import the fact data columns from the source table, as shown in thefollowing display.Display 19.4 Importing Source Columns into the Lookup Transformation8 Right-click in Selected Columns and select Quick Map to draw connectingarrows between the columns in the source and the columns in the work table ofthe Lookup transformation.9 To add foreign key columns, repeat the column import process for each of thelookup (dimension) tables. Theres no need to add mapping lines for thesecolumns. Click Apply, and then open the Lookup tab.Working with Slowly Changing Dimensions Tasks 35510 In the Lookup tab, note that the first lookup table is highlighted. For this tableyou will define the matching condition between the dimension table and the sourcetable. The matching condition specifies when a foreign key value is to be loadedinto the target from the selected dimension table. The compared values are thebusiness key columns. If the business key values match, the foreign key value isloaded into the target from the dimension table.To set up the match between the source and the dimension table, click LookupProperties. In the Source to Lookup Mapping tab, left-click on the businesskey column in the source and drag across to the business key column in the lookuptable. Release the mouse to display a connecting arrow between the two columns,as shown in the following display:Display 19.5 Dening the Match Between Business Key Columns11 If you want to define a WHERE clause that further refines the match between thebusiness key columns, click the Where tab and build an expression. Click Apply tosave changes.Note: If you use a WHERE clause, and if the lookup table uses a generated key,you can improve performance by creating an index on the generated key column,as described in About Generated Keys on page 348.12 To define the lookup table columns that will be loaded into the fact table when amatch occurs between business key values, open the Lookup to Target Mappingtab, left-click in the lookup column, and drag across to the target column. Releasethe mouse to display a connecting arrow between the columns.13 To specify how the transformation deals with exception conditions in the loading ofthis lookup table, click the Exception tab. Review the default values, and thenselect and configure responses to various conditions, such as Lookup Value NotFound. These entries apply only to the currently selected lookup table.Note: Responses to error conditions for the entire transformation are specified inthe Error tab of the Lookup transformation.14 Click OK to complete the definition of the lookup and load operations for the firstlookup table.15 For each of the other lookup tables, repeat the steps that define the matchbetween source table and lookup table. Then repeat the steps that define how theforeign key values are mapped from the lookup table to the target table.356 Generating Retained Keys for an SCD Dimension Table Chapter 1916 Use the Target Designer to create a target fact table. Drop that table onto PlaceTarget Table Here.17 Double-click the table in the job and open the Columns tab. Configure the targetcolumns to match the columns in the work table of the Lookup transformation.Save and close the Properties dialog box.18 Open the Table Loader and open the Mapping tab. Right-click and select QuickMap to connect the Lookup transformations work table to the target. Save andclose the Properties dialog box.19 Open the Lookup transformation and open the Errors tab. Error handling is notconfigured by default. You can specify an error limit and error tables for specifiedcolumns.20 Save and close the job. If the dimension tables have been updated, youre ready torun your fact table loading job.Generating Retained Keys for an SCD Dimension TableProblemYou want to assign a single unique number to all of the rows in a dimension tablethat apply to individual members, and combine that identifier with a datetime columnto uniquely identify each row in the table (as the primary key).SolutionGenerate retained key values using the SCD Type 2 Loader transformation.TasksPerform the following steps to configure a retained key in a job that includes theSCD Type 2 Loader:1 In the targets Properties dialog box, use the Columns tab to add and configurecolumns for the begin datetime value, end datetime value, and the retained keyvalue, all of which are numeric.2 In the Keys tab, identify the begin datetime column and the retained key columnas the primary key columns.3 In the Properties dialog box of the SCD Type 2 Loader, configure the ChangeTracking tab, as described in Loading a Dimension Table Using Begin and EndDatetime Values on page 349.4 Open the Business Key tab and specify the column or columns that make up thebusiness key.If the business key includes more than one column, and if only one of thosecolumns identifies members in the source table, order your business keyspecifications so that the top entry in the list is the column that differentiatesbetween members.If multiple columns in the source uniquely identify members, the column orderin the Business Key tab is not significant. In this case, youll need to select acheck box in the Generated Key tab, as described in a subsequent step.Working with Slowly Changing Dimensions Tasks 3575 Open the Generated Key tab and specify the column you defined in the targetthat will receive the retained key values.6 Select the check box Generate retained key. As you do so, note that the fieldChanged record becomes unavailable. This happens because changed recordsreceive the same key value that was present in the original record.7 If you have a multi-column business key, and if more than one of those columns isused to uniquely identify members in the source, then select the check boxGenerate unique keys for each column in the business key. Selectingthis check box ensures that all columns of the business key are evaluated todetermine whether a source row represents a new member, an updated member, orno change to an existing member. Click OK and open the targets Properties dialogbox.8 In the targets Properties dialog box, open the Keys tab and specify the datetimecolumn and the retained key column as the primary key. Click Apply.9 To enhance performance, open the Indexes tab to create a composite index of thedatetime column and the retained key column. Remove any other columns fromthe index.10 For each indexed column, select at minimum the Unique values check box, andconsider selecting No missing values. Click Save to complete the job.Updating Closed-Out Rows in SCD Dimension TablesProblemYou want to update data in closed-out rows of dimension tables that use begin andend datetime values for change tracking.SolutionClosed-out rows are automatically updated if the changes are not detected betweenthe source row and the current row for that member, and if the end datetime value ofthe closed-out row is older than the end datetime value of the current row.TasksPerform the following steps to ensure that your SCD dimension table is configured toautomatically load updates to closed-out rows:1 Open the job that loads the SCD dimension table, or create a job as described inLoading a Dimension Table Using Begin and End Datetime Values on page 349.Note: Make sure that your begin and end datetime columns are mapped intotarget columns. 2 Open the SCD Type 2 Loader transformation in the job.3 Open the Change Tracking tab.4 Verify that change tracking is implemented with begin and end datetime values.5 If you are generating datetime values in the transformation, examine theexpressions for the begin datetime and end datetime columns.358 Optimizing SQL Pass-Through in the SCD Type 2 Loader Chapter 19For the current row, in the end datetime column, specify an expression thatgenerates a future date that far exceeds any value in the end datetime column forclosed-out rows. This ensures that the end datetime for the current row willalways be later than any end datetime value in a closed-out row.6 If you are loading datetime values from the source table, examine that data toensure that the current-row end datetime values are later than the end datetimevalue for the closed-out rows that will be updated in the dimension table.7 If you need to load date values from the source, instead of datetime values, notethat you have the option of converting the date to a datetime value. This is thedefault mode for the SCD Type 2 Loader. To confirm that you are using the defaultmode, open the Options tab in the Properties dialog box of the SCD Type 2Loader. Confirm that the value of Format Type for Date is DATE.8 Save changes, close the job, and submit the job.Optimizing SQL Pass-Through in the SCD Type 2 LoaderProblemYou want to optimize SQL pass-through performance in your SCD Type 2 Loadertransformation.SolutionSpecify that your target table use the appropriate database engine (ORACLE, DB2,TERADATA, or ODBC), rather that the SAS or SPDS engine, which improvesperformance for SQL pass-through.TasksPerform the following steps to ensure that your dimension table uses the appropriatedatabase engine for SQL pass-through:1 If you are creating a new target table with the Target Designer, specify theappropriate database engine in the DBMS field, which appears in the secondpanel, as shown in the following diagram.Working with Slowly Changing Dimensions Tasks 359Display 19.6 Specifying a Database Engine in the DBMS Field2 If your target table already exists, right-click the table in the Inventory tree or joband select Properties.3 In the Properties dialog box, open the Physical Storage tab.4 In the DBMS field, select the appropriate database engine for SQL pass-through.Pass-through is optimized only in the following engines: ORACLE, DB2,TERADATA, and ODBC.5 Click OK to save changes and close the dialog box.Notes: SQL pass-through is enabled by default in the Options tab of the SCD Type 2Loader, in the field Use SQL pass-through. In the Options tab, you can set options on the SQL procedure using the SQLOptions field.361C H A P T E R20Working with Message QueuesAbout Message Queues 361Prerequisites for Message Queues 362Selecting Message Queue Transformations 363Problem 363Solution 363Tasks 363Additional Information 364Processing a WebSphere MQ Queue 364Problem 364Solution 364Tasks 365Create the WebSphere Queue Writer Job 365Configure and Run the WebSphere Queue Writer Job 365Verify the WebSphere Queue Writer Job 366Create the WebSphere Queue Reader Job 366Configure and Run the WebSphere Queue Reader Job 367Verify the WebSphere Queue Reader Job 367Additional Information 368About Message QueuesA message queue is a guaranteed message delivery mechanism for handling datasharing in a user-defined format. Several widely used messaging technologies arecurrently available. The format of the message content can be completely user defined,or it can be a format that has been commonly accepted for a particular industrysegment. The message queues in SAS Data Integration Studio support all of thefollowing data transfer types:362 Prerequisites for Message Queues Chapter 20Table 20.1 Support Data Transfer TypesData Transfer Type DescriptionText Transmits text of a maximum length of 32767characters or a macro variable for transfer to themessage queue. The default value for the Textfield is the etls_qms macro variable. The text isentered directly into the Text field on the QueueOptions tab on the properties windows for theMessage Queue Reader and Message QueueWriter transformations.Tables Transmits records from a table (from a SAS dataset, a DBMS table, or an XML table). In order tosuccessfully handle tables, the structure of thetable must be included on the receiving end sothat input data values can be correctly formattedto accurately reconstitute the data. A queue ismapped to the data set or table. Each messagesent to the queue corresponds to a databaserecord.Binary Files Transmits files, provided that the receiverunderstands the file format.Unlike other SAS Data Integration Studio jobs, message queue jobs can handle bothstructured data such as tables and unstructured data such as texts.Prerequisites for Message QueuesThe following prerequisites are required to use message queues in SAS DataIntegration Studio jobs: Base SAS and SAS Integration technologies must be installed on the machinewhere the message queue server is installed. The message queue server must be installed (WebSphere MQ server forWebSphere MQ queues; MSMQ Server for Microsoft MQ queues). Then, thequeues must be defined on the server. The workspace server must have client/server or client access to the messagequeue server. The workspace server defined and used to run queue jobs is critical.For example, if you are using a metadata server on your machine and using theworkspace server on Machine X and the model is client/server, then messages aresent to the message queue server running on Machine X. The machine used to run the job is able to access the message queue server. The queue manager and queues must be defined in SAS Management Console.For more information, see the Administering Message Queues section in the'Administering SAS Data Integration Studio' chapter of the SAS IntelligencePlatform: System Administration Guide.Working with Message Queues Tasks 363Selecting Message Queue TransformationsProblemYou want to select the transformations that are appropriate for a Microsoft orWebSphere message queue that contains information that you need to either send orreceive.SolutionFour transformations are provided in SAS Data Integration Studio to facilitate theprocessing of message queues. Select the transformations that you need for yourprocess from the table in the Tasks section.TasksTable 20.2 Message Queue TransformationsTransformation PurposeMicrosoft Queue Writer transformation Enables writing files in binary mode, tables, orstructured lines of text to the Microsoft MQmessaging system. The queue and queuemanager objects necessary to get to themessaging system are defined in SASManagement Console.WebSphere Queue Writer transformation Enables writing files in binary mode, tables, orstructured lines of text to the WebSphere MQmessaging system. The queue and queuemanager objects necessary to get to themessaging system are defined in SASManagement Console.364 Additional Information Chapter 20Transformation PurposeMicrosoft Queue Reader transformation Enables content from a Microsoft MQ messagequeue to be delivered to SAS Data IntegrationStudio. If the message is being sent into atable, the message queue content is sent to atable or a Data Integration Studiotransformation. If the message is being sent toa macro variable or file, then these files ormacro variables can be referenced by a laterstep.WebSphere Queue Reader transformation Enables content from a WebSphere MQmessage queue to be delivered to SAS DataIntegration Studio. If the message is beingsent into a table, the message queue content issent to a table or a Data Integration Studiotransformation. If the message is being sent toa macro variable or a SAS data set file, thenthese data set files or macro variables can bereferenced by a later step.Additional InformationFor detailed information about the Microsoft Queue Writer transformation, see the'About WebSphere Queue Writer Transformations' topic in the SAS Data IntegrationStudio help. For information about the WebSphere Queue Writer transformation, seethe 'About WebSphere Queue Writer Transformations' topic. For information about theMicrosoft Queue Reader transformation, see the 'About Microsoft Queue ReaderTransformations' topic. For information about the WebSphere Queue Readertransformation, see the 'About WebSphere Queue Reader Transformations' topic. Fordetailed information about a specific usage issue, see the 'SAS Data Integration StudioUsage Notes' topic.Processing a WebSphere MQ QueueProblemYou want to write rows from a source table into a message queue. Then, you need toread the messages in the queue from the queue into a target table.SolutionYou can use the WebSphere Queue Writer transformation in SAS Data IntegrationStudio to write the data to the message queue. Then, you can use the WebSphereQueue Reader transformation to read the message from the queue and populate theminto a target table. In addition, you can use the Message browser window in the IBMWebSphere MQ Explorer application to browse the messages. This enables you toWorking with Message Queues Tasks 365monitor the flow of data from the source table to the message queue and from themessage queue to the target table.Text and file transfers are also supported in message queues, but these transfers arenot covered in this example. Finally, SAS Data Integration Studio can also processMicrosoft MQ queues. However, this example focuses exclusively on the WebSpherequeues because they are more commonly encountered.TasksCreate the WebSphere Queue Writer JobPerform the following steps to create and populate the job:1 Create an empty job.2 Select and drag the WebSphere Queue Writer transformation from the Accessfolder in the Process Library tree into the empty job in the Process Editor tab inthe Process Designer window.3 Drop the source table on the source drop zone for the WebSphere Queue Writertransformation.4 Drop the target table on the target drop zone for the WebSphere Queue Writertransformation. The job resembles the following sample.Display 20.1 Write Records from Table to Queue JobCongure and Run the WebSphere Queue Writer JobPerform the following steps to configure the job:1 Open the Queue Options tab of the properties window for the WebSphere QueueWriter transformation.2 Select Table in the Message Type group box. Save the setting and close theproperties window.3 Open the IBM WebSphere MQ application to verify that the message queue isempty. Open IBM WebSphere MQ in the Navigator section of the application.Navigate to the Queues item under the Queue Managers and POPLAR_QMGRfolders. Double-click Queues to access the Queues table.4 Right-click the row for POPLAR_QMGR and click Browse Messages to access theMessage browser window.5 Verify that no messages are displayed in the window. This step verifies that themessage queue is empty before the message queue reader job is run.6 Run the job. If you are prompted to do so, enter a user ID and password for thedefault SAS Application Server that generates and run SAS code for the job. Theserver executes the SAS code for the job.366 Tasks Chapter 207 If the job completes without error, go to the next section. If error messages appear,read and respond to the messages.Verify the WebSphere Queue Writer JobPerform the following steps to verify the results of the queue writer job:1 Access the View Data window for the source table. A sample source table is shownin the following example.Display 20.2 Sample Source Table Data2 Click Refresh on the Browse messages window in IBM WebSphere MQ. Themessages that were written to the sample queue are displayed, and can be used toensure that the data is consistent.If the Message data column is not populated, double-click a row in the table toaccess the Messages Properties window. You can enable the Message data columnon the Data tab.3 Notice that the rows in the Message Data column of the Browse messages windowreflect the corresponding rows in the source table. If you do not see the data thatyou expected, check the Message Format column on the Columns tab in theWebSphere Queue Writer Properties window. To access this window, right-clickWebSphere Queue Writer in the Process Flow Diagram, and click Properties.You can compare this column to message and formatting columns in the Browsemessages window. Then, you can correct the formats as needed.Create the WebSphere Queue Reader JobPerform the following steps to create the WebSphere Queue Reader Job:1 Create an empty job.2 Select and drag the WebSphere Queue Reader transformation from the Accessfolder in the Process Library tree into the empty job in the Process Editor tab inthe Process Designer window.3 Drop the source table on the source drop zone for the WebSphere Queue Readertransformation.4 Drop the target table on the target drop zone for the WebSphere Queue Readertransformation.5 Delete the Table Loader and the temporary worktable from the job.6 After these steps have been completed, the process flow diagram for this exampleresembles the following display.Working with Message Queues Tasks 367Display 20.3 Read Records to a Table JobCongure and Run the WebSphere Queue Reader JobPerform the following steps to configure the job:1 Open the Queue Options tab of the properties window for the WebSphere QueueWriter transformation.2 Select Table in the Message Type group box. Save the setting and close theproperties window. Remember that you verified that the message queue containedthe messages from the source table in the Verify the WebSphere Queue Writer Jobsection above.3 Run the job. If you are prompted to do so, enter a user ID and password for thedefault SAS Application Server that generates and run SAS code for the job. Theserver executes the SAS code for the job.4 If the job completes without error, go to the next section. If error messages appear,read and respond to the messages.Verify the WebSphere Queue Reader JobPerform the following steps to verify the results of the queue reader job:1 Access the View Data window for the source table. A sample source table is shownin the following example.Display 20.4 Sample Target Table DataThe source table and the target table contain identical data. This means thatthe data was transferred successfully through the WebSphere message queue. Ifyou do not see the data that you expected, check the Message Format column onthe Columns tab in the WebSphere Queue Writer Properties window. To accessthis window, right-click WebSphere Queue Writer in the Process Flow Diagram,and click Properties. You can compare this information to the Message Browserand the source table. Then, you can correct the formats as needed.368 Additional Information Chapter 202 Click Refresh on the Browse messages window in IBM WebSphere MQ. Noticethat the Browse messages window is empty. The queue has been cleared untildata is written to it in a later job.Additional InformationFor information about processing a Microsoft MQ message queue, see the 'Example:Process a Microsoft MQ Queue' topic in SAS Data Integration Studio help.369C H A P T E R21Working with SPD Server ClusterTablesAbout SPD Server Clusters 369Creating an SPD Server Cluster 370Problem 370Solution 370Tasks 370Build the SPD Server Cluster 370Maintaining an SPD Server Cluster 371Problem 371Solution 372About SPD Server ClustersThe SAS Scalable Performance Data (SPD) Server enables you to create dynamiccluster tables. A dynamic cluster table is two or more SPD Server tables that arevirtually concatenated into a single entity, using metadata that is managed by the SPDServer. Dynamic cluster tables can be used as the inputs or outputs in SAS DataIntegration Studio jobs.Before you can create an SPD Server, the following prerequisites must be satisfied: Administrators must have installed, started, and registered an SPD Server. Theapplication server that executes the cluster table job must be able to access theSPD Server. For more information about SPD Servers, see the chapters aboutcommon data sources in the SAS Intelligence Platform: Data AdministrationGuide. An SPD Server library must be available. For more information about SPD Serverlibraries, see the chapters about common data sources in the SAS IntelligencePlatform: Data Administration Guide. A cluster table has been registered in the SPD Server library. For moreinformation, see Registering Tables with the Target Table Wizard on page 87. All of the tables that are to be added to the cluster table have been registered inthe SPD Server library. Each table must also have the same column structure asthe cluster table.370 Creating an SPD Server Cluster Chapter 21Creating an SPD Server ClusterProblemYou want to create an SPD Server cluster. The cluster table and the tables that youwill include in the cluster must be registered in the same SPD Server library and sharea common column structure. These cluster tables can be used as the inputs or outputsin SAS Data Integration Studio jobs and can improve the performance of the jobs.SolutionYou can use the Create or Add to a Cluster transformation to create or add tables toan SPD Server cluster table. Use this transformation to create an SPD Server clustertable in a SAS Data Integration Studio job and list its contents in the Output tab in theProcess Designer window.TasksBuild the SPD Server ClusterPerform the following steps to build an SPD Server cluster:1 Create a job in SAS Data Integration Studio and give it an appropriate name.2 Drop the Create or Add to a Cluster transformation on the Process Designerwindow and then drop an SPD Server cluster table in the SPD Server cluster tabledrop zone. The Table Loader transformation is populated into the ProcessDesigner window. However, this SPD server cluster job does not actually load aphysical table. Instead, it creates a virtual table that combines all of the data fromthe tables included in the SPD Server library into a virtual table that is processedas a single unit. You must therefore remove the table loader for the job to runsuccessfully. See the following example.Display 21.1 Sample SPD Server Cluster Table Job with Table Loader Removed3 You can drop the List Cluster Contents transformation on the Process Designerwindow and then drag the SPD Server cluster table into the drop zone for the ListCluster Contents transformation. This step creates a single process flow diagramfor the job, which is shown in the following example.Working with SPD Server Cluster Tables Problem 371Display 21.2 Sample SPD Server Cluster Table Job with List Cluster ContentsThe List Cluster Contents transformation sends a list of all tables included inthe cluster table to the Output tab.4 Right-click the Create or Add to a Cluster transformation and click Properties toaccess the Create or add to a cluster Properties window. Then click Options toaccess the Options tab.5 Limit the tables included in the cluster table by entering a string in the Filter:table name contains ... (Optional) field. In this case, enter DAN becauseall tables required include this string in the table name.6 Enter a value into the Set maximum number of slots (Optional) field. Thisvalue must be large enough to accommodate the potential growth of the clusterbecause the number of slots cannot be increased after the cluster is created. Youare required to delete the existing cluster definition and define a new cluster thatincludes an adequate value for the maximum number of slots.7 Click OK to save the setting and close the properties window.8 Submit and run the job. Click Output to access the Output tab and verify that theexpected tables were added to the SPD Server cluster table, as shown in thefollowing example:Display 21.3 Cluster Contents on Output Tab9 Save the job and check in the repository for future use.Maintaining an SPD Server ClusterProblemYou want to maintain an existing SPD server cluster by generating a list of tablesincluded in a cluster or removing a cluster definition.372 Solution Chapter 21SolutionYou can use the List Cluster Contents transformation or the Remove Clustertransformation. These transformations are explained in the following table.Table 21.1 SPD Server TransformationsServer Tasks That Require This ServerGenerate a list of tables in a cluster Perform the following steps to use the ListCluster Contents transformation:1 Create an empty job.2 Drop the List Cluster Contentstransformation into the Process Designerwindow.3 Drop the cluster table into the drop zonefor the List Cluster Contentstransformation.4 Run the job.Note that you can also include the List ClusterContents transformation in an SPD servercluster job. Then, you will generate a clusterlist each time you create a cluster.Remove a cluster definition Perform the following steps to use the RemoveCluster transformation:1 Create an empty job.2 Drop the Remove Cluster transformationinto the Process Designer window.3 Drop the cluster table into the drop zonefor the List Cluster Contentstransformation.4 Run the job.373P A R T4AppendixesAppendix 1. . . . . . . . .Recommended Reading 375375A P P E N D I X1Recommended ReadingRecommended Reading 375Recommended ReadingHere is the recommended reading list for this title: Customer Data Integration: Reaching a Single Version of the Truth Codys Data Cleaning Techniques Using SAS Software Communications Access Methods for SAS/CONNECT and SAS/SHARE Moving and Accessing SAS Files PROC SQL: Beyond the Basics Using SAS SAS Intelligence Platform: Application Server Administration Guide SAS Intelligence Platform: Desktop Application Administration Guide SAS Intelligence Platform: Data Administration Guide SAS Intelligence Platform: Installation Guide SAS Intelligence Platform: Overview SAS Intelligence Platform: Security Administration Guide SAS Intelligence Platform: System Administration Guide SAS Management Console: Users Guide SAS OLAP Server: Administrators Guide SAS SQL Procedure Users GuideFor a complete list of SAS publications, see the current SAS Publishing Catalog. Toorder the most current publications or to receive a free copy of the catalog, contact aSAS representative atSAS Publishing SalesSAS Campus DriveCary, NC 27513Telephone: (800) 727-3228*Fax: (919) 677-8166E-mail: [email protected] address: support.sas.com/pubs* For other SAS Institute business, call (919) 677-8000.Customers outside the United States should contact their local SAS office.377Glossaryadministratorthe person who is responsible for maintaining the technical attributes of an objectsuch as a table or a library. For example, an administrator might specify where atable is stored and who can access the table. See also owner.alternate keyanother term for unique key. See unique key.analysis data setin SAS data quality, a SAS output data set that provides information on the degree ofdivergence in specified character values.business keyone or more columns in a dimension table that comprise the primary key in a sourcetable in an operational system.change analysisthe process of comparing one set of metadata to another set of metadata andidentifying the differences between the two sets of metadata. For example, in SASData Integration Studio, you have the option of performing change analysis onimported metadata. Imported metadata is compared to existing metadata. You canview any changes in the Differences window and choose which changes to apply. Tohelp you understand the impact of a given change, you can run impact analysis orreverse impact analysis on tables and columns in the Differences window.change managementin the SAS Open Metadata Architecture, a facility for metadata source control,metadata promotion, and metadata replication.change-managed repositoryin the SAS Open Metadata Architecture, a metadata repository that is undermetadata source control.clusterin SAS data quality, a set of character values that have the same match code.comparison resultthe output of change analysis. For example, in SAS Data Integration Studio, themetadata for a comparison result can be selected, and the results of that comparisoncan be viewed in a Differences window and applied to a metadata repository. See alsochange analysis.378 Glossarycross-reference tablea table that contains only the current rows of a larger dimension table. Columnsgenerally include all business key columns and a digest column. The business keycolumn is used to determine if source rows are new dimensions or updates to existingdimensions. The digest column is used to detect changes in source rows that mightupdate an existing dimension. During updates of the fact table that is associatedwith the dimension table, the cross-reference table can provide generated keys thatreplace the business key in new fact table rows.custom repositoryin the SAS Open Metadata Architecture, a metadata repository that must bedependent on a foundation repository or custom repository, thus allowing access tometadata definitions in the repository or repositories on which it depends. A customrepository is used to specify resources that are unique to a particular data collection.For example, a custom repository could define sources and targets that are unique toa particular data warehouse. The custom repository would access user definitions,group definitions, and most server metadata from the foundation repository. See alsofoundation repository, project repository.data analysisin SAS data quality, the process of evaluating input data sets in order to determinewhether data cleansing is needed.data cleansingthe process of eliminating inaccuracies, irregularities, and discrepancies from data.data integrationthe process of consolidating data from a variety of sources in order to produce aunified view of the data.data lineagea search that seeks to identify the tables, columns, and transformations that have animpact on a selected table or column. See also impact analysis, reverse impactanalysis, transformation.data storea table, view, or file that is registered in a data warehouse environment. Data storescan contain either individual data items or summary data that is derived from thedata in a database.data transformationin SAS data quality, a cleansing process that applies a scheme to a specifiedcharacter variable. The scheme creates match codes internally to create clusters. Allvalues in each cluster are then transformed to the standardization value that isspecified in the scheme for each cluster.database librarya collection of one or more database management system files that are recognized bySAS and that are referenced and stored as a unit. Each file is a member of the library.database servera server that provides relational database services to a client. Oracle, DB/2 andTeradata are examples of relational databases.delimitera character that separates words or phrases in a text string.derived mappinga mapping between a source column and a target column in which the value of thetarget column is a function of the value of the source column. For example, if twoGlossary 379tables contain a Price column, the value of the target tables Price column might beequal to the value of the source tables Price column multiplied by 0.8.delivery transportin the Publishing Framework, the method of delivering a package to the consumer.Supported transports include e-mail, message queue, and WebDAV. Although not atrue transport, a channel also functions as a delivery mechanism.digest columna column in a cross-reference table that contains a concatenation of encrypted valuesfor specified columns in a target table. If a source row has a digest value that differsfrom the digest value for that dimension, then changes are detected and the sourcerow becomes the new current row in the target. The old target row is closed out andreceives a new value in the end date/time column.dimensiona category of contextual data or detail data that is implemented in a data model suchas a star schema. For example, in a star schema, a dimension named Customersmight associate customer data with transaction identifiers and transaction amountsin a fact table.dimension tablein a star schema or snowflake schema, a table that contains data about a particulardimension. A primary key connects a dimension table to a related fact table. Forexample, if a dimension table named Customers has a primary key column namedCustomer ID, then a fact table named Customer Sales might specify the Customer IDcolumn as a foreign key.dynamic cluster tabletwo or more SAS SPD Server tables that are virtually concatenated into a singleentity, using metadata that is managed by the SAS SPD Server.fact tablethe central table in a star schema or snowflake schema. A fact table typicallycontains numerical measurements or amounts and is supplemented by contextualinformation in dimension tables. For example, a fact table might include transactionidentifiers and transaction amounts. Dimension tables could add contextualinformation about customers, products, and salespersons. Fact tables are associatedwith dimension tables via key columns. Foreign key columns in the fact table containthe same values as the primary key columns in the dimension tables.foreign keyone or more columns that are associated with a primary key or unique key in anothertable. A table can have one or more foreign keys. A foreign key is dependent upon itsassociated primary or unique key. In other words, a foreign key cannot exist withoutthat primary or unique key.foundation repositoryin the SAS Open Metadata Architecture, a metadata repository that is used tospecify metadata for global resources that can be shared by other repositories. Forexample, a foundation repository is used to store metadata that defines users andgroups on the metadata server. Only one foundation repository should be defined ona metadata server. See also custom repository, project repository.generated keya column in a dimension table that contains values that are sequentially generatedusing a specified expression. Generated keys are used to implement surrogate keysand retained keys.380 Glossarygenerated transformationin SAS Data Integration Studio, a transformation that is created with theTransformation Generator wizard, which helps you specify SAS code for thetransformation. See also transformation.global resourcean object, such as a server or a library, that is shared on a network.impact analysisa search that seeks to identify the tables, columns, and transformations that wouldbe affected by a change in a selected table or column. See also transformation, datalineage.intersection tablea table that describes the relationships between two or more tables. For example, anintersection table could describe the many-to-many relationships between a table ofusers and a table of groups.iterative joba job with a control loop in which one or more processes are executed multiple times.Iterative jobs can be executed in parallel. See also job.iterative processinga method of processing in which a control loop executes one or more processesmultiple times.joba collection of SAS tasks that create output.localea value that reflects the language, local conventions, and culture for a geographicregion. Local conventions can include specific formatting rules for dates, times, andnumbers, and a currency symbol for the country or region. Collating sequences,paper sizes, and conventions for postal addresses and telephone numbers are alsotypically specified for each locale. Some examples of locale values areFrench_Canada, Portuguese_Brazil, and Chinese_Singapore.lookup standardizationa process that applies a scheme to a data set for the purpose of data analysis or datacleansing.match codean encoded version of a character value that is created as a basis for data analysisand data cleansing. Match codes are used to cluster and compare character values.See also sensitivity.message queuein application messaging, a place where one program can send messages that will beretrieved by another program. The two programs communicate asynchronously.Neither program needs to know the location of the other program nor whether theother program is running. See also delivery transport.metadata administratora person who defines the metadata for servers, metadata repositories, users, andother global resources.metadata modela definition of the metadata for a set of objects. The model describes the attributesfor each object, as well as the relationships between objects within the model.Glossary 381metadata objecta set of attributes that describe a table, a server, a user, or another resource on anetwork. The specific attributes that a metadata object includes vary depending onwhich metadata model is being used.metadata repositorya collection of related metadata objects, such as the metadata for a set of tables andcolumns that are maintained by an application. A SAS Metadata Repository is anexample.metadata servera server that provides metadata management services to one or more clientapplications. A SAS Metadata Server is an example.metadata source controlin the SAS Open Metadata Architecture, a feature that enables multiple users towork with the same metadata repository at the same time without overwriting eachothers changes. See also change management.operational datadata that is captured by one of more applications in an operational system. Forexample, an application might capture and manage information about customers,products, or sales. See also operational system.operational systemone or more applications that capture and manage data for an organization. Forexample, a business might have a set of applications that manage information aboutcustomers, products, and sales.ownerthe person who is responsible for the contents of an object such as a table or alibrary. See also administrator.parameterized joba job that specifies its inputs and outputs as parameters. See also job.parameterized tablea table whose metadata specifies some attributes as variables rather than as literalvalues. For example, the input to an iterative job could be a parameterized tablewhose metadata specifies its physical pathname as a variable. See also iterative job.primary keyone or more columns that are used to uniquely identify a row in a table. A table canhave only one primary key. The column(s) in a primary key cannot contain nullvalues. See also unique key, foreign key.process flow diagrama diagram in the Process Editor that specifies the sequence of each source, target,and process in a job. In the diagram, each source, target, and process has its ownmetadata object. Each process in the diagram is specified by a metadata object calleda transformation.project repositorya repository that must be dependent on a foundation repository or custom repositorythat will be managed by the Change Management Facility. A project repository isused to isolate changes from a foundation repository or from a custom repository. Theproject repository enables metadata programmers to check out metadata from afoundation repository or custom repository so that the metadata can be modified andtested in a separate area. Project repositories provide a development/testingenvironment for customers who want to implement a formal change managementscheme. See also custom repository, foundation repository.382 GlossaryQuality Knowledge Basea collection of locales and other information that is referenced during data analysisand data cleansing. For example, to create match codes for a data set that containsstreet addresses in Great Britain, you would reference the ADDRESS matchdefinition in the ENGBR locale in the Quality Knowledge Base.registerto save metadata about an object to a metadata repository. For example, if youregister a table, you save metadata about that table to a metadata repository.retained keya numeric column in a dimension table that is combined with a begin-date column tomake up the primary key. During the update of a dimensional target table, sourcerows that contain a new business key are added to the target. A key value isgenerated and added to the retained key column and a date is added to thebegin-date column. When a source row has the same business key as a row in thetarget, the source row is added to the target, including a new begin-date value. Theretained key of the new column is copied from the target row.reverse impact analysisSee data lineage.SAS Application Serverin the SAS Intelligence Platform, a logical entity that represents the SAS server tier.This logical entity contains specific servers (for example, a SAS Workspace Serverand a SAS Stored Process Server) that execute SAS code. A SAS Application Serverhas relationships with other metadata objects. For example, a SAS library can beassigned to a SAS Application Server. When a client application needs to access thatlibrary, the client submits code to the SAS Application Server to which the library isassigned.SAS Management Consolea Java application that provides a single user interface for performing SASadministrative tasks.SAS metadatametadata that is created by SAS software. Metadata that is in SAS Open MetadataArchitecture format is one example.SAS OLAP Servera SAS server that provides access to multidimensional data. The data is queriedusing the multidimensional expressions (MDX) language.SAS Open Metadata Architecturea general-purpose metadata management facility that provides metadata services toSAS applications. The SAS Open Metadata Architecture enables applications toexchange metadata, which makes it easier for these applications to work together.SAS taska logical process that is executed by a SAS session. A task can be a procedure, aDATA step, a window, or a supervisor process.SAS XML librarya library that uses the SAS XML LIBNAME engine to access an XML file.SAS/CONNECT servera server that provides SAS/CONNECT services to a client. When SAS DataIntegration Studio generates code for a job, it uses SAS/CONNECT software tosubmit code to remote computers. SAS Data Integration Studio can also use SAS/CONNECT software for interactive access to remote libraries.Glossary 383SAS/SHARE librarya SAS library for which input and output requests are controlled and executed by aSAS/SHARE server.SAS/SHARE serverthe result of an execution of the SERVER procedure, which is part of SAS/SHAREsoftware. A server runs in a separate SAS session that services users SAS sessionsby controlling and executing input and output requests to one or more libraries.SAS Stored Process Servera SAS IOM server that is launched in order to fulfill client requests for SAS StoredProcesses.schemea lookup table or data set of character variables that contains variations of dataitems and specifies the preferred variation form or standard. When these schemesare applied to the data, the data is transformed or analyzed according to thepredefined rules to produce standardized values.sensitivityin SAS data quality, a value that specifies the degree of complexity in newly createdmatch codes. A higher sensitivity value results in greater match code complexity,which in turn results in a larger number of clusters, with fewer members in eachcluster.server administratora person who installs and maintains server hardware or software. See also metadataadministrator.server componentin SAS Management Console, a metadata object that specifies information about howto connect to a particular kind of SAS server on a particular computer.slowly changing dimensionsa technique for tracking changes to dimension table values in order to analyzetrends. For example, a dimension table named Customers might have columns forCustomer ID, Home Address, Age, and Income. Each time the address or incomechanges for a customer, a new row could be created for that customer in thedimension table, and the old row could be retained. This historical record of changescould be combined with purchasing information to forecast buying trends and todirect customer marketing campaigns.snowflake schematables in a database in which a single fact table is connected to multiple dimensiontables. The dimension tables are structured to minimize update anomalies and toaddress single themes. This structure is visually represented in a snowflake pattern.See also star schema.sourcean input to an operation.star schematables in a database in which a single fact table is connected to multiple dimensiontables. This is visually represented in a star pattern. SAS OLAP cubes can becreated from a star schema.surrogate keya numeric column in a dimension table that is the primary key of that table. Thesurrogate key column contains unique integer values that are generated sequentiallywhen rows are added and updated. In the associated fact table, the surrogate key isincluded as a foreign key in order to connect to specific dimensions.384 Glossarytargetan output of an operation.transformationa SAS task that extracts data, transforms data, or loads data into data stores.transformation templatea process flow diagram that consists of a transformation object and one or more dropzones for sources, targets, or both.unique keyone or more columns that can be used to uniquely identify a row in a table. A tablecan have one or more unique keys. Unlike a primary key, a unique key can containnull values. See also primary key, foreign key.Web servicea programming interface that enables distributed applications to communicate evenif the applications are written in different programming languages or are running ondifferent operating systems.385IndexAaccess control entries (ACEs)direct assignments of 70access control templates (ACTs)direct assignments of 70access controlsdirect assignments of 70optional promotion of 70Access folder 39accessibility features 13administrative documentation 12administrator profiles 53aggregate columns 244Analysis folder 40application windows 19, 32archived transformations 204Archived Transforms folder 40assistive technologies 13automatic column mappings 198automatic content 67automatic joins 289Bbest practicesimporting or exporting SAS metadata 71browsing table data 109buffering options 328bulk load tables 328bulk loading 273business keysSCD and 348Ccapturing job statistics 158casein table and column names 93CASE expressionadding to queries 303change analysisimporting metadata with 78change detectionSCD loading process and 346change management 59building process flows and 11creating administrator profiles with 53creating user profiles with 52prerequisites for 59user tasks 59clausesadding subqueries to 314modifying properties of 287reviewing 287cleansing data 4, 242closed-out rowsupdating in SCD dimension tables 357cluster tablesSee SPD Server clustersCOBOL copybook 139COBOL data files 139codeSee also generated codecommon code generated for jobs 152credentials in generated code 155, 209displaying SAS code for jobs 152for transformations 196jobs with generated source code 142jobs with user-supplied source code 143user-written 216user-written SQL code 295, 329column header options 117column mappings 197creating automatic mappings 198creating derived mappings 199creating mappings with multiple columns 201creating one-to-one mappings 199deleting 202from source table to work table 224from work table to target table 225pop-up menu options 203column metadata 98adding 99additional operations 100modifying 100notes and documents 103column namescase and special characters in 93setting defaults name options 97columnsadding to target tables 297aggregate columns 244avoiding unneeded columns 244deleting from indexes 108dropping unneeded columns 243key columns 104managing for performance 243matching variable size to data length 244386 Indexrearranging in indexes 108Common Warehouse Metamodel (CWM) format 75Comparison Results tree 44configuration filesmodifying for generated code 212connectivity 3constraint condition options 268constraintsremoving during a load 272Control folder 41control tablescreating for iterative jobs 338registering 338copy and pasteSAS metadata 74credentialsin generated code 155, 209cross-reference tables 347cube wizards 26cubesuser-written code for 216current-row indicatorsloading dimension tables with 353Custom tree 38CWM format 75Ddata cleansing 4, 242data enrichment 4data federation 4data integration 3advantages of 11data integration environment 4libraries 8overview diagram 4SAS Data Integration Studio 5SAS Management Console 5servers 6data optimization 321data storesmetadata for 56data surveyor wizards 27Data Transfer transformation 150Data Transforms folder 41data validation 242datetime valuesloading dimension tables with 349DBMS names 94setting options in source designers 97DBMS servers 7DBMS tablesregistering tables with keys 58debugging 246adding code to process flows 247SQL queries 296default metadata repository 54default SAS application server 34selecting 55Delimited External File wizard 120, 123delimited external filesregistering 122deploying jobs 170See also Web service clientsas stored processes 178for execution on remote host 175for scheduling 171job scheduling and 170redeploying jobs for scheduling 173redeploying to stored processes 180scheduling for complex process flows 174stored processes and 177viewing or updating stored process metadata 181derived column mappings 199Designer tab (SQL) 286adding CASE expression 304adding GROUP BY clause 308adding HAVING clause 309adding joins to queries 298adding ORDER BY clause 310submitting queries from 316desktop 33Desktop window 33dimension tables 346See also slowly changing dimensions (SCD)generating retained keys for 356loading process 346loading with begin and end datetime values 349loading with version numbers or current-row indica-tors 353Type 2 346updating closed-out rows in 357Eediting options 118empty jobscreating 143enterprise resource management servers 8environmentSee data integration environmenterror log location 50ETL (extraction, transformation, and loading) 4explicit pass-through processing 326explicit property for join types 324exporting metadata 25exporting other metadata 75, 80metadata that can be imported 76preparing to export 77usage notes 76exporting SAS metadata 66automatic content 67best practices 71documenting metadata 72importing metadata exported with Export Wizard 73logging after export 71metadata that can be exported 67optional content 68preparation for 71selected metadata objects 72Expression Builder window 29external file properties windows 20external files 120accessing with FTP or HTTP server 136browse options 116edit options 116NLS support for 136overriding code generated by wizards 135registering COBOL data files 139registering delimited files 122registering files with user-written code 130registering fixed-width files 126Index 387tasks with 121updating metadata 134viewing data in 138viewing metadata 134extraction, transformation, and loading (ETL) 4FFact Table Lookup transformation 349fact tablesloading 353structure and loading of 347fixed-width external file source designer 126Fixed Width External File wizard 120fixed-width external filesregistering 126foreign keyskey columns 104foreign keys 58importing metadata 80restoring metadata for 80SCD and 348format options 118FTP access to external files 136Ggenerated code 142, 152, 155, 207credentials in 209displaying for jobs 209displaying for transformations 210editing for jobs 218editing for transformations 221LIBNAME statements and 208modifying configuration files 212modifying SAS start commands 212options for jobs 211options for transformations 211remote connection statements and 208SYSLAST macro statement and 208generated keys 348SCD and 348generated transformations 232creating and using 226impact analysis on 233, 258importing 233maintaining 232updating 234global options 63global Options window 63GROUP BY clauseadding to queries 307Hhash joins 323HAVING clauseadding to queries 307Help 12hiding logs 250HTTP access to external files 136II/O processingminimizing 322impact analysis 255on generated transformations 233, 258performing 257prerequisites 256reverse 255, 259impact analysis windows 24implicit pass-through processing 326implicit property for join types 324importing COBOL copybook 139importing generated transformations 233importing metadata 25importing other metadata 75metadata that can be imported 76preparing to import 77usage notes 76importing SAS metadata 66automatic content 67best practices 71comparing metadata to repository 78, 79invalid change analysis result 80logging after export 71metadata exported with Export Wizard 73metadata that can be imported 67optional promotion of access controls 70package files 73preparation for 71restoration of metadata associations 69with change analysis 78without change analysis 77index condition options 268index joins 323indexes 107creating 107deleting 108deleting columns from 108rearranging columns in 108removing non-essential indexes during a load 272input tablesadding subqueries to 312intermediate filesdeleting for performance 241Inventory tree 36invocation optionssetting on jobs 248iterative jobs 331See also parallel processingadding input and transformation directly 333creating and running 332creating control tables 338JJava options 50job metadataviewing or updating 150job options 145, 211global 211local 211job propertiesviewing or updating 151job properties windows 21job scheduling 170deploying jobs for 171for complex process flows 174prerequisites for 171redeploying jobs for 173388 Indexjob statistics 158job status 143, 154job status code handlingprerequisites for 158job status icon 34Job Status Manager window 30menu options 160monitoring jobs 159resubmitting jobs from 148reviewing jobs 160job status monitoring 246job windows and wizards 19, 27jobsSee also iterative jobsadding user-written source code to 219capturing statistics 158common code generated for 152creating, consisting of user-written code 220creating, including User Written Code transforma-tion 223creating empty jobs 143creating for excution by Web service client 185creating for process flows 10creating process flows for 144credentials in generated code 155data access in context of 148Data Transfer transformation 150deploying 170displaying SAS code for 152editing generated code for 218executing with Web service clients 191generated code for 207interactive data access 149local and remote data access 148monitoring 157New Job wizard 32options 145, 211parameterized 318, 332, 335running 11, 143setting SAS invocation options on 248status of 143, 154submitting for immediate execution 147submitting queries from 316troubleshooting 155viewing or updating metadata 150with generated source code 142with user-supplied source code 143join algorithms 322join typeschanging 292implicit and explicit properties 324results by 293selecting 292joinsadding to queries in Designer tab 298automatic 289hash joins 323index joins 323joining a table to itself 316parameterized jobs 318reviewing 287sort-merge 323SPD Server star joins 319Kkey columns 104Key Effective Date transformation 349keysbusiness 348foreign 58, 80, 348generated 348primary 58, 348registering DBMS tables with 58retained 348, 356SCD and 348surrogate 245, 348unique 58Llevel_value option 51LIBNAME statementgenerated job code and 153, 208libraries 8for Web client inputs and outputs 184registering 55librefsfor Web client inputs and outputs 184List Cluster Contents transformation 372List Data transformationadding to process flows 253load style options 265load techniques 270adding rows 271matching and updating rows 272removing all rows 271loader transformations 263bulk loading 273load techniques 270removing indexes and constraints during a load 272SPD Server Table Loader 264Table Loader 264Table Loader options 265loading dimension tableswith begin and end datetime values 349with version numbers or current-row indicators 353loading fact tables 353loading output tables 263Log tab 29login user ID 34logscapturing additional options 250error log location 50evaluating 249for process flow optimization 249hiding 250logging after import or export 71message logging 51redirecting large logs to a file 250viewing 250Lookup transformation 349lookupstransformations for 245Mmacro variablesfor monitoring job status 154for status handling 209Index 389macrosstatus handling for user-written code 216mappingsSee column mappingsmaster data management 4memory allocation 51menu bar 33message logging 51message queues 361prerequisites 362supported data transfer types 361transformations for 363WebSphere MQ queues 364metadataadding 60checking in 60checking out 60clearing from Project tree 61column metadata 98connectivity and 3copying and pasting 74deleting 61exporting and importing 66, 75, 76for data stores 56for DBMS tables with keys 58removing permanently 61restoring for foreign keys 80updating external file metadata 134updating table metadata 89, 90viewing external file metadata 134viewing or updating job metadata 150viewing or updating stored process metadata 181viewing or updating transformations metadata 197viewing table metadata 89metadata associationsrestoration of 69Metadata Export wizard 80metadata identity 34metadata management 66metadata objectsexporting selected objects 72metadata profilesdefault repository and 54Metadata Profile window 32name 33opening 53metadata serverconnecting to 9, 51reconnecting to 54Metadata tree 44metadata windows and wizards 18, 20Microsoft Queue Reader transformation 363Microsoft Queue Writer transformation 363migration 4monitoring jobs 157, 246Job Status Manager 159prerequisites for job status code handling 158Return Code Check transformation 162status code conditions and actions 165status handling 161status handling actions 158monitoring transformations 157Nnames 93New Job wizard 32, 144NLS encoding options 136Oone-to-one column mappings 199online Help 12optional content 68Options window 34ORDER BY clauseadding to queries 310Output folder 42Output tab 29output tablesloading in process flows 263Ppackage filesimporting 73parallel processing 340See also iterative jobsprerequisites 340setting options 341parameterized jobs 318, 332creating 335pass-through processing 325optimizing in SCD Type 2 Loader 358pastingSAS metadata 74performanceSee also process flow optimizationdata optimization 321SQL processing 320, 327permanent tablesmanaging for transformations 239physical tableupdating metadata 90physical tablesmanaging for performance 240plug-in location 50port for SAS Metadata Server 34pre-sorting data 322primary keys 58key columns 104SCD and 348process data management 238Process Designer window 27submitting jobs from 147viewing code in 210Process Editor tab 28process flow diagramsviewing or updating for a job 151process flow optimization 238column management 243debugging techniques 246logs for 249process data management 238reviewing temporary output tables 251streamlining process flow components 245process flowsadding debugging code to 247adding List Data transformation to 253adding User Written Code transformation to 253building 8creating 49390 Indexcreating for jobs 144creating jobs for 10job execution by Web service client and 183job scheduling for complex process flows 174loading output tables in 263streamlining components 245Process Library 39Access folder 39Analysis folder 40Archived Transforms folder 40Control folder 41Data Transforms folder 41Output folder 42Publish folder 42SPD Server Dynamic Cluster folder 43Process Library tree 39project repositoriesclearing 62Project tree 43clearing metadata from 61promotion processoptional promotion of access controls 70propertiesQuick Properties pane 43property sheet options 327Publish folder 42Qqueriesadding CASE expression 303adding GROUP BY clause 307adding HAVING clause 307adding joins to, in Designer tab 298adding ORDER BY clause 310adding user-written SQL code 295creating 299debugging 296modifying clause and table properties 287reviewing clauses, joins, and tables 287submitting 315submitting as part of a job 316submitting from Designer tab 316subqueries 311Quick Properties pane 43Rreconnecting to metadata server 54redeploying jobsfor scheduling 173to stored processes 180redirectinglogs 250temporary output tables 252registrationcontrol tables 338DBMS tables with keys 58delimited external files 122external files with user-written code 130fixed-width external files 126libraries 55source tables 9sources and targets 56tables, with source designer 85tables, with Target Table wizard 87target tables 10remote connection statements 154, 208remote dataminimizing access 242remote hostdeploying jobs for execution on 175Remove Cluster transformation 372required components 48restoringmetadata associations 69metadata for foreign keys 80retained keysgenerating for SCD dimension tables 356SCD and 348Return Code Check transformation 162return codes 158reverse impact analysis 255performing 259prerequisites 256reverse impact analysis windows 24rowsadding 271current-row indicators 353matching and updating 272removing all 271updating closed-out rows 357running jobs 11, 143iterative jobs 332SSAS Application servers 6SAS Application Serversdefault 34, 55SAS codedisplaying for a job 152SAS/CONNECT Server 6SAS Data Integration Studioadministrative documentation 12environment and 5online Help 12required components 48starting 50SAS Grid Server 6SAS Intelligence Platform 12SAS invocation optionssetting on jobs 248SAS Management Consoleenvironment and 5SAS Metadata Bridge 75SAS Metadata Server 6name of 34port for 34SAS names 93, 94SAS OLAP cube wizards 26SAS OLAP Server 6SAS/SHARE Server 7SAS Sort transformation 275optimizing sort performance 277setting options 276sorting contents of a source 279SAS start commandsmodifying for generated code 212SAS Workspace Server 6SCDSee slowly changing dimensions (SCD)Index 391SCD Type 2 Loader transformation 263optimizing pass-through in 358SCD and 349search options 118SELECT clauseconfiguring 301SELECT statementoptimizing 328servers 6DBMS servers 7enterprise resource management servers 8SAS Application Servers 6SAS data servers 7shortcut bar 34slowly changing dimensions (SCD) 344See also dimension tableschange detection and 346cross-reference tables 347fact tables 347keys and 348loading fact tables 353optimizing pass-through in SCD Type 2 Loader 358star schema loading process 345star schemas and 344transformation supporting 349Type 1 345Type 2 345Type 3 345sort-merge joins 323sort options 276sort performance 277sort transformations 275sortingcontents of a source 279pre-sorting data 322source codeadding user-written code to jobs 219jobs with generated source code 142jobs with user-supplied source code 143source designer wizards 25source designersregistering tables with 85setting DBMS name options in 97Source Editor tab 29Source Editor window 31source tablesmapping columns to work table 224registering 9sourcesregistering 56sorting contents of 279SPD Server 7SPD Server clusters 369creating 370maintaining 371SPD Server Dynamic Cluster folder 43SPD Server star joins 319SPD Server Table Loader transformation 263, 264special charactersin table and column names 93SQL Join transformation 285adding a CASE expression 303adding a GROUP BY clause 307adding a HAVING clause 307adding an ORDER BY clause 310adding columns to target tables 297adding joins to queries in Designer tab 298adding subqueries 311adding user-written SQL code 295automatic joins 289configuring a SELECT clause 301creating or configuring a WHERE clause 305creating queries 299data optimization 321debugging queries 296implicit and explicit properties for join types 324join algorithms 322join types 292joining a table to itself 316modifying properties of clauses and tables 287optimizing processing performance 320, 327parameters with joins 318pass-through processing 325property sheet options 327reviewing clauses, joins, and tables 287SPD Server star joins 319SQL Designer tab 286submitting queries 315star joins 319star schemasloading process 345SCD and 344transformations for 245start commandsmodifying for generated code 212start options 63starting SAS Data Integration Studio 50status codesconditions and actions 165setting and checking 248status handling 158, 161actions 158macro variables for 209macros for user-written code 216stored process server 6stored processes 177deploying jobs as 178prerequisites for 177redeploying jobs to 180viewing or updating metadata 181subqueries 311adding to clauses 314adding to input tables 312Surrogate Key Generator transformation 349surrogate keys 245SCD and 348synchronization 4SYSLAST macro statementgenerated job code and 208job code and 153TTable Loader transformation 263, 264constraint condition options 268index condition options 268load style options 265setting options 265table namescase and special characters 93default name options 97enabling name options 97392 Indextable properties windows 20tables 85browse options 116browsing data 109bulk load 328column metadata maintenance 98creating with View Data window 115DBMS 58edit options 116editing data 112indexes 107joining a table to itself 316key columns 104managing for performance 240managing for transformations 239modifying properties of 287options for 91registering with source designer 85registering with Target Table wizard 87reviewing for queries 287sorting contents of a source 279tasks with 85temporary output tables 251updating metadata 89updating metadata with physical table 90viewing metadata 89Target Table wizard 26registering tables with 87target tablesadding columns to 297mapping columns from work table 225registering 10targetsregistering 56temporary output tablespreserving 251redirecting 252reviewing 251viewing 252temporary tablesmanaging for transformations 239threaded reads 329toolbar 34Transformation Generator wizard 32transformation options 211transformation properties windows 22transformations 195See also generated transformationsadding directly to iterative jobs 333archived 204column mappings 197displaying generated code for 210editing generated code for 221for lookups 245for message queues 363for star schemas 245in Access folder 39in Analysis folder 40in Archived Transforms folder 40in Control folder 41in Data Transforms folder 41in Output folder 42in Publish folder 42in SPD Server Dynamic Cluster folder 43limiting input 247List Data 253loader transformations 263managing tables for 239monitoring 157options 211Return Code Check 162SAS Sort 275SCD Type 2 Loader 263SPD Server Table Loader 263, 264SQL Join 285supporting SCD 349Table Loader 263, 264User Written Code 253verifying output 247viewing code for 196viewing or updating metadata 197tree view 36Process Library 39treessubmitting jobs from 148troubleshooting jobs 155Uunique keys 58key columns 104unrestricted users 62update table metadata feature 90usage notesimporting or exporting other metadata 76user ID 34user profilescreating, with change management 52creating, without change management 52user-supplied source code 143user taskschange management 59creating process flows 49registering sources and targets 56user-written code 216adding source code to a job 219adding status handling macros to 216creating and using generated transformations 226creating jobs consisting of 220creating jobs with User Written Code transformation 223editing generated code for a job 218editing generated code for a transformation 221for cubes 216maintaining generated transformations 232SQL 295, 329User Written Code transformation 222adding to process flows 253creating jobs that include 223user-written external file source designer 131User Written External File wizard 120user-written external filesregistering 130Vvalidating data 242version numbersloading dimension tables with 353View Data window 22browse mode 109creating tables 115edit mode 112Index 393specifying options for 116View File window 23View Statistics window 24viewsmanaging for performance 240WWeb service clients 182creating jobs for execution by 185data format for inputs and outputs 184deploying jobs for excution by 188deploying jobs for execution by 182executing jobs 191libraries and librefs for inputs and outputs 184parameters for user input 185process flow requirements 183requirements for executing jobs 183Web streams for inputs and outputs 185Web streamsfor Web client inputs and outputs 185WebSphere MQ queues 364WebSphere Queue Reader transformation 363, 364configuring and running jobs 367creating jobs 366verifying jobs 367WebSphere Queue Writer transformation 363, 364configuring and running jobs 365creating jobs 365verifying jobs 366WHERE clausecreating or configuring 305windows 18application windows 19, 32Desktop 33Expression Builder 29external file properties windows 20global Options 63impact analysis 24job properties windows 21Job Status Manager 30job windows 19, 27Metadata Profile 32metadata windows 18, 20Options 34Process Designer 27reverse impact analysis 24Source Editor 31table properties windows 20transformation properties windows 22View Data 22View File 23View Statistics 24wizards 18cube wizards 26data surveyor wizards 27importing and exporting metadata 25job wizards 19, 27metadata wizards 18, 20New Job 32source designer wizards 25Target Table 26Transformation Generator 32work tablesmapping columns from source table 224mapping columns to target table 225Your TurnWe want your feedback. If you have comments about this book, please send themto [email protected]. Include the full title and page numbers (ifapplicable). If you have comments about the software, please send them [email protected] Publishing gives you the tools to flourish in any environment with SAS!Whether you are new to the workforce or an experienced professional, you need to distinguish yourself in this rapidly changing and competitive job market. SAS Publishing provides you with a wide range of resources to help you set yourself apart.SAS Press Series Need to learn the basics? Struggling with a programming problem? Youll find the expert answers that you need in example-rich books from the SAS Press Series. Written by experienced SAS professionals from around the world, these books deliver real-world insights on a broad range of topics for all skill levels.s u p p o r t . s a s . c o m / s a s p r e s sSAS Documentation To successfully implement applications using SAS software, companies in every industry and on every continent all turn to the one source for accurate, timely, and reliable informationSAS documentation. We currently produce the following types of reference documentation: online help that is built into the software, tutorials that are integrated into the product, reference documentation delivered in HTML and PDFfree on the Web, and hard-copy books.s u p p o r t . s a s . c o m / p u b l i s h i n gSAS Learning Edition 4.1 Get a workplace advantage, perform analytics in less time, and prepare for the SAS Base Programming exam and SAS Advanced Programming exam with SAS Learning Edition 4.1. This inexpensive, intuitive personal learning version of SAS includes Base SAS 9.1.3, SAS/STAT, SAS/GRAPH, SAS/QC, SAS/ETS, and SAS Enterprise Guide 4.1. Whether you are a professor, student, or business professional, this is a great way to learn SAS.s u p p o r t . s a s . c o m / L ESAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. indicates USA registration. Other brand and product names are trademarks of their respective companies. 2006 SAS Institute Inc. All rights reserved. 428713_1US.0307ContentsIntroductionIntroduction to SAS Data IntegrationAbout SAS Data IntegrationA Basic Data Integration EnvironmentOverview of a Data Integration EnvironmentSAS Management ConsoleSAS Data Integration StudioServersLibrariesAdditional InformationOverview of Building a Process FlowProblemSolutionTasksAdvantages of SAS Data IntegrationOnline Help for SAS Data Integration StudioAdministrative Documentation for SAS Data Integration StudioAccessibility Features in SAS Data Integration StudioAccessibility StandardsAbout the Main Windows and WizardsOverview of the Main Windows and WizardsMetadata Windows and WizardsProperty Windows for Tables and Other ObjectsWindows Used to View DataImpact Analysis and Reverse Impact Analysis WindowsImport and Export Metadata WizardsSource Designer WizardsTarget Table WizardSAS OLAP Cube WizardsData Surveyor WizardsJob Windows and WizardsProcess Designer WindowExpression Builder WindowJob Status Manager WindowSource Editor WindowNew Job WizardTransformation Generator WizardSAS Data Integration Studio Application WindowsMetadata Profile WindowDesktop WindowOptions WindowTree View on the DesktopTree ViewInventory TreeCustom TreeProcess Library TreeProject TreeQuick Properties PaneComparison Results TreeMetadata TreeGeneral User TasksGetting StartedRequired Components for SAS Data Integration StudioMain Tasks for Creating Process FlowsStarting SAS Data Integration StudioProblemSolutionTasksConnecting to a Metadata ServerProblemSolutionTasksAdditional InformationReconnecting to a Metadata ServerProblemSolutionSelecting a Default SAS Application ServerProblemSolutionTasksRegistering Any Libraries That You NeedProblemSolutionTasksRegistering Sources and TargetsProblemSolutionTasksWorking with Change ManagementProblemSolutionTasksAdditional InformationSpecifying Global Options in SAS Data Integration StudioProblemSolutionTasksImporting, Exporting, and Copying MetadataAbout Metadata ManagementWorking with SAS MetadataAbout Importing and Exporting SAS MetadataSAS Metadata That Can Be Imported and ExportedAutomatic ContentOptional ContentRestoration of Metadata AssociationsOptional Promotion of Access ControlsLogging after Import or ExportBest Practices for Importing or Exporting SAS MetadataPreparing to Import or Export SAS MetadataExporting SAS MetadataProblemSolutionTasksImporting SAS MetadataProblemSolutionTasksCopying and Pasting SAS MetadataProblemSolutionTasksWorking with Other MetadataAbout Exporting and Importing Other MetadataOther Metadata That Can Be Imported and ExportedUsage Notes for Importing or Exporting Other MetadataPreparing to Import or Export Other MetadataImporting As New Metadata (without Change Analysis)ProblemSolutionTasksImporting Metadata with Change AnalysisProblemSolutionTasksExporting Other MetadataProblemSolutionTasksWorking with TablesAbout TablesRegistering Tables with a Source DesignerProblemSolutionTasksRegistering Tables with the Target Table WizardProblemSolutionTasksViewing or Updating Table MetadataProblemSolutionUsing a Physical Table to Update Table MetadataProblemSolutionTasksSpecifying Options for TablesProblemSolutionTasksSupporting Case and Special Characters in Table and Column NamesAbout Case and Special Characters in SAS NamesAbout Case and Special Characters in DBMS NamesEnabling Name Options for Existing TablesSet DBMS Name Options in the Source DesignersSetting Default Name Options for Tables and ColumnsMaintaining Column MetadataProblemSolutionTasksIdentifying and Maintaining Key ColumnsProblemSolutionTasksMaintaining IndexesProblemSolutionTasksBrowsing Table DataProblemSolutionTasksAdditional InformationEditing SAS Table DataProblemSolutionTasksAdditional InformationUsing the View Data Window to Create a SAS TableProblemSolutionTasksAdditional InformationSpecifying Browse and Edit Options for Tables and External FilesProblemSolutionTasksWorking with External FilesAbout External FilesRegistering a Delimited External FileProblemSolutionTasksAdditional InformationRegistering a Fixed-Width External FileProblemSolutionTasksAdditional InformationRegistering an External File with User-Written CodeProblemSolutionTasksAdditional InformationViewing or Updating External File MetadataProblemSolutionOverriding the Code Generated by the External File WizardsProblemSolutionTasksAdditional InformationSpecifying NLS Support for External FilesProblemSolutionTasksAdditional InformationAccessing an External File With an FTP Server or an HTTP ServerProblemSolutionTasksAdditional InformationViewing Data in External FilesProblemSolutionTasksRegistering a COBOL Data File That Uses a COBOL CopybookProblemSolutionTasksCreating, Executing, and Updating JobsAbout JobsJobs with Generated Source CodeJobs with User-Supplied Source CodeRun JobsManage Job StatusCreating an Empty JobProblemSolutionTasksCreating a Process Flow for a JobProblemSolutionTasksAbout Job OptionsSubmitting a Job for Immediate ExecutionProblemSolutionTasksAccessing Local and Remote DataAccess Data in the Context of a JobAccess Data InteractivelyUse a Data Transfer TransformationViewing or Updating Job MetadataProblemSolutionTasksDisplaying the SAS Code for a JobProblemSolutionTasksCommon Code Generated for a JobOverviewLIBNAME StatementsSYSLAST Macro StatementsRemote Connection StatementsMacro Variables for Status HandlingUser Credentials in Generated CodeTroubleshooting a JobProblemSolutionTasksMonitoring JobsAbout Monitoring JobsPrerequisites for Job Status Code HandlingUsing the Job Status ManagerProblemSolutionTasksManaging Status HandlingProblemSolutionTasksManaging Return Code Check TransformationsProblemSolutionTasksMaintaining Status Code Conditions and ActionsProblemSolutionTasksDeploying JobsAbout Deploying JobsAbout Job SchedulingPrerequisites for SchedulingDeploying Jobs for SchedulingProblemSolutionTasksRedeploying Jobs for SchedulingProblemSolutionTasksUsing Scheduling to Handle Complex Process FlowsProblemSolutionTasksDeploying Jobs for Execution on a Remote HostProblemSolutionTasksAbout SAS Stored ProcessesPrerequisites for SAS Stored ProcessesDeploying Jobs as SAS Stored ProcessesProblemSolutionTasksRedeploying Jobs to Stored ProcessesProblemSolutionTasksViewing or Updating Stored Process MetadataProblemSolutionTasksAbout Deploying Jobs for Execution by a Web Service ClientRequirements for Jobs That Can Be Executed by a Web Service ClientOverview of RequirementsProcess Flow RequirementsData Format for Web Client Inputs and OutputsLibraries and Librefs for Web Client Inputs and OutputsWeb Streams for Web Client Inputs and Outputs(Optional) Parameters for User InputCreating a Job That Can Be Executed by a Web Service ClientProblemSolutionTasksDeploying Jobs for Execution by a Web Service ClientProblemSolutionTasksUsing a Web Service Client to Execute a JobProblemSolutionTasksWorking with TransformationsAbout TransformationsViewing the Code for a TransformationProblemSolutionTasksViewing or Updating the Metadata for TransformationsProblemSolutionTasksAdditional InformationCreating and Maintaining Column MappingsProblemSolutionTasksAbout Archived TransformationsWorking with Generated CodeAbout Code Generated for JobsOverviewLIBNAME StatementsSYSLAST Macro StatementsRemote Connection StatementsMacro Variables for Status HandlingUser Credentials in Generated CodeDisplaying the Code Generated for a JobProblemSolutionTasksDisplaying the Code Generated for a TransformationProblemSolutionTasksSpecifying Options for JobsProblemSolutionTasksSpecifying Options for a TransformationProblemSolutionTasksModifying Configuration Files or SAS Start Commands for Application ServersWorking with User-Written CodeAbout User-Written CodeEditing the Generated Code for a JobProblemSolutionTasksAdding User-Written Source Code to an Existing JobProblemSolutionTasksCreating a New Job That Consists of User-Written CodeProblemSolutionTasksEditing the Generated Code for a TransformationProblemSolutionTasksAbout the User Written Code TransformationCreating a Job That Includes the User Written Code TransformationProblemSolutionTasksCreating and Using a Generated TransformationProblemSolutionTasksAdditional InformationMaintaining a Generated TransformationProblemSolutionTasksOptimizing Process FlowsAbout Process Flow OptimizationManaging Process DataProblemSolutionTasksManaging ColumnsProblemSolutionTasksStreamlining Process Flow ComponentsProblemSolutionTasksUsing Simple Debugging TechniquesProblemSolutionTasksUsing SAS LogsProblemSolutionTasksReviewing Temporary Output TablesProblemSolutionTasksAdditional InformationUsing Impact AnalysisAbout Impact Analysis and Reverse Impact AnalysisPrerequisitesPerforming Impact AnalysisProblemSolutionTasksPerforming Impact Analysis on a Generated TransformationProblemSolutionTasksPerforming Reverse Impact AnalysisProblemSolutionTasksWorking with Specific TransformationsWorking with Loader TransformationsAbout Loader TransformationsAbout the SPD Server Table Loader TransformationAbout the Table Loader TransformationSetting Table Loader Transformation OptionsProblemSolutionTasksSelecting a Load TechniqueProblemSolutionTasksRemoving Non-Essential Indexes and Constraints During a LoadProblemSolutionConsidering a Bulk LoadProblemSolutionWorking with SAS SortsAbout SAS Sort TransformationsSetting Sort OptionsProblemSolutionOptimizing Sort PerformanceProblemSolutionCreating a Table That Sorts the Contents of a SourceProblemSolutionTasksWorking with the SQL Join TransformationAbout SQL Join TransformationsUsing the SQL Designer TabProblemSolutionTasksAdditional InformationReviewing and Modifying Clauses, Joins, and Tables in an SQL QueryProblemSolutionTasksUnderstanding Automatic JoinsA Sample Auto-Join ProcessSelecting the Join TypeProblemSolutionTasksAdding User-Written SQL CodeProblemSolutionAdditional InformationDebugging an SQL QueryProblemSolutionTasksAdding a Column to the Target TableProblemSolutionTasksAdding a Join to an SQL Query in the Designer TabProblemSolutionTasksCreating a Simple SQL QueryProblemSolutionTasksAdditional InformationConfiguring a SELECT ClauseProblemSolutionTasksAdding a CASE ExpressionProblemSolutionTasksCreating or Configuring a WHERE ClauseProblemSolutionTasksAdding a GROUP BY Clause and a HAVING ClauseProblemSolutionTasksAdding an ORDER BY ClauseProblemSolutionTasksAdding SubqueriesProblemSolutionTasksSubmitting an SQL QueryProblemSolutionTasksJoining a Table to ItselfProblemSolutionTasksUsing Parameters with an SQL JoinProblemSolutionConstructing a SAS Scalable Performance Data Server Star JoinProblemSolutionTasksAdditional InformationOptimizing SQL Processing PerformanceProblemSolutionPerforming General Data OptimizationProblemSolutionTasksInfluencing the Join AlgorithmProblemSolutionTasksSetting the Implicit Property for a JoinProblemSolutionEnabling Pass-Through ProcessingProblemSolutionTasksUsing Property Sheet Options to Optimize SQL Processing PerformanceProblemSolutionTasksWorking with Iterative Jobs and Parallel ProcessingAbout Iterative JobsAdditional InformationCreating and Running an Iterative JobProblemSolutionTasksAdditional InformationCreating a Parameterized JobProblemSolutionTasksAdditional InformationCreating a Control TableProblemSolutionTasksAdditional InformationAbout Parallel ProcessingSetting Options for Parallel ProcessingProblemSolutionTasksAdditional InformationWorking with Slowly Changing DimensionsAbout Slowly Changing Dimensions (SCD)SCD and the Star SchemaAbout the Star Schema Loading ProcessAbout Type 2 SCD Dimension TablesAbout Change Detection and the Loading Process for SCD Dimension TablesAbout Cross-Reference TablesAbout the Structure and Loading of Fact TablesAbout KeysAbout Generated KeysTransformations That Support Slowly Changing DimensionsLoading a Dimension Table Using Begin and End Datetime ValuesProblemSolutionTasksLoading a Dimension Table Using Version Numbers or Current-Row IndicatorsProblemSolutionTasksLoading a Fact TableProblemSolutionTasksGenerating Retained Keys for an SCD Dimension TableProblemSolutionTasksUpdating Closed-Out Rows in SCD Dimension TablesProblemSolutionTasksOptimizing SQL Pass-Through in the SCD Type 2 LoaderProblemSolutionTasksWorking with Message QueuesAbout Message QueuesPrerequisites for Message QueuesSelecting Message Queue TransformationsProblemSolutionTasksAdditional InformationProcessing a WebSphere MQ QueueProblemSolutionTasksAdditional InformationWorking with SPD Server Cluster TablesAbout SPD Server ClustersCreating an SPD Server ClusterProblemSolutionTasksMaintaining an SPD Server ClusterProblemSolutionAppendixesRecommended ReadingRecommended ReadingGlossaryIndex
- Delivers content from a Microsoft MQ message queue to SAS Data Integration Studio. If the message is being sent into a table, the message queue content is sent to a table or a SAS Data Integration Studio transformation.
- Data does not import correctly when using the External File Wizard or the Table Properties window in SAS® Data Integration Studio B51006 NOTE: If you install this hot fix, you must also install hot fix C63005 for Data Integration Studio Server Data 9.2-M2 on your metadata server machine.