Wednesday, February 8, 2017

Spool Space in Teradata

What are the different Spool statuses and their significance?

The answers the following queries:

1. What does spool request contain?
2. Where is the spool request stored?
3. What are different Spool statuses and their significance?
4. Which work processes will create spool requests?
5. What is the difference between spool request and output request?
6. Which work process will create output request or print request?

1. What does spool request contain?

ANS: Spool request contains detailed information regarding printer model, formatting to be done on the data to be output.

2. Where is the spool request stored?

ANS: The spool request that is generated is stored in the Temse(temporary sequential file)

3. What are different Spool statuses and their significance?

ANS: Spool requests will have different statuses like Completed (The output request printed successfully) Waiting (Spool request is in waiting status and spool system has not yet processed the output request and not yet sent to the host system)
In process (The spool work process is in the process of formatting output request for printing) Printing (The host spool is in the process of printing the output request   Problem (This indicates there are some minor errors. It means output request printed but contains some errors like incomplete character set etc)

Error (This status indicates some serious issues such as network error. It means the requests have not printed and they remain in the spool system until they are deleted or until they reach their expiration date. After expiry date, they are deleted by the reorganization jobs.

Archive (This status is set for spool requests that are sent to an archiving device. This status means that the spool request was already processed by the spool system and it is now waiting to be archived)
Time (This status means that the user who has created the spool has mentioned some specific time for output)
(This minus status means that the spool request is not yet sent to the host system and no output request exists)
(This plus status means that the spool request is being generated and it is stored in the spool system)

4. Which work processes will create spool requests?

ANS: Spool requests will be created by dialog or background work processes.   Please note that spool work processes do not create spool requests

5. What is the difference between spool request and output request?

ANS: Spool request contains detailed information regarding printer model, formatting to be done on the data to be output. Output request contains all the data in the format that the printer understands.

A spool workprocess formats the data specified in the spool request and generates the output request. Output request is also termed as print request.

6. Which work process will create output request?

Ans: Output request will be created by spool work processes.

Spool overflow issues :

What is spool overflow ? How to troubleshoot spool overflow issue ?

1. What is spool overflow ?
2. How to troubleshoot spool overflow ?
3. How to delete old spool requests?
4. How to avoid spool overflow issue?
5. How to check spool numbers?
6. How to set the spool number in SAP?
7. How to define storage location (Temse) of spool requests ?

1. What is spool overflow ?

ANS: Spool request that got generated are stored in a temporary sequential file called Temse. Please note that these are stored in Temse because these are not stored permanently. The size of the Temse is limited by the parameter definition.

The default size of Temse is 32,000 and it can be increased till 2 billion. However, defining very high value can effect the databasePerformance. Due to limited storage capacity of Temse, when this file got full, this throws an SAP spool overflow error.

If this happens, customers could not print from SAP. If you check in SM21 and ST22 transactions, you may encounter some error log or dumps like

ERROR IN SPOOL call: SPOOL OVERFLOW
SPOOL_INTERNAL_ERROR

System internal error in the spool system

2. How to troubleshoot spool overflow?

ANS:To resolve this spool issue, some of the old spools can be deleted from the sap system. Goto  transaction SPAD. From the initial screen, navigate to Administration -> Clean_up_spool. Once you click this, Spool Administration: Delete old spool requests screen will be displayed.

In that screen, you will be prompted to choose requests for deletion like

Old spool requests older than the maximum set age, Requests already printed .All spool requests and enter value for minimum age (in days) text box. Select the check boxes as per the requirement (eg:  delete all requests already printed and minimum age in days equal to 30 days) and execute to delete old spool requests, which will clear Temse and will resolve spool over flow issue. Apart from this there are others ways of resolving spool overflow issue. They are

1) Run the report RSPO0041 from SE38 transaction to delete old spool requests
2) Run the report RSPO0043 from SE38 transaction, which checks the Temse/spool consistency and reorganizes the spool
  
4. How to avoid spool overflow issue?

ANS: These reports RSPO0041 and RSPO0043 should be scheduled as background jobs with proper variants on a periodic basis which avoids spool overflow issue.

You can try to increase the spool numbers from default 32000 to a bit higher value but please note do not set this too high as it may impact database performance Proactive monitoring check can be made in RZ20 transaction to see how much the fill level of spool numbers is.

5. How to check spool numbers?

ANS: To check spool numbers: Go to RZ20 transaction, navigate to SAP CCMS Monitor Templates -> Spool System.

Here you can check the status of spool servers and the used numbers percentage of spool numbers. If you find that used numbers is greater than 80%, you can manually run above mentioned reports or delete old spool requests from SPAD tcode as mentioned earlier.

6. How to set the spool number in SAP?

ANS: Using Transaction SNRO (Number range object maintenance) and object SPO_NUM, you can set the spool number in an SAP system.Once you provide object as SPO_NUM, click on change icon.
It displays another screen. In that screen, click on Number Ranges push button. It displays another screen "Number range for spooler numbers". Click on change icon here and set the value of spooler (from number and to number). Please note that a change request may be generated here based on the client settings (if automatic recording of changes is set in SCC4 transaction of the client)

7. How to define storage location (Temse) of spool requests ?

Ans: Every Temse object contains a header in TST01 table and the actual object. This can be either stored in table TST03 or in file system. You can set a parameter rspo/store_location in RZ10 to define whether to store Temse objects  TST03 table or in filesystem.

If value G is set to this parameter, it will be stored in filesystem and if it is db it is stored in TST03 table.

Spool Space Issue :

If you want to check whether the stats are working well for your query, then render the below query, execute EXPLAIN plan

Explain plan generate:
F5 = Query
F6 = Explain

First method: 

EXPLAIN Select query

Second Method :

DIAGNOSTIC HELPSTATS ON FOR SESSION;
or
DIAGNOSTIC VERBOSEEXPLAIN ON FOR SESSION;

EXPLAIN select query ;

Press F5
In that explain information need to check any recommended stats ,

Then ,collect those recommended stats and run in sql assistant ,after that run again explain command

When I get an error stating ‘NO MORE SPOOL SPACE IN USER’ what does it mean?

IF user has spool space limit defined as 10GB and there are 10 AMPS. i.e. 1GB of spool space is available on each AMP under this user.
Spool space problem depends on

1. Volume of data queried/accessed
2. Index selection on table.
3. Stats on table
4. No of active session per user

If your processing data goes into one AMP which is holding only 1GB spool then it will fail with ‘NO MORE SPOOL SPACE IN USER’. I.e. data is skewed.

Information on current user’s profile and active sessions:

Session info:

Below query gives,

How many queries are running at this moment by user?

Locking row for access

SELECT* FROM dbc.sessioninfo WHERE username=’######’;

Spool space info:

Below query gives,

How much spool has been allocated to the user ?

Locking row for access

SELECT username,spoolspace,profilename FROM dbc.users WHERE username=’#####’;

Note:

If profilename is not null, then the spool is probably being allocated by the amount in dbc.profiles.

If profilename is null, then spool is being allocated by the amount defined in dbc.users .

SQL:

locking row for access select * from dbc.Users where username=’#####’;

Approach to resolve “spool space error”:

Follow below steps to resolve the error.

These are potential causes:

1. Statistics are stale. Run HELP STATS on each table involve in SQL.

2. Verify and confirm (using step 1) that there is no other active session running too many large concurrent queries.

Note: Spool space is shared across session under the user and can cause error if there are sessions running too many large queries.

3. The data is skewed and is running out of spool on one amp. After 10 minutes, you can check PDCRDATA to get skew information.

SQL:

LOCKING ROW FOR ACCESS
SELECT T1.STARTTIME “StartTime”
, T1.FIRSTSTEPTIME “FirstStepTime”
, T1.FIRSTRESPTIME “FirstResponseTime”
, T1.USERNAME “UserName”
, T1.SESSIONID “SessionId”
, T1.QUERYID “QueryId”
, T1.PROCID “ProcessId”
, (
EXTRACT(HOUR FROM ((T1.FIRSTRESPTIME-T1.FIRSTSTEPTIME) HOUR(4) TO SECOND))
*3600 + EXTRACT(MINUTE FROM ((T1.FIRSTRESPTIME-T1.FIRSTSTEPTIME) HOUR(4) TO SECOND)) * 60 +
EXTRACT(SECOND FROM ((T1.FIRSTRESPTIME-T1.FIRSTSTEPTIME) HOUR(4) TO SECOND))
) “FirstStepProcessedTime”
, (
EXTRACT(HOUR FROM ((T1.FIRSTRESPTIME – T1.STARTTIME) HOUR(4) TO SECOND)) *3600 +
EXTRACT(MINUTE FROM ((T1.FIRSTRESPTIME – T1.STARTTIME) HOUR(4) TO SECOND)) * 60 +
EXTRACT(SECOND FROM ((T1.FIRSTRESPTIME – T1.STARTTIME) HOUR(4) TO SECOND))
) “ElapsedTime”
, (
EXTRACT(HOUR FROM ((T1.FIRSTSTEPTIME – T1.STARTTIME) HOUR(4) TO SECOND)) *3600 +
EXTRACT(MINUTE FROM ((T1.FIRSTSTEPTIME – T1.STARTTIME) HOUR(4) TO SECOND)) * 60 +
EXTRACT(SECOND FROM ((T1.FIRSTSTEPTIME – T1.STARTTIME) HOUR(4) TO SECOND))
) “GapTime”
, T1.DELAYTIME “DelayTime”
, T1.NUMOFACTIVEAMPS “NumOfActiveAmps”
, T1.TDWMALLAMPFLAG “TDWMAllAmpFlag”
, T1.TDWMESTTOTALTIME “TDWMEstTotalTime”
, T1.ESTPROCTIME “EstProcTime”
, T1.ESTRESULTROWS “EstResultRows”
, T1.ESTMAXROWCOUNT “EstMaxRowCount”
, T1.AMPCPUTIME “AMPCPUTime”
,T1.TOTALIOCOUNT “TotalIOCount”
,T1.PARSERCPUTIME “ParserCPUTime”
,T1.NUMRESULTROWS “NumResultRows”
,T1.CACHEFLAG “CacheFlag”
,ZEROIFNULL(T1.MAXAMPCPUTIME/NULLIFZERO(T1.AMPCPUTIME/(HASHAMP()+1))) “CPUSkew”
,ZEROIFNULL(T1.MAXAMPIO/NULLIFZERO(T1.TOTALIOCOUNT/(HASHAMP()+1))) “IOSkew”
, ZEROIFNULL(T1.AMPCPUTIME*1000/NULLIFZERO(T1.TOTALIOCOUNT)) “PJI”
, ZEROIFNULL(T1.TOTALIOCOUNT/NULLIFZERO(T1.AMPCPUTIME*1000)) “UII”
,T1.MAXAMPCPUTIME*(HASHAMP()+1) “ImpactCPU”
,T1.MAXAMPIO*(HASHAMP()+1) “ImpactIO”
,T1.MAXCPUAMPNUMBER “MaxCPUAmpNumber”
,T1.MAXIOAMPNUMBER “MaxIOAmpNumber”
,T1.NUMSTEPS “NumSteps”
,T1.SPOOLUSAGE “SpoolUsage”
,T1.ERRORCODE “ErrorCode”
,T1.ERRORTEXT “ErrorText”
,T1.STATEMENTTYPE “StatementType”
,T1.DEFAULTDATABASE
,T1.WDID “WDId”
,T1.FINALWDID “FinalWDId”
, CASE
WHEN CHARS(T2.SQLTEXTINFO) <= 15000
THEN SUBSTR(T2.SQLTEXTINFO,1,15000)
ELSE NULL
END “SQLTextInfo”
, CASE
WHEN CHARS(T2.SQLTEXTINFO) > 15000
THEN T2.SQLTEXTINFO
ELSE NULL
END “LongSQLTextInfo”
FROM DBC.DBQLOGTBL T1
LEFT
JOIN DBC.DBQLSQLTBL T2
ON T1.PROCID=T2.PROCID
AND T1.QUERYID=T2.QUERYID
WHERE T2.SQLROWNO=1
and T1.USERNAME=’GHHUSDWETL’ —-OR T1.SESSIONID=<#####>;

Note: Modify Index column of tables if required.

4. Analyze the volume on each tables involved in SQL.

5. Run the EXPLAIN for the query in verbose mode and collect stats on recommended columns.

6. If incorrect PIs are chosen (Skewed table) , the INSERT staments can also spool out.

diagnostic verbose explain on for session;
diagnostic helpstats on for session;

Conclusion: The above steps are to optimize existing spool space allocated. Alternatively you can also looking at allocating more spool space depending on the user query.

Why do I get "No more spool space" error?

This is the most familiar error message in Teradata world:

"Failure 2646 No more spool space"
What does it really mean, what is it caused by?

What is spool space?

Spool space is a temporary area, that can be used to store the part-results during query processing, as well as volatile tables. All free space in the database that are not allocated by PERM data, technically can be used for spool area, since a PERM data does not want to allocate that area.

Each database users may have a "spool limit" that restricts the user to allocate more spool area at a time, than its limit. Keep in mind that all active sessions of a username must share the spool limit together.

Teradata is a massive parallel system, therefore the spool limit must be interpreted on AMP level:

Eg: 100AMP system, a user with 10G spool limit means: 100M spool/AMP

What is spool space limit good for?

This limitation method is a quite simple way to cut those queries from the system that would stuck too much resources. There is no exact relationship between high spool usage and ineffective query, but statistically the correlation is high.
Practically: a bad query is being kicked off before is consumes too much resources unnecessarily.

No more spool space scenarios

System ran out of spool space

This is the most rare situation, forget about. There are too few free space on the system, but this situation used to be avoided by defining a "SpoolReserve" database, where no objects are created, this way that area is always available for spool.
If many "big spool limit" users run high spool queries parallel, then this rare situation can yet occure.

Multiple session of the user are active together

This is a quite rare situation also. Check the active users from dbc.sessioninfo.

Volatile tables

All existing volatile tables reside in your available spool space, reducing the available. If you create many, and even with skewed distribution, you can stuff your spool up. Choose "primary index" carefully, when defining volatile tables also.

Improper execution plan These are the >90% of cases that cause the "No more spool space" errors. Let' see how:

"Duplication to all AMPs" of a non-small set of records

The root cause is typically missing or stale statistics. Either system thinks that much less records will be duplicated than the real (sometimes billions of records get in this kind of spools), or knows this exactly, but on the other branch of the query there are more low quality estimations, and this execution seems to be cheaper.

Redistribution of records by a hash that causes skewed distribution

Check the corresponding blog post: Accelerate skewed joins

Retrieve huge amount of records into spool (locally or redistributed onto the AMPs)
Specific query structures imply this execution, like: join to a view that "union all"-s big tables.

I suggest to use PRISE Tuning Assistant to identify what is the problem. It spectacularly displays which execution step falls in the problems above.
Increasing the spool limit will not solve the problems in the most cases.
Too big task

Sometimes a given SQL query requires big spool area to be performed, even with the best execution plan.

This is the only case when raising spool limit is the solution. But first you have to understand that the task is really big. PRISE Tuning Assistant is a good tool for identify this in a minute.
  
Spool space in Teradata :

Spool space is allocated to a logon with a spool space limit parameter.

This spool space parameter assigns a spool space limit to each amp.

Each amp gets an equal share of the spool space limit. Therefore,

                                                            Logon spool space limit
            Amp spoolspace limit =   ---------------------------------------------------
                                                            Number of amps

The message “Failure 2646 No more spool space in username” occurs whenever any amp exceeds its individual spoolspace limit.


For example,

If the logon is assigned 150MB of spool space, and there are 50 amps, then each amp will get a 3MB spool space limit.

And, the 2646 message will occur if any amp exceeds 3MB of spool space.


ERROR : No more spool space -

1   (1)    SHARING SPOOL –

Are the overnight queries running under a single ID, or small number of IDs compared to daytime? If yes, then that could be the problem. All simultaneous queries running under a single ID share the spool on a FIFO basis.

2) TIMING OF STATS COLLECTION -- are stats collected at night when some tables are empty?

Do these tables grow to a good size during the overnight ETL activity creating a stale stats scenario? Look at some of the staging databases and their tables.

3) STABLE ETL PROCESSES –

Is the ETL cycle predictable? Do significant processing threads run late and/or long? Depending on what is happening this could be a subset of "1" or "2".

4) SKEW –

Are any of the staging loads skewed? Any large tables with a skew number above 30? A few skewed tables or activities have an outsized impact on performance.

5) CHANGE –

A generic item = what is different about the night vs. the day (other than the sun has gone down)? Any recent changes in the nighttime activity set?

6) Check the spool limit to the users

7) It may be due to skewing (data may be skewed in AMP).Since data is loaded, it may cause hot AMPs.



No comments:

Post a Comment