Monday, September 5, 2016

HANA DB Locks

Home


1. Present lock situations

Check the Alerts Monitor in HANA Studio for alerts indicating lock situations, which may serve as a starting point for the following investigations:
Check for blocked transactions ("Performance -> Load", "Performance -> Blocked Transactions", "Performance -> Threads") to determine the number of blocked transactions.
                       When transactions are blocked, identify the session that blocks other transactions from ("Performance -> Sessions")or ("Performance -> Blocked Transactions").


                       To resolve the blocking situation, the following actions can be taken.

                       1. Check the application logic of the currently blocking session to see if the application logic can be changed to avoid the blocking situation.

                       2. Consider changing the schedule or parameterization of the application modifying those database objects and thereby blocking those transactions.

                       3. You can terminate the blocking session manually if you need to resolve the situation immediately.

                       4. When further analysis is needed to find out the root cause of the blocking transaction, follow the instructions in later part of this SAP Note and collect diagnosis information and send the information to SAP Support.

The following "System Information" views provide additional information on blocked transactions, locks and sessions: "Open Transactions", "Blocked Transactions", "Table locks", "Record Locks", "Sessions".
A possible root cause for a temporary increase of internal locks are contentions on system resources. background processes.
A possible root cause for a temporary increase of internal locks are contentions on those system resources.
                       To determine temporal coincidences of deteriorated performance with increasing lock contention caused by background jobs, use the HANA Studio monitors "Performance -> Load", and "Performance -> Job Progress" and the "System Information" views to check the status of "(Delta) Merge Statistics", "Failed Backups", "Backup catalog" "MVCC blocker transaction", "MVCC blocker connection".

                       To find the times when savepoints have been written, submit a query on the system view "SYS"."M_SAVEPOINTS". The view SYS"."M_CS_UNLOADS" logs phases of intensive column data unloads from memory.

To find wait situations at a thread level, open the HANA Studio monitor "Performance -> Threads", select "Create call stacks", and repeatedly refresh the current page every few minutes. Check which threads do not change their call stacks even after many repetitions.
Unchanged call stacks can have many other root causes than lock contention, e.g., long running queries, open sessions waiting for a user action, or background processes.
                       To find out the root cause of a long running query, the following checks can be performed.

Number of row store version is over 1,000,000:
                       1. Check number of row store versions using the following query:

                       select * from m_mvcc_tables

                       If the number of versions is over 1 million, it might affect overall system performance. Therefore, we need to find out which transaction blocks garbage collection, possibly by long-running or unclosed cursors, long-running serializable/repeatable read isolation mode transactions, or hanging threads.

                       2. From "System Information" views, identify the transaction and connection information that is blocking the garbage collection by checking "MVCC blocker transaction" and "MVCC blocker connection".

                       3. Identify Query String of the problemaic cusror or query using the following query:

                       select * from m_prepared_statements where statement_id = (select current_statement_id from m_connections where connection_id = (select connection_id from m_transactions where min_mvcc_snapshot_timestamp = (select min(Value) from m_mvcc_tables where name = 'MIN_SNAPSHOT_TS') and connection_id > 0))

                       4. Get more information on the session context:

                       select * from m_session_context where connection_id = (select connection_id from m_transactions where min_mvcc_snapshot_timestamp = (select min(Value) from m_mvcc_tables where name = 'MIN_SNAPSHOT_TS') and connection_id > 0)

                       You can get more detailed information on the session such as application program, application user from this query.

                       5. Analyze the query why it takes long

                       Check application and solve the problem. For example, application program has to be changed if there is any unclosed cursor or uncommitted transaction.

                       6. If a persistent sequence is used with "NO CACHE" option, then please check SAP Note 1977214 SAP HANA: Growing number of rowstore table versions when a persistent sequence is used.

                       7. If the problem is not resolved, then follow the instructions in later part of this SAP Note and collect diagnosis information and send the information to SAP Support.

Number of row store version is less than 1000000, but still a performance is not good and need further investigation, then follow the instructions in later part of this SAP Note and collect diagnosis information and send the information to SAP Support.
When further analysis is needed to find out the root cause of the wait situation at a thread level, follow the instructions in later part of this SAP Note and collect diagnosis information and send the information to SAP Support.
2. Past lock situations
Identify the exact time frame when the performance temporarily decreased.
For example, by using the "Performance -> Load" monitor, status of resource consumption such as CPU or memory usage or SQL workloads such as number of active connections, statements, blocked transactions, versions, active read / write requests, column unload status can be checked. HANA trace files also can be used to identify the time frame of problematic situation by checking alerts or warning / error messages.
                       The time frame is a valuable filter for further analysis.

                       When HANA database is not restarted after the performance downgrade, the "System Information" views can be checked further. If the performance issue is resolved without a database restart, then it indicates that the database was not in a hang situation, but the database performance was temporarily decreased.

                       If HANA database has been restarted after a hang situation, the history of database status can be found in "_SYS_STATISTICS" schema:

                       "HOST_LONG_RUNNING_STATEMENTS","HOST_LONG_RUNNING_SERIALIZABLE_TRANSACTION","HOST_LONG_IDLE_CURSOR","HOST_LONG_RUNNING_UNCOM MITTED_WRITE_TRANSACTION","HOST_MEMORY_STATISTICS","HOST_RESOURCE_UTILIZATION_STATISTICS","HOST_SAVEPOINTS", "HOST_VOLUME_IO_PERFORMANCE_STATISTICS"

To check if a savepoint was written during a pre-defined time period, submit a query on the view "SYS"."M_SAVEPOINTS" and specify a suitable time window for the corresponding attribute(s). A longer history can be found in "_SYS_STATISTICS"."HOST_SAVEPOINTS". Similarly, the view "SYS"."M_CS_UNLOADS" saves time information when column data is unloaded from memory.
To search for lock messages in trace files, open the HANA DB Studio "Diagnosis Files" view and primarily examine the indexserver alert trace files. In a scale-out environment, there is one such file for each host. Search for the strings "lock", "wait", and "transaction". Check if the messages refer to lock situations and the timestamps match the period of performance deterioration.
3. Collect Diagnosis Information for SAP Support
                       To collect Diagnosis Information ("runtimedump") including configurations and trace files, follow the SAP Notes 1837439 "Activating the Emergency Support Package for DB support" and 1732157 "SAP HANA: Howto get system dump". Send the system dump archive to SAP Customer Support who will provide you an SAPMATS container link for the upload. Small dump archives can be directly attached to OSS messages.

                       This collection of Diagnosis information is helpful for both present and past lock situations.

                       If the HANA DB System currently shows a serious hang situation, repeat the collection of Diagnosis files again after a few minutes and additionally send the second dump archive to SAP Customer Support.



2. SAP HANA locking issues?

You suffer from terminations and short dumps due to SQL errors 131 or 133:

SQL error 131: transaction rolled back by lock wait timeout
SQL error 133: transaction rolled back by detected deadlock
SQL error 146: resource busy and NOWAIT specified
The mvcc_anti_ager reports in a trace file:

There are too many lock items on this system.


3. Types of locks exist?

We can distinguish the following SAP HANA lock types:

Lock type Lock wait thread state Scope  Views Details
Record lock ConditionalVariable Wait transactional M_RECORD_LOCKS Exclusive locks on record level, typically caused by concurrent changes of the same records by different transactions
(RecordLockWaitCondStat / TransactionLockWaitCondStat) M_CONDITIONAL_VARIABLES
M_BLOCKED_TRANSACTIONS
M_OBJECT_LOCK_STATISTICS
Object lock ConditionalVariable Wait transactional M_OBJECT_LOCKS Locks on object level, typically caused by DDL operations requiring an object lock:
(TableLockWaitCondStat / TransactionLockWaitCondStat) M_CONDITIONAL_VARIABLES
M_BLOCKED_TRANSACTIONS Shared locks (INTENTIONAL EXCLUSIVE): Set in case of DELETE, INSERT, MERGE, SELECT FOR UPDATE, UPDATE and UPSERT operations on table
M_OBJECT_LOCK_STATISTICS Exclusive locks (EXCLUSIVE): Set in case of DDL operations on table and in case of an explicit LOCK TABLE operation 
Read / write lock SharedLock Enter internal M_READWRITELOCKS Read / write lock waits, e.g. waits during critical savepoint phase
ExclusiveLock Enter
Semaphore Semaphore Wait internal M_SEMAPHORES Low level locks based on semaphores, e.g. waits for a critical delta merge phase
Mutex Mutex Wait internal M_MUTEXES Low level locks based on mutexes
Barrier Barrier Wait internal Low level locks similar to mutexes
Speculative locks Speculative Lock Retry backoff internal Low level transactional memory locks (SAP HANA >= Rev. 122)
Speculative Lock Wait for fallback
liveCache lock internal M_LIVECACHE_LOCKS Locks related to integrated liveCache (if used)
Nameserver topology special The nameserver topology is locked based on a file (/tmp/.hdb_<sid>_<inst_id>_lock)


Record and object locks are mainly linked to the application transactions while read / write locks,
semaphores, mutexes and barriers are managed by SAP HANA internally.

Be aware that the table above concludes from a lock type to a thread state.
The opposite way is not generally valid. For example, there can be "ConditionalVariable Wait"
situation that are not linked to a record lock or object lock

4. To check if my SAP HANA database suffers from lock waits?

On a very elementary level you can identify lock waits based on the thread states.

Current threads can be displayed via:

Transaction DBACOCKPIT: Performance --> Threads
SAP HANA Studio: Performance --> Threads
SQL: "
HANA
_Threads_CurrentThreads" (SAP Note 1969700)
Historic thread activities can be determined via:

SQL: "HANA_Threads_ThreadSamples_FilterAndAggregation" and "HANA_Threads_ThreadSamples_AggregationPerTimeSlice" (SAP Note 1969700)
The following thread states indicate lock wait situations:

Barrier Wait
ConditionalVariable Wait
ExclusiveLock Enter
Mutex Wait
Semaphore Wait
SharedLock Enter
Sleeping

Not every occurrence of these states is critical. It happens frequently that a thread submits a request to another thread and waits for a
related semaphore until the called thread has returned the result.

5. Do timeouts for lock waits exist?

Transactional lock waits are terminated when the time limit defined with the following parameter is exceeded:

indexserver.ini -> [transaction] -> lock_wait_timeout
Its default value 1800000 which represents 1800000 ms / 1800 s / 30 minutes. This means that a lock wait is terminated after 30 minutes and the following error message is issued:

SQL error 131: transaction rolled back by lock wait timeout
This behavior is different from other databases like Oracle where no timeout for exclusive lock waits exist and transactions will wait for the lock until the lock is available or the transaction is manually terminated.

For internal lock waits no timeout is implemented.


6. How long certain types of locks are held?

Normally it is most important to analyze lock wait situations, i.e. the concurrent access to the same resources. For transactional locks you can additionally see how long locks are held (even if no transaction is waiting) by using the following approaches:

M_OBJECT_LOCKS, M_RECORD_LOCKS, M_OBJECT_LOCK_STATISTICS
SQL: "HANA_Locks_Transactional_Current" / SQL: "HANA_Locks_Transactional_Total" (SAP Note 1969700)

11 comments:

  1. Hi syamala,
    Thank You so much for this blog. It helped me lot. I am a Technical Recruiter by profession and first time working on this technology was bit tough for me, this article really helped me a lot to understand the details to get started with.

    I want to use the HANA Predictive Analysis Library in our SAP Cloud Platform and tried to check the installation following the official help site.
    The following statements do return the expected result:
    SELECT * FROM "SYS"."AFL_AREAS" WHERE AREA_NAME = 'AFLPAL';
    SELECT * FROM "SYS"."AFL_PACKAGES" WHERE AREA_NAME = 'AFLPAL';
    SELECT * FROM "SYS"."AFL_FUNCTIONS" WHERE AREA_NAME = 'AFLPAL';
    However, in the catalog the APL procedures don't show up in the _SYS_AFL schema (even though the Functions exist):

    My user is assigned to the roles AFL__SYS_AFL_AFLPAL_EXECUTE and AFL__SYS_AFL_AFLPAL_EXECUTE_WITH_GRANT_OPTION.
    When trying to call a PAL procedure (using the example code from the official help site), I get the error:
    Could not execute 'CALL _SYS_AFL.PAL_KMEANS(PAL_4_COLUMN_DATA_TBL, "#PAL_PARAMETER_TBL", ?, ?, ?, ?, ?)' in 25 ms 398 µs . SAP DBTech JDBC: [328]: invalid name of function or procedure: PAL_KMEANS: line 1 col 15 (at pos 14)

    By the way do you have any YouTube videos, would love to watch it. I would like to connect you on LinkedIn, great to have experts like you in my connection (In case, if you don’t have any issues).
    Please keep providing such valuable information.
    Obrigado,
    Kevin

    ReplyDelete
  2. Greetings Mate,

    Awesome article. Thanks for making that available. I've been using your help to build my own POC and will publish the steps in another blog soon.

    I have always worked on graphical Hana vdm modelling and SQL scripting. Now I now the entire development focus is shifting to CDs views. I have situation where the graphical views already created needs to be wrapped in a abap CDS view. Is it possible to use calculation views inside a CDS definition. I am not getting exactly what syntax should I use to call the calculation view.Would it be same as calling a schema table. Or do we need to wrap the calculation view in external views then use that external views in the CDS views. Can you please let me know you opinion or suggest me the directions which I need to explore?



    Appreciate your effort for making such useful blogs and helping the community.



    Regards,
    Irene Hynes

    ReplyDelete
  3. Hi Syamala,

    Thanks so much for this article! I tried to follow some instructions from few other article and got in over my head. This worked so quickly and your instructions were very easy to follow. Really appreciate this.
    SAP HANA manages concurrency through the use of multi-version concurrency control (MVCC) which gives every transaction a snapshot of the database at a point in time SAP HANA Training .
    After installing and configuring Agile Data Preparation SP23 on my HANA Express Edition 2.2 Server, I encounter the following error when trying to de-duplicate my data:
    pop.addInputNodeMapping('INPUT_1sap.hana.im.adp.target.transformation::FG_383B8DD8CB8E44C1B5ADD8E48804E417_1_TEMPLATE_','INPUT_1sap.hana.im.adp.target.transformation::FG_383B8DD8CB8E44C1B5ADD8E48804E417_1_TEMPLATE_')
    [4437]{205738}[26/1426526] 2018-03-05 08:00:19.066832 e cePlanExec cePlanExecutor.cpp(08147) : Error during Plan execution of model HXE::SAP_HANA_IM_ADP_STAGING:sap.hana.im.adp.target.transformation::FG_383B8DD8CB8E44C1B5ADD8E48804E417_1 (t -1), reason: executor: plan operation failed;Execution of Adapter operation within node CLEANSE_Addresssap.hana.im.adp.target.transformation::FG_383B8DD8CB8E44C1B5ADD8E48804E417_1_TEMPLATE_ failed: exception 141005: Failed to create or initialize HanaTransform object
    [2727]{-1}[-1/-1] 2018-03-05 08:00:20.584812 e xsa:sap.hana.im. SandBox.cpp(01560) : Found the following errors:
    ===========================
    ApplicationException: Last transformation was unsuccessful while running flowgraph. (line 30 position 1 in /sap/hana/im/adp/adp/Errors.xsjslib)
    Could anyone point me into the director where to look for the cause? I could not find any SAP Notes or other documentation regarding this behaviour.


    Excellent tutorials - very easy to understand with all the details. I hope you will continue to provide more such tutorials.

    Obrigado,
    Kevin

    ReplyDelete
  4. Hi,

    "Awesome article"?? "Thanks so much for this article"??

    You mean: stolen from SAP! Like from this SAP Notes:
    1858357 - HANA DB: Analysis of Wait Situations
    1999998 - FAQ: SAP HANA Lock Analysis

    Kind regards,
    Max

    ReplyDelete