Home
SAP HANA:
SAP HANA is a leading in-memory database and data management platform, specifically developed to take full advantage of the capabilities
provided by modern hardware to increase application performance. By keeping all relevant data in main memory (RAM),
data processing operations are significantly accelerated.
The key performance indicators of SAP HANA appeal to many of our customers, and
thousands of deployments are in progress. SAP HANA has become the fastest growing product in SAP’s history.
Memory used for HANA:
As an in-memory database, it is particularly critical for SAP HANA to carefully manage and track its consumption of memory.
For this purpose, SAP HANA manages its own data memory pool by requesting memory from the OS, possibly in advance of using it.
The memory pool is used to store all the in-memory data and system tables, thread stacks, temporary computations and all the other data
structures required for managing a database.
At any given point, only parts of the memory pool are really in use. SAP refers to the total amount of memory actually in use as the SAP HANA Used Memory. This is the most precise indicator of the amount of memory that SAP HANA uses.
The SAP HANA Used Memory, consisting of code, stack and data, is shown below:
Since the code and program stack size are about 6 GB, almost all of the Used Memory is in fact used for storing tables, computations and database management.
It is possible to partition large tables, and even distribute these partitions over multiple hosts.
In such distributed scenarios, this discussion of memory usage applies to each of the hosts of an SAP HANA system, separately.
Memory Sizing:
Memory sizing is the process of estimating, in advance, the amount of memory that will be required to run a certain workload on SAP HANA.
To understand memory sizing, you will need to answer the following questions:
1. What is the size of the data tables that will be stored in SAP HANA?
You may be able to estimate this based on the size of your existing data,
but unless you precisely know the compression ratio of the existing data and the anticipated growth factor,
this estimate may only be partially meaningful.
2. What is the expected compression ratio that SAP HANA will apply to these tables?
The SAP HANA Column Store automatically uses a combination of various advanced compression algorithms (dictionary, LRE, sparse, and more)
to best compress each table column separately. The achieved compression ratio depends on many factors,
such as the nature of the data, its organization and data-types, the presence of repeated values,
the number of indexes (SAP HANA requires fewer indexes), and more.
3. How much extra working memory will be required for DB operations and temporary computations?
The amount of extra memory will somewhat depend on the size of the tables (larger tables will create larger intermediate result-tables in operations like joins),
but even more on the expected work load in terms of the number of users and the concurrency and complexity of the
analytical queries (each query needs its own workspace).
SAP HANA provides some control over compression, via the "optimize_compresssion" configuration section of the index server.
More details on data compression can be found in "How to Calculate Data Compression Rates in SAP HANA" (https://cw.sdn.sap.com/cw/docs/DOC-145836).
SAP Notes 1514966, 1637145 and 1736976 provide additional tools and information to help you size the required amount of memory,
but the most accurate method is ultimately to import several representative tables into a SAP HANA system,
measure the memory requirements, and extrapolate from the results.
The next section will explain how to measure and understand SAP HANA Used Memory.
This description corresponds to SAP HANA revision 70 (SAP HANA 1.0 SPS7) or later.
To display the current size of the Used Memory:
select HOST, round(INSTANCE_TOTAL_MEMORY_USED_SIZE/1024/1024/1024, 2) as "Used Memory GB" from M_HOST_RESOURCE_UTILIZATION
SAP HANA system consist of multiple services that all consume some memory. Of particular interest is the "indexserver" service,
the main database service. The indexserver holds all the data tables and temporary results, and therefore dominates the SAP HANA Used Memory.
You can drill down into Used Memory,
To obtain the amount of indexserver Used Memory as follows:
select HOST, round(TOTAL_MEMORY_USED_SIZE/1024/1024/1024, 2) as "Used Memory GB" from M_SERVICE_MEMORY where SERVICE_NAME = 'indexserver'
To see the peak amount of indexserver Used Memory since the server was restarted:
select top 1 HOST, SERVER_TIMESTAMP, round(TOTAL_MEMORY_USED_SIZE/1024/1024/1024, 2) as "Used Memory GB" from _SYS_STATISTICS.HOST_SERVICE_MEMORY where SERVICE_NAME = 'indexserver' order by TOTAL_MEMORY_USED_SIZE desc
The following query shows the value of indexserver Used Memory at 7:00 AM during each of the last 30 days:
select top 30 HOST, SERVER_TIMESTAMP, round(TOTAL_MEMORY_USED_SIZE/1024/1024/1024, 2) as "Used Memory GB" from _SYS_STATISTICS.HOST_SERVICE_MEMORY where SERVICE_NAME = 'indexserver' and hour(SERVER_TIMESTAMP) = 7 and minute(SERVER_TIMESTAMP) = 0 order by SERVER_TIMESTAMP desc
SAP HANA Tables:
Column Tables:
The following simple query provides a high-level overview of the amount of memory used for column tables:
select round(sum(MEMORY_SIZE_IN_TOTAL)/1024/1024) as "Column Tables MB Used" from M_CS_TABLES
or providing per-schema details:
select SCHEMA_NAME as "Schema", round(sum(MEMORY_SIZE_IN_TOTAL) /1024/1024) as "MB Used" from M_CS_TABLES GROUP by SCHEMA_NAME order by "MB Used" desc
Note that SAP HANA loads column tables into memory column-by-column only upon use. This is sometimes called “lazy loading”. Hence, columns that are never used will not be loaded, which avoids memory waste. When SAP HANA runs out of allocatable memory, it will try to unload unimportant data (such as caches) and even least recently used table columns to free up some memory.
Thus, if it is important to precisely measure the total, or worst-case, amount of memory used for a particular table, it is best to ensure that the table is first loaded in its entirety, by executing the following SQL statement:
load TABLE_NAME all
You can use the following technique to examine the amount of memory consumed by a specific table. This also shows which of its columns are loaded, and the compression ratio that was accomplished. For example, list all tables for schema “SYSTEM”:
select TABLE_NAME as "Table", round(MEMORY_SIZE_IN_TOTAL/1024/1024) as "MB Used" from M_CS_TABLES where SCHEMA_NAME = 'SYSTEM' order by "MB Used" desc
Or drill down into columns of a single table, for instance the table "LineItem", to view the actual size of the data, the “delta changes” and the compression ratio for each of its columns
select COLUMN_NAME as "Column", LOADED, round(UNCOMPRESSED_SIZE/1024/1024) as "Uncompressed MB", round(MEMORY_SIZE_IN_MAIN/1024/1024) as "Main MB", round(MEMORY_SIZE_IN_DELTA/1024/1024) as "Delta MB", round(MEMORY_SIZE_IN_TOTAL/1024/1024) as "Total Used MB", round(COMPRESSION_RATIO_IN_PERCENTAGE/100, 2) as "Compr. Ratio" from M_CS_COLUMNS where TABLE_NAME = 'LineItem'
Row Tables:
Some system tables are in fact row store tables. To get a sense of the total amount of memory used for these row tables, you can use the following query:
select round(sum(USED_FIXED_PART_SIZE + USED_VARIABLE_PART_SIZE)/1024/1024) as "Row Tables MB Used" from M_RS_TABLES
To examine the memory consumption of row tables of a particular schema, for instance the schema ‘SYS’, drill down as follows:
select SCHEMA_NAME, TABLE_NAME, round((USED_FIXED_PART_SIZE + USED_VARIABLE_PART_SIZE)/1024/1024, 2) as "MB Used" from M_RS_TABLES where SCHEMA_NAME = 'SYS' order by "MB Used" desc, TABLE_NAME
Memory Pool:
SAP HANA pre-allocates and manages its own memory pool, used for storing in-memory tables, for thread stacks, and for temporary results and other system data structures.
When more memory is required for table growth or temporary computations, the SAP HANA memory manager obtains it from the pool.
When the pool cannot satisfy the request, the memory manager will increase the pool size by requesting more memory from the operating system,
up to a predefined Allocation Limit. By default, the allocation limit is set to 90% of the first 64 GB of physical memory on the host plus 97% of each further GB.
You can see the allocation limit on the Overview tab of the Administration perspective of the SAP HANA studio, or view it with SQL:
select HOST, round(ALLOCATION_LIMIT/1024/1024/1024, 2) as "Allocation Limit GB" from PUBLIC.M_HOST_RESOURCE_UTILIZATION
Table (column or partition) unloading is generally not a good situation, since it leads to performance degradation later, when the data will have to be reloaded later for queries that need them. You can identify pool exhaustion by examining the M_CS_UNLOADS system view. For instance, the following query will provide the number of unloads during a particular one-hour time-slot:
select count(*) from M_CS_UNLOADS where UNLOAD_TIME between '19.08.2013 09:00:00' and '19.08.2013 10:00:00'
SAP HANA:
SAP HANA is a leading in-memory database and data management platform, specifically developed to take full advantage of the capabilities
provided by modern hardware to increase application performance. By keeping all relevant data in main memory (RAM),
data processing operations are significantly accelerated.
The key performance indicators of SAP HANA appeal to many of our customers, and
thousands of deployments are in progress. SAP HANA has become the fastest growing product in SAP’s history.
Memory used for HANA:
As an in-memory database, it is particularly critical for SAP HANA to carefully manage and track its consumption of memory.
For this purpose, SAP HANA manages its own data memory pool by requesting memory from the OS, possibly in advance of using it.
The memory pool is used to store all the in-memory data and system tables, thread stacks, temporary computations and all the other data
structures required for managing a database.
At any given point, only parts of the memory pool are really in use. SAP refers to the total amount of memory actually in use as the SAP HANA Used Memory. This is the most precise indicator of the amount of memory that SAP HANA uses.
The SAP HANA Used Memory, consisting of code, stack and data, is shown below:
Since the code and program stack size are about 6 GB, almost all of the Used Memory is in fact used for storing tables, computations and database management.
It is possible to partition large tables, and even distribute these partitions over multiple hosts.
In such distributed scenarios, this discussion of memory usage applies to each of the hosts of an SAP HANA system, separately.
Memory Sizing:
Memory sizing is the process of estimating, in advance, the amount of memory that will be required to run a certain workload on SAP HANA.
To understand memory sizing, you will need to answer the following questions:
1. What is the size of the data tables that will be stored in SAP HANA?
You may be able to estimate this based on the size of your existing data,
but unless you precisely know the compression ratio of the existing data and the anticipated growth factor,
this estimate may only be partially meaningful.
2. What is the expected compression ratio that SAP HANA will apply to these tables?
The SAP HANA Column Store automatically uses a combination of various advanced compression algorithms (dictionary, LRE, sparse, and more)
to best compress each table column separately. The achieved compression ratio depends on many factors,
such as the nature of the data, its organization and data-types, the presence of repeated values,
the number of indexes (SAP HANA requires fewer indexes), and more.
3. How much extra working memory will be required for DB operations and temporary computations?
The amount of extra memory will somewhat depend on the size of the tables (larger tables will create larger intermediate result-tables in operations like joins),
but even more on the expected work load in terms of the number of users and the concurrency and complexity of the
analytical queries (each query needs its own workspace).
SAP HANA provides some control over compression, via the "optimize_compresssion" configuration section of the index server.
More details on data compression can be found in "How to Calculate Data Compression Rates in SAP HANA" (https://cw.sdn.sap.com/cw/docs/DOC-145836).
SAP Notes 1514966, 1637145 and 1736976 provide additional tools and information to help you size the required amount of memory,
but the most accurate method is ultimately to import several representative tables into a SAP HANA system,
measure the memory requirements, and extrapolate from the results.
The next section will explain how to measure and understand SAP HANA Used Memory.
This description corresponds to SAP HANA revision 70 (SAP HANA 1.0 SPS7) or later.
To display the current size of the Used Memory:
select HOST, round(INSTANCE_TOTAL_MEMORY_USED_SIZE/1024/1024/1024, 2) as "Used Memory GB" from M_HOST_RESOURCE_UTILIZATION
SAP HANA system consist of multiple services that all consume some memory. Of particular interest is the "indexserver" service,
the main database service. The indexserver holds all the data tables and temporary results, and therefore dominates the SAP HANA Used Memory.
You can drill down into Used Memory,
To obtain the amount of indexserver Used Memory as follows:
select HOST, round(TOTAL_MEMORY_USED_SIZE/1024/1024/1024, 2) as "Used Memory GB" from M_SERVICE_MEMORY where SERVICE_NAME = 'indexserver'
To see the peak amount of indexserver Used Memory since the server was restarted:
select top 1 HOST, SERVER_TIMESTAMP, round(TOTAL_MEMORY_USED_SIZE/1024/1024/1024, 2) as "Used Memory GB" from _SYS_STATISTICS.HOST_SERVICE_MEMORY where SERVICE_NAME = 'indexserver' order by TOTAL_MEMORY_USED_SIZE desc
The following query shows the value of indexserver Used Memory at 7:00 AM during each of the last 30 days:
select top 30 HOST, SERVER_TIMESTAMP, round(TOTAL_MEMORY_USED_SIZE/1024/1024/1024, 2) as "Used Memory GB" from _SYS_STATISTICS.HOST_SERVICE_MEMORY where SERVICE_NAME = 'indexserver' and hour(SERVER_TIMESTAMP) = 7 and minute(SERVER_TIMESTAMP) = 0 order by SERVER_TIMESTAMP desc
SAP HANA Tables:
Column Tables:
The following simple query provides a high-level overview of the amount of memory used for column tables:
select round(sum(MEMORY_SIZE_IN_TOTAL)/1024/1024) as "Column Tables MB Used" from M_CS_TABLES
or providing per-schema details:
select SCHEMA_NAME as "Schema", round(sum(MEMORY_SIZE_IN_TOTAL) /1024/1024) as "MB Used" from M_CS_TABLES GROUP by SCHEMA_NAME order by "MB Used" desc
Note that SAP HANA loads column tables into memory column-by-column only upon use. This is sometimes called “lazy loading”. Hence, columns that are never used will not be loaded, which avoids memory waste. When SAP HANA runs out of allocatable memory, it will try to unload unimportant data (such as caches) and even least recently used table columns to free up some memory.
Thus, if it is important to precisely measure the total, or worst-case, amount of memory used for a particular table, it is best to ensure that the table is first loaded in its entirety, by executing the following SQL statement:
load TABLE_NAME all
You can use the following technique to examine the amount of memory consumed by a specific table. This also shows which of its columns are loaded, and the compression ratio that was accomplished. For example, list all tables for schema “SYSTEM”:
select TABLE_NAME as "Table", round(MEMORY_SIZE_IN_TOTAL/1024/1024) as "MB Used" from M_CS_TABLES where SCHEMA_NAME = 'SYSTEM' order by "MB Used" desc
Or drill down into columns of a single table, for instance the table "LineItem", to view the actual size of the data, the “delta changes” and the compression ratio for each of its columns
select COLUMN_NAME as "Column", LOADED, round(UNCOMPRESSED_SIZE/1024/1024) as "Uncompressed MB", round(MEMORY_SIZE_IN_MAIN/1024/1024) as "Main MB", round(MEMORY_SIZE_IN_DELTA/1024/1024) as "Delta MB", round(MEMORY_SIZE_IN_TOTAL/1024/1024) as "Total Used MB", round(COMPRESSION_RATIO_IN_PERCENTAGE/100, 2) as "Compr. Ratio" from M_CS_COLUMNS where TABLE_NAME = 'LineItem'
Row Tables:
Some system tables are in fact row store tables. To get a sense of the total amount of memory used for these row tables, you can use the following query:
select round(sum(USED_FIXED_PART_SIZE + USED_VARIABLE_PART_SIZE)/1024/1024) as "Row Tables MB Used" from M_RS_TABLES
To examine the memory consumption of row tables of a particular schema, for instance the schema ‘SYS’, drill down as follows:
select SCHEMA_NAME, TABLE_NAME, round((USED_FIXED_PART_SIZE + USED_VARIABLE_PART_SIZE)/1024/1024, 2) as "MB Used" from M_RS_TABLES where SCHEMA_NAME = 'SYS' order by "MB Used" desc, TABLE_NAME
Memory Pool:
SAP HANA pre-allocates and manages its own memory pool, used for storing in-memory tables, for thread stacks, and for temporary results and other system data structures.
When more memory is required for table growth or temporary computations, the SAP HANA memory manager obtains it from the pool.
When the pool cannot satisfy the request, the memory manager will increase the pool size by requesting more memory from the operating system,
up to a predefined Allocation Limit. By default, the allocation limit is set to 90% of the first 64 GB of physical memory on the host plus 97% of each further GB.
You can see the allocation limit on the Overview tab of the Administration perspective of the SAP HANA studio, or view it with SQL:
select HOST, round(ALLOCATION_LIMIT/1024/1024/1024, 2) as "Allocation Limit GB" from PUBLIC.M_HOST_RESOURCE_UTILIZATION
Table (column or partition) unloading is generally not a good situation, since it leads to performance degradation later, when the data will have to be reloaded later for queries that need them. You can identify pool exhaustion by examining the M_CS_UNLOADS system view. For instance, the following query will provide the number of unloads during a particular one-hour time-slot:
select count(*) from M_CS_UNLOADS where UNLOAD_TIME between '19.08.2013 09:00:00' and '19.08.2013 10:00:00'
Awesome work mate :) ! Can you please share your contact details
ReplyDelete