What is Transparent Application Failover (TAF)?
TAF is a feature in Oracle that helps maintain database availability by automatically reconnecting client applications to a different database server when a failure occurs (such as a network failure or server crash) or when patching is required with near zero downtime. It ensures that the application experiences minimal service disruption. It’s typically used in Oracle Real Application Clusters (RAC) environments but can also be configured in single-instance setups with the Oracle Data Guard. It requires Oracle configuration and application server configuration to work.
TAF works by maintaining a connection at the client or application level. When a failure occurs, Oracle’s Net Services re-establish the session with a surviving node, minimizing downtime. TAF supports different failover mechanisms:
Key Concepts of TAF:
- Automatic Reconnection: If the application loses connection to the database, TAF attempts to reconnect it to a healthy instance or server.
- Seamless Failover: The failover process is transparent to the application, meaning that the application does not need to take any action to handle the failure. The application’s session is re-established without noticeable disruption.
- Failover Types:
- Select Failover: Enables automatic re-execution of queries after failover.
- Session Failover: Reconnects the session but requires applications to restart any uncommitted transactions.
How Does TAF Work?
When a database node fails, TAF ensures that the connection is redirected to a surviving node within the cluster. This is commonly used in Oracle Real Application Clusters (RAC) environments, where multiple database instances share access to the same database.
TAF works as follows:
- A client application connects to the database using a service configured with failover options.
- If the active database instance becomes unavailable, TAF detects the failure.
- The client session is automatically reconnected to a surviving database instance.
- If configured, SELECT statements are re-executed from the last fetched row.
Configuring TAF
To enable TAF, you need to configure your tnsnames.ora, listener.ora, and application JDBC connection string with the failover options.
- Oracle Client Configuration: TAF must be enabled in the client’s
tnsnames.ora
file and server-sidelistener.ora
.
Example tnsnames.ora
:
MYDB =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = primary_host)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = secondary_host)(PORT = 1521))
)
(CONNECT_DATA =
(SERVICE_NAME = mydb)
(FAILOVER_MODE =
(TYPE = SELECT)
(METHOD = BASIC)
(RETRIES = 10)
(DELAY = 5)
)
)
)
- Listener Configuration: In
listener.ora
, enable TAF by defining failover-related parameters.
Example:
SID_LIST_LISTENER =
(SID_LIST =
(SID_DESC =
(SID_NAME = mydb)
(GLOBAL_DBNAME = mydb)
(FAILOVER_MODE =
(TYPE = SESSION)
(METHOD = BASIC)
(RETRIES = 3)
(DELAY = 5)
)
)
)
TYPE = SELECT: Enables select failover.
METHOD = BASIC: Ensures failover occurs only when a failure is detected.
RETRIES = 5: The number of times to retry.
DELAY = 5: The delay in seconds before retrying.
Example: JDBC Connection String
- Application Code: On the application side, ensure that the application is configured to support failover. This might include enabling features like automatic reconnect or handling re-execution of failed transactions.
For Java applications using JDBC, you can configure TAF as follows:
String url = "jdbc:oracle:thin:@(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=node1)(PORT=1521))(ADDRESS=(PROTOCOL=TCP)(HOST=node2)(PORT=1521)))(CONNECT_DATA=(SERVICE_NAME=myservice)(FAILOVER_MODE=(TYPE=SELECT)(METHOD=BASIC)(RETRIES=5)(DELAY=5))))";
Connection conn = DriverManager.getConnection(url, "username", "password");
Best Practices for TAF Implementation
- Use Oracle RAC: TAF works best in Real Application Clusters (RAC) where multiple nodes share the database.
- Tune RETRIES and DELAY: Set appropriate values to avoid excessive reconnection attempts.
- Test Failover Scenarios: Regularly simulate failures to ensure TAF behaves as expected. This must be done to ensure success. TAF cannot be implemented without the application team.
- Monitor Performance: Use Oracle tools like AWR reports to track failover performance.
Common Failures and Troubleshooting:
- TAF Not Triggering: Ensure that the TAF configuration in both
tnsnames.ora
andlistener.ora
is correct and that there is proper connectivity between the client and the server instances. - Session Failover Problems: If TAF does not work as expected during a session failover, verify that the session state is consistent across RAC instances (if used).
- Connection Failures: If failover takes too long or causes other issues, check network latency and database instance health.
Benefits of TAF
- Minimal Downtime: Ensures high availability by automatically reconnecting users.
- Improved User Experience: Users experience little to no disruption when a failure occurs.
- Flexibility: Supports different failover modes tailored to application needs.
- Seamless Integration: Works with Oracle RAC and other Oracle high-availability features.
Considerations and Limitations
- Transaction Loss: Transactions in progress during failover will be lost and must be restarted.
- Application Compatibility: Applications must be designed to handle failover scenarios.
- Limited to SELECT Queries: Only SELECT statements can be automatically resumed in SELECT failover mode.
Conclusion
Transparent Application Failover (TAF) in Oracle is a powerful feature that enhances database availability by automatically redirecting connections to a surviving instance. TAF is an essential feature for mission-critical applications requiring high availability. With proper configuration, TAF ensures that users experience minimal disruption, making it a critical component of high-availability database solutions.
By implementing TAF in conjunction with Oracle RAC, organizations can improve business continuity, minimize downtime, and maintain seamless operations even during database failures or patching.
For more information, please reach out to us.