Posted on April 13, 2021 by vladmihalcea

SQL Server JDBC – Set sendStringParametersAsUnicode to false

Last modified:

Are you struggling with performance issues in your Spring, Jakarta EE, or Java EE application?

Imagine having a tool that could automatically detect performance issues in your JPA and Hibernate data access layer long before pushing a problematic change into production!

With the widespread adoption of AI agents generating code in a heartbeat, having such a tool that can watch your back and prevent performance issues during development, long before they affect production systems, can save your company a lot of money and make you a hero!

Hypersistence Optimizer is that tool, and it works with Spring Boot, Spring Framework, Jakarta EE, Java EE, Quarkus, Micronaut, or Play Framework.

So, rather than allowing performance issues to annoy your customers, you are better off preventing those issues using Hypersistence Optimizer and enjoying spending your time on the things that you love!

Introduction

In this article, I’m going to explain why you should always disable the sendStringParametersAsUnicode default JDBC Driver setting when using SQL Server.

Database table

Let’s assume we have the following database table:

SQL Server Post table

The PostID column is the Primary Key, and the Title column is of the VARCHAR type and has a secondary index as well:

CREATE INDEX IDX_Post_Title ON Post (Title)

The Post table contains the following records:

| PostID | Title                                       |
|--------|---------------------------------------------|
| 1      | High-Performance Java Persistence, part 1   |
| 2      | High-Performance Java Persistence, part 2   |
| 3      | High-Performance Java Persistence, part 3   |
| 4      | High-Performance Java Persistence, part 4   |
| ..     | ..                                          |
| 249    | High-Performance Java Persistence, part 249 |
| 250    | High-Performance Java Persistence, part 250 |

As you can see, the Title column is highly selective since every record has a different title value.

Unexpected CONVERT_IMPLICIT and Clustered Index Scan

When finding a Post row by its associated Title column value, we expect an Index Seek operation against the IDX_Post_Title index, but this is not what we get when using the default SQL Server JDBC settings.

For instance, if we enable the runtime query statistics to retrieve the associated execution plan of the SQL query that filters by the Title column:

executeStatement(entityManager, "SET STATISTICS IO, TIME, PROFILE ON");

try (PreparedStatement statement = connection.prepareStatement("""
    SELECT PostId, Title
    FROM Post
    WHERE Title = ? 
    """
)) {

    statement.setString(1, title);

    if (statement.execute() && statement.getMoreResults()) {
        LOGGER.info("Execution plan: {}{}",
            System.lineSeparator(),
            resultSetToString(statement.getResultSet())
        );
    }
}

We get the following SQL execution plan:

|StmtText                                                                                            |
|----------------------------------------------------------------------------------------------------|
|SELECT PostId, Title FROM Post WHERE Title = @P0                                                    |
|  |--Clustered Index Scan(OBJECT:([high_performance_sql].[dbo].[Post].[PK__Post__AA12603828AEBF55]),|
|     WHERE:(CONVERT_IMPLICIT(nvarchar(255),[high_performance_sql].[dbo].[Post].[Title],0)=[@P0]))   |

The Clustered Index Scan operation tells us that SQL Server has used the PostId Clustered Index to scan the leaf pages in search of the Title value we provided.

The reason why the IDX_Post_Title index was not used is because of the implicit conversion that was done between the provided NVARCHAR value and the VARCHAR value of the Title column.

Even if we provided the Title bind parameter value as a VARCHAR using the setString method:

statement.setString(1, title);

The SQL Server JDBC Driver behaved as if we used setNString method instead.

SQL Server JDBC sendStringParametersAsUnicode configuration

By default, SQL Server sends all String parameter values as NVARCHAR since the sendStringParametersAsUnicode configuration is set to true.

So, if we set the sendStringParametersAsUnicode configuration value to false

jdbc:sqlserver://localhost;instance=SQLEXPRESS;databaseName=high_performance_sql;sendStringParametersAsUnicode=false;

And, rerun the previous SQL query, we will get the following execution plan:

|StmtText                                                                        |
|--------------------------------------------------------------------------------|
|SELECT PostId, Title FROM Post WHERE Title = @P0                                |
|  |--Index Seek(OBJECT:([high_performance_sql].[dbo].[Post].[IDX_Post_Title]),  |
|       SEEK:([high_performance_sql].[dbo].[Post].[Title]=[@P0]) ORDERED FORWARD)|

That’s exactly what we were expecting from the start. There’s an Index Seek on the IDX_Post_Title index, and there’s no implicit conversion happening anymore.

Handing Unicode characters

Now, even if you disable the sendStringParametersAsUnicode setting, you can still persist Unicode data in NHAR, NVARCHAR or NLONGVARCHAR column.

So, if the Title column is of the NVARCHAR type:

CREATE TABLE Post (
    PostID BIGINT NOT NULL, 
    Title NVARCHAR(255), 
    PRIMARY KEY (PostID)
)

We can set the Title column using the setNString PreparedStatement method:

try (PreparedStatement statement = connection.prepareStatement("""
    INSERT INTO Post (Title, PostID) 
    VALUES (?, ?)
    """
)) {

    statement.setNString(1, "România");
    statement.setLong(2, 1L);

    assertEquals(1, statement.executeUpdate());
}

And, we can read the Title column using the getNString ResultSet method:

try (PreparedStatement statement = connection.prepareStatement("""
    SELECT Title, PostId
    FROM Post
    WHERE Title = ? 
    """
)) {

    statement.setNString(1, "România");

    try(ResultSet resultSet = statement.executeQuery()) {
        if (resultSet.next()) {
            assertEquals("România", resultSet.getNString(1));
            assertEquals(1L, resultSet.getLong(2));
        }
    }
}

If you’re using JPA and Hibernate, the NVARCHAR column needs to be annotated with the @Nationalized Hibernate annotation to instruct Hibernate that the underlying String attribute needs to be handled by the StringNVarcharType, as opposed to the default StringType:

@Entity(name = "Post")
public class Post {

    @Id
    @Column(name = "PostID")
    private Long id;

    @Column(name = "Title")
    @Nationalized
    private String title;
    
}

Awesome, right?

If you enjoyed this article, I bet you are going to love my Book and Video Courses as well.

Conclusion

If you’re using SQL Server with Java, it’s a good idea to set the sendStringParametersAsUnicode configuration property value to false, as otherwise, you can face significant performance issues associated with CHAR, VARCHAR or LONGVARCHAR indexes.

In fact, even the SQL Server JDBC Driver performance tuning page recommends disabling this property, which is set to true for backward compatibility reasons.

Follow @vlad_mihalcea

Transactions and Concurrency Control eBook

High-Performance Java Persistence rocks!

Category: Database, Hibernate, SQL Server Tags: configuration, driver, JDBC, NVARCHAR, sendStringParametersAsUnicode, SQL Server, VARCHAR

2 Comments on “SQL Server JDBC – Set sendStringParametersAsUnicode to false”

GP
January 9, 2025

So, why is this setting even there, let alone enabled by default? Chestertons fence (https://en.wiktionary.org/wiki/Chesterton%27s_fence) suggests that there might be cases where this setting is useful? Maybe it was useful in some older version and now the situation changed?

Reply
- vladmihalcea
  January 9, 2025
  
  The SQL Standard defined VARCHAR and NVARCHAR for string values. Prior to UTF-8, VARCHAR was meant for English while NVARCHAR for any other language.
  
  The reason why SQL Server sends string values as NVARCHAR is based on this legacy limitation of the standard. Nowadays, UTF-8 allows us to store string values efficiently so that the ASCII letters found in English take a single byte and just extra letters (e.g., ă, â, ș, ț) require 2 bytes.
  
  Reply