Uploaded image for project: 'phpBB'
  1. phpBB
  2. PHPBB-15847

Invalid UTF-8 encoding on mssqlnative driver




      The previous version of PHPBB used this code to construct a connection string and connect to a MSSQL database inside mssqlnative.php:


      //connect to database
      $this->db_connect_id = sqlsrv_connect($this->server, array(
      	'Database' => $this->dbname,
      	'UID' => $this->user,
      	'PWD' => $sqlpassword


      The latest release of PHPBB adds a UTF-8 characterset declaration line:

      //connect to database
      $this->db_connect_id = sqlsrv_connect($this->server, array(
      	'Database' => $this->dbname,
      	'UID' => $this->user,
      	'PWD' => $sqlpassword,
      	'CharacterSet' => 'UTF-8'

      When I upgraded my site, every post on the site which included Unicode characters started showing invalid character sequences in place of the proper character. For instance, instead of the opening quote character “ the site would show the UTF-8 character sequence “.

      The post_text column in the phpbb_posts table stores that character in the post using the encoded UTF-8 character sequence “. This is sent directly to the browser as is, declaring the page to be UTF-8, and the browser correctly displays the Unicode character.

      By declaring the connection to the SQL Server as UTF-8, the connection/SQL Server encodes the already encoded UTF-8 character sequences that are stored in the post_text column. This is output to the browser, which decodes the second encoding, but then displays the post directly as it is stored in the database - which is still encoded.

      Removing this CharacterSet parameter from the connection string stops the double UTF-8 encoding from occurring and returns proper functionality to the site under SQL Server.





            Unassigned Unassigned
            gsmaclean gsmaclean [X] (Inactive)
            0 Vote for this issue
            1 Start watching this issue