Uploaded image for project: 'phpBB'
  1. phpBB
  2. PHPBB-11612

Cache/wrap check result for UTF-8 support on regexp

XMLWordPrintable

    • Icon: Improvement Improvement
    • Resolution: Won't Fix
    • Icon: Minor Minor
    • None
    • 3.0.11
    • Other
    • None

      In /includes/functions_user.php in function validate_username() we have this:

      	// generic UTF-8 character types supported?
      	if ((version_compare(PHP_VERSION, '5.1.0', '>=') || (version_compare(PHP_VERSION, '5.0.0-dev', '<=') && version_compare(PHP_VERSION, '4.4.0', '>='))) && @preg_match('/\p{L}/u', 'a') !== false)
      	{
      		$pcre = true;
      	}
      	else if (function_exists('mb_ereg_match'))
      	{
      		mb_regex_encoding('UTF-8');
      		$mbstring = true;
      	}

      What is checked here should change rarily, so the result should be cached somewhere instead of executing it again and again. Right now this would make little sense (in that function only), however if other code needs to be flexible aswell (e.g. providing a #hashtag feature needs to parse all posts of a topic and should also accept non-latin characters, hence it wants to use UTF-8 while still providing fallbacks) it could already use an existing result instead of making all those checks and calls again.

      Maybe something like $regex_support which can have (constant) values like:

      • PHPBB_RE_UTF8 (PCRE supports Unicode)
      • PHPBB_RE_MBEREG (as a fallback, multibyte EREG can be used) and
      • PHPBB_RE_NONE (fallback on ASCII only).

            Marc Marc
            AmigoJack AmigoJack
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: