Details

    • Type: Bug
    • Status: Unverified Fix
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 3.0.8
    • Fix Version/s: 3.0.10-RC1
    • Component/s: Posting
    • Labels:
      None

      Description

      The word filter can be easily evaded using control characters (unicode characters 00 - 0F). It does not work with NUL (00), but I have tested it with SOH and STX and it works with both of them. When the user posts, all they have to do is insert on of these control characters into the word they don't want to be filtered, and it is allowed in the post.

      My proposed fix, which I would be happy to implement myself, would be to simply strip all control characters from the post. I've never seen a control character used genuinely on a bulletin board.

      If you want to replicate this yourself and your keyboard doesn't allow you to type these characters, go into a javascript console (easiest) and type document.log('wo\u0001rd'). You will not be able to see the control character.

        Activity

        Hide
        Oleg Oleg [X] (Inactive) added a comment - - edited

        Unicode has some useful ones, e.g. http://en.wikipedia.org/wiki/Soft_return. But stripping 0x0-0xf is probably ok.

        On the other hand, if someone really wanted to defeat the word censor they could simply spell out bad w o r d s spaced out like that, embed zero-length bbcodes in them ([b][/b]) or a myriad of other ways.

        Show
        Oleg Oleg [X] (Inactive) added a comment - - edited Unicode has some useful ones, e.g. http://en.wikipedia.org/wiki/Soft_return . But stripping 0x0-0xf is probably ok. On the other hand, if someone really wanted to defeat the word censor they could simply spell out bad w o r d s spaced out like that, embed zero-length bbcodes in them ( [b] [/b] ) or a myriad of other ways.
        Hide
        callum95 callum95 added a comment -

        Zero-length bbcodes are stripped automatically, but ba[b]d[/b]word would get through the filter. They could also write the word backwards and use an RLO character, but that's probably a tad extreme.

        Show
        callum95 callum95 added a comment - Zero-length bbcodes are stripped automatically, but ba [b] d [/b] word would get through the filter. They could also write the word backwards and use an RLO character, but that's probably a tad extreme.
        Hide
        Oleg Oleg [X] (Inactive) added a comment -

        Any objections to merging this change?

        Show
        Oleg Oleg [X] (Inactive) added a comment - Any objections to merging this change?
        Hide
        Oleg Oleg [X] (Inactive) added a comment -

        It looks like an appropriate unicode font is required to display control characters at zero width. On my system I get the generic unicode symbol for \u0001.

        Show
        Oleg Oleg [X] (Inactive) added a comment - It looks like an appropriate unicode font is required to display control characters at zero width. On my system I get the generic unicode symbol for \u0001.
        Hide
        naderman Nils Adermann added a comment -

        The patch deletes good characters like line feeds

        Show
        naderman Nils Adermann added a comment - The patch deletes good characters like line feeds
        Hide
        callum95 callum95 added a comment -
        Show
        callum95 callum95 added a comment - https://github.com/phpbb/phpbb3/pull/334

          People

          • Assignee:
            bantu Andreas Fischer
            Reporter:
            callum95 callum95
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development