[PHPBB3-217] Multiline [url] not Converted Created: 25/Mar/06  Updated: 22/Jan/17  Resolved: 05/Jun/11

Status: Closed
Project: phpBB3
Component/s: Posting
Affects Version/s: 3.0.x
Fix Version/s: 3.0.9-RC1

Type: Bug Priority: Minor
Reporter: curtisj Assignee: rxu
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Duplicate
is duplicated by PHPBB3-9864 Multiple line URLs using [url] tag ar... Closed
is duplicated by PHPBB3-9957 [url=...]Text \n with linebreak[/url]... Closed
GitHub Pull Request URL: https://github.com/flxf/phpbb3/tree/ticket/phpbb3-217
Fix URL: https://github.com/phpbb/phpbb3/pull/155
phpBB Import Key: http://www.phpbb.com/bugs/all/ticket.php?ticket_id=1309

 Description   

When using a [ url ] BBCode tag whose text content spans multiple lines, the regex will not match and convert to markup accordingly.

Please refer to this test post in the development forums for an example.



 Comments   
Comment by cheater512 [ 25/Mar/06 ]

Would a simple m after the preg delimiter fix it?
Might be a bit slower though.

Comment by curtisj [ 25/Mar/06 ]

Well, the "s" modifier is suited for such regex matching, and I haven't taken a look at the source code yet. I'm not a regular when it comes to phpBB source code, and happened upon it by accident. If the s modifier isn't already in use for the [url] regex pattern match, then that's probably all it would take to fix it.

Comment by Meik Sievertsen [X] (Inactive) [ 25/Mar/06 ]

This is not a bug. Newlines are not allowed in urls (or do you have seen them in urls recently?).

Comment by clodo [ 11/Jun/09 ]

The patch:
Open "includes/message_parser.php"
Find

array('#\[url(=(.*))?\](.*)\[/url\]#iUe'

Add the 's':

array('#\[url(=(.*))?\](.*)\[/url\]#iUes'

.

------------
Can i ask why can't be fixed?
Multiline urls are allowed in html, are allowed in ALL board engine i test (Invision, vBulletin, SMF for example).
I'm developing tools that generate bbcodes... ok that doesn't exists an official standard, but exists some "agreement" that are respected by all board, except phpBB. This issue is one of them.
I'm only want to understand why, thanks.

Comment by curtisj [ 12/Jun/09 ]

I think there may have been a misunderstanding when I initially posted this.

I wasn't pointing out that line terminators should be allowed within the actual URI, but within the textual data that's displayed:

[url=http://www.phpbb.com/]phpBB
forums[/url]

Currently, the above BBCode will not be parsed at all. Even if the devs would prefer it doesn't actually insert line breaks, it may be helpful to some users if the BBCode parser at least parse the tag, converting the line breaks into spaces. Most, if not all, user agents will parse this just fine:

<a href="http://www.phpbb.com/">phpBB
forums</a>

A patched regex pattern should not apply the /s modifier to the whole pattern, but only the group matching the text between the tags. It would make life easier to start using the /x modifier, too. It's a lot easier to maintain complex regex patterns.

Open "includes/message_parser.php"
Find

			'url'			=> array('bbcode_id' => 3,	'regexp' => array('#\[url(=(.*))?\](.*)\[/url\]#iUe' => "\$this->validate_url('\$2', '\$3')")),

In-line Find

'#\[url(=(.*))?\](.*)\[/url\]#iUe'

In-line replace:

'#\[url(=(.*))?\]((?s).*)\[/url\]#iUe'

Just a note: the BBCode parser will, in fact, convert users' line breaks to <br> tags.

Comment by clodo [ 12/Jun/09 ]

I quote curtisj.
Not also multiline url are allowed by html standard, but also Invision, vBulletin, SMF, in general all forum board engine i tested support multiline url.
Invision doesn't support in 1.x release, but was fixed at least in 2.x.

In my specific case, i have a tools that convert news from html to bbcode format, and multiline url are frequently used; for example, different thumbnail images, separated every X image, for creating a grid of screenshots; all surrounded by unique url to a complete gallery.
What i need to do, repeat the URL tag every line?

Honestly, if all the board engine, and also HTML, allow multiline url, this issue in phpBB must be a bug, imho. Secondly, if you don't want to fix, the ticket status "Will not fix" may be more correct.

Comment by Meik Sievertsen [X] (Inactive) [ 12/Jun/09 ]

To put it in the correct context - we won't allow URL's to span over multiple lines (obviously, because this is not allowed), regarding the explanation, text within the a tags - this is another story. We currently discuss this.

Comment by suitlocal [X] (Inactive) [ 12/Jun/09 ]

@Clodo - if it is by design it is by definition "not a bug". may be it is not a good design decision but from what has been said here it is by design all the same and hence "not a bug".

do yourself a favor and get a dictionary.

Comment by clodo [ 13/Jun/09 ]

I'm trying to explain better...

If you consider "bbcode" as "shortcut to html language", this issue is a bug, imho.
If you consider "bbcode" as "phpbb posting language", this issue is a design discussion.

Normally, bbcode are considered an html shortcut languages, to avoid interoperability problems between board (for example, a news written in bbcode syntax and posted in many different forum).
This issue cause this type of problem.

But i understand that Acyd Burn consider phpBBCode a specific phpBB markup language, so this issue maybe closed as "Will not fix".

Thanks Acyd for your attention.

Sorry for my english.

Comment by curtisj [ 13/Jun/09 ]

Clodo:

If you consider "bbcode" as "shortcut to html language", this issue is a bug, imho.


Actually, an abstraction over HTML might be a better description.

Clodo:

But i understand that Acyd Burn consider phpBBCode a specific phpBB markup language, so this issue maybe closed as "Will not fix".


Acyd Burn said it's currently under discussion, so let's just leave it at that. IMO, it's not that imperative that anyone try to persuade them, one way or the other.

The modification is also very simple, so if you want to run your boards with the patch, you can. It should be very easy to maintain in future releases, even if a patch isn't integrated officially.

Comment by clodo [ 13/Jun/09 ]

@curtisj
You are right, i already patched by phpBB.
But i also have a tool that generate bbcodes compliant with many forum engine; So, i need to add a disclaimer "phpBB not supported" or "you need to manually patch your phpBB to use the generated bbcodes"; to avoid that, i'm here.
My only alternative is detect multiline-url and repeat the url tag every line, but is a lot of work...

Comment by suitlocal [X] (Inactive) [ 21/Jun/09 ]

If you consider "bbcode" as "shortcut to html language", this issue is a bug, imho.
If you consider "bbcode" as "phpbb posting language", this issue is a design discussion.

as a shortcut to html bbcode is totally inadequate without the addition of custom bbcodes and even then html tags can have optional tags whereas i do not know of a single bbcode engine that supports that.

really bbcode is its own beast. on phpbb bbcode is "phpbb posting language" on smf bbcode is "smf posting language" on ipb bbcode is "ipb posting language". there is no formal ietf standard for bbcode and you should be thankful that informally it has become as standarised as it has.

@curtisj
You are right, i already patched by phpBB.
But i also have a tool that generate bbcodes compliant with many forum engine; So, i need to add a disclaimer "phpBB not supported" or "you need to manually patch your phpBB to use the generated bbcodes"; to avoid that, i'm here.
My only alternative is detect multiline-url and repeat the url tag every line, but is a lot of work...

why single out phpbb? have you tested the bbcode your tool generates on every single product in the universe that supports bbcode? i ask because to say "phpBB not supported" implies that everything else is which is a lie - you cannot know that. you should do a white list if anything - say what is officially supported - not what is not.

Comment by clodo [ 29/Jun/09 ]

as a shortcut to html bbcode is totally inadequate without the addition of custom bbcodes and even then html tags can have optional tags whereas i do not know of a single bbcode engine that supports that.

Ok, shortcut is a wrong word. Maybe "selection" of a subset of html tag?

really bbcode is its own beast. on phpbb bbcode is "phpbb posting language" on smf bbcode is "smf posting language" on ipb bbcode is "ipb posting language". there is no formal ietf standard for bbcode and you should be thankful that informally it has become as standarised as it has.

There isn't any standard, also because exists many different implementation.
But, imho, a possibile standardization is desirable, in the future.
To reach this possible objective, many common board engine apply the same rules to the "common" bbcodes. Also, to avoid problem if an author write an article in bbcode syntax and want to publish in many board.

Another example of this is the "size" tag: 1 to 7 are fixed "common" values, from little to biggest, and Invision/SMF/vBulletin use the same dimension/rules. A coincidence? maybe not.

why single out phpbb? have you tested the bbcode your tool generates on every single product in the universe that supports bbcode? i ask because to say "phpBB not supported" implies that everything else is which is a lie - you cannot know that. you should do a white list if anything - say what is officially supported - not what is not.

I have a public list of board engine where i test my tools (Invision, phpBB, SMF, vBulletin, Osiris), and give support/bugfix for that list.
The problem of this issue was reported by many users: now i'm here, and if isn't fixed, i need to specify why. If i simply remove "phpBB" from the supported list, people may ask to me why i don't support it.
phpBB is my favorite board engine, i understand that look as a "spite", but isn't. Simply, i can't have alternative right now.

Comment by Meik Sievertsen [X] (Inactive) [ 24/Oct/09 ]

I will revert this because this causes the following text to break (which is more sever than not having multiline support):

test [url] test
test [url=http://www.phpbb.com/]test[/url] test
Comment by rxu [ 31/Jan/10 ]

Actually, current regexp will brake the same text if wrote without newline as well. So, the general behavior stays the same.

test [url] test test [url=http://www.phpbb.com/]test[/url] test
Comment by rxu [ 06/Nov/10 ]

Proposed patch is similar to the reverted one and still brakes the text given by Meik above.

Comment by flxf [ 06/Nov/10 ]

Patch revised.

For any matches with invalid urls as in the example given by Meik, ignores first [url] tag and attempts to rematch from the subsequent [url] tag if it exists.

Comment by rxu [ 06/Nov/10 ]

Thanks for the patch.
The problem I see in it is that it's targeted on the particukar text braking problem, thus it's not universal. For example, it brakes the link text here

test [url=http://www.phpbb.com/]test
[url]http://phpbb.com[/url] test

The point is that we need universal solution that won't brake the text for the most of cases.

Comment by flxf [ 06/Nov/10 ]

The problem is that, as noted above, there is an absence of a standard to how to parse the bbcode in a 'standard' way. The fact of the matter is that the cases given above, it is bad input and we have undefined behaviour. Arguably, my patch produces 'a' parse that works. Do want that the inner-most tags take precedence?

Comment by naderman [ 05/Jun/11 ]

The github pull request states that the patch needs more fixing, so reopening. Please provide a new patch.

Comment by rxu [ 05/Jun/11 ]

The fix has been adjusted, also tests for url bbcode have been added.

Generated at Sat Oct 20 12:23:21 UTC 2018 using JIRA 7.9.2#79002-sha1:3bb15b68ecd99a30eb364c4c1a393359bcad6278.