git svn: do not overescape URLs (fallback case)
authorJonathan Nieder <jrnieder@gmail.com>
Sun, 14 Oct 2012 11:45:21 +0000 (14 04:45 -0700)
committerEric Wong <normalperson@yhbt.net>
Thu, 17 Jan 2013 23:28:12 +0000 (17 23:28 +0000)
Subversion's canonical URLs are intended to make URL comparison easy
and therefore have strict rules about what characters are special
enough to urlencode and what characters should be left alone.

When in the fallback codepath because unable to use libsvn's own
canonicalization function for some reason, escape special characters
in URIs according to the svn_uri__char_validity[] table in
subversion/libsvn_subr/path.c (r935829).  The libsvn versions that
trigger this code path are not likely to be strict enough to care, but
it's nicer to be consistent.

Noticed by using SVN 1.6.17 perl bindings, which do not provide
SVN::_Core::svn_uri_canonicalize (triggering the fallback code),
with libsvn 1.7.5, whose do_switch is fussy enough to care:

  Committing to file:///home/jrn/src/git/t/trash%20directory.\
  t9118-git-svn-funky-branch-names/svnrepo/pr%20ject/branches\
  /more%20fun%20plugin%21 ...
  svn: E235000: In file '[...]/subversion/libsvn_subr/dirent_uri.c' \
  line 2291: assertion failed (svn_uri_is_canonical(url, pool))
  error: git-svn died of signal 6
  not ok - 3 test dcommit to funky branch

After this change, the '!' in 'more%20fun%20plugin!' is not urlencoded
and t9118 passes again.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Eric Wong <normalperson@yhbt.net>
perl/Git/SVN/Utils.pm

index 8b8cf37..3d1a093 100644 (file)
@@ -155,7 +155,7 @@ sub _canonicalize_url_path {
 
        my @parts;
        foreach my $part (split m{/+}, $uri_path) {
-               $part =~ s/([^~\w.%+-]|%(?![a-fA-F0-9]{2}))/sprintf("%%%02X",ord($1))/eg;
+               $part =~ s/([^!\$%&'()*+,.\/\w:=\@_`~-]|%(?![a-fA-F0-9]{2}))/sprintf("%%%02X",ord($1))/eg;
                push @parts, $part;
        }