public class UURIFactoryTest extends TestCase (Code)
Test UURIFactory for proper UURI creation across variety of
important/tricky cases.
Be careful writing this file. Make sure you write it with UTF-8 encoding.
author: igor stack gojomo
testSameAsNutchURLFilterBasic() Check that our 'normalization' does same as Nutch's
Below before-and-afters were taken from the nutch urlnormalizer-basic
TestBasicURLNormalizer class (December 2006, Nutch 0.9-dev).
final public void
testSpaceDoubleEncoding() Test space plus encoding ([ 1010966 ] crawl.log has URIs with spaces in them).
testStartsWithColon() Ensure that URI strings beginning with a colon are treated
the same as browsers do (as relative, rather than as absolute
with zero-length scheme).
public void
testStrayPercents() Ensure that stray '%' characters do not prevent
UURI instances from being created, and are reasonably
escaped when encountered.
testTrailingPercents() Ensure that stray trailing '%' characters do not prevent
UURI instances from being created, and are reasonably
escaped when encountered.
final public void testAbsolute() throws URIException(Code)
testAnchors
final public void testAnchors() throws URIException(Code)
A UURI should always be without a 'fragment' segment, which is
unused and irrelevant for network fetches.
See [ 970666 ] #anchor links not trimmed, and thus recrawled
throws: URIException -
testBadBaseResolve
final public void testBadBaseResolve() throws URIException(Code)
testCurlies
final public void testCurlies() throws URIException(Code)
testDnsHost
final public void testDnsHost() throws URIException(Code)
testDoubleEncoding
final public void testDoubleEncoding() throws URIException(Code)
final public void testRelative() throws URIException(Code)
testRelativeDblPathSlashes
final public void testRelativeDblPathSlashes() throws URIException(Code)
testRelativeEmpty
final public void testRelativeEmpty() throws URIException(Code)
Test that an empty uuri does the right thing -- that we get back the
base.
throws: URIException -
testRelativeURIWithTwoSlashes
final public void testRelativeURIWithTwoSlashes() throws URIException(Code)
testRelativeWithScheme
final public void testRelativeWithScheme() throws URIException(Code)
testSameAsNutchURLFilterBasic
public void testSameAsNutchURLFilterBasic() throws URIException(Code)
Check that our 'normalization' does same as Nutch's
Below before-and-afters were taken from the nutch urlnormalizer-basic
TestBasicURLNormalizer class (December 2006, Nutch 0.9-dev).
throws: URIException -
testSpaceDoubleEncoding
final public void testSpaceDoubleEncoding() throws URIException(Code)
public void testStartsWithColon() throws URIException(Code)
Ensure that URI strings beginning with a colon are treated
the same as browsers do (as relative, rather than as absolute
with zero-length scheme).
throws: URIException -
testStrayPercents
public void testStrayPercents() throws URIException(Code)
Ensure that stray '%' characters do not prevent
UURI instances from being created, and are reasonably
escaped when encountered.
throws: URIException -
testTabsInURL
public void testTabsInURL() throws URIException(Code)
testThreeSlashes
final public void testThreeSlashes() throws URIException(Code)
final public void testTilde() throws URIException(Code)
testTooLongAfterEscaping
final public void testTooLongAfterEscaping()(Code)
testTrailingEncodedSpace
final public void testTrailingEncodedSpace() throws URIException(Code)
testTrailingPercents
public void testTrailingPercents() throws URIException(Code)
Ensure that stray trailing '%' characters do not prevent
UURI instances from being created, and are reasonably
escaped when encountered.
throws: URIException -
testTrimSpaceNBSP
final public void testTrimSpaceNBSP() throws URIException(Code)