Java Doc for Element.java in  » HTML-Parser » jericho-html » au » id » jericho » lib » html » Java Source Code / Java DocumentationJava Source Code and Java Documentation

Java Source Code / Java Documentation
1. 6.0 JDK Core
2. 6.0 JDK Modules
3. 6.0 JDK Modules com.sun
4. 6.0 JDK Modules com.sun.java
5. 6.0 JDK Modules sun
6. 6.0 JDK Platform
7. Ajax
8. Apache Harmony Java SE
9. Aspect oriented
10. Authentication Authorization
11. Blogger System
12. Build
13. Byte Code
14. Cache
15. Chart
16. Chat
17. Code Analyzer
18. Collaboration
19. Content Management System
20. Database Client
21. Database DBMS
22. Database JDBC Connection Pool
23. Database ORM
24. Development
25. EJB Server geronimo
26. EJB Server GlassFish
27. EJB Server JBoss 4.2.1
28. EJB Server resin 3.1.5
29. ERP CRM Financial
30. ESB
31. Forum
32. GIS
33. Graphic Library
34. Groupware
35. HTML Parser
36. IDE
37. IDE Eclipse
38. IDE Netbeans
39. Installer
40. Internationalization Localization
41. Inversion of Control
42. Issue Tracking
43. J2EE
44. JBoss
45. JMS
46. JMX
47. Library
48. Mail Clients
49. Net
50. Parser
51. PDF
52. Portal
53. Profiler
54. Project Management
55. Report
56. RSS RDF
57. Rule Engine
58. Science
59. Scripting
60. Search Engine
61. Security
62. Sevlet Container
63. Source Control
64. Swing Library
65. Template Engine
66. Test Coverage
67. Testing
68. UML
69. Web Crawler
70. Web Framework
71. Web Mail
72. Web Server
73. Web Services
74. Web Services apache cxf 2.0.1
75. Web Services AXIS2
76. Wiki Engine
77. Workflow Engines
78. XML
79. XML UI
Java
Java Tutorial
Java Open Source
Jar File Download
Java Articles
Java Products
Java by API
Photoshop Tutorials
Maya Tutorials
Flash Tutorials
3ds-Max Tutorials
Illustrator Tutorials
GIMP Tutorials
C# / C Sharp
C# / CSharp Tutorial
C# / CSharp Open Source
ASP.Net
ASP.NET Tutorial
JavaScript DHTML
JavaScript Tutorial
JavaScript Reference
HTML / CSS
HTML CSS Reference
C / ANSI-C
C Tutorial
C++
C++ Tutorial
Ruby
PHP
Python
Python Tutorial
Python Open Source
SQL Server / T-SQL
SQL Server / T-SQL Tutorial
Oracle PL / SQL
Oracle PL/SQL Tutorial
PostgreSQL
SQL / MySQL
MySQL Tutorial
VB.Net
VB.Net Tutorial
Flash / Flex / ActionScript
VBA / Excel / Access / Word
XML
XML Tutorial
Microsoft Office PowerPoint 2007 Tutorial
Microsoft Office Excel 2007 Tutorial
Microsoft Office Word 2007 Tutorial
Java Source Code / Java Documentation » HTML Parser » jericho html » au.id.jericho.lib.html 
Source Cross Reference  Class Diagram Java Document (Java Doc) 


java.lang.Object
   au.id.jericho.lib.html.Segment
      au.id.jericho.lib.html.Element

Element
final public class Element extends Segment implements HTMLElementName(Code)
Represents an element in a specific document, which encompasses a , an optional and all in between.

Take the following HTML segment as an example:

<p>This is a sample paragraph.</p>

The whole segment is represented by an Element object. This is comprised of the StartTag "<p>", the EndTag "</p>", as well as the text in between. An element may also contain other elements between its start and end tags.

The term normal element refers to an element having a with a of StartTagType.NORMAL . This comprises all and non-HTML elements.

Element instances are obtained using one of the following methods:

See also the HTMLElements class, and the XML 1.0 specification for elements.

Element Structure

The three possible structures of an element are listed below:

Single Tag Element:
Example:
<img src="mypicture.jpg">

The element consists only of a single and has no (although the start tag itself may have ).
Element.getEndTag() ==null
Element.isEmpty() ==true
Element.getEnd() getEnd() == Element.getStartTag() . Element.getEnd() getEnd()

This occurs in the following situations:

  • An HTML element for which the .
  • An HTML element for which the , but the end tag is not present in the source document.
  • An HTML element for which the , where the implicitly terminating tag is situated immediately after the element's .
  • An
  • A non-HTML element that is not an but is missing its end tag.
  • An element with a start tag of a that does not define a .
  • An element with a start tag of a that does define a but is missing its end tag.
Explicitly Terminated Element:
Example:
<p>This is a sample paragraph.</p>

The element consists of a , , and an .
Element.getEndTag() !=null.
Element.isEmpty() ==false (provided the end tag doesn't immediately follow the start tag)
Element.getEnd() getEnd() == Element.getEndTag() . Element.getEnd() getEnd() .

This occurs in the following situations, assuming the start tag's matching end tag is present in the source document:

  • An HTML element for which the end tag is either or .
  • A non-HTML element that is not an .
  • An element with a start tag of a that defines a .
Implicitly Terminated Element:
Example:
<p>This text is included in the paragraph element even though no end tag is present.
<p>This is the next paragraph.

The element consists of a and , but no .
Element.getEndTag() ==null.
Element.isEmpty() ==false
Element.getEnd() getEnd() != Element.getStartTag() . Element.getEnd() getEnd() .

This only occurs in an HTML element for which the .

The element ends at the start of a tag which implies the termination of the element, called the implicitly terminating tag. If the implicitly terminating tag is situated immediately after the element's , the element is classed as a single tag element.

See the element parsing rules for HTML elements with optional end tags for details on which tags can implicitly terminate a given element.

See also the documentation of the HTMLElements.getEndTagOptionalElementNames method.

Element Parsing Rules

The following rules describe the algorithm used in the StartTag.getElement method to construct an element. The detection of the start tag's matching end tag or other terminating tags always takes into account the possible nesting of elements.

  • If the start tag has a of StartTagType.NORMAL :
    • If the of the start tag matches one of the recognised (indicating an HTML element):
      • If the end tag for an element of this is , the parser does not conduct any search for an end tag and a single tag element is created.
      • If the end tag for an element of this is , the parser searches for the start tag's matching end tag.
        • If the matching end tag is found, an explicitly terminated element is created.
        • If no matching end tag is found, the source document is not valid HTML and the incident is as a missing required end tag. In this situation a single tag element is created.
      • If the end tag for an element of this is , the parser searches not only for the start tag's matching end tag, but also for any other tag that implicitly terminates the element.
        For each tag (T2) following the start tag (ST1) of this element (E1):
      Note that the syntactical indication of an in the start tag is ignored when determining the end of HTML elements. See the documentation of the Element.isEmptyElementTag() method for more information.
    • If the of the start tag does not match one of the recognised (indicating a non-HTML element):
      • If the start tag is an , the parser does not conduct any search for an end tag and a single tag element is created.
      • Otherwise, section 3.1 of the XML 1.0 specification states that a matching end tag MUST be present, and the parser searches for the start tag's matching end tag.
        • If the matching end tag is found, an explicitly terminated element is created.
        • If no matching end tag is found, the source document is not valid XML and the incident is as a missing required end tag. In this situation a single tag element is created.
  • If the start tag has any other than StartTagType.NORMAL :
    • If the start tag's type does not define a , the parser does not conduct any search for an end tag and a single tag element is created.
    • If the start tag's type does define a , the parser assumes that a matching end tag is required and searches for it.

See Also:   HTMLElements


Field Summary
final static  ElementNOT_CACHED
    
 ElementparentElement
    

Constructor Summary
 Element(Source source, StartTag startTag, EndTag endTag)
    

Method Summary
public  StringgetAttributeValue(String attributeName)
     Returns the value of the attribute with the specified name (case insensitive).

Returns null if the does not , no attribute with the specified name exists or the attribute .

This is equivalent to Element.getStartTag() . StartTag.getAttributeValue(String) getAttributeValue(attributeName) .
Parameters:
  attributeName - the name of the attribute to get.

public  AttributesgetAttributes()
     Returns the attributes specified in this element's start tag.
final public  ListgetChildElements()
     Returns a list of the immediate children of this element in the document element hierarchy.
final  ListgetChildElements(int depth)
    
public  SegmentgetContent()
     Returns the segment representing the content of the element.
 intgetContentEnd()
    
public  StringgetDebugInfo()
    
public  intgetDepth()
     Returns the nesting depth of this element in the document element hierarchy.
public  EndTaggetEndTag()
     Returns the end tag of the element.
public  FormControlgetFormControl()
     Returns the FormControl defined by this element.
public  StringgetName()
     Returns the of the of this element, always in lower case.
public  ElementgetParentElement()
     Returns the parent of this element in the document element hierarchy.
public  StartTaggetStartTag()
     Returns the start tag of the element.
public  booleanisEmpty()
     Indicates whether this element has zero-length .
public  booleanisEmptyElementTag()
     Indicates whether this element is an empty-element tag.

Field Detail
NOT_CACHED
final static Element NOT_CACHED(Code)



parentElement
Element parentElement(Code)




Constructor Detail
Element
Element(Source source, StartTag startTag, EndTag endTag)(Code)




Method Detail
getAttributeValue
public String getAttributeValue(String attributeName)(Code)
Returns the value of the attribute with the specified name (case insensitive).

Returns null if the does not , no attribute with the specified name exists or the attribute .

This is equivalent to Element.getStartTag() . StartTag.getAttributeValue(String) getAttributeValue(attributeName) .
Parameters:
  attributeName - the name of the attribute to get. the value of the attribute with the specified name, or null if the attribute does not exist or .




getAttributes
public Attributes getAttributes()(Code)
Returns the attributes specified in this element's start tag.

This is equivalent to Element.getStartTag() . StartTag.getAttributes getAttributes() . the attributes specified in this element's start tag.
See Also:   StartTag.getAttributes




getChildElements
final public List getChildElements()(Code)
Returns a list of the immediate children of this element in the document element hierarchy.

The objects in the list are all of type Element .

See the Source.getChildElements method for more details. a list of the immediate children of this element in the document element hierarchy, guaranteed not null.
See Also:   Element.getParentElement()




getChildElements
final List getChildElements(int depth)(Code)



getContent
public Segment getContent()(Code)
Returns the segment representing the content of the element.

This segment spans between the end of the start tag and the start of the end tag. If the end tag is not present, the content reaches to the end of the element.

Note that before version 2.0 this method returned null if the element was , whereas now a zero-length segment is returned. the segment representing the content of the element, guaranteed not null.




getContentEnd
int getContentEnd()(Code)



getDebugInfo
public String getDebugInfo()(Code)



getDepth
public int getDepth()(Code)
Returns the nesting depth of this element in the document element hierarchy.

The Source.fullSequentialParse method should be called after construction of the Source object if this method is to be used.

A top-level element has a nesting depth of 0.

An element formed from a always have a nesting depth of 0, regardless of whether it is nested inside a normal element.

See the Source.getChildElements method for more details. the nesting depth of this element in the document element hierarchy.
See Also:   Element.getParentElement()




getEndTag
public EndTag getEndTag()(Code)
Returns the end tag of the element.

If the element has no end tag this method returns null. the end tag of the element, or null if the element has no end tag.




getFormControl
public FormControl getFormControl()(Code)
Returns the FormControl defined by this element. the FormControl defined by this element, or null if it is not a control.



getName
public String getName()(Code)
Returns the of the of this element, always in lower case.

This is equivalent to Element.getStartTag() . StartTag.getName getName() .

See the Tag.getName method for more information. the name of the of this element, always in lower case.




getParentElement
public Element getParentElement()(Code)
Returns the parent of this element in the document element hierarchy.

The Source.fullSequentialParse method should be called after construction of the Source object if this method is to be used.

This method returns null for a top-level element, as well as any element formed from a , regardless of whether it is nested inside a normal element.

See the Source.getChildElements method for more details. the parent of this element in the document element hierarchy, or null if this element is a top-level element.
See Also:   Element.getChildElements()




getStartTag
public StartTag getStartTag()(Code)
Returns the start tag of the element. the start tag of the element.



isEmpty
public boolean isEmpty()(Code)
Indicates whether this element has zero-length .

This is equivalent to Element.getContent() . Segment.length length() ==0.

Note that this is a broader definition than that of both the HTML definition of an empty element, which is only those elements whose end tag is , and the XML definition of an empty element, which is "either a start-tag immediately followed by an end-tag, or an ". The other possibility covered by this property is the case of an HTML element with an end tag that is immediately followed by another tag that implicitly terminates the element. true if this element has zero-length , otherwise false.
See Also:   Element.isEmptyElementTag()




isEmptyElementTag
public boolean isEmptyElementTag()(Code)
Indicates whether this element is an empty-element tag.

It is signified by an element with the characters "/>" at the end of the .

This is equivalent to Element.isEmpty() && Element.getStartTag() . StartTag.isEmptyElementTag isEmptyElementTag() .

The StartTag.isEmptyElementTag property only checks whether the start tag syntactically an empty-element tag, whereas this property also makes sure the element is in fact .

A syntactical empty-element tag that is not actually empty can occur if the end tag of an HTML element is either or , but the start tag is erroneously terminated with the characters "/>" in the source document. All major browsers ignore the syntactical hint of an empty element in this case, even in an XHTML document, so this parser does the same. true if this element is an empty-element tag, otherwise false.




Fields inherited from au.id.jericho.lib.html.Segment
final int begin(Code)(Java Doc)
List childElements(Code)(Java Doc)
final int end(Code)(Java Doc)
final Source source(Code)(Java Doc)

Methods inherited from au.id.jericho.lib.html.Segment
final static StringBuffer appendCollapseWhiteSpace(StringBuffer sb, CharSequence text)(Code)(Java Doc)
final public char charAt(int index)(Code)(Java Doc)
public int compareTo(Object o)(Code)(Java Doc)
final public boolean encloses(Segment segment)(Code)(Java Doc)
final public boolean encloses(int pos)(Code)(Java Doc)
final public boolean equals(Object object)(Code)(Java Doc)
public String extractText()(Code)(Java Doc)
public String extractText(boolean includeAttributes)(Code)(Java Doc)
public List findAllCharacterReferences()(Code)(Java Doc)
public List findAllElements()(Code)(Java Doc)
public List findAllElements(String name)(Code)(Java Doc)
public List findAllElements(StartTagType startTagType)(Code)(Java Doc)
public List findAllElements(String attributeName, String value, boolean valueCaseSensitive)(Code)(Java Doc)
public List findAllStartTags()(Code)(Java Doc)
public List findAllStartTags(String name)(Code)(Java Doc)
public List findAllStartTags(String attributeName, String value, boolean valueCaseSensitive)(Code)(Java Doc)
public List findAllTags()(Code)(Java Doc)
public List findAllTags(TagType tagType)(Code)(Java Doc)
public List findFormControls()(Code)(Java Doc)
public FormFields findFormFields()(Code)(Java Doc)
final public int getBegin()(Code)(Java Doc)
public List getChildElements()(Code)(Java Doc)
public String getDebugInfo()(Code)(Java Doc)
final public int getEnd()(Code)(Java Doc)
public Renderer getRenderer()(Code)(Java Doc)
public TextExtractor getTextExtractor()(Code)(Java Doc)
public int hashCode()(Code)(Java Doc)
public void ignoreWhenParsing()(Code)(Java Doc)
final public boolean isWhiteSpace()(Code)(Java Doc)
final public static boolean isWhiteSpace(char ch)(Code)(Java Doc)
final public int length()(Code)(Java Doc)
public Attributes parseAttributes()(Code)(Java Doc)
final public CharSequence subSequence(int beginIndex, int endIndex)(Code)(Java Doc)
public String toString()(Code)(Java Doc)

Methods inherited from java.lang.Object
native protected Object clone() throws CloneNotSupportedException(Code)(Java Doc)
public boolean equals(Object obj)(Code)(Java Doc)
protected void finalize() throws Throwable(Code)(Java Doc)
final native public Class getClass()(Code)(Java Doc)
native public int hashCode()(Code)(Java Doc)
final native public void notify()(Code)(Java Doc)
final native public void notifyAll()(Code)(Java Doc)
public String toString()(Code)(Java Doc)
final native public void wait(long timeout) throws InterruptedException(Code)(Java Doc)
final public void wait(long timeout, int nanos) throws InterruptedException(Code)(Java Doc)
final public void wait() throws InterruptedException(Code)(Java Doc)

www.java2java.com | Contact Us
Copyright 2009 - 12 Demo Source and Support. All rights reserved.
All other trademarks are property of their respective owners.