| java.lang.Object java.nio.charset.CharsetEncoder
CharsetEncoder | abstract public class CharsetEncoder (Code) | | An converter that can convert 16-bit Unicode character sequence to byte
sequence in some charset .
The input character sequence is wrapped by
java.nio.CharBuffer CharBuffer and the output character sequence is
java.nio.ByteBuffer ByteBuffer . A encoder instance should be used in
following sequence, which is referred to as a encoding operation:
- Invoking the
CharsetEncoder.reset() reset method to reset the encoder if the
encoder has been used;
- Invoking the
CharsetEncoder.encode(CharBuffer,ByteBuffer,boolean) encode method until the additional input is not needed, the
endOfInput
parameter must be set to false, the input buffer must be filled and the
output buffer must be flushed between invocations;
- Invoking the
CharsetEncoder.encode(CharBuffer,ByteBuffer,boolean) encode method last time, and the the
endOfInput parameter must be set
to true
- Invoking the
CharsetEncoder.flush(ByteBuffer) flush method to flush the
output.
The
CharsetEncoder.encode(CharBuffer,ByteBuffer,boolean) encode method will
convert as many characters as possible, and the process won't stop except the
input characters has been run out of, the output buffer has been filled or
some error has happened. A
CoderResult CoderResult instance will be
returned to indicate the stop reason, and the invoker can identify the result
and choose further action, which can include filling the input buffer,
flushing the output buffer, recovering from error and trying again.
There are two common encoding errors. One is named as malformed and it is
returned when the input content is illegal 16-bit Unicode character sequence,
the other is named as unmappable character and occurs when there is a problem
mapping the input to a valid byte sequence in the specific charset.
The two errors can be handled in three ways, the default one is to report the
error to the invoker by a
CoderResult CoderResult instance, and the
alternatives are to ignore it or to replace the erroneous input with the
replacement byte array. The replacement byte array is {(byte)'?'} by default
and can be changed by invoking
CharsetEncoder.replaceWith(byte[]) replaceWith method. The invoker of this encoder can choose one way by specifying a
CodingErrorAction CodingErrorAction instance for each error type via
CharsetEncoder.onMalformedInput(CodingErrorAction) onMalformedInput method and
CharsetEncoder.onUnmappableCharacter(CodingErrorAction) onUnmappableCharacter method.
This class is abstract class and encapsulate many common operations of
encoding process for all charsets. encoder for specific charset should extend
this class and need only implement
CharsetEncoder.encodeLoop(CharBuffer,ByteBuffer) encodeLoop method for basic
encoding loop. If a subclass maintains internal state, it should override the
CharsetEncoder.implFlush(ByteBuffer) implFlush method and
CharsetEncoder.implReset() implReset method in addition.
This class is not thread-safe.
See Also: java.nio.charset.Charset See Also: java.nio.charset.CharsetDecoder |
Constructor Summary | |
protected | CharsetEncoder(Charset cs, float averageBytesPerChar, float maxBytesPerChar) Construct a new CharsetEncoder using given
Charset , average number and maximum number of bytes
created by this encoder for one input character. | protected | CharsetEncoder(Charset cs, float averageBytesPerChar, float maxBytesPerChar, byte[] replacement) Construct a new CharsetEncoder using given
Charset , replace byte array, average number and maximum
number of bytes created by this encoder for one input character. |
CharsetEncoder | protected CharsetEncoder(Charset cs, float averageBytesPerChar, float maxBytesPerChar)(Code) | | Construct a new CharsetEncoder using given
Charset , average number and maximum number of bytes
created by this encoder for one input character.
Parameters: cs - this encoder's Charset , which create thisencoder Parameters: averageBytesPerChar - average number of bytes created by this encoder for one inputcharacter, must be positive Parameters: maxBytesPerChar - maximum number of bytes which can be created by this encoderfor one input character, must be positive throws: IllegalArgumentException - if maxBytesPerChar oraverageBytePerChar is negative |
CharsetEncoder | protected CharsetEncoder(Charset cs, float averageBytesPerChar, float maxBytesPerChar, byte[] replacement)(Code) | | Construct a new CharsetEncoder using given
Charset , replace byte array, average number and maximum
number of bytes created by this encoder for one input character.
Parameters: cs - the this encoder's Charset , which create thisencoder Parameters: averageBytesPerChar - average number of bytes created by this encoder for singleinput character, must be positive Parameters: maxBytesPerChar - maximum number of bytes which can be created by this encoderfor single input character, must be positive Parameters: replacement - the replacement byte array, cannot be null or empty, itslength cannot larger than maxBytesPerChar , andmust be legal replacement, which can be justified byCharsetEncoder.isLegalReplacement(byte[]) isLegalReplacement throws: IllegalArgumentException - if any parameters are invalid |
averageBytesPerChar | final public float averageBytesPerChar()(Code) | | get the average number of bytes created by this encoder for single input
character
the average number of bytes created by this encoder for singleinput character |
canEncode | public boolean canEncode(char c)(Code) | | Check if given character can be encoded by this encoder.
Note that this method can change the internal status of this encoder, so
it should not be called when another encode process is ongoing, otherwise
it will throw IllegalStateException .
This method can be overridden for performance improvement.
Parameters: c - the given encoder true if given character can be encoded by this encoder throws: IllegalStateException - if another encode process is ongoing so that current internalstatus is neither RESET or FLUSH |
canEncode | public boolean canEncode(CharSequence sequence)(Code) | | Check if given CharSequence can be encoded by this
encoder.
Note that this method can change the internal status of this encoder, so
it should not be called when another encode process is ongoing, otherwise
it will throw IllegalStateException .
This method can be overridden for performance improvement.
Parameters: sequence - the given CharSequence true if given CharSequence can be encoded by thisencoder throws: IllegalStateException - if current internal status is neither RESET or FLUSH |
charset | final public Charset charset()(Code) | | Get the Charset which creates this encoder.
the Charset which creates this encoder |
encode | final public ByteBuffer encode(CharBuffer in) throws CharacterCodingException(Code) | | This is a facade method for encoding operation.
This method encodes the remaining character sequence of the given
character buffer into a new byte buffer. This method performs a complete
encoding operation, resets at first, then encodes, and flushes at last.
This method should not be invoked if another encode operation is ongoing.
Parameters: in - the input buffer a new ByteBuffer containing the the bytes producedby this encoding operation. The buffer's limit will be theposition of last byte in buffer, and the position will be zero throws: IllegalStateException - if another encoding operation is ongoing throws: MalformedInputException - if illegal input character sequence for this charsetencountered, and the action for malformed error isCodingErrorAction.REPORT CodingErrorAction.REPORT throws: UnmappableCharacterException - if legal but unmappable input character sequence for thischarset encountered, and the action for unmappable charactererror isCodingErrorAction.REPORT CodingErrorAction.REPORT.Unmappable means the Unicode character sequence at the inputbuffer's current position cannot be mapped to a equivalentbyte sequence. throws: CharacterCodingException - if other exception happened during the encode operation |
encode | final public CoderResult encode(CharBuffer in, ByteBuffer out, boolean endOfInput)(Code) | | Encodes characters starting at the current position of the given input
buffer, and writes the equivalent byte sequence into the given output
buffer from its current position.
The buffers' position will be changed with the reading and writing
operation, but their limits and marks will be kept intact.
A CoderResult instance will be returned according to
following rules:
- A
CoderResult.malformedForLength(int) malformed input result
indicates that some malformed input error encountered, and the erroneous
characters start at the input buffer's position and their number can be
got by result's
CoderResult.length length . This kind of result
can be returned only if the malformed action is
CodingErrorAction.REPORT CodingErrorAction.REPORT .
-
CoderResult.UNDERFLOW CoderResult.UNDERFLOW indicates that
as many characters as possible in the input buffer has been encoded. If
there is no further input and no characters left in the input buffer then
this task is complete. If this is not the case then the client should
call this method again supplying some more input characters.
-
CoderResult.OVERFLOW CoderResult.OVERFLOW indicates that the
output buffer has been filled, while there are still some characters
remaining in the input buffer. This method should be invoked again with a
non-full output buffer
- A
CoderResult.unmappableForLength(int) unmappable character result indicates that some unmappable character error was encountered,
and the erroneous characters start at the input buffer's position and
their number can be got by result's
CoderResult.length length .
This kind of result can be returned only on
CodingErrorAction.REPORT CodingErrorAction.REPORT .
The endOfInput parameter indicates that if the invoker can
provider further input. This parameter is true if and only if the
characters in current input buffer are all inputs for this encoding
operation. Note that it is common and won't cause error that the invoker
sets false and then finds no more input available; while it may cause
error that the invoker always sets true in several consecutive
invocations so that any remaining input will be treated as malformed
input.
This method invokes
CharsetEncoder.encodeLoop(CharBuffer,ByteBuffer) encodeLoop method to
implement basic encode logic for specific charset.
Parameters: in - the input buffer Parameters: out - the output buffer Parameters: endOfInput - true if all the input characters have been provided a CoderResult instance indicating the result throws: IllegalStateException - if the encoding operation has already started or no moreinput needed in this encoding progress. throws: CoderMalfunctionError - If the CharsetEncoder.encodeLoop(CharBuffer,ByteBuffer) encodeLoopmethod threw an BufferUnderflowException orBufferUnderflowException |
encodeLoop | abstract protected CoderResult encodeLoop(CharBuffer in, ByteBuffer out)(Code) | | Encode characters into bytes. This method is called by
CharsetEncoder.encode(CharBuffer,ByteBuffer,boolean) encode .
This method will implement the essential encoding operation, and it won't
stop encoding until either all the input characters are read, the output
buffer is filled, or some exception encountered. And then it will return
a CoderResult object indicating the result of current
encoding operation. The rules to construct the CoderResult
is same as the
CharsetEncoder.encode(CharBuffer,ByteBuffer,boolean) encode .
When exception encountered in the encoding operation, most implementation
of this method will return a relevant result object to
CharsetEncoder.encode(CharBuffer,ByteBuffer,boolean) encode method, and some
performance optimized implementation may handle the exception and
implement the error action itself.
The buffers are scanned from their current positions, and their positions
will be modified accordingly, while their marks and limits will be
intact. At most
CharBuffer.remaining in.remaining() characters
will be read, and
ByteBuffer.remaining out.remaining() bytes
will be written.
Note that some implementation may pre-scan the input buffer and return
CoderResult.UNDERFLOW until it receives sufficient input.
Parameters: in - the input buffer Parameters: out - the output buffer a CoderResult instance indicating the result |
flush | final public CoderResult flush(ByteBuffer out)(Code) | | Flush this encoder.
This method will call
CharsetEncoder.implFlush(ByteBuffer) implFlush . Some
encoders may need to write some bytes to the output buffer when they have
read all input characters, subclasses can overridden
CharsetEncoder.implFlush(ByteBuffer) implFlush to perform writing action.
The maximum number of written bytes won't larger than
ByteBuffer.remaining out.remaining() . If some encoder want to
write more bytes than output buffer's remaining spaces, then
CoderResult.OVERFLOW will be returned, and this method
must be called again with a byte buffer has more spaces. Otherwise this
method will return CoderResult.UNDERFLOW , which means one
encoding process has been completed successfully.
During the flush, the output buffer's position will be changed
accordingly, while its mark and limit will be intact.
Parameters: out - the given output buffer CoderResult.UNDERFLOW orCoderResult.OVERFLOW throws: IllegalStateException - if this encoder hasn't read all input characters during oneencoding process, which means neither after callingCharsetEncoder.encode(CharBuffer) encode(CharBuffer) nor aftercalling CharsetEncoder.encode(CharBuffer,ByteBuffer,boolean) encode(CharBuffer, ByteBuffer, boolean) with true value forthe last boolean parameter |
implFlush | protected CoderResult implFlush(ByteBuffer out)(Code) | | Flush this encoder. Default implementation does nothing and always return
CoderResult.UNDERFLOW , and this method can be overridden
if needed.
Parameters: out - the output buffer CoderResult.UNDERFLOW orCoderResult.OVERFLOW |
implOnMalformedInput | protected void implOnMalformedInput(CodingErrorAction newAction)(Code) | | Notify that this encoder's CodingErrorAction specified for
malformed input error has been changed. Default implementation does
nothing, and this method can be overridden if needed.
Parameters: newAction - The new action |
implOnUnmappableCharacter | protected void implOnUnmappableCharacter(CodingErrorAction newAction)(Code) | | Notify that this encoder's CodingErrorAction specified for
unmappable character error has been changed. Default implementation does
nothing, and this method can be overridden if needed.
Parameters: newAction - The new action |
implReplaceWith | protected void implReplaceWith(byte[] newReplacement)(Code) | | Notify that this encoder's replacement has been changed. Default
implementation does nothing, and this method can be overridden if needed.
Parameters: newReplacement - the new replacement string |
implReset | protected void implReset()(Code) | | Reset this encoder's charset related state. Default implementation does
nothing, and this method can be overridden if needed.
|
isLegalReplacement | public boolean isLegalReplacement(byte[] repl)(Code) | | Check if the given argument is legal as this encoder's replacement byte
array.
The given byte array is legal if and only if it can be decode into
sixteen bits Unicode characters.
This method can be overridden for performance improvement.
Parameters: repl - the given byte array to be checked true if the the given argument is legal as this encoder'sreplacement byte array. |
malformedInputAction | public CodingErrorAction malformedInputAction()(Code) | | Gets this encoder's CodingErrorAction when malformed input
occurred during encoding process.
this encoder's CodingErrorAction when malformedinput occurred during encoding process. |
maxBytesPerChar | final public float maxBytesPerChar()(Code) | | Get the maximum number of bytes which can be created by this encoder for
one input character, must be positive
the maximum number of bytes which can be created by this encoderfor one input character, must be positive |
replaceWith | final public CharsetEncoder replaceWith(byte[] replacement)(Code) | | Set new replacement value.
This method first checks the given replacement's validity, then changes
the replacement value, and at last calls
CharsetEncoder.implReplaceWith(byte[]) implReplaceWith method with the given
new replacement as argument.
Parameters: replacement - the replacement byte array, cannot be null or empty, itslength cannot larger than maxBytesPerChar , andmust be legal replacement, which can be justified byisLegalReplacement(byte[] repl) this encoder throws: IllegalArgumentException - if the given replacement cannot satisfy the requirementmentioned above |
replacement | final public byte[] replacement()(Code) | | Get the replacement byte array, which is never null or empty, and it is
legal
the replacement byte array, cannot be null or empty, and it islegal |
reset | final public CharsetEncoder reset()(Code) | | Reset this encoder. This method will reset internal status, and then call
implReset() to reset any status related to specific
charset.
this encoder |
unmappableCharacterAction | public CodingErrorAction unmappableCharacterAction()(Code) | | Gets this encoder's CodingErrorAction when unmappable
character occurred during encoding process.
this encoder's CodingErrorAction when unmappablecharacter occurred during encoding process. |
|
|