| |
|
| java.lang.Object java.lang.Thread org.jgroups.stack.UpHandler org.jgroups.protocols.FD_SOCK
FD_SOCK | public class FD_SOCK extends Protocol implements Runnable(Code) | | Failure detection protocol based on sockets. Failure detection is ring-based. Each member creates a
server socket and announces its address together with the server socket's address in a multicast. A
pinger thread will be started when the membership goes above 1 and will be stopped when it drops below
2. The pinger thread connects to its neighbor on the right and waits until the socket is closed. When
the socket is closed by the monitored peer in an abnormal fashion (IOException), the neighbor will be
suspected. The main feature of this protocol is that no ping messages need to be exchanged between
any 2 peers, and failure detection relies entirely on TCP sockets. The advantage is that no activity
will take place between 2 peers as long as they are alive (i.e. have their server sockets open).
The disadvantage is that hung servers or crashed routers will not cause sockets to be closed, therefore
they won't be detected.
The FD_SOCK protocol will work for groups where members are on different hosts
The costs involved are 2 additional threads: one that
monitors the client side of the socket connection (to monitor a peer) and another one that manages the
server socket. However, those threads will be idle as long as both peers are running.
author: Bela Ban May 29 2001 |
get_cache_retry_timeout | final static long get_cache_retry_timeout(Code) | | |
get_cache_timeout | long get_cache_timeout(Code) | | |
got_cache_from_coord | boolean got_cache_from_coord(Code) | | |
num_suspect_events | int num_suspect_events(Code) | | |
regular_sock_close | boolean regular_sock_close(Code) | | |
srv_sock_sent | boolean srv_sock_sent(Code) | | |
start_port | int start_port(Code) | | Start port for server socket (uses first available port starting at start_port). A value of 0 (default)
picks a random port
|
suspect_msg_interval | long suspect_msg_interval(Code) | | |
broadcastSuspectMessage | void broadcastSuspectMessage(Address suspected_mbr)(Code) | | Sends a SUSPECT message to all group members. Only the coordinator (or the next member in line if the coord
itself is suspected) will react to this message by installing a new view. To overcome the unreliability
of the SUSPECT message (it may be lost because we are not above any retransmission layer), the following scheme
is used: after sending the SUSPECT message, it is also added to the broadcast task, which will periodically
re-send the SUSPECT until a view is received in which the suspected process is not a member anymore. The reason is
that - at one point - either the coordinator or another participant taking over for a crashed coordinator, will
react to the SUSPECT message and issue a new view, at which point the broadcast task stops.
|
broadcastWhoHasSockMessage | void broadcastWhoHasSockMessage(Address mbr)(Code) | | |
getCacheFromCoordinator | void getCacheFromCoordinator()(Code) | | Determines coordinator C. If C is null and we are the first member, return. Else loop: send GET_CACHE message
to coordinator and wait for GET_CACHE_RSP response. Loop until valid response has been received.
|
getNumSuspectEventsGenerated | public int getNumSuspectEventsGenerated()(Code) | | |
interruptPingerThread | void interruptPingerThread()(Code) | | Interrupts the pinger thread. The Thread.interrupt() method doesn't seem to work under Linux with JDK 1.3.1
(JDK 1.2.2 had no problems here), therefore we close the socket (setSoLinger has to be set !) if we are
running under Linux. This should be tested under Windows. (Solaris 8 and JDK 1.3.1 definitely works).
Oct 29 2001 (bela): completely removed Thread.interrupt(), but used socket close on all OSs. This makes this
code portable and we don't have to check for OSs.
Does *not* need to be synchronized on pinger_mutex because the caller (down()) already has the mutex acquired
See Also: org.jgroups.tests.InterruptTest See Also: to determine whether Thread.interrupt() works for InputStream.read(). |
printSuspectHistory | public String printSuspectHistory()(Code) | | |
resetStats | public void resetStats()(Code) | | |
run | public void run()(Code) | | Runs as long as there are 2 members and more. Determines the member to be monitored and fetches its
server socket address (if n/a, sends a message to obtain it). The creates a client socket and listens on
it until the connection breaks. If it breaks, emits a SUSPECT message. It the connection is closed regularly,
nothing happens. In both cases, a new member to be monitored will be chosen and monitoring continues (unless
there are fewer than 2 members).
|
sendIHaveSockMessage | void sendIHaveSockMessage(Address dst, Address mbr, IpAddress addr)(Code) | | Sends or broadcasts a I_HAVE_SOCK response. If 'dst' is null, the reponse will be broadcast, otherwise
it will be unicast back to the requester
|
sendPingInterrupt | void sendPingInterrupt()(Code) | | |
sendPingSignal | synchronized void sendPingSignal(int signal)(Code) | | |
sendPingTermination | synchronized void sendPingTermination()(Code) | | |
setupPingSocket | boolean setupPingSocket(IpAddress dest)(Code) | | Creates a socket to dest , and assigns it to ping_sock. Also assigns ping_input
|
signalToString | static String signalToString(int signal)(Code) | | |
startPingerThread | void startPingerThread()(Code) | | Does *not* need to be synchronized on pinger_mutex because the caller (down()) already has the mutex acquired
|
startServerSocket | void startServerSocket()(Code) | | |
stopPingerThread | void stopPingerThread()(Code) | | |
stopServerSocket | void stopServerSocket()(Code) | | |
teardownPingSocket | void teardownPingSocket()(Code) | | |
Fields inherited from org.jgroups.stack.UpHandler | final protected Log log(Code)(Java Doc)
|
|
|
|