Source Code Cross Referenced for K2.java in » Science » weka » weka » classifiers » bayes » net » search » global » Java Source Code / Java DocumentationJava Source Code and Java Documentation

Java Source Code / Java Documentation

1.	6.0 JDK Core
2.	6.0 JDK Modules
3.	6.0 JDK Modules com.sun
4.	6.0 JDK Modules com.sun.java
5.	6.0 JDK Modules sun
6.	6.0 JDK Platform
7.	Ajax
8.	Apache Harmony Java SE
9.	Aspect oriented
10.	Authentication Authorization
11.	Blogger System
12.	Build
13.	Byte Code
14.	Cache
15.	Chart
16.	Chat
17.	Code Analyzer
18.	Collaboration
19.	Content Management System
20.	Database Client
21.	Database DBMS
22.	Database JDBC Connection Pool
23.	Database ORM
24.	Development
25.	EJB Server geronimo
26.	EJB Server GlassFish
27.	EJB Server JBoss 4.2.1
28.	EJB Server resin 3.1.5
29.	ERP CRM Financial
30.	ESB
31.	Forum
32.	GIS
33.	Graphic Library
34.	Groupware
35.	HTML Parser
36.	IDE
37.	IDE Eclipse
38.	IDE Netbeans
39.	Installer
40.	Internationalization Localization
41.	Inversion of Control
42.	Issue Tracking
43.	J2EE
44.	JBoss
45.	JMS
46.	JMX
47.	Library
48.	Mail Clients
49.	Net
50.	Parser
51.	PDF
52.	Portal
53.	Profiler
54.	Project Management
55.	Report
56.	RSS RDF
57.	Rule Engine
58.	Science
59.	Scripting
60.	Search Engine
61.	Security
62.	Sevlet Container
63.	Source Control
64.	Swing Library
65.	Template Engine
66.	Test Coverage
67.	Testing
68.	UML
69.	Web Crawler
70.	Web Framework
71.	Web Mail
72.	Web Server
73.	Web Services
74.	Web Services apache cxf 2.0.1
75.	Web Services AXIS2
76.	Wiki Engine
77.	Workflow Engines
78.	XML
79.	XML UI

Java

Java Tutorial

Illustrator Tutorials

GIMP Tutorials

C# / C Sharp

C# / CSharp Tutorial

C# / CSharp Open Source

SQL Server / T-SQL Tutorial

Oracle PL / SQL

Oracle PL/SQL Tutorial

Flash / Flex / ActionScript

VBA / Excel / Access / Word

XML

XML Tutorial

Microsoft Office PowerPoint 2007 Tutorial

Microsoft Office Excel 2007 Tutorial

Microsoft Office Word 2007 Tutorial

Java Source Code / Java Documentation » Science » weka » weka.classifiers.bayes.net.search.global

Source Cross Referenced Class Diagram Java Document (Java Doc)

001:        /*
002:         * This program is free software; you can redistribute it and/or modify
003:         * it under the terms of the GNU General Public License as published by
004:         * the Free Software Foundation; either version 2 of the License, or
005:         * (at your option) any later version.
006:         * 
007:         * This program is distributed in the hope that it will be useful,
008:         * but WITHOUT ANY WARRANTY; without even the implied warranty of
009:         * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
010:         * GNU General Public License for more details.
011:         * 
012:         * You should have received a copy of the GNU General Public License
013:         * along with this program; if not, write to the Free Software
014:         * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
015:         */
016:
017:        /*
018:         * K2.java
019:         * Copyright (C) 2001 University of Waikato, Hamilton, New Zealand
020:         * 
021:         */
022:        package weka.classifiers.bayes.net.search.global;
023:
024:        import java.util.Enumeration;
025:        import java.util.Vector;
026:        import java.util.Random;
027:
028:        import weka.classifiers.bayes.BayesNet;
029:        import weka.core.Instances;
030:        import weka.core.Option;
031:        import weka.core.TechnicalInformation;
032:        import weka.core.TechnicalInformation.Type;
033:        import weka.core.TechnicalInformation.Field;
034:        import weka.core.TechnicalInformationHandler;
035:        import weka.core.Utils;
036:
037:        /**
038:         <!-- globalinfo-start -->
039:         * This Bayes Network learning algorithm uses a hill climbing algorithm restricted by an order on the variables.<br/>
040:         * <br/>
041:         * For more information see:<br/>
042:         * <br/>
043:         * G.F. Cooper, E. Herskovits (1990). A Bayesian method for constructing Bayesian belief networks from databases.<br/>
044:         * <br/>
045:         * G. Cooper, E. Herskovits (1992). A Bayesian method for the induction of probabilistic networks from data. Machine Learning. 9(4):309-347.<br/>
046:         * <br/>
047:         * Works with nominal variables and no missing values only.
048:         * <p/>
049:         <!-- globalinfo-end -->
050:         *
051:         <!-- technical-bibtex-start -->
052:         * BibTeX:
053:         * <pre>
054:         * &#64;proceedings{Cooper1990,
055:         *    author = {G.F. Cooper and E. Herskovits},
056:         *    booktitle = {Proceedings of the Conference on Uncertainty in AI},
057:         *    pages = {86-94},
058:         *    title = {A Bayesian method for constructing Bayesian belief networks from databases},
059:         *    year = {1990}
060:         * }
061:         * 
062:         * &#64;article{Cooper1992,
063:         *    author = {G. Cooper and E. Herskovits},
064:         *    journal = {Machine Learning},
065:         *    number = {4},
066:         *    pages = {309-347},
067:         *    title = {A Bayesian method for the induction of probabilistic networks from data},
068:         *    volume = {9},
069:         *    year = {1992}
070:         * }
071:         * </pre>
072:         * <p/>
073:         <!-- technical-bibtex-end -->
074:         *
075:         <!-- options-start -->
076:         * Valid options are: <p/>
077:         * 
078:         * <pre> -N
079:         *  Initial structure is empty (instead of Naive Bayes)</pre>
080:         * 
081:         * <pre> -P &lt;nr of parents&gt;
082:         *  Maximum number of parents</pre>
083:         * 
084:         * <pre> -R
085:         *  Random order.
086:         *  (default false)</pre>
087:         * 
088:         * <pre> -mbc
089:         *  Applies a Markov Blanket correction to the network structure, 
090:         *  after a network structure is learned. This ensures that all 
091:         *  nodes in the network are part of the Markov blanket of the 
092:         *  classifier node.</pre>
093:         * 
094:         * <pre> -S [LOO-CV|k-Fold-CV|Cumulative-CV]
095:         *  Score type (LOO-CV,k-Fold-CV,Cumulative-CV)</pre>
096:         * 
097:         * <pre> -Q
098:         *  Use probabilistic or 0/1 scoring.
099:         *  (default probabilistic scoring)</pre>
100:         * 
101:         <!-- options-end -->
102:         *
103:         * @author Remco Bouckaert (rrb@xm.co.nz)
104:         * @version $Revision: 1.6 $
105:         */
106:        public class K2 extends GlobalScoreSearchAlgorithm implements 
107:                TechnicalInformationHandler {
108:
109:            /** for serialization */
110:            static final long serialVersionUID = -6626871067466338256L;
111:
112:            /** Holds flag to indicate ordering should be random **/
113:            boolean m_bRandomOrder = false;
114:
115:            /**
116:             * Returns an instance of a TechnicalInformation object, containing 
117:             * detailed information about the technical background of this class,
118:             * e.g., paper reference or book this class is based on.
119:             * 
120:             * @return the technical information about this class
121:             */
122:            public TechnicalInformation getTechnicalInformation() {
123:                TechnicalInformation result;
124:                TechnicalInformation additional;
125:
126:                result = new TechnicalInformation(Type.PROCEEDINGS);
127:                result.setValue(Field.AUTHOR, "G.F. Cooper and E. Herskovits");
128:                result.setValue(Field.YEAR, "1990");
129:                result
130:                        .setValue(Field.TITLE,
131:                                "A Bayesian method for constructing Bayesian belief networks from databases");
132:                result.setValue(Field.BOOKTITLE,
133:                        "Proceedings of the Conference on Uncertainty in AI");
134:                result.setValue(Field.PAGES, "86-94");
135:
136:                additional = result.add(Type.ARTICLE);
137:                additional
138:                        .setValue(Field.AUTHOR, "G. Cooper and E. Herskovits");
139:                additional.setValue(Field.YEAR, "1992");
140:                additional
141:                        .setValue(Field.TITLE,
142:                                "A Bayesian method for the induction of probabilistic networks from data");
143:                additional.setValue(Field.JOURNAL, "Machine Learning");
144:                additional.setValue(Field.VOLUME, "9");
145:                additional.setValue(Field.NUMBER, "4");
146:                additional.setValue(Field.PAGES, "309-347");
147:
148:                return result;
149:            }
150:
151:            /**
152:             * buildStructure determines the network structure/graph of the network
153:             * with the K2 algorithm, restricted by its initial structure (which can
154:             * be an empty graph, or a Naive Bayes graph.
155:             * 
156:             * @param bayesNet the network
157:             * @param instances the data to work with
158:             * @throws Exception if something goes wrong
159:             */
160:            public void buildStructure(BayesNet bayesNet, Instances instances)
161:                    throws Exception {
162:                super .buildStructure(bayesNet, instances);
163:                int nOrder[] = new int[instances.numAttributes()];
164:                nOrder[0] = instances.classIndex();
165:
166:                int nAttribute = 0;
167:
168:                for (int iOrder = 1; iOrder < instances.numAttributes(); iOrder++) {
169:                    if (nAttribute == instances.classIndex()) {
170:                        nAttribute++;
171:                    }
172:                    nOrder[iOrder] = nAttribute++;
173:                }
174:
175:                if (m_bRandomOrder) {
176:                    // generate random ordering (if required)
177:                    Random random = new Random();
178:                    int iClass;
179:                    if (getInitAsNaiveBayes()) {
180:                        iClass = 0;
181:                    } else {
182:                        iClass = -1;
183:                    }
184:                    for (int iOrder = 0; iOrder < instances.numAttributes(); iOrder++) {
185:                        int iOrder2 = Math.abs(random.nextInt())
186:                                % instances.numAttributes();
187:                        if (iOrder != iClass && iOrder2 != iClass) {
188:                            int nTmp = nOrder[iOrder];
189:                            nOrder[iOrder] = nOrder[iOrder2];
190:                            nOrder[iOrder2] = nTmp;
191:                        }
192:                    }
193:                }
194:
195:                // determine base scores
196:                double fBaseScore = calcScore(bayesNet);
197:
198:                // K2 algorithm: greedy search restricted by ordering 
199:                for (int iOrder = 1; iOrder < instances.numAttributes(); iOrder++) {
200:                    int iAttribute = nOrder[iOrder];
201:                    double fBestScore = fBaseScore;
202:
203:                    boolean bProgress = (bayesNet.getParentSet(iAttribute)
204:                            .getNrOfParents() < getMaxNrOfParents());
205:                    while (bProgress
206:                            && (bayesNet.getParentSet(iAttribute)
207:                                    .getNrOfParents() < getMaxNrOfParents())) {
208:                        int nBestAttribute = -1;
209:                        for (int iOrder2 = 0; iOrder2 < iOrder; iOrder2++) {
210:                            int iAttribute2 = nOrder[iOrder2];
211:                            double fScore = calcScoreWithExtraParent(
212:                                    iAttribute, iAttribute2);
213:                            if (fScore > fBestScore) {
214:                                fBestScore = fScore;
215:                                nBestAttribute = iAttribute2;
216:                            }
217:                        }
218:                        if (nBestAttribute != -1) {
219:                            bayesNet.getParentSet(iAttribute).addParent(
220:                                    nBestAttribute, instances);
221:                            fBaseScore = fBestScore;
222:                            bProgress = true;
223:                        } else {
224:                            bProgress = false;
225:                        }
226:                    }
227:                }
228:            } // buildStructure 
229:
230:            /**
231:             * Sets the max number of parents
232:             *
233:             * @param nMaxNrOfParents the max number of parents
234:             */
235:            public void setMaxNrOfParents(int nMaxNrOfParents) {
236:                m_nMaxNrOfParents = nMaxNrOfParents;
237:            }
238:
239:            /**
240:             * Gets the max number of parents.
241:             *
242:             * @return the max number of parents
243:             */
244:            public int getMaxNrOfParents() {
245:                return m_nMaxNrOfParents;
246:            }
247:
248:            /**
249:             * Sets whether to init as naive bayes
250:             *
251:             * @param bInitAsNaiveBayes whether to init as naive bayes
252:             */
253:            public void setInitAsNaiveBayes(boolean bInitAsNaiveBayes) {
254:                m_bInitAsNaiveBayes = bInitAsNaiveBayes;
255:            }
256:
257:            /**
258:             * Gets whether to init as naive bayes
259:             *
260:             * @return whether to init as naive bayes
261:             */
262:            public boolean getInitAsNaiveBayes() {
263:                return m_bInitAsNaiveBayes;
264:            }
265:
266:            /** 
267:             * Set random order flag 
268:             *
269:             * @param bRandomOrder the random order flag
270:             */
271:            public void setRandomOrder(boolean bRandomOrder) {
272:                m_bRandomOrder = bRandomOrder;
273:            } // SetRandomOrder
274:
275:            /** 
276:             * Get random order flag 
277:             *
278:             * @return the random order flag
279:             */
280:            public boolean getRandomOrder() {
281:                return m_bRandomOrder;
282:            } // getRandomOrder
283:
284:            /**
285:             * Returns an enumeration describing the available options.
286:             *
287:             * @return an enumeration of all the available options.
288:             */
289:            public Enumeration listOptions() {
290:                Vector newVector = new Vector(0);
291:
292:                newVector
293:                        .addElement(new Option(
294:                                "\tInitial structure is empty (instead of Naive Bayes)",
295:                                "N", 0, "-N"));
296:
297:                newVector.addElement(new Option("\tMaximum number of parents",
298:                        "P", 1, "-P <nr of parents>"));
299:
300:                newVector.addElement(new Option("\tRandom order.\n"
301:                        + "\t(default false)", "R", 0, "-R"));
302:
303:                Enumeration enu = super .listOptions();
304:                while (enu.hasMoreElements()) {
305:                    newVector.addElement(enu.nextElement());
306:                }
307:                return newVector.elements();
308:            }
309:
310:            /**
311:             * Parses a given list of options. <p/>
312:             *
313:             <!-- options-start -->
314:             * Valid options are: <p/>
315:             * 
316:             * <pre> -N
317:             *  Initial structure is empty (instead of Naive Bayes)</pre>
318:             * 
319:             * <pre> -P &lt;nr of parents&gt;
320:             *  Maximum number of parents</pre>
321:             * 
322:             * <pre> -R
323:             *  Random order.
324:             *  (default false)</pre>
325:             * 
326:             * <pre> -mbc
327:             *  Applies a Markov Blanket correction to the network structure, 
328:             *  after a network structure is learned. This ensures that all 
329:             *  nodes in the network are part of the Markov blanket of the 
330:             *  classifier node.</pre>
331:             * 
332:             * <pre> -S [LOO-CV|k-Fold-CV|Cumulative-CV]
333:             *  Score type (LOO-CV,k-Fold-CV,Cumulative-CV)</pre>
334:             * 
335:             * <pre> -Q
336:             *  Use probabilistic or 0/1 scoring.
337:             *  (default probabilistic scoring)</pre>
338:             * 
339:             <!-- options-end -->
340:             *
341:             * @param options the list of options as an array of strings
342:             * @throws Exception if an option is not supported
343:             */
344:            public void setOptions(String[] options) throws Exception {
345:
346:                setRandomOrder(Utils.getFlag('R', options));
347:
348:                m_bInitAsNaiveBayes = !(Utils.getFlag('N', options));
349:
350:                String sMaxNrOfParents = Utils.getOption('P', options);
351:
352:                if (sMaxNrOfParents.length() != 0) {
353:                    setMaxNrOfParents(Integer.parseInt(sMaxNrOfParents));
354:                } else {
355:                    setMaxNrOfParents(100000);
356:                }
357:                super .setOptions(options);
358:            }
359:
360:            /**
361:             * Gets the current settings of the search algorithm.
362:             *
363:             * @return an array of strings suitable for passing to setOptions
364:             */
365:            public String[] getOptions() {
366:                String[] super Options = super .getOptions();
367:                String[] options = new String[4 + super Options.length];
368:                int current = 0;
369:                options[current++] = "-P";
370:                options[current++] = "" + m_nMaxNrOfParents;
371:                if (!m_bInitAsNaiveBayes) {
372:                    options[current++] = "-N";
373:                }
374:                if (getRandomOrder()) {
375:                    options[current++] = "-R";
376:                }
377:                // insert options from parent class
378:                for (int iOption = 0; iOption < super Options.length; iOption++) {
379:                    options[current++] = super Options[iOption];
380:                }
381:                // Fill up rest with empty strings, not nulls!
382:                while (current < options.length) {
383:                    options[current++] = "";
384:                }
385:                // Fill up rest with empty strings, not nulls!
386:                return options;
387:            }
388:
389:            /**
390:             * @return a string to describe the RandomOrder option.
391:             */
392:            public String randomOrderTipText() {
393:                return "When set to true, the order of the nodes in the network is random."
394:                        + " Default random order is false and the order"
395:                        + " of the nodes in the dataset is used."
396:                        + " In any case, when the network was initialized as Naive Bayes Network, the"
397:                        + " class variable is first in the ordering though.";
398:            } // randomOrderTipText
399:
400:            /**
401:             * This will return a string describing the search algorithm.
402:             * @return The string.
403:             */
404:            public String globalInfo() {
405:                return "This Bayes Network learning algorithm uses a hill climbing algorithm "
406:                        + "restricted by an order on the variables.\n\n"
407:                        + "For more information see:\n\n"
408:                        + getTechnicalInformation().toString()
409:                        + "\n\n"
410:                        + "Works with nominal variables and no missing values only.";
411:            }
412:        }

www.java2java.com | Contact Us

All other trademarks are property of their respective owners.