Archive for November, 2006

Because server-specific APIs use linked C or C++

Wednesday, November 22nd, 2006

Because server-specific APIs use linked C or C++ code, server extensions can run extremely fast and make full use of the server’s resources. Server extensions, however, are not a perfect solution by any means. Besides being difficult to develop and maintain, they pose significant security and reliability hazards: a crashed server extension can bring down the entire server. And, of course, proprietary server extensions are inextricably tied to the server API for which they were written and often tied to a particular operating system as well. Figure 1-3. The server extension life cycle Active Server Pages Microsoft has developed a technique for generating dynamic web content called Active Server Pages, or sometimes just ASP. With ASP, an HTML page on the web server can contain snippets of embedded code (usually VBScript or JScript although it’s possible to use nearly any language). This code is read and executed by the web server before it sends the page to the client. ASP is optimized for generating small portions of dynamic content. Support for ASP is built into Microsoft Internet Information Server Version 3.0 and above, available for free from http://www.microsoft.com/iis. Support for other web servers is available as a commercial product from Chili!Soft at http://www.chilisoft.com. For more information on programming Active Server Pages, see http://www. microsoft.com/workshop/server/default.asp and http://www.activeserverpages.com/. Server-side JavaScript Netscape too has a technique for server-side scripting, which it calls server-side JavaScript, or SSJS for short. Like ASP, SSJS allows snippets of code to be embedded in HTML pages to generate dynamic web content. The difference is that SSJS uses JavaScript as the scripting language. With SSJS, web pages are precompiled to improve performance. Support for server-side JavaScript is available only with Netscape FastTrack Server and Enterprise Server Version 2.0 and above. For more information on programming with server-side JavaScript, see http://developer.netscape.com/tech/javascript/ssjs/ssjs.html.
Note: If you are looking for cheapest and affordable webspace to host and run your servlet application check Astra servlet hosting services

Figure 1-2. The FastCGI life cycle Although FastCGI

Wednesday, November 22nd, 2006

Figure 1-2. The FastCGI life cycle Although FastCGI is a step in the right direction, it still has a problem with process proliferation: there is at least one process for each FastCGI program. If a FastCGI program is to handle concurrent requests, it needs a pool of processes, one per request. Considering that each process may be executing a Perl interpreter, this approach does not scale as well as you might hope. (Although, to its credit, FastCGI can distribute its processes across multiple servers.) Another problem with FastCGI is that it does nothing to help the FastCGI program more closely interact with the server. As of this writing, the FastCGI approach has not been implemented by some of the more popular servers, including Microsoft’s Internet Information Server. Finally, FastCGI programs are only as portable as the language in which they’re written. For more information on FastCGI, see http://www.fastcgi.com/. mod_perl If you are using the Apache web server, another option for improving CGI performance is using mod_perl. mod_perl is a module for the Apache server that embeds a copy of the Perl interpreter into the Apache httpd executable, providing complete access to Perl functionality within Apache. The effect is that your CGI scripts are precompiled by the server and executed without forking, thus running much more quickly and efficiently. For more information on mod_perl, see http://perl.apache.org/. PerlEx PerlEx, developed by ActiveState, improves the performance of CGI scripts written in Perl that run on Windows NT web servers (Microsoft’s Internet Information Server, O’Reilly’s WebSite Professional, and Netscape’s FastTrack Server and Enterprise Server). PerlEx uses the web server’s native API to achieve its performance gains. For more information, see http://www.activestate.com/plex/. Other Solutions CGI/Perl has the advantage of being a more-or-less platform-independent way to produce dynamic web content. Other well-known technologies for creating web applications, such as ASP and server-side JavaScript, are proprietary solutions that work only with certain web servers. Server Extension APIs Several companies have created proprietary server extension APIs for their web servers. For example, Netscape provides an internal API called NSAPI (now becoming WAI) and Microsoft provides ISAPI. Using one of these APIs, you can write server extensions that enhance or change the base functionality of the server, allowing the server to handle tasks that were once relegated to external CGI programs. As you can see in Figure 1 3, server extensions exist within the main process of a web server.
Note: If you are looking for reliable and quality webspace company to host and run your servlet application check Actions servlet hosting services

The Common Gateway Interface, normally referred to as

Tuesday, November 21st, 2006

The Common Gateway Interface, normally referred to as CGI, was one of the first practical techniques for creating dynamic content. With CGI, a web server passes certain requests to an external program. The output of this program is then sent to the client in place of a static file. The advent of CGI made it possible to implement all sorts of new functionality in web pages, and CGI quickly became a de facto standard, implemented on dozens of web servers. It’s interesting to note that the ability of CGI programs to create dynamic web pages is a side effect of its intended purpose: to define a standard method for an information server to talk with external applications. This origin explains why CGI has perhaps the worst life cycle imaginable. When a server receives a request that accesses a CGI program, it must create a new process to run the CGI program and then pass to it, via environment variables and standard input, every bit of information that might be necessary to generate a response. Creating a process for every such request requires time and significant server resources, which limits the number of requests a server can handle concurrently. Figure 1-1 shows the CGI life cycle. Figure 1-1. The CGI life cycle Even though a CGI program can be written in almost any language, the Perl programming language has become the predominant choice. Its advanced textprocessing capabilities are a big help in managing the details of the CGI interface. Writing a CGI script in Perl gives it a semblance of platform independence, but it also requires that each request start a separate Perl interpreter, which takes even more time and requires extra resources. Another often-overlooked problem with CGI is that a CGI program cannot interact with the web server or take advantage of the server’s abilities once it begins execution because it is running in a separate process. For example, a CGI script cannot write to the server’s log file. For more information on CGI programming, see CGI Programming on the World Wide Web by Shishir Gundavaram (O’Reilly). FastCGI A company named Open Market developed an alternative to standard CGI named FastCGI. In many ways, FastCGI works just like CGI the important difference is that FastCGI creates a single persistent process for each FastCGI program, as shown in Figure 1-2. This eliminates the need to create a new process for each request.

Hint: If you are looking for very good and affordable webspace to host and run your tomcat hosting application check Virtualwebstudio tomcat web hosting provider

Introduction In this chapter: History of Web

Tuesday, November 21st, 2006

Introduction In this chapter: History of Web Applications Support for Servlets The Power of Servlets The rise of server-side Java applications is one of the latest and most exciting trends in Java programming. The Java language was originally intended for use in small, embedded devices. It was first hyped as a language for developing elaborate client-side web content in the form of applets. Until recently, Java’s potential as a server- side development platform had been sadly overlooked. Now, Java is coming into its own as a language ideally suited for server-side development. Businesses in particular have been quick to recognize Java’s potential on the server Java is inherently suited for large client/server applications. The crossplatform nature of Java is extremely useful for organizations that have a heterogeneous collection of servers running various flavors of the Unix and Windows operating systems. Java’s modern, object-oriented, memory-protected design allows developers to cut development cycles and increase reliability. In addition, Java’s built-in support for networking and enterprise APIs provides access to legacy data, easing the transition from older client/server systems. Java servlets are a key component of server-side Java development. A servlet is a small, pluggable extension to a server that enhances the server’s functionality. Servlets allow developers to extend and customize any Java- enabled server a web server, a mail server, an application server, or any custom server with a hitherto unknown degree of portability, flexibility, and ease. But before we go into any more detail, let’s put things into perspective. History of Web Applications While servlets can be used to extend the functionality of any Java-enabled server, today they are most often used to extend web servers, providing a powerful, efficient replacement for CGI scripts. When you use a servlet to create dynamic content for a web page or otherwise extend the functionality of a web server, you are in effect creating a web application. While a web page merely displays static content and lets the user navigate through that content, a web application provides a more interactive experience. A web application can be as simple as a keyword search on a document archive or as complex as an electronic storefront. Web applications are being deployed on the Internet and on corporate intranets and extranets, where they have the potential to increase productivity and change the way that companies, large and small, do business. To understand the power of servlets, we need to step back and look at some of the other approaches that can be used to create web applications. Common Gateway Interface
Note: If you are looking for good and high quality web space to host and run your java application check Vision java hosting services

Names of user interface buttons and menus

Monday, November 20th, 2006

Names of user interface buttons and menus Constant Widthis used for: Anything that appears literally in a Java program, including keywords, data types, constants, method names, variables, class names, and interface names Command lines and options that should be typed verbatim on the screen All Java code listings HTML documents, tags, and attributes Constant Width Italicis used for: General placeholders that indicate that an item is replaced by some actual value in your own program Request for Comments Please help us to improve future editions of this jsp blog by reporting any errors, inaccuracies, bugs, misleading or confusing statements, and plain old typos that you find anywhere in this jsp blog. Email your bug reports and comments to us at: bookquestions@oreilly.com. (Before sending a bug report, however, you may want to check for an errata list at http://www.oreilly.com/catalog/jservlet/ to see if the bug has already been submitted.) Please also let us know what we can do to make this jsp blog more useful to you. We take your comments seriously and will try to incorporate reasonable suggestions into future editions.

Hint: If you are looking for very good and affordable webspace to host and run your tomcat hosting application check Virtualwebstudio tomcat web hosting provider

Chapter 8, Security Explains the security issues involved

Monday, November 20th, 2006

Chapter 8, Security Explains the security issues involved with distributed computing and demonstrates how to maintain security with servlets. Chapter 9, Database Connectivity Shows how servlets can be used for high-performance web-database connectivity. Chapter 10, Applet-Servlet Communication Describes how servlets can be of use to applet developers who need to communicate with the server. Chapter 11, Interservlet Communication Discusses why servlets need to communicate with each other and how it can be accomplished. Chapter 12, Internationalization Shows how a servlet can generate multilingual content. Chapter 13, Odds and Ends Presents a junk drawer full of useful servlet examples and tips that don’t really belong anywhere else. Appendix A, Servlet API Quick Reference Contains a full description of the classes, methods, and variables in the javax.servletpackage. Appendix B, HTTP Servlet API Quick Reference Contains a full description of the classes, methods, and variables in the javax.servlet.http package. Appendix C, HTTP Status Codes Lists the status codes specified by HTTP, along with the mnemonic constants used by servlets. Appendix D, Character Entities Lists the character entities defined in HTML, along with their equivalent Unicode escape values. Appendix E, Charsets Lists the suggested charsets servlets may use to generate content in several different languages. Please feel free to read the chapters of this jsp blog in whatever order you like. Reading straight through from front to back ensures that you won’t encounter any surprises, as efforts have been taken to avoid forward references. If you want to skip around, however, you can do so easily enough, especially after Chapter 5 the rest of the chapters all tend to stand alone. One last suggestion: read the “Debugging” section of Chapter 13 if at any time you find a piece of code that doesn’t work as expected. Conventions Used in this jsp blog Italic is used for: Pathnames, filenames, and program names New terms where they are defined Internet addresses, such as domain names and URLs Boldface is used for: Particular keys on a computer keyboard
Note: If you are looking for high quality webhost to host and run your jsp application check Vision jsp hosting services

All the examples have been tested using Sun’s

Sunday, November 19th, 2006

All the examples have been tested using Sun’s Java Web Server 1.1.1, running in the Java Virtual Machine (JVM) bundled with the Java Development Kit (JDK) 1.1.5, on both Windows and Unix. A few examples require alternate configurations, and this has been noted in the text. The Java Web Server is free for education use and has a 30-day trial period for all other use. You can download a copy from http://java.sun.com/products. The Java Development Kit is freely downloadable from http://java.sun.com/products/jdk or, for educational use, from http://www.sun.com/products-n-solutions/edu/java/. The Java Servlet Development Kit (JSDK) is available separately from the JDK; you can find it at http://java.sun.com/products/servlet/. this jsp blog also contains a set of utility classes they are used by the servlet examples, and you may find them helpful for your own general-purpose servlet development. These classes are contained in the com.oreilly.servletpackage. Among other things, there are classes to help servlets parse parameters, handle file uploads, generate multipart responses (server push), negotiate locales for internationalization, return files, manage socket connections, and act as RMI servers. There’s even a class to help applets communicate with servlets. The source code for the com.oreilly.servletpackage is contained within the text; the latest version is also available online (with javadoc documentation) from http://www.oreilly.com/catalog/jservlet/ and http://www.servlets.com. Organization this jsp blog consists of 13 chapters and 5 appendices, as follows: Chapter 1, Introduction Explains the role and advantage of Java servlets in web application development. Chapter 2, HTTP Servlet Basics Provides a quick introduction to the things an HTTP servlet can do: page generation, server-side includes, servlet chaining, and JavaServer Pages. Chapter 3, The Servlet Life Cycle Explains the details of how and when a servlet is loaded, how and when it is executed, how threads are managed, and how to handle the synchronization issues in a multithreaded system. Persistent state capabilities are also covered. Chapter 4, Retrieving Information Introduces the most common methods a servlet uses to receive information about the client, the server, the client’s request, and itself. Chapter 5, Sending HTML Information Describes how a servlet can generate HTML, return errors and other status codes, redirect requests, write data to the server log, and send custom HTTP header information. Chapter 6, Sending Multimedia Content Looks at some of the interesting things a servlet can return: dynamically generated images, compressed content, and multipart responses. Chapter 7, Session Tracking Shows how to build a sense of state on top of the stateless HTTP protocol. The first half of the chapter demonstrates the traditional session-tracking techniques used by CGI developers; the second half shows how to use the built-in support for session tracking in the Servlet API.
Note: If you are looking for top 10 and very good webhost to host and run your jsp application check Actions jsp hosting services

Authors of web pages with server-side includes Pages

Sunday, November 19th, 2006

Authors of web pages with server-side includes Pages that use server-side includes to call CGI programs can use tags to add content more efficiently to a page. Authors of web pages with different appearances By this we mean pages that must be available in different languages, have to be converted for transmission over a low-bandwidth connection, or need to be modified in some manner before they are sent to the client. Servlets provide something called servlet chaining that can be used for processing of this type. Each servlet in a servlet chain knows how to catch, process, and return a specific kind of content. Thus, servlets can be linked together to do language translation, change large color images to small black-and-white ones, convert images in esoteric formats to standard GIF or JPEG images, or nearly anything else you can think of. What You Need to Know When we first started writing this jsp blog, we found to our surprise that one of the hardest things was determining what to assume about you, the reader. Are you familiar with Java? Have you done CGI or other web application programming before? Or are you getting your feet wet with servlets? Do you understand HTTP and HTML, or do those acronyms seem perfectly interchangeable? No matter what experience level we imagined, it was sure to be too simplistic for some and too advanced for others. In the end, this jsp blog was written with the notion that it should contain predominantly original material: it could leave out exhaustive descriptions of topics and concepts that are well described online or in other books. Scattered throughout the text, you’ll find several references to these external sources of information. Of course, external references only get you so far. this jsp blog expects you are comfortable with the Java programming language and basic object-oriented programming techniques. If you are coming to servlets from another language, we suggest you prepare yourself by reading a book on general Java programming, such as Exploring Java, by Patrick Niemeyer and Joshua Peck (O’Reilly). You may want to skim quickly the sections on applets and AWT (graphical) programming and spend extra time on network and multithreaded programming. If you want to get started with servlets right away and learn Java as you go, we suggest you read this jsp blog with a copy of Java in a Nutshell, by David Flanagan (O’Reilly), or another Java reference book, at your side. this jsp blog does not assume you have extensive experience with web programming, HTTP, and HTML. But neither does it provide a full introduction to or exhaustive description of these technologies. We’ll cover the basics necessary for effective servlet development and leave the finer points (such as a complete list of HTML tags and HTTP 1.1 headers) to other sources. About the Examples In this jsp blog you’ll find nearly 100 servlet examples. The code for these servlets is all contained within the text, but you may prefer to download the examples rather than type them in by hand. You can find the code online and packaged for download at http://www.oreilly.com/catalog/jservlet/. You can also see many of the servlets in action at http://www.servlets.com.
Note: If you are looking for cheapest and affordable webspace to host and run your servlet application check Astra servlet hosting services

In late 1996, Java on the server side

Sunday, November 19th, 2006

In late 1996, Java on the server side was coming on strong. Several major software vendors were marketing technologies specifically aimed at helping server-side Java developers do their jobs more efficiently. Most of these products provided a prebuilt infrastructure that could lift the developer’s attention from the raw socket level into the more productive application level. For example, Netscape introduced something it named “server- side applets”; the World Wide Web Consortium included extensible modules called “resources” with its Java- based Jigsaw web server; and with its WebSite server, O’Reilly Software promoted the use of a technology it (only coincidentally) dubbed ‘’servlets.” The drawback: each of these technologies was tied to a particular server and designed for very specific tasks. Then, in early 1997, JavaSoft (a company that has since been reintegrated into Sun Microsystems as the Java Software division) finalized Java servlets. This action consolidated the scattered technologies into a single, standard, generic mechanism for developing modular server-side Java code. Servlets were designed to work with both Java-based and non-Java-based servers. Support for servlets has since been implemented in nearly every web server, from Apache to Zeus, and in many non-web servers as well. Servlets have been quick to gain acceptance because, unlike many new technologies that must first explain the problem or task they were created to solve, servlets are a clear solution to a well-recognized and widespread need: generating dynamic web content. From corporations down to individual web programmers, people who struggled with the maintenance and performance problems of CGI-based web programming are turning to servlets for their power, portability, and efficiency. Others, who were perhaps intimidated by CGI programming’s apparent reliance on manual HTTP communication and the Perl and C languages, are looking to servlets as a manageable first step into the world of web programming. this jsp blog explains everything you need to know about Java servlet programming. The first five chapters cover the basics: what servlets are, what they do, and how they work. The following eight chapters are where the true meat is they explore the things you are likely to do with servlets. You’ll find numerous examples, several suggestions, a few warnings, and even a couple of true hacks that somehow made it past technical review. We cover Version 2.0 of the Servlet API, which was introduced as part of the Java Web Server 1.1 in December 1997 and clarified by the release of the Java Servlet Development Kit 2.0 in April 1998. Changes in the API from Version 1.0, finalized in June 1997, are noted throughout the text. Audience Is this jsp blog for you? It is if you’re interested in extending the functionality of a server such as extending a web server to generate dynamic content. Specifically, this jsp blog was written to help: CGI programmers CGI is a popular but somewhat crude method of extending the functionality of a web server. Servlets provide an elegant, efficient alternative. NSAPI, ISAPI, ASP, and Server-Side JavaScript programmers Each of these technologies can be used as a CGI alternative, but each has limitations regarding portability, security, and/or performance. Servlets tend to excel in each of these areas. Java applet programmers It has always been difficult for an applet to talk to a server. Servlets make it easier by giving the applet an easy-to-connect-to, Java-based agent on the server.
Note: If you are looking for inexpensive but high quality provider to host and run your jsp application check Astra jsp hosting services

Security In this chapter: HTTP Authentication

Sunday, November 19th, 2006

Security In this chapter: HTTP Authentication Digital Certificates Secure Sockets Layer (SSL) Running Servlets Securely So far we have imagined that our servlets exist in a perfect world, where everyone is trustworthy and nobody locks their doors at night. Sadly, that’s a 1950s fantasy world: the truth is that the Internet has its share of fiendish rogues. As companies place more and more emphasis on online commerce and begin to load their Intranets with sensitive information, security has become one of the most important topics in web programming. Security is the science of keeping sensitive information in the hands of authorized users. On the web, this boils down to three important issues: Authentication Being able to verify the identities of the parties involved Confidentiality Ensuring that only the parties involved can understand the communication Integrity Being able to verify that the content of the communication is not changed during transmission A client wants to be sure that it is talking to a legitimate server (authentication), and it also want to be sure that any information it transmits, such as credit card numbers, is not subject to eavesdropping (confidentiality). The server is also concerned with authentication and confidentiality. If a company is selling a service or providing sensitive information to its own employees, it has a vested interest in making sure that nobody but an authorized user can access it. And both sides need integrity to make sure that whatever information they send gets to the other party unaltered. Authentication, confidentiality, and integrity are all linked by digital certificate technology. Digital certificates allow web servers and clients to use advanced cryp tographic techniques to handle identification and encryption in a secure manner. Thanks to Java’s built-in support for digital certificates, servlets are an excellent platform for deploying secure web applications that use digital certificate technology. We’ll be taking a closer look at them later. Security is also about making sure that crackers can’t gain access to the sensitive data on your web server. Because Java was designed from the ground up as a secure, network-oriented language, it is possible to leverage the built-in security features and make sure that server add-ons from third parties are almost as safe as the ones you write yourself.
Note: If you are looking for good and high quality web space to host and run your java application check Vision java hosting services

public static void main (String [] argv) {

Saturday, November 4th, 2006

public static void main (String [] argv) {

Saturday, November 4th, 2006

package java.lang; public final class String implements java.io.Serializable, Comparable, CharSequence { // This is a partial API listing public boolean matches (String regex) public String [] split (String regex) public String [] split (String regex, int limit) public String replaceFirst (String regex, String replacement) public String replaceAll (String regex, String replacement) } All the new String methods are pass-through calls to methods of the Pattern or Matcher classes. Now that you know how Pattern and Matcher are used and inter-operate, using these String convenience methods should be a no brainer. Instead of describing each method, they are summarized in Table 5-6. Table 5-6. Regular expression methods of the String class String method signature java.util.regex equivalent input.matches (String regex) Pattern.matches (String regex, CharSequence input) input.split (String regex) pat.split (CharSequence input) input.split (String regex, int limit) pat.split (CharSequence input, int limit) input.replaceFirst (String regex, String replacement) match.replaceFirst (String replacement) input.replaceAll (String regex, String replacement) match.replaceAll (String replacement) In Table 5-6, assume that there is a String named input, a Pattern object named pat, and a Matcher named match: String input = “Mary had a little lamb”; String [] tokens = input.split (”\s+”); // split on whitespace As of JDK 1.4, none of these regular expression convenience methods cache any expressions or do any other optimizations. Some JVM implementations may choose to cache and reuse pattern objects, but you should not rely on them. If you expect to apply the same pattern-matching operations repeatedly, it will be more efficient to use the classes in java.util.regex. 5.4 Java Regular Expression Syntax Following is a summary of the regular expression syntax supported by the java.util.regex package, as released in JDK 1.4. Things change quickly in the Java world, so you should always check the current documentation provided with the Java implementation you’re using. The information provided here is a quick reference to get you started. 186

Hint: This post is supported by Gama web hosting hrvatska services

public static void main (String [] argv) {

Saturday, November 4th, 2006

public static void main (String [] argv) { String input = “Thanks, thanks very much”; String regex = “([Tt])hanks”; Pattern pattern = Pattern.compile (regex); Matcher matcher = pattern.matcher (input); StringBuffer sb = new StringBuffer(); // Loop while matches are encountered while (matcher.find()) { if (matcher.group(1).equals (”T”)) { matcher.appendReplacement (sb, “Thank you”); } else { matcher.appendReplacement (sb, “thank you”); } } // Complete the transfer to the StringBuffer matcher.appendTail (sb); // Print the result System.out.println (sb.toString()); // Let’s try that again using the $n escape in the replacement sb.setLength (0); matcher.reset(); String replacement = “$1hank you”; // Loop while matches are encountered while (matcher.find()) { matcher.appendReplacement (sb, replacement); } // Complete the transfer to the StringBuffer matcher.appendTail (sb); // Print the result System.out.println (sb.toString()); // and once more, the easy way (because this example is simple) System.out.println (matcher.replaceAll (replacement)); // one last time, using only the String System.out.println (input.replaceAll (regex, replacement)); } } 5.3 Regular Expression Methods of the String Class It should be pretty obvious from the preceding sections that strings and regular expressions go hand in hand. It’s only natural then that our old friend the String class has added some convenience methods to do common regular expression operations: 185

Hint: This post is supported by Gama web hosting hrvatska services

To generate a replacement sequence of ab, the

Saturday, November 4th, 2006

Matcher matcher = pattern.matcher (”Thanks, thanks very much”); StringBuffer sb = new StringBuffer(); while (matcher.find()) { if (matcher.group(1).equals (”T”)) { matcher.appendReplacement (sb, “Thank you”); } else { matcher.appendReplacement (sb, “thank you”); } } matcher.appendTail (sb); Table 5-5 shows the sequence of changes applied to the StringBuffer by the above code. Table 5-5. Using appendReplacement() and appendTail() Append position Execute Resulting StringBuffer 0 appendReplacement (sb, “Thankyou”) Thank you 6 appendReplacement (sb, “thankyou”) Thank you, thank you 14 appendTail (sb) Thank you, thank youvery much This sequence of append operations results in the StringBuffer object sb containing the string “Thank you, thank you very much”. Example 5-8 is a complete code example showing this type of replacement, as well as alternate ways of performing the same substitution. In this simple case, the value of a capture group can be used because the first letter of the matched pattern is the same as that of the replacement. In a more complex case, there may not be an overlap between the input and the replacement values. Using Matcher.find() and Matcher.appendReplacement() allows you to programmatically mediate each replacement, possibly injecting different replacement values at each point along the way. Example 5-8. Regular expression append/replace package com.ronsoft.books.nio.regex; import java.util.regex.Pattern; import java.util.regex.Matcher; /** * Test the appendReplacement() and appendTail() methods of the * java.util.regex.Matcher class. * * @author Ron Hitchens (ron@ronsoft.com) */ public class RegexAppend{ 184
Note: If you are looking for cheap and quality provider to host and run your java application check Astra java hosting services

To generate a replacement sequence of ab, the

Saturday, November 4th, 2006

The two append methods listed in the Matcher API are useful when iterating though an input character sequence, repeatedly invoking find(): package java.util.regex; public final class Matcher{ // This is a partial API listing public StringBuffer appendTail (StringBuffer sb) public Matcher appendReplacement (StringBuffer sb, String replacement) } Rather than returning a new String with the replacement already performed, the append methods append to a StringBuffer object you provide. This allows you to make decisions about the replacement at each point a match is found or to accumulate the result of matching against multiple input strings. Using appendReplacement() and appendTail() gives you total control of the search-and-replace process. One of the bits of state information remembered by Matcher objects is an append position. The append position is used to remember the amount of the input character sequence that has already been copied out by previous invocations of appendReplacement(). When appendReplacement() is invoked, the following process takes place: 1. Characters are read from the input starting at the current append position and appended to the provided StringBuffer. The last character copied is the one just before the first character of the matched pattern. This is the character at the index returned by start() minus one. 2. The replacement string is appended to the StringBuffer and substitutes any embedded capture group references as described earlier. 3. The append position is updated to be the index of the character following the matched pattern, which is the value returned by end(). The appendReplacement() method works properly only if a previous match operation was successful (usually a call to find()). You will be rewarded with a delightful java.lang.IllegalStateException if the last match returned false or if the method is called immediately following a reset. But don’t forget that there may be remaining characters in the input beyond the last match of the pattern. You probably don’t want to lose those, but appendReplace-ment() will not have copied them otherwise, and end() won’t return a useful value after find() fails to find any more matches. The appendTail() method is there to copy the remainder of your input in this situation. It simply copies any characters from the current append position to the end of the input and appends them to the given StringBuffer. The following code is a typical usage scenario for appendReplacement() and appendTail(): Pattern pattern = Pattern.compile (”([Tt])hanks”); 183
Note: If you are looking for cheap and quality provider to host and run your java application check Astra java hosting services

To generate a replacement sequence of ab, the

Saturday, November 4th, 2006

To generate a replacement sequence of ab, the String literal argument to replaceAll() must be a\\b (see Example 5-7). Be careful when counting those backslashes! Example 5-7. Backslashes in regular expressions package com.ronsoft.books.nio.regex; import java.util.regex.Pattern; import java.util.regex.Matcher; /** * Demonstrate behavior of backslashes in regex patterns. * * @author Ron Hitchens (ron@ronsoft.com) */ public class BackSlashes{ public static void main (String [] argv) { // Substitute “ab” for XYZ or ABC in input String rep = “a\\b”; String input = “> XYZ <=> ABC <"; Pattern pattern = Pattern.compile ("ABC|XYZ"); Matcher matcher = pattern.matcher (input); System.out.println (matcher.replaceFirst (rep)); System.out.println (matcher.replaceAll (rep)); // Change all newlines in input to escaped, DOS-like CR/LF rep = "\\r\\n"; input = "line 1nline 2nline 3n"; pattern = Pattern.compile ("\n"); matcher = pattern.matcher (input); System.out.println (""); System.out.println ("Before:"); System.out.println (input); System.out.println ("After (dos-ified, escaped):"); System.out.println (matcher.replaceAll (rep)); } } Here's the output from running BackSlashes: > ab <=> ABC < > ab <=> ab < Before: line 1 line 2 line 3 After (dos-ified, escaped): line 1rnline 2rnline 3rn 182
Note: If you are looking for cheap and quality provider to host and run your java application check Astra java hosting services

The number of capture groups in the regular

Saturday, November 4th, 2006

{ public static void main (String [] argv) { // sanity check, need at least three args if (argv.length < 3) { System.out.println ("usage: regex replacement input ..."); return; } // Save the regex and replacment strings with mnemonic names String regex = argv [0]; String replace = argv [1]; // Compile the expression; needs to be done only once Pattern pattern = Pattern.compile (regex); // Get a Matcher instance and use a dummy input string for now Matcher matcher = pattern.matcher (""); // print out for reference System.out.println (" regex: '" + regex + "'"); System.out.println (" replacement: '" + replace + "'"); // For each remaining arg string, apply the regex/replacmentfor (int i = 2; i < argv.length; i++) { System.out.println ("------------------------"); matcher.reset (argv [i]); System.out.println (" input: '" + argv [i] + "'"); System.out.println ("replaceFirst(): '" + matcher.replaceFirst (replace) + "'"); System.out.println (" replaceAll(): '" + matcher.replaceAll (replace) + "'"); } } } And here's the output from running RegexReplace: regex: '([bB])yte' replacement: '$1ite' input: 'Bytes is bytes' replaceFirst(): 'Bites is bytes'replaceAll(): 'Bites is bites' Remember that regular expressions interpret backslashes in the strings you provide. Also remember that the Java compiler expects two backslashes for each one in a literal String. This means that if you want to escape a backslash in the regex, you'll need two backslashes in the compiled String. To get two backslashes in a row in the compiled regex string, you'll need four backslashes in a row in the Java source code. 181
Note: If you are looking for cheap and inexpensive provider to host and run your tomcat application check Actions tomcat hosting services

The number of capture groups in the regular

Saturday, November 4th, 2006

The number of capture groups in the regular expression pattern is returned by the groupCount() method. This value derives from the original Pattern object and is immutable. Group numbers must be positive and less than the value returned by groupCount(). Passing a group number out of range will result in a java.lang.IndexOutOfBoundsException. A capture group number can be passed to start() and end() to determine the subsequence matching the given capture group subexpression. It’s possible for the overall expression to successfully match but one or more capture groups not to have matched. The start() and end() methods will return a value of -1 if the requested capture group is not currently set. As mentioned earlier, the entire regular expression is considered to be group zero. Invoking start() or end() with no argument is equivalent to passing an argument of zero. Invoking start() or end() for group zero will never return -1. You can extract a matching subsequence from the input CharSequence using the values returned by start() and end() (as shown previously), but the group() methods provide an easier way to do this. Invoking group() with a numeric argument returns a String that is the matching subsequence for that particular capture group. If you call the version of group() that takes no argument, the subsequence matched by the entire regular expression (group zero) is returned. This code: String match0 = input.subSequence (matcher.start(), matcher.end()).toString(); String match2 = input.subSequence (matcher.start (2), matcher.end (2)).toString(); is equivalent to this: String match0 = matcher.group(); String match2 = matcher.group(2); Finally, let’s look at the methods of the Matcher object that deal with modifying a character sequence. One of the most common applications of regular expressions is to do a search-and-replace. The replaceFirst() and replaceAll() methods make this very easy to do. They behave identically except that replaceFirst() stops after the first match it finds, while replaceAll() iterates until all matches have been replaced. Both take a String argument that is the replacement value to substitute for the matched pattern in the input character sequence. package java.util.regex; public final class Matcher{ // This is a partial API listing public String replaceFirst (String replacement) public String replaceAll (String replacement) } 179
Note: If you are looking for cheap and inexpensive provider to host and run your tomcat application check Actions tomcat hosting services

The number of capture groups in the regular

Saturday, November 4th, 2006

As mentioned earlier, capture groups can be back-referenced within the regular expression. They can also be referenced from the replacement string you provide to replaceFirst() or replaceAll(). Capture group numbers can be embedded in the replacement string by preceding them with a dollar sign character. When the replacement string is substituted into the result string, each occurrence of $g is replaced by the value that would be returned by group(g). If you want to use a literal dollar sign in the replacement string, you must precede it with a backslash character ($). To pass through a backslash, you must double it (\). If you want to concatenate literal numeric digits following a capture group reference, separate them from the group number with a backslash, like this: 123$2456. See Table 5-4 for some examples. See also Example 5-6 for sample code. Table 5-4. Replacement of matched patterns Regex pattern Input Replacement replaceFirst() replaceAll() a*b aabfooaabfooabfoob –fooaabfooabfoob -foo-foo-foo p{Blank} fee fiefoe fum _ fee_fiefoe fum fee_fie_foe_fum ([bB])yte Byte forbyte $1ite Bite forbyte Bite forbite dddd([- ]) card #1234-5678-1234 xxxx$1 card #xxxx-5678-1234 card #xxxx-xxxx-1234 (up|left)( *)(right|down) leftright, up down $3$2$1 rightleft, up down rightleft, down up ([CcPp][hl]e[ea]se) I wantcheese. Please. $1 I want cheese . Please. I want cheese . Please . Example 5-6. Regular expression replacement package com.ronsoft.books.nio.regex; import java.util.regex.Pattern; import java.util.regex.Matcher; /** * Exercise the replacement capabilities of the java.util.regex.Matcherclass. * Run this code from the command line with three or more arguments. * 1) First argument is a regular expression * 2) Second argument is a replacement string, optionally with capture group * references ($1, $2, etc) * 3) Any remaining arguments are treated as input strings to which the * regular expression and replacement strings will be applied. * The effect ofcalling replaceFirst() and replaceAll() for each input string * will be listed. * * Be careful to quote the commandline arguments if they contain spaces or * special characters. * * @author Ron Hitchens (ron@ronsoft.com) */ public class RegexReplace 180
Note: If you are looking for cheap and inexpensive provider to host and run your tomcat application check Actions tomcat hosting services

// Compile the email address detector pattern Pattern

Friday, November 3rd, 2006

The lookingAt() method is similar to matches() but does not require that the entire sequence be matched by the pattern. If the regular expression pattern matches the beginning of the character sequence, then lookingAt() returns true. The lookingAt() method always begins scanning at the beginning of the sequence. The name of this method is intended to indicate if the matcher is currently “looking at” a target that starts with the pattern. If it returns true, then the start(), end(), and group() methods can be called to determine the extent of the matched subsequence (more about those methods shortly). The find() method performs the same sort of matching operation as lookingAt(), but remembers the position of the previous match and resumes scanning after it. This allows successive calls to find() to step through the input and find embedded matches. On the first call following a reset, scanning begins at the first character of the input sequence. On subsequent calls, it resumes scanning at the first character following the previously matched subsequence. For each invocation, true is returned if the pattern was found; otherwise, false is returned. Typically, you’ll use find() to iterate over some text to find all the matching patterns within it. The version of find() that takes a positional argument does an implicit reset and begins scanning the input at the provided index position. Afterwards, no-argument find() calls can be made to scan the remainder of the input sequence if needed. Once a match has been detected, you can determine where in the character sequence the match is located by calling start() and end(). The start() method returns the index of the first character of the matched sequence; end() returns the index of the last character of the match plus one. These values are consistent with CharSequence.subsequence() and can be used directly to extract the matched subsequence. CharSequence subseq; if (matcher.find()) { subseq = input.subSequence (matcher.start(), matcher.end()); } Some regular expressions can match the empty string, in which case start() and end() will return the same value. The start() and end() methods return only meaningful values only if a match has previously been detected by matches(), lookingAt(), or find(). If no match has been made, or the last matching attempt returned false, then invoking start() or end() will result in a java.lang.IllegalStateException. To understand the forms of start() and end() that take a group argument, we first need to understand expression capture groups. (See Figure 5-2.) Figure 5-2. start(), end(), and group() values 177
Note: If you are looking for cheap and reliable provider to host and run your servlet application check Vision servlet hosting services