Volume II - Advanced Features
Core Java® has long been recognized as the leading, no-nonsense tutorial and reference for experienced programmers who want to write robust Java code for real-world applications. Now, Core Java®, Volume II—Advanced Features, Tenth Edition, has been extensively updated to reflect the most eagerly awaited more » and innovative version of Java in years: Java SE 8. Rewritten and reorganized to illuminate powerful new Java features, idioms, and best practices for enterprise and desktop development, it contains hundreds of up-to-date example programs—all carefully crafted for easy understanding and practical applicability.
Writing for serious programmers solving real-world problems, Cay Horstmann deepens your understanding of today’s Java language and library. In this second of two updated volumes, he offers in-depth coverage of advanced topics including the new Streams API and date/time/calendar library, advanced Swing, security, code processing, and more. This guide will help you
* Use the new Streams library to process collections more flexibly and efficiently
* Efficiently access files and directories, read/write binary or text data, and serialize objects
* Work with Java SE 8’s regular expression package
* Make the most of XML in Java: parsing, validation, XPath, document generation, XSL, and more
* Efficiently connect Java programs to network services
* Program databases with JDBC 4.2
* Elegantly overcome date/time programming complexities with the new java.time API
* Write internationalized programs with localized dates/times, numbers, text, and GUIs
* Process code with the scripting API, compiler API, and annotation processors
* Enforce security via class loaders, bytecode verification, security managers, permissions, user authentication, digital signatures, code signing, and encryption
* Master advanced Swing components for lists, tables, trees, text, and progress indicators
* Produce high-quality drawings with the Java 2D API
* Use JNI native methods to leverage code in other languages
If you’re an experienced programmer moving to Java SE 8, Core Java®, Tenth Edition, is the reliable, practical, and complete guide to the Java platform that has been trusted by developers for over twenty years.
Look for the companion volume, Core Java®, Volume I—Fundamentals, Tenth Edition (ISBN-13: 978-0-13-417730-4), for foundational coverage of Java 8 language concepts, UI programming, objects, generics, collections, lambda expressions, concurrency, functional programming, and more. « less
* Extract data from any source to perform real time analytics.
* Full of techniques and examples to help you crawl websites and extract data within hours.
* A hands-on guide to web scraping and crawling with real-life problems and solutions
This book covers the long more » awaited Scrapy v 1.0 that empowers you to extract useful data from virtually any source with very little effort. It starts off by explaining the fundamentals of Scrapy framework, followed by a thorough description of how to extract data from any source, clean it up, shape it as per your requirement using Python and 3rd party APIs. Next you will be familiarised with the process of storing the scrapped data in databases as well as search engines and performing real time analytics on them with Spark Streaming. By the end of this book, you will perfect the art of scarping data for your applications with ease
WHAT YOU WILL LEARN
* Understand HTML pages and write XPath to extract the data you need
* Write Scrapy spiders with simple Python and do web crawls
* Push your data into any database, search engine or analytics system
* Configure your spider to download files, images and use proxies
* Create efficient pipelines that shape data in precisely the form you want
* Use Twisted Asynchronous API to process hundreds of items concurrently
* Make your crawler super-fast by learning how to tune Scrapy's performance
* Perform large scale distributed crawls with scrapyd and scrapinghub
ABOUT THE AUTHOR
Dimitrios Kouzis-Loukas has over fifteen years experience as a topnotch software developer. He uses his acquired knowledge and expertise to teach a wide range of audiences how to write great software, as well.
He studied and mastered several disciplines, including mathematics, physics, and microelectronics. His thorough understanding of these subjects helped him raise his standards beyond the scope of "pragmatic solutions." He knows that true solutions should be as certain as the laws of physics, as robust as ECC memories, and as universal as mathematics.
Dimitrios now develops distributed, low-latency, highly-availability systems using the latest datacenter technologies. He is language agnostic, yet has a slight preference for Python, C++, and Java. A firm believer in open source software and hardware, he hopes that his contributions will benefit individual communities as well as all of humanity.
TABLE OF CONTENTS
1. Introducing Scrapy
2. Understanding HTML and XPath
3. Basic Crawling
4. From Scrapy to a Mobile App
5. Quick Spider Recipes
6. Deploying to Scrapinghub
7. Configuration and Management
8. Programming Scrapy
9. Pipeline Recipes
10. Understanding Scrapy's Performance
11. Distributed Crawling with Scrapyd and Real-Time Analytics
12. Installing and troubleshooting prerequisite software « less
Design and build Service-Oriented Architecture Solutions with the Oracle SOA Suite 10gR3
This book is a comprehensive guide, split into three sections. The initial section of the book provides an introduction to the Oracle SOA Suite and its various components, and will give you a fast-paced hands-on introduction to each of the key components in turn. The next section illustrates the usage more » of the various components of the SOA Suite to implement a real-world SOA-based solution with the help of an example of an online auction site (oBay). The final section covers other considerations such as the packaging, deployment, testing, security, and administration of SOA applications. This book targets developers and technical architects who work in the SOA domain. The primary purpose of the book is to provide them with a "hands on" practical guide to using and applying the Oracle SOA Suite in the delivery of real-world composite applications. It presumes basic understanding of the concepts of SOA, as well as some of the key standards in this space, including web services (SOAP, WSDL), XML Schemas, and XSLT (and XPath). « less
This book is a comprehensive guide, split into three sections. The initial section of the book provides an introduction to the Oracle SOA Suite and its various components, and will give you a fast-paced hands-on introduction to each of the key components in turn. The next section illustrates the usage more » of the various components of the SOA Suite to implement a real-world SOA-based solution with the help of an example of an online auction site (oBay). The final section covers other considerations such as the packaging, deployment, testing, security, and administration of SOA applications.
This book targets developers and technical architects who work in the SOA domain. The primary purpose of the book is to provide them with a "hands on" practical guide to using and applying the Oracle SOA Suite in the delivery of real-world composite applications. It presumes basic understanding of the concepts of SOA, as well as some of the key standards in this space, including web services (SOAP, WSDL), XML Schemas, and XSLT (and XPath). « less
This book is primarily a practical reference book for professional XSLT developers. It assumes no previous knowledge of the language, and many developers have used it as their first introduction to XSLT; however, it is not structured as a tutorial, and there are other books on XSLT that provide a gentler more » approach for beginners. The book does assume a basic knowledge of XML, HTML, and the architecture of the Web, and it is written for experienced programmers. There’s no assumption that you know any particular language such as Java or Visual Basic, just that you recognize the concepts that all programming languages have in common.
The book is suitable both for XSLT 1.0 users upgrading to XSLT 2.0, and for newcomers to XSLT. The book is also equally suitable whether you work in the Java or .NET world.
As befits a reference book, a key aim is that the coverage should be comprehensive and authoritative. It is designed to give you all the details, not just an overview of the 20 percent of the language that most people use 80 percent of the time. It’s designed so that you will keep coming back to the book whenever you encounter new and challenging programming tasks, not as a book that you skim quickly and then leave on the shelf. If you like detail, you will enjoy this book; if not, you probably won’t.
But as well as giving the detail, this book aims to explain the concepts, in some depth. It’s therefore a book for people who not only want to use the language but who also want to understand it at a deep level.
The book aims to tell you everything you need to know about the XSLT 2.0 language. It gives equal weight to the things that are new in XSLT 2.0 and the things that were already present in version 1.0. The book is about the language, not about specific products. However, there are appendices about Saxon (the author’s own implementation of XSLT 2.0), about the Altova XSLT 2.0 implementation, and about the Java and Microsoft APIs for controlling XSLT transformations, which will no doubt be upgraded to handle XSLT 2.0 as well as 1.0. A third XSLT 2.0 processor, Gestalt, was released shortly before the book went to press, too late to describe it in any detail. But the experience of XSLT 1.0 is that there has been a very high level of interoperability between different XSLT processors, and if you can use one of them, then you can use them all.
In the previous edition we split XSLT 2.0 and XPath 2.0 into separate volumes. The idea was that some readers might be interested in XPath alone. However, many bought the XSLT 2.0 book without its XPath companion and were left confused as a result; so this time, the material is back together. The XPath reference information is in self-contained chapters, so it should still be accessible when you use XPath in contexts other than XSLT.
The book does not cover XSL Formatting Objects, a big subject in its own right. Nor does it cover XML Schemas in any detail. If you want to use these important technologies in conjunction with XSLT, there are other books that do them justice.
This book contains twenty chapters and eight appendixes (the last of which is a glossary) organized into four parts. The following section outlines what you can find in each part, chapter, and appendix.
Part I: Foundations: The first part of the book covers essential concepts. You should read these before you start coding. If you ignore this advice, as most people do, then you read them when you get to that trough of despair when you find it impossible to make the language do anything but the most trivial tasks. XSLT is different from other languages, and to make it work for you, you need to understand how it was designed to be used.
Chapter 1: XSLT in Context: This chapter explains how XSLT fits into the big picture: how the language came into being and how it sits alongside other technologies. It also has a few simple coding examples to keep you alert.
Chapter 2: The XSLT Processing Model: This is about the architecture of an XSLT processor: the inputs, the outputs, and the data model. Understanding the data model is perhaps the most important thing that distinguishes an XSLT expert from an amateur; it may seem like information that you can’t use immediately, but it’s knowledge that will stop you making a lot of stupid mistakes.
Chapter 3: Stylesheet Structure: XSLT development is about writing stylesheets, and this chapter takes a bird’s eye view of what stylesheets look like. It explains the key concepts of rule-based programming using templates, and explains how to undertake programming-in-the-large by structuring your application using modules and pipelines.
Chapter 4: Stylesheets and Schemas: A key innovation in XSLT 2.0 is that stylesheets can take advantage of knowledge about the structure of your input and output documents, provided in the form of an XML Schema. This chapter provides a quick overview of XML Schema to describe its impact on XSLT development. Not everyone uses schemas, and you can skip this chapter if you fall into that category.
Chapter 5: The Type System: XPath 2.0 and XSLT 2.0 offer strong typing as an alternative to the weak typing approach of the 1.0 languages. This means that you can declare the types of your variables, functions, and parameters, and use this information to get early warning of programming errors. This chapter explains the data types available and the mechanisms for creating user-defined types.
Part II: XSLT and XPath Reference: This section of the book contains reference material, organized in the hope that you can easily find what you need when you need it. It’s not designed for sequential reading, though you might well want to leaf through the pages to discover what’s there.
Chapter 6: XSLT Elements: This monster chapter lists all the XSLT elements you can use in a stylesheet, in alphabetical order, giving detailed rules for the syntax and semantics of each element, advice on usage, and examples. This is probably the part of the book you will use most frequently as you become an expert XSLT user. It’s a “no stone unturned” approach, based on the belief that as a professional developer you need to know what happens when the going gets tough, not just when the wind is in your direction.
Chapter 7: XPath Fundamentals: This chapter explains the basics of XPath: the low-level constructs such as literals, variables, and function calls. It also explains the context rules, which describe how the evaluation of XPath expressions depends on the XSLT processing context in which they appear.
Chapter 8: XPath: Operators on Items: XPath offers the usual range of operators for performing arithmetic, boolean comparison, and the like. However, these don’t always behave exactly as you would expect, so it’s worth reading this chapter to see what’s available and how it differs from the last language that you used.
Chapter 9: XPath: Path Expressions: Path expressions are what make XPath special; they enable you to navigate around the structure of an XML document. This chapter explains the syntax of path expressions, the 13 axes that you can use to locate the nodes that you need, and associated operators such as union, intersection, and difference.
Chapter 10: XPath: Sequence Expressions: Unlike XPath 1.0, in version 2.0 all values are sequences (singletons are just a special case). Some of the most important operators in XPath 2.0 are those that manipulate sequences, notably the «for» expression, which translates one sequence into another by applying a mapping.
Chapter 11: XPath: Type Expressions: The type system was explained in Chapter 5; this chapter explains the operations that you can use to take advantage of types. This includes the «cast» operation which is used to convert values from one type to another.A big part of this chapter is devoted to the detailed rules for how these conversions are done.
Chapter 12: XSLT Patterns: This chapter returns from XPath to a subject that’s specific to XSLT. Patterns are used to define template rules, the essence of XSLT’s rule-based programming approach. The reason for explaining them now is that the syntax and semantics of patterns depends strongly on the corresponding rules for XPath expressions.
Chapter 13: The Function Library: XPath 2.0 includes a library of functions that can be called from any XPath expression; XSLT 2.0 extends this with some additional functions that are available only when XPath is used within XSLT. The library has grown immensely since XPath 1.0. This chapter provides a single alphabetical reference for all these functions.
Chapter 14: Regular Expressions: Processing of text is an area where XSLT 2.0 and XPath 2.0 are much more powerful than version 1.0, and this is largely through the use of constructs that exploit regular expressions. If you’re familiar with regexes from languages such as Perl, this chapter tells you how XPath regular expressions differ. If you’re new to the subject, it explains it from first principles.
Chapter 15: Serialization: Serialization in XSLT means the ability to generate a textual XML document from the tree structure that’s manipulated by a stylesheet. This isn’t part of XSLT processing proper, so (following W3C’s lead) it’s separated it into its own chapter. You can control serialization from the stylesheet using an declaration, but many products also allow you to control it directly via an API.
Part III: Exploitation: The final section of the book is advice and guidance on how to take advantage of XSLT to write real applications. It’s intended to make you not just a competent XSLT coder, but a competent designer too. The best way of learning is by studying the work of others, so the emphasis here is on practical case studies.
Chapter 16: Extensibility: This chapter describes the “hooks” provided in the XSLT specification to allow vendors and users to plug in extra functionality. The way this works will vary from one implementation to another, so we can’t cover all possibilities, but one important aspect that the chapter does cover is how to use such extensions and still keep your code portable.
Chapter 17: Stylesheet Design Patterns: This chapter explores a number of design and coding patterns for XSLT programming, starting with the simplest “fill-in-the-blanks” stylesheet, and extending to the full use of recursive programming in the functional programming style, which is needed to tackle problems of any computational complexity. This provides an opportunity to explain the thinking behind functional programming and the change in mindset needed to take full advantage of this style of development.
Chapter 18: Case Study: XMLSpec: XSLT is often used for rendering documents, so where better to look for a case study than the stylesheets used by the W3C to render the XML and XSLT specifications, and others in the same family, for display on the web? The resulting stylesheets are typical of those you will find in any publishing organization that uses XML to develop a series of documents with a compatible look-and-feel.
Chapter 19: Case Study: A Family Tree: Displaying a family tree is another typical XSLT application. This example with semi-structured data—a mixture of fairly complex data and narrative text—that can be presented in many different ways for different audiences. It also shows how to tackle another typical XSLT problem, conversion of the data into XML from a legacy text-based format. As it happens, this uses nearly all the important new XSLT 2.0 features in one short stylesheet. But another aim of this chapter is to show a collection of stylesheets doing different jobs as part of a complete application.
Chapter 20: Case Study: Knight's Tour: Finding a route around a chessboard where a knight visits every square without ever retracing its steps might sound a fairly esoteric application for XSLT, but it’s a good way of showing how even the most complex of algorithms are within the capabilities of the language. You may not need to tackle this particular problem, but if you want to construct an SVG diagram showing progress against your project plan, then the problems won’t be that dissimilar.
Part IV: Appendices: Appendix A: XPath 2.0 Syntax Summary: Collects the XPath grammar rules and operator precedences into one place for ease of reference.
Appendix B: Error Codes: A list of all the error codes defined in the XSLT and XPath language specifications, with brief explanations to help you understand what’s gone wrong.
Appendix C: Backward Compatibility: The list of things you need to look out for when converting applications from XSLT 1.0.
Appendix D: Microsoft XSLT Processors: Although the two Microsoft XSLT processors don’t yet support XSLT 2.0, we thought many readers would find it useful to have a quick summary here of the main objects and methods used in their APIs.
Appendix E: JAXP: the Java API for XML Processing: JAXP is an interface rather than a product. Again, it doesn’t have explicit support yet for XSLT 2.0, but Java programmers will often be using it in XSLT 2.0 projects, so the book includes an overview of the classes and methods available.
Appendix F: Saxon: At the time of writing Saxon (developed by the author of this book) provides the most comprehensive implementation of XSLT 2.0 and XPath 2.0, so its interfaces and extensions are covered in some detail.
Appendix G: Altova: Altova, the developers of XML Spy, have an XSLT 2.0 processor that can be used either as part of the development environment or as a freestanding component. This appendix gives details of its interfaces.
Appendix H: Glossary
Note: CD-ROM/DVD and other supplementary materials are not included as part of eBook file. « less
Referring to specific information inside an XML document is a little like finding a needle in a haystack: how do you differentiate the information you need from everything else? XPath and XPointer are two closely related languages that play a key role in XML processing by allowing developers to find more » these needles and manipulate embedded information. XPath describes a route for finding specific items by defining a path through the hierarchy of an XML document, abstracting only the information that's relevant for identifying the data. XPointer extends XPath to identify more complex parts of documents. The two technologies are critical for developers seeking needles in haystacks in various types of processing.XPath and XPointer fills an essential need for XML developers by focusing directly on a critical topic that has been covered only briefly. Written by John Simpson, an author with considerable XML experience, the book offers practical knowledge of the two languages that underpin XML, XSLT and XLink. XPath and XPointer cuts through basic theory and provides real-world examples that you can use right away.Written for XML and XSLT developers and anyone else who needs to address information in XML documents, the book assumes a working knowledge of XML and XSLT. It begins with an introduction to XPath basics. You'll learn about location steps and paths, XPath functions and numeric operators. Once you've covered XPath in depth, you'll move on to XPointer--its background, syntax, and forms of addressing. By the time you've finished the book, you'll know how to construct a full XPointer (one that uses an XPath location path to address document content) and completely understand both the XPath and XPointer features it uses.XPath and XPointer contains material on the forthcoming XPath 2.0 spec and EXSLT extensions, as well as versions 1.0 of both XPath and XPointer. A succinct but thorough hands-on guide, no other book on the market provides comprehensive information on these two key XML technologies in one place. « less
Solutions to Real-World Problems
While the XML "buzz" still dominates talk among Internet developers, the critical need is for information that cuts through the hype and lets Java programmers put XML to work. Java & XML shows how to use the APIs, tools, and tricks of XML to build real-world applications, with the end result that both more » the data and the code are portable.This second edition of Java & XML adds chapters on Advanced SAX and Advanced DOM, new chapters on SOAP and data binding, and new examples throughout. A concise chapter on XML basics introduces concepts, and the rest of the book focuses on using XML from your Java applications. Java developers who need to work with XML, or think that they will in the future--as well as developers involved in the new peer-to-peer movement, messaging, or web services--will find the new Java & XML a constant companion.This book covers:
* The basics of XML, including DTDs, namespaces, XML Schema, XPath, and XSL
* The SAX API, including all handlers, the SAX 2 extensions, filters, and writers
* The DOM API, including DOM Level 2, Level 3, and the Traversal, Range, CSS, Events, and HTML modules.
* The JDOM API, including the core, a look at XPath support, and JDOM as a JSR
* Using web publishing frameworks like Apache Cocoon
* Developing applications with XML-RPC
* Using SOAP and UDDI for web services
* Data Binding, using both DTDs and XML Schema for constraints
* Building business-to-business applications with XML
* Building information channels with RSS and dynamic content with XSP
Includes a quick reference on SAX 2.0, DOM Level 2, and JDOM. « less
XML, the Extensible Markup Language, is the next-generation markup language for the Web. It provides a more structured (and therefore more powerful) medium than HTML, allowing you to define new document types and stylesheets as needed. Although the generic tags of HTML are sufficient for everyday text, more » XML gives you a way to add rich, well-defined markup to electronic documents.The XML Pocket Reference is both a handy introduction to XML terminology and syntax, and a quick reference to XML instructions, attributes, entities, and datatypes. Although XML itself is complex, its basic concepts are simple. This small book combines a perfect tutorial for learning the basics of XML with a reference to the XML and XSL specifications. The new edition introduces information on XSLT (Extensible Stylesheet Language Transformations) and Xpath. « less