cheerio v1.0.0-rc.2 Release Notes

Release Date: 2017-07-02 // almost 7 years ago
  • ๐Ÿš€ This release changes Cheerio's default parser to the Parse5 HTML ๐Ÿ“œ parser. Parse5 is an excellent project ๐Ÿ‘ that rigorously conforms to the HTML standard. It does not support XML, so ๐Ÿ“œ Cheerio continues to use htmlparser2 when working with XML documents.

    This switch addresses many long-standing bugs in Cheerio, but some users may ๐Ÿš‘ experience slower behavior in performance-critical applications. In addition, ๐Ÿ“œ htmlparser2 is more forgiving of invalid markup which can be useful when input sourced from a third party and cannot be corrected. For these reasons, ๐Ÿ“œ the load method also accepts a DOM structure as produced by the htmlparser2 ๐Ÿ‘€ library. See the project's "readme" file for more details on this usage pattern.

    Migrating from version 0.x

    cheerio.load( html[, options ] ) This method continues to act as a "factory" function. It produces functions that define an API that is similar to the global jQuery function provided by the jQuery library. The generated function operates on a DOM structure based on the provided HTML.

    ๐Ÿš€ In releases prior to version 1.0, the provided HTML was interpreted as a document fragment. Following version 1.0, strings provided to the load method are interpreted as documents. The same example will produce a $ function that operates on a full HTML document, including an <html> document element with ๐Ÿ’ป nested <head> and <body> tags. This mimics web browser behavior much more closely, but may require alterations to existing code.

    For example, the following code will produce different results between 0.x and ๐Ÿš€ 1.0 releases:

    var $ = cheerio.load('<p>Hello, <b>world</b>!</p>');
    
    $.root().html();
    
    //=> In version 0.x: '<p>Hello, <b>world</b>!</p>'
    //=> In version 1.0: '<html><head></head><body><p>Hello, <b>world</b>!</p></body></html>'
    

    ๐Ÿ“œ Users wishing to parse, manipulate, and render full documents should not need to modify their code. Likewise, code that does not interact with the "root" element should not be effected by this change. (In the above example, the expression $('p') returns the same result across Cheerio versions--a Cheerio collection whose only member is a paragraph element.)

    However, users wishing to render document fragments should now explicitly create a "wrapper" element to contain their input.

    // First, create a Cheerio function "bound" to an empty document (this is
    // similar to loading an empty page in a web browser)
    var $ = cheerio.load('');
    // Next, create a "wrapper" element for the input fragment:
    var $wrapper = $('<div/>');
    // Finally, supply the input markup as the content for the wrapper:
    $wrapper.append('<p>Hello, <b>world</b>!</p>');
    
    $wrapper.html();
    //=> '<p>Hello, <b>world</b>!</p>'
    

    ๐Ÿ”„ Change log:

    • โšก๏ธ Update History.md (and include migration guide) (Mike Pennisi)
    • ๐Ÿ“œ Rename useHtmlParser2 option (Mike Pennisi)
    • โœ‚ Remove documentation for xmlMode option (Mike Pennisi)
    • ๐Ÿ“œ Document advanced usage with htmlparser2 (Mike Pennisi)
    • Correct errors in Readme.md (Mike Pennisi)
    • ๐Ÿ‘Œ Improve release process (Mike Pennisi)
    • 1.0.0-rc.1 (Mike Pennisi)
    • โšก๏ธ Update unit test (Mike Pennisi)
    • ๐Ÿ’… Normalize code style (Mike Pennisi)
    • โž• Added support for nested wrapping. (Diane Looney)
    • โž• Add nested wrapping test (Toni Helenius)
    • โž• Added $.merge following the specification at https://api.jquery.com/jquery.merge/ Added test cases for $.merge (Diane Looney)
    • Clarify project scope in README file (Mike Pennisi)
    • ๐Ÿ’… .text() ignores script and style tags (#1018) (Haleem Assal)
    • โœ… Test suite housekeeping (#1016) (DianeLooney)
    • ๐Ÿ‘ท experiment with a job board (Matthew)
    • ๐Ÿ”„ Change options format (inikulin)
    • โž• Add test for #997 (inikulin)
    • โšก๏ธ Update .filter function docs. (Konstantin)
    • Standardise readme on ES6 variable declarations (Dekatron)
    • ๐Ÿ‘‰ Use documents via \$.load (inikulin)
    • 0๏ธโƒฃ Use parse5 as a default parser (closes #863) (inikulin)
    • ๐Ÿ›  Fix small typo in Readme (Darren Scerri)
    • ๐Ÿ‘ท Report test failures in CI (Mike Pennisi)
    • serializeArray should not ignore input elements without value attributes (Ricardo Gladwell)
    • Disallow variable shadowing (Mike Pennisi)
    • โšก๏ธ Update hasClass method (sufisaid)
    • โž• Added MIT License fixes #902 (Prasanth Vaaheeswaran)
    • โšก๏ธ chore(package): update dependencies (greenkeeper[bot])
    • ๐Ÿ“ฆ Use modular lodash package (#913) (Billy Janitsch)