All Versions
11
Latest Version
Avg Release Cycle
113 days
Latest Release
807 days ago

Changelog History
Page 1

  • v1.1.9 Changes

    April 12, 2020

    Important notice

    โฌ‡๏ธ Drop Node.js 6 support.

    ๐Ÿ› Bug fixes

    • #485 - Fix format in FS cache backend error constructor
  • v1.1.8 Changes

    June 13, 2019

    ๐Ÿ› Bug fixes

    • #375 - Cache doesn't work as expected
  • v1.1.7 Changes

    April 09, 2019

    Important notice

    โฌ‡๏ธ Dropped support for Node 4. The lowest supported version is now 6.13.0. This is due to requirements of some dependencies.

    ๐Ÿ†• New features

    • The README is now generated using jsdoc-to-markdown.

    ๐Ÿ› Bug fixes

    • #419 - srcset source termination
    • #432 - Add a replacement in cleanURL
    • #439 - Links getting skipped due to escape sequence in href
    • #441 - Pattern for CSS url() syntax matches wrongly also some JS url() function calls
    • #443 - Oldest unfetched item duplicates
    • #447 - Fix TypeError ERR_INVALID_CALLBACK on fs.writeFile for node.js v10
  • v1.1.6 Changes

    October 06, 2017

    ๐Ÿ†• New features

    • Sitemap directives in /robots.txt are now added to the queue if Crawler#respectRobotsTxt is truthy.

    ๐Ÿ› Bug fixes

    • #398 - fix issue where multiple cookies weren't properly serialized for outbound requests
    • ๐Ÿ“œ #400 - fix issue where <meta name="robots"> tags weren't properly parsed
  • v1.1.5 Changes

    August 15, 2017

    Administrative

    • Welcome @konstantinblaesi! We invited a few people who have made recent and significant contributions to come on board as simplecrawler collaborators. @konstantinblaesi heeded our call and has already submitted several PR's, including an ambitious upstream patch!
    • ๐Ÿ‘€ simplecrawler now has its own GitHub organisation and lives at simplecrawler/simplecrawler. This enables more fine tuned access controls for current and future collaborators. If you are interested in joining us, see #388.

    Important notice

    • ๐Ÿ‘€ We have dropped support for node 0.12. The lowest supported version is now 4.x. This enables us to use more modern language features and will hopefully enable more patches soon. See #382 for more details.

    ๐Ÿ› Bug fixes

    • ๐ŸŽ #364 - improved performance of Crawler#cleanExpandResources. See #382 for relevant patch
    • ๐Ÿ‘€ #385 and #363 - @konstantinblaesi submitted an upstream patch to URI.js that employs the same validation logic when calling the URI constructor as when calling the hostname and port methods. See #393 and medialize/URI.js#345 for relevant patches
  • v1.1.4 Changes

    July 16, 2017

    ๐Ÿ› Bug fixes

    • #377 - fixed multiple issues with Crawler#removeFetchCondition and Crawler#removeDownloadCondition. Previously, those methods promised to throw an error if they couldn't find the fetch/download condition that was targeted, but they did not. The previous system for condition ID's was not stable either, since they targeted indexes in an array would change length. Crawler#removeFetchCondition also had a bug where it looked for fetch condition references in the Crawler#_downloadConditions array rather than Crawler#_fetchConditions. All of these issues have been fixed now. Thanks to @venning for a great bug report and PR!
  • v1.1.3 Changes

    June 21, 2017

    ๐Ÿ†• New features

    • #376 - added Crawler#sortQueryParameters option. This option will sort the query parameters in a URL, making it simple to avoid fetching duplicate pages on sites that use a lot of query parameters. Thanks @HaroldPutman!
  • v1.1.2 Changes

    June 21, 2017

    ๐Ÿ› Bug fixes

    • ๐Ÿ›  #371 - fixed an issue where custom cache back-ends would not be properly type checked
  • v1.1.1 Changes

    March 19, 2017

    ๐Ÿ› Bug fixes

    • โšก๏ธ #360 - updated README to correctly reflect new async fetch/download conditions API
    • โšก๏ธ #353 - updated default srcset discovery function to be more permissive
    • ๐Ÿšš #357 - ensure that the port parameter is properly removed from the Crawler#getRequestOptions return object when using a custom HTTP agent
  • v1.1.0 Changes

    March 10, 2017

    ๐Ÿ†• New features

    • โž• Added the ability to make both fetch conditions and download conditions async. This change also deprecates the previous synchronous behavior (we will be removing it in the next major release). This suggestion was originally brought up by @maxcorbeau in #345