skale alternatives and similar modules
Based on the "Mad Science" category.
Alternatively, view skale alternatives based on common mentions on social networks and blogs.
-
dat
DISCONTINUED. :floppy_disk: peer-to-peer sharing & live syncronization of files via command line [ DEPRECATED - More info on active projects and modules at https://dat-ecosystem.org/ ] -
webcat
Mad science p2p pipe across the web using webrtc that uses your Github private/public key for authentication and a signalhub for discovery
SaaSHub - Software Alternatives and Reviews
* Code Quality Rankings and insights are calculated and provided by Lumnify.
They vary from L1 to L5 with "L5" being the highest.
Do you think we are missing an alternative of skale or a related project?
Popular Comparisons
README
[logo](docs/images/logo-skale.png)
High performance distributed data processing and machine learning.
Skale provides a high-level API in Javascript and an optimized parallel execution engine on top of NodeJS.
Features
- Pure javascript implementation of a Spark like engine
- Multiple data sources: filesystems, databases, cloud (S3, azure)
- Multiple data formats: CSV, JSON, Columnar (Parquet)...
- 50 high level operators to build parallel apps
- Machine learning: scalable classification, regression, clusterization
- Run interactively in a nodeJS REPL shell
- Docker [ready](docker/), simple local mode or full distributed mode
- Very fast, see [benchmark](benchmark/)
Quickstart
npm install skale
Word count example:
var sc = require('skale').context();
sc.textFile('/my/path/*.txt')
.flatMap(line => line.split(' '))
.map(word => [word, 1])
.reduceByKey((a, b) => a + b, 0)
.count(function (err, result) {
console.log(result);
sc.end();
});
Local mode
In local mode, worker processes are automatically forked and communicate with app through child process IPC channel. This is the simplest way to operate, and it allows to use all machine available cores.
To run in local mode, just execute your app script:
node my_app.js
or with debug traces:
SKALE_DEBUG=2 node my_app.js
Distributed mode
In distributed mode, a cluster server process and worker processes must be started prior to start app. Processes communicate with each other via raw TCP or via websockets.
To run in distributed cluster mode, first start a cluster server
on server_host
:
./bin/server.js
On each worker host, start a worker controller process which connects to server:
./bin/worker.js -H server_host
Then run your app, setting the cluster server host in environment:
SKALE_HOST=server_host node my_app.js
The same with debug traces:
SKALE_HOST=server_host SKALE_DEBUG=2 node my_app.js
Resources
- [Contributing guide](CONTRIBUTING.md)
- Documentation
- Gitter for support and discussion
- Mailing list for discussion about use and development
Authors
The original authors of skale are Cedric Artigue and Marc Vertes.
License
[Apache-2.0](LICENSE)
Credits
Logo Icon made by Smashicons from www.flaticon.com is licensed by CC 3.0 BY
*Note that all licence references and agreements mentioned in the skale README section above
are relevant to that project's source code only.