scrapyd-go

command module
Version: v0.0.0-...-bbcf28f Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 5, 2019 License: Apache-2.0 Imports: 23 Imported by: 0

README

scrapyd-go

an drop-in replacement for scrapyd that is more easy to be scalable and distributed on any number of commodity machines with no hassle, each scrapyd-go instance is a stateless microservice, all instances must be connected to the same redis server, redis is used as a ceneralized registry system for all instances, so each instance se what others see.

Why

scrapyd isn't bad, but it is very stateful, it isn't that easy to deploy it in a distributed environment like k8s, as well as I wanted to add more features, so I started this project as a drop-in replacement for scrapyd but writing in modern & scalable environment like go for restful server and redis as centeralized registry.

TODOs

  • schedule.json
  • cancel.json
  • addversion.json
  • listprojects.json
  • listversions.json
  • listspiders.json
  • delproject.json
  • delversion.json
  • listjobs.json
  • daemonstatus.json
  • logs/{jobid}, new: realtime output of the job log

Configurations

scrapyd-go configs are just simple command line flags

  -dir string
        the directory to use for local caching (default ".scrapyd-go")
  -listen string
        the address to bind to (default ":6800")
  -max2keep int
        the maximum jobs/logs to keep in memory (default 1000000)
  -poll int
        time in millisecond between each poll operation from queue(s) (default 10)
  -python string
        the python binary to use (default "python3")
  -redis string
        the redis server address (default "redis://:somepass@localhost:6379/1")
  -sync int
        time in seconds between each sync operation (default 15)
  -workers int
        the maximum workers count (default cpu-cores-count)

Installation

  • binary : go to releases page and download your os based release
  • docker: $ docker pull alash3al/scrapyd-go
  • source: $ go get github.com/alash3al/scrapyd-go

Running

  • binary: $ ./scrapyd_bin_file -redis redis://localhost:6379/1
  • docker: $ docker run --link SomeRedisServerContainer -p 6800:6800 alash3al/scrapyd-go -redis redis://SomeRedisServerContainer:6379/1
  • source: $ scrapyd-go -redis redis://localhost:6379/1

Contributing

  • Fork the repo
  • Create a feature branch
  • Push your changes
  • Create a pull request

License

Apache License v2.0

Author

Documentation

The Go Gopher

There is no documentation for this package.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
t or T : Toggle theme light dark auto
y or Y : Canonical URL