Releases: crawlab-team/crawlab
Releases · crawlab-team/crawlab
v0.5.0
Features / Enhancement
- Spider Market. Allow users to download open-source spiders into Crawlab.
- Batch actions. Allow users to interact with Crawlab in batch fashions, e.g. batch run tasks, batch delete spiders, ect.
- Migrate MongoDB driver to
MongoDriver
. - Refactor and optmize node-related logics.
- Change default
task.workers
to 16. - Change default nginx
client_max_body_size
to 200m. - Support writing logs to ElasticSearch.
- Display error details in Scrapy page.
- Removed Challenge page.
- Moved Feedback and Dislaimer pages to navbar.
Bug Fixes
- Fixed log not expiring issue because of failure to create TTL index.
- Set default log expire duration to 1 day.
task_id
index not created.docker-compose.yml
fix.- Fixed 404 page.
- Fixed unable to create worker node before master node issue.
v0.4.10
Features / Enhancement
- Enhanced Log Management. Centralizing log storage in MongoDB, reduced the dependency of PubSub, allowing log error detection.
- API Token. Allow users to generate API tokens and use them to integrate into their own systems.
- Web Hook. Trigger a Web Hook http request to pre-defined URL when a task starts or finishes.
- Auto Install Dependencies. Allow installing dependencies automatically from
requirements.txt
orpackage.json
. - Auto Results Collection. Set results collection to
results_<spider_name>
if it is not set. - Optimized Project List. Not display "No Project" item in the project list.
- Upgrade Node.js. Upgrade Node.js version from v8.12 to v10.19.
- Add Run Button in Schedule Page. Allow users to manually run task in Schedule Page.
Bug Fixes
v0.4.9
Features / Enhancement
- Challenges. Users can achieve different challenges based on their actions.
- More Advanced Access Control. More granular access control, e.g. normal users can only view/manage their own spiders/projects and admin users can view/manage all spiders/projects.
- Feedback. Allow users to send feedbacks and ratings to Crawlab team.
- Better Home Page Metrics. Optimized metrics display on home page.
- Configurable Spiders Converted to Customized Spiders. Allow users to convert their configurable spiders into customized spiders which are also Scrapy spiders.
- View Tasks Triggered by Schedule. Allow users to view tasks triggered by a schedule. #648
- Support Results De-Duplication. Allow users to configure de-duplication of results. #579
- Support Task Restart. Allow users to re-run historical tasks.
Bug Fixes
v0.4.8
Features / Enhancement
- Support Installations of More Programming Languages. Now users can install or pre-install more programming languages including Java, .Net Core and PHP.
- Installation UI Optimization. Users can better view and manage installations on Node List page.
- More Git Support. Allow users to view Git Commits record, and allow checkout to corresponding commit.
- Support Hostname Node Registration Type. Users can set hostname as the node key as the unique identifier.
- RPC Support. Added RPC support to better manage node communication.
- Run On Master Switch. Users can determine whether to run tasks on master. If not, all tasks will be run only on worker nodes.
- Disabled Tutorial by Default.
- Added Related Documentation Sidebar.
- Loading Page Optimization.
Bug Fixes
v0.4.7
Features / Enhancement
- Better Support for Scrapy. Spiders identification,
settings.py
configuration, log level selection, spider selection. #435 - Git Sync. Allow users to sync git projects to Crawlab.
- Long Task Support. Users can add long-task spiders which is supposed to run without finishing. #425
- Spider List Optimization. Tasks count by status, tasks detail popup, legend. #425
- Upgrade Check. Check latest version and notifiy users to upgrade.
- Spiders Batch Operation. Allow users to run/stop spider tasks and delete spiders in batches.
- Copy Spiders. Allow users to copy an existing spider to create a new one.
- Wechat Group QR Code.
Bug Fixes
- Schedule Spider Selection Issue. Fields not responding to spider change.
- Cron Jobs Conflict. Possible bug when two spiders set to the same time of their cron jobs. #515 #565
- Task Log Issue. Different tasks write to the same log file if triggered at the same time. #577
- Task List Filter Options Incomplete.
v0.4.6
Features / Enhancement
- SDK for Node.js. Users can apply SDK in their Node.js spiders.
- Log Management Optimization. Log search, error highlight, auto-scrolling.
- Task Execution Process Optimization. Allow users to be redirected to task detail page after triggering a task.
- Task Display Optimization. Added "Param" in the Latest Tasks table in the spider detail page. #295
- Spider List Optimization. Added "Update Time" and "Create Time" in spider list page.
- Page Loading Placeholder.
Bug Fixes
v0.4.5
Features / Enhancement
- Interactive Tutorial. Guide users through the main functionalities of Crawlab.
- Global Environment Variables. Allow users to set global environment variables, which will be passed into all spider programs. #177
- Project. Allow users to link spiders to projects. #316
- Demo Spiders. Added demo spiders when Crawlab is initialized. #379
- User Admin Optimization. Restrict privilleges of admin users. #456
- Setting Page Optimization.
- Task Results Optimization.
Bug Fixes
- Unable to find spider file error. #485
- Click delete button results in redirect. #480
- Unable to create files in an empty spider. #479
- Download results error. #465
- crawlab-sdk CLI error. #458
- Page refresh issue. #441
- Results not support JSON. #202
- Getting all spider after deleting a spider.
- i18n warning.
v0.4.4
Features / Enhancement
- Email Notification. Allow users to send email notifications.
- DingTalk Robot Notification. Allow users to send DingTalk Robot notifications.
- Wechat Robot Notification. Allow users to send Wechat Robot notifications.
- API Address Optimization. Added relative URL path in frontend so that users don't have to specify
CRAWLAB_API_ADDRESS
explicitly. - SDK Compatiblity. Allow users to integrate Scrapy or general spiders with Crawlab SDK.
- Enhanced File Management. Added tree-like file sidebar to allow users to edit files much more easier.
- Advanced Schedule Cron. Allow users to edit schedule cron with visualized cron editor.
Bug Fixes
nil retuened
error.- Error when using HTTPS.
v0.4.3
Features / Enhancement
- Dependency Installation. Allow users to install/uninstall dependencies and add programming languages (Node.js only for now) on the platform web interface.
- Pre-install Programming Languages in Docker. Allow Docker users to set
CRAWLAB_SERVER_LANG_NODE
asY
to pre-installNode.js
environments. - Add Schedule List in Spider Detail Page. Allow users to view / add / edit schedule cron jobs in the spider detail page. #360
- Align Cron Expression with Linux. Change the expression of 6 elements to 5 elements as aligned in Linux.
- Enable/Disable Schedule Cron. Allow users to enable/disable the schedule jobs. #297
- Better Task Management. Allow users to batch delete tasks. #341
- Better Spider Management. Allow users to sort and filter spiders in the spider list page.
- Added Chinese
CHANGELOG
. - Added Github Star Button at Nav Bar.
Bug Fixes
v0.4.2
Features / Enhancement
- Disclaimer. Added page for Disclaimer.
- Call API to fetch version. #371
- Configure to allow user registration. #346
- Allow adding new users.
- More Advanced File Management. Allow users to add / edit / rename / delete files. #286
- Optimized Spider Creation Process. Allow users to create an empty customized spider before uploading the zip file.
- Better Task Management. Allow users to filter tasks by selecting through certian criterions. #341