Commit Graph

44 Commits

Author SHA1 Message Date
DebaucheryLibrarian ccb99e278c Added periodic memory logger. 2021-11-20 23:59:15 +01:00
DebaucheryLibrarian 29b8c5e38e Including unextracted scenes in date determination. 2021-10-28 02:10:30 +02:00
DebaucheryLibrarian 0864154a0e Added unextracted property to keep paginating when extracting scenes. 2021-10-28 01:59:53 +02:00
DebaucheryLibrarian a22c4d5679 Added beforeNetwork hook, used by MindGeek. Added Filthy Kings to Gamma. 2021-10-27 17:19:23 +02:00
DebaucheryLibrarian 100a35b4e8 Added before scene fetch method to prevent e.g. unnecessary session requests, moved scraper assignment to entity lookup. Removed channel URL hostname matching.. 2021-10-26 23:42:32 +02:00
DebaucheryLibrarian 6c5d4389fe Not parsing HTML with jsdom when using http module directly to save memory. Added loading ellipsis to release grid pages. 2021-10-25 02:06:24 +02:00
DebaucheryLibrarian 49f891ba44 Ignoring 1-second scene duration from MindGeek API. 2021-10-17 19:59:05 +02:00
DebaucheryLibrarian 193af9bab5 Fixed session options in http module. 2021-03-23 15:25:21 +01:00
DebaucheryLibrarian c2a008afbe Added mimetype check to teasers and trailers. Added chapters to MindGeek scraper, fixed scene ID extraction getting stuck on numbers in domain name. Ordering chapters by timestamp. 2021-02-27 18:05:06 +01:00
DebaucheryLibrarian 37e39dc1ec Added S3 support for media files. Fixed MindGeek scraper for new poster data structure. 2021-02-22 02:33:39 +01:00
DebaucheryLibrarian 7ff222ce25 Passing recursive parameters to all scraper methods. Using throttle parameters in MindGeek scraper, fixed missing slug breaking scene and actor URLs. 2021-02-10 03:23:48 +01:00
DebaucheryLibrarian 824fb9ef37 Changed profile network argument to context. 2021-02-03 00:50:00 +01:00
DebaucheryLibrarian 6d93083581 Removed superfluous MindGeek scrapers. 2021-02-03 00:46:59 +01:00
DebaucheryLibrarian e38922f372 Removed redundant sitename from MindGeek session error. 2021-01-05 16:35:49 +01:00
DebaucheryLibrarian e1d6c9e489 Added site name to MindGeek session error. 2021-01-05 16:34:32 +01:00
DebaucheryLibrarian 9ca2ec6dd0 Fixed parent entity relations in seed file. Fixed MindGeek scraper session URL determination. 2021-01-05 16:27:20 +01:00
DebaucheryLibrarian 0633197793 Removed direct bhttp usage from scrapers in favor of local http module. Deleted legacy scrapers, as old code is available via git repo history. 2020-11-23 00:05:02 +01:00
DebaucheryLibrarian 39f8c037a5 Replaced bhttp with patched fork. Improved Jesse Loads Monster Facials scraper reliability (WIP). Added various tag photos. 2020-10-30 17:37:10 +01:00
DebaucheryLibrarian 3c9468b0f1 Fixed wrong MindGeek session acquire URL. 2020-09-18 03:27:00 +02:00
DebaucheryLibrarian a4929819df Using channel URL instead of composed URL for session retrieval, should fix Brazzers. 2020-09-18 02:54:05 +02:00
DebaucheryLibrarian 1bfdf4b232 Storing actor profiles from scene pages. 2020-08-30 04:18:47 +02:00
DebaucheryLibrarian 8611d738b0 Using UTC to query date ranges. Removed stray console log from MindGeek scraper. 2020-08-26 02:01:38 +02:00
DebaucheryLibrarian 52f66e7982 Fixed undefined location in FreeOnes scraper. 2020-08-24 18:24:07 +02:00
DebaucheryLibrarian 7fed5b7138 Moved Brazzers to MindGeek scraper to support new site. 2020-08-24 05:13:34 +02:00
DebaucheryLibrarian dff4d15872 Updated profile scrapers to use base actor instead of actor name. Fixes for Reality Kings and Cherry Pimps scrapers. 2020-07-21 01:44:51 +02:00
ThePendulum 98c19b560f Updated mindgeek scraper for entities. Various fixes. 2020-06-28 22:29:18 +02:00
ThePendulum 885aa4f627 Passing context object with site or network instead of scraper slug and 'site or network' to all profile scrapers. 2020-05-18 03:22:03 +02:00
ThePendulum 11eb66f834 Switched to tabs. Adding missing actor entries when scraping actors, with batch ID. 2020-05-14 04:26:05 +02:00
ThePendulum 0f09fd53eb Refactoring deep scrape. Added tag posters. 2020-03-16 04:10:52 +01:00
ThePendulum b03775fa07 Using generic slugify for MindGeek channel. 2020-02-29 05:00:50 +01:00
ThePendulum 8359f78e2e Fixed RK scraper returning dick size as bust size. 2020-02-23 22:01:12 +01:00
ThePendulum 97f5e49187 Refactored media module. Returning 320p and 720p videos from MindGeek as teasers instead of trailers. 2020-02-19 04:41:53 +01:00
ThePendulum 139f0ce7cb Allowing release scrapers to return actor details. Added True Amateurs. 2020-02-09 23:25:54 +01:00
ThePendulum 9d9eda29be Added scene count to actor inspect. Preferring network slug over data brand for scene URLs in MindGeek scraper, since milehighmedia.com's brand is milehigh, resulting in milehigh.com. 2020-02-09 03:09:06 +01:00
ThePendulum f921bb4ae9 Generating and using URL slugs for releases, improver slugify module. Added 'extract' parameter to MindGeek scraper to get scenes not associate with a channel (see Digital Playground). Added various high res logos. 2020-02-04 03:12:09 +01:00
ThePendulum 87e2d6bbfd Added actor releases to MindGeek module. 2020-02-01 04:42:35 +01:00
ThePendulum cde9aba0cb Redundant actor sources can now be bundled in configuration. Fixed Men network actor path. 2020-02-01 04:14:08 +01:00
ThePendulum 3f113310e3 Added Trans Angels to MindGeek. Interpreting MindGeek 'other' gender as transsexual. 2020-01-31 00:25:51 +01:00
ThePendulum ff61094b69 Added Men network and Icon Male to MindGeek. Added entropy filter to media module to help filter out generic avatars. Added Pure Taboo. Various logo updates. 2020-01-30 01:14:31 +01:00
ThePendulum 81ede3f511 Improved MindGeek avatar fix. 2020-01-29 02:31:55 +01:00
ThePendulum fc675ae144 Added Metro HD network using MindGeek scraper. Fixed MindGeek profile scraper avatar issue. 2020-01-29 02:24:19 +01:00
ThePendulum 76852daf6d Added VogoV (no trailer yet). Fixed MindGeek profile scraper. 2020-01-28 03:05:53 +01:00
ThePendulum 6d4fd5fd77 Added MindGeek profile scraper for all MG sites. 2020-01-27 22:54:14 +01:00
ThePendulum ef76909d3c Merged Reality Kings and Babes into MindGeek scraper. Kept classic latest wrapper for Look At Her Now and Tranny Surprise. 2020-01-14 04:50:42 +01:00