Commit Graph

58 Commits

Author SHA1 Message Date
DebaucheryLibrarian 1a24d39761 Updated MG session acquisition. 2023-03-27 00:01:38 +02:00
DebaucheryLibrarian 2943aea4d8 Added showcased migration. Added Love Her Films scraper. 2022-10-25 23:13:24 +02:00
DebaucheryLibrarian 85c73bad77 Improved MindGeek actor scraper. 2022-05-30 00:05:33 +02:00
DebaucheryLibrarian 61123fdb6a Added Accept-Language header to MindGeek requests, seems to help with acquiring sessions. 2022-03-30 01:17:54 +02:00
DebaucheryLibrarian e93e8ace5c Added deep scene force parameter to MindGeek scraper. 2022-03-27 00:27:26 +01:00
DebaucheryLibrarian fd8170f223 Added series. 2022-03-26 17:40:20 +01:00
DebaucheryLibrarian 5ff076cac3 Added DP Star Sex Challenges to Digital Playground. 2022-03-09 23:26:48 +01:00
DebaucheryLibrarian c6e977f842 Added movie support to MindGeek scraper. 2022-03-04 23:32:09 +01:00
DebaucheryLibrarian 496c29e569 Configured Reality Kings to fetch session from RK scene overview. 2022-02-11 22:14:44 +01:00
DebaucheryLibrarian 20da2d1cf6 Reusing batch ID for movies to preserve new-flag. 2022-01-20 00:54:10 +01:00
DebaucheryLibrarian 372db86927 Disabled MindGeek session bundling to analyze Too Many Requests errors. 2022-01-16 22:24:47 +01:00
DebaucheryLibrarian 9d7183ac69 Added PurgatoryX scraper. 2021-11-27 23:55:16 +01:00
DebaucheryLibrarian 20d0d860d3 Fixed MindGeek scraper trying to acquire session from mindgeek.com 2021-11-22 02:51:52 +01:00
DebaucheryLibrarian 6b4aa64d74 Improved MindGeek scraper session check to prevent crash when network session isn't available yet. 2021-11-22 02:44:03 +01:00
DebaucheryLibrarian 26539b74a5 Updated dependencies. Added periodic memory logger. 2021-11-20 23:59:48 +01:00
DebaucheryLibrarian 29b8c5e38e Including unextracted scenes in date determination. 2021-10-28 02:10:30 +02:00
DebaucheryLibrarian 0864154a0e Added unextracted property to keep paginating when extracting scenes. 2021-10-28 01:59:53 +02:00
DebaucheryLibrarian a22c4d5679 Added beforeNetwork hook, used by MindGeek. Added Filthy Kings to Gamma. 2021-10-27 17:19:23 +02:00
DebaucheryLibrarian 100a35b4e8 Added before scene fetch method to prevent e.g. unnecessary session requests, moved scraper assignment to entity lookup. Removed channel URL hostname matching.. 2021-10-26 23:42:32 +02:00
DebaucheryLibrarian 6c5d4389fe Not parsing HTML with jsdom when using http module directly to save memory. Added loading ellipsis to release grid pages. 2021-10-25 02:06:24 +02:00
DebaucheryLibrarian 49f891ba44 Ignoring 1-second scene duration from MindGeek API. 2021-10-17 19:59:05 +02:00
DebaucheryLibrarian 193af9bab5 Fixed session options in http module. 2021-03-23 15:25:21 +01:00
DebaucheryLibrarian c2a008afbe Added mimetype check to teasers and trailers. Added chapters to MindGeek scraper, fixed scene ID extraction getting stuck on numbers in domain name. Ordering chapters by timestamp. 2021-02-27 18:05:06 +01:00
DebaucheryLibrarian 37e39dc1ec Added S3 support for media files. Fixed MindGeek scraper for new poster data structure. 2021-02-22 02:33:39 +01:00
DebaucheryLibrarian 7ff222ce25 Passing recursive parameters to all scraper methods. Using throttle parameters in MindGeek scraper, fixed missing slug breaking scene and actor URLs. 2021-02-10 03:23:48 +01:00
DebaucheryLibrarian 824fb9ef37 Changed profile network argument to context. 2021-02-03 00:50:00 +01:00
DebaucheryLibrarian 6d93083581 Removed superfluous MindGeek scrapers. 2021-02-03 00:46:59 +01:00
DebaucheryLibrarian e38922f372 Removed redundant sitename from MindGeek session error. 2021-01-05 16:35:49 +01:00
DebaucheryLibrarian e1d6c9e489 Added site name to MindGeek session error. 2021-01-05 16:34:32 +01:00
DebaucheryLibrarian 9ca2ec6dd0 Fixed parent entity relations in seed file. Fixed MindGeek scraper session URL determination. 2021-01-05 16:27:20 +01:00
DebaucheryLibrarian 0633197793 Removed direct bhttp usage from scrapers in favor of local http module. Deleted legacy scrapers, as old code is available via git repo history. 2020-11-23 00:05:02 +01:00
DebaucheryLibrarian 39f8c037a5 Replaced bhttp with patched fork. Improved Jesse Loads Monster Facials scraper reliability (WIP). Added various tag photos. 2020-10-30 17:37:10 +01:00
DebaucheryLibrarian 3c9468b0f1 Fixed wrong MindGeek session acquire URL. 2020-09-18 03:27:00 +02:00
DebaucheryLibrarian a4929819df Using channel URL instead of composed URL for session retrieval, should fix Brazzers. 2020-09-18 02:54:05 +02:00
DebaucheryLibrarian 1bfdf4b232 Storing actor profiles from scene pages. 2020-08-30 04:18:47 +02:00
DebaucheryLibrarian 8611d738b0 Using UTC to query date ranges. Removed stray console log from MindGeek scraper. 2020-08-26 02:01:38 +02:00
DebaucheryLibrarian 52f66e7982 Fixed undefined location in FreeOnes scraper. 2020-08-24 18:24:07 +02:00
DebaucheryLibrarian 7fed5b7138 Moved Brazzers to MindGeek scraper to support new site. 2020-08-24 05:13:34 +02:00
DebaucheryLibrarian dff4d15872 Updated profile scrapers to use base actor instead of actor name. Fixes for Reality Kings and Cherry Pimps scrapers. 2020-07-21 01:44:51 +02:00
ThePendulum 98c19b560f Updated mindgeek scraper for entities. Various fixes. 2020-06-28 22:29:18 +02:00
ThePendulum 885aa4f627 Passing context object with site or network instead of scraper slug and 'site or network' to all profile scrapers. 2020-05-18 03:22:03 +02:00
ThePendulum 11eb66f834 Switched to tabs. Adding missing actor entries when scraping actors, with batch ID. 2020-05-14 04:26:05 +02:00
ThePendulum 0f09fd53eb Refactoring deep scrape. Added tag posters. 2020-03-16 04:10:52 +01:00
ThePendulum b03775fa07 Using generic slugify for MindGeek channel. 2020-02-29 05:00:50 +01:00
ThePendulum 8359f78e2e Fixed RK scraper returning dick size as bust size. 2020-02-23 22:01:12 +01:00
ThePendulum 97f5e49187 Refactored media module. Returning 320p and 720p videos from MindGeek as teasers instead of trailers. 2020-02-19 04:41:53 +01:00
ThePendulum 139f0ce7cb Allowing release scrapers to return actor details. Added True Amateurs. 2020-02-09 23:25:54 +01:00
ThePendulum 9d9eda29be Added scene count to actor inspect. Preferring network slug over data brand for scene URLs in MindGeek scraper, since milehighmedia.com's brand is milehigh, resulting in milehigh.com. 2020-02-09 03:09:06 +01:00
ThePendulum f921bb4ae9 Generating and using URL slugs for releases, improver slugify module. Added 'extract' parameter to MindGeek scraper to get scenes not associate with a channel (see Digital Playground). Added various high res logos. 2020-02-04 03:12:09 +01:00
ThePendulum 87e2d6bbfd Added actor releases to MindGeek module. 2020-02-01 04:42:35 +01:00