Improve pagination behavior #11

Closed
opened 2020-09-18 22:10:15 +00:00 by pendulum · 1 comment

Some sites, such as HardX, will keep serving the last page when you paginate beyond the available pages. needNextPage() in src/updates.js is not dealing with this properly at the moment.

For --last 1000, it will keep paginating until it found 1000 scenes, even if the last 300 came from that same last page. This probably affects nullDateLimit similarly, but is less of a problem, as it will stop eventually.

However, for --latest, it will keep paginating until the date of the last scene on the page is older than the requested date range. If the last scene on the last page is still newer than the date range, it will keep paginating forever, always getting the last page.

This was originally resolved by checking uniqueReleases.length, however if you had already scraped a full page before, it would not find any unique releases on page 1, and thus stop paginating without checking page 2 for unique scenes.

A new approach will probably need to distinguish current unique releases from database unique releases.

Some sites, such as HardX, will keep serving the last page when you paginate beyond the available pages. `needNextPage()` in `src/updates.js` is not dealing with this properly at the moment. For `--last 1000`, it will keep paginating until it found 1000 scenes, even if the last 300 came from that same last page. This probably affects `nullDateLimit` similarly, but is less of a problem, as it will stop eventually. However, for `--latest`, it will keep paginating until the date of the last scene on the page is older than the requested date range. If the last scene on the last page is still newer than the date range, it will keep paginating forever, always getting the last page. This was originally resolved by checking `uniqueReleases.length`, however if you had already scraped a full page before, it would not find any unique releases on page 1, and thus stop paginating without checking page 2 for unique scenes. A new approach will probably need to distinguish current unique releases from database unique releases.
pendulum added the
bug
label 2020-09-18 22:10:57 +00:00
Author

Now only using scraped scenes for pagination, and filtering database duplicates from accumulated result.

Now only using scraped scenes for pagination, and filtering database duplicates from accumulated result.
Sign in to join this conversation.
No Milestone
No Assignees
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: DebaucheryLibrarian/traxxx#11
No description provided.